Ancestors is a web server allowing one to easily and quickly perform the last three steps of the ancestral genome reconstruction procedure. It implements several alignment algorithms, an indel maximum likelihood solver and a context-dependent maximum likelihood substitution inference algorithm. The results presented by the server include the posterior probabilities for the last two steps of the ancestral genome reconstruction and the expected error rate of each ancestral base prediction.
Armadillo is a workflow platform dedicated to conducting computational biology simulations. A number of important bioinformatics tools have been included in the first release. The Armadillo platform is open-source and allows users to develop their own modules and/or integrate existing computer applications. Different complex bioinformatics tasks can be modeled and presented in a single workflow without any prior knowledge of programming techniques. Thus, Life Science students and researchers can develop themselves, with minimum effort, workflow prototypes in order to automate their bioinformatics routines on local workstations. The following popular Bioinformatics tasks can be performed using Armadillo:
Multiple Sequence Alignment
Automatic BLAST (using the NCBI server)
Evolutionary model inference
Inference of phylogenetic tree using Maximum likelihood and Maximum Parsimony
Inference of phylogenetic trees using the PHYLIP package
High-throughput screening (HTS) is a novel and efficient technology for drug discovery. It allows for screening of more than 100,000 compounds a day per screen and requires effective procedures for quality control. We have developed a method for evaluating a background surface of an HTS assay; it can be used to correct raw HTS data. This correction is necessary to take into account systematic errors that may affect the procedure of hit selection. The described method allows one to analyze experimental HTS data and determine trends and local fluctuations of the corresponding background surfaces. For a large amount of plates of the same assay, the deviations of the background surface from a plane are caused by systematic errors. Their influence can be minimized by the subtraction of the systematic background from the raw data.
In this program, classical linear redundancy analysis (Rao, 1964) and canonical correspondence analysis (ter Braak, 1986, 1987) are computed using multiple regressions followed by direct eigenanalysis of the matrix of fitted values. The method of calculation is described in Chapter 11 of Legendre & Legendre (1998). Polynomial RDA and CCA, which are generalizations of the linear forms, are implemented using a new approach proposed by Makarenkov and Legendre (1999, 2002). The polynomial methods are based on the use of polynomial multiple regression, during the first stage of RDA and CCA, instead of the multiple linear regression used in the linear forms. The explanatory variables are limited to their quadratic form in any term of the polynomial. The program produces the output required to draw biplot diagrams for linear and polynomial RDA or CCA. The explanatory variables can be represented in biplots in two different ways: (1) the individual terms of the polynomial equation can be represented as separate variables, or (2) one can choose to represent an explanatory variable using the multiple correlations (rescaled as required by the selected scaling method) of the canonical ordination axes against the linear and quadratic forms of the variable. A permutation procedure allows one to test the significance of the two models (linear and polynomial) and of the difference between them.
This program performs optimal variable weighting for ultrametric and additive tree clustering, following the method proposed by De Soete (1986, 1988), as well as for K-means partitioning. It is described in Makarenkov & Legendre.
The program, which is available free of charge to academic users, provides some improvements and extra options, compared to De Soete's (1988) program OVWTRE, which only implemented fitting to the first two families of clustering methods mentioned above. Given a rectangular data matrix Y (n,m), containing measurements of n objects on m variables, the procedure computes variable weights w(m) such that the resulting matrix of inter-object dissimilarities D(n,n) obtained from Y optimally satisfies either the ultrametric or the additive inequality, or optimally corresponds to a K-means partition with fixed number of groups K. The weights w are constrained to be nonnegative and their sum is equal to one. We used Polak-Ribiere's Conjugate Gradient Method (Numerical Recipes, Press et al., 1986) to carry out the optimization process. Once the optimal variable weights are obtained in the case of the ultrametric or additive tree clustering, the resulting inter-object dissimilarities D can be subjected to any of the existing ultrametric or additive tree fitting procedures. See De Soete (1986) or Makarenkov and Leclerc (1999) for an overview of these methods. It is worth noting that sometimes only a local minimum can be obtained as a final result. Hence a good choice of initial weights is essential. According to our investigations an initial guess consisting of all equal weights (as implemented in De Soete's software OVWTRE) cannot guarantee reaching the global minimum. An interesting feature of program OVW, compared to OVWTRE, is that it allows users to restart the optimization procedure any number of times, using different random initial guesses for the weights. As a consequence, OVW usually obtains better results than OVWTRE in the case of ultrametric and additive clustering; optimization for K-means partitionning is not available in OVWTRE. Moreover, the optimization in OVW is carried out in the way allowing to avoid a degenerate trivial solution in the case of the additive clustering. Such a solution consists of giving a weight of 1 to any one variable and weights of 0 to all other variables. Another important feature not mentioned by De Soete (1986, 1988) is that the global minimum of a function to minimize can be reached with several different sets of optimal weights w. This may lead to different inter-object dissimilarities matrices D, from which different hierarchies or additive trees can be inferred.
This program allows to compute in the optimal time the Robinson and Foulds topological distance between two or more additive trees given their distance matrices. The program is based on the algorithm proposed by Makarenkov and Leclerc (1999, b) for calculation of this distance. The algorithm implemented in this program uses the notion of circular orders to compare the topology of two trees. The algorithm described in Makarenkov and Leclerc (1999, b) has optimal time complexity, requiring O(n2) time when performed on two nxn distance matrices. The Robinson and Foulds topological distance is an important and frequently used tool to compare additive (phylogenetic) tree structures (see for instance Robinson and Foulds (1981) or Makarenkov and Leclerc (1999, a,b)). This distance is equal to the minimum number of elementary operations, consisting of merging or splitting nodes, necessary to transform one tree into the other. As proved in Robinson and Foulds (1981), it is also the number of bipartitions, or Buneman's splits (1971), which belong to exactly one of the two trees. If we deal with two unrooted trees having no internal vertices labeled according to the elements of the set X, the Robinson and Foulds distance of on the set X of n elements varies between 0 (when the trees are isomorphic) and 2n-6 (when all non-trivial bipartitions in two trees are different; a trivial bipartition corresponds to an edge incident to a leaf).