Аннотация:In many studies it is of interest to extract phylogenetic trees that describe common evolution from various datasets. We present a natural method to detect such trees. The method is based on an original algorithm to build evolutionary scenarios of genes and reconcile individual gene trees into a supertree [Lyubetsky, Gorbunov, Biology Direct, 2012; Molecular Biology (Mosk), 2009, 2012; Information Processes, 2010; Problems of Information Transmission, 2011]. The algorithm implements novel concepts of the gene evolutionary scenario and time slices imposed on the supertree to describe particular types of gene evolution events, such as gene duplications, losses and horizontal transfers. Unlike traditionally used approaches to the NP-hard reconciliation problem, the algorithm has a cubic complexity with respect to the size of input data and is mathematically proved to find the true supertree as a global minimum of the used functional, the total cost of individual tree reconciliations. The method is used to detect compatibility thresholds and extract trees suitable to provide correct reconciliations. We define a natural measure of compatibility between the current tree and the supertree in terms of the cost of individual gene evolution events inferred during reconciliation. It is the ratio of this cost to the number of edges in the current tree (the “conditional edge” cost). During the procedure trees with higher conditional edge costs are removed from the input set, and reconciliation is recomputed. Simulation experiments show that the quality of the current supertree reaches stationarity when the compatibility value enters a certain range. This range defines the set of trees that are compatible enough to describe common evolution, i.e., the supertree. Analyses of biological trees support this observation in terms of inferring the known correct topology of the supertree only after a certain amount of trees is discarded according to this threshold. Given a very low time complexity and mathematical correctness of the algorithm, the method is applicable to large and very large sets of trees, which can be useful in many types of large-scale analyses of tree data.