diff --git a/paper/paper.md b/paper/paper.md index d0def282..4d7b332d 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -83,19 +83,7 @@ For a more detailed understanding, please refer to the comprehensive tutorial [p ![The workflow of the algorithm. The operations within this workflow include several blocks.\label{fig:figure1}](../img/workflow_en.png) -The diagram below illustrates the workflow of the algorithm, consisting of several key blocks, each highlighted with a distinct color. - -- **First Block (Light Blue):** This block creates climate trees based on input climate data (CSV file) and validates the input parameters using a YAML file. More precisely, the climate trees were generated by calculating the pairwise differences between each value of the species' habitats, normalized between the minimum and maximum of the parameter. This process resulted in a symmetric square matrix. From this matrix, the climate tree was inferred using the Neighbor-Joining method. This involves processing climatic variables such as temperature, precipitation, and elevation to construct phylogenetic trees that represent the relationships between geographic locations based on their climatic similarity. - -- **Second Block (Light Green):** This block creates phylogenetic trees based on input genetic data and performs input parameter validation (refer to the [YAML file](../aphylogeo/params.yaml)). This entails aligning DNA or amino acid sequences, inferring phylogenetic relationships using various methods (e.g., maximum likelihood, Bayesian inference), and assessing the statistical support for the inferred tree topology. - -- **Third Block (Light Pink):** The third block, referred to as the phylogeography step, is the crux of the analysis. It compares the genetic trees (representing evolutionary relationships) with the climate trees (representing environmental similarity). This comparison utilizes either the Robinson-Foulds distance or the Least Squares distance to quantify the degree of congruence between the two tree types. The output of this step includes: - -- Topological congruence statistics: Quantifying the degree of similarity between the genetic and climate trees. -- Co-phylogenetic visualizations: Graphical representations highlighting the associations between genetic lineages and climatic niches. -- Statistical tests: Assessing the significance of the observed phylogeographic patterns. - -This third block is pivotal, forming the basis from which users obtain output data (i.e., name of gene, name of climate parameter, bootstrap value, Robinson-Foulds distance, entropy distance, least-square distance, the starting position and the ending position of windows, and climatic and genetic trees) with essential calculations (i.e., distances, tree inference, sequence alignment). Our approach is optimized to adapt to various computing environments through elasticity and utilize parallelism and available GPUs/CPUs based on resource usage per unit of computation. This flexibility enables efficient processing of a single genetic window, as outlined in the workflow below. +The diagram below illustrates the workflow of the algorithm, consisting of several key blocks, each highlighted with a distinct color (refer to the [wiki page](https://github.com/tahiri-lab/aPhyloGeo/wiki/Worflow). ## Multiprocessing