Skip to content

Commit

Permalink
reduce state of fields
Browse files Browse the repository at this point in the history
  • Loading branch information
TahiriNadia authored Jun 8, 2024
1 parent 4685ba1 commit baf9acc
Showing 1 changed file with 7 additions and 14 deletions.
21 changes: 7 additions & 14 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,16 +42,13 @@ aas-journal: Astrophysical Journal <- The name of the AAS journal.
---

# Abstract
*aPhyloGeo* is a Python library designed to elucidate the complex relationship between species evolution and environmental pressures, with a particular focus on climate. By integrating genetic and climatic data, *aPhyloGeo* empowers researchers to investigate the mechanisms of evolutionary adaptation and pinpoint genetic regions potentially influenced by environmental factors.
*aPhyloGeo* is a Python library that explores the relationship between species evolution and environmental pressures, particularly climate. By integrating genetic and climatic data, it helps researchers investigate evolutionary adaptations and identify genetic regions influenced by environmental factors.

The software's core strength lies in its comprehensive phylogenetic analysis pipeline, encompassing three distinct levels of investigation: genetic relationships, climatic impact assessment, and biogeographic correlations. This multi-faceted approach facilitates a holistic understanding of how species evolve and adapt to their environments. For example, researchers studying the genetic basis of high-altitude adaptation in birds could utilize aPhyloGeo to construct phylogenetic trees from genetic data, analyze oxygen levels across different altitudes, and identify correlations between specific genes and hypoxic conditions.
The core feature of the software is a comprehensive phylogenetic analysis pipeline with three investigation levels: 1) genetic relationships, 2) climatic impact assessment, and 3) biogeographic correlations. This approach facilitates understanding how species adapt to their environments.

In another scenario, scientists investigating the impact of climate change on marine biodiversity could employ *aPhyloGeo* to examine the genetic diversity of coral species, assess changes in sea surface temperatures over time, and pinpoint genetic markers associated with thermal tolerance.
These examples demonstrate the wide range of research questions that *aPhyloGeo* can address, making it an valuable tool for evolutionary biologists, ecologists, and conservationists alike.
*aPhyloGeo* employs algorithms using metrics like least squares [@felsenstein1997alternating], Euclidean, and Robinson-Foulds [@robinson1981comparison] distances to ensure statistically sound correlations. Its modular structure offers flexibility, and its open-source nature promotes collaboration.

Underlying *aPhyloGeo*'s analyses are robust algorithms employing metrics such as least squares distance [@felsenstein1997alternating], Euclidean distance, and Robinson-Foulds distance [@robinson1981comparison] to quantify similarity across different levels. This approach ensures that the identification of correlations is statistically sound, while adhering to the principles of phylogenetic inference [@gascuel2006neighbor]. The software's modular design and Python interface offer flexibility, allowing users to tailor analyses to their specific research questions and datasets. Additionally, *aPhyloGeo*'s open-source nature fosters collaboration and transparency within the scientific community.

By enabling researchers to explore the complex interplay between genetics and environment, *aPhyloGeo* contributes to a deeper understanding of evolutionary processes. This knowledge not only enhances our appreciation of the natural world but also informs conservation efforts in the face of climate change and other environmental challenges. By identifying genetic adaptations to changing environments, *aPhyloGeo* can help prioritize species and populations for conservation, ultimately contributing to the preservation of biodiversity on our planet.
Available as a PyPI package, *aPhyloGeo* enhances understanding of evolutionary processes and informs conservation efforts, helping prioritize species and populations for preservation in the face of climate change.

# Statement of Need

Expand All @@ -68,13 +65,9 @@ Given the urgency of the current climate crisis and the anticipated future chall
This research aims to make a significant contribution to our understanding of the evolving ecological landscape and provide the scientific community with robust tools for comprehensive analysis and interpretation. *aPhyloGeo* will enable researchers to unravel the complex interplay between genetics, geography, and environment, informing conservation strategies and predicting the impacts of ongoing environmental change.

# State of the Field - Advancements in Genomic Analysis
The field of genomic analysis has progressed significantly in recent years, notably in the creation of tools and algorithms to explore the intricate relationship between genetic variation and environmental factors. Our algorithm for identifying sub-sequences within genes [@nadia_tahiri-proc-scipy-2022], and its subsequent application to SARS-CoV-2 data in 2023 [@nadia_tahiri-proc-scipy-2023] enhance comprehension of the genetic underpinnings of adaptation across various species and environments.

In the broader field of phylogeography, substantial methodological advancements have also occurred. Several Python packages provide functionalities pertinent to phylogeographic analysis, but often in a fragmented way. Biopython [@cornish2021biopython], a cornerstone in bioinformatics, excels at handling genetic sequences and basic phylogenetic tasks, yet falls short in integrating environmental data. [DendroPy](https://pypi.org/project/DendroPy), a robust library for phylogenetic trees, aids in visualizing phylogeographic patterns but requires additional tools for comprehensive analysis. While [SciPy](https://pypi.org/project/scipy/)'s statistical utilities could be harnessed for custom analyses, its complexity demands a strong background in statistical programming. [GeoPandas](https://pypi.org/project/geopandas/), adept at handling geospatial data, is useful for mapping genetic or environmental distributions, but lacks seamless integration with genetic data analysis tools. In summary, while powerful individual tools exist, a comprehensive and user-friendly Python package specifically designed for phylogeographic analysis remains a gap to be filled.

Statistical approaches, including generalized linear models (GLMs) and mixed models, are increasingly used to investigate the relationship between genetic variation and environmental variables. These methods enable researchers to quantify the relative influence of various factors, such as climate, geography, and demography, on observed patterns of genetic diversity.
Advances in genomic analysis, notably in identifying gene sub-sequences [@nadia_tahiri-proc-scipy-2022], and applying them to SARS-CoV-2 data [@nadia_tahiri-proc-scipy-2023], enhance understanding of genetic adaptation across species and environments. In phylogeography, though various Python packages provide easy analysis, none seamlessly integrates genetic and environmental data. While Biopython handles genetic sequences well [@cornish2021biopython], it lacks environmental integration. DendroPy visualizes phylogeographic patterns but needs additional tools (actually our team works on [iPhyloGeo++](https://github.com/tahiri-lab/iPhyloGeo_plus_plus), a software). SciPy offers statistical utilities but requires expertise. GeoPandas manages geospatial data but lacks genetic integration. A user-friendly Python package specifically for phylogeography is needed.

The continuous refinement of these tools and methodologies, coupled with the growing availability of high-throughput sequencing technologies and environmental data, has opened up exciting new research avenues in evolutionary biology, ecology, and conservation. *aPhyloGeo* builds upon these advancements, providing a unified platform for integrating genetic and climatic data to address a wide array of phylogeographic questions. By bridging the gap between genomics and environmental science, *aPhyloGeo* aims to contribute to a more comprehensive understanding of the forces shaping biodiversity in a changing world.
Statistical methods like generalized linear models and mixed models help quantify the relationship between genetic variation and environmental factors. The evolution of these tools alongside high-throughput sequencing and environmental data availability creates opportunities in evolutionary biology, ecology, and conservation. *aPhyloGeo* integrates genetic and climatic data, aiming to enhance understanding of biodiversity dynamics.

# Pipeline

Expand Down Expand Up @@ -125,6 +118,6 @@ By adhering to best practices in software development and embracing open-source

# Acknowledgements

This work was supported by the Natural Sciences and Engineering Research Council of Canada, Fonds de recherche du Québec - Nature et technologie, the University of Sherbrooke grant, and the Centre de recherche en écologie de l'UdeS (CREUS). The author would like to thank the Department of Computer Science, University of Sherbrooke, Quebec, Canada for providing the necessary resources to conduct this research. The computations were performed on resources provided by Compute Canada and Compute Quebec - the National and Provincial Infrastructure for High-Performance Computing and Data Storage. The authors would like to thank the students of the University of Sherbrooke and the Université du Québec à Montréal for their great contribution to the development of the software.
This work was supported by the Natural Sciences and Engineering Research Council of Canada, Fonds de recherche du Québec Nature et technologies, the University of Sherbrooke grant, and the Centre de recherche en écologie de l'UdeS (CREUS). The computations were performed on resources provided by Compute Canada and Compute Quebec, the national and provincial infrastructure for high-performance computing and data storage. The authors would like to thank the students of the University of Sherbrooke and the Université du Québec à Montréal for their significant contributions to the development of the software. Finally, the authors would like to thank the reviewers and the editor for their valuable comments.

# References

0 comments on commit baf9acc

Please sign in to comment.