ETE Toolkit: Phylogenomic Data Analysis and Visualization

Project ETE Toolkit

Jaime Huerta Cepas (Centro de Biotecnología y Genómica de Plantas)

Funding Cycle 2

Proposal Summary

To support the release and maintenance of a new version of the ETE toolkit including updated documentation and new features such as tree diff, tree-like regular expression searches, and large tree visualization.


ETE Toolkit

ETE is a Python framework that assists in the programmatic reconstruction, analysis and visualization of phylogenetic trees and multiple sequence alignments. This is the most common type of data obtained from the evolutionary analysis of genes and proteins coming from shotgun sequencing projects. ETE provides a comprehensive API to handle phylogenetic results and to integrate them into the scientific Python ecosystem (e.g. Jupyter notebooks). This includes a fully-featured system for programmatic visualization of hierarchical trees that complements other Python libraries such as matplotlib, plotly or seaborn. With more than ten years of continuous development, ETE is considered a standard tool for phylogenomic analysis, with more than 45,000 downloads from conda and over 850 citations in peer-reviewed papers with a clear increasing trend. The ETE team will boost the release of a new version of the ETE library including all novel features and bug fixes accumulated over the past years, update library documentation, tutorial, and cookbooks and also improve user support, unitests, continuous development system and community engagement. Those tasks represent the weakest corners in the sustainability of the project, given that, despite a large user community, ETE is developed by a small group of volunteer researchers without direct funding. The requested resources represent, therefore, the minimal setup to cope with an increasing demand of support, frequent updates, and academic use.

Key Personnel

Jaime Huerta Cepas
Renato Alves
Ziqi Deng
Ana Hernández
Francois Serra