preparing genomic data for phylogeny recostruction (GTN)

This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. The alignments are cleaned with ClipKIT and the concatenation matrix is built using PhyKit. This can be used for phylogeny reconstruction. ## Associated Tutorial This workflows is part of the tutorial [preparing genomic data for phylogeny recostruction (GTN)](https://training.galaxyproject.org/training-material/topics/ecology/tutorials/phylogeny-data-prep/tutorial.html), available in the [GTN](https://training.galaxyproject.org) ## Thanks to... **Tutorial Author(s)**: [Miguel Roncoroni](https://training.galaxyproject.org/training-material/hall-of-fame/roncoronimiguel/), [Brigida Gallone](https://training.galaxyproject.org/training-material/hall-of-fame/brigidagallone/) **Workflow Author(s)**: Miguel Roncoroni [![gtn star logo followed by the word workflows](http://galaxy-training.s3-website.us-east-1.amazonaws.com/misc/gtn-workflows.png)](https://training.galaxyproject.org/training-material/)

License
https://spdx.org/licenses/CC-BY-4.0

Contents