This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. The alignments are cleaned with ClipKIT and the concatenation matrix is built using PhyKit. This can be used for phylogeny reconstruction. ## Associated Tutorial This workflows is part of the tutorial [Preparing genomic data for phylogeny reconstruction](https://training.galaxyproject.org/training-material/topics/ecology/tutorials/phylogeny-data-prep/tutorial.html), available in the [GTN](https://training.galaxyproject.org) ## Thanks to... **Workflow Author(s)**: Miguel Roncoroni **Tutorial Author(s)**: [Miguel Roncoroni](https://training.galaxyproject.org/training-material/hall-of-fame/roncoronimiguel/), [Brigida Gallone](https://training.galaxyproject.org/training-material/hall-of-fame/brigidagallone/) [](https://training.galaxyproject.org/training-material/)