Functional annotation helps bringing biological meaning to genetic sequences. Functional annotation is usually obtained through protein sequence similarity. Indeed, across two organisms, if two sequences are very similar, one can infer that they can encode for the same biological function.
There are several main parameters that will impact the process of functional annotation:
- how distant is the species which was actually annotated with experimental data (the reference)
- the correct structural annotation of to which annotations will be transferred
- the type of functional annotation (full protein annotation, protein domains, etc.)
Several tools to functionally annotate sequences exist. They do not all require the same input, nor will they deliver the same output, and are therefore complementary.
# Requirements
In Ortho_KB, we integrate functional annotation from EggNOG, MapMan, TRAPID and InterPro.
## EggNOG
First, install emapper, which can be done for example with conda:
Then create the eggNOG database which contains ortholog groups and the functional annotation. In our case, we will chose the taxonomic group Viridiplantae, whose code is 33090 (see [http://eggnog5.embl.de/#/app/downloads](http://eggnog5.embl.de/#/app/downloads))