... | ... | @@ -6,6 +6,7 @@ title: Functional annotations |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Functional annotation helps bringing biological meaning to genetic sequences. Functional annotation is usually obtained through protein sequence similarity. Indeed, across two organisms, if two sequences are very similar, one can infer that they can encode for the same biological function.
|
|
|
There are several main parameters that will impact the process of functional annotation:
|
|
|
- how distant is the species which was actually annotated with experimental data (the reference)
|
... | ... | @@ -21,16 +22,19 @@ In Ortho_KB, we integrate functional annotation from EggNOG, MapMan, InterPro an |
|
|
|
|
|
The annotation tool EggNOG-mapper (https://github.com/eggnogdb/eggnog-mapper) relies on EggNOG databases to annotate genes with knowledge from orthologs on other genomes.
|
|
|
|
|
|
### Install
|
|
|
First, install emapper, which can be done for example with conda:
|
|
|
|
|
|
`conda create -n emapper -c bioconda eggnog-mapper`
|
|
|
|
|
|
_Note that the version 2.1.12 does not currently work (https://github.com/eggnogdb/eggnog-mapper/issues/516). While the issue is not fixed, please use `conda create -n emapper_2.1.12 -c bioconda -c conda-forge eggnog-mapper=2.1.12 python=3.11.10`_
|
|
|
|
|
|
### Create the database
|
|
|
Then create the eggNOG database which contains ortholog groups and the functional annotation. In our case, we will chose the taxonomic group Viridiplantae, whose code is 33090 (see [http://eggnog5.embl.de/#/app/downloads](http://eggnog5.embl.de/#/app/downloads))
|
|
|
|
|
|
`download_eggnog_data.py --data_dir eggnog_db --dbname 33090`
|
|
|
|
|
|
### Run the program
|
|
|
Finally, the emapper program can be run to annotate each genome:
|
|
|
|
|
|
`emapper.py --cpu 10 -i $input --output ${prefix}.tsv -m diamond --data_dir eggnog_db --database 33090`
|
... | ... | |