Ontologies

Overview of ontologies used for KMAP (7/7/20). This list is updated continuously and, based on relevance, new ontologies are made.

Ontology Description of sources
Human Genes A vocabulary of human gene names, along with their symbols and synonyms. This collection was downloaded from the HGNC site. Since a lot of genes have symbols and synonyms that may also relate to non-gene words, a TenWise proprietary gene calling algorithm was used to discard false positive hits.
 
Diseases A collection of terms describing human diseases. This collection was downloaded from the BioOntology Portal and the terms were optimized for text mining.
 
Phenotypes A collection of terms describing human phenotypes. As such this collection has some overlap with the Human Disease vocabularies. This collection was downloaded from the HPO github repository and the terms were optimized for text mining.
Animals Organisms that are categorized as animals, excluding bacteria.
 
Plants & Fungi Organisms that are categorized as plants and fungi.
 
Bacteria This vocabulary is based on the taxonomical names, obtained from the NCBI FTP site. From this set, only the names for Archaea and Bacteria were used for text mining. In all cases the full name (e.g. Escherichia coli) and the abbreviated form (E. coli) were used for mining.
 
Pathways A vocabulary of biological pathways. This vocabulary was constructed by TenWise using a proprietary method to detect pathway names, signaling routes and metabolic pathways in biomedical text. In addition, terms from the GeneOntology Biological Process subset were curated for text mining and added to this set.
 
Metabolites This is a vocabulary constructed by TenWise by merging metabolite names from several repositories (such as KEGG, ChEBI and Reactome) and textbooks on biochemical pathways.
 
Drugs This is a vocabulary of pharmaceutical research compounds. This vocabulary was constructed by TenWise using a proprietary algorithm to detect these compounds in biomedical text.
 
Bacterial genes A set of bacterial gene names that was obtained by creating a non-redundant list of bacterial names from public GenBank repositories for bacterial genomes. Since a lot of gene names are shared between multiple bacteria across a wide range of species, these names are provided as general names and not mapped back to any particular organism.
 
Research Workflow Ontology Research Workflow Ontology deals with intervention options, sample collection methods and analytical methods to generate metabolomics, microbiomics, proteomics and transcriptomics outputs.