Vocabularies
Overview of vocabularies used for KMAP. This list is updated continuously and, based on relevance, new vocabularies are made.
Ontology | Description of sources |
Human Genes (HGNC) | A vocabulary of human gene names, along with their symbols and synonyms. This collection was downloaded from the HGNC site. Since a lot of genes have symbols and synonyms that may also relate to non-gene words, a TenWise proprietary gene calling algorithm was used to discard false positive hits. |
Human Diseases (TWDIS) | A collection of terms describing human diseases. This collection was downloaded from the BioOntology Portal and the terms were optimized for text mining. |
Tool compounds (TOOLC) | A set of pharmaceutical tool compounds. |
Bacteria (TAX) | This vocabulary is based on the taxonomical names, obtained from the NCBI FTP site. From this set, only the names for Archaea and Bacteria were used for text mining. In all cases the full name (e.g. Escherichia coli) and the abbreviated form (E. coli) were used for mining. |
Pathways (PATH) | A vocabulary of biological pathways. This vocabulary was constructed by TenWise using a proprietary method to detect pathway names, signaling routes and metabolic pathways in biomedical text. In addition, terms from the GeneOntology Biological Process subset were curated for text mining and added to this set. |
Metabolites (TWMET) | This is a vocabulary constructed by TenWise by merging metabolite names from several repositories (such as KEGG, ChEBI and Reactome) and textbooks on biochemical pathways. |
Food terms (TWFOOD) | A set of terms describing food items. |
Bacterial genes (BACG) | A set of bacterial gene names that was obtained by creating a non-redundant list of bacterial names from public GenBank repositories for bacterial genomes. Since a lot of gene names are shared between multiple bacteria across a wide range of species, these names are provided as general names and not mapped back to any particular organism. |
Research Workflow Ontology (TWRWO) | Research Workflow Ontology deals with intervention options, sample collection methods and analytical methods to generate metabolomics, microbiomics, proteomics and transcriptomics outputs. |