The overall philosophy behind our KMAP platform is the fact that a lot of valuable data is already available in a large number of biological databases and in the scientific literature. TenWise takes this information and makes it suitable for text mining or we identify gaps and build our own ontologies.
Vocabularies: specially construed ontologies
Our Vocabularies contain over 500,000 unique biological keywords, that describe human physiology for health and disease, as well as the drugs and foods that impact it. See figure below:
Next, we generate KMAP by automated collection of data from the scientific literature. This is done by searching the entire PubMed collection of > 30 million abstracts with a set of > 0.5 million biological keywords, describing genes, diseases, metabolites, etc. This yields a large network, referred to as KMAP, consisting of over 200 million literature relations. See how we generate KMAP below:
BioSets: precompiled filters for text analytics
Finally, we apply precompiled filters that we call BioSets to these 200 million relations yielding only the relevant ones that are characterized and visualised in many ways: eg. by the number of abstracts in which they are described, a statistical score that indicates the likely importance of this relation or the extraction of sentences in which the relevant relations are described. These filters are developed by experts with relevant domain knowledge.
Access to KMAP can be given to: end users via KMINE Literature Explorer and to: CROs and Bio-IT departments via KMINE REST-API. Research groups via dedicated interactive KMINE Literature Reports and KMINE ~Omics Viewers. Finally dedicated projects can be done on tailor made content and UI’s.