Data Sources
The biomedical databases that feed into OptimusKG.
OptimusKG integrates data from over 15 primary direct biomedical data sources, plus dozens of indirect sources contributed through aggregated databases.
Direct Sources
| Source | Description | Contributes To |
|---|---|---|
| OpenTargets | Drug target identification and validation platform | Genes, Diseases, Drugs, Phenotypes, Pathways, and their relationships |
| Bgee | Gene expression across animal species | Gene expression in anatomical structures (Anatomy-Protein edges) |
| CTD | Environmental exposure impact on human health | Exposures and their links to diseases, genes, and biological processes |
| DrugBank | Comprehensive pharmaceutical knowledge base | Drug properties, drug-protein interactions, drug-drug interactions |
| DrugCentral | Drug-disease interaction resource | Drug-disease indications, drug properties |
| DisGeNET | Gene-disease association database | Disease-protein and phenotype-protein associations |
| OnSIDES | Adverse drug reaction data from FDA labels | Drug-phenotype adverse reaction edges |
| Reactome | Biological pathway knowledge base | Pathway hierarchy and pathway-protein relationships |
| Gene Names (HGNC) | Standardized human gene nomenclature | Gene identifier mapping and naming |
| GO | Gene Ontology | Biological processes, molecular functions, cellular components, and their hierarchies |
| HPO | Human Phenotype Ontology | Phenotype definitions and hierarchy |
| MONDO | Disease ontology | Disease definitions, hierarchy, and cross-references |
| UBERON | Anatomy ontology | Anatomical entity definitions and hierarchy |
| MeSH | Medical Subject Headings | Exposure entity identifiers |
| MedDRA | Medical Dictionary for Regulatory Activities | Phenotype/adverse event terminology |
Indirect Sources
Several aggregated databases contribute indirect provenance to OptimusKG:
Protein-protein Interaction Databases
Protein-protein interaction data is sourced through PrimeKG, which itself aggregates over 20 interaction databases including APID, BioGRID, BioPlex, ENCODE, HINT, HI-Union, HIPPIE, InnateDB, InBioMap, IntAct, Interactome3D, MINT, PINA, and SignaLink.
OpenTargets Sub-sources
OpenTargets integrates evidence from dozens of upstream databases. These are tracked as indirect sources in OptimusKG, including ChEMBL, ClinGen, Genomics England, Orphanet, UniProt, ClinicalTrials.gov, FDA, EMA, and PubMed, among others.