Grence

Data Sources

The biomedical databases that feed into OptimusKG.

OptimusKG integrates data from over 15 primary direct biomedical data sources, plus dozens of indirect sources contributed through aggregated databases.

Direct Sources

SourceDescriptionContributes To
OpenTargetsDrug target identification and validation platformGenes, Diseases, Drugs, Phenotypes, Pathways, and their relationships
BgeeGene expression across animal speciesGene expression in anatomical structures (Anatomy-Protein edges)
CTDEnvironmental exposure impact on human healthExposures and their links to diseases, genes, and biological processes
DrugBankComprehensive pharmaceutical knowledge baseDrug properties, drug-protein interactions, drug-drug interactions
DrugCentralDrug-disease interaction resourceDrug-disease indications, drug properties
DisGeNETGene-disease association databaseDisease-protein and phenotype-protein associations
OnSIDESAdverse drug reaction data from FDA labelsDrug-phenotype adverse reaction edges
ReactomeBiological pathway knowledge basePathway hierarchy and pathway-protein relationships
Gene Names (HGNC)Standardized human gene nomenclatureGene identifier mapping and naming
GOGene OntologyBiological processes, molecular functions, cellular components, and their hierarchies
HPOHuman Phenotype OntologyPhenotype definitions and hierarchy
MONDODisease ontologyDisease definitions, hierarchy, and cross-references
UBERONAnatomy ontologyAnatomical entity definitions and hierarchy
MeSHMedical Subject HeadingsExposure entity identifiers
MedDRAMedical Dictionary for Regulatory ActivitiesPhenotype/adverse event terminology

DrugBank requires a registered account and license acceptance to access raw data. When DrugBank data is unavailable, the Optimus pipeline gracefully continues with public data only.

Indirect Sources

Several aggregated databases contribute indirect provenance to OptimusKG:

Protein-protein Interaction Databases

Protein-protein interaction data is sourced through PrimeKG, which itself aggregates over 20 interaction databases including APID, BioGRID, BioPlex, ENCODE, HINT, HI-Union, HIPPIE, InnateDB, InBioMap, IntAct, Interactome3D, MINT, PINA, and SignaLink.

OpenTargets Sub-sources

OpenTargets integrates evidence from dozens of upstream databases. These are tracked as indirect sources in OptimusKG, including ChEMBL, ClinGen, Genomics England, Orphanet, UniProt, ClinicalTrials.gov, FDA, EMA, and PubMed, among others.

On this page