Data / Tools

SPOKE is populated with several openly available data sources and tools, and efforts to incorporate more information are ongoing

List of SPOKE resources

Bgee is a database to retrieve and compare gene expression patterns in multiple animal species, produced from multiple data types (RNA-Seq, Affymetrix, in situ hybridization, and EST data).

BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules.


ChEMBL is a manually curated chemical database of bioactive molecules with drug-like properties



ClinicalTrials is a database of privately and publicly funded clinical studies conducted around the world.

Disease Ontology

The Disease Ontology semantically integrates disease and medical vocabularies through extensive cross mapping of DO terms to MeSH, ICD, NCI’s thesaurus, SNOMED and OMIM.


DISEASES is a weekly updated web resource that integrates evidence on disease-gene associations from automatic text mining, manually curated literature, cancer mutation data, and genome-wide association studies. 


DisGeNET is a discovery platform containing one of the largest publicly available collections of genes and variants associated to human diseases.


Disease Ontology Annotation Framework (DOAF) comprises a collection of disease-gene mappings between disease ontology and gene.

DrugBank 4.2

The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information.


Drug efficacy targets, indications, and pharmacologic class.

Entrez Gene

Databases of molecular data on the NCBI Web site include such examples as nucleotide sequences (GenBank), protein sequences, macromolecular structures, molecular variation, gene expression, and mapping data. 

Evolutionary Rate Covariation

ERC measures correlated rates across a phylogeny, allowing for extraction of genes with similar evolutionary histories.

Gene Ontology

The Gene Ontology (GO) project is a major bioinformatics initiative to develop a computational representation of our evolving knowledge of how genes encode biological functions at the molecular, cellular and tissue system levels.

GWAS Catalog

Catalog of published genome-wide association studies.


This repository hosts data for the disease-associated genes project on

Human Interactome Database

A reference of binary protein-protein interactions generated by systematically interrogating all pairwise combinations of predicted gene products in defined search spaces using proteome-scale technologies.

Incomplete Interactome



iRefIndex provides an index of protein interactions available in a number of primary interaction databases including BIND, BioGRID, CORUM, DIP, HPRD, InnateDB, IntAct, MatrixDB, MINT, MPact, MPIDB and MPPI.


The LINCS L1000 dataset is a comprehensive resource for gene expression changes observed in human cell lines perturbed with small molecules and genetic constructs. The L1000 experiments systematically measure the changes in gene expression after small molecule exposure, gene knockdown by RNAi, and gene overexpression.


Medical Subject Headings (MeSH) is the NLM's curated medical vocabulary resource, providing a hierarchically-organized terminology for indexing and cataloging of biomedical information such as MEDLINE/PUBmed and other NLM databases. 

Pathway Interaction Database

The Pathway Interaction Database is a highly-structured, curated collection ofinformation about known bio-molecular interactions and key cellular processes assembled into signaling pathways.


REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database.


SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. The available information include side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations.


STAR provides a powerful search engine across samples, experiments, and attributes from GEO in order to Search, Tag, Analyze & Resource.


TISSUES is a weekly updated web resource that integrates evidence on tissue expression from manually curated literature, proteomics and transcriptomics screens, and automatic text mining. 


Uberon is an integrated cross-species ontology covering anatomical structures in animals.


UniProt is a comprehensive, high-quality and freely accessible resource of protein sequence and functional information.


WikiPathways is a database of biological pathways maintained by and for the scientific community.