Open Data in Biomedicine

The Information Gold Mine for Pharma

The biomedical field generates large volumes of information and a substantial share of that data is made publicly available online by scientists around the world. In addition, policies enforced by the medical regulatory agencies require the registration of clinical trials and the publication of their results. That combination of big data and open source broadens the perspectives for data science applications.

Open data in biomedicine is a gold mine that strengthens innovation in pharma R&D. In combination with the right analytics, public data helps identify therapeutic targets and ligands, enhance clinical development, and boost portfolio management efficiency. The challenge is to purposefully integrate abundant and heterogeneous data scattered across data sources.

Our methodology allows to navigate that maze of information and harness the value of biomedical databases. This enables the meaningful selection and integration of data, which constitutes the foundation of actionable knowledge discovery. Natural language processing is at the core of this process and thus at the interface of biomedicine and data science. Customised to the biomedical corpus, it is the centrepiece of next generation information processing engines.


The power of open data for pharma R&D

Navigating the open data landscape

Choosing the right data sources

    • Genes & proteins
    • Compounds & chemical information
    • Biomedical literature
    • Clinical trials
    • Terminologies & ontologies

Harnessing the value of biomedical databases

    • The challenges of connecting data across sources
    • Natural language processing in biomedicine
    • A case study: drugs and targets in precancerous indications
    • Data science for biomedical discovery

List of analysed data sources

  • European Nucleotide Archive
  • Ensembl
  • OMIM
  • UniProt
  • RCSB Protein Data Bank
  • CheMBL
  • DrugBank
  • Drugs@FDA
  • European public assessment reports
  • PubMed
  • PubMed Central
  • Europe PMC
  • Cochrane Database of Systematic Reviews
  • EU Clinical Trials Register
  • International Clinical Trials Registry Platform
  • Medical Subject Headings (MeSH)
  • Chemical Entities of Biological Interest (ChEBI)
  • ICD-10
  • Orphanet