A staggering 85% of the human proteome has been deemed ‘undruggable’. But, spurred on by new techniques, enhanced data and diminishing returns from existing research avenues, we are beginning to explore these uncharted seas
Shortly after the first publication of the Human Genome Project, in 2002, Hopkins and Groom coined the term “druggable genome” – to describe the proportion of the genome they believed could be addressed with small molecule drugs. Back then they considered about 3,000 proteins, ~15% of the human proteome, druggable.
What was it that made a protein druggable, in their eyes? To borrow some philosophy from Tolstoy, Hopkins and Groom believed all “druggable” proteins are alike, but every “undruggable” protein is undruggable in its own way. Those proteins they labelled druggable exhibited one key facet: a structural ability to be targeted by small molecules.
But being druggable doesn’t make a protein a drug target – for that it also needs to have a disease-relevant function. Looking only at disease-relevant proteins (again around 3,000), Hopkins and Groom found an overlap of 600-1,500 potential drug targets, of which only 120 were realised as targets of approved drugs.1 Since then, more than 500 proteins have been added to the list of drug targets; even so, the proportion of the human proteome targeted by approved drugs remains below 5%.2,3
What about the 85% considered undruggable? Proteins within this group range from well-documented players that drive disease but elude inhibition by small molecule drugs, to the “dark proteome”, the uncharted territory of unexplored proteins.2
Although scientific literature has labelled these proteins undruggable for some time, calling them “yet to be drugged” or “potential future drug targets” seems a more apt description. The classification of proteins as undruggable is based on their lack of features that lend themselves to classical small molecule inhibitors1 and, more recently, biologics such as antibodies. Emerging therapies such as PROTACs and new classes of biologics will overcome many of these structural challenges.
Proteins that have no clear disease function assigned to them are also not explored as drug targets, but this is likely to change as the knowledge gaps on protein function close.
Finding proteins that satisfy druggability requirements in relevant disease contexts remains a key challenge in drug discovery. And it’s one in which artificial intelligence and predictive algorithms will be an invaluable ally, enabling researchers to navigate this complex space and rank proteins – by relevance as disease players and suitability as drug targets.
Privileged proteins – druggability’s poster children
Certain classes of proteins have emerged as success stories in druggability. These privileged protein families, which combine structural features of druggability with disease-relevant functions, include nuclear receptors, G-protein coupled receptors, ion channels and kinases.3 Notably, however, some of the protein classes that once accounted for the highest clinical success rates in the past, such as nuclear receptors, have seen little innovation of late. This suggests the upper limit of druggability has been reached for this class. In contrast, newer classes of small molecule drug targets, such as kinases, have lower historical success rates but exhibit a higher degree of innovation, with a broader range of targets explored in clinical trials.3,4
Which features make a protein a suitable drug target?
- Protein structure: Most proteins that are effectively targeted by small molecule inhibitors have a small, hydrophobic binding pocket or enzyme active site that enables specific interactions with the drug. However, many signalling functions are carried out by interactions between proteins. The contact areas of these interactions often lack defined groves or pockets5 and may contain intrinsically disordered domains, whose architecture is labile and depends on interaction with functional partners.6
- Protein function: For a protein to serve as a drug target it needs to have a disease-relevant function, with disease drivers being the initially most attractive class. A protein may also be target-worthy if it is related to disease symptoms or is active in adjacent and connected signalling pathways. Many proteins however, have multiple functions. They may have important roles in a disease context, while at the same time partake in normal cellular processes – and this increases the risk of side effects.
- Protein tissue expression: Proteins that have a ubiquitous expression in many vital tissues are likely to result in serious on-target, off-tissue side effects. Proteins that are exclusively or increasingly expressed in diseased tissues make more suitable drug targets because these side effects are kept to a minimum.
- Protein accessibility: A protein can be localised in the extracellular space, within the plasma membrane, in the cellular cytoplasm and other cellular organelles, or in the nucleus. Not all drugs can reach all of those compartments. For example, antibodies don’t pass the plasma membrane and can only act on proteins that have some part of their body in the extracellular space. Notably, many proteins change their cellular location dynamically (for example cytokines or growth factors being excreted or receptors being internalised), which should be considered when targeting them. Next to a protein’s place in the cell, also its tissue localisation is important. Proteins that are solely expressed in difficult-to-reach tissues, such as the brain, might not be reached by most small molecules.
A landscape of undruggable proteins
Different groups of proteins exist in the vast space of the currently undruggable proteome. On the one hand, we have proteins with well-known functions, but whose structures and/ or accessibility, such as their lack of catalytic centers or enzymatic activity, make them difficult for classic small molecules or biologics to target. This group of proteins includes the holy grail of drug targeting, the so-called cancer driver genes. While a number of those drivers are targeted by mainstay cancer therapies – including, for example, the estrogen and androgen receptors, or receptor tyrosine kinases such as EGFR – the overlap between existing cancer drugs and supposed cancer driver genes is modest.3 Chief among those undruggable cancer targets is the family of RAS proteins, which are mutated in ~20-30% of cancers, and transcription factors acting as oncogenes (such as MYC) or tumor suppressors (such as p53).7
Other dysfunctional proteins that act as clear disease drivers are present in 5,000-8,000 monogenic human diseases.8 While for some of those diseases, such as cystic fibrosis, the mutated protein can be tackled with small molecules9, others, such as the dismal neurodegenerative disorder Huntington’s disease, have no treatment options.10
Proteins without intrinsic, disease-driving function can also be effective drug targets. The transformative success of immune checkpoint inhibitors for some cancer types highlights the relevance of modulating the interface between the tumor and the immune system. Moreover, novel targets emerge from redundancies in biological pathways, exploited in synthetic lethality approaches. Here, the inhibition of one protein target alone is ineffective, but the silencing of two individual targets (either by combination therapy or in the context of an existing mutation) is effective at killing cancer cells. This principle is exemplified by PARP inhibitors, which show excellent responses in tumors that have mutations in BRAC1/2 genes.11
When it comes to druggability the frontier is constantly shifting. Protein classes such as the BCL-2 family had long been considered undruggable, but new structural knowledge led to the development of Venetoclax, a BCL-2 inhibitor used for treating certain types of blood cancer.5 At the same time, acquired resistance mechanisms can turn a protein from druggable to undruggable; these include mutations in the catalytic domain that binds small molecule inhibitors, or a rewiring of the surrounding signal transduction environment that makes the target inhibition ineffective.
Navigating the space of drug targets
Notably, the reason many proteins have not yet been explored as drug targets doesn’t always relate to their druggability; researchers in academia and the pharmaceutical industry tend to concentrate their efforts on a small number of well-known targets2, often ignoring those that appear more challenging. Next to chemical and biological screening efforts, intelligent mining of the vast amounts of information on proteins and drugs will be key to reversing this trend and shedding light on the dark parts of the human proteome.
In the age of personalized medicine, the collecting of patients’ genetic, proteomic and other omics data is becoming more and more frequent, while large-scale initiatives assign protein functions and localisation, and in silico prediction of protein 3D structures seems within close reach.7,12 Combined, the data in scientific publications, databases and consortia represents a treasure trove of information, tapped into by academic and industry researchers to categorise and characterise the understudied proportion of the proteome.13-18
Examples of this include:
- The Target 2035 initiative, which aims to create chemical and biological probes against all proteins of the human proteome.13
- The Illuminating the Druggable Genome (IDG) initiative, launched by the US National Institutes of Health in 201415, which employs a dual strategy of developing novel experimental approaches and a Knowledge Management Center for exploring the understudied proteome.2,15,16 Pharos, a key resource from the IDG program, is a web interface that aggregates protein information from several sources to help researchers identify proteins that might be interesting drug targets in different contexts.16
A new age of druggability
While the druggability or otherwise of a protein has long been determined by the presence of structural features that enable interaction with small molecule inhibitors, the emergence of novel therapies, such as PROTACs might change all that. Making large parts of the proteome accessible will shift the frontiers of drug development. Finding drug targets with disease-relevant functions for these novel therapies will be crucial to harnessing the yet-to be drugged proteome.
- Opinion article by Hopkins and Groom in 2002, which coined the term “druggable genome” and analysed druggable proteins, as well as proteins which could become drug targets.
- Nature research article by Oprea et al. 2017, describing how the Illuminating the Druggable Genome (IDG) initiative Knowledge Management Center can help to categorise different drug classes on evidence-based criteria.
- Nature research article by Santos et al. 2018, providing a curated landscape of drug targets based on various databases and manual curation.
- Research article from 2014, which describes the development of a dataset of clinical trial drug-target interactions, identifying 475 potentially novel clinical trial drug targets.
- Review article describing how protein-protein interactions can be utilised as drug targets.
- Nature scientific report, in 2018, analysing protein-protein interactions of intrinsically disordered proteins by machine-learning approaches.
- Review article published in 2020, describing the challenges for drugging cancer genomes and how personalised medicine approaches and novel treatment strategies might overcome them.
- 2014 report on the State of the Art of Rare Disease Activities in Europe.
- Nature opinion article on drugs targeting CFTR, the gene which is mutated in cystic fibrosis.
- Review article, discussing novel treatment options in Huntington’s disease
- Review article by Lord and Ashworth (who are among the founding fathers of the PARP field) in Science 2017, describing the concept of synthetic lethality underlying the action of PARP inhibitors in clinical application.
- Article from DeepMind describing AlphaFold’s astounding performance at the CASP14 challenge.
- Article describing the Target 2035, a global federation from academia and industry aiming to create chemogenomic libraries, chemical probes, and/or functional antibodies for the entire proteome.
- Research article published in 2020, which analyses human proteins bound to drug-like ligands in the protein databank to draw a landscape of the human druggable genome.
- Homepage of the Illuminating the Druggable Genome (IDG) initiative, which was established in 2014.
- Article and landing page for the Pharos Webtool, which enables data accumulated within the IDG initiative to be explored.
- Research article published in Science Translational Medicine in 2017, which connects disease- and biomarker-associated loci from genome-wide association studies to a set of genes encoding druggable human proteins.
- Scientific report published in Nature 2020, in which data mining and integration are leveraged to inspect target innovation trends in drug discovery.