Into the unknown: biotech’s quest to target the 'undruggable' proteome

A staggering 85% of the human proteome has been deemed ‘undruggable’. But, spurred on by new modalities, enhanced data and diminishing returns from existing research avenues, the biotech industry is beginning to explore the uncharted seas of potential drug targets.

Shortly after the first publication of the Human Genome Project, in 2002, Hopkins and Groom coined the term “druggable genome” – to describe the proportion of the genome they believed could be addressed with small molecule drugs. Back then they considered about 3,000 proteins, ~15% of the human proteome, druggable because they exhibited one key facet: a structural ability to be targeted by small molecules.

Since then, more than 500 proteins have been added to the list of actual drug targets; even so, the proportion of the human proteome targeted by approved drugs remains below 5%.

What about the 85% deemed undruggable back in 2002? Proteins within this group range from well-documented players that drive disease – including holy grail cancer drivers that have eluded inhibition by small molecule drugs – to the “dark proteome”, the uncharted territory of proteins with unknown function. And the race to harness the full scope of the proteome as drug targets is heating up, driven by three developments.

New routes to explore the undruggable genome

First of all, many so-called “privileged protein classes”, which have seen high success rates in previous decades, seem to have been exhausted. This includes nuclear receptors, an initially promising drug class in which little progress has been made in recent years.

Secondly, emerging modalities such as PROTACs and new classes of biologics seem to be opening up the scope of targetable proteins beyond the “privileged”. Indeed, in 2022 the FDA approved two key innovative modalities: a first-in-class allosteric inhibitor of the hypoxia-inducible factor‑2α; and a covalent inhibitor of notorious cancer driver KRAS. Both had long been considered undruggable.

And thirdly, as our understanding of biology advances, new links between proteins and disease are being discovered, opening up new territories of highly attractive drug targets. As exemplified by the transformative clinical success of immune checkpoint inhibitors, proteins without intrinsic, disease-driving function can be effective drug targets. Moreover, novel targets emerge from redundancies in biological pathways, exploited in synthetic lethality approaches, such as PARP inhibitors, which show excellent responses in tumors that have mutations in BRAC1/2 genes.

Those new developments fuel the biotech industry’s quest to identify proteins that satisfy druggability requirements. And it’s one in which artificial intelligence and predictive algorithms will be an invaluable ally, enabling researchers to prioritise proteins by relevance as disease players and suitability as drug targets.

The race to harness the full scope of the proteome as drug targets is heating up

Finding new drug targets with artificial intelligence

A Cambrian explosion of biological data in recent years has given researchers a much stronger foundation for exploring the parts of the proteome previously considered undruggable.

This data explosion is driven by two factors: high-throughput screening methods and improved predictive tools. DeepMind’s AlphaFold, an example of the latter, predicts protein structures based on publicly available structure and sequence information. All its findings are held on a public database.

This vast amount of accessible information makes it possible to augment drug discovery with data/AI-driven methods, which excel at rapid large-scale scanning. The entire proteome can thus be examined in an objective and transparent manner, before embarking on in-depth qualitative analysis of individual drug targets. But what should those objective criteria be?

In order to formulate criteria that can be rolled-out algorithmically, we need to ask ourselves what makes a protein a good drug target for the chosen modality and disease pathway.

In essence, a protein’s suitability as a drug target is determined by:

structural features that make it addressable by a specific modality (be it small molecule active site inhibitor, PROTAC or antibody)
function in or adjacent to a relevant disease pathway
expression levels in diseased vs. healthy tissue,
accessibility by the chosen modality in a tissue and cell

Although these factors cannot all be 100% encoded in algorithms, this “pre-prioritisation” helps to keep human input minimal and efficient. It also helps us to discover potential targets that we may miss, due to our individual blind spots.

Moreover, while there are some modality-agnostic aspects that are important to most biotechs (such as relevance in a specific disease pathway), the real power of this approach comes from tailoring it to a specific R&D strategy/modality. This approach further filters the list of suitable candidates.

For example, if you’re looking for good targets for PROTACs, there are certain key factors to bear in mind. These include structural features that allow for efficient ternary complex formation, functionality that is not driven by a single enzymatic centre and adequate protein turnover rates, or the coexpression with suitable E3 ligases.

Such an approach is not limited to AI-first biotech companies. We can help you bring the power of target prioritisation to bear on your specific technology and are happy to walk you through our approach. Just get in touch.

Further reading

Opinion article by Hopkins and Groom in 2002, which coined the term "druggable genome" and analysed druggable proteins, as well as proteins which could become drug targets.

Review article published in 2020, describing the challenges for drugging cancer genomes and how personalised medicine approaches and novel treatment strategies might overcome them.

Scientific report published in Nature 2020, in which data mining and integration are leveraged to inspect target innovation trends in drug discovery.