SGC Utilizes AI-driven hit-finding technologies to discover novel small molecule ligands for the WDR protein family

A group of scientists in lab coatsFrom Left to Right: Shabbir Ahmad, Levon Halabelian and Serah Kimani at the SGC-Toronto laboratory.

A recent study led by researchers at the SGC-Toronto identified a first-in-class small molecule ligand for WDR91 to better understand the role of this protein in physiology and viral infection.

Among the approximately 20,000 human proteins encoded by genes, not all are amenable to modulation by drug-like small molecules. Only a small fraction of them can bind to a small molecule and serve as potential drug targets. This subset of proteins is known as the druggable proteome.

WD40 repeat domain proteins (WDRs) represent one of the largest protein families in the human genome and are involved in a wide spectrum of cellular processes. Many of these are perturbed in human diseases, including epigenetics, ubiquitin signaling, DNA repair, RNA splicing, and immune system signaling. While the Structural Genomics Consortium was the first to demonstrate that WDR proteins appear to be druggable, most human WDRs have remained largely uncharted territory in drug discovery, especially when compared to other major drug target families, such as kinases and GPCRs.

“WDRs are doughnut-shaped β-propeller domains featuring druggable central pockets, often referred to as ‘doughnut holes’. These sites frequently interact with peptide regions of key interaction partners, rendering them attractive targets for small molecule drug discovery”, explains Dr. Levon Halabelian, Principal Investigator at the SGC-Toronto.

Dr. Halabelian’s group has been working on shedding light on this family of proteins and evaluating hit-finding technologies. In his latest study published in the Journal of Medicinal Chemistry, his group, along with collaborators, discovered a novel small-molecule ligand targeting the WDR domain of WDR91 by using affinity-mediated DNA-encoded chemical library (DEL) selection followed by machine learning (ML).

To identify the initial hits, the WDR domain of WDR91 was screened against the X-Chem DEL deck of more than 125 billion different small molecules and simultaneously tested for its ability to bind to the protein target. “The X-Chem library deck contains billions of drug-like and lead-like compounds and is a proven resource for the efficient initiation of drug-discovery projects,” says Dr Anthony D Keefe, SVP of Innovation at X-Chem. He continues, “It is sometimes hard to fully harness the richness and scale of the selection output datasets for compound design, and it is often worthwhile to include workflows that utilize machine learning. These generate algorithms with predictive power for compounds outside of the screened library that can, in turn, be used sourced and validated, as was done in the WDR91 project.”

An ML model by Google was trained with this screening data and predicted diverse putative drug-like ligands from Enamine’s publicly available database. “We saw that training an ML model using DEL selection output data can enable and accelerate the discovery of small molecule ligands from readily accessible compounds, without the need for expensive, custom, off-DNA chemical synthesis,” says Dr. Shabbir Ahmad, a post-doctoral fellow in Halabelian’s group and first author of the study.

Dr. Shabbir Ahmad, post-doctoral fellow in Halabelian’s group at SGC-Toronto.

The SGC validated these hits and identified a first-in-class small molecule ligand of WDR91. “We believe that these compounds could guide medicinal chemistry efforts to develop potent and selective WDR91 chemical probes to further characterize its function in cellular disease models and evaluate its therapeutic potential as an antiviral against coronaviruses and related viruses”, says Dr. Ahmad.

The success of this project relied heavily on collaborations with SGC’s industry partners including Google, Relay and X-Chem, as these capabilities are not available (at the scale required) in academia, and such data is normally not released to the public. “We engage the community to work with us and create more opportunities for collaboration by providing a window into the multidisciplinary drug discovery process”, says Dr. Halabelian. “Our partners take advantage of SGC’s expertise in protein production, biophysical assays and crystallography and at the same time they get important feedback for their screening methods”.

Notably, this isn't SGC's first foray into using DEL-ML for hit-finding discovery. Dr. Halabelian’s team in collaboration with Relay (former ZebiAI) and XChem, used this DEL-ML approach to discover a micromolar affinity small molecule hit targeting the WDR central pocket of DCAF1, a key player in the Ubiquitin Proteasome System (UPS) by acting as the substrate recruitment component of two distinct E3 ligases and a potential target for cancer therapies. By screening a vast compound library using the WDR domain of DCAF1, they identified a promising hit compound binding to the WDR central pocket of DCAF1. In collaboration with the OICR, Dr. Halabelian’s team further advanced this molecule into a nanomolar affinity ligand, which serves as a valuable tool for future development of DCAF1-based PROTACs.

Earlier last year, Dr. Halabelian’s group published a study in collaboration with Recursion Canada (former Cyclica) in the Journal of Chemical Information and Modeling. In this study, Recursion examined three WDR proteins to discover small molecule binders using their proprietary deep learning platform to assess commercially available libraries for potential binders.

“SGC has expertise in structural and chemical biology, and protein bioinformatics and importantly has the infrastructure and expertise to support crystallography. Recursion's technology and workflows enabled the rapid discovery of the first-ever publicly deposited small molecule co-crystallized with DCAF1. Working with an open science consortium, such as the SGC, allows knowledge to be shared as quickly as possible that will hopefully drive new treatments.”, says Julie Owen, Director of Chemistry and one of the lead authors for this study.

Dr. Serah Kimani, one of the lead authors of this study and a research scientist in Halabelian's group, added, “Utilizing computational methods has a big potential to identify small molecule binders for protein targets with no known ligands and not enough data deposited publicly”.

Dr. Serah Kimani, a research scientist in Halabelian's group at SGC-Toronto.

These collaborative efforts showcase how the industry can advance its internal drug discovery pipeline while contributing to SGC’s open science model. “We were the first to publicly release the co-crystal structure of DCAF1 bound to a small molecule in Protein Data Base (PDB), and we saw the field rapidly evolving. While our manuscript was under revision, two series of DCAF1 ligands were reported, highlighting the importance of having our data available for the scientific community to use”, says Dr. Kimani.

Looking ahead, Dr. Halabelian’s work and X-Chem contributions align with the ambitious Target 2035, an open science movement, which aims to develop a pharmacological modulator for every protein in the human proteome by 2035. “Delineating the druggable genome and using high-throughput methods for the screening of small molecules that bind to proteins, will help us contribute to the mission of Target 2035”, explains Dr. Halabelian.


This work was supported by Genome Canada.

Recent Posts

glqxz9283 sfy39587stf02 mnesdcuix8