Are there drugs beyond the rule of 5?

J. Med. Chem. paper from the Kihlberg and Dobritzsch groups at Uppsala University takes a look at the drugs and candidates beyond Lipinski’s rule of 5. They analyze how such ligands bind to their targets and how different these interactions are from the Ro5 compounds.

First cool thing is that authors use principle component analysis for the analyzed datasets of compounds (I didn’t see it before in the literature). What’s remarkable is that three out of four parameters from the original Ro5 (MW, HBA, and HBD) and some commonly used extensions (PSA and rotatable bonds) actually correlate very well with each other. Which in fact makes perfect sense: the bigger the compound, the more likely it will have more hydrogen bond donors/acceptors and rotatable bonds. cLogP stays a little bit aside probably because it is not cumulative. As a result, the six parameters can be reduced to two principal components accounting for 92% of variance or to three, which cover 97%.

Spearman correlation coefficients for physicochemical parameters

Next, the authors turn their attention to the opposite site of drug discovery business – to the biological targets. It appears that in the relatively new target classes (kinases, proteases, transferases, etc.) non-Ro5 drugs and clinical candidates actually outnumber Ro5-counterparts. Not surprisingly, the parenteral dosing is also prevailing for these compounds (but still 30% are orally available).

Finally, to bring the drugs and targets together, authors analyze binding modes of three identified compound clusters (Ro5, ‘extended’ eRo5 and ‘beyond’ bRo5). As a metric for compound interaction, they use proportion of buried surface, which actually differs a lot between drug clusters. The bigger the deviation from Ro5, the lower the proportion. Seems like bigger molecules don’t need that much coverage by their targets to bind tightly. At the same time, the interface between drugs and targets do not differ regardless of compliance with Ro5. So do affinities measures (IC50, Kd and alike). Authors draw two major conclusions from these observations:

Firstly, drugs outside Ro5 space do not require higher affinities for their targets compared to Ro5 compliant drugs to compensate for any perceived or actual unfavourable pharmacokinetics. Secondly, despite being perceived as “difficult”, binding sites that are larger and more open can be modulated by drugs with similar affinities as drugs directed to sites traditionally considered highly “druggable”.

While the second conclusion is nothing new (it’s not the affinity per se that should be different, it’s how to reach that high affinity that bothers medicinal chemists), the first one casts a shadow on application of metrics such as LE and LLE to the compounds beyond Ro5. LE is predictably lower for bigger molecules and correlates with the compound shapes. Frankly, the latter correlation seems to be redundant as the shape was also well-correlated with compliance to Ro5.

Finally, the authors analyze macrocycles as a representative subclass of bRo5 ligands. Quite counterintuitively they claim that macrocycles are not more rigid than their acyclic bRo5 (pun intended). But that just means that acyclic compounds obtain their rigidity from other sources (e.g. amide and double bonds, aromatic cycles, etc.). In general, from the discussion it follows that there’s nothing too special about macrocycles. The most peculiar feature is their exceptional ability to bind flat protein surfaces. Hence, they are excellent tools for the right problem. So is the rule of five.

In the conclusion, authors propose to extend the boundaries of the original Ro5. Seems like they are too tight. Which raises a logical question, is the next extension just a matter of time?

P.S. Extra kudos to the authors for using R/ggplot2 for graphics in the main text of the paper (I just wonder why they don’t use it in the SI).


Studying memory, top-down

It is useful sometimes to raise your head from the ground and to have a look on the opposite site of a scientific field. Conceptually, most of the research in general is done via two approaches: top-down and bottom-up. In memory research, while some scientist are trying to identify the right receptor or gene and manipulate it with molecular preciseness (bottom-up), the others put electrodes into different brain areas and fire the entire groups of neurons (top-down).

Naturally, both approaches have their pros and cons. The greatest question in going bottom-up is “will the mechanism work on the next level of complexity?” When you go from the top, however, you will always be left with a question “How in the world did it work?”

A recent overview in Nature prompted me to look at the websites of groups doing that kind of research. There’s certainly lots of mathematical modeling and pattern recognition involved, which can lead to quite remarkable results in reverse-engeneering of neural circuitry. At the same time, reading the publication titles left me with a perception of how little we know even about such seemingly trivial circuits as CA1-CA3 in the hippocampus, which is known from 60-es or 70-es to be crucial for the memory formation. This kind of argument poses a big question mark behind the Human Brain Project. Will the neuron-by-neuron reverse-engineering of the brain help us with understanding its function? Probably not, unless someone digs from the opposite side.


AMPAR trafficking is complicated

The team from John Hopkins reported in PNAS a new pathway regulating the trafficking of AMPA receptor subunit GluA1. It involves previously unknown phosphorylation of the receptor by PAK3 kinase. However, the mechanism is not straightforward and some controversial data are reported. It appears that stimulation of EphrinB2, another player in activity-based synaptogenesis, leads to phosphorylation of the GluA1 subunits and increases the recruitment of them to the synaptic membrane. On the other hand, mutation of serine S863, which is supposed to be phosphorylated, to alanine or aspartate leads to the same increased surface recruitment of the subunit. The latter can be explained by similarity of carboxylic group to phosphorylated serine, but alanine is the obvious outlier.

All in all, the discovery of PAK3 as the AMPA trafficking regulator is unambiguous and may provide a mechanistic rationale for X-linked intellectual disability. But data are still insufficient to build a robust regulatory pathway, and in my opinion the scheme proposed by the authors doesn’t explain all the observations.

Compiling structures to pdf-file (linux)

During my job-searching campaign I was once asked to show all the structures that I have synthesized. Drawing 200+ molecules seemed no fun to me. Even opening all .cdx files generated in 3.5 years, to copy-paste in a single one, was too boring. So I’ve used openbabel for this job.

Once I had all the .cdx in one folder I’ve ran

babel *.cdx allStruc.svg -xe -xl -xC
rsvg-convert -f pdf -o allStruc.pdf allStruc.svg

But the output was weird. All the charged molecules were assigned unrealistic charges over +2000, so all my potassium trifluoroborate and ammonium salts were crap.

Then I turned to molconvert tool from Chemaxon, which is free for academic non-commercial use. To convert all .cdx files to correct smiles I used a simple script:

for i in $(ls -1 .|grep .cdx)
~/marvin/bin/molconvert smiles $i -o tmp.smi
cat tmp.smi >> smiles.smi

Followed by openbabel (I’ve decided to sort the molecules by molecular weight so the complexity will increase more or less steadily down the list):

babel smiles.smi allStruc.svg -xe -xl -xC --sort MW
rsvg-convert -f pdf -o allStruc.pdf allStruc.svg

Still, the conversion wasn’t ideal. Particularly, BF3¯ groups were represented as BF2·F¯. Fortunately, simple replacement of SMILES code ‘B(F)F’ to ‘[B-](F)(F)F’ and removal of extra fluoride (‘[F-].’ in SMILES) solved the problem.

So, here we go, the work of 3.5 years as almost square matrix 15×14:

The final result

ASOs for spinal muscular atrophy

In the new paper scientists from Isis Pharmaceuticals report on the development of the new mouse model for spinal muscular atrophy types I and II. The disease emerges from corrupted splicing of the SMN genes. The problem with previous models was that they were either too severe (with complete knockout of the ‘good’ protein), or too mild. So authors attempted to balance the copy number of the protein and create an ‘intermediate’ mouse line. They achieved that by combining ‘mild’ and ‘severe’ alleles and inserting additional human SMN2 gene into corresponding murine locus. So the resulting mice could live long enough and develop the expected neuromuscular pathology with relatively late onset of sympoms.

What’s more exciting is that when mutant mice were treated with the antisense oligo (ASO) targeting the pre-mRNA of SMN2 gene, the lethality and symptoms were improved. Even more surprising was the finding that delivery of the drug into CNS was not required for the improvement. The question remains if this feature translates into patients. Potentially this can lead to better understanding of the SMA pathology, namely if the disease originates in muscles or in neurons and what are the feedback loops between two cell types.