NMR strategies to support medicinal chemistry workflows for primary structure determination

https://doi.org/10.1016/j.bmcl.2016.11.066Get rights and content

Abstract

Central to drug discovery is the correct characterization of the primary structures of compounds. In general, medicinal chemists make great synthetic and characterization efforts to deliver the intended compounds. However, there are occasions which incorrect compounds are presented, such as those reported for Bosutinib and TIC10. This may be due to a variety of reasons such as uncontrolled reaction schemes, reliance on limited characterization techniques (LC–MS and/or 1D 1H NMR spectra), or even the lack of availability or knowledge of characterization strategies. Here, we present practical NMR approaches that support medicinal chemist workflows for addressing compound characterization issues and allow for reliable primary structure determinations. These strategies serve to differentiate between regioisomers and geometric isomers, distinguish between N- versus O-alkyl analogues, and identify rotamers and atropisomers. Overall, awareness and application of these available NMR methods (e.g. HMBC/HSQC, ROESY and VT experiments, to name only a few) should help practicing chemists to reveal chemical phenomena and avoid mis-assignment of the primary structures of compounds.

Introduction

The art of medicinal chemistry involves a wide range of synthetic and characterization avenues. It therefore stands to reason that the delivery of designed compounds that have accurate primary structures depends on careful considerations of both. However, the inherent complexities of this field of science can be overwhelmingly challenging, and thus prone to error.

To identify the potential sources of error, one must first better understand a typical medicinal chemistry workflow. It commences with defining a synthetic scheme, then initiating the reactions and monitoring reaction progress. The progress of one or multiple reaction steps are, in general, monitored with accessible tools such as thin-layer chromatography (TLC), liquid chromatography and mass spectrometry (LC–MS) and 1D 1H NMR spectroscopy. These tools can be used alone and/or as combinations for monitoring reactions and every reaction intermediate (i.e. reaction work-up, compound purification and final compound characterization).

Unfortunately, errors can occur at any of the steps involved in this workflow – inherently uncontrollable reaction schemes are widespread, the monitoring and characterization tools have their inherent limitations, and small-molecule compounds have fascinating elusive properties.1 This can be exasperated by the lack of one or more of the above tools due to finances or due to pressures to produce compounds at an ever faster rate. The focus on speed as a measure of productivity can lead to sacrifices on the quality control of compounds. Lastly, there can be a limited knowledge of how to use modern strategies for appropriately characterizing primary structure and properties of compounds.

The mis-assignment of the primary structures of compounds is a serious issue facing the pharmaceutical industry, academia and research institutes. In the past decade, about 160 compounds (synthetic and natural) were reported as incorrect primary structures then officially revised.2 This highlights potential issues with the chemical literature, and it exposes the real and likely possibility that many other compounds (reported and unreported) are also mis-assigned.

Two example cases are mentioned here to sensitize the reader to the problem, namely the cases of Bosutinib and TIC10. Bosutinib is a tyrosine kinase inhibitor marketed under the trade name Bosulif as an anticancer agent.3 In 2012, C&EN warned consumers that stocks of the incorrect isomer (compound 1a, Fig. 1) were being sold rather than the correct isomer (compound 1b, Fig. 1).4 It was later revealed that the situation was caused from the fact that the incorrect starting material 3,5-dichloro-4-methoxyaniline (1c, Fig. 1) was used instead of the correct starting material 2,4-dichloro-5-methoxyaniline (1d, Fig. 1).4

Another example involving TIC10 is even more striking. The compound was first patented in 1973 by “company A”,5 and then patented again in 2013 by “company B” based on its broad spectrum activity against multiple malignancies.6 Independently, the Scripps research group was studying the role of TIC10 in inducing apoptosis in cancer cells, and discovered that TIC10 prepared in their lab gave an unexpected negative result while that obtained from National Cancer Institute (NCI) library gave the expected positive result. Detailed primary structure analysis of TIC10 from both sources were then undertaken and the findings revealed that the wrong chemical structure was twice patented as compound 2a (Fig. 2) which is inactive and the actual active compound was the isomer shown as compound 2b in Fig. 2.7

It is impossible to know the full financial and health burden that incorrect compounds have on our society. Although many errors are reported and structures revised, many are not disclosed or are not yet discovered. Nonetheless, it is interesting that the judicious use of NMR HSQC, HMBC and ROESY experiments, in many cases, could easily confirm the accurate primary structure and avoid many issues. Given this, it should also be kept in mind that 2D NMR methods only became available for chemists after the late 1980’s.

Our aim is to raise awareness about this topic and to sensitize medicinal chemists to potential sources of the errors and, more importantly, demonstrate some available tools for compound characterization. One needs to be aware of structural ambiguity that can arise from the following: (i) regioisomerism, (ii) stereoisomerism, (iii) atropisomer/rotamer formation, (iv) adduct formation, (v) salt formation, (vi) geometric isomerism, (vii) chemical exchange, (viii) aggregation, (ix) reactions that proceed via unusual mechanisms like SN2′ in place of SN2 and neighboring group participation in place of SN2, (x) N-vs-O alkylation, (xi) tautomerism, (xii) ring slippage, and (xiii) rearrangement.

This paper focuses on the atomic-level advantages of NMR methods for proper characterizations, including 1H 1D, 1H selective decoupling, 1H–1H COSY, 1H–13C/15N HMQC/HSQC, 1H–13C/15N/19F HMBC (and variants), 1H ROESY, 1D 19F and variable temperature (VT) NMR. Furthermore, significant progress has been made in applying computer-assisted methods as valuable structural elucidation tools such as ACD and CASE. Here, we provide some examples of the use of some of these NMR methods for the accurate determination of the primary structures of compounds.

Uncontrolled reaction schemes and unpredictable behavior of small molecules can cause a functional group or moiety to be linked to a molecule in an unusual fashion; yielding regioisomeric products. When this occurs, it is difficult to distinguish the regioisomers by running only an LC–MS and/or 1D 1H NMR. Furthermore, the two isomeric products are not available for comparison purposes. On the other hand, regioisomers can be clearly characterized by readily available NMR methods HMQC (HSQC) which gives through single-bond 1JH,C correlations.8 A valuable complement to this method is the HMBC through-bond experiment which reveals multi-bond correlations 2 and 3 bonds, 2JH,C and 3JH,C. In addition, correlations can also be transferred across heteroatoms and quaternary carbon atoms, which is important in characterizing compounds with different spin systems.9

Fig. 3 illustrates the application of HMBC which helped to distinguish between the two regioisomers 3a and 3b (Fig. 3). In compound 3a, the ester moiety is meta to atom 2. In the HMBC spectrum (on the right), H2/C1 is observed as the only low-field correlation to atom 2. No H2/C3 crosspeak is observed whereas H4/C3 appears as the lowest crosspeak in the vertical f1-axis. This is consistent with the structure of compound 3a. For compound 3b, the ester moiety is ortho to C2, and two low-field crosspeaks (H2/C1 and H2/C3) with atom 2 are observed in the HMBC spectrum.

In the above example, a chemist would be considered as fortunate given that the two regioisomers are available for comparison purposes. Often times, a chemist has only one product available for full characterization and therefore must be careful to consider all NMR data/crosspeaks for assignment and characterization. Knowledge of the reaction scheme employed and NMR data consistency with the proposed primary structure are often sufficient for conclusions to be made.

As with all experimental methods, HMBC experiments also have shortcomings. The through-bond coupling constants can vary significantly, making it difficult to distinguish between crosspeaks that arise form 3JH,C and 2JH,C couplings.10 The coupling constant (3JH,C) depends on the dihedral angle11 among other factors and when the dihedral angle approaches 90°, crosspeaks may be apparently absent – thus the absence of peaks must be considered in a full analysis. Another issue with HMBC, is the appearance of 1JH,C and sometimes 4JH,C crosspeaks12 which can potentially confuse initial interpretations. Fortunately, ambiguities in HMBC data can sometimes be resolved using newly available variants of the HMBC and HSQC experiments. Also, complementary experiments should also be considered such as ROESY experiments (Fig. 4, vide infra), thus it is best to acquire a series of NMR experiments when attempting to confirm the primary structures of compounds.

ROESY experiments are valuable for structure elucidation because it provides through-space 1H–1H distance information between hydrogens that are within 5 Å proximity in space.13 Fig. 4 shows a nice example where ROESY experiments were necessary to discriminate between isomeric versions of the alkylated azabenzimidazoyl moiety.

Upon inspection, regio-isomers 4a and 4b differ only by their attachment point on the azabenzimidazoyl moiety. From a simple combination of 1D 1H and ROESY NMR data (top of Fig. 4), one can identify correlations H2/H1 and H2/H6, which are consistent only with compound 4a and not 4b. On the other hand, the H4/H1 and H4/H6 crosspeaks observed in the ROESY spectrum (bottom of Fig. 4) is only consistent with the primary structure of compound 4b and not 4a. Fortunately, this example shows distinct NMR spectra for 4a and 4b that are consistent with their corresponding structures, which allows for cross verifications. Often times, only one isomer is available for full NMR analysis.

Note that a third isomers is also possible (atom 6 linked via the pyridine nitrogen atom), but an analysis of the ensemble of NMR (ROESY, HMQC, HMBC) data clearly was inconsistent with this possibility.

Several issues can arise with ROESY data, which chemists should be aware of. Strongly J-coupled hydrogens can give rise to COSY-like artifacts, which produce absorption/dispersive crosspeaks and cannot be used for distance information. Also, compounds that experience slow chemical exchange can produce apparent artifacts. ROESY crosspeaks can be found between corollary hydrogens of the entities involved in chemical exchange. Such peaks can readily be identified as they have peak signs that are the same as the diagonal and opposite of distance-related peaks.13

Geometric (E/Z) isomers are formed as a result of restricted rotation about a double bond. Fig. 5 shows an example of cis/trans isomers (colored red). In simple scenarios, these types of isomers can easily be identified from their coupling constants which usually lies in the range (∼3–13) Hz for the cis and (∼12–20) Hz for trans. Many times, however, couplings to other hydrogen atoms result in the appearance of resonance multiplets that can hinder a simple observation of coupling across the double bond (i.e. hydrogen 2 with 3). Thus, resonance decoupling is required. This is demonstrated for compound 5a (Fig. 5) where the 1H NMR spectrum of the cis isomer reveals H2 as a multiplet (middle) due to its coupling to the two H1 protons and H3. H3 on the other hand is obscured by the overlap with H5. Due to this hindrance, 3JH3,H2 cannot be determined. However, by selectively decoupling H1 (right), proton H2 simplifies to a doublet from which its coupling to H3 (3JH2,H3 = 9.8 Hz) can be derived. Thus, coupling falls within the range typically expected for the cis orientation (5a, Fig. 5). For compound 5b, H3 is isolated but is coupled to H4 (Fig. 5). However, selective decoupling simplified the resonance from which the large coupling with H2 can be extracted as 3JH3,H2 = 15.2 Hz, consistent with the trans-isomer. In this case, the sample for compound 5b also contained the minor cis impurity, which made it impossible to calculate 3JH2,H3 directly.

Interesting properties can result from compounds that experience restricted rotation along single bonds or hindered ring flipping. Slow exchanging orientations of the same compound can develop due to, for example, steric factors, electronic effects, hydrogen bonding, and others.14 If the barriers to rotation/flipping are >20 kcal/mol then chirality can be generated, thus forming distinct compounds called atropisomers. Detailed discussions on the detection and prediction of atropisomers are covered elsewhere.15

Restricted rotation of <20 kcal/mol results in rotamers (conformers) that can have the appearance of distinct compounds. These slow exchanging entities cannot be separated by LC methods, due to rapid return to equilibrium. Nonetheless, their slow rotation on the NMR timescale can lead to resonance broadening or even doubling of signals, which can result in confusion and mis-interpretations as impurities and, potentially, even as other isomers like diastereoisomers.

Acquiring data for samples in different solvents can sometimes help in assessing the presence of rotamers. Rotamers can also be identified by NMR ROESY and/or VT experiments. Fig. 6 shows three compounds that have tertiary amides, which typically exhibit rotameric behavior. All have two sets of peaks as a result of hindered rotation along the amide bond and differentially modulated by the red-colored substituents. Fig. 6 shows how VT is employed to rotamer characterization. As the temperature is raised for compound 6a, the two equally-sized signals observed in the 1H NMR spectrum at 27 °C coalesce at 67 °C. This corresponds to the kinetics of rotation with 17.5 kcal/mol barrier. For compound 6b, there are also two signals observed, but one is much smaller in intensity, which reflects the thermodynamic influence of the red colored substituent. Interestingly, the least bulky methyl group of compound 6c exhibits a higher barrier to rotation given that the two resonances must coalescence at >67 °C. It is amazing how subtle structural changes (benzyl 6a, isopropyl 6b and methyl 6c) can have such an impact on both kinetic and thermodynamic properties.

ROESY NMR data is another reliable method to easily detect rotamers. ROESY data is typically used to monitor inter-hydrogen distances of small molecules to determine primary structures and conformations.16 Effectively, ROESY data can also report exchange phenomenon17 such as that found for rotamers. Fig. 7 shows the ROESY spectrum of compound 6a which has distance information or crosspeaks that are the opposite sign (color blue) as compared to the red diagonal peaks that follows along the top-right to bottom left. On the other hand, peaks that result from intermediate conformation exchange (such as that from rotamers) has the same sign (color red) as the red diagonal peaks.

Section snippets

N- versus O-alkylation

Synthetic alkylation reactions are perhaps the most prone to characterization error. In general, this relatively uncontrolled reaction results in unpredictable O- and N-alkyl analogues. Detailed discussions are provided elsewhere on how we proposed to employ two of three NMR techniques, including ROESY, HSQC/HMBC to distinguish them.18 Interestingly, 13C chemical shifts can reliably be employed to differentiate N- versus O-alkylation products formed from ambidented ligands. The example below (

Acknowledgments

We would like to thank Norman Aubry for his advice and hard work over the years. We also thank medicinal chemistry colleagues from Boehringer Ingelheim, many of whom Norman and SRL provided 23 years of NMR support for primary structure elucidation.

References (18)

  • B.J. Davis et al.

    Bioorg Med Chem Lett

    (2013)
    S.R. LaPlante et al.

    J Med Chem

    (2013)
  • M. Puttini et al.

    Cancer Res

    (2006)
    A. Vultur et al.

    Mol Cancer Ther

    (2008)
  • H. Friebolin

    Basic One and Two Dimensional NMR Spectroscopy

    (1998)
  • A.D. Bax et al.

    J Magn Reson

    (1985)
    C.P. Butt et al.

    Org Biomol Chem

    (2011)
    S.R. Laplante et al.

    J Med Chem

    (2014)
  • G.F. Pauli et al.

    J Org Chem

    (2016)
  • ...N.M. Levinson et al.

    PLoS One

    (2012)
  • Ingelheim BS. DE 2150062 A1;...
  • El-Deiry WS, Allen JE, Wu GDS. U.S. Patent 8673923B2;...
  • N.T. Jacob et al.

    Angew Int Ed

    (2014)
There are more references available in the full text version of this article.

Cited by (17)

  • Discovery of novel tubulin inhibitors targeting the colchicine binding site via virtual screening, structural optimization and antitumor evaluation

    2022, Bioorganic Chemistry
    Citation Excerpt :

    N-benzyl-N-methyl-2-(p-tolyl)quinoline-4-carboxamide (E2): brown solid, 51.5 mg, yield 74%. Since the amide nitrogen of compound E2 is substituted by two different groups, the 1H NMR spectrum of E2 shows the presence of two different rotamers in equilibrium, which is consistent with the literatures [37,38]. For the sake of simplification, the integral intensities were not given.

  • Discovery and optimization of 2-((1H-indol-3-yl)thio)-N-benzyl-acetamides as novel SARS-CoV-2 RdRp inhibitors

    2021, European Journal of Medicinal Chemistry
    Citation Excerpt :

    13C NMR (126 MHz, DMSO‑d6) δ 169.2, 157.4, 157.3, 136.8, 131.4, 129.4, 129.2, 128.6, 127.9, 127.7, 125.4, 124.9, 122.3, 120.9, 120.7, 120.2, 120.1, 118.9, 118.8, 112.5, 111.3, 111.1, 102.8, 102.7, 55.8, 55.7, 46.0, 39.0, 38.9, 36.3, 33.8. Rotamers formed because of tertiary amides in the structure which was similar to that reported in literature [39,40]. HRMS (ESI) m/z calcd.

View all citing articles on Scopus
View full text