Introduction

Formaldehyde (HCHO), the simplest aldehyde, is present in blood, intercellular tissue, and within cells1. Exogenous HCHO sources, include pharmaceuticals (including prodrugs), cosmetics, paper, furniture, cigarette smoke, plywood and foodstuffs2,3. Endogenous cellular HCHO production occurs via enzyme-catalysed reactions involving semicarbazide-sensitive amine oxidases4, serine hydroxymethyltransferases5, dimethylglycine dehydrogenases6, lipid peroxidases7, P450 oxidases8 and N-methyl group demethylases9,10,11,12. Recent work also suggests HCHO is involved in folate metabolism13. Above threshold levels, HCHO manifests toxicity in humans, with acute HCHO exposure associated with pain, dermatitis14, nausea, arrhythmia15, coma and renal failure16; chronic exposure increases the risk of cancer. Studies with mice have revealed correlations between compromised HCHO metabolism and organ dysfunction, bone marrow failure, leukaemia and liver cancer17. Impaired HCHO metabolism is lethal in models of Fanconi anaemia, a condition characterised by defects in DNA damage repair18.

From a chemical perspective, HCHO (in its unhydrated form) is a potent electrophile that can react with biological nucleophiles in proteins and DNA. HCHO reportedly facilitates formation of intra-strand and DNA–protein cross-links in vitro19,20. It is probable that related reactions of HCHO in cells are responsible for its toxic/carcinogenic effects, although the precise chemistry of such reactions and their cellular prevalence is poorly defined. Given HCHO is produced endogenously, it is possible that it, or its adducts, play sensing or regulatory roles including in redox metabolism21; such roles might include dynamically regulating interactions between biomolecules, and/or forming functionally important adducts to nucleic acids and associated proteins22.

To better understand how HCHO and its derivatives influence cellular functions in health and disease, it is important that its reactions with biomolecules are defined. Von Hippel and McGhee pioneered studies on reactions of nucleosides/nucleotides with HCHO19,23; these, together with more recent work18,22, reveal substantial differences in the rates of reaction of HCHO with nucleobases, and in the stabilities of the HCHO-nucleobase adducts (including when produced by enzymatic oxidation of N-methyl groups).

Following pioneering work in the 1920s, numerous studies have been reported on reactions of HCHO with amino acids (AAs) and peptides24,25,26,27,28,29,30,31,32,33,34,35,36,37,38. However, few detailed systematic studies have been reported on reactions of HCHO with amino acids. Following from recent work investigating some of the products of the reactions of HCHO with arginine, lysine and short peptides39,40, we report systematic nuclear magnetic resonance (NMR) studies on reactions of HCHO with common proteinogenic and other biologically relevant AAs. The results reveal HCHO reacts with different AAs under a range of conditions, giving hydroxymethylated, cyclised, N-methylated and N-formylated products of very different stabilities. Comparison of the reactions of HCHO with those of other tested biologically relevant aldehydes/ketones implies HCHO reacts fastest and, in general, forms the most stable products. The results highlight the potential of HCHO-derived adducts to be involved in healthy biology, disease and evolution.

Results

Product identification in reactions of AAs and HCHO

Initially, we investigated reactions of HCHO with the following common proteinogenic AAs: alanine (Ala), cysteine (Cys), serine (Ser), threonine (Thr), homocysteine (Hcy), penicillamine (Pen), homoserine (Hse), allo-threonine (allo-Thr), lysine (Lys), ornithine (Orn), arginine (Arg), histidine (His), tryptophan (Trp), tyrosine (Tyr), asparagine (Asn), glutamine (Gln), Nε-methyllysine (Lys(Me)), Nε,Nε-dimethyllysine (Lys(Me)2), 5-hydroxylysine (5hLys) and proline (Pro). Standard reaction conditions comprised AA (4.34 mM) with HCHO (10 equivalents) in D2O. Reactions were monitored over 48 h by 1H NMR.

The most reactive proteinogenic AA studied was Cys, which gave thiazolidine 1 as the ‘thermodynamic’ product (Fig. 1, black circle), and hemithioacetal 2 as the apparent ‘kinetic’ product (Fig. 1, white diamond, and Fig. 2 (1)). The observation of hemithioacetal 2 was relatively prolonged under acidic conditions (DCl, 2 equivalents), possibly relating to increased Nα-amino group protonation. Cyclisation to give thiazolidine 1 was enhanced at elevated temperatures (308 and 318 K) and on addition of base (NaOD, 1 equivalent). A similar reactivity was observed with homocysteine (Hcy), which gave cyclic thiazinane 3 as the thermodynamic product and hemithioacetal 4 as the kinetic product (Fig. 2 (2))41. The reaction of the Cys analogue penicillamine (Pen) with HCHO gave a thiazolidine, 5, and kinetic formation of the hydroxymethylated thiol, 6. Stable hemiaminal-type adducts or imines were not directly observed within detection limits by reaction of HCHO with the Nα-groups of Cys/Hcy/Pen, or with any of the tested AAs; however, hemithioacetals 2/4/6 may not be intermediates in formation of 1/3/5 (see below).

Fig. 1
figure 1

The reaction between cysteine and formaldehyde (HCHO). HCHO reacts with cysteine to give hemithioacetal and thiazolidine products. a Reaction of cysteine (4.34 mM, white circle) with HCHO (2 equiv.) under standard conditions, monitored by 1H NMR over time, reveals formation of hemithioacetal 2 (white diamond) and thiazolidine 1 (black circle) formation; b monitoring the reaction of cysteine (4.34 mM) with HCHO (10 equiv.) under standard conditions, at different pDs: 6, 7.5 and 9

Fig. 2
figure 2

Products of the reactions of HCHO with thiol/alcohol-containing amino acids. Cysteine, homocysteine, penicillamine, serine, threonine, allo-threonine, homoserine, asparagine and glutamine were allowed to react with excess HCHO, in non-buffered D2O (570 µL). a Amino acid (4.34 mM), HCHO (10 equivalents); b amino acid (43.4 mM), HCHO (10 equivalents); (I) required addition of base (NaOD, 1 equivalent); (II) was accelerated by the addition of base (NaOD, 1 equivalent) and at higher concentrations (43.4 mM amino acid, and HCHO)

No direct evidence for product formation was accrued in initial reactions between Ser and HCHO under our standard conditions. However, new broad 1H resonances were identified on addition of base (NaOD, 1 equivalent), implying formation of a metastable product in equilibrium with the Ser starting material. This product was assigned as oxazolidine 7 (Fig. 2 (4)), which is structurally equivalent to thiazolidine 142. The slower formation rate of 7 from Ser, relative to that of 1 from Cys, correlates with the higher pKa value of the serinyl hydroxyl group (predicted at 13.4 (ACD/I-Lab, 5.0.0.184, Advanced Chemistry Development, Inc., Toronto, ON, Canada, www.acdlabs.com, (2019)) compared to the cysteinyl thiol group (~8.4 (AAs and acidity values, http://academics.keene.edu/rblatchly/Chem220/hand/npaa/aawpka.htm, (accessed 1 May 2019)), suggesting hydroxyl/thiol deprotonation might be (partially) rate limiting39. The apparent rapid interconversion between unreacted Ser and 7, as indicated by their broadened 1H resonances, was slowed by cooling (278 K), which enabled structural assignment of 7 (Supplementary Fig. 1). Investigation into potential stereochemical purity loss under basic conditions, using quantitative 1H NMR, showed no 2H incorporation (<2%) at the Hα position, implying maintenance of stereochemistry (Supplementary Fig. 2).

In contrast to Ser, reaction between Thr and HCHO led to observation of oxazolidine 8 under standard conditions, i.e. without addition of base (Fig. 2 (5))42. Conversion to 8 was, however, accelerated by addition of base (NaOD, 1 equivalent) and was concentration dependent. The subtle difference between the Thr secondary alcohol and the Ser primary alcohol thus appears sufficient to favour formation of 8 from Thr, potentially due to a Thorpe–Ingold effect, and/or increase in oxazolidine product stability (8 compared to 7). Reaction of HCHO with the Thr epimer allo-Thr gave oxazolidine 9 (Fig. 2 (6)) but reaction was less efficient than with Thr. This observation likely reflects a steric clash in the cis-substituted oxazolidine ring of 9.

On addition of HCHO to homoserine (Hse), 1,3-oxazinane 10 was formed (Fig. 2 (7)). Reaction to 10 was observed in the absence of acid or base, suggesting formation of the 1,3-oxazinane is more efficient than formation of 1,3-oxazolidines with Ser, Thr or allo-Thr. However, addition of base was required for full conversion to 10.

The reactions of Asn and Gln with HCHO were considerably slower than those with Cys, Hcy, Pen, Ser, Thr, allo-Thr or Hse. Reaction of Asn and Gln with HCHO over 48 h led to formation of lactams 11 (60%) and 14 (12%), respectively (Fig. 2 (8/9))43. Cyclisation to 11 and 14 was accelerated using more concentrated conditions, by heating, or by addition of base. Conducting the reaction between Asn and HCHO under more concentrated conditions (43 mM Asn; 10 equivalents HCHO) resulted in further hydroxymethylation, forming N-hydroxymethylated lactams 12 and 13 (Fig. 2 (8)).

Under standard conditions, His was observed to undergo rapid reaction with HCHO to form the cyclised Nπ- and Nα-bridged compound 17 (<20 min, Fig. 3 (1)). At higher concentrations, hydroxymethylation of the imidazole ring occurred rapidly, forming 15 and 16. After 24 h, subsequent formation of the C-Nα-cyclised molecule spinacine (19), a natural product present in many foods44,45, was observed. Formation of 19, which is likely more stable than 17, correlated with a decrease in the level of 17. Formation of 19 was increased with addition of base; however, an additional hydroxymethylated product 20 was also observed under basic conditions45.

Fig. 3
figure 3

Products of the reactions of HCHO with aromatic amino acids. Histidine and tryptophan were reacted with HCHO in excess, in non-buffered D2O (570 µL). a Amino acid (4.34 mM), HCHO (10 equivalents); b amino acid (43.4 mM), HCHO (10 equivalents); (I) was accelerated under basic conditions (NaOD, 1 equivalent) and at higher concentrations (43.4 mM amino acid); (II) formation of product was promoted under acidic conditions (DCl, 1 equivalent)

Reaction of Trp with HCHO resulted in initial indole ring N-hydroxymethylation giving 21; over time, C-Nα cyclisation was subsequently observed, affording 22 (Fig. 3 (2)). Addition of acid (DCl, 1 equivalent) suppressed observation of 21 and promoted formation of 22; addition of NaOD (1 equivalent) promoted formation of 21 over 22. Further N-hydroxymethylation of the indole nitrogen of 22 yielded 23, which has been reported in reaction of Trp with HCHO under strongly acidic conditions (Fig. 3 (2))46.

When Arg was dissolved under standard conditions, a moderately alkaline solution (pH 9) was obtained; reaction with HCHO led to the observation of 1,6,8-triazabicyclodecane 24, presumably via the transient cyclic intermediate 25 (Fig. 4 (1)), which was initially observed at lower concentrations. Increasing the Arg concentration (to 43.4 mM) resulted in observation of additional low-level 1H resonances, which might correspond to acyclic hemiaminals, as previously suggested40.

Fig. 4
figure 4

Products of the reactions of HCHO with sidechain amine-containing amino acids. Arginine, lysine, 5-hydroxylysine and ornithine were allowed to react with HCHO in excess, in non-buffered D2O (570 µL). a Amino acid (4.34 mM), HCHO (10 equivalents); b amino acid (43.4 mM), HCHO (10 equivalents), base (NaOD, 1 equivalent); c amino acid (4.34 mM), HCHO (10 equivalents), base (NaOD, 1 equivalent); d amino acid (4.34 mM), base (NaOD, 1 equivalent); (I) present at lower concentrations (4.34 mM), and limited HCHO (1 equivalent); (II) present at higher concentrations (43.4 mM), and limited HCHO (1 equivalent); (III) conversion only under alkaline conditions (pD 9) and more product formation at higher concentrations (43.4 mM); (IV) broadened signals prevented full assignment

With Orn, reaction was only observed on addition of 1 equivalent of NaOD. At early time-points (5 min), cyclic aminal 26 was the only observed product; 26 is presumably formed via the intermediate aminal 29, which was observed when exposing Orn to stoichiometric amounts of HCHO (Fig. 4 (2)). Addition of one equivalent of HCHO at high Orn concentrations (43.4 mM) resulted in formation of 29 (31%) and a new species, which was assigned as the dimeric aminal 30 (27%). At a lower Orn concentration (4.34 mM), 29 was observed as the major product (56%). Interestingly, prolonged incubation of Orn (43.4 mM) and HCHO (10 equivalents over 24 h) produced N6-formyl-N1-methyl and N1-formyl-N6-methyl adducts 27 and 28 (Fig. 4 (2)); no di-formylated or di-methylated species were observed, suggesting 27 and 28 are formed via intramolecular disproportionation of 26 (likely via an intramolecular hydride shift, as previously proposed for the analogous reaction with simpler alkyldiamines47 (Fig. 5a)). After 48 h a small portion (~5%) of 27 and 28 had lost their bridging methylene, affording linear versions of the disproportionated products, i.e., 31 and 32, respectively. Sufficient levels of 31 and 32 for characterisation were observed after several days. The observations with Orn illustrate the potential for complex and dynamic outcomes in reactions of HCHO with simple AAs.

Fig. 5
figure 5

Possible mechanism and pH profile of the reactions of HCHO with lysine or ornithine. a Possible mechanisms for formation of 27 and 28 from 26 (Fig. 4) under basic conditions, (ornithine (4.34 mM), HCHO (10 equiv.) in the presence of base (NaOD, 1 equiv.)). (b) 1H NMR time course of the reaction between ornithine (43.4 mM) and HCHO (10 equiv.) at different pHs; concentrations were determined by peak integration and referenced to the original ornithine concentration. c possible outline mechanism for formation of 33 via imine 33b

As with Orn, Lys only reacted with HCHO to form detectable products under basic conditions (NaOD, 1 equivalent), forming low levels of Nε-methylated Lys 33 (<5%, Fig. 4 (3), Supplementary Figs. 143146)40. No evidence for formation of the Nα-methylated product was accrued. By contrast with Orn, no cyclic or formylated products were observed with Lys, suggesting the formation of such species is disfavoured when the AA side-chain length increases from 3 to 4 methylenes.

We then explored the reaction of HCHO with 5-hydroxylysine (5hLys, 4.34 mM), which is produced via post-translational modifications in collagen and other proteins (Fig. 4 (4))48. No reaction was observed under neutral or acidic conditions with HCHO (10 equivalents); however, new broad resonances appeared with base (1 equivalent of NaOD, Supplementary Fig. 3). The broad nature of the new resonances precluded product characterisation, although it was possible to tentatively assign two resonances to diastereotopic protons of the C6-methylene group of 34 based on chemical shift analysis (δH 3.02 and 2.41 ppm, respectively, Supplementary Figs. 4 and 5). These two resonances possessed COSY correlations to a single broadened resonance at δH 3.61 ppm, which was tentatively assigned to the proton attached to C5. HMBC analyses using 13C-labelled HCHO revealed correlations between the three resonances and a carbon resonance at δC 83 ppm, potentially indicative of oxazolidine 34. Therefore, the available data collectively imply formation of the oxazolidine ring in 34, as observed with Ser/Thr/allo-Thr; the broad nature of the resonances suggests a high degree of reversibility under the tested conditions.

No detectable reaction between Tyr and HCHO was observed. However, previous reports suggest Tyr-residues in peptides can react with HCHO and Gly to form bridged Tyr-CH2-Gly adducts49, and HCHO-mediated methylene bridged Tyr-Gly cross-links have been observed in protein crystal structures50. No reactions to give stable products were observed between HCHO with either Ala or Pro. These observations imply reactions of the α-amino group to give NMR detectable products require the involvement of additional side-chain groups, at least under our conditions.

pH and concentration dependence studies

Studies then focussed on investigating effects of altering the AA concentration and pH. NMR analyses were conducted on samples containing either 4.34 or 43.4 mM AA, pre-adjusted to either pD 6, 7.5 or 9, prior to addition of HCHO (10 equivalents) and monitored up to 48 h. Cys and Hcy reacted with HCHO rapidly at all three pHs and at both concentrations, forming thiazolidines 1 and 3, respectively, within one hour (Supplementary Table 1 (1/2)). Note that in the absence of HCHO, disulfide formation was observed; thus HCHO has potential to modulate thiol/disulfide equilibria in cells. Ser, Thr, allo-Thr and Hse also reacted to form cyclic products (710), but less efficiently than Cys, reaching equilibrium within 20 min. In all cases cyclisation was greater at the higher tested AA and HCHO concentrations and at pD 9 (Supplementary Table 1 (1–6), 0–86%). When reacting Asn and Gln with HCHO, slow conversion was observed to produce lactams 11 and 14. Hydroxymethylated lactams 12 and 13 were observed under the more alkaline conditions at both AA concentrations after 48 h (Supplementary Table 1 (7/8)). Formation of 17 by reaction of His and HCHO was preferred at the lower His concentration, and at a lower pH (pD = 6). The C–N cyclised products were more prevalent under basic conditions, with 19 and 20 being observed at pD 9 after 48 h (Supplementary Table 1 (9)). With Trp, the hydroxymethylated adduct 21 was the major observed product at the lower Trp concentration; with higher Trp concentrations, almost exclusive formation of the cyclised products 22 and 23 was observed after 48 h (Supplementary Table 1 (10)). Arg and Lys remained largely unreactive at both tested AA concentrations and across the tested pH range; the only detectable product with Arg was 24, which was observed only at pD 9 after 48 h (12% and 23% (4.34 and 43.4 mM of AA, respectively), Supplementary Table 1 (11)). With Lys, trace formation of the Nε-methylated product 33 was observed (Fig. 5c); at pD 9 after 48 h (43.4 mM AA), there was evidence for formation of another, previously undetected adduct (broad 1H resonance at 6.8 ppm), potentially arising from an imine or hemiaminal (33b, Fig. 5c); however, the low concentration of this species precluded full characterisation (Supplementary Table 1 (12), Supplementary Figs. 143146). No further methylation was observed in reactions of HCHO with Nε-methyllysine (Lys(Me)) or Nε,Nε-dimethyllysine (Lys(Me)2) over 48 h (43.4 mM AA, 10 equivalents of HCHO, pD 9). Upon exposing Orn to HCHO at pD 9, two N-formyl amide conformational isomers of each of 27 and 28 were observed. The formation of 27 and 28 appeared to be accelerated at a higher Orn (43.4 mM) and HCHO concentration and at higher pH (pD 9; Fig. 5b).

To further investigate effects of pH on formation of oxazolidines and oxazinanes, we performed reactions with Ser, Thr, Hse and 5hLys (4.34 mM, Supplementary Figs. 611) with HCHO (10 equivalents) at different pHs (25 mM phosphate buffer pD 6.4, 7.0, 8.0, 9.0, 9.8, 11.0 and 12.0). Reactions were equilibrated before analysis. Oxazolidine/oxazinane formation was more prevalent under alkaline conditions; for Ser, formation of oxazolidine 7 was most significant in the sample at pD 12 (>50% conversion), while the Thr-derived oxazolidine 8 was comparably prevalent at pD 8. Similar reactivities were observed with Ser/Thr (2.5 mM) and 20 equivalents of HCHO (Supplementary Fig. 10), suggesting HCHO concentrations are saturating. By contrast, reaction of Hse and HCHO to form oxazinane 10 (>50% conversion) was observed at pD 7, suggesting oxazinane formation is more favourable than oxazolidine formation at neutral pH. Observed reaction of HCHO with 5hLys was most significant at pD 12 (Supplementary Fig. 9); this reactivity is similar to that observed with Ser. Finally, no significant difference in reactivity was observed in samples with Thr or allo-Thr (at 2.5 mM, Supplementary Fig. 11).

Various values for in vivo HCHO concentrations have been reported1,51 likely reflecting difficulties in measuring localised concentrations of a reactive electrophile. To increase the biological relevance of our results, which we initially carried out with a substantial HCHO excess, we performed reactions with Cys or His at a lower HCHO:AA ratio and at more dilute conditions (AA 25 µM; HCHO 30 µM). Despite the dilute concentrations and reduced HCHO excess (1:1.16), Cys still manifested thiazolidine 1 formation; however, no products were observed with His under these conditions (Supplementary Figs. 12 and 13).

Assessing product stabilities

We then investigated the stability of selected AA-HCHO adducts. Firstly, the life-times of the adducts were monitored after removal of residual excess HCHO. This was achieved by either: (i) treatment with the HCHO scavenger 1,3-diketocyclohexane (DCH)9, or (ii) lyophilisation followed by re-addition of D2O. The reaction mixtures of AA (4.34 mM) and HCHO (10 equivalents) were allowed to reach equilibrium (3–24 h) before HCHO removal. Applying either of the methods to mixtures containing 17 and 18 resulted in rapid re-formation of His, within 1 h, implying fast degradation of the two adducts (Supplementary Figs. 14 and 15). Similarly, fast degradation of oxazolidines 7 and 8 was observed, re-forming Ser and Thr respectively, within 1 and 4 h (Supplementary Figs. 1618). When mixtures containing the lactams 11 and 12 (from the reaction of Asn and HCHO), and 24 (from Arg and HCHO) were exposed to either of these methods, significantly slower and incomplete formation of the parent AAs was observed (after 72 h), suggesting these lactams (11, 12) are more stable than the cyclised adducts derived from His, Ser and Thr (Supplementary Figs. 1922). Treating mixtures of Trp-derived 22 and 23 with DCH manifested loss of the N-hydroxymethyl groups; however, the HCHO-derived C,N-linked methylene groups were stable over 48 h (Supplementary Fig. 23). The most stable AA-HCHO adduct of those analysed (Cys adduct 1, Ser 7, Thr 8, Asn 11, 12 and 13, Trp 21, 22 and 23, Arg 24) was the Cys-derived thiazolidine 1, which was stable throughout the analysis time (Supplementary Fig. 24). When 13C-labelled HCHO (H13CHO) was added to an aqueous solution of thiazolidine 1, slow incorporation of the 13C-label into 1 was observed (>48 h) demonstrating that the product 1 is in slow dynamic equilibrium with Cys (Supplementary Fig. 25). Incorporation of the 13C label was also observed when Trp-derived 21 and 23 were treated with H13CHO over 16 h (Supplementary Fig. 23B); 13C incorporation occurred at the N-hydroxymethyl groups.

Competition between Cys and different AAs for reaction with HCHO

We then investigated competition between AAs for reaction with HCHO, initially focussing on the effects of adding other AAs to reaction mixtures of Cys and HCHO. In competition experiments, where 1 equivalent of HCHO was added to reaction mixtures containing Cys and either Thr, Hse, Asn, His, Trp, Arg, Lys, or Orn (1:1 ratio), exclusive formation of Cys derived thiazolidine 1 was observed within 12 h; with the exception of His, no other products were observed during this time period (Supplementary Figs. 2634). With His, N,N-cyclised 17 was observed initially, but this product reverted to His alongside the formation of thiazolidine 1 over 5 h.

We then conducted competition experiments with the three thiol-containing AAs, Cys, Hcy and Pen. To a mixture of Cys, Hcy and Pen (1:1:1, at 4.34 mM) was added HCHO (1 equivalent), and the mixture was allowed to react over 6 h (Supplementary Fig. 35). The Pen-derived thiazolidine 5 was the major product after 6 h (55% of total product), while the Cys-derived thiazolidine 1 (30%) and the Hcy-derived thiazinane 3 (15%) were also observed (Supplementary Fig. 35). These studies suggest that, under the tested conditions, thiazolidine ring formation is favoured over thiazinane ring formation, implying faster formation rates and/or stability. The preferred formation of 5 over 1 is potentially due to the Thorpe–Ingold effect.

Competition experiments with the alcohol-containing AAs, Ser, Thr, allo-Thr, and Hse and HCHO (1 equivalent) did not reveal significant formation of adducts. Addition of excess HCHO (100 equivalents) resulted in observable formation of Thr-derived oxazolidine 8 (39%), allo-Thr-derived oxazolidine 9 (18%) and Hse-derived oxazinane 10 (43%, Supplementary Fig. 36). No Ser-derived oxazolidine 7 was observed.

Competition experiments with more complex mixtures of AAs were then conducted. Ser, Thr, allo-Thr, Hse, His, Trp, Asn, Orn, Arg and Lys were mixed and reacted with HCHO (1 equivalent) for 24 h. A number of HCHO-derived adducts were observed; the His-derived adduct 17 was the predominant adduct (29%), while compounds 8 (7%), 10 (9%), Asn-derived 11 (10%), Trp-derived 21 (7%) and His-derived 19 (4%) were also observed (some HCHO was unreacted, Supplementary Figs. 37 and 38). On addition of Cys (1 equivalent, 36 h), adducts 8, 10 and 17 disappeared from the mixture, while formation of 1 was observed. However, adducts 11, 19 and 21 persisted, implying kinetic stability (Supplementary Figs. 39 and 40, note: these adducts are not observed when Cys was initially present in the mixture, see above).

We then tested whether rates of reaction between Cys and HCHO are affected by the presence of other AAs. Mixtures containing Cys (5 mM), HCHO (1 equivalent) and one of either Ala, Ser, Thr or His (10 equivalents) in phosphate buffer (50 mM, pD 7.5), were monitored over 1 h (Supplementary Fig. 41). With Ala or Ser, only 1 and 2 were observed. With Thr or His, some formation of 8 or 17 was observed, respectively; formation of 8 did not affect the initial formation rate of 1, while formation of 1 was slowed with addition of His, presumably as a consequence of competing formation of 17.

The afore-described results imply that thiol containing AAs, are the most reactive with respect to reacting with HCHO. To test whether this apparently special reactivity requires an Nα-amino group, we reacted two tripeptides, glutathione, and l-δ-(2-aminoadipoyl)-l-cysteinyl-d-valine (ACV), under our standard conditions. In agreement with previous studies with glutathione52, both these tripeptides formed S-hydroxymethylated adducts at neutral pH (Supplementary Figs. 42 and 43).

Comparison of the reactivity of Cys with HCHO using different electrophiles

Studies were then conducted to investigate the potential reactions of other biologically relevant carbonyl compounds with Cys. Samples were prepared containing Cys and either acetaldehyde (AcH), acetone, glyoxal or methylglyoxal. The reactions produced a set of thiazolidines analogous to the product observed with HCHO (35a, 35b, 36, 37a, 37b, 38a and 38b, Fig. 6). Reaction between Cys and acetone gave incomplete formation of 36 (33%), even after addition of base (NaOD, 1 equivalent, 56% of 36). When reacting with glyoxal to give thiazolidines 37a and 37b, formation of the trans substituted thiazolidine was moderately preferred (37a:37b, 1.1:1). Formation of a bisthiazolidine was not observed, even with excess Cys (2 equivalents). Methylglyoxal reacted to give thiazolidines 38a and 38b, again favouring the trans-substituted 38a (1.7:1).

Fig. 6
figure 6

Products of reactions between cysteine and aldehydes and ketones. Cysteine (4.34 mM) was allowed to react with aldehydes and ketones in non-buffered D2O (570 µL). a HCHO (10 equivalents); b acetaldehyde (10 equivalents); c acetone (10 equivalents); d glyoxal (10 equivalents); e methylglyoxal (10 equivalents); f acetaldehyde (100 equivalents); g acetone (100 equivalents); h glyoxal (100 equivalents); i methylglyoxal (100 equivalents); j HCHO (100 equivalents)

We then compared the reactivities of the different electrophiles. Cys (4.34 mM) was treated with each of the electrophiles (10 equivalents) and the reaction allowed to react to completion, after which HCHO (100 equivalents) was added. The relative levels of the thiazolidines were then assessed. Displacement of acetone by HCHO from 36 was fastest, resulting in complete formation of thiazolidine 1 within 5 min (Supplementary Figs. 44 and 45). The glyoxal products 37a and 37b were displaced to give 1 within an hour, while the methylglyoxal products 38a and 38b were present in solution for 6 h after HCHO addition (Supplementary Figs. 4649). Fragmentation of the acetaldehyde adducts to give thiazolidine 1 was complete in 5 h (Supplementary Figs. 50 and 51). Notably, when the displacement reactions were performed in reverse, i.e., the other electrophiles (100 equivalents) were added to solutions of 1, no displacement of HCHO from thiazolidine 1 was observed.

Reductive N α -methylation

Although we did not detect direct evidence for reactions between HCHO and unmodified AA Nα-amino groups under standard conditions, such reactions likely occur (given the nucleophilicity of amino groups) and are potential intermediate steps in the formation of cyclic products. The proposed products of such reactions, i.e., hemiaminals and imines, are presumably unstable/transient, precluding their observation in our NMR analyses. We, therefore, attempted to ‘trap’ Nα-reaction products by well-precedented reductive N-methylation. Cys, Ala and His were individually mixed with HCHO (1 equivalent) and sodium cyanoborohydride (NaCNBH3, 5 equivalents, Supplementary Figs. 5254). Notably, with Ala and His, both mono- and di-Nα-methylated products were observed within 1 h, suggesting reactions between HCHO and AA α-amino groups occur under our aqueous conditions (Supplementary Figs. 53 and 54). With Cys, Nα-methylated thiazolidine 1 was observed.

Discussion

We have carried out systematic NMR studies on the reactions of HCHO with common proteinogenic and other AAs, focusing on those with nucleophilic sidechains. The analyses have identified multiple products, revealing the potential for complexity in the reactions of HCHO with even simple biologically relevant small molecules.

Our studies indicate that the Nα-amino groups of AAs do not react to give stable (i.e., directly detectable/assignable by NMR) hemiaminal or imine products under our assay conditions. These observations are consistent with previous studies, including work on the mechanism of Nε-methyllysine demethylases, where the proposed intermediate hemiaminals undergo rapid fragmentation9. Although we did not directly observe Nα-hemiaminal type adducts with any proteinogenic AAs, indirect evidence, including signal broadening, i.e., with Ser, Thr allo-Thr, and Hse, suggest these do exist. Indeed, such hemiaminals and the associated imines are likely intermediates in the formation of the cyclised adducts observed by us and others with AAs and peptides9,47. In the case of the reaction of Cys with HCHO, formation of thiazolidine 1 likely proceeds via the unobserved (protonated) Nα-linked imine intermediate rather than by the S-hydroxymethyl product 2. Transient imines/hemiaminals are likely intermediates in the efficient reductive N-methylation reactions of AAs, as observed by us with Cys, Ala and His, as well as by others with many proteins and peptides using HCHO and NaCNBH353.

Some hemiaminals can be observed in aqueous solution, such as the HCHO adduct with Tris buffer and one product derived from the inhibitor/competitive substrate Meldonium after its reaction with γ-butyrobetaine hydroxylase (this hemiaminal is in equilibrium with an oxazolidine)54. Thus, since they possess multiple nucleophiles, proteins, nucleic acids, and many other bio-molecules (or complexes thereof) have the potential to sequester HCHO or other reactive carbonyl compounds or to act as reservoirs for those reactive species.

Of the reactive carbonyl compounds we investigated for reaction with Cys, i.e. HCHO, acetone, acetaldehyde, glyoxal and methylglyoxal, HCHO appears to be most reactive, and gives the most stable product (thiazolidine 1). These observations, which presumably reflect the electrophilicity and small size of HCHO, suggest HCHO has a unique reactivity with biomolecules. A caveat on our work is that in most reactions we used an excess of HCHO and in some cases prolonged reaction times. Different concentration values (µg mL−1 range) for in vivo HCHO have been reported, likely reflecting difficulties in its analysis. Localised HCHO levels in specific biological environments (e.g., in chromatin) may be higher than generally perceived. Context dependent constraints on concentrations or stereoelectronics may also enhance rates of reaction and/or product stabilities and enzyme (or nucleic acid) catalysis may also be involved. It should also be noted that the high reactivity of HCHO is despite the fact that at neutral pH values HCHO is substantially in its hydrated form, though the carbonyl form is also observed (<0.1%, Supplementary Fig. 55).

Formation of both the R2N–CH2–N bridged and RN/S–CH2–OH hydroxymethyl adducts is reversible, as shown by studies perturbing the equilibrium position by trapping, repeated lyophilisation, or by exchange of the methylenes using H13CHO. In some cases, HCHO derived product formation is likely much less reversible or irreversible, e.g., N-methylation and N-formylation (with Lys/Orn), though these products are arguably less biologically relevant, at least in healthy circumstances (see below). The apparent dynamic nature of most identified HCHO-derived adducts renders their identification in cells challenging. Thus, we perceive it is important to define the precise nature of the reaction products of HCHO with isolated biological components with the intention that the results will inform on biological observations.

It is notable that, of all the tested common proteinogenic AAs, Cys reacts most efficiently with HCHO, either as a free AA or in a peptide. This is consistent with the role of the Cys-containing tripeptide glutathione (or other Cys-containing molecules in some organisms) in HCHO detoxification13,52, which proceeds via formation of an S-hydroxymethyl adduct. Previous work has found that glutathione can react with HCHO to give cyclic products other than thiazolidines52,55. Although the biological relevance of the glutathione–HCHO-derived and related rings, if any, is unclear, the work with Cys and glutathione reveals the potential for complex reaction outcomes with HCHO and even simple biological molecules. We thus speculate that the reaction of HCHO with biopolymers to form rings may regulate activity (e.g., inhibit proteolysis/alter folding kinetics) or elicit new biochemical functions. The results reveal subtle differences in how the AA side-chains affect the rates of reaction and product stability, e.g. as observed with Thr/allo-Thr. As shown by studies with glutathione and another Cys containing tripeptide, we observed that under the tested conditions thiols do not form disulfides when treated with HCHO (Supplementary Figs. 42 and 43). Thus, variations in HCHO and other reactive carbonyl compounds have the potential to directly alter the redox levels in cells via altering thiol/disulfide equilibria. There are several studies showing the potential of histone demethylases to act as hypoxia sensors, a role proposed to be mediated via changes in histone methylation status56,57; our results suggest that HCHO generation by these enzymes should be considered in a regulatory context.

The potential for HCHO to regulate chromatin activity/genetic regulation is of particular current interest given the identification of multiple enzymes catalysing N-methyl group demethylation of proteins (principally to date, but not exclusively, histones) and nucleic acids (DNA and RNA) with concomitant HCHO production. Chromatin contains basic elements including nucleobases and histones, which have potential for reaction with HCHO. Indeed, work on the reactions of HCHO with nucleobases has revealed that in some, but not all, cases the N-hydroxymethyl adducts are relatively stable (i.e., spectroscopically observable) with substantial half-lives (hours), implying they may accrue over time22,58. Further, HCHO can crosslink the protein and nucleic acid components of chromatin via Mannich-type chemistry59,60.

In the context of chromatin biochemistry, it is interesting that HCHO reacts with basic AAs (Lys, Orn, His and Arg) to give R2N–CH2–X-type adducts including rings, and in some cases, N-methylated/formylated products. Such non-enzymatic methylation/formylation may be relevant in the context of long term toxic environmental exposure to HCHO, or e.g. by exposure due to treatment of patients with HCHO producing prodrugs54,61. It is also possible that such reactions occur in healthy cells or in the extracellular matrix, due to locally high HCHO levels, e.g. as a consequence of HCHO produced by N-methyl demethylase catalysis or during inflammation. As with the other reactions described here, it is also of potential interest with respect to the early stages of the evolution of current biochemistry.

In the case of some HCHO reactions with AAs with N-containing side-chains (e.g., Arg, Orn, His and Trp), formation of ring systems was observed; over time, His and Trp underwent cyclisation forming a C–N bond via Pictet–Spengler type reaction (Fig. 3). The complex rings systems formed with Orn and Arg are particularly notable (Fig. 4).

It is also important to note that, as well as modulating interactions between biopolymers by formation of covalent links, both stable or transient HCHO adducts have the potential to modify biomolecular properties/functions (e.g., protein/nucleic acid folding) by altering non-covalent interactions, altering solvation patterns22, or by modulating HCHO availability, as evidenced by our competition experiments.

Our overall results reveal that even the reactions of HCHO with AAs can have complex and dynamic outcomes. Combined with other results, they reveal the potential for significant biochemical complexity in the reactions of HCHO and potentially other electrophiles with the nucleophilic components of chromatin and other biological systems. The reactions of HCHO with histones tails, which contain a high density of basic/nucleophilic residues, are the subject of ongoing investigations. Whether or not HCHO and its reactions on chromatin and other cellular systems are involved in the regulation of normal healthy biology is presently an open question. However, it is clear that, at least above threshold levels, exogenous HCHO can promote disease, e.g. cancer62. HCHO is also used to modulate pain response in animal models, likely at least in part via reaction with cysteine residues in transient receptor potential sensory protein. HCHO is used as a preservative63,64, and its derivatives are also present in food (e.g., spinacine), prodrugs54,61, drug metabolites (e.g., of Meldonium54) and in natural products (e.g., tubulysin)65,66. It is likely some of the reactions studied here are relevant to these roles of HCHO.

Methods

General methods

Alanine monohydrate, arginine, asparagine, 1,3-cyclohexadione, cysteine, dimedone, 5-hydroxylysine hydrochloride salt, homocysteine, homoserine, lysine hydrochloride salt, Nε-methyllysine hydrochloride salt, Nε,Nε-dimethyllysine hydrochloride salt, ornithine hydrochloride salt, proline, serine, threonine, allo-threonine, tryptophan and tyrosine, were from commercial suppliers and used without purification. Samples were prepared in D2O (purity 99.9 atom% D), unless stated otherwise. Starting materials and products formed during the reactions were monitored by NMR and characterised using a combination of 1D and 2D techniques. Samples were recorded at 298 K, unless stated otherwise, in a 5 mm tube, using a Bruker AVII 500 equipped with a TXI H/F/C probe, Bruker AVIII HD 500 equipped with a BBFO probe, Bruker AVII 500 equipped with a CPDUL He CryoProbe, Bruker AVIII HD 600 equipped with a BB-F/H N2 CryoProbe or a Bruker AVIII 700 equipped with a TCI H/C/N He CryoProbe, at their respective frequencies.

Standard conditions reaction setup

AAs (0.0247 mmol) were weighted and dissolved in D2O (1 mL)–25 mM AA stock solution. HCHO (30.0 mg, 1.0 mmol) was suspended with D2O (3 mL) and heated with a hairdryer until a clear colourless solution was obtained—the 335 mM HCHO stock solution. In the cases of acetaldehyde, acetone, glyoxal or methylglyoxal an equimolar solution was created by addition of an appropriate volume of one of the compounds to D2O (3 mL). A typical sample consisted of the AA stock solution (100 µL, 2.47 µmol, 1 equivalent) and D2O (400 µL); the solution was thoroughly mixed and the NMR spectrum was recorded. Optionally, DCl (1 M) or NaOD (1 M) were added, mixed and recorded. Subsequently HCHO (7–70 µL, 2.47–24.7 µmol, 1–10 equivalents) was added and the progress of the reaction was monitored over time. Variable temperature (VT) studies required the samples to be adjusted to the appropriate temperature before the addition of electrophile. Competition experiments were conducted using AA1 stock solution (100 µL), AA2 stock solution (100 µL) and D2O (300 µL), with a limited amount of HCHO (1 equivalent).

Adjusted pH reactions

The appropriate stock solution was diluted to a minimum volume of 4 mL and the pH was adjusted to: 5.60, 7.10 or 8.60, corresponding to pD 6.00, 7.50 or 9.00, using NaOD and DCl (0.1M and 0.01M). HCHO stock solution was prepared by suspending paraformaldehyde (30.0 mg, 1 mmol) in D2O and subsequently adjusting the pH with NaOD and DCl (0.1 M and 0.01 M) to the desired pH (total volume eventually 3 mL). AA stock solution (400 µL, 2.47 µmol) was mixed with D2O (100 µL); and NMR spectra were recorded. Subsequently, HCHO (7–70 µL, 2.46–24.6 µmol, 1–10 equivalents) was added and reaction progress was monitored over time (maximum 48 h). See Supplementary Fig. 55 for analysis of the stock solution

pH dependent equilibria experiments

To a buffered solution (144 µL, 25 mM phosphate buffer, at pH = 5.9, 6.6, 7.6, 8.6, 9.4, 10.6 and 11.6), an AA (16 µL, 50 mM stock solution; in the case of 2 AAs: 2 × 8 µL was used) was added. Subsequently, HCHO (2.38 µL, 3.35M, 10 equivalents), was added. The mixture was allowed to equilibrate for at least 20 min before recording the NMR spectrum.

Stability assays using 1,3-cyclohexadione

Upon completion of a standard reaction, as monitored by 1H NMR, a solution of 1,3-cyclohexadione (22.4 mg, 0.2 mmol, in 180 µL, 1.1 M; 85 µL, 40 equivalents) was added to the reaction mixture. Formation of the dimerised diketone and parent AA formation were both followed by NMR over time.

Lyophilisation to test stability of HCHO adducts

Upon completion of a standard reaction, as monitored by 1H NMR, the solution was frozen in liquid N2 and lyophilised overnight. The mixture was then redissolved in D2O and allowed to equilibrate for 1 h and analysed by NMR. The sample was then put through a cycle of lyophilisation and redissolving several times (5–8).

Using 13C labelled HCHO to test for exchange of HCHO adducts

A reaction mixture with AA was stirred under standard conditions using unlabelled HCHO (5 equivalents), until the species of interest was observed. Subsequently, labelled H13CHO (5 equivalents) was added to the solution and progress was monitored over time.

Competition assays

A stock solutions of one AA (25 mM) was prepared and added (100 µL) to D2O (300 µL). Cysteine (100 µL, or another AA) was then added, to give an equimolar solution (50 mM, 1:1). This was followed by the addition of HCHO (1 equivalent, 3.35 M stock solution). Reactions were monitored by 1H NMR for 16 h until all HCHO was consumed.

For competition assays with more than four AAs, higher stock solution concentrations (200 mM) were used.

Spectator AA experiments

A solution of cysteine (5 mM) and a ‘spectator’ AA (25 mM), in phosphate buffer (50 mM, pH 7.10, in >95% D2O) was made and HCHO (1 equivalent with respect to cysteine, 3.35 M stock solution) was added. The reaction was followed by 1H NMR.

Reductive amination

A mixture of AA (5 mM) and NaBH3CN (5 equivalents) in D2O was prepared. Subsequently, HCHO (1.1 equivalents, 3.35 M stock solution) was added. The reaction was followed and upon completion analyzed by 1H NMR.

Product characterisation

Full spectra of the following compounds are provided in Supplementary Note 1 and Supplementary Figs. 57146.

Cysteine products

(S)-4-Carboxy-1,3-thiazolidine 1: The title compound was observed when cysteine (4.34 mM) was exposed to HCHO (10 equiv.) and left to react for 1 h. 1H NMR (700 MHz, D2O) δ: 4.33 (d, 10.0 Hz, 1H, H2) 4.29 (dd, J = 7.5, 6.0 Hz, 1H, H4), 4.21 (d, J = 10.0 Hz, 2H, H2’), 3.29 (dd, 1H, 12.0, 7.5 H5), 3.18 (dd, 1H, 12.0, 7.5 H5’). 13C NMR (176 MHz, D2O) δ: 172.0, 81.6, 63.9, 48.6, 32.9. Electrospray ionisation mass spectrometry (ESI–MS) calculated for C4H8O2NS [M + 1]+ 134.02703, found 134.02704. These results are in agreement with previously reported data67.

(S)-Hydroxymethyl-l-cysteine 2: The title compound was observed after 30 min, when adding acid (DCl, 2 equiv.) to Cys (4.34 mM) followed by HCHO (2 equiv.) addition. 1H NMR (700 MHz, D2O) δ: 4.62 (q, J = 12.0 Hz, 2H, H8), 4.22 (dd, J = 7.0, 4.0 Hz, 1H, H2), 3.15 (dd, J = 15.5, 4.0 Hz, 2H, H6). 13C NMR (176 MHz, D2O) δ: 170.4, 81.6, 65.1, 52.7, 31.0.

Homocysteine products

(S)-4-Carboxy-1,3-thiazanane 3: The title compound was observed when homocysteine (4.34 mM) was exposed to HCHO (10 equiv.) and left to react for 1 h. 1H NMR (700 MHz, D2O) δ: 4.16 (d, J = 13.0 Hz, 1H, H2), 4.10 (d, J = 13.0 Hz, 1H, H2’), 3.52 (dd, J = 12.0, 3.0 Hz, 1H, H4), 2.85 (dt, J = 14.0, 3.0 Hz, 1H, H6), 2.69 (dt, J = 14.0, 3.0 Hz, 1H, H6’), 2.39 (dq, J = 15.0, 3.5 Hz, 1H, H5), 1.87 (dq, J = 15.0, 3.5 Hz, 1H, H5’). 13C NMR (176 MHz, D2O) δ: 170.0, 59.4, 45.2, 28.1, 26.0. These data are in agreement with those reported68.

(S)-Hydroxymethyl-l-homocysteine 4: The title compound was observed when adding acid (DCl, 2 equiv.) to Hcy (4.34 mM) followed by HCHO (2 equiv.) addition, after 30 min. 1H NMR (500 MHz, D2O) δ: 4.67 (s, 2H, H9), 4.15 (t, J = 6.5 Hz, 1H, H2), 2.75 (t, J = 7.5 Hz, 2H, H7), 2.28 – 2.21 (dt, J = 14.5, 7.5 Hz, 1H, H3), 2.14 (dt, J = 14.5, 7.5 Hz, 1H, H3’). 13C NMR (126 MHz, D2O) δ: 171.6, 64.6, 51.6, 30.0, 25.7.

Penicillamine products

(R)-5,5-Dimethylthiazolidine-4-carboxylic acid 5: The title compound was observed when penicillamine (4.34 mM) was allowed to react with HCHO (10 equiv.) for 30 min. 1H NMR (700 MHz, D2O) δ: 4.43 (d, J = 10.5 Hz, 1H, H8), 4.36 (d, J = 10.5 Hz, 1H, H8’), 3.91 (s, 1H, H2), 1.61 (s, 3H, H9/10), 1.37 (s, 3H, H9/10). 13C NMR (151 MHz, D2O) δ: 170.0, 73.2, 52.9, 27.3, 25.4. These data are in agreement with those reported69.

(S)-Hydroxymethyl-l-penicillamine 6: The title compound is observed when penicillamine (4.34 mM) was allowed to react with HCHO (10 equiv.) for 30 min. 1H NMR (700 MHz, D2O) δ: 4.87 (d, J = 13.0 Hz, 1H, H8), 4.75 (d, J = 13.0 Hz, 2H, H8’), 4.07 (s, 1H), 1.59 (s, 3H), 1.38 (s, 4H). 13C NMR (151 MHz, D2O) δ: 170.0, 73.2, 52.9, 27.3, 25.4.

Serine product

(S)-4-Carboxy-1,3-oxazolidine 7: The title compound is observed when serine (4.34 mM) was reacted with HCHO (10 equiv.) with base (NaOD, 1 equiv.). 1H NMR (600 MHz, D2O) δ: 4.49 (d, J = 6.0 Hz, 1H, H6), 3.99 (d, J = 6.0 Hz, 1H, H6’), 3.73 (t, J = 7.0 Hz, 2H, H4), 3.54 (t, J = 7.0, Hz, 2H, H2), 3.38 (t, J = 7.0 Hz, 2H, H4’). 13C NMR (151 MHz, D2O) δ: 178.2, 81.0, 67.7, 60.6. These data are in agreement with those reported70.

Threonine product

(4S,5R)-5-methyloxazolidine-4-carboxylic acid 8: The title compound is observed is observed within an hour, when exposing threonine (43.4 mM) to HCHO (10 equiv.). 1H NMR (700 MHz, D2O) δ: 4.50 (d, J = 6.0 Hz, 1H, H6), 4.23 (d, J = 6.0 Hz, 1H, H6’), 3.73–3.65 (m, 1H, H4), 3.10 (d, J = 7.0 Hz, 1H, H2), 1.19 (d, J = 6.1 Hz, 3H, H9). 13C NMR (151 MHz, D2O) δ: 177.8, 79.8, 76.8, 67.9, 18.8. These data are in agreement with those reported70.

allo-Threonine product

(4S,5S)-5-methyloxazolidine-4-carboxylic acid 9: The title compound is observed within an hour, when exposing allo-threonine (43.4 mM) to HCHO (10 equiv.). 1H NMR (700 MHz, D2O) δ: 5.02 (d, J = 6.0 Hz, 1H, H4), 4.51 (d, J = 6.0 Hz, 1H, H4’), 3.42–3.67 (m, 1H, H2), 4.09 (d, J = 7.0 Hz, 1H, H1), 1.05 (d, J = 6.0 Hz, 3H, H9). 13C NMR (151 MHz, D2O) δ: 170.6, 76.8, 76.7, 60.9, 14.5.

Homoserine product

(S)-4-Carboxy-1,3-oxazinane 10: The title compound is observed within an hour, when exposing homoserine (4.34 mM) to HCHO (10 equiv.). 1H NMR (700 MHz, D2O) δ: 4.86 (d, J = 9.5 Hz, 1H, H2), 4.41 (d, J = 9.5 Hz, 1H, H2’), 4.02 (dt, J = 11.5, 2.5 1H, H6’), 3.76 (dd, J = 12.0, 4.0 Hz, 1H, H4), 3.63–3.61 (m, 1H, H6’), 2.02–1.99 (m, 1H, H5), 1.89–1.84 (m, 1H, H5’). 13C NMR (176 MHz, D2O) δ: 174.3, 75.3, 66.9, 56.5, 26.2.

Asparagine products

(S)-6-Oxohexahydropyrimidine-4-carboxylic acid 11: The title compound is observed after 12 h, when exposing asparagine (4.34 mM) to HCHO (10 equiv.). 1H NMR (700 MHz, D2O) δ: 4.43 (d, J = 11.5 Hz, 1H, H2), 2.86 (d, J = 11.5, Hz, 1H, H2’), 4.05 (dd, J = 10.0, 6.5 Hz, 1H, H4), 2.86 (dd, J = 17.5, 10.0 Hz, 1H, H5), 2.64 (dd, J = 17.5, 10.0 Hz, 1H, H5’). 13C NMR (176 MHz, D2O) δ: 172.1, 171.0, 54.1, 52.6, 31.6. ESI–MS calculated for C5H9O3N2 [M + 1]+ 145.06077, found 145.06088. These data are in agreement with those reported43.

(S)-1-(Hydroxymethyl)-6-oxohexahydropyrimidine-4-carboxylic acid 12: The title compound was observed when exposing asparagine (43.4 mM) to HCHO (10 equiv.) and base (NaOD, 1 equiv.) after 24 h. 1H NMR (700 MHz, D2O) δ: 4.70 (s, 2H, H11), 4.30 (d, J = 12 Hz, 1H, H2), 4.21 (d, J = 12 Hz, 1H, H2’), 3.49 (dd, J = 11.0, 5.5 Hz, 1H, H4), 2.61–2.51 (m, 1H, H5), 2.27 (dd, J = 17.5, 11.0 Hz, 1H, H5’). 13C NMR (176 MHz, D2O) δ: 178.2, 171.2, 66.9, 59.9, 55.9, 34.7. ESI–MS calculated for C6H11O4N2 [M + 1]+ 175.07133, found 175.07137.

(S)-1,3-Bis(hydroxymethyl)-6-oxohexahydropyrimidine-4-carboxylic acid 13: The title compound was observed when exposing asparagine (43.4 mM) to HCHO (10 equiv.) and base (NaOD, 1 equiv.) after 24 h. 1H NMR (700 MHz, D2O) δ: 4.72 (s, 2H, H11), 4.40 (d, J = 12.0 Hz, 1H, H2), 4.35 (q, J = 11.0 Hz, 2H, H13), 4.27 (d, J = 12.0 Hz, 1H, H2), 3.66 (t, J = 7.0 Hz, 1H, H4), 2.61–2.51 (m, 2H, H5). 13C NMR (176 MHz, D2O) δ: 178.2, 172.0, 77.2, 67.4, 62.8, 58.8, 33.5. ESI–MS calculated for C7H13O5N2 [M + 1]+ 205.08190, found 205.08169.

Glutamine product

(S)-7-Oxo-1,3-diazepane-4-carboxylic acid 14: The title compound was observed when exposing glutamine (43.4 mM) to HCHO (10 equiv.) after 24 h. 1H NMR (700 MHz, D2O) δ: 4.59 (d, J = 14.5 Hz, 1H, H8), 4.51 (d, J = 14.5 Hz, 1H, H8’), 3.83 (dd, J = 11.0, 3.5 Hz, 1H, H2), 2.75 (t, J = 14.5 Hz, 1H, H5), 2.53 (dd, J = 15.5, 8.3, 1.6 Hz, 1H, H5’), 2.33-2.29 (m, J = 3.8 Hz, 1H, H4), 1.89 (q, J = 11.5, 1H, H4’). 13C NMR (176 MHz, D2O) δ: 180.3, 172.6, 62.9, 53.0, 32.8, 23.6. ESI–MS calculated for C7H13O5N2 [M + 1]+ 159.07642, found 159.07646. These results are in agreement with previously published data43.

Histidine products

(S)-5,6,7,8-Tetrahydroimidazo[1,5-c]pyrimidine-7-carboxylic acid 17: The title compound was observed when exposing histidine (4.34 mM) to HCHO (10 equiv.) after 20 min. 1H NMR (500 MHz, D2O) δ: 8.42 (s, 1H, H11), 7.12 (s, 1H, H9), 5.31 (d, J = 12.5 Hz, 1H, H8), 4.96 (d, J = 12.5 Hz, 1H, H8’), 3.61 (dd, J = 11.0, 5.0 Hz, 1H, H2), 3.18 (dd, J = 16.5, 5.0, Hz, 1H, H5), 2.74 (dd, J = 16.5, 5.0 Hz, 1H, H5’). 13C NMR (126 MHz, D2O) δ: 178.1, 130.7, 128.7, 115.8, 59.5, 54.6, 23.8.

(S)-5-(Hydroxymethyl)-4,5,6,7-tetrahydro-3H-imidazo[4,5-c]pyridine-6-carboxylic acid 18: The title compound was observed when exposing histidine (43.4 mM) to HCHO (10 equiv.) after 1 day. 1H NMR (700 MHz, D2O) δ: 8.46 (s, 1 H, H11), 7.10 (s, 1H, H9) 5.20 (q, J = 13.0, 2H, H8), 4.42 (q, 11.0 Hz, 2H, H13), 3.94 (m, 1H, H2), 3.14–3.04 (m, 2H, H5). 13C NMR (176 MHz, D2O) δ: 176.1, 130.6, 128.9, 115.1, 77.4, 62.3, 56.9, 23.0.

Nπ-(Hydroxymethyl)-l-histidine 15: The title compound was observed when exposing histidine (43.4 mM) to HCHO (10 equiv.) after 1 day. 1H NMR (700 MHz, D2O) δ (two stereoisomers, A/B): 8.94 (s, 1H, HA10) 8.73 (s, 1H, HB10), 7.43 (s, 1H, HA8), 7.29 (s, 1H, HB8), 5.45 (s, 2H, HB12) 5.43 (s, 2H, HA12), 3.96–3.92 (m, 1H, HA2/B2), 3.29–3.22 (m, 2H, HB4), 3.12–3.05 (m, 2H, HA4). 13C NMR (176 MHz, D2O) δ: 176.0 (2, A3/B3)*, 135.5 (A10) 135.2 (B10), 129.3 (A7), 128.2 (B7), 120.38 (A8), 119.2 (B8), 70.3 (A12), 69.9 (B12), 53.0 (2)*, 25.0 (A4/B4)*. * = couplings found, though exact chemical shift not obtained.

Nτ-(Hydroxymethyl)-l-histidine 16: The title compound was observed when exposing histidine (43.4 mM) to HCHO (10 equiv.) after 1 day. 1H NMR (700 MHz, D2O) δ (two stereoisomers, A/B): 8.68 (s, 1H, HB10) 8.65 (s, 1H, HA10), 7.37 (s, 1H, HB8), 7.22 (s, 1H, HA8), 5.45 (s, 2H, HB12) 5.43 (s, 2H, HA12), 3.90–3.83 (m, 1H, HA2/B2), 3.22–3.14 (m, 2H, HB4). 13C NMR (176 MHz, D2O) δ: 176.0 (2, A3/B3)*, 134.4 (B10) 131.3 (A10), 129.5 (A7), 128.5 (B7), 119.0 (B8), 116.7 (A8), 72.0 (A12), 71.9 (B12), 53.0 (2)*, 24.4 (B4), 21.0 (A4). * = couplings found, though exact chemical shift not obtained.

(S)-4,5,6,7-Tetrahydro-3H-imidazo[4,5-c]pyridine-6-carboxylic acid 19: The title compound was observed when exposing histidine (43.4 mM) to HCHO (10 equiv.) after 1 day (several days for complete conversion). 1H NMR (500 MHz, D2O) δ: 7.66 (s, 1H, H8), 4.20 (q, J = 15.0 Hz, 2H, H2), 4.00 (dd, J = 11.0, 5.5 Hz, 1H, H4), 3.21 (dd, J = 16.5, 5.0 Hz, 1H, H5), 2.91 (dd, J = 16.5, 5.0 Hz, 1H, H5’). 13C NMR (126 MHz, D2O) δ: 173.2, 136.7, 124.6, 123.5, 56.7, 40.8, 22.9. These results are in agreement with previously published data45.

(S)-1-(Hydroxymethyl)-4,5,6,7-tetrahydro-1H-imidazo[4,5-c]pyridine-6-carboxylic acid 20: The title compound was observed when exposing histidine (43.4 mM) to HCHO (10 equiv.) after 1 day (several days to complete conversion). 1H NMR (500 MHz, D2O) δ: 7.71 (s, 1H, H8), 5.35 (s, 2H, H13), 4.23–4.21 (m, 2H, H2), 4.04 (dd, J = 10.5, 5.5 Hz, 1H, H4), 3.32–3.25 (m, 1H, H5), 2.99–2.92 (m, 2H, H5’). 13C NMR (126 MHz, D2O) δ: 172.8, 138.6, 128.5, 122.6, 67.2, 56.3, 41.2, 21.6. These results are in agreement with previously published data45.

Tryptophan products

1-(Hydroxymethyl)-l-tryptophan 21: The title compound was observed when tryptophan (4.34 mM) was exposed to HCHO (10 equiv.) after 2 h. 1H NMR (600 MHz, D2O) δ: 7.66 (d, J = 8.0 Hz, 1H, H12), 7.53 (d, J = 8.5 Hz, 1H, H15), 7.29 (t, J = 7.5 Hz, 1H, H14), 7.25 (s, 1H, H11), 7.19 (t, J = 7.5 Hz, 1H, H13), 5.57 (s, 1H, H16), 3.99 (dd, J = 9.0, 5.0 Hz, 1 H, H2), 3.39 (dd, J = 15.0, 5.0 Hz 1H, H6), 3.22 (dd, J = 15.0, 5.0 Hz, 1H, H6’). 13C NMR (151 MHz, D2O) δ: 174.3, 136.0, 128.0, 127.8, 122.7, 120.3, 118.9, 110.1, 108.7, 68.1, 54.9, 26.2.

(S)-2,3,4,9-Tetrahydro-1H-pyrido[3,4-b]indole-3-carboxylic acid 22: The title compound was observed when tryptophan (43.4 mM) was exposed to HCHO (10 equiv.) after 2 days. 1H NMR (500 MHz, D2O) δ: 7.56 (d, J = 8.0 Hz, 1H, H10), 7.49 (d, J = 8.2 Hz, 1H, H13), 7.18 (t, J = 7.9 Hz, 2H, H12), 7.10 (t, J = 7.5 Hz, 1H, H11), 4.46 (d, J = 15.0 Hz, 1H, H2), 4.38 (d, J = 15.0 Hz, 1H, H2’), 3.98 (dd, J = 9.5, 3.9 Hz, 1H, H4), 3.34 (dt, J = 16.5, 5.5 Hz, 1H, H5), 2.99 (dt, J = 16.5, 5.5 Hz, 1H, H5’). 13C NMR (126 MHz, D2O) δ: 173.7, 136.7, 126.3, 125.6, 123.0, 119.7, 118.2, 109.8, 105.9, 56.8, 39.5, 22.2.

(S)-9-(Hydroxymethyl)-2,3,4,9-tetrahydro-1H-pyrido[3,4-b]indole-3-carboxylic acid 23: The title compound was observed when tryptophan (43.4 mM) was exposed to HCHO (10 equiv.) after 2 days. 1H NMR (500 MHz, D2O) δ: 7.54 (d, J = 8.0 Hz, 1H, H10), 7.40 (d, J = 8.0 Hz, 1H, H13), 7.18 (t, J = 7.9 Hz, 1H, H12), 7.10 (t, J = 7.5 Hz, 1H, H11), 5.50 (dd, J = 36.8, 11.5 Hz, 2H, H17), 4.58 (d, J = 16.0 Hz, 1H, H2), 4.43 (d, J = 10.6 Hz, 1H, H2’), 4.01 (dd, J = 9.5, 4.0 Hz, 1H, H4), 3.34 (dt, J = 16.5, 5.5 Hz, 1H, H5), 2.99 (dt, J = 16.5, 5.5 Hz, 1H, H5’). 13C NMR (126 MHz, D2O) δ: 173.9, 136.5, 126.2, 125.7, 122.5, 120.6, 118.5, 111.6, 107.7, 65.25, 57.1, 40.39 22.28.

Arginine products

(S)-1-Carbamimidoyl-1,3-diazepane-4-carboxylic acid 25: The title compound was observed when arginine (43.4 mM) was exposed to HCHO (3 equiv.) after 4 h. 1H NMR (700 MHz, D2O) δ: 4.35 (d, J = 14.5 Hz, 1H, H7) 4.11 (d, J = 14.5 Hz, 1H, H7’), 3.32 (q, J = 5.9 Hz, 2H, H5), 3.20-3.14 (m, 2H, H2), 1.98–1.91 (m, 1H, H3), 1.82–1.77 (m, 1H, H4), 1.70–1.63 (m, 1H, H4’), 1.62–1.56 (m, 1H, H3’). 13C NMR (176 MHz, D2O) δ: 180.7, 156.1, 61.6, 59.5, 46.4, 31.6, 25.0.

(2S)-7-Imino-1,6,8-triazabicyclo[4.3.1]decane-2-carboxylic acid 24: The title compound was observed when arginine (4.34 mM) was exposed to HCHO (10 equiv.) after 24 h. 1H NMR (700 MHz, D2O) δ: 4.37 (d, J = 12.0 Hz, 1H, H4), 4.28 (d, J = 14.0 Hz, 1H, H14), 4.19 (d, J = 14.0 Hz, 1H, H14’), 4.02 (dd, J = 12.0, 2.5 Hz, 1H, H4’), 3.65 (dt, J = 15.0, 4.5 Hz, 1H, H2), 3.08–3.03 (m, 2H, H6), 3.01–2.96 (m, 1H, H2’), 1.95–1.87 (m, 2H, H7), 1.83–1.78 (m, 2H, H7’). 13C NMR (176 MHz, D2O) δ: 181.1, 157.0, 67.9, 64.8, 63.9, 50.0, 28.8, 28.2. ESI–MS calculated for C8H15O2N4 [M + 1]+ 199.11895, found 199.11892.

Ornithine products

(S)-1,3-Bis(hydroxymethyl)-1,3-diazepane-4-carboxylic acid 26: The title compound was observed when ornithine (HCl salt) (43.4 mM) was exposed to HCHO (10 equiv.) and base (NaOD, 2 equiv.) after 3 min. 1H NMR (700 MHz, D2O) δ: 4.81 (d, J = 11.0 Hz 1H, H11), 4.75 (d, J = 4.7 Hz, 1H, H12), 4.49 (d, J = 11.0 Hz, 1H, H11’), 4.36 (d, J = 11.0 Hz, 1H, H12’), 4.14 (d, J = 14.5 Hz, 1H, H6), 3.87 (d, J = 14.5, 1H, H6’), 3.37 (dd, J = 12.0, 4.5 Hz, 1H, H1), 2.97 (dt, J = 14.5, 2.0 Hz, 1H, H4), 2.65 (t, J = 14.0 Hz, 1H, H4’), 2.21–2.13 (m, 1H, H2), 1.98–2.05 (m, 1H, H3), 1.84–1.78 (m, 1H, H3’), 1.73–1.65 (m, 1H, H2’). 13C NMR (176 MHz, D2O) δ: 183.0, 86.2, 84.9, 69.1, 62.7, 55.1, 32.0, 29.2.

(S)-3-Formyl-1-methyl-1,3-diazepane-4-carboxylic acid 27: The title compound was observed when ornithine (HCl salt) (43.4 mM) was exposed to HCHO (10 equiv.) and base (NaOD, 2 equiv.) after 2 h. 1H NMR (600 MHz, D2O) δ (two isomers, C and D): 8.30 (s, 1H, HC11), 8.15 (d, J = 2.0 Hz, 1H, HD11), 5.14 (d, J = 13.0 Hz, 1H, HD2), 4.77–4.73 (m, 10H, HC2), 4.62 (d, J = 14.0 Hz, 1H, HC2’), 4.34–4.26 (m, 2H, HC4/D4/D2), 3.42–3.32 (m, 2H, HA7’/D7’), 3.21 (d, J = 13.5 Hz, 1H, HC7), 2.97 (t, J = 13.0 Hz, 1H, HD7), 2.86 (t, J = 13.0 Hz, 1H, HC7’), 2.80 (s, 4H, HC4/D4), 2.61 (s, 3H, HC14/D5), 2.48–2.42 (m, 1H, HC5), 2.08–2.02 (m, 1H, HA5’/D6), 2.00–1.94 (m, 1H, HC6), 1.93–1.76 (m, 4H, HB6/C5), 1.76–1.65 (m, 2H, HC6’). 13C NMR (151 MHz, D2O) δ: 177.9 (D8), 177.9 (C8), 167.2 (C11), 166.7 (D11), 65.9 (C2), 61.2 (D4), 60.1 (D2), 59.5 (C4), 58.3 (D7), 57.1 (C7), 40.3 (D14), 39.0 (C14), 29.4 (D5), 28.7 (C5), 23.4 (C6), 22.4 (D6).

(S)-1-Formyl-3-methyl-1,3-diazepane-4-carboxylic acid 28: The title compound was observed when ornithine (HCl salt) (43.4 mM) was exposed to HCHO (10 equiv.) and base (NaOD, 2 equiv.) after 2 h. 1H NMR (600 MHz, D2O) δ (two isomers, A and B): 8.21 (s, 1H, HA11), 8.14 (d, J = 2.0 Hz, 1H, HB11), 4.96 (dd, J = 14.0, 2.0 Hz, 1H, HA7), 4.86–4.81 (m, 2H, HB2), 4.70 (s, 24H, HB2’), 4.54 (d, J = 14.0 Hz, 1H, HA7’), 3.72–3.66 (m, 1H, HB4), 3.63–3.59 (m, 1H, HA4), 3.56–3.48 (m, 3H, HA7/B7), 3.42–3.32 (m, 2H, HA7’/D7’), 2.64 (s, 3H, HA14), 2.22–2.17 (m, 2H, HB5), 2.17–2.10 (m, 1H, HA5), 2.08–2.02 (m, 1H, HA5’/D6), 1.93–1.76 (m, 4H, HB6/C5). 13C NMR (151 MHz, D2O) δ: 175.5 (A8), 173.9 (B8), 167.6 (B11), 166.8 (A11), 85.7 (B2), 70.5 (B4), 69.1 (A4), 66.1 (A7), 60.8 (B2), 48.6 (B7), 44.1 (A7), 41.3 (B14), 38.9 (A14), 26.1 (B6), 25.9 (A5), 25.3 (B5), 23.3 (A6).

(S)-1,3-Diazepane-4-carboxylic acid 29: The title compound was observed when ornithine (HCl salt) (4.34 mM) was exposed to HCHO (1 equiv.) and base (NaOD, 2 equiv.) after 10 min. 1H NMR (700 MHz, D2O) δ: 3.71 (d, J = 13.5 Hz, 1H, H2), 3.40 (d, J = 13.5 Hz, 1H, H2’), 3.22 (dd, J = 9.0, 4.0 Hz, 1H, H4), 2.70–2.59 (m, 2 H, H7), 1.66–1.51 (m, 2H, H5), 1.50–1.40 (m, 2H, H6). 13C NMR (176 MHz, D2O) δ: 182.2, 61.2, 60.0, 46.0, 32.5, 27.9.

(1R,2S,6R)-8-((S)-4-Amino-4-carboxybutyl)-1,6,8-triazabicyclo[4.3.1]decane-2-carboxylic acid 30: The title compound was observed when ornithine (HCl salt) (43.4 mM) was exposed to HCHO (1 equiv.) and base (NaOD, 2 equiv.) after 20 min. 1H NMR (700 MHz, D2O) δ: 3.85 (d, J = 14.3, 1H, H2), 3.72 (d, J = 11.5, 1H, H11), 3.54 (d, J = 11.5 Hz, 1H, H12), 3.29 (dd, J = 12.0, 4.1 Hz, 1H, H4), 3.19 (d, J = 11.5 Hz, 1H, H11’), 3.10 (m, J = 11.5 Hz, 2H, H12’ and H17), 2.91–2.85 (m, 1H, H7) 2.65–2.59 (m, 1H, H7’), 2.20–2.12 (m, 2H, H6), 1.69–1.62 (m, 2H, H16), 1.55–1.46 (m, 2H, H14), 1.35–1.28 (m, 2H, H15). 13C NMR (176 MHz, D2O) δ: 183.6, 183.5, 76.1, 73.8, 69.1, 63.5, 55.0, 51.8, 32.8, 31.7, 28.5, 21.6.

(S)-2-Formamido-5-(methylamino)pentanoic acid 31: The title compound was observed when ornithine (HCl salt) (43.4 mM) was exposed to HCHO (10 equiv.) and base (NaOD, 2 equiv.) after several days and more build up over a period of a week. 1H NMR (700 MHz, D2O) δ: 8.04 (s, 1H, H10), 4.32 (dd, J = 8.0, 5.0, 1H, H2), 3.01–2.96 (m, 2H, H6), 2.62 (s, 3H, H13), 1.89–1.84 (m, 1H, H4), 1.75–1.68 (m, 1H, H4’), 1.70–1.63 (m, 2H, H5). 13C NMR (176 MHz, D2O) δ: 176.2, 163.9, 51.90, 48.4, 32.6, 28.1, 21.8

(S)-5-Formamido-2-(methylamino)pentanoic acid 32: The title compound was observed when ornithine (HCl salt) (43.4 mM) was exposed to HCHO (10 equiv.) and base (NaOD, 2 equiv.) after several days and more build up over a period of aweek. 1H NMR (700 MHz, D2O) δ: 7.96 (s, 1H), 3.56–3.53 (m, 1H, H1), 3.20–3.17 (m, 2H, H5), 2.62 (s, 3H, H13), 1.85–1.75 (m, 2H, H3), 1.57–1.44 (m, 2H, H4). 13C NMR (176 MHz, D2O) δ: 173.2, 164.3, 63.0, 37.3, 31.6, 24.0.

Lysine product

Nε-Methyl-l-lysine 33: The title compound was observed when lysine (HCl salt) (43.4 mM) was exposed to HCHO (10 equiv.) at pD 9 after 48 h. 1H NMR (700 MHz, D2O) δ: 3.61 (t, J = 6.1 Hz, 1H, H2), 2.92 (t, J = 10.0 Hz, 2H, H9), 2.56 (s, 3H, H11), 1.80–1.71 (m, 2H, H3), 1.62–1.55 (m, 2H, H8), 1.40–1.24 (m, 2H, H7). 13C NMR (176 MHz, D2O) δ: 174.5, 54.4, 48.5, 32.6, 29.8, 26.3, 21.3. These results are in agreement with previously reported data40.

Full spectra of the following compounds are provided in Supplementary Note 2 and Supplementary Figs. 148–160.

Cysteine and acetaldehyde products

(2S,4R)-2-Methylthiazolidine-4-carboxylic acid 35a: The title compound was observed when cysteine (43.4 mM) was exposed to acetaldehyde (10 equiv.) and base (NaOD, 1 equiv.) after 1 h. 1H NMR (700 MHz, D2O) δ: 4.82 (q, J = 6.5 Hz, 1H), 4.37 (t, J = 7.1 Hz, 1H), 3.40 (dd, J = 12.1, 7.6 Hz, 1H), 3.19 (dd, J = 12.1, 6.6 Hz, 1H), 1.53 (d, J = 6.5 Hz, 4H). 13C NMR (176 MHz, D2O) δ: 171.8, 63.49, 60.5, 32.6, 18.3.

(2R,4R)-2-Methylthiazolidine-4-carboxylic acid 35b: The title compound was observed when cysteine (43.4 mM) was exposed to acetaldehyde (10 equiv.) and base (NaOD, 1 equiv.) after 1 h. 1H NMR (700 MHz, D2O) δ: 4.74 (q, J = 6.4 Hz, 1H), 4.29 (dd, J = 7.7, 6.7 Hz, 1H), 3.34 (dd, J = 12.0, 7.8 Hz, 1H), 3.24 (dd, J = 12.0, 6.7 Hz, 1H), 1.55 (d, J = 6.5 Hz, 3H). 13C NMR (176 MHz, D2O) δ: 171.8, 64.3, 61.0, 32.8, 17.3.

Cysteine and acetone product

(R)-2,2-dimethylthiazolidine-4-carboxylic acid 36: The title compound was observed when cysteine (43.4 mM) was exposed to acetone (10 equiv.) and base (NaOD, 1 equiv.) after 3 h. 1H NMR (600 MHz, D2O) δ: 4.52 (t, J = 7.5 Hz, 1H, H5), 3.57 (dd, J = 12.5, 8.0 Hz, 1H, H1), 3.41 (dd, J = 12.5, 8.0 Hz, 1H, H1’), 1.76 (d, J = 11.0 Hz, 6H, H9 and H10). 13 C NMR (151 MHz, D2O) δ: 172.3, 73.2, 64.3, 32.9, 27.5, 26.9.

Cysteine and glyoxal products

(2S,4R)-2-(2-Oxoacetyl)thiazolidine-4-carboxylic acid 37a: The title compound was observed when cysteine (43.4 mM) was exposed to glyoxal (10 equiv.) after 1 h. 1H NMR (700 MHz, D2O) δ: 5.15 (d, J = 4.8 Hz, 1H, H11), 4.46 (dd, J = 7.0, 4.9 Hz, 1H, H3), 3.30 (dd, J = 11.3, 6.9 Hz, 1H, H4), 3.21 (dd, J = 12.1, 5.0 Hz, 1H, H4’). 13C NMR (176 MHz, D2O) δ: 171.3, 88.6, 66.9, 64.6, 32.2.

(2R,4R)-2-(2-Oxoacetyl)thiazolidine-4-carboxylic acid 37b: The title compound was observed when cysteine (43.4 mM) was exposed to HCHO (10 equiv.) after 1 h. 1H NMR (700 MHz, D2O) δ: 5.19 (d, J = 6.2 Hz, 1H, H11), 4.39 (t, J = 7.2 Hz, 1H, H3), 3.33 (d, J = 7.2 Hz, 1H, H4), 3.25–3.23 (m, 1H, H4’). 13C NMR (176 MHz, D2O) δ: 171.1, 88.3, 67.6, 64.9, 32.3.

Cysteine and methylglyoxal products

(2S,4R)-2-Acetylthiazolidine-4-carboxylic acid 38a: The title compound was observed when cysteine (43.4 mM) was exposed to methylglyoxal (10 equiv.) after 1 h. 1H NMR (600 MHz, D2O) δ: 5.65 (s, 1H, H1), 4.45 (dd, J = 9.8, 6.6 Hz, 1H, H3), 3.44–3.41 (m, 1H, H4), 2.99 (dd, J = 12.5, 9.7 Hz, 1H, H4’), 2.34 (s, 3H, H11). 13C NMR (151 MHz, D2O) δ: 200.4, 170.8, 67.8, 66.8, 32.9, 25.1.

(2R,4R)-2-Acetylthiazolidine-4-carboxylic acid 38b: The title compound was observed when cysteine (43.4 mM) was exposed to methylglyoxal (10 equiv.) after 1 h. 1H NMR (600 MHz, D2O) δ: 5.72 (s, 1H, H1), 4.55 (dd, J = 7.6, 3.3 Hz, 1H, H3), 3.40–3.37 (m, 1H, H4), 3.18 (dd, J = 12.3, 7.2 Hz, 1H, H4’), 2.31 (s, 3H, H11). 13C NMR (151 MHz, D2O) δ: 200.7, 171.6, 67.7, 64.8, 32.9, 24.9.