Introduction

Herbal medicines have been used for the prevention and treatment of various diseases for thousands of years in China and their curative effects were also confirmed by modern pharmacological researches. Moreover, methods for profiling multiple chemical components in herbal medicines are of great concerns because these complex components are the basis for their therapeutic effects and quality control. Currently, high performance liquid chromatography coupling tandem mass spectrometry (LC-MSn) has been considered as the most sensitive high-throughput technique for profiling complex chemical matrices1. Usually, reference compounds and/or their reported mass spectrometric data are necessary and serve as aids for identification of components by LC-MSn  2,3,4. However, obtaining the reference compounds is a challenge and time consuming work. Moreover, natural products usually share with isomers in herbs. Therefore, it is challenging to distinguish chemical components by mass spectrometry because of the difficulty to determine the exact structures when multiple isomers are available.

Up to now, numerous efforts have been made to identify compounds more rapidly and accurately without or with few standards. Among them, mass defect filter (MDF) is a popular data processing tool applied for distinguishing metabolites in biosamples and chemical constituents in traditional Chinese medicines (TCMs)5,13. It could segregate and detect the compounds by imposing preset criteria around the mass defects and selected core substructures. In general, structural analogues in herb usually shared with a similar core substructure and characterized by various chemical groups including hydroxyls, formyls, methoxys, methyls, and glycosyls. Thus, the similar core substructure with significantly lower molecular mass could be utilized as MDF template compound. And the chemical groups provided the changes and limited ranges. Once the preset filtering setting applied, the characteristic compounds could be extracted and the heterogeneous ions could be removed5,13. An approach of liquid chromatography-hybrid ion trap time-of-flight mass spectrometry (LC-IT-TOF-MS) with MDF technique was developed to extract and classify peaks according to the structure types. By using this approach, 50 ophiopogonins and 27 ophiopogonones were structurally classified and determined from the extract of Ophiopogon japonicus 5. Also, a study developed an improved strategy using MDF in combination with LC-IT-TOF-MS analysis and theoretical calculations for identification and structural characterisation of dihydroindole-type alkaloids in processed Semen Strychni extracts. Twenty-four dihydroindole-type alkaloids, including four that were previously not described, were tentatively identified14. Moreover, ion fragment pathways were applied for identification. Based on LC-IT-TOF-MS, a novel and generally applicable approach to identifying nontarget components from herbal preparations was developed. By classified peaks into families based on the exact same fragment ions (mass error <5 mDa) and connected all families by peaks that presented in two or more families, this method was successfully applied to identify 43 compounds from an herbal preparation6. In our laboratory, 18 tetracyclic monoterpenoid oxindole alkaloids in Uncaria rhynchophylla were determined by analyzing their LC-MS2 cleavage pathways and metabolic relationships7. However, all of the above approaches are restricted due to the needs of the predefined ion fragments of known compounds summarized from standards and/or literatures. Because of the scarcity of the standard compounds for the majority of herbal medicines and natural products, it is urgent to develop a new strategy to identify compounds without standards.

Picrasma quassioides (D. Don) Benn. (Fam. Simaroubaceae), called Kumu in Chinese, is one of TCMs for the treatment of swollen sore throat, diarrhea and dysentery, eczema, sore and deep-rooted boil, and bite wound of insect or snake, or as a gastrointestinal vermifuge agent8,9. It has been listed in all versions of Chinese Pharmacopeia and has been used singly (Kumu Injection) or as an ingredient for many Chinese herbal preparations. Previous investigation showed that alkaloids, including β-carboline, canthinone and the dimers of them, were the principal active components in P. quassioides 10,11. A high performance liquid chromatography (HPLC) fingerprint has been constructed from the extract of P. quassioides. Only 7 alkaloids of 27 peaks were identified12. Also, in our laboratory, a method of total ion chromatogram combined with chemometrics and MDF was established for the prediction of antitumor active ingredients in P. quassioides samples. A total of 17 constituents were predicted as the potential antitumor active compounds, and only 12 of which were identified13. Because there are too many isomers and not enough standards could be available, it is difficult to identify and/or even predict the druggability of the alkaloids in P. quassioides. Therefore, it is a challenging work to develop a methodology for comprehensive identification of such components with numerous isomers in the absence of standards.

Herein, we described an efficient and practicable strategy to identify alkaloids, including structural analogues such as isomers, in P. quassioides based on developed self-feedback enhanced network by quadrupole-time of flight mass spectrometry (LC-Q-TOF MS). This strategy was aimed to establish a primary network with the most common fragment ions (the number of times the ion appeared ≥4) of alkaloids by MDF technique. Once a single alkaloid has been identified, its fragment ions presented in the primary network could be connected by the cleavage pathways, while some logical ions (the maximum tolerance of mass error <5 mDa with a rational predicted molecular formula) which were not presented in the primary network could be feedback to establish an enhanced network. The identifying capability of this network will become ever more powerful with more and more ions being feedback to it from the identified compounds. Based on the chemical information obtained from literatures, the structures of a particular type of constituents could be easily identified by this strategy without standards. It is useful for the quality control and component identification of various natural products, such as herbal medicines, preparations and other biological samples.

Results and Discussion

The workflow of this strategy was described in the method section and shown in Fig. 1.

Figure 1
figure 1

Workflow for the component identification with the strategy of self-feedback network.

Extraction of alkaloid candidates by MDF

The extract of P. quassioides was detected by LC-Q-TOF MS, and the total ion chromatogram (TIC) was shown in Fig. 2A. Under the selected conditions, most of the peaks were well separated with high resolution and good sensitivity. However, the alkaloids contained in P. quassioides were the major active components. In order to reduce interferences of other ions in matrix, the potential alkaloids, divided into single (β-carbolines and canthinones) and dimer alkaloids, were extracted by MDF technique.

Figure 2
figure 2

Total ion chromatogram obtained by LC-Q-TOF MS (A) and corresponding filtered chromatogram of single β-carboline alkaloids by mass defect filter (B).

For the single alkaloids, the mass defects of 9H-pyrido[3,4-b]indole (C11H8N2) (filter reference) and all its derivatives with various chemical substituents were summarized (Supplementary Table S1). Based on the information of Supplementary Table S1, the minimum and maximum values of mass defects were calculated as 0.0378 and 0.1790 Da, corresponding to the formula of C11H6N2O3 and C18H23N3O2, respectively. Therefore, the filter was set as C14.5H14.5N2.5O2.5 ± 70.6 mDa over the mass range of 160–400 Da. The filtered chromatogram was shown in Fig. 2B and its noise level (2.5 × 10 e6 counts per second, cps) was 36% of that of the original chromatogram (7 × 10 e6 cps) (Fig. 2A). After excluding the irrelevant ions by MDF, a total of 76 single candidates (Table 1) were detected in the filtered TIC profiles.

Table 1 Identified single alkaloids from the extract of P. quassioides.

Establishment of the primary network

Diagnostic ion pathways (cleavage pathway connected by diagnostic fragment ions (DFIs)) are useful in LC-MSn analysis and have been widely applied for the rapid identification of compounds in various studies2,3,4. Generally, the way to construct ion cleavage pathways was based on chemical standards containing same carbon skeletons or substructures, from which the same fragment ions (i.e., DFIs) can be produced. Lack of enough chemical standards for selecting reasonable DFIs limits the application of ion cleavage pathways to identify the chemical constituents in herbal productions. Herein, a strategy to select reasonable DFIs with a primary network of ion fragments rather than chemical standards was proposed.

Based on the Q-TOF MS analysis, the 76 single alkaloid candidates gave a total of 231 MS2 ion fragments with the maximum tolerance of mass error less than 5 mDa. They could be divided into two groups: Group-I with 79 ions (the number of times the ion appeared (NTAs) ≥4) and Group-II with 152 ions (NTAs <4). Among Group-I, 65 ions which had logical losses of molecular weight and corresponded to a certain chemical formula were selected to establish the primary network (Supplementary Table S2). The selection of NTAs was able to include most of the alkaloid candidates with the least fragment ions to facilitate the compound identification.

Identification of alkaloid candidates by primary network

The single alkaloid candidates captured by MDF were rearranged according to the number of isomers (including detected and reported) (Table 2). The candidates without isomers can be identified easily by the MS and/or MS2 ions with reported information. For example, compounds 8 and 32, two of single alkaloids with [M + H]+ m/z at 259 and 289, were unambiguously identified by the precise relative molecular mass, respectively. Sometimes, although a compound has more than two reported isomers, its structure is also easily identified by the precise relative molecular mass and the MS2 ion fragments inferred from the reported isomers, such as compounds 10, 52 and 57 ([M + H]+ m/z at 229, 185 and 273). After the identification of the above compounds, the MS2 ion fragments and their cleavage pathways of these compounds can be applied to extrapolate the compounds with the same ion fragments. Generally, the more the same ion fragments, the more the structure is similar. For example, there were 6 isomers at [M + H]+ m/z 257, but only compound 27 shared 4 of the same ion fragments with compound 10 at m/z 128.0495, 155.0604, 156.0444, 183.0553 in the primary network (Supplementary Table S2). Thus, compound 27 can be easily identified through the diagnostic ion pathways built with the above four ion fragments (Fig. 3). Besides, the metabolic relationship of two compounds in the plant is also useful for the structure identification. For example, compound 32, having one more methoxyl at C-8 than compound 8, can be considered as a metabolite of compound 8. Therefore, the cleavage patterns of compounds 8 and 32 were very similar for losing seven identical neutral fragments (-C2H5O2, -CH4O2, -C2H6O2, -CH, -CH2, -CH3O and -CO) at the corresponding same positions (Fig. 4). This suggested that the structures, especially the position of the substituent group, of the compounds which showed the similar cleavage patterns (i.e., with the same neutral losses) can be deduced through their metabolic pathways.

Table 2 Detected and reported isomers of β-carboline alkaloids in P. quassioides.
Figure 3
figure 3

Cleavage pathways of compounds 10 and 27.

Figure 4
figure 4

Cleavage pathways of compounds 8 and 32.

As a result, 31 of 76 single alkaloids (Table 1) were identified by the ion fragments in the primary network, combined with the structural or metabolic correlations between these compounds. The information of the ion fragments (including precise molecular mass, logical loss, percentage of relative abundance and tolerance) of the determined compounds can continually feed back to the primary network for correlating and analyzing more unidentified compounds.

Enhancement of the network with feedback DFIs from the identified compounds

In order to identify the rest of the candidate alkaloids, the fragment ions in Group-II (NTAs <4) that involved into the cleavage pathways of the identified compounds were selected and fed back to the primary network. This network could be continually enhanced with more and more compounds being identified. The enhanced network was effective at discriminating the unidentified alkaloid analogues, especially the position isomers.

Compounds 6, 19, 31, 55, 67 and 69 are six isomers with the same molecular weight ([M + H]+ at m/z 241). As an example, their structures were distinguished by the enhanced network and the procedure was presented as follows. In the primary network, there were 10 ion fragments for compound 6, 15 for 19, 15 for 31, 2 for 55, 29 for 67 and 11 for 69. They showed many same ion fragments because they are isomers. Among them, four isomers have been reported with the precise relative molecular mass of 240.2615 and a formula of C14H12N2O2 (Supplementary Table S1). By comparing the MS2 ion fragments, the structures of compounds 6, 19, 31, 55 could be assigned (Table 3). Compound 6 was easily determined as β-carboline-1-propanoic acid due to the ion fragments at m/z 223.0866 (C14H11N2O+, loss of H2O) and 195.0917 (C13H11N2 +, loss of CH2O2). In addition to having two ion fragments (m/z 223.0866 and 195.0917) that were the same as compound 6, compound 19 showed an ion fragment at m/z 226.0737 (C13H10N2O2 +, loss of CH3) in the primary network and an ion fragment at m/z 199.0628 (C12H9NO2 +, loss of C2H4N) in the secondary network (Group-II). Thus, compound 19 could be determined as 1-methenyl-4-methoxyl-β-carboline. Compound 31 also got an ion fragment at m/z 199.0628 (C12H9NO2 +), which could continually lose CO and C2H2 (199.0628 → 171.0679 → 145.0522) or lose CHO (199.0628 → 170.0600). But differing from compound 19, compound 31 had an ion fragment at m/z 225.0659 (C13H9N2O2 +, loss of CH4) in the primary network and its structure could be deduced as 1-ethenyl-4-methoxyl-8-hydroxyl-β-carboline. Compound 55 only provided two ion fragments of 167.0604 (C11H7N2 +, loss of C3H6O2, 70.26%) and 140.0495 (C10H6N+, loss of CHN from C11H7N2 +, 29.74%) with high relative abundance, indicating that the structure was easily losing all the substituents to yield structurally stable residue. Therefore, compound 55 could be determined as 1-ethoxymethenyl-β-carboline. All the ion fragments and the cleavage pathways of the four compounds fed back to the enhanced network in order to identify the rest isomers and other compounds. Compound 67 shared 13, 7 and 4 ion fragments that were the same as compounds 31 (1-ethenyl-4-methoxyl-8-hydroxyl-β-carboline), 40 (1-ethenyl-4-methoxyl-β- carboline) and 24 (1-ethenyl-β-carboline), respectively (Supplementary Table S2). By compared with the above three compounds, compound 67 was determined as 1-ethenyl-4-methoxyl-6-hydroxyl-β-carboline. It is a known compound but found in P. quassioides for the first time, and its structure was confirmed by the MS2 ion fragments listed in Table 1. In the same way, compound 69 was identified as 1-propenyl-4,8-dihydroxyl-β-carboline, which shared 8 and 5 fragments that were the same as compounds 47 and 13, respectively (Supplementary Table S2).

Table 3 Comparison of the structures and fragment ions from compounds 6, 19, 31, 55, 67 and 69, six isomers with the same [M + H]+ at m/z 241.

By utilizing the enhanced network, 39 of the 45 unidentified single alkaloids were determined. As shown in Table 2, a total of 70 single alkaloids were identified by the primary and enhanced networks. Among them, 20 compounds were reported for the first time and 19 known compounds were detected in P. quassioides for the first time.

Searching for missing alkaloids by the enhanced network

Ion fragments in the enhanced network can be used as DFIs to enlarge the screening ranges of alkaloids in P. quassioides. By using this strategy, three single alkaloids (kx1, kx2, kx3) that were not extracted by MDF technique were discovered and identified (Table 1). Compound kx1 was screened due to four of the same ion fragments (171.0553, 144.0444, 143.0604, 116.0495) as compound 61 (4-hydroxyl-1-methoxyl-β-carboline) (Supplementary Fig. S1a), and it was determined as 8-methoxyl-1, 2, 3, 4-tetrahydro-1, 3, 4-trioxo-β-carboline. Compound kx2 was discovered due to the ion fragment pathway from 184.0631 (C11H8N2O+) to 155.0604 (C10H7N2 +) by losing CHO, and it was determined as 4, 10-dihydroxyl-5-methoxylcanthin-6-one (Supplementary Fig. S1b). Compound kx3 showed 8 ion fragments (193.076, 192.0682, 169.076, 168.0682, 167.0604, 166.0651, 140.0495, 115.0542) belonging to the primary network (Supplementary Table S2). Compound kx3, with a [M + H]+ at m/z 211, was the isomers of compounds 5 (1-ethenyl-3-hydroxyl-β-carboline) and 13 (1-ethenyl-8-hydroxyl-β-carboline). By comparing their ions in the enhanced network, both compounds 5 and 13 had an ion fragment at m/z 131.0491 (C9H7O+) which compound kx3 did not show (Supplementary Fig. S1c), indicating that compounds 5 and 13 had a cleavage pathway to loss two nitrogen but compound kx3 did not. However, only compound 13 could lose a neutral fragment of CO (from 211.0866 to 183.0917). Thus, compound kx3 was determined as 1-ethanoyl-β-carboline.

Structure validation by chemical standards and literatures

As a result, totals of 79 single alkaloids were screened by the MDF technique and the enhanced network, and 73 of them were identified by the enhanced network (Table 1). Among them, the structures of 25, 28, 41, 56, 70 and kx3 were confirmed by comparing with the standards, respectively. Moreover, the rationality of this strategy was also verified by comparing the ion fragments and the cleavage pathways of the standards with the corresponding identified compounds. Consulting with the literatures, the structures of 28 known alkaloids (Table 1) were also confirmed by analyzing their reasonable cleavage pathways and/or plant metabolic pathways. Among the identified compounds, 20 single alkaloids have not been reported before. Although we have revealed reasonable cleavage pathways to prove their possible structures, it is necessary to get more information to confirm the results due to the varied structures for a single fragment ion, which may lead to a wrong combination of the fragments. However, the case of wrong combinations would be not very likely based on the enhanced network, because a mismatch combination is easily observed and excluded by the cleavage pathways and the metabolic pathways.

Six single alkaloids (1, 7, 21, 33, 66, 76) extracted by MDF technique were not successfully distinguished by this strategy. The main reason was the lack of enough fragment information. Among them, compound 76 was an alkaloid with two optical isomers which provide almost the same ion fragments and cannot be identified by this strategy.

In the same way, 16 of 35 dimer alkaloids were identified and 4 of them have not been reported before (Supplementary Fig. S2, Supplementary Table S3S6).

Conclusions

Alkaloids are one kind of secondary metabolites in natural products with significant biological and therapeutic activities. It is challenging to distinguish the position isomers of alkaloids because of the difficulty to determine the exact structure when multiple position isomers are available. In this paper, a strategy for determination of the structure of alkaloids has been developed based on self-feedback network of MS2 ions obtained from the LC-Q-TOF MS analysis, combining with the logical cleavage pathways and metabolic pathways. Taking P. quassioides as an example, 89 alkaloids, including 70 position isomers, had been successfully characterized by using this strategy. Moreover, 24 compounds had not been reported (new compounds), and 19 known compounds were detected in this herbal medicine for the first time. The results showed that this is an efficient and practical method to identify different kinds of secondary metabolites, especially some position isomers, in natural products.

Materials and Methods

Chemicals

Alkaloid standards of β-carboline-1-carboxylic acid, 3-methylcanthin-5,6-dione, 5-hydroxy-4-methoxycanthin-6-one, 4,5-dimethoxycanthin-6-one, 1-ethanoyl-β-carboline, canthin-4-one were isolated from P. quassioides in our laboratory and identified by MS2, 1H and 13C NMR spectral data. Their purities were determined to be more than 98% by HPLC method. HPLC grade acetonitrile was purchased from Shanghai Xingke Biochemistry Co., Ltd (Shanghai, China) and deionized water was prepared by a Milli-Q water purification system (Merck KGaA, Darmstadt, German). Other chemicals and solvents were of analytical grade and were obtained from Nanjing Chemical Reagent Co., Ltd. (Nanjing, China).

Samples preparation

The 70% methanolic extract of P. quassioides was provided by Qingfeng Medical Investment Group (Jiangxi, China). The extract was accurately weighed and dissolved in methanol to prepare a sample solution containing 10 mg·mL−1. The reference compounds were separately dissolved in methanol to produce the reference solutions ranged from 0.365 to 0.115 mg·mL−1. Aliquots of 10 μL of samples and reference solutions were injected for LC-Q-TOF MS analysis after filtered through 0.45 μm millipore membrane.

Chromatographic conditions

Mass spectrometric conditions

Mass spectra of test solutions were obtained from an Agilent 6520 Q-TOF spectrometry system (Agilent Corp., USA) equipped with an electrospray ionization interface. High purity nitrogen was used as the sheath gas and ultra high purity helium as the auxiliary gas. The sample was analyzed in positive ionization mode with the parameters as follows: drying gas flow at 8.0 L·min−1 and 325 °C, nebulizer 40 psi, sheath gas at 10 L·min−1 and 400 °C, capillary voltage at 3500 V, skimmer at 65 V, fragmentor voltage at 120 V. For MS/MS mode, the normalized collision energy was 35% with a resolution of 15000. Mass spectra were recorded in the range m/z 100–1700 for MS and 50–1700 for MS/MS. The activation time was 30 ms, and Agilent Mass Hunter Acquisition Software Ver. A.01.00 (Agilent Corp, USA) was used to control the parameters and to obtain the analytical data.

MDF Extraction and peak selection

For the single and dimer alkaloids contained in P. quassioides, 9H-pyrido[3,4-b]indole (C11H8N2) and 2 × 9H-pyrido[3,4-b]indole (C22H16N4) were selected as the filtering references, respectively, with calculated mass defects of 0.0687 Da and 0.1374 Da. The elemental compositions of target candidates ranged from 0 to 60 for carbon and hydrogen, from 0 to 8 for oxygen, from 2 to 4 for nitrogen and zero for other elements. Tolerance of predicted formula (mass error) less than 5 mDa. The candidate compounds were extracted according to the average element compositions of structural analogue ± mass defect tolerance (half width of mass defects range) by the qualitative analysis software Ver. B.04.00 (Agilent Corp., USA). And the relevant data of the candidate compounds including peak number, retention time, accurate mass, predicted chemical formula and the corresponding mass error were output.

Strategy for network construction and candidate alkaloid identification

For identification of the candidate alkaloids, the first step was to extract the rational MS fragment ions. All the fragment ions from all the target candidates with the exact molecular mass (the maximum tolerance of mass error <5 mDa) and predicted molecular formula was selected and rearranged according to the NTAs. Based on the NTAs, these ions were divided into two groups: Group-I with NTAs ≥4 and Group-II with NTAs <4. The ions in Group-I with logical losses of molecular weight corresponding to a definite chemical formula were selected to establish the primary network.

The secondary step was to determine the structures and the cleavage pathways of the ion fragments in the primary network. With the aids of the reported chemical information, the structures of known compounds could be easily identified. The fragment ions, which were obtained from these compounds and included in the primary network and can be linked by reasonable cleavage pathways, were utilized as DFIs. The DFIs can be utilized for determination of the same substructures.

The third step was to enhance the primary network with feedback DFIs. For the identified compounds, some fragment ions which can be linked by reasonable cleavage pathways were not occurred in Group-I but in Group-II. Such fragment ions, together with the cleavage pathways, could be fed back to the primary network to strengthen the identification function of the system.

After the identification of the known compounds, the unknown candidate alkaloids were mainly characterized by the enhanced network, combining with the cleavage pathways (including diagnostic central losses) and the metabolic relationships between the same type compounds. The new DFIs which were obtained from the identified alkaloids and included in Group-II could be continually fed back to the network.