Introduction

Cytochrome P450 monooxygenases (CYPs) play an essential role in metabolism of a wide range of xenobiotics and are involved in biosynthesis of prostaglandins and steroid hormones. CYPs catalyze hydroxylation, epoxidation, reduction and other oxidative reactions on substrates that range from alkanes to complex endogenous molecules such as drugs, steroids and fatty acids [1]. The many reactions that CYPs can catalyze make these enzymes highly versatile and very interesting as tools to generate rather diverse and unique medicinal compound libraries not always accessible via classical synthetic routes. Moreover, in many cases these compounds show improved physico-chemical properties which is of crucial importance during development of drug dosage forms. Metabolites produced by CYPs are well known to display important pharmacological activities or may be responsible for the toxicity or other unwanted side effects of xenobiotics [2, 3]. Systems enabling facile biosynthesis of sufficient quantities of metabolites for structural elucidation and pharmacological and toxicological evaluation are therefore highly desirable. In addition, for drug discovery purposes metabolites are very interesting during lead optimization as they may display improved properties, such as target selectivity or even solubility without significant loss of the pharmacological properties compared to the parent drug itself [4].

The ideal biocatalyst possesses high turnover rates, is very stable, can easily be expressed at high levels, and can function under extreme conditions of temperature, pH, buffer system or solvent. Although mammalian CYPs are capable of metabolizing a wide array of substrates, they require additional redox partners, display relatively low activities, give low expression yields and have poor stability which makes them less suitable for biocatalytic purposes [5]. The fatty acid-metabolizing CYP BM3 from Bacillus megaterium is a very interesting candidate because it is a natural fusion between a heme domain responsible for substrate oxidation and a diflavin reductase domain responsible for electron transport [6]. CYP BM3 possesses the highest activity ever measured and is a highly stable enzyme [7]. These properties, in combination with availability of crystal structures and large-scale expression and purification protocols, make CYP BM3 a highly suitable candidate for biocatalytic applications. Various protein engineering techniques have already been successfully employed to generate CYP BM3 mutants with increased activities and altered regio- and stereoselectivities [8].

An important step during development of novel CYP BM3 mutants for biocatalytic purposes is selection of mutants that display the desired changes. Therefore, mutant libraries which have been created using different protein engineering techniques need to be screened to select mutants with improved characteristics. In order to improve throughput for screening of CYP BM3 mutant libraries for different substrates, it may be useful to screen the different substrates in a cocktail format. Such an approach is already being used to study inhibitory effects of drugs and new chemical entities (NCE) on human CYP activities using a mixture of CYP-specific substrates in a single human microsomal incubation in combination with tandem mass spectrometry. These cocktail approaches have been shown to be less time-consuming, less labour-intensive, more cost-effective and thereby very powerful tools to study drug-drug interactions due to CYP inhibition [913].

The aim of the present study was to investigate if a cocktail approach could be used to screen CYP BM3 libraries for metabolic activity and diversity. Furthermore, it was of interest to know if such an approach could be used in combination with different chemometric tools in order to identify metabolites without prior knowledge of their structure and identity. Firstly, metabolic activity of a selection of well-characterized BM3 mutants from an in-house library was screened against two drugs for which also metabolic profiles were determined. Secondly, for a single BM3 mutant, the effect of co-administration of multiple drugs on the metabolic activity and diversity was investigated. Based on these experiments, a cocktail of drugs was selected against which the whole in-house CYP BM3 mutant library was screened. A chemometrical approach was subsequently used to analyse the data generated by the qualitative HPLC-MS/MS-based screening in order to determine if metabolites without prior knowledge of their structure and identity could be detected. It was demonstrated that a cocktail approach can be utilized to screen CYP BM3 libraries for metabolic activity and diversity and that a chemometrical approach is beneficial in order to visualize and analyse the results generated during the library screen.

Materials and methods

Chemicals

5-Hydroxybuspirone and 16β-hydroxy-17-epi-norethisterone were purchased from Toronto Research Chemicals (Toronto, Ontario, Canada). All other chemicals were of analytical grade and were purchased from Sigma Aldrich (Zwijndrecht, The Netherlands) unless stated otherwise.

Enzymes and plasmids

A total of 83 CYP BM3 mutants were used. Detailed information about mutations present in the different mutants is tabulated in the Electronic Supplementary Material (ESM), Table S1. For 44 of the CYP BM3 mutants, construction has been described previously [1420]. In total, 39 novel mutants were made by site-directed mutagenesis using the appropriate oligonucleotides (see ESM, Table S1). In general, mutations were introduced in the various templates in the pBluescript II KS(+) vector using the QuickChange Site-Directed Mutagenesis Kit (Stratagene). Forward primers that were used are mentioned in the ESM, Table S2. Reverse primers were exactly complementary to the forward primers. After mutagenesis, the presence of the desired mutations was confirmed by DNA sequencing (Service XS, Leiden, The Netherlands). Genes of the novel site-directed mutants were cloned from the pBluescript II KS(+) system, where they reside between the BamHI/EcoRI restriction sites, into the pET28a+ vector.

Expression of CYP BM3 mutants

His-tagged pET28a+ constructs of wild-type (WT) CYP BM3 and all mutants were transformed in Escherichia coli BL21 (DE3) cells using standard procedures. For expression, 600 mL Terrific Broth (TB) [21] medium (24 g/L yeast extract, 12 g/L tryptone, 2 g/L peptone, 20 mL/L glycerol) with 30 μg/mL kanamycin was inoculated with 15 mL of an overnight culture. Cells were grown at 175 rpm and 37 °C until OD600 reached 0.6. Protein expression was induced by addition of 0.6 mM isopropyl-β-d-thiogalactopyrasonide (IPTG). Temperature was lowered to 20 °C and 0.5 mM of the heme precursor δ-aminolevulinic acid was added. Expression was allowed to proceed for 18 h. Cells were harvested by centrifugation (4600×g, 4 °C, 25 min), and the pellet was resuspended in 20 mL KPi-glycerol buffer (100 mM potassium phosphate (KPi) pH = 7.4, 10 % glycerol, 0.5 mM EDTA and 0.25 mM dithiothreitol). Cells were disrupted using an EmulsiFlex-C3 (Avestin Inc., Ottawa, Ontario, Canada; 1000 psi, 3 repeats), and the cytosolic fraction was separated from the membrane fraction by ultracentrifugation of the lysate (120,000×g, 4 °C, 60 min). CYP concentrations were determined using the carbon monoxide (CO) difference spectrum assay as described by Omura et al. [22].

CYP BM3-mediated metabolism of the drugs amitriptyline and buspirone

Metabolic incubations for the drugs amitriptyline (AMI) and buspirone (BUS) were performed in 100 mM potassium phosphate (KPi) buffer at pH 7.4 with the cytosolic fraction containing 100 nM of BM3 mutant at a 40-μM substrate concentration. Final volume of the incubations was 250 μL, and the final DMSO concentration was always 5 %. Reactions were initiated by the addition of 50 μL of NADPH-regenerating system (NRS; final concentrations: 50 μM NADPH, 2.5 mM glucose-6-phosphate and 0.5 U/mL glucose-6-phosphate dehydrogenase). Reactions were allowed to proceed for 60 min at 24 °C and were terminated by addition of 250 μL ice-cold acetonitrile (ACN). Samples were subsequently centrifuged to remove precipitated protein (14,000 rpm for 15 min) and supernatants were analysed by LC-MS/MS.

Electrospray ionization LC-MS/MS analyses in the positive ion mode were carried out at the Division of Molecular Toxicology at the Vrije Universiteit in Amsterdam, The Netherlands. Metabolites and parent compounds were analyzed by reversed phase chromatography using a Synergi MAX-RP column (Synergi, 4 μm, 150 × 4.6 mm i.d.; Phenomenex, Amstelveen, The Netherlands) at a flow rate of 0.3 mL/min. The gradient was composed of solvent A (0.1 % v/v formic acid in water) and solvent B (0.1 % v/v formic acid in ACN). For the various substrates, different gradient programs were used. For identification of substrates and metabolites, a Finnigan LCQ Deca mass spectrometer (ThermoQuest-Finnigan) was used operating in the positive ion electrospray mode. N2 was used as a sheath gas (60 psi) and auxiliary gas (10 psi); the needle voltage was 5000 V and the heated capillary was at 150 °C. LC-MS/MS data of the substrates and bio transformation products were processed with Xcalibur/Qual Browser v 1.2 (ThermoQuest-Finnigan). Standard curves of the substrates were linear between 0.1 and 100 μM.

Effects of co-administration of multiple drugs

To investigate effects of co-administration of multiple drugs, incubations were performed with CYP BM3 mutant MT72 (M11 V87F) using the marketed drugs AMI, BUS, coumarine (COU), clozapine (CLZ), dextromethorphan (DEX) and norethisterone (NET) in different combinations. All incubations had a final volume of 250 μL and consisted of KPi containing 250 nM of the cytosolic fraction of MT72 while the final DMSO concentrations were always of 5 %. The concentration of each individual drug was set at 100 μM. Reactions were initiated by addition of 50 μL NRS and were allowed to proceed for 60 min at 24 °C after which 750 μL ice-cold ACN was added. Samples were subsequently centrifuged to remove precipitated protein (14,000 rpm for 15 min) and supernatants were analysed by LC-MS.

LC-ESI-MS analyses were carried out at the Division of Molecular Toxicology at the Vrije Universiteit in Amsterdam, The Netherlands. Extracts were separated by reversed phase chromatography using a C18 column (Luna C18(2), 5 μm, 4.6 × 150 mm i.d.; Phenomenex, Amstelveen, The Netherlands) at a flow rate of 0.5 mL/min and a temperature of 25 °C. The gradient was composed of solvent A (0.1 % formic acid in water) and solvent B (0.1 % formic acid in ACN). The samples were analyzed on an Agilent 1200 Series Rapid resolution LC equipped with a Time-Of-Flight (TOF) Agilent 6230 mass spectrometer (Agilent technologies, Waldbronn, Germany). The electrospray interface was operated at a capillary voltage of 3500 V with N2 as drying gas (12 L/min) and nebulizer gas (pressure 60 psig). The gas temperature was 350 °C during operation. The TOF was used in the positive ion mode, and data was acquired using the Mass Hunter workstation software (version B.06.00). Standard curves of the substrates were linear between 0.1 and 100 μM.

Screening of the CYP BM3 library using a cocktail approach

The degree of metabolism of six marketed drugs by the total CYP BM3 library was investigated using a cocktail approach. The six drugs used for this experiment were AMI, BUS, COU, DEX, diclofenac (DIC) and NET. The concentration of the six marketed drugs was set at 100 μM, the final DMSO concentration in the incubation was set at 5 % and all CYP BM3 mutants were incubated at a 250-nM enzyme concentration. Reactions were initiated by addition of 50 μL NRS and were allowed to proceed for 60 min at 24 °C. To terminate reactions, 250 μL of ice-cold ACN containing 50 μM minaprine was added after which samples were subsequently centrifuged to remove precipitated protein (14,000 rpm for 15 min). Each sample was analysed by UPLC-MS/MS while a selection of samples was also analysed by LC-MS.

UPLC-MS/MS analyses were carried out at QPS Netherlands B.V. in Groningen, The Netherlands. For UPLC-MS/MS, an AB SCIEX API 4000™ tandem MS System (AB SCIEX, Framingham, MA, USA) hyphenated with an Agilent 1290 Infinity Binary LC System (Agilent technologies, Waldbronn, Germany) was used. Electrospray ionization and a Multiple Reaction Monitoring (MRM) scan mode with a dwell time of 10 ms/s (see for more details the ESM, Table S3) was used. Instrument conditions were as follows: collision gas (CUR) 35, curtain gas (CAD), GS1 50, GS2 50, ion spray voltage 5500 and source temperature 550 °C. For sample pretreatment, 10 μL of supernatant from the incubations was mixed with 2000 μL mobile phase (90:10, A/B) to which 50 nM MIN was added. From this pretreated sample, 5 μL was injected for analysis. Separation was achieved with an Acquity UPLC BEH column (1.7 μm particles, 2.1 × 50 mm; Waters, Milford, USA) at a flow rate of 0.3 mL/min and a temperature of 40 °C. The data were processed with commercial available Analyst 1.4.1 (AB SCIEX) software. Standard calibration curves of all substrates and metabolites were linear between 0.1 and 100 μM.

LC-ESI-MS/MS analyses were carried out at the BioAnalytical Chemistry at the Vrije Universiteit in Amsterdam, The Netherlands. A Shimadzu (‘s Hertogenbosch, The Netherlands) ion-trap time-of-flight (IT-TOF) hybrid mass spectrometer equipped with a Shimadzu LC system (SIL-20 AC autoinjector, two LC-20AD pumps, CT-20AC column oven, SPD-AD UV/VIS detector and a CBM-20A controller) was used. The needle voltage was set to 4.5 kV, and the source heating block and the curved desolvation line were kept at 200 °C. A drying gas pressure of 62 kPa and a nebulizing gas (N2) flow rate of 1.5 L/min assisted the ionization. Full spectra were obtained in the positive ion mode between m/z 140 and 450. MS2 spectra were obtained in a data-dependent mode between m/z 100 and 450 with an ion accumulation time of 10 ms, a precursor width of 3 Da and a collision energy of 75 %. For sample pretreatment, 20 μL of supernatant from the incubations was mixed with 80 μL of 50 % ACN. From this pretreated sample, 12 μL was injected for analysis. Chromatography was performed on a Synergi MAX-RP column (Synergi, 4 μm, 150 × 4.6 mm i.d.; Phenomenex, Amstelveen, The Netherlands) at a flow rate of 0.4 mL/min and a temperature of 25 °C. The gradient was composed of solvent A (0.1 % formic acid in water) and solvent B (0.1 % formic acid in ACN). The data were processed with LCMS solution Version 3.60.361 (Shimadzu).

Chemometric analyses

Raw data (.cdf files) resulting from LC-MS analyses of the CYP BM3 library screening using the cocktail approach (BM3-Cocktail data) were read into the R data analysis environment (R, 2.15, http://cran.r-project.org/) using the R ncdf package (version 1.6.6) and binned according to a m/z range of ±0.5 using the MassView package. The MassView package has been made in-house for binning mass data according to a specified m/z range and uses the caMassClass package version 1.9 (http://cran.rproject.org/src/contrib/Archive/caMassClass/). Data between 10 and 26.2 min were used for further analysis. Alignment of spectra was carried out with Parametric Time Warping (PTW) [23] using the R ptw package [24]. After zero padding (addition of 1000 zeros before and after each chromatogram for each m/z), PTW was executed twice using the mean chromatograms of the zero padded data and the aligned data, respectively, as reference. Resulting aligned data were converted in Matlab format using the R.matlab library 1.6.1 [25]. The size of the resulting data was 84 × 310 × 9720 (samples × m/z × time-points). For peak detection (for each nominal m/z), the method of Zhang et al. was applied [26]. The method is based on wavelets (haar wavelet, scale factor 60). Peaks (noise) below a value of 50,000 were not taken into account. Before peak detection, the chromatograms for each m/z were smoothed using a wavelet smoother (dog wavelet, Matlab cwtft function). Detected peaks with a signal-to-noise ratio below 2 (mostly chemical noise) were removed (noise calculated at the start- and endpoint of the peak using the data before smoothing). Detected peaks were integrated and stored in an array combined with associated m/z and time of peak maximum. During this procedure, corrections were made for small residual misalignment of peaks between the different samples (up to 3 s). To correct for sample processing variations and mass detector sensitivity changes, the data were normalised to the total mass of the first sample. Isotopes of substances were removed for m/z + 1, m/z + 2, m/z + 3 and m/z + 4 with additional constraints on the peak height ratios (roughly based on the natural ratios of 1H/D and Cl35/Cl37). Small differences in retention time were allowed (<15 s). Peaks related to substances present in the eluent, solvent and incubation solution and peaks of the substrates and the internal standard were removed using the peaks present in the WT sample. The resulting data is expected to comprise of the metabolites formed by enzymatic conversion of the substrates.

Outlier analysis was executed using Robust PCA [27]. Implementation in the Matlab-based LIBRA package was applied [28]. ROBPCA was applied using default settings. For retrieval of the contributing variables, the partial decomposition methodology of Alcale et al. [29] was implemented. For determination of the number of latent variables, the estimate factors function of the PLS toolbox was applied (PLS toolbox Version 7.0, Eigenvector Research, Inc., Manson USA). The principle of this function is that it selects those principal components which appear to be stable during resampling of the data. Clustering of data was obtained using hierarchical clustering [30]. Built-in Matlab procedures were used (Matlab version R2012b, Matlab, The MathWorks Inc., Natick, Massachusetts, USA). Besides these procedures and functions, several in-house functions were constructed to easily visualize subsets of the data. Functions were constructed to visualize m/z traces (chromatograms) for one specific m/z value for all samples and to visualize the identified peaks for a specific m/z and retention time combination in all samples.

Results and discussion

Selection of the CYP BM3 mutants

The aim of the present study is to investigate if a cocktail approach can be used to screen CYP BM3 libraries for metabolic activity and diversity. It was therefore decided to employ a CYP BM3 library consisting of mutants with as much structural diversity as possible by selecting mutants with mutations that preferably were introduced throughout the whole protein structure. In addition, CYP expression levels of the selected mutants needed to be suitable to allow large-scale enzyme expression. Mutants that were used in the current study were a combination of 44 mutants that have been described previously [1420] complemented with 39 novel CYP BM3 mutants (see ESM, Table S1).

The set of 44 previously described mutants include M01, M02, M05 and M11 which were constructed by a combination of three site-directed mutations (R47L, F87V and L188Q) and subsequent rounds of random mutagenesis by error-prone PCR [31]. The 40 other mutants have all been based upon these four templates. This set of mutants has previously been tested against a range of substrates, including marketed drugs [14, 16, 32, 33], steroids [15, 18, 20], ionones [19], kinase inhibitors [34], alkoxyresorufins [15, 17] and endocrine disrupting chemicals [17, 35]. They metabolized these substrates with varying catalytic activities while also displaying significant differences in the metabolic profiles generated. The 39 mutants that have not been described previously were all created in-house by site-directed mutagenesis using the appropriate mutant templates (see ESM, Table S2). Mutations have been introduced at different positions throughout the protein in order to address, amongst others, effects upon regio- and stereo-selectivity (residues 72, 74, 75, 81, 82, 86, 87, 264, 328, 436, 437 and 440), organic solvent tolerability (residues 235, 471, 494 and 1024, see also [36]) and protein stability (residues 53, 176, 208 and 359). In-depth discussion of these properties is not a subject in this manuscript and will not be addressed.

Expression levels and stability of WT CYP BM3 and all 83 mutants were determined by measuring intensity of the characteristic Soret band at 450 nm upon reduction by dithionite and addition of CO. All cytosolic BM3 fractions were found to contain stable and active CYP protein. Although expression yields varied amongst the different mutants (data not shown), at least 50 nmols of active CYP protein was expressed per 600 mL of TB culture for each mutant and the WT enzyme. Cytosolic fractions were used to prepare 5 μM stock solutions in KPi-glycerol buffer for the WT enzyme and all mutants.

Characterization of the CYP BM3 library-using amitriptyline and buspirone

In order to investigate the applicability of a cocktail approach to screen CYP BM3 libraries for metabolic activity and diversity, information is needed about activity of the mutants towards single substrates that are present in the cocktail. Since AMI and BUS have previously been successfully used to probe differences between CYP BM3 mutants, showing significant differences in metabolite profiles [16], it was decided to screen the WT enzyme and a selection of 43 mutants against both drugs. Figure 1a, b shows the diversity of the 44 tested enzymes towards AMI and BUS, respectively. As can be seen, both drugs are metabolized by almost all CYP BM3 mutants with varying activities into different metabolites.

Fig. 1
figure 1

Product profile and activity of CYP BM3 engineered variants towards amitriptyline (A) and buspirone (B). Activities were determined by calculating the substrate depletion using the average peak area of the parent at 60 min and at time zero. Values represent the mean of two independent experiments with less than 10 % error. Product selectivity in % was calculated from the extracted ion chromatograms and is represented by different shades in each bar

For AMI (see Fig. 1a), the major product in all cases was nortriptyline (NOR; MA4) which is consistent with previous results [16]. Significant amounts of MA2 are only formed by a small number of mutants (MT35, MT59, MT66, MT67, MT68, MT72, MT90 and MT91). It is interesting to observe that all mutants that form MA2 contain a mutation at amino acid residue 87, residue 437 or mutations at both of these residues. MA1, MA3 and MA5 were formed by almost all mutants although at varying levels. The WT enzyme did not display any activity while MT76 only formed a very small amount of MA4. Identification of the metabolites of AMI is discussed in more detail in the ESM.

For BUS (see Fig. 1b), it was observed that significant differences existed between metabolite distribution profiles generated by the tested mutants. The WT enzyme, MT65 and MT76 did not display any activity towards BUS. MB4 (5-OH-BUS) was produced by all other mutants, although this was not the major metabolite in every incubation analysed. MB1 was not formed by MT64, MT65, MT66, MT70, MT71 and MT76, having in common that they contain a mutation at the 87 position (V87Q, V87E, V87G, V87K, V87M and V87W, respectively). When compared to their template M11, for MT70 and MT71, it appears that the mutation at position 87 resulted in a very large decrease of metabolic activity confirming that this position is crucial for efficient biotransformation of substrates [15]. MB2 was only formed by a selection of mutants (M01, M02, MT31, MT38, MT41, MT43, MT44, MT59, MT104 and MT107). The majority of the mutant library is derived from M11 (M11 + 28 M11-mutants; 66 %) while besides the WT enzyme, M02 and M05 the rest of the library is derived from M01 (M01 + 12 M01-mutants: 27 %). It is interesting to note that of the ten mutants that form MB2, only two are derived from M11 while the majority is derived from M01. These results suggest that the additional mutations in M11 compared to M01 and M02 (F81I, E143G, Y198C and H285Y) have an effect on the metabolism of BUS which is probably mainly caused by mutation of the active-site residue phenylalanine at position 81 into isoleucine. Significant amounts of MB3 were only produced by M01, M02, MT59, MT66 and MT80 while MB4 and MB5 were formed by almost all mutants although at varying levels. Identification of the metabolites of BUS is discussed in more detail in the ESM.

From all these data, it was again proven that the 87 position is crucial, and that significant differences in activity and metabolite profiles could be obtained with the mutant library.

Validation of the cocktail screening approach

Having an established CYP BM3 mutant library, the next step in development of the cocktail screening was to investigate effects of co-administration of multiple drug substrates. Substrate mixes of various compositions were made, and effects of co-administration of multiple drugs on the metabolic activity and metabolic product profile of AMI and/or BUS by a single CYP BM3 mutant were investigated. The CYP BM3 mutant chosen for this experiment was MT72 since it displayed good activity towards both AMI and BUS during the previous single substrate screening experiment (see Fig. 1a, b). Besides AMI and BUS, the marketed drugs clozapine (CLZ), coumarin (COU), dextromethorphan (DEX) and norethisterone (NET) were selected for this experiment based on previous studies [20, 31, 32] (see Fig. 2 for chemical structures). Concentrations of the substrates were set at 100 μM which is higher than the concentrations used during the single substrate screening experiment (40 μM). In the latter experiment, a number of mutants displayed very high conversions (above 80 %) towards both AMI and BUS. It was therefore decided to increase all substrate concentrations to 100 μM.

Fig. 2
figure 2

Structures of the marketed drug substrates that were used in this study. The molecular weight (MW) is indicated for each substrate

As can be seen from Table 1, substrate depletion was observed for all drugs screened in all incubations. When looking at the metabolic activity of MT72 towards AMI, it can be seen that the amount of AMI depletion varies between 34 and 64 % while the average AMI depletion in these incubations is 46.4 % with a standard deviation of 8.3 % (overall precision (CV) of 18 %). This amount of substrate depletion corresponds very well with the results from the single substrate screening experiment during which a depletion of 48.6 % was measured for MT72. There is no clear trend to observe in the effect of co-administration of the marketed drugs, as all drugs tested did not significantly induce or inhibit the metabolism of AMI and the number of drugs co-administered had no effect on the metabolic activity. Furthermore, it can be seen from Fig. 3a that the metabolite distribution is not affected by co-administration of multiple drugs. For AMI, the expected metabolites MA1-MA5 (based on the single substrate screening experiment) were formed while also the metabolic profile generated was similar as before (see Fig. 1a).

Table 1 Effect of co-administration of multiple drugs on metabolic activity of MT72
Fig. 3
figure 3

Effect of co-administration of multiple drugs on the metabolic profile of the drugs amitriptyline (A) and buspirone (B) generated by MT72. The sample numbers correspond to the incubations represented in Table 1

Considering the metabolic activity of MT72 towards BUS, it can be seen that the amount of BUS depletion varies between 53 and 77 % while the average BUS depletion in these incubations is 61.7 % with a standard deviation of 7.9 % (CV of 13 %). These findings correspond very well with the results from the single substrate screening in which a depletion of 60.2 % was measured. No significant alteration of the metabolite distribution occurs as result of the co-administration (see Fig. 3b). For BUS, the expected metabolites MB1, MB3, MB4, MB5 and MB6 (based on the single substrate screening experiment) were formed while also the metabolic profile generated was similar as before (see Fig. 1b).

For CLZ, DEX and NET, metabolic activity (see Table 1) and diversity (data not shown) were not affected by co-administration of multiple drugs. Metabolic activity was much lower than the activity observed for AMI and BUS (substrate depletion ranging between 3 and 9 %). CLZ was metabolized into two metabolites by MT72 which were identified as being clozapine N-oxide and N-desmethylclozapine, respectively. DEX was metabolized into a total of four products. Two products were a result of hydroxylation (MD3 and MD4) while the two other products were a result of demethylation (MD5 and MD6). NET was metabolized into one monohydroxylated product, MN1. Identification of the metabolites of CLZ, DEX and NET is discussed in more detail in the ESM.

For COU, substrate depletion was observed in all incubations. However, in none of the incubations, COU-related metabolites were detected. Besides the possibility that the absence of metabolites is caused by the simple fact that no biotransformation occurred, MS signals of metabolites might have been too low for detection caused by lack of optimal ionization conditions since MS signals of COU itself were also much lower (at least ten times) than the signals of the other drugs. Another possible explanation for the absence of COU-related metabolites is that COU might be metabolized into coumarin 3,4-epoxide (CE) which is a reactive intermediate that can react with proteins present in the incubation and could therefore not be detected. Subsequently, CE can be further metabolized into o-hydroxyphenylacetaldehyde (o-HPA). This metabolite has also not been detected in any of the incubations.

In conclusion, no inhibition of the biocatalytic activity of the CYP BM3 mutant MT72 for the applied substrates was observed. Neither did the molecular profiles of the metabolites change, indicating that the cocktail incubation is a valid approach for the generation of unique compound libraries and for the screening of unique biocatalytic properties of BM3 mutants.

Screening of the CYP BM3 library using a cocktail approach

The next step of our study was to screen the complete CYP BM3 library encompassing the 83 mutants and the WT enzyme in a cocktail approach to investigate if such an approach can be used to rapidly classify mutant CYP BM3 libraries for both metabolic activity and diversity. The substrates that were selected for the cocktail screening experiment were AMI, BUS, COU, DEX, diclofenac (DIC) and NET. Besides one case report where a life-threatening dextromethorphan intoxication was associated with interaction with amitriptyline in a poor CYP2D6 metabolizer [37], no reports have been published about the interactions of any two or more of these drugs. Based on this information and the co-administration experiment described above, occurrence of drug interactions is not expected to have an impact. In order to rapidly assess bioactivity of the mutants, for each substrate, one of the previously described biotransformations was monitored. Based upon the single substrate screening experiment, NOR (MA4) and 5-OH-BUS (MB4) were selected for AMI and BUS, respectively. 7-OH-COU was selected as probe metabolite for COU. The demethylated products dextrorphan (MD5) and MM (MD6) were selected as probe metabolites for DEX whereas 4-OH-DIC was selected for DIC. For NET, 16β-hydroxy-17-epi-norethisterone was used as a probe metabolite. Standard calibration curves of all substrates and metabolites were obtained, and the limit of detection was below 0.1 μM in all cases. The results of the cocktail screening experiment are shown in Table 2.

Table 2 Metabolic activity of the CYP BM3 mutant library towards six drugs

It can be seen that significant differences exist between metabolic activities of the different mutants towards the six drugs. The averaged standard deviation of the substrate depletions calculated were below 15 % for all substrates (see also ESM, Table S4). Averaged standard deviation of the metabolite concentrations determined were 3.0, 0.3, 1.0 and 0.4 μM for NOR, 5-OH-BUS, MM and OH-DIC, respectively. For AMI, detection of the metabolite NOR corresponded well with the amount of substrate depletion observed. This is in agreement with the previous finding that NOR was the major metabolite during the single substrate screening experiment. For BUS, the substrate depletion determined was in most cases significantly higher than the amount of 5-OH-BUS detected. This indicates that more metabolites are formed which agrees with the previous results of the single substrate screening experiments. For COU, significant amounts of substrate depletion were observed with MT24 being the most active mutant. However, no formation of 7-OH-COU was detected. For DEX, formation of the N-demethylated product MM was detected while formation of the O-demethylated product dextrorphan was not detected. When comparing the amount of substrate depletion with the concentration of metabolite detected, it can be seen that in many cases the concentration of MM formed is higher than the amount of DEX which has been consumed. In addition, for several mutants, a negative amount of substrate depletion was observed. A possible explanation for these effects is, again, that factors present (including formed metabolites) in the incubation mixture may influence the MS signals of DEX and MM. For DIC, formation of 4-OH-DIC was detected, and in most cases, the amount of substrate depletion observed was higher than the concentration of 4-OH-DIC detected. This could indicate that other products are formed and agrees with the previous finding that DIC can be converted by CYP BM3 mutants into reactive metabolites [38], although in the respective study the amount of GSH-conjugates formed by most mutants did not account for more than 10 % of the total amount of metabolites produced. For NET, significant differences in the amounts of substrate depletion were determined for the tested mutants. However, formation of 16β-hydroxy-17-epi-norethisterone was not detected in any of the incubations. For a number of mutants, a negative amount of substrate depletion was observed.

Selected incubations were qualitatively analyzed by LC-MS/MS on a Shimadzu IT-TOF in order to provide accurate mass information on the substrates, metabolites and their fragment ions and thereby get more information about the metabolic profiles generated by the different mutants for the six drugs tested. For each mutant, one sample of the incubation at 60 min was analyzed while for a selection of mutants also an incubation at time zero was analyzed.

Interpretation of the cocktail screening data

In order to scan the data in depth for known, but also unknown metabolites, several questions can be asked and several strategies can be followed. Regarding the questions, one can try, e.g. to: (1) look for all metabolites that are produced, (2) look for mutants that uniquely produce certain metabolites, (3) look for the most active mutants, (4) relate groups of enzyme mutations to certain metabolic processes, or (5) look for mutations which more efficiently produce certain metabolites. With respect to the strategies, one can (a) manually scan the data for all theoretical possible metabolic conversions (and combinations of conversions) for all substrates and (b) use certain procedures and techniques which are more specifically targeted at the different questions. The former strategy is a rather laborious strategy, especially if multiple substrates need to be investigated in combination with large numbers of mutant enzymes. In this case, an untargeted approach provides a better alternative. There are several simple data visualization and analytical techniques available to facilitate screening and investigation of data resulting from experiments in view of the various afore mentioned questions. First of all, raw LC-MS data files need to be pre-processed. Katajamaa and Orešič [39] and Draper et al. [40] describe the possible steps in the data (pre)processing (processing pipeline). In this work, we have followed a slightly different order. Firstly, data were aligned using PTW, followed by a peak detection procedure, as described in the ‘Materials and Methods’ section. The result of the application of such a processing pipeline is a reduced data set containing only peaks (including peak area or height, position in time and m/z dimension) of substances which could be of possible interest.

One of the most basic methods to obtain a visual overview of such multidimensional data is PCA, which can be combined with information on the most important variables (peaks) using a biplot [41]. PCA focusses on the variation in data. Frequently, this variation is due to metabolites that are produced in high quantities by most enzymes (cf. questions 1 and 3). If one wants to focus on those mutants that are specifically producing one or a few metabolites (cf. question 2), these cannot readily be identified using PCA and therefore another approach needs to be applied. Robust PCA is a methodology specifically constructed for identifying potential outliers in data sets. Selected mutant enzymes that produce specific metabolites differ from most other enzymes and thus can be viewed as being outliers. Moreover, clustering techniques can be applied to facilitate recognition of certain groupings in the data being either the detected peaks or the types and sites of mutations. For a comparison between the various chemometric analyses and manual data analysis, we have also manually screened all IT-TOF MS data files upon the presence of CYP-related metabolites. For quantification of the parents and metabolites, peak areas from the MS spectra were assumed to yield similar MS responses, although it is realized that MS signals generated by the different metabolites in most cases are not similar to those generated by the parent compound. However, as no UV data is available, quantification based upon the MS signal is the only alternative in order to generate metabolite profiles. Detailed results of the qualitative analysis of the metabolism of AMI, BUS, DEX, DIC and NET can be found in the ESM (Figures S1-S5).

The result of the application of PCA on pre-processed data from which substrates and their isotopes are removed is visible in Fig. 4a. This is a so-called biplot in which samples are presented together with (in this case a subset of) peaks mainly responsible for the spread in the data. From right to left, there is an increase in metabolic production. A zoomed-in version of the samples on the right is visible in Fig. 4b. Clearly, the samples can be identified that hardly or not display any significant activity or metabolite formation for any of the drugs tested.

Fig. 4
figure 4

Biplot of the pre-processed data. The samples are identified with the MT-number, the peaks indicated by lines and labelled with the corresponding m/z and retention time. The small numbers indicate the order of samples in the data set. 0 (84) indicates the WT. (A) The data are coloured according to a hierarchical clustering (average linkage) of the data based on 5 clusters. In order to prevent that too many peaks are plotted a user definable threshold of mostly 10 % of the largest peak was used; (B) zoomed-in version of the central right part of (A); (C) biplot of the pre-processed data with colour coding according to the clustering of the mutation patterns of the mutants (using hierarchical clustering, average linkage) based on 6 clusters; (D) zoomed-in version of the central right part of (c)

The data describing the mutations present in the CYP BM3 library (with zeros and ones, the latter indicating that a certain mutation is present in a mutant) can also be subjected to clustering. Based on the so-called cityblock distance and using average linkage hierarchical clustering, the data was clustered in 6/8 clusters: 5/7 small clusters and one large cluster. If the mutants in Fig. 4a are coloured according to this clustering, this results in Fig. 4c, d. This plot does not reveal much information, apart from one striking feature: the mutants belonging to the dark green cluster are located either on the right side of the figure (MT21, MT22, MT87, MT88; MT0 is the WT; see Fig. 4d), indicating that they do not produce any metabolite, or in the centre of the figure (MT16 and MT17). All these mutants contain only one to four mutations. The latter two mutants only contain the mutation A964V and G1049E, respectively, indicating that these mutations could be important for the production of metabolites.

In Fig. 4a, c, the main metabolites of the different substrates can be tentatively identified on basis of the mass spectrometric data. Several of the peaks in this plot are metabolites of AMI. The peak at m/z 264@13.93 (and its isotope at m/z 265@13.93) is the major metabolite MA4 (NOR). The peak at m/z 294@14.27 corresponds with MA3, while peak at m/z 250@13.78, corresponding with MA5, is also formed by most mutants. That they appear in this plot indicates that they explain the largest variation between the samples and that they are probably present in a large number of samples in high concentration. This is confirmed by Fig. 1a and ESM Fig S1. By checking the mass trace of m/z 294, more metabolites can be identified (MA1 at 12.3 min and MA2 at 12.8 min; see ESM Fig S6). By checking the corresponding mutants, it can be concluded that MA2 is formed mainly by mutants containing a mutation at position 87, position 437 or mutations at both of these positions which is in agreement with the results from the single substrate screening experiment.

The largest peak in Fig. 4a is the result of a DEX metabolite (see below for more details). When this DEX metabolite and the AMI metabolite MA4 are removed from the data, three metabolites of BUS appear in the biplot (data not shown). The most prominent peak at m/z 402@12.39 corresponds to MB4 (5-OH-BUS) while the peaks at m/z 402@12.87 and at m/z 360@12.14 correspond to MB5 and MB6, respectively. If the mass trace of m/z 402 is plotted (ESM Fig S7), five different metabolites appear in different quantities. Almost all mutants form MB4 (5-OH-BUS) while MB2 is only formed by a limited number of mutants. This is in agreement with the results from the single substrate screening experiment. For formation of MB2, again, the trend is observed that the majority of the mutants that form significant amounts of this metabolite do not contain the F81I mutations. The fact that mutant MT122, which is a mutant of M01 where the F81I mutation was introduced, does not form MB2 supports this hypothesis. In Table 3, a comparison has been made between the results of the screens for metabolic activity and diversity for a selection of mutants towards AMI and BUS. It can be seen from Table 3 that differences exist between the results from the different screenings. The substrate depletions calculated based on the results of the qualitative analysis by IT-TOF are in almost all cases (except for MT68-mediated conversion of BUS) lower than the substrate depletion calculated from the quantitative analysis by UPLC-MS/MS. This can be caused by the fact that MS signals generated by the different metabolites are not similar to those generated by the parent compound. The discrepancy between the percentage of depletion and the amount of metabolite quantified by UPLC-MS/MS can in most cases be explained by the formation of additional metabolites. This is especially the case for the MT66-mediated metabolism of BUS where the detected 5-OH-BUS (MB4) is only detected as a minor metabolite by IT-TOF. Trends of the metabolic activities and diversities measured by LCQ during the single substrate screening experiment and by IT-TOF during the cocktail screen correspond well. Mutants that displayed good activity during the single substrate screen also displayed good activity during the cocktail screen and metabolic profiles were very similar. This again demonstrates that co-administration of multiple drugs in a single incubation has no major effects on the metabolic activity and diversity of the tested mutants.

Table 3 Metabolic activity and diversity of selected CYP BM3 mutants towards amitriptyline and buspirone

The largest peak in Fig. 4a appears to be the N-demethylated product MM (MD6) (and parallel with it its m/z + 1 isotope) of DEX. When plotting the mass traces of m/z 258 (data not shown), it was found that the majority of the mutants produced the N-demethylated product MM (MD6) as the major product (see also ESM, Fig S3 and Table S5). Very small amounts of the O-demethylated product dextrorphan were also seen in this plot (at 11.55 min). These were formed by a selection of mutants (MT34, MT35, MT36, MT37, MT38 and MT42) which contained mutations at active-site residues S72, A74 or L437. However, also for these mutants, MM was the major product.

By using biplots, either directly or after removing the main peaks, the prominent metabolites of the various substrates can be retrieved (satisfying goal 1 and 5 when combined with mass traces or plots of the selected peaks for all mutants). Most of the times, the mutants that lie in the direction of such a peak produce high amounts of that specific metabolite. The procedure of removing the main peaks from the data, reconstructing and analysing the biplot, can be repeated, thereby retrieving also less prominent metabolites. Another approach is the ROBPCA-based outlier detection procedure. When applied on the data, it appeared that sample 32, which corresponds with MT66, was an outlying sample (see Fig. 5a). In Fig. 5b, the variables are plotted which are responsible for this outlying behaviour. The peaks as variable 202–204 and 206 appear to be peaks with a m/z value of 288 (the adjacent small peaks are their M + 1 isotopes), being four hydroxylated DEX metabolites which almost uniquely and in large amounts are produced by MT66 which displayed the highest activity towards DEX (see Fig. 5c).

Fig. 5
figure 5

(A) ROBPCA plot of data set. Samples positioned in the upper left quadrant are called orthogonal outliers. Also samples in the upper right quadrant can be investigated; (B) a plot of the variables responsible for the outlying behaviour of sample 32 = MT66; the variable numbers in this plot are linked to chromatographic peaks at specific m/z – RT combinations; (C) a mass trace of m/z 288 showing the five hydroxylated DEX metabolites MD1-MD5; the sample numbers indicate the order of the samples in the data set. In the biplot in Fig. 4a, they are linked to the mutant (MT) numbers, used in other figures and tables

UPLC overall has shown that a cocktail approach can be used to screen CYP BM3 mutants for metabolic activity and diversity. Different approaches have been used to assess the activities of the mutants. On the one hand, a UPLC-MS/MS-based method was used that measured depletion of the parent drugs and additionally monitored formation of at least one probe metabolite for each drug. This method made use of very short gradients (<5 min) and can be used to screen large amounts of samples in a relatively short time. The results of the UPLC-MS/MS-based screening were used to calculate substrate depletions for the tested drugs by the mutant library which, in most cases, provided reliable information about the activities of the mutants towards the tested drugs (except for COU). Inclusion of at least one probe metabolite for each drug makes the screening assay more reliable since it will make it easier to identify false positives when it is known which metabolite is expected to be formed. However, this is of course not always possible, especially not when the purpose of the screening is to identify novel mutants that can create chemical diversity by forming novel metabolites. Additional qualitative analysis of selected samples by an information-rich analysis method such as LC-MS/MS proved to be very valuable during this study as it provided much needed information about the metabolic profiles generated by the different mutants. The results of the screening suggest that the majority of the CYPs screened displayed a metabolic profile that is most similar to CYP3A4 (see ESM for more details). The chemometrical analysis facilitated an easy visualization and screening of the resulting data and a quick retrieval of the metabolites of the substrate cocktail. This appeared to be less laborious and time-consuming than manual analysis of the data.

Conclusions

With the experimental setup chosen, significant differences in the drug-metabolizing potential were encountered for the CYP BM3 mutants tested and large differences in the biotransformation profiles generated were established. Therefore, it can be concluded and proven based on the data presented that the proposed analytical strategy is valid, and that we successfully developed a LC-MS/MS-based cocktail screening method for the classification of CYP BM3 mutant libraries for metabolic activity in a quantitative assay method and diversity in a qualitative assay method. By combining all data, the classification of the mutants was made feasible using well-defined chemometrical methods for the rapid and valuable mining of the data. Hence, the classification is of crucial importance to select the proper mutants to generate predesigned compound libraries rather than screening all mutants for each sort of biotransformation. The mutants described in this study could be useful as a starting point for further site-directed or random mutagenesis, to further enhance metabolic efficiency and change substrate diversity and regioselectivity. More research to rationalize the effects of the introduced mutations would be very valuable in this process. In the current study, the presented strategy was used to evaluate the metabolic efficiency of a library of CYP BM3 mutants towards in total six drugs. The analytical approach and designed experiments described in this article will form the basis for future research in screening for CYP BM3 mutants with improved metabolic activity and diversity. This will aid to further improve lead diversification in the drug discovery process and biosynthesis of drug(like) metabolites.