INTRODUCTION

Elucidation of the mechanisms of cell functioning in living organisms including changes occurring as a result of drug therapy, external factors, or physiological conditions in the organism itself is one of the main goals of postgenome research and development in “omics” technologies and their further use in biology and clinical practice [1]. Achieving this goal will enhance the existing opportunities for breakthrough changes in approaches to solving both basic and practical problems of medicine such as early diagnosis of a variety of serious diseases or prediction of therapy efficiency [2, 3]. Both qualitative and quantitative proteomics using high performance liquid chromatography in combination with tandem mass spectrometry (HPLC-MS/MS) is one of the main approaches to determine changes in protein expression associated with the above factors that play key role in the development of a pathology or response [4, 5]. Note that the search and analysis of the regulated proteins become possible with recent development of high-resolution mass analyzers [6, 7] and novel methods of chemical proteomics, such as the thermal proteome profiling, TPP [8, 9]. However, a need for long and costly instrumentation time for performing proteome-wide profiling remains the bottleneck of further progress in chemical proteomics method developments and its applicability in biomedical research. Indeed, typical time required for the proteome-wide analysis of cell lysates by the state-of-the-art HPLC-MS/MS systems is in the range of several hours (ten and more hours for a sample analyzed in several technical and/or biological replicates). First of all, these large instrumentation time expenses are due to the prolonged separation of the analyzed proteolytic mixturenecessary for performing tandem mass spectrometry measurement of as many as possible of its components [10, 11]. In particular, for a single “drug-to-proteome” model, realization of the TPP method requires about 100 proteome-wide analyses, which is translated to the extended instrumentation time limiting wide acceptance of the method in practice. The ability to perform rapid analyses becomes more and more important for identification of the regulated proteins in large sample cohorts and variety of “drug-to-proteome” models. The problem could be partially solved using sample multiplexing based on the isobaric tandem mass tags (TMT) [12], yet, reducing the time required for performing proteome-wide analysis remains an issue due to the price of the latter approach.

A number of first attempts to develop methods of ultrahigh-speed proteome-wide analysis for the large-scale studies in expression proteomics have been reported in recent years [13-16]. The method of direct liquid chromatography/mass spectrometry-based protein identification, DirectMS1, is one of these approaches allowing rapid quantitative proteome-wide analyses based on ultrashort separation gradients [17, 18]. Previously, capabilities of the method and its comparison with the standard HPLC-MS/MS-based approaches have been demonstrated for the proteome profiling of HeLa (human cervical cancer) cell line. In particular, the method has been shown to allow identification of more than 2000 proteins within several minute time frame using high resolution Orbitrap mass spectrometers [19]. “Direct” in the method’s notation means exclusion of the peptide fragmentation step from the workflow, which is responsible for the long analysis time of the standard proteomic approaches. Proteins are identified using the complementary data acquired for the peptides, such as elution time, ion mobility, etc. The important feature of DirectMS1 method is its ability to identify proteins with large sequence coverage that gives more accurate quantitation [19]. However, feasibility of using the method in chemical proteomics to correctly identify drug targets and reveal the processes regulated by the treatments was not comprehensively evaluated.

In this work, we performed a number of experimental studies to evaluate feasibility of using the DirectMS1 method based on ultrashort separation gradients in chemical proteomics. Specific aims of the study were evaluation of the method’s abilities for monitoring drug targets and revealing proteins regulated by the drug treatment, as well as comparison of the proteins identified in the proteomic samples obtained using different cell lysis approaches, including freeze-thaw lysis [20] widely used in chemical proteomics and the standard one based on cell destruction with ultrasonication. For evaluation of the method’s ability for monitoring drug targets, the following experimental datasets were used:the one obtained earlier in Karolinska Institute using standard deep quantitative proteome profiling with sample multiplexing for the A549 cell line treated with a large variety of anticancer drugs [21]; and DirectMS1 data obtained earlier and in this work for HeLa [19] and A2780 cell lines, respectively, using ultrashort separation gradients. Further, we analyzed the data obtained in this work for the A2780 cell line by employing two proteome-wide analysisapproaches, including the standard hour-long HPLC-MS/MS and the ultrafast DirectMS1, to compare two cell lysis techniques. Finally, we performed expression proteome profiling of the A2780 cell line treated with lonidamine followed by gene ontology analysis for the differentially expressed proteins to evaluate ability of the DirectMS1 to correctly reveal the processes regulated by drug treatment.

MATERIALS AND METHODS

Cell culture. A2780 human ovarian carcinoma cells obtained from the European Collection of Authentic Cell Cultures (ECACC, UK) were cultured in a standard RPMI 1640 medium (Gibco™, Ireland) containing 5% fetal calf serum (Gibco™, Brazil) and a mixture of antibiotics (penicillin – 100 µg/ml, streptomycin – 100 µg/ml, PanEco, Russia) at 37°C in an atmosphere of 5% CO2. Cells after 5-7 passages were used in the experiments. To obtain cell mass, cells were seeded in 25 cm2 culture flasks (TPP, Switzerland) (1×106 cells in 5 ml of culture medium) and incubated for 24 h. On the day of the experiment, the medium was replaced with 5 ml of the medium containing active compound lonidamine (obtained at Moscow State University according to the literature method [22]) added at concentration of 150 µM (IC50). The cells were then incubated for 72 h, the medium was removed, the cells were separated from the surface with a scraper, suspended in a fresh culture medium, centrifuged (1500 rpm × 3 min), and then washed twice with cold PBS (10 mM, pH 7,4). Cells in flasks that had not been treated with lonidamine were used as controls. All experiments were carried out in five replicates.

Cell lysis. Cell proteins were isolated using two methods for comparison: (i) ultrasonication-based lysis; (ii) the freeze-thaw one. In the first method, a standard lysis protocol was used, including lysis buffer containing 50 mM ammonium bicarbonate (ABC), 10% acetonitrile, 0.1% Protease MAX Surfactant (Promega, USA). One hundred microliters of the lysis buffer was added to one million cells, followed by incubation on a temperature-controlled shaker (500 rpm) at room temperature for 30 min. Further lysis was performed using an ultrasonic homogenizer (Qsonica Q125, Newtown, USA) for 3 min (in cycles of 10 s) on ice with an amplitude of 25%. For further enzymatic hydrolysis, the samples were diluted 2-fold with 50 mM ABS.

Freeze-thaw lysis was performed in a sodium phosphate buffer (1× concentration, pH 7.4, 1 million cells – 100 μl of buffer). Samples were placed in liquid nitrogen for 1 min, next, heated to 25°C on a temperature-controlled shaker (600 rpm) for 1 min. The freeze-thaw cycle was repeated three times. After that, the samples were centrifuged (10,000g) for 5 min, and the supernatant was transferred into new tubes for further enzymatic hydrolysis.

Sample preparation for liquid chromatography/mass spectrometry proteomic analysis. To restore disulfide bonds, dithiothreitol (DTT, final concentration 10 mM) was added to the samples, followed by heating to 56°C (25 min), after which an alkylation reaction was carried out using iodoacetamide (IAA, final concentration 15 mM) under dark conditions at room temperature for 30 min. To eliminate a possible excess of the alkylating reagent, dithiothreitol was added to the samples to final concentration of 5 mM. Trypsin (Sequencing Grade Modified Trypsin, Promega) at a ratio of 1 : 75 was used for enzymatic hydrolysis, the reaction was carried out at 37°C for 15 h. To stop hydrolysis, acetic acid (3-5%, final pH < 3) was added to the samples. After 30 min, the samples were centrifuged (12,000g, 10 min). The supernatant was transferred into glass tubes and dried using a vacuum centrifuge evaporator. The obtained samples were stored at –80°C. Immediately before mass spectrometry analysis, the samples were dissolved in a 0.1% aqueous solution of formic acid to a peptide concentration of 1 μg/μl.

Liquid chromatography/mass spectrometry analysis of proteolytic mixtures and protein identification. Identification of proteins and their quantitation were carried out by two methods: (i) standard proteome-wide bottom-up analysis [23] based on separation of proteolytic mixtures of peptides by reverse-phase chromatography in combination with tandem mass spectrometry; and (ii) method of direct liquid chromatography/mass spectrometry identification of proteins, DirectMS1, based on ultrashort separation of peptide mixtures and acquisition of high-resolution MS1 mass spectra of peptide ions without the use of fragmentation [18].

Proteome analysis based on tandem mass spectrometry was performed in two technical replicas using an Ultimate 3000 RSLCnano HPLC system (Thermo Fisher Scientific, USA) coupled with a Q-Exactive HFX mass spectrometer (Thermo Fisher Scientific). The peptide mixture (500 ng) was loaded onto an Acclaim µ-Precolumn enrichment column (0.5 mm × 3 mm, particle size 5 µm, Thermo Fisher Scientific) at a flow rate of 10 µl/min for 4 min in isocratic mode using buffer “C” as mobile phase (2% acetonitrile, 0.1% formic acid in deionized water). Next, the peptides were separated on a C18 HPLC column (75 μm × 150 mm, Reprosil-Pur phase, 3 μm particle size, 100 Å pore size) in a gradient elution mode. A gradient was formed from mobile phase A (0.1% formic acid) and mobile phase B (80% acetonitrile, 0.1% aqueous formic acid) at a flow rate of 400 nl/min. The gradient was formed as follows: column was equilibrated with a 2% mobile phase B for 3 min, after which concentration of the mobile phase B was linearly increased to 35% over 28 min, then concentration of the phase B was linearly increased to 99% over 5 min; after 5-min washing with 99% buffer B, the buffer concentration was linearly reduced to the initial 2% over 1 min. Total analysis time was 45 min. Mass spectra were acquired using a Q-Exactive HFX mass spectrometer operated in the positive ionization mode using a NESI source (Thermo Fisher Scientific). For MS, the following settings were used: emitter voltage of 2.3 kV; capillary temperature of 285°C, mass range from m/z 350 to m/z 1500, and mass resolution of 120,000. For MS/MS used in DDA mode (data dependent acquisition [24]), mass resolution was set to 15,000 in the mass range from m/z 100 to the upper limit, which was determined automatically based on the mass of the precursor, but not more than m/z 2000. Precursor ions were isolated in a window of ±1 Da. The maximum number of ions selected for isolation in the MS/MS mode was set to 20, normalized collision energy (NCE) was 29. For precursor ions sent for fragmentation, only ions at charge states from z = 2+ to z = 6+ were allowed. Maximum accumulation times for precursor and fragment ions were 50 ms and 110 ms, respectively. The AGC (automatic gain control) values for precursors and fragment ions were set to 1*106 and 2*105, respectively. All precursor ions already selected for fragmentation were dynamically excluded from the list of candidates selected for isolation and fragmentation for 30 s. DDA data from proteome analysis of digests of the analyzed cell line lysates were, next, processed for protein identifications.

DirectMS1 analyses were performed for each biological replica in three technical runs using an Ultimate 3000 RSLCnano HPLC system (Thermo Fisher Scientific) coupled with a Q-Exactive HFX mass spectrometer (Thermo Fisher Scientific). A peptide mixture (200 ng) was loaded onto an Acclaim µ-Precolumn enrichment column (0.5 mm × 3 mm, particle size 5 µm, Thermo Fisher Scientific) at a flow rate of 10 µl/min for 4 min in isocratic mode using buffer “C” as a mobile phase (2% acetonitrile, 0.1% formic acid in deionized water). Next, the peptides were separated on a C18 reverse-phase HPLC column (75 μm × 150 mm, Reprosil-Pur phase, 3 μm particle size, 100 Å pore size) in a gradient elution mode with mobile phases A (0.1% formic acid) and B (80% acetonitrile, 0.1% aqueous formic acid) at a flow rate of 1.5 μl/min. The gradient was formed as follows: first, the column was equilibrated with 8% mobile phase B for 0.8 min, after which concentration of the mobile phase B was linearly increased to 35% over 4.2 min, then concentration of the phase B was linearly increased to 99% over 0.9 min to wash the column at 99% buffer B, after that the phase B concentration was linearly reduced to the original 8% over 0.1 min. Total analysis time, including the column washing and equilibration, was 7.3 min.

Protein identification. In all cases, hereinafter, the SwissProt database dated July 30, 2021 was used with a restriction on the species of the studied organism – Homo sapiens, containing 20,368 proteins. Protein identification in the MS/MS data was carried out using IdentiPy v.0.3.3 search engine [25] and post-search validation tool Scavager v.0.2.9 [26] with integrated NSAF algorithm for quantitation (normalized spectral abundance factor, [27]) by default, which was chosen earlier as the most accurate algorithm for the label-free quantitative proteome analysis based on tandem mass spectrometry [28]. The following search parameters were used: enzyme – trypsin; length of tryptic peptides from 7 to 30 amino acid residues; mass measurement accuracies of ±5 ppm and ±0.01 Da for the masses of monoisotopic peptides in the MS1 spectra and their fragments in the MS/MS spectra, respectively; and up to two enzymatic miscleavages. Oxidation of methionines, N-terminal acetylation of the protein, and modification of cysteine with an alkylating reagent were considered as variable modifications. To identify peptide-spectrum matches (PSMs), peptides, and protein groups, a false positive identification rate (FDR) of 1.0% was used. To compare proteins identified using different lysis methods, the data from technical and biological replicates were combined and filtered to the same FDR level (“union” function of the Scavager v.0.2.9 algorithm [26]).

For the DirectMS1 data, proteins were identified using ms1searchpy v.2.0.3 [17], upgraded recently by integration of machine learning algorithms [18, 19]. False sequences for validation of the results were generated by reshuffling of target protein sequences. The following search parameters were used: enzyme – trypsin, at least three scans per peptide ion cluster formed by the peptide’s 13C isotopic distribution, number of peaks in the cluster – at least two, including monoisotope, initial mass accuracy – 8 ppm, peptides were restricted in possible length from 7 to 30 amino acid residues (484,765 peptides) with zero enzymatic miscleavages, and the charge states allowed for peptide ions were from 1+ to 6+. The list of proteins was filtered to 1.0% FDR. For comparison of different lysis methods, all five biological replicates were analyzed and proteins identified in at least four of them were used.The identified proteins were reported as leaders of protein groups, which included proteins with shared peptides. Further, all proteins of the group have been subjected to subsequent quantitative analysis. In particular, for the MS1 data obtained by DirectMS1 method, quantitative analysis was performed as follows [18]: first, a list of all reliably (at a given FDR level) identified proteins (leaders of protein groups) was generated for technical/biological replicates and samples; then, for these proteins, a list of all identified peptides was compiled; and, finally, for these peptides, a list of all proteins from the database, to which these peptides correspond, was generated. Thus, not only the leaders of protein groups, but also all proteins that share peptides with them participated in the quantitative analysis for the subsequent statistical tests.

Analysis of chemical proteomics data from Karolinska institute study. Chemical proteomics data for a large set of drug-to-proteome systems studied at the Molecular Biometrics Center of Karolinska Institute using the TMT-based deep proteome analyses [21] were downloaded from the ProteomeXchange repository(ID PXD009775) and searched using IdentiPy [25]. For the peptide search, the mass measurement accuracies of 10 ppm and 0.01 Da were set for precursor ions and their fragments, respectively. Because the data were obtained using TMT tags (specifically, TMT10), the mass shift of 229,16 Da was set as a fixed modification of all peptide N-terminals and lysine residues, same as carbamidomethylation of cysteines. Methionine oxidation and phosphorylation of serine, threonine and tyrosine were set as variable modifications. Search results for individual fractions were pooled and validated using Scavager [26]. To assess relative protein content, the sum of its intensities in all 10 TMT channels, averaged over three replicas, was used. This value does not reflect the actual protein content in a particular sample, but characterizes its average concentration, so it was used to rank the known drug targets and select top15 targets.

Identification of the differentially expressed proteins. Diffacto algorithm [29] was used for semi-quantitative analysis of protein identifications obtained by DirectMS1 method. Peptide intensities for the statistical analysis by Diffacto were extracted using Biosaur v.2.0 [30]. Protein p-values were calculated using the PECA method (Probe-level Expression Change Averaging) and adjusted for multiple comparisons using the Benjamini–Hochberg correction. Selection of the differentially regulated proteins was performed according to the following criteria: fdr_bh < 0.001, FCup > 1.2, FCdown < 1/1.2. When assessing response of the A2780 cells to lonidamine obtained by DirectMS1 method, gene ontology (GO) analysis was performed separately for upregulated (FCup > 1.2) and downregulated (FCdown < 1/1.2) proteins using GOrilla [31]. GO enrichments were calculated against all identified proteins included in the results of semi-quantitative analysis (background).

RESULTS

To evaluate capabilities of the DirectMS1 method based on ultrashort separation gradients, we re-analyzed the chemical proteomics data for a large set of drug-to-proteome systems obtained using TMT multiplexing at the Molecular Biometry core facility of the Karolinska Institute [21]. The data were obtained for several dozens of antitumor drugs used in clinical practice or undergoing initial stages of preclinical trials. The drugs were used to treat human lung adenocarcinoma cell line A549 of the ATCC collection. Table 1 lists the drugs used in the study and their targets (columns “Drug” and “Target protein ID”). For comparison, only the known drug targets found in the results of the multiplex HPLC-MS/MS analyses among the 15 proteins identified as the targets of a particular drug with the highest values of relative concentrations (top-15) are listed.

The proteomic data obtained using the DirectMS1 method and ultrashort separation gradients were searched for the proteins belonging to the above list of top-15 targets. In particular, Table 1 shows the results of DirectMS1 analysis of HeLa cell line digest data obtained previously using Orbitrap Fusion mass spectrometer equipped with ion mobility system FAIMS [19], and A2780 cell line data obtained in this work using two different lysis method as described above. Note that the results obtained for the standard HeLa digest are an example of the deepest proteome achieved to date using DirectMS1 and 5-min separation gradients. As can be seen from the presented data, most of the drug targets from the list of top-15 proteins are also identified in the ultrafast proteome-wide analysis at the currently achieved sensitivity of the method.

Table 1 Chemical proteomics data for the “drug-to-proteome” systems studied using TMT-based HPLC-MS/MS quantitative proteomics [21], and evaluation of capabilities of the DirectMS1 method to monitor drug target proteins in the HeLa and A2780 cell lines digests

One of the objectives of the study was to compare different methods of cell lysis, including a protocol based on ultrasonic destruction of cells (US), as well as a method of cryogenic freezing of cells with subsequent thawing (freeze-thaw lysis, FT). Comparison of the lysis methods was performed for an untreated A2780 cell line using standard proteomic HPLC-MS/MS analysis and HPLC-DirectMS1. FT protocol is widely used in chemical proteomics for searching drug targets that allow interaction of the drugs with proteins in their native forms [8]. Figure 1 shows the results of comparison of two methods of lysis followed by the use of MS/MS (Fig. 1a) and DirectMS1-based (Fig. 1b) proteome analyses. The results show that the use of FT lysis leads to a lower depth of proteome coverage (see Table S1 in the Online Resource 1). Note that the observed decrease in the number of identifications is mainly due to the membrane proteins. Indeed, according to the functional enrichment analysis using GOrilla [31], vast majority of the proteins identified exclusively using the common US lysis method belong to the cell membrane components, such as cell walls and mitochondrial membranes (see Table 2). Overlap in the identified proteins (leaders of the corresponding protein groups) between the MS/MS and DirectMS1 methods was about 80% (Fig. 1, c and d), and enrichment of the membrane components was observed for both (Fig. 1e). It is worth noting that the proteins unique to FT lysis method are primarily cytoplasmic and do not belong to a specific cellular compartment. Thus, their identification is probably associated with the decrease in complexity of the protein mixture due to the absence of a significant amount of membrane proteins. It should be noted that the analysis of gene ontologies (Gene Ontology, GO) [32], the results of which are given in Table 2, was performed for the standard HPLC-MS/MS data using only protein group leaders, which are the most likely proteins present in the sample. At the same time, quantitative analysis in the Diffacto algorithm used in the DirectMS1 method was carried out for all proteins of the group [18, 19]. However, such difference in the quantitation approaches did not affect the results of comparison, since majority of the protein groups identified in the MS/MS analysis and lost when using FT lysis consisted of either a single protein (80% of protein groups), or two homologous proteins (10% of protein groups) with the same GO terms.

Fig. 1.
figure 1

Comparison of different lysis methods – a standard protocol based on ultrasonication (US) and a cryogenic freeze-thaw protocol (FT): proteome-wide analysis of A2780 cell line digests was performed using MS/MS-based (a) and DirectMS1 (b) methods; comparison of identified proteins (leaders of protein groups) obtained by DirectMS1 and MS/MS methods using US (c) and FT (d) lysis; e) intersection of the 15 most statistically significant enriched cellular components corresponding to the proteins lost when using FT lysis (see Table 2 for details). The analysis was performed using GOrilla [31].

Table 2 Functional enrichment (cellular components) of proteins lost using the freeze-thaw lysis

Study of the capabilities of DirectMS1 method based on ultrashort separation gradients for identifying pathways of drug-activated intracellular interactions was performed using A2780 cancer cell line treated with lonidamine as an example. It is believed that the cancer cell metabolism is characterized by aerobic glycolysis to generate energy needed for cellular processes, a phenomenon known as the Warburg effect. Persistent activation of aerobic glycolysis in cancer cells may be associated with activation of oncogenes or loss of tumor suppressors, thereby significantly accelerating cancer progression. In this regard, inhibition of glycolytic capacity in the malignant cells can enhance the effect of combined antitumor chemotherapy [33]. Lonidamine is an indazole derivative, and it has been suggested that the mechanism of its action is based on inhibition of the mitochondrial-associated hexokinase II [34, 35]. Also, it was shown in a number of earlier studies that lonidamine inhibits glycolysis, which leads to the decrease in the synthesis of intracellular ATP [36]. The above-mentioned possible mechanisms of action of lonidamine allowed considering this drug as one of the components of combined chemotherapy for solid tumors, including ovarian cancer [37, 38].

Here, a comparative analysis of the proteomes of the ovarian cancer cell line treated with lonidamine (at a concentration of IC50) relative to the untreated control was performed using ultrafast proteome-wide profiling by DirectMS1 method. It was interesting to see features of the protein regulation in the metabolic pathways associated with glycolysis and ATP synthesis for such relatively well-characterized drug-to-ovarian cancer system [39]. Figure 2 shows the results of proteome analysis of the lonidamide-treated A2780 ovarian cancer cell line compared to the untreated control using the 5-min HPLC-DirectMS1. Based on the measurements of fold changes in the content of identified proteins and gene ontologies analysis, it can be concluded that the identified differentially regulated proteins, as expected, are associated with the metabolic pathways of glycolysis and ATP synthesis (marked in bold in Table 3) in agreement with the known mechanism of lonidamine action.

Fig. 2.
figure 2

Volcano plot of proteins identified using DirectMS1 method for A2780 cell line treated with lonidamine in coordinates of fold changes (FC) of measured concentrations versus control and statistical significance of these changes (p-value). Experiments were performed using HPLC-MS/MS system Orbitrap Q Exactive HF adapted for the DirectMS1 method based on ultrashort (5 min) HPLC gradients.

Table 3 Analysis of gene ontologies using GOrilla for proteins with statistically significant increase in concentration after drug treatment (proteins in the upper right quadrant of the scatterplot in Fig. 2)

To optimize threshold for the statistically significant changes in concentration of the proteins due to lonidamine treatment, we used the previously developed approach based on the Benjamini–Hochberg correction for multiple comparisons, analysis of density of the data points in the volcano plots of protein concentration changes, and gene ontologies [40]. Analysis of gene ontologies for the identified differentially regulated proteins, level of which significantly decreased after treatment, revealed enrichment of the biological processes associated with RNA metabolism and splicing (Table 4).

Table 4 Analysis of gene ontologies using GOrilla utility for the proteins with a statistically significant decrease in concentration after drug treatment (upper left quadrant in Fig. 2)

CONCLUSION

In summary, the study has shown the possibility of using the ultrafast method of quantitative proteome-wide analysis DirectMS1 in chemical proteomics to search for drug targets and their chemotherapeutic effects. Specifically, we re-analyzed chemical proteomics data previously obtained by deep quantitative proteome profiling for a large cohort of known anticancer drugs and demonstrated that the ultrafast 5-min long proteome-wide analysis with DirectMS1 correctly identifies target proteins and measures their quantitative changes. Moreover, even in the case of insufficient depth of analysis for direct quantitative identification of targets, as in the studied case of the A2780 ovarian cancer cell line treated with lonidamine, the method correctly reveals main pathways of the intracellular interactions activated by drug treatment.

The results obtained suggest that the new method of ultrafast proteome-wide quantitative analysis could find applications in proteomic analyses of large sample cohorts, such as clinical and population proteomics, validation of protein biomarkers of diseases, etc.