Abstract
Prostar is a software tool dedicated to the processing of quantitative data resulting from mass spectrometry-based label-free proteomics. Practically, once biological samples have been analyzed by bottom-up proteomics, the raw mass spectrometer outputs are processed by bioinformatics tools, so as to identify peptides and quantify them, notably by means of precursor ion chromatogram integration. From that point, the classical workflows aggregate these pieces of peptide-level information to infer protein-level identities and amounts. Finally, protein abundances can be statistically analyzed to find out proteins that are significantly differentially abundant between compared conditions. Prostar original workflow has been developed based on this strategy. However, recent works have demonstrated that processing peptide-level information is often more accurate when searching for differentially abundant proteins, as the aggregation step tends to hide some of the data variabilities and biases. As a result, Prostar has been extended by workflows that manage peptide-level data, and this protocol details their use. The first one, deemed “peptidomics,” implies that the differential analysis is conducted at peptide level, independently of the peptide-to-protein relationship. The second workflow proposes to aggregate the peptide abundances after their preprocessing (i.e., after filtering, normalization, and imputation), so as to minimize the amount of protein-level preprocessing prior to differential analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhang Y, Fonslow BR, Shan B, Baek MC, Yates III JR (2013) Protein analysis by shotgun/bottom-up proteomics. Chem Rev 113(4):2343–2394. https://doi.org/10.1021/cr3003533
Ong SE, Foster LJ, Mann M (2003) Mass spectrometric-based approaches in quantitative proteomics. Methods 29(2):124–130. https://doi.org/10.1016/s1046-2023(02)00303-1
Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M (2011) Global quantification of mammalian gene expression control. Nature 473(7347):337–342. https://doi.org/10.1038/nature10098
Beeley C (2013) Web application development with R using Shiny. Packt Publishing Ltd. https://github.com/PacktPublishing/Web-Application-Development-with-R-Using-Shiny-third-edition
Wieczorek S, Combes F, Lazar C, Giai Gianetto Q, Gatto L, Dorffer A, Hesse AM, Coute Y, Ferro M, Bruley C, Burger T (2017) Dapar & prostar: software to perform statistical analyses in quantitative discovery proteomics. Bioinformatics 33(1):135–136. https://doi.org/10.1093/bioinformatics/btw580
Goeminne LJ, Argentini A, Martens L, Clement L (2015) Summarization vs peptide-based models in label-free quantitative proteomics: performance, pitfalls, and data analysis guidelines. J Proteome Res 14(6):2457–2465. https://doi.org/10.1021/pr501223t
Wieczorek S, Combes F, Borges H, Burger T (2019) Protein-level statistical analysis of quantitative label-free proteomics data with prostar. In: Proteomics for biomarker discovery. Springer, New York, pp 225–246. https://doi.org/10.1007/978-1-4939-9164-8_15
Gatto L, Lilley KS (2012) MSnbase-an R/Bioconductor package for isobaric tagged mass spectrometry data visualization, processing and quantitation. Bioinformatics 28(2):288–289. https://doi:10.1093/bioinformatics/btr645
Wieczorek S, Combes F, Burger T (2018) DAPAR and ProStaR user manual. In: Bioconductor. https://www.bioconductor.org/packages/release/bioc/vignettes/Prostar/inst/doc/Prostar_UserManual.pdf?attredirects=0
RStudio Team (2015) RStudio: Integrated Development for R. RStudio, Inc., Boston, MA. http://www.rstudio.com/
Cox J, Mann M (2008) Maxquant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511
Bouyssié D, Hesse AM, Mouton-Barbosa E, Rompais M, Macron C, Carapito C, Gonzalez de Peredo A, Couté Y, Dupierris V, Burel A, et al. (2020) Proline: an efficient and user-friendly software suite for large-scale proteomics. Bioinformatics 36(10):3148–3155. https://doi.org/10.1093/bioinformatics/btaa118
R-Core-Team (2020) stats package. URL https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/hclust, r package version 3.6.2
Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836, https://doi.org/10.1080/01621459.1979.10481038
Smyth GK (2005) Limma: linear models for microarray data. In: Bioinformatics and computational biology solutions using R and bioconductor. Springer, New York, pp 397–420. https://doi.org/10.1007/0-387-29362-0_23
Huber W, Von Heydebreck A, Sültmann H, Poustka A, Vingron M (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18(suppl_1):S96–S104. https://doi.org/10.1093/bioinformatics/18.suppl_1.S96
Lazar C, Gatto L, Ferro M, Bruley C, Burger T (2016) Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. J Proteome Res 15(4):1116–1125. https://doi.org/10.1021/acs.jproteome.5b00981
Giai Gianetto Q, Combes F, Ramus C, Bruley C, Couté Y, Burger T (2016) Calibration plot for proteomics: A graphical tool to visually check the assumptions underlying FDR control in quantitative experiments. Proteomics 16(1):29–32. https://doi.org/10.1002/pmic.201500189
Giai Gianetto Q, Couté Y, Bruley C, Burger T (2016) Uses and misuses of the fudge factor in quantitative discovery proteomics. Proteomics 16(14):1955–1960. https://doi.org/10.1002/pmic.201600132
Wieczorek S, Gianetto QG, Burger T (2019) Five simple yet essential steps to correctly estimate the rate of false differentially abundant proteins in mass spectrometry analyses. J Proteomics 207:103441. https://doi.org/10.1016/j.jprot.2019.103441
Acknowledgements
This work was supported by grants from Agence Nationale de la Recherche under: ProFI project (Proteomics French Infrastructure, ANR-10-INBS-08), GRAL project, a program from the Chemistry Biology Health (CBH) Graduate School of University Grenoble Alpes (ANR-17-EURE-0003), DATA@UGA and SYMER projects (ANR-15-IDEX-02) and MIAI @ Grenoble Alpes (ANR-19-P3IA-0003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Tardif, M., Fremy, E., Hesse, AM., Burger, T., Couté, Y., Wieczorek, S. (2023). Statistical Analysis of Quantitative Peptidomics and Peptide-Level Proteomics Data with Prostar. In: Burger, T. (eds) Statistical Analysis of Proteomic Data. Methods in Molecular Biology, vol 2426. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1967-4_9
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1967-4_9
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1966-7
Online ISBN: 978-1-0716-1967-4
eBook Packages: Springer Protocols