It is now clear that unraveling how the brain creates behavior, and how this understanding can better inform new therapies and interventions, will require vast new approaches to tackling the complex problems in Neuroscience and Psychiatry. Fortunately, tremendous evolution in sample sizes, computing power, neuroimaging technologies, digital approaches to phenotyping, and computational modeling have rapidly brought in a new era of Big Data in Psychiatry. How we understand and utilize computational approaches to enormous data, from genomics to neuroimaging to phenotyping, may well determine our success as a field in the coming years.

To address these issues, the 2021 Neuropsychopharmacology Reviews edition aims to cover a broad swath, from digital data collection, to neuroimaging across development, to multiomics across disorders, to computational approaches to Big Data in Psychiatry. We have broken up these themes into several more focused sections, as briefly outlined below.

In our section on Computational Modeling in Psychiatry, Quentin Huys from University College London and colleagues from Oxford, Brown, and the Laureate Institute for Brain Research examine how computational psychiatry approaches can be used to leverage large datasets across Psychiatry. In particular, they describe advances in theory-driven computational approaches for understanding mechanisms in disorders of mood, anxiety, and addiction [1]. How we can generate explanatory models that integrate across biological, psychological, and social–environmental domains are tested on experimental data, as a critical path forward for understanding mechanisms and advancing new treatments. Danielle Bassett and colleagues from the University of Pennsylvania examine these issues and review a number of exciting new approaches in their review of computational modeling in neuroscience [2]. Related to this, Sturman and colleagues review machine learning approaches to complex behavior in their manuscript examining deep phenotyping of rodent behavior [3]. Returning to human behavior, JP Onella and colleagues from Harvard School of Public Health will examine large-scale approaches to active and passive data collection and prediction (e.g., digital phenotyping) in Psychiatric Disorders [4].

Another section that we address in the 2021 NPPR is Large-Scale ‘Omics Approaches to Understanding Psychiatric Disorders. Matt State and colleagues of the University of California, San Francisco, first describe large-scale international collaborative efforts to dissect genetic contributions of rare and de novo variation to Autism Spectrum Disorders, Tourette’s syndrome, and brain malformation syndromes [5]. These collaborative efforts may help traverse the complex path from genomics to therapeutics in autism spectrum disorder. Another very exciting tool for using large data approaches to understanding the brain development and function utilizes induced human neurons and more complex organoids derived from stem cells. Flora Vaccarino and colleagues from Yale address these approaches, as well as the large-scale-related work from the PsychEncode consortium in their timely review [6]. Being able to do large-scale single-cell sequencing of hundreds of thousands of individual cells in the brain is also revolutionizing our understanding of cell and microcircuit functioning.

An additional approach to utilize large data across species include integration of large-scale genome-wide association study (GWAS) data with genomics in model systems. Elissa Chesler and colleagues address this issue in their review on interpretation of psychiatric GWAS with multispecies heterogeneous functional genomic data integration [7]. While genetic and mRNA analyses have transformed our understanding of brain function, in large part due to the genetics revolution and the ability to do large-scale inexpensive nucleotide sequencing, new approaches using rapid, high-throughput mass spectroscopy have allowed large-scale proteomics also to advance rapidly. Nick Seyfried and colleagues from Emory University provide a terrific review of analyses of the proteome in Alzheimer’s Disease, an area of great recent progress in large-scale data collection and analytic approaches [8]. Rob McCullumsmith and colleagues at Cincinnati address how such integrative ‘omics discoveries can be used to change and improve drug discovery, and implement such discoveries in their review focused on big data approaches to systems pharmacology, drug repurposing, and translational research [9].

We are now increasingly appreciating that distinct psychiatric disorders are likely much more biologically distinct than are captured by our clinical and self-report symptom clusters that are currently used to define, for example, “depression” or “schizophrenia”. In other words, how do we best carve nature at its joints using large-scale biological and phenotypic data? To address these important issues, we have invited several reviews of recent large-scale progress, often through consortia, that are working to understand the distinct biological subtypes that make up psychiatric disorders. Deanna Barch and colleagues, from Washington University discuss large-scale, collaborative, multisite efforts to examine neural basis of disturbances in cognitive control and emotional processing in individuals with schizophrenia, and those at risk for the development of schizophrenia and other psychiatric disorders based on the Adolescent Brain and Cognitive Development cohort study [10]. More broadly in the area of bipolar and psychotic disorders, Carol Tamminga and colleagues from University of Texas review their work on multimodal biomarkers and biotypes as part of the large-scale Bipolar-Schizophrenia Network on Intermediate Phenotypes [11]. Great progress has also been made in these approaches in depression and mood disorders, as reviewed by Conor Liston and colleagues at Weill Cornell Medical School in their work on functional connectivity and biotypes of depression [12]. Of course, these enormously large datasets, and the application of unbiased approaches to dissecting them, can only be made possible by new analytic tools. Great progress in this area is reviewed by Dr. Andreas Myer-Lindenberg and colleagues from the Medical Faculty Mannheim at Heidelberg University in Germany, in their work describing machine learning methods, with a focus on deep and recurrent neural networks, and how these can be applied in the context of psychiatry [13].

In a final section in this edition, we further examine approaches to digital phenotyping through large-scale data collection that can advance our understanding of Psychiatric Syndromes. Lisa Marsch and colleagues at Dartmouth address the topic of how science informs the development, evaluation, and sustainable implementation of technology-based tools (that leverage web, mobile sensing, and/or social media approaches) for behavior change, targeting a wide array of populations and health behaviors [14]. Kathleen Merikangas and colleagues at the National Institute for Mental Health dive deeper into these approaches, in particular as they relate to Bipolar Disorder, in their use of real-time mobile monitoring and sensing devices [15]. And finally, Laura Germine and colleagues, at McLean Hospital and Harvard Medical School, review profoundly exciting work in taking behavioral and cognitive testing “to the world” through web-based, large-scale data collection in their review on population-based macroscale data collection for behavioral research [16].

Together, we are confident that the 2021 Neuropsychopharmacology Reviews edition on Big Data in Psychiatry will provide an exciting update and discussion on how large-scale data approaches, spanning data collection, genetics, neuroimaging, and modeling, will transform the future of Neuroscience and Psychiatry.

Funding and disclosures

KJR is supported by NIH R21MH112956, P50MH115874, R01MH094757, and R01MH106595, and the Frazier Foundation Grant for Mood and Anxiety Research. KJR has received consulting income from Alkermes, research support from NIH, Genomind and Brainsway, and he is on scientific advisory boards for Janssen and Verily, all of which is unrelated to the present work.