Patterns
Volume 3, Issue 2, 11 February 2022, 100410
Journal home page for Patterns

Perspective
Breaking away from labels: The promise of self-supervised machine learning in intelligent health

https://doi.org/10.1016/j.patter.2021.100410Get rights and content
Under a Creative Commons license
open access

The bigger picture

Machine learning (ML) touches every area of science, and medicine especially is well poised to benefit the most. Hospital and nonhospital settings generate unprecedented amounts of data that if used correctly can unlock advances in new diagnostics and contribute to preventive medicine. The established paradigm of ML (supervised) requires the collection of input data (such as vitals or imaging) coupled with annotations from experts (such as indications of arrhythmia). New self-supervised models promise to do without annotations by using clever transformations of the input data only and achieve remarkable performance in an array of clinical tasks. This perspective gives a brief overview of the fundamental methodologies that enable these advances and discusses further challenges and opportunities.

Summary

Medicine is undergoing an unprecedented digital transformation, as massive amounts of health data are being produced, gathered, and curated, ranging from in-hospital (e.g., intensive care unit [ICU]) to person-generated data (wearables). Annotating all these data for training purposes in order to feed to deep learning models for pattern recognition is impractical. Here, we discuss some exciting recent results of self-supervised learning (SSL) applications to high-resolution health signals. These examples leverage unlabeled data to learn meaningful representations that can generalize to situations where the ground truth is inadequate or simply infeasible to collect due to the high burden or associated costs. The most prominent bottleneck of deep learning today is access to labeled, carefully curated datasets, and self-supervision on health signals opens up new possibilities to eliminate data silos through general-purpose models that can transfer to low-resource environments and tasks.

Data science maturity

DSML 3: Development/pre-production: Data science output has been rolled out/validated across multiple domains/problems

Keywords

machine learning
health signals
transfer learning
biomedical informatics

Cited by (0)

About the authors

Dimitris Spathis recently completed his PhD in computer science at the University of Cambridge and is now interning in Microsoft Research. He has degrees in AI and computer science and has previously worked in Telefonica Research, Qustodio, and Ocado. His research enables deep neural networks to learn richer and label-efficient representations of high-dimensional real-world data (mobile sensors, time-series, audio, or other modalities), motivated by challenges in health. Lastly, he serves on the program committees of top AI conferences such as AAAI, IJCAI, and KDD, and his research projects have been featured in international media (BBC, Guardian, Forbes, Times, NPR, Venturebeat).

Ignacio Perez Pozuelo is with the University of Cambridge and the Alan Turing Institute, focusing on human-activity recognition using multimodal wearable sensors. He uses these behavioural phenotypes to further understand the impact of physical activity and sleep on health. Ignacio has worked on deriving sleep inferences from multi-modal data using deep learning approaches. Further, he has worked on time-series forecasting of digital biomarkers using physical activity as well as on activity classification using semi-supervised and self-supervised learning approaches for large, unlabelled datasets.

Laia Marques-Fernandez is a doctor at Cambridge University Hospitals NHS Foundation Trust (Addenbrooke’s Hospital). She has a medical degree from the University of Barcelona with a year abroad at the University of Bologna. She is currently working as a junior doctor both in the hospital (in obstetrics and gynaecology, paediatrics, and emergency medicine) and in the community (in sexual and reproductive health at iCaSH Cambridge). Her recent research studies women’s health from a population level down to the biological characteristics of disease. She envisions a future where doctors and AI work together to improve patient care and physician workload through predictive medicine.

Cecilia Mascolo is a Full Professor of mobile systems in the Department of Computer Science and Technology, University of Cambridge, UK. She is co-director of the Centre for Mobile, Wearable System and Augmented Intelligence. She is also a fellow of Jesus College Cambridge and the recipient of an ERC Advanced Research Grant. Her research interests are in mobile systems and data for health, human mobility modelling, sensor systems and networking, and mobile data analysis. More details can be found at www.cl.cam.ac.uk/users/cm542.