LSM-2: Learning from incomplete wearable sensor data
July 22, 2025
Girish Narayanswamy and Maxwell A. Xu, Student Researchers, Google Research
We introduce LSM-2 with Adaptive and Inherited Masking (AIM), a novel self-supervised learning approach that learns directly from incomplete wearable sensor data, achieving strong performance across classification, regression, and generative tasks without explicit imputation.
Quick links
Wearable devices have revolutionized health monitoring by providing continuous, multimodal physiological and behavioral data — from heart signals and sleep patterns to activity levels and stress indicators. Due to advances in sensor technology, it is increasingly feasible to capture a large volume of data, but the cost of labeling remains high, requiring real-time user annotations or laborious clinical studies. Self-supervised learning (SSL) addresses this limitation by directly using the unlabeled data to learn underlying structures, such as subtle physiological relationships. When applied at scale, SSL can enable the creation of foundation models that produce rich and generalizable representations useful for a wide variety of downstream health tasks.
However, when applying SSL to the wearable domain a critical limitation exists: state-of-the-art SSL methods assume complete, uninterrupted data — a rarity in real-world wearable sensor streams where gaps inevitably occur due to device removal, charging, intermittent loosening, motion artifacts, battery-saving modes, or environmental noise, which we quantify as “missingness”. In fact, we find that no samples, amongst our 1.6 million day-long windows, had 0% missingness. Historically, the challenge of fragmented data has forced researchers to rely on either imputation methods to fill missing segments, or aggressive filtering to remove instances with incomplete data. Neither pose an optimal solution, as the former may introduce unintended biases, while the latter discards valuable data.
Missing data is ubiquitous in wearable sensor recordings. Common modes of missingness are highlighted above in a day-long sample of multimodal wearable sensor data. We note that no samples amongst our 1.6 million day-long windows have 0% missingness.
In “LSM-2: Learning from Incomplete Wearable Sensor Data”, we present Adaptive and Inherited Masking (AIM), a novel SSL training framework that learns directly from incomplete data. Rather than treating data-gaps as erroneous measures that must be filled in, AIM learns directly from incomplete recordings by treating missingness as a natural artifact of real-world data. Leveraging AIM, we develop a Large Sensor Model (LSM-2) that improves upon our previous foundation model for wearable sensor data (LSM-1, presented at ICLR ‘25). We demonstrate that LSM-2 achieves strong performance even when sensors fail or temporal windows are removed, exhibiting significantly less degradation than models trained on imputed data.
Taking AIM with adaptive inherited masking
At the heart of AIM's innovation is its unique approach to handling the inevitable gaps in real-world sensor data. Unlike traditional SSL methods that either discard incomplete data or attempt to fill in missing values, AIM embraces these gaps as natural features of wearable data. As an extension of the masked autoencoder (MAE) pre-training framework, AIM learns the underlying structure of sensor data by reconstructing masked input samples.
However, while traditional MAE methods rely on a fixed masking ratio to enable the efficient drop out of masked tokens (i.e., a fixed number of masked tokens are not passed through the encoder thus reducing computational complexity), fragmentation in sensor data is unpredictable, resulting in a variable number of masked tokens. AIM addresses this fundamental challenge of wearable data by pairing token drop out with attention masking. During pre-training the set of tokens to be masked consists of those inherited from and inherent in the wearable sensor data plus those deliberately masked for the reconstruction training objective.
AIM first applies drop out to a fixed number of masked tokens, improving pre-training computational efficiency by reducing the sequence length processed by the encoder. AIM then adaptively handles any remaining masked tokens — whether naturally missing or part of the reconstruction task — via attention masking in the encoder’s transformer block. During discriminative-task fine-tuning and evaluation, where masked tokens solely consist of naturally occurring data gaps, AIM employs attention masking for all masked tokens. Through this dual masking approach, and by treating naturally occurring and artificially masked tokens as equivalent, AIM teaches the model to work with variable fragmentation inherent to wearable sensors.
AIM pre-training (A) and evaluation (B) for LSM-2. During pre-training, AIM uses the artificial mask to learn reconstruction and the inherited mask to model real-world missingness. Then, during evaluation, we can use the missingness-aware embedding to predict health targets, such as hypertension, directly from the inherently fragmented sensor data.
Training and evaluation
We leverage a dataset with 40 million hours of wearable data sampled from over 60,000 participants during the period from March to May 2024. The dataset was thoroughly anonymized or de-identified to ensure that participant information was removed and privacy was maintained. Subjects wore a variety of Fitbit and Google Pixel smartwatches and trackers and consented for their data to be used for research and development of new health and wellness products and services. The subjects were asked to self-report sex, age, and weight.
To pre-train LSM-2, we employ the AIM SSL technique introduced in the previous section. AIM implements a masked reconstruction training objective, and learns to understand data that is naturally missing, and impute data that is artificially masked. This unified framework allows LSM-2 to learn the underlying structure (including missingness) inherent in wearable sensor data.
We curate a set of downstream tasks to evaluate the pre-trained model, using meta-data that was collected alongside the sensor signals for the purposes of research and development. These include user annotated activities from a set of 20 different categories (such as running, skiing, kayaking and playing golf) and self-reported diagnoses of hypertension and anxiety. These data were split into fine-tuning and evaluation sets where data from each individual was only in either the tuning or the evaluation set and not both. Data from individuals used in the pretraining stage was also not included in the fine-tuning or evaluation stages.
The generative capabilities of LSM-2 are evaluated through the tasks of random imputation, temporal interpolation, temporal extrapolation (forecasting), and sensor imputation, described in our LSM-1 work.
The utility of the LSM-2 embeddings are evaluated via linear probe on a number of discriminative tasks. Specifically we gauge the applicability of the LSM-2 embeddings to the tasks of binary hypertension classification, binary anxiety classification, and 20-class activity recognition. We evaluate LSM-2’s ability to model physiology via age and BMI regression tasks.
Key results
The AIM-based LSM-2 model demonstrates remarkable versatility, outperforming its predecessor, LSM-1, across three key areas: classifying health conditions and activities (like hypertension, anxiety, and 20-class activity recognition), reconstructing missing data (e.g., recovery of missing sensor signals), and predicting continuous health metrics (like BMI with improved correlation). Additional comparisons to supervised and pre-trained baselines may be found in our paper.
LSM-2 models real-world missingness without imputation, allowing it to achieve lower reconstruction error (left) and higher classification scores (right) as compared to LSM-1.
LSM-2 excels in realistic scenarios where sensors fail or data is incomplete. The figure below simulates situations where whole sensor feeds or data for entire portions of the day may be missing. This mirrors the reality that different wearables may host different sensor load-outs, or that an individual may only wear their device for portions of the day. Here we find that the AIM-based LSM-2 proves more robust to these ablations as compared to LSM-1.
LSM-2 is more robust to missing data than LSM-1, degrading less from its original performance (dotted line) than its predecessor when whole sensor feeds or periods of the day are ablated.
Finally, LSM-2 exhibits improved scaling across users, data volume, compute, and model size as compared to LSM-1. While its predecessor shows signs of plateauing, LSM-2 continues to improve with more data and has yet to saturate.
LSM-2 exhibits improved scaling over LSM-1 across subjects, data, compute, and model size.
Conclusion
The LSM-2 foundation model, pre-trained with AIM represents progress towards more useful and usable wearable health technology. Fundamentally, AIM teaches LSM-2 to understand and leverage the natural gaps in real-world sensor streams to derive reliable insights from imperfect data. This innovation means wearable AI can finally embrace the messy reality of sensor data, preserving data integrity, while utilizing all available information.
Acknowledgements
The research described here is joint work across Google Research, Google Health, Google DeepMind, and partnering teams. The following researchers contributed to this work: Maxwell A. Xu, Girish Narayanswamy, Kumar Ayush, Dimitris Spathis, Shun Liao, Shyam Tailor, Ahmed Metwally, A. Ali Heydari, Yuwei Zhang, Jake Garrison, Samy Abdel-Ghaffar, Xuhai Xu, Ken Gu, Jacob Sunshine, Ming-Zher Poh, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Mark Malhotra, Shwetak Patel, Yuzhe Yang, James M. Rehg, Xin Liu, and Daniel McDuff. We would also like to thank participants who contributed their data for this study.