How to Work (and Play) with Imperfect Sleep Data

Published in

The Slice

11 min readMay 27, 2021

Do you have sleep data, such as from a consumer wearable device with a sensor? Do you feel sort of meh about it? You’re definitely not alone.

Using sleep data is troublesome: it can emphasize things that you might already feel worried about, and it doesn’t necessarily help improve them.
Getting sleep data is a whole other can of worms: it is definitely biased, but in which ways it’s biased, or how consistently, is (kind of) anyone’s guess.
…but can it still be usable? Yes! In combination with other data, like a diary; and/or as a basis for self-exploratory art practice.

Most sleep data a person can collect about themselves is far from perfect, but it can still be helpful for understanding and improving your sleep. This article summarizes research articles about sleep tracking, as well as my experiences of different phases of collecting and analyzing my own sleep data. It is inspired, in part, by a talk I attended a few months ago by Sarah F. Schoch, PhD, psychologist and neuroscientist interested in sleep in relation to development, memory, and emotions. (You can check out her recent work on studying sleep in infants in this article and video; and well as her research on dreams.)

“You improve your sleep by knowing more about it, and not working against your own rhythms” -Sarah F. Schoch, PhD

The approach of building self-knowledge in order to work with your rhythms, not against them is a powerful and positive antidote to the undercurrent of using body knowledge to exert control over ourselves. Sleep in particular is really hard to influence: Studies show that “established approaches like goal setting do not work well with sleep, because goals like falling asleep quicker or not waking up at night are typically not things a person can control” [1]. One study participant noted regarding the sleep goal being not motivating: “Of course I want to get 8 hours sleep every day. But how to control that? If I try to get 8 hours sleep, I have to go to bed early, and that’s just not feasible,” due to existing “work and family commitments” [1]. There is an assumption that the design of this tool should bring about greater control; meanwhile, these design attempts may instead highlight uncontrollable aspects of daily life.

Figure showing 30 years of actigraphy data. Notice, e.g., the impact of time zone shifts. From: Borbély et al. Three decades of continuous wrist‐activity recording: analysis of sleep duration. Journal of sleep research, 26(2), 188–194. © 2017 European Sleep Research Society

Sleep data can take a frustratingly long time to collect. Though Fitbit, one commonly-available wearable device that includes sleep tracking, has widespread and growing use, “5 percent [of all users] stop using their device within a week of buying it, and 12.5 percent stop within a month.” One month gives you 30 days, which inevitably looks like an inscrutable squiggly line.

Even with much longer datasets — the figure on the left encompasses 30 years of wrist-worn tracking, analyzed and reported by the wearer, who is, unsurprisingly, a sleep researcher [2] — it’s still more or less an inscrutable squiggly line, in absence of additional sources and interpretations to contextualize it. In Borbély et al’s article [2], the analysis focuses on changes in sleep duration after retirement, and the author (and wearer) comments on using alarms: “The subject of the present study has not perceived the reduced sleep time on weekdays [pre-retirement] as a real problem … He regards the pre- and post-retirement phases essentially as different modes of living”.

Assuming you are interested in collecting somewhere between 30 days and 30 years of data, then, what are some helpful additional data sources to keep in mind as you approach your sleep tracking project?

No Alarms and No Surprises

After undertaking various body data projects, I believe collecting and analyzing data about oneself is best done in portions no bigger than they need to be. So when I decided to systematically consider my sleep, I first collected handwritten notes for two months. As you can see below, my initial neat, printed, detailed table quickly dissolved into hurried scribbles. It was messy, but interesting, so I decided to keep going and get a wrist-worn sleep tracking device. Tracking sleep using a device is more certainly more long-term viable than a diary — however, there are biases that are important to keep in mind.

Some measures, like “fragmentation” (number of times woken up) were abandoned soon after starting. Other things I tracked included: feeling “full/hungry” before bed; feeling “rested” on waking; whether I had an alarm; my blood pressure / heart rate before bed; amount of coffee I had that day; ambient light/darkness.

Sleep tracking is a now-commonplace encounter with the invisible and uncontrollable body through wearable sensors. Most available devices use actigraphy, described below. Some devices also incorporate heart rate variability data, but, ultimately, every device uses proprietary algorithms to work with data from proprietary hardware. This means that any research that compares how well these devices work may or many not stay applicable — algorithms can be improved over time, after all. This might be a bit discouraging. Below are my experiences of using my own diary, and existing research, to understand the data that my wearable device produces — as an example of how one can cope with the inscrutability of these data sources.

Actigraphy works by collecting accelerometer data via “a portable wrist-worn sleep monitoring device” and using a classification algorithm to decide whether the wearer is sleeping; these are “used in clinical sleep medicine for assessing certain sleep disorders, such as circadian rhythm sleep-wake disorders, and for characterizing day-to-day patterns or sleep disturbances in insomnia” [3]. Actigraphy has a relatively high sensitivity and accuracy, but low specificity, compared to the gold standard approach, polysomnography [4]. Subjectively, the accuracy of a device feels uncertain [1]: in the words of one interview respondent, “it never takes me zero minute to fall asleep. I know that at the time that [my wrist-worn wearable] said I was asleep, I was actually reading.”

So, this is observation number one: that the specific boundary between sleep and waking may not be correctly recorded. Aware of this pitfall, I kept an informal diary for the first few weeks of wearing my device, and found that my data was much more consistent if I only wore the device during sleep.

After a few months of using the tracker, I compared the sleep duration it reported to what I had found in my sleep diary. During my 55-day diary, I was using an alarm quite often to cut sleep potentially shorter than it would be otherwise; but during the tracked data collection, I had nearly completely stopped using an alarm. The tracker claimed I slept 6.4 hours a night on average, and the diary — 7.5 hours! Because of the no-alarms policy, I know my sleep was definitely not worse during the tracker than during the diary, which means the tracker was consistently reporting over an hour too little for duration. When I did spot-checks of the tracked sleep data on particularly good or bad nights, I noticed the tracker was overly sensitive to movement throughout the night. Both datasets had similar variance otherwise, so this was quite encouraging: it meant that there was a bias in the tracker, compared to my hand-crafted gold standard, but at least it was consistent!

I use examples from two datasets I collected in 2020–2021: a diary and, later, a tracker. A tracker is imperfect, and so is a diary: of 55 days I kept the diary, I managed to record only some of the measures only some of the time. Different datasets included different measures (the diary included heart rate and blood pressure before sleep), and both included Duration and Minutes after Midnight (time when I fell asleep).

There are other specific biases in measures that may come up: “[Two specific devices] have shown a tendency to overestimate [sleep onset latency, SOL: the amount of time it takes to fall asleep]. Patients with insomnia already tend to overestimate SOL, and data from these devices could perpetuate their cognitive errors” [3].

So, I dug into my worries using my data.

In my my data, the easiest measure of SOL was “Minutes after Midnight” (so, 11:56.pm would be -4, and 12:43 would be 43) and I compared this as well. In the diary, my average time to bed was 15 minutes after midnight; according to the tracker — it was 56 minutes. I have my own biases about going to sleep “too late” so my diary likely under-reported this measure; so the “true” SOL for me is likely somewhere in between.

Returning to the advice to know more about sleep and not work against your own rhythms, I dug into my worries about “going to sleep too late” using my data. The results were not really surprising so much as something that I hadn’t really faced with clarity and acceptance before.

Research on sleep tracking points out that users of these devices struggle to interpret “whether their readings are normal, exceptional, or worrying” [5]; “I don’t know whether that’s normal, because I don’t know what’s normal for other people” [1]. In my own experience, I had a very normative idea of bed-time: I thought I should be asleep at midnight, or I would, presumably, turn into a pumpkin. Looking at my sleep tracking data actually helped me normalize behavior that I had previously pathologized.

A breakdown of how late I got to bed, by day of week, separating very late (>1:30am) and very early (<12:30am)

That initial 55-day period of messy data collection was, in fact, very successful: the act of taking notes was itself a kind of behavioral intervention. Within the first two weeks I realized that I didn’t need more data about eating too close to bedtime — I just needed to eat less closely to bedtime. The experience also helped me even consider getting rid of an alarm clock, despite a lifetime of being alarmed every morning. On my original tracker table, I had fields for whether I felt “full” before sleep, and whether I felt “well-rested” upon waking, and these questions were difficult to face, because the answer was always the same. I felt uncomfortably full before bed; and after waking up with my alarm, I felt like my skeleton wanted to crawl out of my body (“rested: 0/5”). I tried over a weekend to shift dinner much earlier, as well as not setting an alarm, and, well, turns out some things are not complicated, they are just difficult. It was difficult to re-organize my life enough to allow not relying on alarm clocks or on very late meals (I am not perfect at either of these, but over all much better than at the beginning of this exploratory data-body project); but it was not a complicated, obscure data insight. A diary practice for just a few weeks made it very obvious.

These behavior changes meant that I was able to start physically removing my alarming electronic devices out of my bedroom, which improved my life quality, not just my sleep quality. On the other hand, diving more in-depth into the data helped me to observe patterns that I could not observe through the diary. In particular, data analysis helped me understand my patterns of going to bed, and stop working against my own rhythms by trying to shift my sleep earlier. In her talk Sarah Schoch suggested calculating the sleep midpoint, for helping find one’s chronotype, and this further supported the observation that I naturally am more of a “night owl” than I ever realized. As you can see below, during winter break, without the influences of a work schedule, my midpoint shifted almost an hour later.

Data form my wrist-worn tracker, over about 4 months. I created these graphics in a spreadsheet, using my own calendar and memory to correct major errors. The x-axis represents days, from 25 Nov, 2020 until 30 March 2021 of wrist-worn sleep data collection. Gaps are missing data.

Just a Piece of the Puzzle

In talking about her motivation to study sleep, Sarah Schoch talked about the relationship between sleep and the quality of life. When we track sleep, it is easy to get bogged down in one particular dataset, but the reality is that most of the interesting observations are actually about how sleep relates to something. Maybe something (stress) affects sleep, or maybe sleep affects something (mood, creativity) — the point is, though, it is far more interesting when it is placed into a personally-meaningful narrative context.

For me, the bigger picture that animates my sleep study includes (1) dreams and (2) hormones. Like everyone else, I have been having completely ridiculous dreams for over a year, and in my 55-day diary, I ended up more interested in writing down my dreams than almost any other measure I initially wanted to collect. For the past 1.5 months, I’ve been keeping a journal of dreams (left). They are all dramatic, crowded, and clearly influenced by all the things I’ve binge-watched in the recent months. More interestingly, the dreams have well-developed and recurring geographies, which I am recording to explore expanding the time-series sleep data into a spatial dimension.

The second subject, hormones, places the timeline of my sleep data collection in perspective. As shown below, this journey started in early 2019, so the recent sleep data only includes the most recent few months. Eventually, I would like to have more substantial longitudinal data to explore and visually represent gender-affirming hormone therapy through the lens of sleep.

“Desirable Range” (2019-ongoing, work in progress)

Shown above, “Desirable Range” so far includes sketches of bodily experience of gender-affirming hormone therapy. Labels and y-axis purposely omitted; the dots show lab test dates, and the orange band shows the “desirable range” for the specific hormone level being measured. In spring last year, my lab work was literally off the charts (not in the good way — it exceeded the maximum measure of the instrument). I knew something was wrong because it manifested in my physical well-being and energy level; this, in part, inspired the initial 55-day diary, shortly after getting the results and adjusting the hormone treatment schedule.

Sensors render an unseen, sleeping body more visible — through data and charts. These data and charts are not neutral, because the process of data collection and analysis in consumer devices is itself unobservable. Observing sleep, which hinges on a variety of factors and cannot be quickly or effortlessly influenced, can highlight the false expectation that visibility leads to manageability. However, sleep data from a black-box tracker can be effectively compared to data from a diary; and it can support the exploration of a personally-meaningful narrative. Sleep tracker data can be one of the tools that helps you work with your body, not against it.

References

[1] Liang, Zilu, and Bernd Ploderer. 2016. Sleep tracking in the real world: a qualitative study into barriers for improving sleep. In Proceedings of the 28th Australian Conference on Computer-Human Interaction (pp. 537–541).

[3] Kolla, Bhanu Prakash, Subir Mansukhani, and Meghna P. Mansukhani. 2016. Consumer sleep tracking devices: a review of mechanisms, validity and utility. Expert review of medical devices, 13(5), 497–506

[4] Marino, Miguel, Yi Li, Michael N. Rueschman, John W. Winkelman, J. M. Ellenbogen, Jo M. Solet, Hilary Dulin, Lisa F. Berkman, and Orfeu M. Buxton. 2013. Measuring sleep: accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep, 36(11), 1747–1755.

[5] Knowles, Bran, Alison Smith-Renner, Forough Poursabzi-Sangdeh, Di Lu, and Halimat Alabi. 2018. Uncertainty in current and future health wearables. Communications of the ACM, 61(12), 62–67.

How to Work (and Play) with Imperfect Sleep Data

No Alarms and No Surprises

Just a Piece of the Puzzle

References

Written by Kit Kuksenok