# Building a terrible 'IoT' temperature logger

I had approximately the following exchange with a co-worker a few days ago:

Them: “Hey, do you have a spare Raspberry Pi lying around?”
Me: [thinks] “..yes, actually.”
T: “Do you want to build a temperature logger with Prometheus and a DS18B20+?
M: “Uh, okay?”

It later turned out that that co-worker had been enlisted by yet another individual to provide a temperature logger for their project of brewing cider, to monitor the temperature during fermentation. Since I had all the hardware at hand (to wit, a Raspberry Pi 2 that I wasn’t using for anything and temperature sensors provided by the above co-worker), I threw something together. It also turned out that the deadline was quite short (brewing began just two days after this initial exchange), but I made it work in time.

## Interfacing the thermometer

As noted above, the core of this temperature logger is a DS18B20 temperature sensor. Per the manufacturer:

The DS18B20 digital thermometer provides 9-bit to 12-bit Celsius temperature measurements … communicates over a 1-Wire bus that by definition requires only one data line (and ground) for communication with a central microprocessor. … Each DS18B20 has a unique 64-bit serial code, which allows multiple DS18B20s to function on the same 1-Wire bus. Thus, it is simple to use one microprocessor to control many DS18B20s distributed over a large area.

Indeed, this is a very easy device to interface with. But even given the svelte hardware needs (power, data and ground signals), writing some code that speaks 1-Wire is not necessarily something I’m interested in. Fortunately, these sensors are very commonly used with the Raspberry Pi, as illustrated by an Adafruit tutorial published in 2013.

The Linux kernel provided for the Pi in its default Raspbian (Debian-derived) distribution supports bit-banging 1-Wire over its GPIOs by default, requiring only a device tree overlay to activate it. This is as simple as adding a line to /boot/config.txt to make the machine’s boot loader instruct the kernel to apply a change to the hardware configuration at boot time:

dtoverlay=w1-gpio

With that configuration, one simply needs to wire the sensor up. The w1-gpio device tree configuration by default uses GPIO 4 on the Pi as the data line, then power and grounds need to be connected and a pull-up resistor added to the data line (since 1-Wire is an open-drain bus).

The w1-therm kernel module already understands how to interface with these sensors- meaning I don’t need to write any code to talk to the temperature sensor: Linux can do it all for me! For instance, reading the temperature out in an interactive shell to test, after booting with the 1-Wire overlay enabled:

$modprobe w1-gpio w1-therm$ cd /sys/bus/w1/devices
$ls 28-000004b926f1 w1_bus_master1$ cat 28-000004b926f1/w1_slave
9b 01 4b 46 7f ff 05 10 6e : crc=6e YES
9b 01 4b 46 7f ff 05 10 6e t=25687

The kernel periodically scans the 1-Wire bus for slaves and creates a directory for each device it detects. In this instance, there is one slave on the bus (my temperature sensor) and it has serial number 000004b926f1. Reading its w1_slave file (provided by the w1-therm driver) returns the bytes that were read on both lines, a summary of transmission integrity derived from the message checksum on the first line, and t=x on the second line, where x is the measured temperature in milli-degrees Celsius. Thus, the measured temperature above was 25.687 degrees.

While it’s fairly easy to locate and read these files in sysfs from a program, I found a Python library that further simplifies the process: w1thermsensor provides a simple API for detecting and reading 1-wire temperature sensors, which I used when implementing the bridge for capturing temperature readings (detailed more later).

### 1-Wire details

I wanted to verify for myself how the 1-wire interfacing worked so here are the details of what I’ve discovered, presented because they may be interesting or helpful to some readers. Most documentation of how to perform a given task with a Raspberry Pi is limited to comments like “just add this line to some file and do the other thing!” with no discussion of the mechanics involved, which I find very unsatisfying.

The line added to /boot/config.txt tells the Rapberry Pi’s boot loader (a version of Das U-Boot) to pass the w1-gpio.dtbo device tree overlay description to the kernel. The details of what’s in that overlay can be found in the kernel source tree at arch/arm/boot/dts/overlays/w1-gpio-overlay.dts.

This in turn pulls in the w1-gpio kernel module, which is part of the upstream kernel distribution- it’s very simple, setting or reading the value of a GPIO port as requested by the Linux 1-wire subsystem.

Confusingly, if we examine the dts file describing the device tree overlay, it can take a pullup option that controls a rpi,parasitic-power parameter. The documentation says this “enable(s) the parasitic power (2-wire, power-on-data) feature”, which is confusing. 1-Wire is inherently capable of supplying parasitic power to slaves with modest power requirements, with the slaves charging capacitors off the data line when it’s idle (and being held high, since it’s an open-collector bus). So, saying an option will enable parasitic power is confusing at best and probably flat wrong.

Further muddying the waters, there also exists a w1-gpio-pullup overlay that includes a second GPIO to drive an external pullup to provide more power, which I believe allows implementation of the strong pull-up described in Figure 6 of the DS18B20 datasheet (required because the device’s power draw while reading the temperature exceeds the capacity of a typical parasitic power setup):

By also connecting the pullup GPIO to the data line (or putting a FET in there like the datasheet suggests), the w1-gpio driver will set the pullup line to logic high for a requested time, then return it to Hi-Z where it will idle. But for my needs (cobbling something together quickly), it’s much easier to not even bother with parasite power.

In conclusion for this section: I don’t know what the pullup option for the 1-Wire GPIO overlay actually does, because enabling it and removing the external pull-up resistor from my setup causes the bus to stop working. The documentation is confusingly imprecise, so I gave up on further investigation since I already had a configuration that worked.

## Prometheus scraping

To capture store time-series data representing the temperature, per the co-worker’s original suggestion I opted to use Prometheus. While it’s designed for monitoring the state of computer systems, it’s plenty capable of storing temperature data as well. Given I’ve used Prometheus before, it seemed like a fine option for this application though on later consideration I think a more robust (and effortful) system could be build with different technology choices (explored later in this post).

The Raspberry Pi with temperature sensor in my application is expected to stay within range of a WiFi network with internet connectivity, but this network does not permit any incoming connections, nor does it permit connections between wireless clients. Given I wanted to make the temperature data available to anybody interested in the progress of brewing, there needs to be some bridge to the outside world- thus Prometheus should run on a different machine from the Pi.

The easy solution I chose was to bring up a minimum-size virtual machine on Google Cloud running Debian, then install Prometheus and InfluxDB from the Debian repositories:

$influx Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring. Connected to http://localhost:8086 version 1.0.2 InfluxDB shell version: 1.0.2 > CREATE DATABASE prometheus; Examining the Prometheus logs again, now it was failing and complaining that the specified retention policy didn’t exist. Noting that the Influx documentation for the CREATE DATABASE command mentioned that the autogen retention policy will be used if no other is specified, setting the retention-policy flag to autogen and restarting Prometheus made data start appearing, which I verified by waiting a little while and making a query (guessing a little bit about how I would query a particular metric): > USE prometheus; > SELECT * FROM w1therm_temperature LIMIT 10; name: w1therm_temperature ------------------------- time id instance job value 1532423583303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423598303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423613303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423628303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423643303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423658303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423673303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423688303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423703303000000 000004b926f1 localhost:9000 w1therm 297.9 1532423718303000000 000004b926f1 localhost:9000 w1therm 297.9 ## Results A sample graph of the temperature over two days: The fermentation temperature is quite stable, with daily variation of less than one degree in either direction from the baseline. ## Refinements I later improved the temperature server to handle SIGHUP as a trigger to scan for sensors again, which is a slight improvement over restarting it, but not very important because the server is already so simple (and fast to restart). On reflection, using Prometheus and scraping temperatures is a very strange way to go about solving the problem of logging the temperature (though it has the advantage of using only tools I was already familiar with so it was easy to do quickly). Pushing temperature measurements from the Pi via MQTT would be a much more sensible solution, since that’s a protocol designed specifically for small sensors to report their states. Indeed, there is no shortage of published projects that do exactly that more efficiently than my Raspberry Pi, most of them using ESP8266 microcontrollers which are much lower-power and can still connect to Wi-Fi networks. ### Rambling about IoT security Getting sensor readings through an MQTT broker and storing them to be able to graph them is not quite as trivial as scraping them with Prometheus, but I suspect there does exist a software package that does most of the work already. If not, I expect a quick and dirty one could be implemented with relative ease. On the other hand, running a device like that which is internet-connected but is unlikely to ever receive anything remotely looking like a security update seems ill-advised if it’s meant to run for anything but a short amount of time. In that case having the sensor be part of a Zigbee network instead, which does not permit direct internet connectivity and thus avoids the fraught terrain of needing to protect both the device itself from attack and the data transmitted by the device from unauthorized use (eavesdropping) by taking ownership of that problem away from the sensor. It remains possible to forward messages out to an MQTT broker on the greater internet using some kind of bridge (indeed, this is the system used by many consumer “smart device” platforms, like Philips’ Hue though I don’t think they use MQTT), where individual devices connect only to the Zigbee network, and a more capable bridge is responsible for internet connectivity. The problem of keeping the bridge secure remains, but is appreciably simpler than needing to maintain the security of each individual device in what may be a heterogeneous network. It’s even possible to get inexpensive off-the-shelf temperature and humidity sensors that connect to Zigbee networks like some sold by Xiaomi, offering much better finish than a prototype-quality one I might be able to build myself, very good battery life, and still capable of operating in a heterogenous Zigbee network with arbitrary other devices (though you wouldn’t know it from the manufacturer’s documentation, since they want consumers to commit to their “platform” exclusively)! So while my solution is okay in that it works fine with hardware I already had on hand, a much more robust solution is readily available with off-the-shelf hardware and only a little bit of software to glue it together. If I needed to do this again and wanted a solution that doesn’t require my expertise to maintain it, I’d reach for those instead. 1. Hostname changed to an obviously fake one for anonymization purposes. [return] # Considering my backup systems With the recent news that Crashplan were doing away with their “Home” offering, I had reason to reconsider my choice of online backup backup provider. Since I haven’t written anything here lately and the results of my exploration (plus description of everything else I do to ensure data longevity) might be of interest to others looking to set up backup systems for their own data, a version of my notes from that process follows. ## The status quo I run a Linux-based home server for all of my long-term storage, currently 15 terabytes of raw storage with btrfs RAID on top. The choice of btrfs and RAID allows me some degree of robustness against local disk failures and accidental damage to data. If a disk fails I can replace it without losing data, and using btrfs’ RAID support it’s possible to use heterogenous disks, meaning when I need more capacity it’s possible to remove one disk (putting the volume into a degraded state) and add a new (larger) one and rebalance onto the new disk. btrfs’ ability to take copy-on-write snapshots of subvolumes at any time makes it reasonable to take regular snapshots of everything, providing a first line of defense against accidental damage to data. I use Snapper to automatically create rolling snapshots of each of the major subvolumes: • Synchronized files (mounted to other machines over the network) have 8 hourly, 7 daily, 4 weekly and 3 monthly snapshots available at any time. • Staging items (for sorting into other locations) have a snapshot for each of the last two hours only, because those items change frequently and are of low value until considered further. • Everything else keeps one snapshot from the last hour and each of the last 3 days. This configuration strikes a balance according to my needs for accident recovery and storage demands plus performance. The frequently-changed items (synchronized with other machines and containing active projects) have a lot of snapshots because most individual files are small but may change frequently, so a large number of snapshots will tend to have modest storage needs. In addition, the chances of accidental data destruction are highest there. The other subvolumes are either more static or lower-value, so I feel little need to keep many snapshots of them. I use Crashplan to back up the entire system to their “cloud”1 service for$5 per month. The rate at which I add data to the system is usually lower than the rate at which it can be uploaded back to Crashplan as a backup, so in most cases new data is backed up remotely within hours of being created.

Finally, I have a large USB-connected external hard drive as a local offline backup. Also formatted with btrfs like the server (but with the entire disk encrypted), I can use btrfs send to send incremental backups to this external disk, even without the ability to send information from the external disk back. In practice, this means I can store the external disk somewhere else completely (possibly without an Internet connection) and occasionally shuttle diffs to it to update to a more recent version. I always unplug this disk from power and its host computer when not being updated, so it should only be vulnerable to physical damage and not accidental modification of its contents.

### Synchronization and remotes

For synchronizing current projects between my home server (which I treat as the canonical repository for everything), the tools vary according to the constraints of the remote system. I mount volumes over NFS or SMB from systems that rarely or never leave my network. For portable devices (laptop computers), Syncthing (running on the server and portable device) makes bidirectional synchronization easy without requiring that both machines always be on the same network.

I keep very little data on portable devices that is not synchronized back to the server, but because it is (or, was) easy to set up, I used Crashplan’s peer-to-peer backup feature to back up my portable computers to the server. Because the Crashplan application is rather heavyweight (it’s implemented in Java!) and it refuses to include peer-to-peer backups in uploads to their storage service (reasonably so; I can’t really complain about that policy), my remote servers back up to my home server with Borg.

I also have several Android devices that aren’t always on my home network- these aren’t covered very well by backups, unfortunately. I use FolderSync to automatically upload things like photos to my server which covers the extent of most data I create on those devices, but it seems difficult to make a backup of an Android device that includes things like preferences and per-app data without rooting the device (which I don’t wish to do for various reasons).

### Summarizing the status quo

• btrfs RAID provides resilience against single-disk failures and easy growth of total storage in my server.
• Remote systems synchronize or back up most of their state to the server.
• Everything on the server is continuously backed up to Crashplan’s remote servers.
• A local offline backup can be easily moved and is rarely even connected to a computer so it should be robust against even catastrophic failures.

## Evaluating alternatives

Now that we know how things were, we can consider alternative approaches to solve the problem of Crashplan’s $5-per-month service no longer being available. The primary factors for me are cost and storage capacity. Because most of my data changes rarely but none of it is strictly immutable, I want a system that makes it possible to do incremental backups. This will of course also depend on software support, but it means that I will tend to prefer services with straightforward pricing because it is difficult to estimate how many operations (read or write) are necessary to complete an incremental backup. Some services like Dropbox or Google Drive as commonly-known examples might be appropriate for some users, but I won’t consider them. As consumer-oriented services positioned for the use case of “make these files available whenever I have Internet access,” they’re optimized for applications very different from the needs of my backups and tend to be significantly more expensive at the volumes I need. So, the contenders: • Crashplan for Small Business: just like Crashplan Home (which was going away), but costs$10/mo for unlimited storage and doesn’t support peer-to-peer backup. Can migrate existing Crashplan Home backup archives to Small Business as long as they are smaller than 5 terabytes.
• Backblaze: $50 per year for unlimited storage, but their client only runs on Mac and Windows. • Google Cloud Storage: four flavors available, where the interesting ones for backups are Nearline and Coldline. Low cost per gigabyte stored, but costs are incurred for each operation and transfer of data out. • Backblaze B2: very low cost per gigabyte, but incurs costs for download. • Online.net C14: very low cost per gigabyte, no cost for operations or data transfer in the “intensive” flavor. • AWS Glacier: lowest cost for storage, but very high latency and cost for data retrieval. The pricing is difficult to consume in this form, so I’ll make some estimates with an 8 terabyte backup archive. This somewhat exceeds my current needs, so should be a useful if not strictly accurate guide. The following table summarizes expected monthly costs for storage, addition of new data and the hypothetical cost of recovering everything from a backup stored with that service. Service Storage cost Recovery cost Notes Crashplan$10 0 "Unlimited" storage, flat fee.
Backblaze $4.17 0 "Unlimited" storage, flat fee. GCS Nearline$80 ~$80 Negligible but nonzero cost per operation. Download$0.08 to $0.23 per gigabyte depending on total monthly volume and destination. GCS Coldline$56 ~$80 Higher but still negligible cost per operation. All items must be stored for at least 90 days (kind of). B2$40 $80 Flat fee for storage and transfer per-gigabyte. C14 €40 0 "Intensive" flavor. Other flavors incur per-operation costs. Glacier$32 \$740 Per-gigabyte retrieval fees plus Internet egress. Reads may take up to 12 hours for data to become available. Negligible cost per operation. Minimum storage 90 days (like Coldline).

Note that for Google Cloud and AWS I’ve used the pricing quoted for the cheapest regions; Iowa on GCP and US East on AWS.

### Analysis

Backblaze is easily the most attractive option, but the availability restriction for their client (which is required to use the service) to Windows and Mac makes it difficult to use. It may be possible to run a Windows virtual machine on my Linux server to make it work, but that sounds like a lot of work for something that may not be reliable. Backblaze is out.

AWS Glacier is inexpensive for storage, but extremely expensive and slow when retrieving data. The pricing structure is complex enough that I’m not comfortable depending on this rough estimate for the costs, since actual costs for incremental backups would depend strongly on the details of how they were implemented (since the service incurs charges for reads and writes). The extremely high latency on bulk retrievals (up to 12 hours) and higher cost for lower-latency reads makes it questionable that it’s even reasonable to do incremental backups on Glacier. Not Glacier.

C14 is attractively priced, but because they are not widely known I expect backup packages will not (yet?) support it as a destination for data. Unfortunately, that means C14 won’t do.

Google Cloud is fairly reasonably-priced, but Coldline’s storage pricing is confusing in the same ways that Glacier is. Either flavor is better pricing-wise than Glacier simply because the recovery cost is so much lower, but there are still better choices than GCS.

B2’s pricing for storage is competitive and download rates are reasonable (unlike Glacier!). It’s worth considering, but Crashplan still wins in cost. Plus I’m already familiar with software for doing incremental backups on their service (their client!) and wouldn’t need to re-upload everything to a new service.

## Fallout

I conclude that the removal of Crashplan’s “Home” service effectively means a doubling of the monthly cost to me, but little else. There are a few extra things to consider, however.

First, my backup archive at Crashplan was larger than 5 terabytes so could not be migrated to their “Business” version. I worked around that by removing some data from my backup set and waiting a while for those changes to translate to “data is actually gone from the server including old versions,” then migrating to the new service and adding the removed data back to the backup set. This means I probably lost a few old versions of the items I removed and re-added, but I don’t expect to ever need any of them.

Second and more concerning in general is the newfound inability to do peer-to-peer backups from portable (and otherwise) computers to my own server. For Linux machines that are always Internet-connected Borg continues to do the job, but I needed a new package that works on Windows. I’ve eventually chosen Duplicati, which can connect to my server the same way Borg does (over SSH/SFTP) and will in general work over arbitrarily-restricted internet connections in the same way that Crashplan did.

## Concluding

I’m still using Crashplan, but converting to their more-expensive service was not quite trivial. It’s still much more inexpensive to back up to their service compared to others, which means they still have some significant freedom to raise the cost until I consider some other way to back up my data remotely.

As something of a corollary, it’s pretty clear that my high storage use on Crashplan is subsidized by other customers who store much less on the service; this is just something they must recognize when deciding how to price the service!