A Wayback Machine Bookmarklet

Sometimes I find it useful to be able to quickly save a page to the Wayback Machine, often to be able to provide a stable link to a page that I don’t control- for instance if I’m pointing somebody to a document that describes something they’re asking about, then it’s nice to ensure that there will still be an archived copy if the original goes away.

The 'Save Page Now' field in the wayback machine. There is a 'Save Page' button associated with a URL entry field.

While the landing page on web.archive.org has a “save now” form to quickly save a page given its URL, this is still more cumbersome than I’d like- it involves copying the desired URL, opening a new tab to web.archive.org, pasting the URL into the form and pressing the save button- in exactly that order.

The concept of bookmarklets comes to the rescue: little snippets of Javascript in a bookmark to do some function- it’s easy enough to open a new tab to a given URL with some javascript, so by inspecting how the “Save page” button works, I can automate it:

javascript:void(window.open('https://web.archive.org/save/'+location.href));

I’ve put that string into a bookmark that sits on my browser’s bookmarks bar, so I can just click on that to open a new tab which will save the currently-shown page to the Wayback Machine in a new tab. Much easier!

Of course this is still cumbersome to do with many URLs, such as if I want to archive all the links in a blog post. Fortunately AMBER automates that particular use case, and larger applications tend to be the realm of fetching WARCs with wget.

Fomu: a beginner's guide

FPGAs are pretty cool pieces of hardware for tinkering with, and have become remarkably easy to approach as a hobbyist in recent years. Boards like the TinyFPGA BX don’t require any special hardware to use and can provide a simple platform for modestly-scoped projects or just for learning.

While historically the software tools for programming FPGAs are proprietary and provided by the hardware manufacturer, Symbiflow (enabled and probably inspired by earlier work like Project IceStorm) provides completely free and open-source tooling and documentation for programming some FPGAs, significantly lowering the cost of entry (most vendors provide some free version of their design software but limited to lower-end devices; a license for the non-free version of the software is well into the realm of “if you have to ask, you can’t afford it”) and appearing to yield better results in many cases.1


As somebody who finds it fun to learn new things and experiment with new kinds of creations, FPGAs are quite interesting to me- they’re quite complex devices that enable very powerful creations, with excellent depth for mastery. While I did some course lab work with Altera FPGAs in university (and a little bit of chip design/layout later), I’d call those mostly canned tasks with easily-understood requirements and problem-solving approaches; it was sufficient to familiarize myself with the systems, but not enough to be particularly useful.

The announcement of Fomu caught my interest because I was aware of the earlier Tomu but wasn’t sufficiently interested to try to acquire any hardware. With Fomu however, I’m rather more interested because it enables interesting capabilities for playing with hardware- others have already demonstrated small RISC-V CPUs running in that FPGA (despite its modest logic capacity), for instance.

Even more conveniently for being able to play with Fomu, I’ve been in contact with Mithro who is approximately half of the team behind Fomu and gotten access to a stockpile of “hacker edition” boards that have been hand-assembled but not programmed at all. With slightly early access to hardware, I’ve been able to do some exploration and re-familiarize myself with the world of digital logic design and figure out the hardware.

Hardware

In summary, Fomu is a small (9.4 by 13 by 0.6 millimeters) circuit board with a Lattice ICE40UP5K-UWG30 FPGA, a 16-megabit SPI Flash for configuration (and other data) storage, a single RGB LED for blinkiness and a 48 MHz MEMS oscillator to provide a clock.

A photo of the board component side. There are seven integrated circuits and bare copper pads labelled clockwise from the top left 4, 3, 2, 1, G, R, O, I, C, S and V.

Board component side. The other side is mostly just the USB pads.

The whole thing is built so it can fit inside a standard USB port. Production boards are meant to ship with a USB bootloader that allows new configurations to be uploaded to the board only via that USB connection, but hacker boards are provided completely unprogrammed (and untested).

Schematic

Before we can make the hardware do something, we’ll need to understand how everything is put together:

A schematic for 'TomuUltraPlus' created in Kicad. The schematic is separated into 7 logical blocks: power regulation and decoupling, SPI flash, MEMS clock, RGB LED, ICE40 power, ICE40 PLL power filter and ICE40 IO.

Unfortunately, this schematic leaves some things to be desired. While it does allow us to see what parts are actually on the board and generally how they’re connected, it fails to clearly mark the external connections- power and data lines on the USB connector, test points and utility I/O pads.

The USB connections are easy to figure out, however; it’s a standard pinout so we can easily identify which physical pads on the board correspond to VUSB (5V supply), ground and the two data lines (USBP, USBN). Rather trickier to work out is the function of each of the test points on the board, though there is a provided template for laser-cutting a programming jig which provides some hints:

Four rounded squares with lines and shapes marking where laser cuts or engraving should be done. There are seven holes that align on two of the squares to permit pins to pass through, and one of them has text denoting the purpose of each pin.

Here red and green lines are cuts, while black is raster engraving for marking.

This jig is meant to be built up by stacking four layers of material, engraving a small pocket in the bottom to hold the board to program and inserting pogo pins in the small holes to contact with the test points on the board. This template helps us in that it has labels for the test points, though! All of the test points are clearly identified, except it’s unclear what voltage is expected on the power supply.

By inspecting the board myself, I eventually determined that the test point for supplying power (marked VCC on the programming jig template) is downstream of the 3.3V regulator (not connected to the USB power supply pad) so it expects 3.3 Volts for programming.

Convenient pinout diagram

By way of improving the schematic, here’s that same photo of the board with the signal names from the schematic pointed out on each of the pads, and the individual chips pointed out.

And the same in tabular form for easy searching:

Silkscreen Schematic Description
V +3V3 3.3V rail
S CS SPI chip select (active low)
C SCK SPI clock
I MISO SPI MISO
O MOSI SPI MOSI
R CRESET_B FPGA reset (active low)
G GND Ground
1 PIN1 User I/O 1
2 PIN2 User I/O 2
3 PIN3 User I/O 3
4 PIN4 User I/O 4

Bootstrapping

My first task in attempting to bootstrap a board and load some configuration on it was building a programming jig. Given there was already a template for a laser-cut acrylic one and I have access to a benchtop laser cutter, this was easy:

It’s a little bit ugly because the pogo pins I had ready access too are too small to nicely fit in the laser-cut holes so I had to carefully glue them in place.


Actually programming a board can be done with the fomu-flash utility running on a Raspberry Pi. I conveniently had a Raspberry Pi 2 to hand, so a little wiring to the Pi’s GPIO header had a jig that should work. Unfortunately, it didn’t- all I got out when trying to make it identify the on-board flash chip was 1s:

1
2
3
4
5
6
7
8
9
$ fomu-flash -i
Manufacturer ID: unknown (ff)
Memory model: unknown (ff)
Memory size: unknown (ff)
Device ID: ff
Serial number: ff ff ff ff
Status 1: ff
Status 2: ff
Status 3: ff

I gave up on that hardware after spending a while experimenting with it, and decided to design a custom programming jig that might be a little easier to ensure pin alignment is good. This is a little bit tricky because the minimum pitch of the test points is just 1.8 mm, which is not large enough for the 0.1-inch (2.54 mm) DuPont connectors commonly used for prototyping and desirable in this case because they’re very easy to connect to the Raspberry Pi’s GPIO header.

A better jig

Fortunately, I also have access to rather sophisticated prototyping tools and had some nice parts handy from other projects. In particular, a Form 2 stereolithographic 3D printer and some good pogo pins, Preci-dip 90155-AS. I computer-modeled a jig to be 3d-printed that should be both compact and robust, pictured below (see the end of this post for downloadable resources):

Isometric view with all edges visible

Isometric view with all edges visible

Top view

Top view

Bottom view

Bottom view

Features

The design takes great advantage of the flexibility of 3d printing for fabrication: it is easy to install the pogo pins off-vertical by making the (press-fit) holes at an arbitrary angle; this would be very difficult with conventional fabrication, but it allows the spacing of the pins at the top of the jig to be large enough that 2.54mm connectors can be used, despite the pad spacing on the board being only 1.8mm.

On the bottom side, there are several narrow features that act as a shelf to support the board (which is 0.6mm thick) so its outside surface is flush with the bottom surface of the jig. A semicircular boss on one side mates with the cutout on the PCB to key the jig so it is obvious when the board is correctly oriented in the jig. A small cutout on one edge allows a tool to be inserted to pull the board out if needed, because the fit is close enough that it might stick sometimes (or a tool could be pushed through from the top).

As a manufacturability consideration, the top surface has a slant between opposite corners. This improves the print quality on a Form 2- because dimensions on that side are not critical the part is designed to be printed with that side “down” (actually up, once in the printer) and supports attached to it on that end. By allowing the printer to gradually build up a slope rather than immediately build a plane, it can better produce the intended shape- an earlier version of the design with a flat top had a very rough finish because large and thin layers of material tend to warp until enough material is built up to be self-supporting.


The choice of pogo pins in particular is key, since they’re made with a small shoulder and retaining barbs that allow them to be easily press-fit into a connector shell:

Mechanical drawing of a Preci-dip 90155-AS pogo pin. It is 10mm long, with 1.4mm stroke. The central area along its length has several barbs and a narrow shoulder with 10 micron tolerances.

The one downside of these pins is the short tail, intended for mounting to a circuit board. While the aforementioned DuPont connectors can be mated to the tail, they are not very secure and come off at the slightest force. A revised design choosing parts for their function and not just immediate availability might prefer to use a part like 90101-AS, which is intended for wire termination rather than board mounting- then wires can be securely attached to the pin rather than tenuously placed on it. My workaround that didn’t involve buying more parts was carefully gluing the wires in place, which seems to work okay.

Programming

Having built a jig that I could be confident would work correctly, we now return to the problem of actually programming the board. Connecting the new jig to my Raspberry Pi in the same way I did the first one, it failed in the same way- reading all 1s.

At this point I was rather stumped, with a few possible explanations for the problems:

  1. Both jigs are unreliable
  2. I’m wiring the jigs up incorrectly
  3. Software on my Pi is configured incorrectly
  4. All of the Fomus I tried were faulty

To discount the first two possibilities, I was able to borrow Mithro’s professionally-built jig that already had a Raspberry Pi 3 connected to it. I didn’t have any credentials to log in to that Pi and use it interactively however, so I was limited to checking its wiring and carefully ensuring I connected my Pi to the jig in the same way, then try programming again. This also failed.

Mithro’s jig. It seems very cleverly built to me, clearly designed by somebody with a lot of experience designing these kinds of fixtures.

Mithro’s jig. It seems very cleverly built to me, clearly designed by somebody with a lot of experience designing these kinds of fixtures.

Having tried that I had to assume my Pi was somehow misconfigured, since it seemed increasingly unlikely that I was doing anything wrong and it seemed implausible that all of my boards were faulty. I eventually took the SD card out of the other jig’s Pi and inspected the software it would run by connecting it to another computer. This amounted to the same fomu-flash program I was using, so I inspected the system configuration in /boot/config.txt and found a variety of non-default options that seemed plausibly useful. Ultimately, I found some magic words:

1
dtparam=spi=on

This option makes the kernel on the Pi expose a hardware-assisted SPI peripheral, which seems like an obvious missing option until you realize that fomu-flash actually bit-bangs SPI because the hardware support is insufficient for this application. In any case, I did find that turning that option on makes everything work correctly:

1
2
3
4
5
6
7
8
9
$ fomu-flash -i
Manufacturer ID: Adesto (1f)
Memory model: AT25SF161 (86)
Memory size: 16 Mbit (01)
Device ID: 14
Serial number: ff ff ff ff
Status 1: 02
Status 2: 00
Status 3: ff

I reported the bug and made a note of this in the documentation so hopefully nobody else has to deal with that problem in the future, even if the root cause is mystifying.

Success!

With the ability to talk to the configuration flash, it’s then possible to write an actual bitstream. To avoid needing to write one myself, it’s easy to take the LED blinker example from the fomu-tests repository:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
fomu-tests/blink$ make FOMU_REV=hacker
...

Info: constrained 'rgb0' to bel 'X4/Y31/io0'
Info: constrained 'rgb1' to bel 'X5/Y31/io0'
Info: constrained 'rgb2' to bel 'X6/Y31/io0'
Info: constrained 'clki' to bel 'X6/Y0/io1'
Warning: unmatched constraint 'spi_mosi' (on line 5)
Warning: unmatched constraint 'spi_miso' (on line 6)
Warning: unmatched constraint 'spi_clk' (on line 7)
Warning: unmatched constraint 'spi_cs' (on line 8)
Info: constrained 'user_1' to bel 'X12/Y0/io1'
Info: constrained 'user_2' to bel 'X5/Y0/io0'
Info: constrained 'user_3' to bel 'X9/Y0/io1'
Info: constrained 'user_4' to bel 'X19/Y0/io1'
Warning: unmatched constraint 'usb_dn' (on line 13)
Warning: unmatched constraint 'usb_dp' (on line 14)

...

Info: Device utilisation:
Info:            ICESTORM_LC:    33/ 5280     0%
Info:           ICESTORM_RAM:     0/   30     0%
Info:                  SB_IO:     5/   96     5%
Info:                  SB_GB:     1/    8    12%
Info:           ICESTORM_PLL:     0/    1     0%
Info:            SB_WARMBOOT:     0/    1     0%
Info:           ICESTORM_DSP:     0/    8     0%
Info:         ICESTORM_HFOSC:     0/    1     0%
Info:         ICESTORM_LFOSC:     0/    1     0%
Info:                 SB_I2C:     0/    2     0%
Info:                 SB_SPI:     0/    2     0%
Info:                 IO_I3C:     0/    2     0%
Info:            SB_LEDDA_IP:     0/    1     0%
Info:            SB_RGBA_DRV:     1/    1   100%
Info:         ICESTORM_SPRAM:     0/    4     0%

...

Built 'blink' for Fomu hacker

$ fomu-flash -w blink.bin
Erasing @ 018000 / 01969a  Done
Programming @ 01959a / 01969a  Done
$ fomu-flash -v blink.bin
Reading @ 01969a / 01969a Done

Programming that to the board yields a blinking LED as expected, so I’ve achieved success in the basic form of this project by getting the FPGA to do something. Further exploration will involve writing gateware with Migen rather than straight Verilog (because I find Verilog to be rather tedious to write) and trying to build a system around a RISC-V CPU (because that sounds interesting).

Resources

If you want to make your own copy of the programming jig or just explore it, you’ve got several options:

  • View the model at OnShape. This will allow you to view and make changes to the parametric model, which is what you’ll need to make most useful changes to it.
  • View the STL online. A quick and dirty way to get an interactive view of the model.
  • Download the STL. If you just want to try to 3D print your own, this is all you need. It may also be useful if you want to make changes using a 3d modeling program (rather than a CAD program).
  • Download a Solidworks part file. This was just exported from OnShape, but you might prefer this if you want to use SolidWorks to edit the model.

All of the official documentation for Fomu is available on Github. For basic information (such as what I referred to when writing up this project), that’s a great starting point.

I designed the programming jig in OnShape which is a pretty good and very convenient CAD tool.


  1. FPGA vendors don’t publish all the information required to build configuration bitstreams for their hardware, possibly because they wish to support their side business in selling design tool licenses- this despite the fact that (anecdotally, since I can’t recall where I saw it) many FPGA developers say that vendor tooling is one of the biggest annoyances in their work. The open-source tools require a fair amount of painstaking reverse-engineering of chips to create! ↩︎

Temperature Logging: Redux

Previously as I was experimenting with logging the temperature using a Raspberry Pi (to monitor the temperatures experienced by fermenting cider), I noted that the Pi was something of a terrible hack, and it should be possible to do more efficiently with some slightly less common hardware.

I decided that improved version would be interesting to build for use at home, since it’s both kind of fun to collect data like that, and actually knowing the temperature in the house is useful at times. The end result of this project is that I can generate graphs like the one below of conditions around the house:

Three graphs of temperature, barometric presure and humidity spanning a week, where each graph has three lines; one each for the lounge, bedroom and entry. Temperature shows a diurnal cycle with mostly constant offsets between lines, presure is equal for each and varies slowly over the entire week, and humidity is broadly similar between the three lines and varies somewhat more randomly.

Software requirements

My primary requirement for home monitoring of this sort is that it not depend on a proprietary hub (especially not one that depends on an external service that might go away without warning), and I’d also like something that can be integrated with my existing (but minimal) home automation setup that’s based around Home Assistant running on my home server.

Given my main software is open source it should be possible to integrate an arbitrary solution with it, with varying amount of reverse engineering and implementation necessary. Because reverse-engineering services like that is not my idea of fun, it’s much preferable to find something that’s already supported and take advantage of others’ work. While I don’t mind debugging, I don’t want to build an integration from scratch if I don’t need to.

Hardware selection

As observed last time, the “hub” model for connecting “internet of things” devices to a network seems to be the best choice from a security standpoint- the only externally-visible network device is the hub, which can apply arbitrary security policies to communications between devices and to public networks (in the simplest case, forbidding all communications with public networks). Indeed, recent scholarly work (PDF) suggests systems that work on this model but apply more sophisticated policies to communications passing through the hub.

With that in mind, I decided a Zigbee network for my sensors would be appropriate- the sensors themselves have no ability to talk to public networks because they don’t even run an Internet Protocol stack, and it’s already a fairly common standard for communication. Plus, I was able to get several of the previously-mentioned Xiaomi temperature, humidity and barometric pressure sensors for about $10 each; a quite reasonable cost, given they’re battery powered with very long life and good wireless range.

A small white square with rounded corners and a thermometer drawn on the front.

One of the Xiaomi temperature/humidity/pressure sensors.


Home assistant already has some support for Zigbee devices; most relevant here seems to be its implementation of the Zigbee Home Automation application standard. Though the documentation isn’t very clear, it supports (or, should support) any radio that communicates with a host processor over a UART interface and speaks either the XBee or EZSP serial protocol.

Since the documentation for Home Assistant specifically notes that the Elelabs Zigbee USB adapter is compatible, I bought one of those. Its documentation includes a description of how to configure Home Assistant with it and specifically mentions Xiaomi Aqara devices (which includes the sensors I had selected), so I was confident that radio would meet my needs, though unsure of exactly what protocol was actually used to communicate with the radio over USB at the time I ordered it.

Experimenting

Once I received the Zigbee radio-on-a-usb-stick, I immediately tried to manually drive it using whatever libraries I could use to set up a network and get one of my sensors connected to it. This ended up not working, but I did learn a lot about how the radio adapter is meant to work.

For working with it in Python, the Elelabs documentation points to bellows, a library providing EZSP protocol support for the zigpy Zigbee stack. It also includes a command-line interface exposing some basic commands, perfect for the sort of experimentation I wanted to do.

Getting connected was easy; I plugged the USB stick into my Linux workstation and it appeared right away as a PL2303 USB-to-serial converter. Between this and noting that bellows implements the EZSP protocol, I inferred that the Elelabs stick is a Silicon Labs EM35x microcontroller running the EmberZNet stack in a network coordinator mode, with a PL2303 exposing a UART over USB so the host can communicate with the microcontroller (and the rest of the network) by speaking EZSP.

A network diagram showing multiple Zigbee routers and sleepy end devices. Text claims that the EmberZNet PRO stack delivers robust and reliable mesh networking, supporting all Zigbee device types.

SiLabs marketing does a pretty good job of selling their software stack.

Having worked that out and made sense of it, I printed out a label for the stick that says what it is (“Elelabs Zigbee USB adapter”) and how to communicate with it (EZSP at 57600 baud) since the stick is completely unmarked otherwise and being able to tell what it does just by looking at it is very helpful.


Trying to use the bellows CLI, the status output seemed okay and the NCP was running. In order to connect one of my sensors, I then needed to figure out how to make the sensor join the network after using bellows permit to let new devices join the network. The sensors each came with a little instruction booklet, but it was all in Chinese. With the help of Google Translate, I was able to take photos of it and find the important bit- holding the button on the sensor for about 5 seconds until the LED blinks three times will reset it, at which point it will attempt to join an open network.

On trying to run bellows permit prior to resetting a sensor to get it on the network, I encountered an annoying bug- it didn’t seem to do anything, and Python emitted a warning: RuntimeWarning: coroutine 'permit' was never awaited. I dug into that a little more and found the libraries make heavy use of PEP 492 coroutines, and the warning was fairly clear that a function was declared async when it shouldn’t have been (or its coroutine wasn’t then given to an event loop) so the function actually implementing permit never ran. I eventually tracked down the problem, patched it locally and filed a bug which has since been fixed.

Having fixed that bug, I continued to try to get a sensor on my toy network but was ultimately (apparently) unsuccessful. I could permit joins and reset the sensor and see debug output indicating something was happening on the network, but never any conclusive messages saying a new device had joined and rather a lot of messages along the lines of “unrecognized message.” I couldn’t tell if it was working or not, so moved on to hooking up Home Assistant.

Setting up Home Assistant

Getting set up with Home Assistant was mostly just a matter of following the guide provided with the USB stick, but using my own knowledge of how to set up the software (not using hassio). Configuring the zha component and pointing it at the right USB path is pretty easy. I did discover that specifying the database_path for the zha component alone is not enough to make it work; if the file doesn’t already exist setup just fails. Simply creating an empty file at the configured path is enough- apparently that file is an sqlite database that zigpy uses to track known devices.

Still following the Elelabs document, I spent a bit of time invoking zha.permit and trying to get a sensor online to no apparent success. After a little more searching, I found discussion on the Home Assistant forums and in particular one user suggesting that these particular sensors are somewhat finicky when joining a network. They suggested (and my findings agree) that holding the button on the sensor to reset it, then tapping the button approximately every second for a little while (another 5-10 seconds) will keep it awake long enough to successfully join the network.

The keep-awake tapping approach did eventually work, though I also found that Home Assistant sometimes didn’t show a new sensor (or parts of a new sensor, like it might show the temperature but not humidity or pressure) until I restarted it. This might be a bug or a misconfiguration on my part, but it’s minor enough not to worry about.

At this point I’ve verified that my software and hardware can all work, so it’s time to set up the permanent configuration.

Permanent configuration

As mentioned above, I run Home Assistant on my Linux home server. Since I was already experimenting on a Linux system, that configuration should be trivial to transfer over, but for one additional desire I had: I want more freedom in where I place the Zigbee radio, in particular not just plugged directly into a free USB port on the server. Putting it in a reasonably central location with other radios (say, near the WiFi router) would be nice.

A simple solution might be a USB extension cable, but I didn’t have any of those handy and strewing more wires about the place feels inelegant. My Internet router (a TP-Link Archer C7 running OpenWrt) does have an available USB port though, so I suspected it would be possible to connect the Zigbee radio to the router and make it appear as a serial port on the server. This turned out to be true!

Serial over network

To find the solution for running a serial port over the network, I first searched for existing protocols; it turns out there’s a standard one that’s somewhat commonly used in fancy networking equipment, specified by RFC 2217. RFC 2217 specifies a set of extensions to Telnet allowing serial port configuration (bit rate, data bits, parity, etc) and flow control over Telnet.

A computer is connected via Ethernet to a Device Server which runs a RFC 2217 server and it connected to multiple external modems via RS/232.

A diagram of RFC 2217 application from some IBM documentation.

Having identified a protocol that does what I want, it’s then a matter of finding software that works as a client (assuming I’ll be able to find or write a suitable server). Suitable clients are somewhat tricky however, since from an applicaton perspective UART use on Linux involves making specialized ioctls to the device to configure it, then reading and writing bytes as usual. Making an RFC2217 network serial device appear like a local device would seem to involve writing a kernel driver that exports a new class of RFC2217 device nodes supporting the relevant ioctls- none exists.1

An alternate approach (not using RFC 2217) might be USB/IP, which is supported in mainline Linux and allows a server to bind USB devices physically connected to it to a virtual USB controller that can then be remotely attached to a different physical machine over a network. This seems like a more complex and potentially fragile solution though, so I put that aside after learning of it.

Since Linux doesn’t have any kernel-level support for remote serial ports, I needed to search for support at the application level. It turns out bellows uses pyserial to communicate with radios, and pySerial is a quite featureful library- while most users will only ever provide device names like COM1 or /dev/ttyUSB0, it supports a range of more exotic URLs specifying connections, including RFC 2217.

So given a suitable server running on a remote machine, I should be able to configure Home Assistant to use a URL like rfc2217://zigbee.local:25 to reach the Zigbee radio.

Serial server

The next step in setting up the Zigbee radio plugged into the router is finding an application that can expose a PL2303 over the network with the RFC 2217 protocol. That turned out to be a short search, where I quickly discovered ser2net which does the job and is already packaged for OpenWRT. Installing it on the router was trivial, though I also needed to be sure the kernel module(s) required to expose the USB-serial port were available:

1
# opkg install kmod-usb-serial-pl2303 ser2net

Having installed ser2net, I still had to figure out how to configure it. While the documentation describes its configuration format, I know from experience that configuring servers on OpenWRT is usually done differently (as something of a concession to targeting embedded systems without much storage). I quickly found that the package had installed a sample configuration file at /etc/config/ser2net:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
config ser2net global
    option enabled 1

config controlport
    option enabled 0
    option host localhost
    option port 2000

config default
    option speed 115200
    option databits 8
    option parity 'none'
    option stopbits 1
    option rtscts false
    option local false
    option remctl true

config proxy
    option enabled 0
    option port 5000
    option protocol telnet
    option timeout 0
    option device '/dev/ttyAPP0'
    option baudrate 115200
    option databits 8
    option parity 'none'
    option stopbits 1
#   option led_tx 'tx'
#   option led_rx 'rx'
    option rtscts false
    option local false
    option xonxoff false

Unfortunately, this configuration doesn’t include any comments so the reader is force to guess the meaning of each option. They mostly correspond to words that appear in the ser2net manual, but I didn’t trust guesses so went digging in the OpenWRT packages source code and found the script responsible for converting /etc/config/ser2net into an actual configuration file when starting ser2net.

My initial guess at the configuration I wanted looked something like this:

1
2
3
4
5
6
7
8
config proxy
    option enabled 1
    option port 5000
    option protocol telnet
    option timeout 0
    option device '/dev/ttyUSB0'
    option baudrate 57600
    option remctl true

The protocol is specified as telnet because RFC 2217 is a layer on top of telnet (my first guess was that I actually wanted raw until actually reading the RFC and seeing it was a set of telnet extensions), and the device is the device name that I found the Zigbee stick appeared as when plugged into the router.2 Unfortunately, this configuration didn’t work and pyserial gave gack a somewhat perplexing error message: serial.serialutil.SerialException: Remote does not seem to support RFC2217 or BINARY mode [we-BINARY:False(INACTIVE), we-RFC2217:False(REQUESTED)].

The Elelabs stick plugged into my TP-link router, which is mounted on a wall.

Without much visibility into what the serial driver was trying to do, I opted to examine the network traffic with Wireshark. I first attempted to use the text-mode interface (tshark -d tcp.port==5000,telnet -f 'port 5000'), but quickly gave up and switched to the GUI instead. I captured the traffic passing between the server and router, but there was almost nothing! The client (pyserial) was sending some Telnet negotiation messages (DO ECHO, WILL suppress go ahead and COM port control), then nothing happened for a few seconds and the connection closed.

Since restarting Home Assistant for every one of these serial tests was quite cumbersome, at this point I checked if pyserial includes any programs suitable for testing connectivity. It happily does, provided in my distribution’s package as miniterm.py. Running miniterm.py rfc2217://c7:5000 failed in the same way, so I had a quicker debugging tool.


At this point the problem seems like it’s at the server side, so I stopped the ser2net server on the router and started one in the foreground, with a custom configuration specified on the command line:

1
2
3
$ /etc/init.d/ser2net stop
$ ser2net -n -d -C '5000:telnet:0:/dev/ttyUSB0:57600 remctl'
ser2net[14914]: Unable to create network socket(s) on line 0

While ser2net didn’t outright fail, it did print a concerning error message. Does it work if I change the port it’s listening on?

1
$ ser2net -n -d -C '1234:telnet:0:/dev/ttyUSB0:57600 remctl'

And then running miniterm.py succeeds, leaving me with a terminal I could type into (but didn’t, since I don’t know how to speak EZSP with my keyboard).

1
2
3
4
5
$ miniterm.py rfc2217://c7:1234 57600
--- Miniterm on rfc2217://c7:1234  57600,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---

--- exit ---

I discovered after a little digging (netstat -lnp) that miniupnpd was already listening on port 5000 of the router, so changing the port fixes the confusing problem. A different sample port in the ser2net configuration would have prevented such an issue, as would ser2net giving up when it fails to bind to a requested port instead of printing a message and pretending nothing happened. But at least I didn’t have to patch anything to make it work.


With ser2net listening on port 2525 instead, Home Assistant can connect to it (hooray!). But it immediately throws a different error: NotImplementedError: write_timeout is currently not supported. I’ve found another bug in a rarely-exercised corner of this software stack, have I?

Well, kind of. Finding that error message in the pyserial source, something is trying to set the write timeout to zero and it’s simply not implemented in pyserial for RFC2217 connections. This is ultimately because Home Assistant (as alluded to earlier with bellows and zigpy) is all coroutine-based so it uses pyserial-asyncio to adapt the blocking APIs provided by pyserial to something that works nicely with coroutines running on an event loop. When pyserial-asyncio tries to set non-blocking mode by making the timeout zero, we find it’s not supported.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def _reconfigure_port(self):
    """Set communication parameters on opened port."""
    if self._socket is None:
        raise SerialException("Can only operate on open ports")

    # if self._timeout != 0 and self._interCharTimeout is not None:
        # XXX

    if self._write_timeout is not None:
        raise NotImplementedError('write_timeout is currently not supported')
        # XXX

While I could probably implement non-blocking support for RFC 2217 in pyserial, that seemed rather difficult and not my idea of fun. So instead I looked for a workaround- if RFC 2217 won’t work, does pyserial support a protocol that will?

The answer is of course yes: I can use socket:// for a raw socket connection to the ser2net server. This sacrifices the ability to change UART parameters (format, baud rate, etc) on the fly, but since the USB stick doesn’t support changing parameters on the fly anyway (as far as I can tell), this is no problem.

Final configuration

The ser2net configuration that I’m now using looks like this:

1
2
3
4
5
6
7
8
config proxy
    option enabled 1
    option port 2525
    option protocol raw
    option timeout 0
    option device '/dev/ttyUSB0'
    option baudrate 57600
    option remctl 0

And the relevant stanza in Home Assistant configuration: (The baud rate needs to be specified, but pyserial ignores it for socket:// connections.)

1
2
3
4
zha:
  usb_path: 'socket://c7:2525'
  database_path: /srv/homeassistant/.homeassistant/zigbee.db
  baudrate: 57600

After ensuring the zigbee.db file exists and restarting Home Assistant to reload the configuration, I was able to pair all three sensors by following the procedure defined above: call the permit service in Home Assistant, then reset the sensor by holding the button until its LED blinks three times, then tap the button every second or so for a bit.

I did observe some strange behavior on pairing the sensors that made me think they weren’t pairing correctly, like error messages in the log (ERROR (MainThread) [homeassistant.components.sensor] Setup of platform zha is taking longer than 60 seconds. Startup will proceed without waiting any longer.) and some parts of each sensor not appearing (the temperature might be shown, but not humidity or pressure). Restarting Home Assistant after pairing the sensors made everything appear as expected though, so there may be a bug somewhere in there but I can’t be bothered to debug it since there was a very easy workaround.

Complaining about async I/O

It’s rather interesting to me that the major bugs I encountered in trying to set up this system in a slightly unusual configuration were related to asynchronous I/O running in event loops- this is an issue that’s become something of my pet problem, such that I will argue to just about anybody who will listen that asynchronous I/O is usually unnecessary and more difficult to program.

That I discovered two separate bugs in the tools that make this work relating to running asynchronous I/O in event loops seems to support that conclusion. If Home Assistant simply spawned threads for components I believe it would simplify individual parts (perhaps at the cost of some slightly more complex low-level communication primitives) and make the system easier to debug. Instead, it runs all of its dependencies in a way they are not well-exercised in, presumably in search of “maximum performance” that seems entirely irrelevant when considering the program’s main function is acting as a hub for a variety of external devices.

I have (slowly) been working on distilling all these complaints into a series of essays on the topic, but for now this is a fine opportunity to wave a finger at something that I think is strictly worse because it’s evented.

Conclusion

I’m pretty happy with the sensors and software configuration I have now- the sensors are tiny and unobtrusive, while the software does a fine job of logging data and presenting live readings for my edification.

I’d like to also configure a “real” database like InfluxDB to store my sensor readings over arbitrarily long time periods (since Home Assistant doesn’t remember data forever, reasonably so), which shouldn’t be too difficult (it’s supported as a module) but is somewhat unrelated to setting up Zigbee sensors in the first place. Until then, I’m pretty happy with these results despite the fact that I think the developers have made a terrible choice with evented I/O.

A line of circles, each labelled with a sensor name and the value. The state of the sun and moon are shown, as well as temperature, pressure and humidity for each of the bedroom, entry and lounge Zigbee sensors.

Live sensor readings from Home Assistant; nice at a glance.


  1. I did find somebody asking for input on the implementation of exactly that, but it looks like nothing ever came of it. A reply suggesting an application at the master end of a pty (pseudoterminal) suggests an interesting alternate option, but it doesn’t appear to be possible to receive parameter change requests from a pty (though flow control is exposed when running in “packet mode”). ↩︎

  2. I was concerned at the outset that the router might be completely unable to see the Zigbee stick, since apparently the Archer C7 doesn’t include a USB 1.1 OHCI or UHCI controller, so it’s incapable of communicating at all with low-speed devices like keyboards! I’ve heard (but not verified myself) that connecting a USB 2.0 hub will allow the router to communicate with low-speed devices downstream of the hub as a workaround. ↩︎

Building a terrible 'IoT' temperature logger

I had approximately the following exchange with a co-worker a few days ago:

Them: “Hey, do you have a spare Raspberry Pi lying around?”
Me: [thinks] “..yes, actually.”
T: “Do you want to build a temperature logger with Prometheus and a DS18B20+?
M: “Uh, okay?”

It later turned out that that co-worker had been enlisted by yet another individual to provide a temperature logger for their project of brewing cider, to monitor the temperature during fermentation. Since I had all the hardware at hand (to wit, a Raspberry Pi 2 that I wasn’t using for anything and temperature sensors provided by the above co-worker), I threw something together. It also turned out that the deadline was quite short (brewing began just two days after this initial exchange), but I made it work in time.

Interfacing the thermometer

As noted above, the core of this temperature logger is a DS18B20 temperature sensor. [Per the manufacturer] (https://www.maximintegrated.com/en/products/sensors/DS18B20.html):

The DS18B20 digital thermometer provides 9-bit to 12-bit Celsius temperature measurements … communicates over a 1-Wire bus that by definition requires only one data line (and ground) for communication with a central microprocessor. … Each DS18B20 has a unique 64-bit serial code, which allows multiple DS18B20s to function on the same 1-Wire bus. Thus, it is simple to use one microprocessor to control many DS18B20s distributed over a large area.

Indeed, this is a very easy device to interface with. But even given the svelte hardware needs (power, data and ground signals), writing some code that speaks 1-Wire is not necessarily something I’m interested in. Fortunately, these sensors are very commonly used with the Raspberry Pi, as illustrated by an Adafruit tutorial published in 2013.


The Linux kernel provided for the Pi in its default Raspbian (Debian-derived) distribution supports bit-banging 1-Wire over its GPIOs by default, requiring only a device tree overlay to activate it. This is as simple as adding a line to /boot/config.txt to make the machine’s boot loader instruct the kernel to apply a change to the hardware configuration at boot time:

1
dtoverlay=w1-gpio

With that configuration, one simply needs to wire the sensor up. The w1-gpio device tree configuration by default uses GPIO 4 on the Pi as the data line, then power and grounds need to be connected and a pull-up resistor added to the data line (since 1-Wire is an open-drain bus).

DS18B20 VDD and GND connect to Raspberry Pi 3V3 and GND respectively; sensor DQ connects to Pi GPIO4. There is a 4.7k resistor between VDD and DQ.

The w1-therm kernel module already understands how to interface with these sensors- meaning I don’t need to write any code to talk to the temperature sensor: Linux can do it all for me! For instance, reading the temperature out in an interactive shell to test, after booting with the 1-Wire overlay enabled:

1
2
3
4
5
6
7
$ modprobe w1-gpio w1-therm
$ cd /sys/bus/w1/devices
$ ls
28-000004b926f1  w1_bus_master1
$ cat 28-000004b926f1/w1_slave
9b 01 4b 46 7f ff 05 10 6e : crc=6e YES
9b 01 4b 46 7f ff 05 10 6e t=25687

The kernel periodically scans the 1-Wire bus for slaves and creates a directory for each device it detects. In this instance, there is one slave on the bus (my temperature sensor) and it has serial number 000004b926f1. Reading its w1_slave file (provided by the w1-therm driver) returns the bytes that were read on both lines, a summary of transmission integrity derived from the message checksum on the first line, and t=x on the second line, where x is the measured temperature in milli-degrees Celsius. Thus, the measured temperature above was 25.687 degrees.

While it’s fairly easy to locate and read these files in sysfs from a program, I found a Python library that further simplifies the process: w1thermsensor provides a simple API for detecting and reading 1-wire temperature sensors, which I used when implementing the bridge for capturing temperature readings (detailed more later).

1-Wire details

I wanted to verify for myself how the 1-wire interfacing worked so here are the details of what I’ve discovered, presented because they may be interesting or helpful to some readers. Most documentation of how to perform a given task with a Raspberry Pi is limited to comments like “just add this line to some file and do the other thing!” with no discussion of the mechanics involved, which I find very unsatisfying.

The line added to /boot/config.txt tells the Rapberry Pi’s boot loader (a version of Das U-Boot) to pass the w1-gpio.dtbo device tree overlay description to the kernel. The details of what’s in that overlay can be found in the kernel source tree at arch/arm/boot/dts/overlays/w1-gpio-overlay.dts.

This in turn pulls in the w1-gpio kernel module, which is part of the upstream kernel distribution- it’s very simple, setting or reading the value of a GPIO port as requested by the Linux 1-wire subsystem.

Confusingly, if we examine the dts file describing the device tree overlay, it can take a pullup option that controls a rpi,parasitic-power parameter. The documentation says this “enable(s) the parasitic power (2-wire, power-on-data) feature”, which is confusing. 1-Wire is inherently capable of supplying parasitic power to slaves with modest power requirements, with the slaves charging capacitors off the data line when it’s idle (and being held high, since it’s an open-collector bus). So, saying an option will enable parasitic power is confusing at best and probably flat wrong.

Further muddying the waters, there also exists a w1-gpio-pullup overlay that includes a second GPIO to drive an external pullup to provide more power, which I believe allows implementation of the strong pull-up described in Figure 6 of the DS18B20 datasheet (required because the device’s power draw while reading the temperature exceeds the capacity of a typical parasitic power setup):

A secondary GPIO from a microprocessor provides a strong pull-up on the 1-Wire bus while power requirements exceed parasitic supply capabilities.

By also connecting the pullup GPIO to the data line (or putting a FET in there like the datasheet suggests), the w1-gpio driver will set the pullup line to logic high for a requested time, then return it to Hi-Z where it will idle. But for my needs (cobbling something together quickly), it’s much easier to not even bother with parasite power.

In conclusion for this section: I don’t know what the pullup option for the 1-Wire GPIO overlay actually does, because enabling it and removing the external pull-up resistor from my setup causes the bus to stop working. The documentation is confusingly imprecise, so I gave up on further investigation since I already had a configuration that worked.

Prometheus scraping

To capture store time-series data representing the temperature, per the co-worker’s original suggestion I opted to use Prometheus. While it’s designed for monitoring the state of computer systems, it’s plenty capable of storing temperature data as well. Given I’ve used Prometheus before, it seemed like a fine option for this application though on later consideration I think a more robust (and effortful) system could be build with different technology choices (explored later in this post).

The Raspberry Pi with temperature sensor in my application is expected to stay within range of a WiFi network with internet connectivity, but this network does not permit any incoming connections, nor does it permit connections between wireless clients. Given I wanted to make the temperature data available to anybody interested in the progress of brewing, there needs to be some bridge to the outside world- thus Prometheus should run on a different machine from the Pi.

The easy solution I chose was to bring up a minimum-size virtual machine on Google Cloud running Debian, then install Prometheus and InfluxDB from the Debian repositories:

1
$ apt-get install prometheus influxdb

Temperature exporter

Having connected the thermometer to the Pi and set up Prometheus, we now need to glue them together such that Prometheus can read the temperature. The usual way is for Prometheus to make HTTP requests to its known data sources, where the response is formatted such that Prometheus can make sense of the metrics. There is some support for having metrics sources push their values to Prometheus through a bridge (that basically just remembers the values it’s given until they’re scraped), but that seems inelegant given it would require running another program (the bridge) and goes against the how Prometheus is designed to work.

I’ve published the source for the metrics exporter I ended up writing, and will give it a quick description in the remnants of this section.


The easiest solution to providing a service over HTTP is using the http.server module, so that’s what I chose to use. When the program starts up it scans for temperature sensors and stores them. This has a downside of never returning data if a sensor is accidentally disconnected at startup, but detection is fairly slow and only doing it at startup makes it clearer if sensors are accidentally disconnected during operation, since reading them will fail at that point.

1
2
3
4
5
6
7
#!/usr/bin/env python3

import socketserver
from http.server import HTTPServer, BaseHTTPRequestHandler
from w1thermsensor import W1ThermSensor

SENSORS = W1ThermSensor.get_available_sensors()

The request handler has a method that builds the whole response at once, which is just plain text based on a simple template.

1
2
3
4
5
6
7
8
9
class Exporter(BaseHTTPRequestHandler):
    METRIC_HEADER = ('# HELP w1therm_temperature Temperature in Kelvin of the sensor.\n'
                     '# TYPE w1therm_temperature gauge\n')

    def build_exposition(self, sensor_states):
        out = self.METRIC_HEADER
        for sensor, temperature in sensor_states.items():
            out += 'w1therm_temperature{{id="{}"}} {}\n'.format(sensor, temperature)
        return out

do_GET is called by BaseHTTPRequestHandler for all HTTP GET requests to the server. Since this server doesn’t really care what you want (it only exports one thing- metrics), it completely ignores the request and sends back metrics.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
    def do_GET(self):
        response = self.build_exposition(self.get_sensor_states())
        response = response.encode('utf-8')

        # We're careful to send a content-length, so keepalive is allowed.
        self.protocol_version = 'HTTP/1.1'
        self.close_connection = False

        self.send_response(200)
        self.send_header('Content-Type', 'text/plain; version=0.0.4')
        self.send_header('Content-Length', len(response))
        self.end_headers()
        self.wfile.write(response)

The http.server API is somewhat cumbersome in that it doesn’t try to handle setting Content-Length on responses to allow clients to keep connections open between requests, but at least in this case it’s very easy to set the Content-Length on the response and correctly implement HTTP 1.1. The Content-Type used here is the one specified by the Prometheus documentation for exposition formats.

The rest of the program is just glue, for the most part. The console_entry_point function is the entry point for the w1therm_prometheus_exporter script specified in setup.py. The network address and port to listen on are taken from the command line, then an HTTP server is started and allowed to run forever.

As a server

As a Python program with a few non-standard dependencies, installation of this server is not particularly easy. While I could sudo pip install everything and call it sufficient, that’s liable to break unexpectedly if other parts of the system are automatically updated- in particular the Python interpreter itself (though Debian as a matter of policy doesn’t update Python to a different release except as a major update, so it shouldn’t happen without warning). What I’d really like is the ability to build a single standalone program that contains everything in a convenient single-file package, and that’s exactly what PyInstaller can do.

A little bit of wrestling with pyinstaller configuration later (included as the .spec file in the repository), I had successfully built a pretty heavy (5MB) executable containing everything the server needs to run. I placed a copy in /usr/local/bin, for easy accessibility in running it.

I then wrote a simple systemd unit for the temperature server to make it start automatically, installed as /etc/systemd/system/w1therm-prometheus-exporter.service:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[Unit]
Description=Exports 1-wire temperature sensor readings to Prometheus
Documentation=https://bitbucket.org/tari/w1therm-prometheus

[Service]
ExecStart=/usr/local/bin/w1therm-prometheus-exporter localhost 9000
Restart=always

StandardOutput=journal
StandardError=journal

# Standalone binary doesn't need any access beyond its own binary image and
# a tmpfs to unpack itself in.
DynamicUser=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true

[Install]
WantedBy=multi-user.target

Enable the service, and it will start automatically when the system boots:

1
systemctl enable w1therm-prometheus-exporter.service

This unit includes rather more protection than is probably very useful, given the machine is single-purpose, but it seems like good practice to isolate the server from the rest of the system as much as possible.

  • DynamicUser will make it run as a system user with ID semi-randomly assigned each time it starts so it doesn’t look like anything else on the system for purposes of resource (file) ownership.
  • ProtectSystem makes it impossible to write to most of the filesystem, protecting against accidental or malicious changes to system files.
  • ProtectHome makes it impossible to read any user’s home directory, preventing information leak from other users.
  • PrivateTmp give the server its own private /tmp directory, so it can’t interfere with temporary files created by other things, nor can its be interfered with- preventing possible races which could be exploited.

Pi connectivity

Having built the HTTP server, I needed a way to get data from it to Prometheus. As discussed earlier, the Raspberry Pi with the sensor is on a WiFi network that doesn’t permit any incoming connections, so how can Prometheus scrape metrics if it can’t connect to the Pi?

One option is to push metrics to Prometheus, using the push gateway. However, I don’t like that option because the push gateway is intended mostly for jobs that run unpredictably, in particular where they can exit without warning. This isn’t true of my sensor server. PushProx provides a rather better solution, wherein clients connect to a proxy which forwards fetches from Prometheus to the relevant client, though I think my ultimate solution is just as effective and simpler.

What I ended up doing is using autossh to open an SSH tunnel at the Prometheus server which connects to the Raspberry Pi’s metrics server. Autossh is responsible for keeping the connection alive, managed by systemd. Code is going to be much more instructive here than a long-form description, so here’s the unit file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
[Unit]
Description=SSH reverse tunnel from %I for Prometheus
After=network-online.target
Wants=network-online.target

[Service]
User=autossh
ExecStart=/usr/bin/autossh -N -p 22 -l autossh -R 9000:localhost:9000 -i /home/autossh/id_rsa %i
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Installed as /etc/systemd/system/autossh-tunnel@.service, this unit file tells systemd that we want to start autossh when the network is online and try to ensure it always stays online. I’ve increased RestartSec from the default 100 milliseconds because I found that even with the dependency on network-online.target, ssh could fail when the system was booting up with DNS lookup failures, then systemd would give up. Increasing the restart time means it takes much longer for systemd to give up, and in the meantime the network actually comes up.

The autossh process itself runs as a system user I created just to run the tunnels (useradd --system -m autossh), and opens a reverse tunnel from port 9000 on the remote host to the same port on the Pi. Authentication is with an SSH key I created on the Pi and added to the Prometheus machine in Google Cloud, so it can log in to the server without any human intervention. Teaching systemd that this should run automatically is a simple enable command away1:

1
systemctl enable autossh-tunnel@pitemp.example.com

Then it’s just a matter of configuring Prometheus to scrape the sensor exporter. The entire Prometheus config looks like this:

1
2
3
4
5
6
7
8
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'w1therm'
    static_configs:
      - targets: ['localhost:9000']

That’s pretty self-explanatory; Prometheus will fetch metrics from port 9000 on the same machine (which is actually an SSH tunnel to the Raspberry Pi), and do so every 15 seconds. When the Pi gets the request for metrics, it reads the temperature sensors and returns their values.

Data retention

I included InfluxDB in the setup to get arbitrary retention of temperature data- Prometheus is designed primarily for real-time monitoring of computer systems, to alert human operators when things appear to be going wrong. Consequently, in the default configuration Prometheus only retains captured data for a few weeks, and doesn’t provide a convenient way to export data for archival or analysis. While the default retention is probably sufficient for this project’s needs, I wanted better control over how long that data was kept and the ability to save it as long as I liked. So while Prometheus doesn’t offer that control itself, it does support reading and writing data to and from various other databases, including InfluxDB (which I chose only because a package for it is available in Debian without any additional work).

Unfortunately, the version of Prometheus available in Debian right now is fairly old- 1.5.2, where the latest release is 2.2. More problematic, while Prometheus now supports a generic remote read/write API, this was added in version 2.0 and is not yet available in the Debian package. Combined with the lack of documentation (as far as I could find) for the old remote write feature, I was a little bit stuck.

Things ended up working out nicely though- I happened to see flags relating to InluxDB in the Prometheus web UI, which mostly have no default values:

  • storage.remote.influxdb-url
  • storage.remote.influxdb.database = prometheus
  • storage.remote.influxdb.retention-policy
  • storage.remote.influxdb.username

These can be specified to Prometheus by editing /etc/defaults/prometheus, which is part of the Debian package for providing the command line arguments to the server without requiring users to directly edit the file that tells the system how to run Prometheus. I ended up with these options there:

1
2
3
ARGS="--storage.local.retention=720h \
      --storage.remote.influxdb-url=http://localhost:8086/ \
      --storage.remote.influxdb.retention-policy=autogen"

The first option just makes Prometheus keep its data longer than the default, whereas the others tell it how to write data to InfluxDB. I determined where InfluxDB listens for connections by looking at its configuration file /etc/influxdb/influxdb.conf and making a few guesses: a comment in the http section there noted that “these (HTTP endpoints) are the primary mechanism for getting data into and out of InfluxDB” and included the settings bind-address=":8086" and auth-enabled=false, so I guessed (correctly) that telling Prometheus to find InfluxDB at http://localhost:8086/ should be sufficient.

Or, it was almost enough: setting the influxdb-url and restarting Prometheus, it was logging warnings periodically about getting errors back from InfluxDB. Given the influxdb.database settings defaults to prometheus, I (correctly) assumed I needed to create a database. A little browsing of the Influx documentation and a few guesses later, I had done that:

1
2
3
4
5
6
$ apt-get install influxdb-client
$ influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 1.0.2
InfluxDB shell version: 1.0.2
> CREATE DATABASE prometheus;

Examining the Prometheus logs again, now it was failing and complaining that the specified retention policy didn’t exist. Noting that the Influx documentation for the CREATE DATABASE command mentioned that the autogen retention policy will be used if no other is specified, setting the retention-policy flag to autogen and restarting Prometheus made data start appearing, which I verified by waiting a little while and making a query (guessing a little bit about how I would query a particular metric):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
> USE prometheus;
> SELECT * FROM w1therm_temperature LIMIT 10;
name: w1therm_temperature
-------------------------
time                    id              instance        job     value
1532423583303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423598303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423613303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423628303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423643303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423658303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423673303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423688303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423703303000000     000004b926f1    localhost:9000  w1therm 297.9
1532423718303000000     000004b926f1    localhost:9000  w1therm 297.9

Results

A sample graph of the temperature over two days:

Temperature follows a diurnal cycle, starting at 23 degrees at 00:00, peaking around 24 at 06:00 and bottoming out near 22 at 21:00.

The fermentation temperature is quite stable, with daily variation of less than one degree in either direction from the baseline.

Refinements

I later improved the temperature server to handle SIGHUP as a trigger to scan for sensors again, which is a slight improvement over restarting it, but not very important because the server is already so simple (and fast to restart).


On reflection, using Prometheus and scraping temperatures is a very strange way to go about solving the problem of logging the temperature (though it has the advantage of using only tools I was already familiar with so it was easy to do quickly). Pushing temperature measurements from the Pi via MQTT would be a much more sensible solution, since that’s a protocol designed specifically for small sensors to report their states. Indeed, there is no shortage of published projects that do exactly that more efficiently than my Raspberry Pi, most of them using ESP8266 microcontrollers which are much lower-power and can still connect to Wi-Fi networks.

Rambling about IoT security

Getting sensor readings through an MQTT broker and storing them to be able to graph them is not quite as trivial as scraping them with Prometheus, but I suspect there does exist a software package that does most of the work already. If not, I expect a quick and dirty one could be implemented with relative ease.

On the other hand, running a device like that which is internet-connected but is unlikely to ever receive anything remotely looking like a security update seems ill-advised if it’s meant to run for anything but a short amount of time. In that case having the sensor be part of a Zigbee network instead, which does not permit direct internet connectivity and thus avoids the fraught terrain of needing to protect both the device itself from attack and the data transmitted by the device from unauthorized use (eavesdropping) by taking ownership of that problem away from the sensor.

It remains possible to forward messages out to an MQTT broker on the greater internet using some kind of bridge (indeed, this is the system used by many consumer “smart device” platforms, like Philips’ Hue though I don’t think they use MQTT), where individual devices connect only to the Zigbee network, and a more capable bridge is responsible for internet connectivity. The problem of keeping the bridge secure remains, but is appreciably simpler than needing to maintain the security of each individual device in what may be a heterogeneous network.

It’s even possible to get inexpensive off-the-shelf temperature and humidity sensors that connect to Zigbee networks like some sold by Xiaomi, offering much better finish than a prototype-quality one I might be able to build myself, very good battery life, and still capable of operating in a heterogenous Zigbee network with arbitrary other devices (though you wouldn’t know it from the manufacturer’s documentation, since they want consumers to commit to their “platform” exclusively)!

So while my solution is okay in that it works fine with hardware I already had on hand, a much more robust solution is readily available with off-the-shelf hardware and only a little bit of software to glue it together. If I needed to do this again and wanted a solution that doesn’t require my expertise to maintain it, I’d reach for those instead.


  1. Hostname changed to an obviously fake one for anonymization purposes. ↩︎