Reverse-engineering Ren'py packages

Some time ago (September 3, 2013, apparently), I had just finished reading Analogue: A Hate Story (which I highly recommend, by the way) and was particularly taken with the art. At that point it seems my engineer’s instincts kicked in and it seemed reasonable to reverse-engineer the resource archives to extract the art for my own nefarious purposes.

A little examination of the game files revealed a convenient truth: it was built with Ren’Py, a (open-source) visual novel engine written in Python. Python is a language I’m quite familiar with, so the actual task promised to be well within my expertise.

Code

Long story short, I’ve build some rudimentary tools for working with compiled Ren’py data. You can get it from my repository on BitBucket. Technically-inclined readers might also want to follow along in the code while reading.

Background

There are a large number of games designed with Ren’py. It’s an easy tool to get started with and hack on, since the script language is fairly simple and because it’s open-source, more sophisticated users are free to bend it to their will. A few examples of (in my opinion) high-quality things built with the engine:

Since visual novels tend to live or die on the combination of art and writing, the ability to examine the assets outside the game environment offers interesting possibilities.

Since it was handy, I started my experimentation with Analogue.

RPA resource archives

The largest files distributed with the game were .rpa files, so I investigated those first for finding art. As it turned out, this was exactly the place I needed to look. Start by examining the raw data:

$cd "Analogue A Hate Story/game"$ ls
bytecode.rpyb
data.rpa
dlc1.rpa
nd.rpa
$less data.rpa RPA-3.0 00000000035f5c75 414154bb <F0><AA><D6>^MީZ<A0><90><FB>^M6<B9><B7>^X<A3><82><F3>F<B0><DF>k8(<BF>ߦx<9C><D5>T [snip] There’s an obvious file identifier (RPA-3.0), followed by a couple numbers and a lot of compressed-looking data. The first number turns out to be very close to the total file size, so it’s probably some size or offset field, while the other one looks like some kind of signature. $ python -c 'print(0x35f5c75)'
56581237
$stat -c %s data.rpa 56592058 At this point I simply referred to the Ren’Py source code, rather than waste time experimenting on the data itself. Turns out the first number is the file offset of the index, and the second one is a key used for simple obfuscation of elements of the index (numbers are bitwise exclusive-or’d with the key). The archive index itself is a DEFLATE-compressed block of pickled Python objects. The index maps file names to tuples of offset and block length specifying where within the archive file the data can be found. With that knowledge in hand, it’s short work to build a decoder for the index data and dump it all to files. This is rpa.py in my tools. Extracting the archives pulls out plenty of images and other media, as well as a number of interesting-looking .rpyb files, which we’ll discuss shortly. Cosplay For a bit of amusement, I exercised my web-programming chops a little and built a standalone web page for playing with the extracted costumes and expressions of *Hyun-ae and *Mute, which I’ve included below. Here’s a link to the bare page for standalone amusement as well. Script guts The basic format of compiled scripts (.rpyb files) is similar to that of resource packages. The entire thing is a tuple of (data, statements), where data is a dictionary of basic metadata and statements is a list of objects representing the script’s code. The statements in this are just the Ren’py abstract syntax tree, so all the objects come from the renpy.ast module. Unfortunately and as I’ll discuss later, the pickle format makes this representation hard to work with. The structure of AST members is designed such that each object can have attached bytecode. In practice this appears to never happen in archives. In my investigations of the source, it appears that Ren’py only writes Python bytecode as a performance enhancement, and most of it ends up in bytecode.rpyb. That file appears to provide some sort of bytecode cache that overrides script files in certain situations. For the purposes of reverse-engineering this is fortunate– Python bytecode is documented, but rather more difficult to translate into something human-readable than the source code that is actually present in RPYB archives. Here’s some of the Act 1 script from Analogue run through the current version of my script decompiler: ## BEGIN Label 'dont_understand', params=None, hide=False dont_understand: ## <class 'renpy.ast.Show'> -- don't know how to dump this! ## <class 'renpy.ast.Say'> -- don't know how to dump this! ## <class 'renpy.ast.Jump'> -- don't know how to dump this! ## BEGIN Label 'nothing', params=None, hide=False nothing: python: shown_message = None for block in store.blocks: for message in block.contents: if message == _message: shown_message = str(store.blocks.index(block)+1) + "-" + str(block.contents.index(message)+1) if shown_message: gray_out(shown_message) ## <class 'renpy.ast.With'> -- don't know how to dump this! ## <class 'renpy.ast.Show'> -- don't know how to dump this! ## <class 'renpy.ast.With'> -- don't know how to dump this! ## <class 'renpy.ast.Say'> -- don't know how to dump this!  Clearly there are a few things my decompiler needs to learn about. It does, however, handle the more common block elements such as If statements. In any case, the Python code embedded in these scripts tends to be more interesting than the rest (which are mostly just dialogue and display manipulation) for the purposes of reverse-engineering. If you’re more interested in spoiling the game for yourself, it’s not as useful. A few telling bits of logic from options.rpy: init -3: ## BEGIN Python python hide: ## BEGIN PyCode, exec mode, from game/options.rpy renpy.demo_mode = True init -1: ## BEGIN Python python hide: ## BEGIN PyCode, exec mode, from game/options.rpy config.developer = True config.screen_width = 1024 if persistent.resolution == None: persistent.resolution = 2 if persistent.old_resolution == None: persistent.old_resolution = persistent.resolution if not persistent.resolution: config.screen_height = 600 else: config.screen_height = 640 if renpy.demo_mode: config.window_title = u"Analogue trial" else: config.window_title = u"Analogue: A Hate Story" config.name = "Analogue" config.version = "0.0"  The spacing here is interesting; I suspect (but haven’t attempted to verify) that the Ren’py script compiler strips comments since there haven’t been any in all of the scripts I’ve examined, so it’s likely that the unusual empty blocks in the code were comment blocks in a former life. I’ve yet to dig much into what determines when demo_mode is set, but I doubt it would be difficult to forcibly set (or clear) if one were so inclined. Not that I condone such an action.. A little bit of interesting game-critical logic, also from Analogue (caution: minor spoilers) store.radiation = 0 store.reactor_enabled = True def interact_cb(): if store.radiation and store.reactor_enabled: store.radiation_levels += 0.01 if (any_read("7-*") or (store.current_character == "mute" and get_m("6-9").enabled)) and store.radiation_levels < 0.65: store.radiation_levels = 0.65 store.radiation_levels = min (store.radiation_levels, 0.8) config.interact_callbacks.append(interact_cb) You can get some idea of how specialized Ren’py’s execution environment is from this code. Particularly, store is a magic value injected into the locals of Python script blocks which maps to the RPY persistent variable store, which stores most of the game state. config is a similar magic value providing a handle to the engine configuration. In this instance, radiation refers to a sort of hidden timer which forces the player to solve a puzzle on expiration (assuming the preconditions have been met), then make a decision which causes the plot to fork depending on that decision. Elsewhere in the code, I found a few developer switches which allow one to display the value of this countdown and reset or force it. Conclusions As the official documentation notes, the process of resource compilation is not very secure but is enough to deter casual copying. I’ve shown here that such a claim is entirely correct, though script decompilation may be somewhat harder than the developers envisioned due to the choice of pickle as a serialization format. It’s nothing particularly new to me, but a reminder to designers of software: if it runs on your attacker’s system, it can be hacked. It’s not a question of “if”, but instead “how fast”. I was mostly interested in extracting resources with this project, which was quite easy. In that matter, I think the designers of Ren’Py made a good design decision. The compiled archives and scripts are much more robust against accidental modification in the face of curious users than not compiling anything, but the developers do not expend undue effort building something harder to break which would eventually be broken anyway by a sufficiently determined attacker. Portability As I alluded to earlier, the pickle representation makes the Ren’Py AST hard to work with. This is because many of the objects contain references to the engine state, which in turn implies most of the engine needs to be initialized when unpickling the AST. To say the least, this is not easy- engine initialization is not easily separated from game startup. To illustrate the problem, observe that the Ren’Py developer kit is simply the engine itself packaged with a game of sorts that provides help in getting a new project set up by modifying the included scripts. There simply seems to be no part of the engine that is designed to run without the rest of it running as well. In experimenting with different products built with Ren’Py, I’ve had to make changes to some combination of the engine itself and my code in order to bootstrap the engine state to a point where the AST can be successfully unpickled. Suffice to say, this has hampered my progress somewhat, and led me to consider slightly different avenues of attack. The most promising of these would involve a semi-custom unpickler which avoids instantiating actual Ren’Py objects; the only data that need be preserved is the structural information, rather than the many hooks into engine state that are also included in the pickle serialization. Further continuation of this project is likely to take this approach to deserialization. GStreamer's playbin, threads and queueing I’ve been working on a project that uses GStreamer to play back audio files in an automatically-determined order. My implementation uses a playbin, which is nice and easy to use. I had some issues getting it to continue playback on reaching the end of a file, though. According to the documentation for the about-to-finish signal, This signal is emitted when the current uri is about to finish. You can set the uri and suburi to make sure that playback continues. This signal is emitted from the context of a GStreamer streaming thread. Because I wanted to avoid blocking a streaming thread under the theory that doing so might interrupt playback (the logic in determining what to play next hits external resources so may take some time), my program simply forwarded that message out to be handled in the application’s main thread by posting a message to the pipeline’s bus. Now, this approach appeared to work, except it didn’t start playing the next URI, and the pipeline never changed state- it was simply wedged. Turns out that you must assign to the uri property from the same thread, otherwise it doesn’t do anything. Fortunately, it turns out that blocking that streaming thread while waiting for data isn’t an issue (determined by experiment by simply blocking the thread for a while before setting the uri). Newlib's git repository Because I had quite the time finding it when I wanted to submit a patch to newlib, there’s a git mirror of the canonical CVS repository for newlib, which all the patches I saw on the mailing list were based off of. Maybe somebody else looking for it will find this note useful: git clone git://sourceware.org/git/newlib.git  See also: the original mailing list announcement of the mirror’s availability. Matrioshka brains and IPv6: a thought experiment Nich (one of my roommates) mentioned recently that discussion in his computer networking course this semester turned to IPv6 in a recent session, and we spent a short while coming up with interesting ways to consider the size of the IPv6 address pool. Assuming 2^128 available addresses (an overestimate since some number of them are reserved for certain uses and are not publicly routable), for example, there are more IPv6 addresses than there are (estimated) grains of sand on Earth by a factor of approximately 3 × 10^14 (Wolfram|Alpha says there are between 10^20 and 10^24 grains of sand on Earth). A Matrioshka brain? While Nich quickly lost interest in this diversion into math, I started venturing into cosmic scales to find numbers that compare to that very large address space. I eventually started attempting to do things with the total mass of the Solar System, at which point I made the connection to a Matrioshka brain. “A what?” you might say. A Matrioshka brain is a megastructure composed of multiple nested Dyson spheres, themselves megastructures of orbiting solar-power satellites in density sufficient to capture most of a star’s energy output. A Matrioshka brain uses the captured energy to power computation at an incredible scale, probably to run an uploaded version of something evolved from contemporary civilization (compared to a more classical use of powering a laser death ray or something). Random note: a civilization capable of building a Dyson sphere would be at least Type II on the Kardashev scale. I find Charlie Stross’ novel Accelerando to be a particularly vivid example, beginning in a recognizable near-future sort of setting and eventually progressing into a Matrioshka brain-based civilization. While the typical depiction of a Dyson sphere is a solid shell, it’s much more practical to build a swam of individual devices that together form a sort of soft shell, and this is how it’s approached in Accelerando, where the Solar System’s non-Solar mass is converted into “computronium”, effectively a Dyson swarm of processors with integrated thermal generators. By receiving energy from the sunward side and radiating waste heat to the next layer out, computation may be performed. Let’s calculate Okay, we’ve gotten definitions out of the way. Now, what I was actually pondering: how does the number of routable IPv6 addresses compare to an estimate of the number of computing devices there might be in a Matrioshka brain? That is, would IPv6 be sufficient as a routing protocol for such a network, and how many devices might that be? A silicon wafer used for manufacturing electronics, looking into the near future, has a diameter of 450 millimeters and thickness of 925 micrometers (450mm wafers are not yet common, but mass-production processes for this size are being developed as the next standard). These wafers are effectively pure crystals of elemental (that is, monocrystalline) silicon, which are processed to become semiconductor integrated circuits. Our first target, then, will be to determine the mass of an ideal 450mm wafer. First, we’ll need the volume of that wafer (since I was unable to find a precise number for a typical wafer’s mass): $$\pi \times \left( \frac{450 \; \mathrm{mm}}{2} \right)^2 \times 925 \;\mathrm{\mu m} = 147115 \;\mathrm{mm^3}$$ Given the wafer’s volume, we then need to find its density in order to calculate its mass. I’m no chemist, but I know enough to be dangerous in this instance. A little bit of research reveals that silicon crystals have the same structure as diamond, which is known as diamond cubic. It looks something like this: Now, this diagram is rather difficult to make sense of, and I struggled with a way to estimate the number of atoms in a given volume from that. A little more searching revealed a handy reference in a materials science textbook, however. The example I’ve linked here notes that there are 8 atoms per unit cell, which puts us in a useful position for further computation. Given that, the only remaining question is how large each unit cell is. That turns out to be provided by the crystal’s lattice constant. According to the above reference, and supported by the same information from the ever-useful HyperPhysics, the lattice constant of silicon is 0.543 nanometers. With this knowledge in hand, we can compute the average volume per atom in a silicon crystal, since the crystal structure fits 8 atoms into a cube with sides 0.543 nanometers long. $$\frac{0.543^3 \mathrm{\frac{nm^3}{cell}}}{8 \mathrm{\frac{atoms}{cell}}} = .02001 \mathrm{\frac{nm^3}{atom}}$$ Now that we know the amount of space each atom (on average) takes up in this crystal, we can use the atomic mass of silicon to compute the density. Silicon’s atomic mass is 28.0855 atomic mass units, or about 4.66371 × 10^-23 grams. $$\frac{4.66371 \times 10^{-23} \mathrm{\frac{g}{atom}}}{.02001 \mathrm{\frac{nm^3}{atom}}} = 2.3307 \mathrm{\frac{g}{cm^3}}$$ Thus, we can easily compute the mass of a single wafer, given the volume we computed earlier. $$\frac{147115 \;\mathrm{mm}^3}{1000 \mathrm{\frac{mm^3}{cm^3}}} \times 2.3307 \mathrm{\frac{g}{cm^3}} = 342.881 \;\mathrm{g}$$ Aside: impurities are negligible We’re going to ignore impurities in the crystal, both the undesired and desired ones. Silicon is doped to control its semiconductor properties when manufacturing integrated circuits, and these intentional impurities will affect the crystal’s density. For illustrative examples, we might refer to materials published by Chipworks, a company that specializes in reverse-engineering of integrated circuits. Below I’ve included one example from Chipworks with doped regions of the substrate annotated: There’s also a question of the undesired impurities, but those concentrations should be even less important to our case. If we refer to some (slightly old) figures on the tolerances of a commercial wafer, well.. I’m not entirely sure how to make sense of those numbers. We can consider that the magnitude of undesired impurities in the wafer must be significantly less than that of the intentional ones (since that would affect the semiconductor properties in a hard-to-control fashion), however, and decide it’s not worth worrying about. If you look around that tolerance sheet though, you can get a good idea of how exact the mechanical specifications are. For example, local deviations in the surface must be less than 0.25 micrometers (although it doesn’t appear to include a definition for “local flatness”, rather disappointingly). More importantly than impurities in the silicon, additional metal and insulator layers are deposited on top of the wafer for interconnection. Using material from Chipworks again, a complex integrated circuit is quite tall when considered in cross-section, mainly due to the numerous interconnects necessary: How does this metal stack compare to the wafer’s thickness? Chipworks don’t publish many cross-sectional images like the one above, but here’s one of the same Intel 22 nanometer process featured on the left side of the above image, this time with scale bars (and much higher magnification). From that, we can estimate a bit from the image we have of the metal layers. It looks like the 1x-pitch metal layers are each about 40 nanometers tall, since I know that the smallest serrated-looking bits at the bottom of the stack are the FETs. Working from that, the entire interconnect stack is about (1 + 1.4 + 2 + 3 + 4) × 40 = 456 nanometers tall, assuming the metal pitch is proportional to its thickness. That’s a small fraction of the wafer’s overall thickness, which is 925000 nanometers.1 But enough of things that don’t enter into our computations. Back to the real work! A real-world reference CPU To this point, we’ve computed the density of monocrystalline silicon and determined the volume of a 450mm silicon wafer. Next, we should determine how many useful computing devices can be obtained from a single wafer. As a possible model for a hypothetical processor to drive a computronium Dyson swarm, I’ll refer to Texas Instruments’ MSP430 microcontrollers. These devices include an efficient CPU core with a number of analog peripherals on-chip. The analog consideration is important, because some way for the members of our Dyson swarm to communicate is required. In this situation, I’ll assume some sort of radio is on the same die (the piece of a silicon wafer that makes up a single chip) as the CPU, allowing two-way communication with nearby processors. In addition, power generation components (since these devices must gather energy from the sun independently) will likely be at least partially analog. This assumption of radio communication is perhaps not accurate, since optical communication may be much easier to control in such a dense network, with optical emitters (LEDs or laser diodes, probably) and receivers (photodiodes) constructed directly on the wafer. For this case, however, it’s not terribly important, since space taken by unneeded analog parts on the real-world MSP430 could easily be reused, for example to provide additional on-chip memory. With a model chip to refer to, we must now determine the size of an MSP430’s die. The smallest package (integrated circuits are usually sold encased in epoxy packages with exposed metal pads to make electrical connections to) any MSP430 is available in (that I can quickly find) appears to be a chip-scale BGA, on the MSP430F2370 (I’m providing a copy of the datasheet here, as well). This is perfect for die size estimation, since we can assume that the chip-scale BGA package (YFF in TI’s parlance) is essentially the same size as the die. This estimate is supported by a note on the package drawing (note D) that the exact package size of a device is specified in the device-specific datasheet. Since the note indicates actual package dimensions are determined by the device contained therein, I believe it is safe to assume that the device package is approximately the same size as the die. Referring to the Device Pinout section of our datasheet, Table 2 (on page 4) provides the overall package dimensions: 3.2 ± 0.05 millimeters on each side. Now we must determine the number of dies that can be made from a single wafer. This turns into a geometric packing problem where we want to fit as many squares (dies) as possible into a circle (the wafer), which is surprisingly hard. I found an interesting collection of records for some number of squares packed into the smallest circle, but, there’s no simple way to determine an optimal packing. Wolfram|Alpha has some capability to estimate properties of such an optimal packing, and it says we could get 15279 dies out of a 450mm wafer, with 98.37% efficiency. But wait! We’re assuming somewhat more advanced manufacturing than is currently available. Certainly, I’d expect a computronium manufacturing effort with intent to convert the entire Solar System to recycle waste materials whenever possible, so per-wafer waste isn’t really a concern, since the portions of the wafer that cannot be made into usable dies can simply be recycled into new wafers. Thus, a simple area calculation can be used to determine the amortized number of dies yielded from each wafer. $$\frac{\pi \times \left( \frac{450 \;\mathrm{mm}}{2} \right) ^2}{3.2^2 \frac{\mathrm{mm^2}}{\mathrm{die}}} = 15531.6 \;\mathrm{dies}$$ A silicon Solar System Now it’s time to determine how many processors we could get by converting the entire non-Solar mass of the Solar System into integrated circuits. We will assume a way exists to efficiently form the requisite materials out of what may be available, likely via nuclear fusion (particularly for converting the mostly-hydrogen gas giants into silicon). Our first order of business, since we’re assuming all mass may be converted to be whatever elements are required, is to determine the Solar System’s mass, excluding that of the Sun itself. Wikipedia notes the following: The mass of the Solar System excluding the Sun, Jupiter and Saturn can be determined by adding together all the calculated masses for its largest objects and using rough calculations for the masses of the Oort cloud (estimated at roughly 3 Earth masses), the Kuiper belt (estimated at roughly 0.1 Earth mass) and the asteroid belt (estimated to be 0.0005 Earth mass) for a total, rounded upwards, of ~37 Earth masses, or 8.1 percent the mass in orbit around the Sun. With the combined masses of Uranus and Neptune (~31 Earth masses) subtracted, the remaining ~6 Earth masses of material comprise 1.3 percent of the total. Well, we could manually compute these figures, but such numbers are fairly well-known, so we’ll just ask Wolfram|Alpha what they are. It responds that the Solar System’s mass (including the Sun) is 1.9911 × 10^30 kilograms, and the Sun’s mass is 1.988435 × 10^30 kilograms. Thus the non-Solar mass is trivial to compute: $$1.9911 \times 10^{30} \; \mathrm{kg} - 1.988435 \times 10^{30} \; \mathrm{kg} = 2.7 \times 10^{27} \; \mathrm{kg}$$ Now determine the number of dies we can make from that mass: $$\frac{2.7 \times 10^{27} \; \mathrm{kg}}{342.881 \mathrm{\frac{g}{wafer}}} \times 15531.6\mathrm{\frac{dies}{wafer}} = 1.223 \times 10^{32} \;\mathrm{dies}$$ Final quantitative analysis Having done all the physical computations, we finally have a sense of how a Matrioshka brain could use IPv6. We can make about 10^32 processors out of the Solar System, compared to about 10^38 (theoretically) available IPv6 addresses. That is, it would take approximately one million Matrioshka brains to completely exhaust the IPv6 address pool. In practice, such a dense network would not be desirable, since the very large IPv6 address space allows a lot of slack in address allocation to make routing easier. For example, clearly-defined hierarchical address allocations allow routers to efficiently determine destinations for traffic via route aggregation or other methods. Basically: once networks shift to IPv6, address exhaustion is not a concern for any forseeable future. The IPv6 address pool could support Matrioshka brains around about 1% of the stars in our galaxy (extimating about 2 × 10^11 stars in the galaxy) all in a single network. Without superluminal communication, such a network would pose its own challenges (mainly in message latency), to the point where I believe it would make more sense to have interstellar communications running on a different network that happens to be bridged with a local IP network. I had a bit of difficulty remembering which novels I was thinking of, but Charlie Stross’ “Singularity Sky” and “Iron Sunrise” and Vernor Vinge’s “A Fire Upon the Deep” involve interesting cases where sublight shipping of information remains relevant in galactic civilizations, representing cases where (non-transcended) civilizations maintain local networks with dedicated (comparatively high-cost) links for long-range communication. I think that is a logical way to approach the problem of networking a galactic civilization, given any expected means of interstellar communication will have limited bandwidth, high latency, or both. So what’s my conclusion? Don’t worry about IPv6 exhaustion. Even if address allocation seems extremely inefficient, since they can (in theory) be reallocated if necessary, and even extremely inefficient allocation still allows a transcended Solar civilization to function comfortably on top of an IP network. Epilogue Wow, that was a lot of writing. Over about a week, I spent four evenings actively writing this post, for a total of approximately 10 hours. I wrote the math markup in this post with the Interactive LaTeX Editor, which is a really slick tool and allows me to ensure MathJax (which I use to render math markup on the site and is itself magical) will handle the expressions as expected. Highly recommended! Anybody who finds the very lowest level of technology to be interesting (as I do) would probably do well to follow the Chipworks blogs. They also publish teardowns of consumer goods, if that’s more your thing. 1. As a more informed estimate, somebody who works in the semiconductor industry estimates in a talk from HoPE in 2012 that the total stackup on Intel’s 22nm process is about 100 microns, still only about a tenth of the wafer thickness. [return] "Four"ier transform I liked the joke and am familiar enough with the math of working in unusual bases that I felt a need to implement a quick version of this in Python. Code follows. #!/usr/bin/env python def fourier(x, b): """Attempts to find a fourier version of x, working down from base b. Returns the fouriest base.""" mostFours = 0 bestBase = -1 for base in range(b, 1, -1): fours = 0 t = x while t != 0: if (t % base) == 4: fours += 1 t //= base # Prefer lower bases if fours >= mostFours: print(baseconvert(x, base) + "_{0}".format(base)) mostFours = fours bestBase = base return bestBase BASE_CHARS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" def baseconvert(x, base): s = "" while x != 0: s += BASE_CHARS[x % base] x //= base return ''.join(reversed(s)) if __name__ == '__main__': from sys import argv, exit if len(argv) < 2: print("""Usage: {0} <number> Computes the "four"ier transform of <number>, printing the optimizations to reach the "fouriest" form.""".format(argv[0])) exit(1) x = int(argv[1]) # Base 36 is the largest sensible base to use base = fourier(x, 36) if base == -1: print("{0} is four-prime!".format(x)) This is Python 3.x code, using explicit integer division. It should work under the 2.x series if you change line 34 to use “/=” rather than “//=”. It can only go up to base 36, because I didn’t want to deal with bases that are hard to represent in reasonable ways. Up to base 64 is an option, but in that case I would have wanted to use MIME base 64, which puts digits at positions 52 through 61, which would be confusing to read. Thus it only supports up to base 36, but could be adjusted with relative east to do larger bases. Running a few examples: $ python fourier.py 624
HC_36
HT_35
IC_34
IU_33
JG_32
K4_31
143_23
1B4_20
440_12
4444_5

$python fourier.py 65535 1EKF_36 1IHF_35 1MNH_34 1R5U_33 1VVV_32 2661_31 2COF_30 2JQO_29 2RGF_28 38O6_27 3IOF_26 44LA_25 B44F_18 14640_15 4044120_5$ python fourier.py 3
3 is four-prime!

A few quirks: it prefers lower bases, so bases that match earlier attempts in fouriness will be printed, despite having equal fouriness. I’ve decided to call values that have no representations containing a ‘4’ character “four-prime”, which is probably going to be a rare occurrence, but the program handles it okay.

Generalization of the algorithm is certainly possible, and basically requires changing the condition on line 14 to match different digit values. For example, a hypothetical “Three”ier transform would replace the ‘4’ on line 14 with a ‘3’.

There’s a rather interesting discussion of the topic over on Reddit, as well as a few other implementations. (Thanks to Merth for pointing those out to me.)

Of Cable Modems and the Dumb Solution

I was studying in Japan last semester (which explains somewhat why I haven’t posted anything interesting here in a while). That’s a whole different set of things to blog about, which I’ll get to at some point with any luck (maybe I’ll just force myself to write one post per day for a bit, even though these things tend to take at least a few hours to write..).

Background

At any rate, back in Houghton I live with a few roommates in an apartment served by Charter internet service (which I believe is currently DOCSIS2). The performance tends to be quite good (it seems that the numbers that they quote for service speeds are guaranteed minimums, unlike most other ISPs), but I like to have complete control over my firewall and routing.

In the past, such freedom has been achieved through my trusty WRT54GL, but the 4-megabyte Flash chip in that device makes it hard to fit in a configuration that includes IPv6 support, which is increasingly important to me. As I had an Intel Atom-based board sitting around some time ago, I decided to turn that into a full-time router/firewall running pfSense. The power available with pfSense is probably overkill for my needs, but it ensures I’ll be able to stay up to date and potentially do fancy things with my network configuration at some future date.

Returning to the matter at hand: the whole system was working just fine for a while, but I got a report from my roommates that the internet connection had stopped working, but came up okay with a bargain-basement consumer router (a Linksys/Cisco E900). From what information I was able to get from my roommates, it sounded like a hardware failure in the secondary network card, which is used for the WAN uplink (not exactly surprising, since it’s a 100-megabit PCI Ethernet card I pulled out of something else some time ago).

Debugging

On my recent return to the apartment, one of my priorities was getting the pfSense system up and running again as the main router/firewall. While the E900 was performing fine, pfSense allows me to get a few additional things out of the connection. Most notably, Charter provide a 6rd relay for ISP-provided IPv6 (compared to something like the public IPv6 tunnel service available from Hurricane Electric), which is quite desirable to me.

After performing a basic test, the pfSense box did indeed fail to get a public IP address from Charter when put in place as the primary gateway. At that point, I decided to break out a network analyzer (Wireshark in this case) and see how the DHCP solicitations on the WAN interface differed between the E900 and my pfSense configuration. What follows is Wireshark’s dissection of a single DHCP Discover message from each system.

Ethernet II, Src: Micro-St_60:86:0c (8c:89:a5:60:86:0c), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Source: Micro-St_60:86:0c (8c:89:a5:60:86:0c)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 0.0.0.0 (0.0.0.0), Dst: 255.255.255.255 (255.255.255.255)
Version: 4
Differentiated Services Field: 0x10 (DSCP 0x04: Unknown DSCP; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
0001 00.. = Differentiated Services Codepoint: Unknown (0x04)
.... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
Total Length: 328
Identification: 0x0000 (0)
Flags: 0x00
Fragment offset: 0
Time to live: 128
Protocol: UDP (17)
Source: 0.0.0.0 (0.0.0.0)
Destination: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: bootpc (68), Dst Port: bootps (67)
Source port: bootpc (68)
Destination port: bootps (67)
Length: 308
Checksum: 0x9918 [validation disabled]
Bootstrap Protocol
Message type: Boot Request (1)
Hardware type: Ethernet
Hops: 0
Transaction ID: 0x1a5f4329
Seconds elapsed: 0
.000 0000 0000 0000 = Reserved flags: 0x0000
Next server IP address: 0.0.0.0 (0.0.0.0)
Relay agent IP address: 0.0.0.0 (0.0.0.0)
Server host name not given
Boot file name not given
Option: (53) DHCP Message Type
Length: 1
DHCP: Discover (1)
Option: (12) Host Name
Length: 10
Host Name: Needlecast
Option: (55) Parameter Request List
Length: 4
Parameter Request List Item: (1) Subnet Mask
Parameter Request List Item: (3) Router
Parameter Request List Item: (15) Domain Name
Parameter Request List Item: (6) Domain Name Server
Option: (61) Client identifier
Length: 7
Hardware type: Ethernet
Option: (255) End
Option End: 255


pfSense 2.0.2:

Ethernet II, Src: 3com_8a:b9:6b (00:50:da:8a:b9:6b), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Source: 3com_8a:b9:6b (00:50:da:8a:b9:6b)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 0.0.0.0 (0.0.0.0), Dst: 255.255.255.255 (255.255.255.255)
Version: 4
Differentiated Services Field: 0x10 (DSCP 0x04: Unknown DSCP; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
0001 00.. = Differentiated Services Codepoint: Unknown (0x04)
.... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
Total Length: 328
Identification: 0x0000 (0)
Flags: 0x00
Fragment offset: 0
Time to live: 16
Protocol: UDP (17)
Source: 0.0.0.0 (0.0.0.0)
Destination: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: bootpc (68), Dst Port: bootps (67)
Source port: bootpc (68)
Destination port: bootps (67)
Length: 308
Checksum: 0x3a68 [validation disabled]
Bootstrap Protocol
Message type: Boot Request (1)
Hardware type: Ethernet
Hops: 0
Transaction ID: 0x06303c2b
Seconds elapsed: 0
Bootp flags: 0x0000 (Unicast)
0... .... .... .... = Broadcast flag: Unicast
.000 0000 0000 0000 = Reserved flags: 0x0000
Next server IP address: 0.0.0.0 (0.0.0.0)
Relay agent IP address: 0.0.0.0 (0.0.0.0)
Server host name not given
Boot file name not given
Option: (53) DHCP Message Type
Length: 1
DHCP: Discover (1)
Option: (61) Client identifier
Length: 7
Hardware type: Ethernet
Option: (12) Host Name
Length: 7
Host Name: pfSense
Option: (55) Parameter Request List
Length: 8
Parameter Request List Item: (1) Subnet Mask
Parameter Request List Item: (2) Time Offset
Parameter Request List Item: (121) Classless Static Route
Parameter Request List Item: (3) Router
Parameter Request List Item: (15) Domain Name
Parameter Request List Item: (6) Domain Name Server
Parameter Request List Item: (12) Host Name
Option: (255) End
Option End: 255


(Apologies to anybody who finds the above ugly, but I only have so much patience for CSS while blogging.)

There are a few differences there, none of which seem really harmful. Given it was working without incident before, however, I guessed that maybe some upstream configuration had changed and become buggy. In particular, I thought that either the BOOTP broadcast flag (line 32 of both packet dissections) needed to be set for some reason, or the upstream DHCP server was choking on some of the parameters pfSense was requesting.

In an effort to pin down the problem, I manually made some DHCP requests with dhclient configured to match what I was seeing from the E900. The configuration I used with dhclient looked like this (where xl0 is the identifier BSD assigns to my WAN interface):

interface "xl0" {
send host-name "Needlecast";
send dhcp-client-identifier 1:8c:89:a5:60:86:0c;
}


This yielded packets that, when examined in Wireshark, only differed by some of the hardware addresses and the BOOTP broadcast flag. At that point I was rather stuck. Newer releases of dhclient support an option to force the broadcast flag in requests, but FreeBSD (which pfSense is derived from) does not provide a new enough version to have that option, and I didn’t want to try building it myself. In addition, I know that my ISP doesn’t lock connections to MAC addresses, so I shouldn’t have to spoof the MAC address of the E900 (indeed, nothing needed to be changed when switching from pfSense to the E900, so the other direction shouldn’t need anything special).

Since I was stuck, it was time to start doing things that seemed increasingly unlikely. One comment on the pfSense forum related to a similar issue mentioned that cable modems tend to be simple DOCSIS-to-Ethernet bridges, so there’s some sort of binding to the client MAC address in the upstream DOCSIS machinery, which rebooting the modem should reset. So I hooked everything up normally, cycled power to the modem and booted up pfSense, and…

…it worked.

I had spent a few evenings working on the problem, and the fix was that simple. I was glad it was finally working so I could reconfigure internet-y goodness (QoS, DDNS updating, 6rd tunneling, VPN) on it, but there was certainly also frustration mixed in there.

Lessons

So what’s the lesson? I suppose we might say that “you’re never too knowledgeable to try rebooting it”. It’s common advice to less savvy users to “try rebooting it”, but I think that’s an oft-neglected solution when more technically-inclined individuals are working on a problem. On the other hand, maybe I’ve just learned some details about DOCSIS systems and the solution in this case happened to be rebooting.

<witty and relevant image goes here>

Better SSL

I updated the site’s SSL certificate to no longer be self-signed. This means that if you use the site over HTTPS, you won’t need to manually accept the certificate, since it is now signed by StartSSL. If you’re interested in doing similar, Ars Technica have a decent walk through the process (though they target nginx for configuration, which may not be useful to those running other web servers).

Rationale

I was reading up on web frameworks available when programming in Haskell earlier today, and I liked the use of domain-specific languages (DSLs) within frameworks such as the routing syntax in Yesod. Compared to how routes are specified in Django (as a similar example that I’m already familiar with), the DSL is both easier to read (because it doesn’t need to be valid code in the hosting language) and faster (since it ends up getting compiled into the application as properly executable code).

A pattern I find myself using rather often in Python projects is to have a small module (usually called config) that encapsulates an INI-style configuration file. It feels like an ugly solution though, since it generally just exports a ConfigParser instance. Combined with consideration of DSLs in Haskell, that got me thinking: what if there were an easier way that made INI configuration files act like Python source such that they could just be imported and have the contents of the file exposed as simple Python types (thus hiding some unnecessary complexity)?

Implementation

I was aware of Python’s import hook mechanisms, so I figured that it should be a good way to approach this problem, and it ended up being a good excuse to learn more about the import hook mechanism. Thus, the following code provides a way to expose INI-style configuration as Python modules. It should be compatible with Python 3 after changing the import of ConfigParser on line 1 to configparser, but I only tested it on Python 2.7.

import ConfigParser, imp, os, sys

def __init__(self, prefix):
self.prefix = prefix

if name in sys.modules:
return sys.modules[name]

module = imp.new_module(name)
if name == self.prefix:
# 'from config import foo' gets config then config.foo,
# so we need a dummy package.
module.__package__ = name
module.__path__ = []
module.__file__ = __file__
else:
# Try to find a .ini file
module.__package__, _, fname = name.rpartition('.')
fname += '.ini'
module.__file__ = fname
if not os.path.isfile(fname):
raise ImportError("Could not find a .ini file matching " + name)
else:

sys.modules[name] = module
return module

def find_module(self, name, path=None):
if name.startswith(self.prefix):
return self
else:
return None

"""Load ini-style file f into module m."""
cp = ConfigParser.SafeConfigParser()
for section in cp.sections():
setattr(m, section, dict(cp.items(section)))

def init(package='config'):
"""Install the ini import hook for the given virtual package name."""
sys.meta_path.append(INILoader(package))

Most of this code should be fairly easy to follow. The magic of the import hook itself is all in the INILoader class, and exactly how that works is specified in PEP 302.

Usage

So how do you use this? Basically, you must simply run init(), then any imports from the specified package (config by default) will be resolved from an .ini file rather than an actual Python module. Sections in a file are exposed as dictionaries under the module.

An example is much more informative than the preceding short description, so here’s one. I put the code on my Python path as INIImport.py and created foo.ini with the following contents:

[cat]
sound=meow
[dog]
sound=woof
[cow]
sound=moo


It has three sections, each describing an animal. Now I load up a Python console and use it:

>>> import INIImport
>>> INIImport.init()
>>> from config import foo
>>> foo.cat
{'sound': 'meow'}
>>> foo.dog['sound']
'woof'


This has the same semantics as a normal Python module, so it can be reloaded or aliased just like any other module:

>>> import config.foo
>>> foo == config.foo
True
<module 'config.foo' from 'foo.ini'>


The ability to reload this module is particularly handy, because my normal configuration module approach doesn’t provide an easy way to reload the file.

Improvements, Limitations

Some addition improvements come to mind if I were to release this experiment as production-quality code. Notably, additional path manipulations for finding .ini files would be useful, such as taking a path argument to init(), supplying a set of directories to search within. Having a way to remove the import hook that it installs would also be good, and straightforward to implement. There’s no way to get all the sections in the module, so it would also be useful to export the sections somehow– perhaps by having the module support the mapping protocol (so all the sections could be retrieved with module.items(), for example).

The main limitation of this scheme is that it has no way to determine the desired type of loaded configuration values, so everything is a string. This is a typical limitation when using the ConfigParser module, but compared to a more robust configuration scheme such as defining objects in a Python file (such as Django does), this might be an annoying loss of expressiveness. The values can always be coerced to the required type when retrieving them, but that’s a bit of unnecessary extra code in whatever uses the configuration.

It may also be useful to provide a way to write configuration back to a file when modifying a config module, but my simplistic implementation makes no attempt at such. Doing so would not be terribly difficult, just involving some wrapper objects to handle member assignment for sections and items, then providing a mechanism for saving the current values back to the original file.

Postlude

This made for an interesting experiment, and it should be a handy example for how to implement import hooks in Python. You may use this code freely within your own work, but I’d appreciate if you leave a note here that it was useful, and link back to this post.

I got a.. “fun” e-mail from mediafire a few weeks ago, saying that one of my files had been suspended due to suspected copyright infringement.

fb-hitler? Oh, that’s some Free Software I wrote. I disputed the claim, simply stating that fb-hitler.tar.bz2 is a piece of software that I created (and thus own the copyright to). As of tonight, I’ve heard nothing back about it, and the file is still inaccessible. Here’s the link to it, for future reference:

http://www.mediafire.com/?mhnmnjztyn3 (.tar.bz2, 477 KB)

And here’s the complete message I got. Notice it somehow got pulled in by somebody looking to get links to Dragonball Z downloads removed, and that the link to fb-hitler itself isn’t even in the (absurdly long) list of URLs given.

Dear MediaFire User:

MediaFire has received notification under the provisions of the Digital Millennium Copyright Act ("DMCA") that your usage of a file is allegedly infringing on the file creator's copyright protection.

The file named fb-hitler.tar.bz2 is identified by the key (mhnmnjztyn3).

As a result of this notice, pursuant to Section 512(c)(1)(C) of the DMCA, we have suspended access to the file.

The reason for suspension was:

Information about the party that filed the report:

Company Name: LeakID Contact Address: 15 bis rue de chateaudun, 02250 La garenne colombes, France Contact Name: HervÃ© Lemaire Contact Phone: Contact Email: herve.lemaire@leakid.com

If you feel this suspension was in error, please submit a counterclaim by following the process below.

Step 1. Click on the following link to open the counterclaim webpage.

Step 2. Use this PIN on the counterclaim webpage to begin the process:

[removed by Tari]

Step 3. Fill in the fields on the counterclaim form with as much detail as possible.

This is a post-only mailing. Replies to this message are not monitored or answered.

So, what of it? In theory, the DMCA is pretty reasonable (discounting the criminalization of DRM circumvention). The safe harbor provisions for hosts are worthwhile, and the takedown process (that is, sending a request to the host) is reasonable. Problem is, it’s been twisted- there’s supposed to be a penalty for requesting the takedown of items that the requestor does not own copyright to, in order to deter trolls. In practice, there is no penalty and content creators go around freely demanding the removal of just about anything, with no repercussions. The automated systems on most service providers now just worsen the problem (though understandably, because the hosts have little ability to fight against the underlying policy that results in these things), because rightsholders can spew all kinds of takedown demands with minimal effort.

For those subject to these takedown demands, it’s unfair because many hosts will deactivate users’ accounts when they receive too many demands for removal of items uploaded by a given user, even if the user proves they have the right to upload the content. For example, YouTube suspends accounts after three copyright-related incidents, no matter the outcome.

To my mind, this situation is unacceptable, and it reeks of the “old media” clutching at straws to prop up an outdated business model, to the detriment of everyone else. As recent history has shown, legal force has little effect on the economic problem of media piracy, and utterly fails to address the economics that lead to this phenomenon (sounds a bit like the US government’s war on drugs, actually..).

Going forward, I would support greatly reducing the length of copyright terms (to somewhere around 20 years, perhaps). While I can’t comment much on what exactly that means to rightsholders and their profits (though I have little sympathy for them, whatever the situation is, due to such things as seen in this post), it would be hugely useful to anybody concerned with preserving history (whom I count myself among), because the time required before something can legally be reproduced without the creator’s consent is greatly reduced. With shorter time spans, it is much less likely that any given piece of content will be lost forever, which is the ultimate result that should be avoided.

Enough of a rant regarding my position on copyright, though. The real point here is that I was annoyed by a spurious copyright claim on something I created, and I will be avoiding mediafire for my future file storage needs (not that I ever used them for much).

MAX5214 Eval Board

I caught on to a promotion from AVNet last week, in which one may get a free MAX5214 eval board (available through August 31), so hopped on it because really, why wouldn’t I turn down free hardware? I promptly forgot about it until today, when a box arrived from AVNet.

What’s on the board

The board features four Maxim ICs:

• MAX8510- small low-power LDO.  Not terribly interesting.
• MAXQ622- 16-bit microcontroller with USB.  I didn’t even know Maxim make microcontrollers!
• MAX5214- 14-bit SPI DAC. The most interesting part.
• MAX6133- precision 3V LDO (provides supply for the DAC)

The MAXQ622 micro (U2) is connected to a USB mini-B port for data, and USB also supplies power for the 5V rail.  The MAX8510 (U4) supplies power for the microcontroller and also the MAX6133 (U3).  The microcontroller acts as a USB bridge to the MAX5214 DAC (U1), which it communicates with over SPI.  The SPI signals are also broken out to a 4-pin header (J6).

Software

The software included with the board is fairly straightforward, providing a small variety of waveforms that can be generated. It’s best demonstrated photographically, as below. Those familiar with National Instruments’ LabView environment will probably recognize that this interface is actually just a LabView VI (Virtual Instrument).

Hacking

Rather more interesting than the stock software is the possibility of reprogramming the microcontroller. Looking at the board photos, we can see that there’s a header that breaks out the JTAG signals. With the right tools, it shouldn’t be very difficult to write a custom firmware to open up a communication protocol to the device (perhaps change its device class to a USB CDC for easier interfacing). Reprogramming the device requires some sort of JTAG adapter, but I can probably make a Bus Pirate do the job.

With some custom software, this could become a handy little function generator- its precision is good and it has a handy USB connection. On the downside, the slew rate on the DAC is not anything special (0.5V/µs, -3dB bandwidth is 100 kHz), and its output current rating is pretty pathetic (5 mA typical). With a unity-gain amplifier on the output though, it could easily drive decent loads and act as a handy low-cost waveform generator. Let’s get hacking?