Quick and dirty web image optimization

Given a large pile of images that nominally live on a web server, I want to make them smaller and more friendly to serve to clients. This is hardly novel: for example, Google offer detailed advice on reducing the size of images for the web. I have mostly JPEG and PNG images, so, jpegtran and optipng are the tools of choice for bulk lossless compression. To locate and compress images, I’ll use GNU find and parallel to invoke those tools. For JPEGs I take a simple approach, preserving comment tags and creating a progressive JPEG (which can be displayed at a reduced resolution before the entire image has been downloaded).

Read More…

sax-ng

Over on Cemetech, we’ve long had an embedded chat widget called “SAX” (“Simultaneous Asynchronous eXchange”). It behaves kind of like a traditional shoutbox, in that registered users can use the SAX widget to chat in near-real-time. There is also a bot that relays messages between the on-site widget and an IRC channel, which we call “saxjax”. The implementation of this, however, was somewhat lacking in efficiency. It was first implemented around mid-2006, and saw essentially no updates until just recently. The following is a good example of how dated the implementation was: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 // code for Mozilla, etc if (window.

Read More…

Web history archival and WARC management

I’ve been a sort of ‘rogue archivist’ along the lines of the Archive Team for some time, but generally lack the combination of motivation and free time to directly take part in their activities. That said, I do sometimes go on bursts of archival since these things do concern me; it’s just a question of when I’ll get manic enough to be useful and latch onto an archival task as the one to do. An earlier public example is when I mirrored ticalc.org. The historical record contains plenty of instances where people maintained copies of their communications or other documentation which has proven useful to study, and in the digital world the same is likely to be true.

Read More…

Claude Shannon hates this one weird trick!

There was a question posted to /r/AskComputerScience recently: “does this compression scheme look fishy to you?”. The algorithm in question, called “press” by its author, makes some.. bold claims in the README: By stringing together 1/0s, using 1’s as negative and a 0 as a positive, I’ve sublimely made a perfect compression. Through outputting into hexadecimal. This shortens <=4096 ‘bits’ into a 2-byte hexadecimal. I know enough to immediately scoff at these claims, and if somebody hadn’t specifically asked about it, I would have left it there and not thought twice. Since somebody asked, however, I thought it would be fun to give this a thorough examination.

Read More…

HodorCSE

Localization of software, while not trivial, is not a particularly novel problem. Where it gets more interesting is in resource-constrained systems, where your ability to display strings is limited by display resolution and memory limitations may make it difficult to include multiple localized copies of any given string in a single binary. All of this is then on top of the usual (admittedly slight in well-designed systems) difficulty in selecting a language at runtime and maintaining reasonably readable code. This all comes to mind following discussion of providing translations of Doors CSE, a piece of software for the TI-84+ Color Silver Edition1 that falls squarely into the “embedded software” category.

Read More…

"A Sufficiently Smart Compiler"

On a bit of a lark today, I decided to see if I could get Spasm running in a web browser via Emscripten. I was successful, but found that something seemed to be optimizing out most of main() such that I had to hack in my own main function that performed the same critical functions and (for the sake of simplicity) hard-coded the relevant command-line options. Looking into the problem a bit further, I observed that not all of main() was being removed; there was one critical line left in. The beginning of the function in source and the generated code were as follows.

Read More…

Reverse-engineering Ren'py packages

Some time ago (September 3, 2013, apparently), I had just finished reading Analogue: A Hate Story (which I highly recommend, by the way) and was particularly taken with the art. At that point it seems my engineer’s instincts kicked in and it seemed reasonable to reverse-engineer the resource archives to extract the art for my own nefarious purposes. Yeah, I really got into Analogue. That's all of the achievements. A little examination of the game files revealed a convenient truth: it was built with Ren’Py, a (open-source) visual novel engine written in Python. Python is a language I’m quite familiar with, so the actual task promised to be well within my expertise.

Read More…

GStreamer's playbin, threads and queueing

I’ve been working on a project that uses GStreamer to play back audio files in an automatically-determined order. My implementation uses a playbin, which is nice and easy to use. I had some issues getting it to continue playback on reaching the end of a file, though. According to the documentation for the about-to-finish signal, This signal is emitted when the current uri is about to finish. You can set the uri and suburi to make sure that playback continues. This signal is emitted from the context of a GStreamer streaming thread. Because I wanted to avoid blocking a streaming thread under the theory that doing so might interrupt playback (the logic in determining what to play next hits external resources so may take some time), my program simply forwarded that message out to be handled in the application’s main thread by posting a message to the pipeline’s bus.

Read More…

Newlib's git repository

Because I had quite the time finding it when I wanted to submit a patch to newlib, there’s a git mirror of the canonical CVS repository for newlib, which all the patches I saw on the mailing list were based off of. Maybe somebody else looking for it will find this note useful: git clone git://sourceware.org/git/newlib.git See also: the original mailing list announcement of the mirror’s availability.

Matrioshka brains and IPv6: a thought experiment

Nich (one of my roommates) mentioned recently that discussion in his computer networking course this semester turned to IPv6 in a recent session, and we spent a short while coming up with interesting ways to consider the size of the IPv6 address pool. Assuming 2^128 available addresses (an overestimate since some number of them are reserved for certain uses and are not publicly routable), for example, there are more IPv6 addresses than there are (estimated) grains of sand on Earth by a factor of approximately 3 × 10^14 (Wolfram|Alpha says there are between 10^20 and 10^24 grains of sand on Earth).

Read More…