OPTIMIZE ALL THE IMAGES. Losslessly optimize JPEG encoding and make them progressive (for nice incremental decoding). Do similar for PNGs, making an interlaced image and messing with encoding parameters for better compression.

 find -iname '*.jpg' -printf '%P,%s\n' -exec jpegtran -o -progressive -copy comments -outfile {} {} \; > delta.csv
 find -iname '*.png' -size +256k -printf '%P,%s\n' -exec parallel optipng -i 1 -o 9 ::: {} + >> delta.csv

Record considered files plus their original sizes, to analyze changes.

In [51]:
import os
deltas = []

with open('delta.csv', 'r') as f:
 for line in f:
 name, _, size = line.partition(',')
 size = int(size)
 newsize = os.stat(name).st_size
 deltas.append((name, size, newsize))

deltas

[('forum/images/smiles/dcs7_chevron.png', 922, 592),
 ...]

Find empty files (they screw up shrinkage computation) and filter them out. They probably shouldn't exist, but there were a few in my dataset.

In [52]:
[name for (name, orig, new) in deltas if orig == 0 or new == 0]
deltas = [d for d in deltas if d[1] != 0]
print('Recorded', len(deltas), 'non-empty files')

Recorded 3102 non-empty files


In [53]:
original_sizes = [orig for (_, orig, _) in deltas]
final_sizes = [new for (_, _, new) in deltas]
shrinkage = [orig - new for (_, orig, new) in deltas]

pct_total_change = 100 * (sum(original_sizes) - sum(final_sizes)) / sum(original_sizes)

pct_change = [shrinkage / orig for (shrinkage, orig) in zip(shrinkage, original_sizes)]
avg_pct_change = 100 * sum(pct_change) / len(pct_change)

print('Total size reduction:', sum(shrinkage), 'bytes ({}%)'.format(round(pct_total_change, 2)))
avg = sum(shrinkage) / len(shrinkage)
print('Average reduction per file:', avg, 'bytes ({}%)'.format(round(avg_pct_change, 2)))

Total size reduction: 12719421 bytes (15.9%)
Average reduction per file: 4100.393617021276 bytes (22.15%)


In [54]:
idx = shrinkage.index(max(shrinkage))
print('Best single-file reduction', shrinkage[idx], 'bytes in', deltas[idx][0])
idx = shrinkage.index(min(shrinkage))
print('Worst single-file reduction', shrinkage[idx], 'bytes in', deltas[idx][0])
idx = pct_change.index(max(pct_change))
print('Largest fractional change', round(100 * pct_change[idx], 2), 'percent in', deltas[idx][0])

Best single-file reduction 350682 bytes in img/misc/sniperpwn.png
Worst single-file reduction 0 bytes in forum/templates/Cemetech6/images/green/table_top.png
Largest fractional change 98.78 percent in play/media/img/misc/inv-bg.png
