Fuzzy Compression


I recenty programmed a very strange and almost worthless image compressor for fun.

The image on the left is 230400 bytes and when reduced to greyscale takes up 76800 bytes.  The image on the right was reconstructed from 3072 bytes… thats 4% of the original size.  The compression time was somewhere in the region of an hour running on a single core.  It is very very lossy.

The process is fairly cool however.  The compressed image is reconstructed by performing 512 random walks over the screen, each leaving a pixel trail behind.  If the start locations and random number seeds for each random walk is computed correctly then the final result can be trained to produce any desired image.  The training I used for this demo was a very basic genetic algorythm / progressive refinement technique.  I imagine if left to its own devices to learn for a few days, the output image could look much cleaner, but I dont have that kind of patience.

The concept has been done before and much better:

see here: http://www.pouet.net/prod.php?which=62917

It seems the creator of this demo managed to fit that into 256 bytes including the unpacker.  Here are some of the things he says about his demo:

There are 64 pseudo random brush strokes of cycling color. Each stroke is shorter than previous to increase details. Intro code generates set of random numbers starting from initial 32bit seed with Galois LFSRs as a PRNG. This set is used to control direction of brush movement and and starting position. Each stroke has initial 16bit value that influences the seed. Therefore data is 4 bytes (initial seed) + 64*2 bytes of stroke PRNG seeds. Picture of Mona Lisa (3072 bytes) was compressed to those random brush strokes by external optimization program (a few days of work for modern CPU/GPU combo). The process can be seen as lossy compression (23x) of picture into random brush movements.

The seeds were found with iterative brute-force approach. Each brush stroke is a layer which covers the previous one, therefore is mostly independent in calculations. For each layer there is a 16bit seed to be found that generates the best picture possible. Therefore there are 64 layers * 65536 possible values per layer and the program iterates through all of them (4 million of combinations – acceptable). This should be done for each initial 32bit seed (means *4 billion combinations) but can be performed in parallel. I’ve checked only a few thousands of randomly chosen initial seeds to find a nice looking picture. Details mask had to be used to multiply difference error on critical parts of the picture (eyes, nose, mouth). Otherwise too many details were put in the background.