Zipnet - a neural image codec entirely in your browser!

Try it out with your own image. A trained neural network will find an efficient encoding for your image and you can then download it. The encoding will be lossy (similar to JPEG - but it often looks more natural to the eye). You can then re-upload the encoded image to see the reconstruction. And the best thing about it – this algorithm runs entirely in your browser. No servers or anything involved! You can try it out below. If you need additional instructions or want explanations on what you see refer to the Detailed usage.

Encoding Image

Choose a .png image you want to encode (max size 256x256).

Decoding Image

Upload a Zipnet encoded image. The file should be called something like "zipnet-enc-<timestamp>.bin".

Additional information

Detailed usage

Encoding

The website contains two sections. One for encoding and one for decoding. Under Encoding Image, you can encode an image by selecting the Choose File button. A file dialog should pop up. There, you can select your image file. We only accept .png, as this is the most widespread format that is not encoded in a lossy form. Due to speed reasons, we strongly recommend uploads of a maximum size of 256x256. You can however upload larger files – proceed with caution!
We also note that Zipnet works best with images of naturally occuring scenes (landscapes, objects, faces, animals...) as opposed to artificial images (screenshots, certain kinds of drawings...). This is simply due to the data it was trained on. We encourage you to try different images to gauge the effect!

Uploading the image puts a preview of the image below the Choose File button. This is still your original image, not the encoded one. To start the encoding process, click Get Encoded.

After the encoding process is finished (which should take a second) a file download will be prompted. The file's default name should look something like "zipnet-enc-<timestamp>.bin". You can however save it as you want. Additionally, some stats about the process are displayed below the Get Encoded button. You can see the previous size and the size of the encoded representation (in bytes). After the encoded size, a percentage value denotes how big the file size is in relation to the original image. Additionally, we report the bits per pixel (bpp) value. This indicates how many bits the encoded representation takes up per pixel of the original image (completely uncompressed equals 24 bpp for RGB8 color images).

Decoding

With the encoded file, we can start the decoding process. For this, click the Choose File button in the decoding section. This will prompt you to upload the encoded file. After choosing the file, the decoding process will immediately start. Don't worry, this can take some time (around 10-15 seconds). After the process has finished, you will see a preview of the decoded image below the Choose File button. As we apply a pretty aggressive lossy compression scheme, this will result in a loss of quality compare to the original. After the process has finished, you can download the decoded image with the Get Decoded button.

The decoding and encoding process are of course decoupled. You can encode an image, close the website, and visit a day later to decode your image without problems. We do not guarantee that encoded images will be decodeable until the end of time – we are still actively working on improving the process, which can render older encoded images unrecoverable if we decide to publish a new version of the website.
There is however an advantage to encoding and decoding the image in one sitting. As we will have access to the original in this case, we can provide some additional statistics. This will result in the decoding process showing two decoded images: Our method on the left, and a JPEG image that was encoded to a very similar bitrate, to the right. Keep in mind: higher bitrate means a larger encoded size. This typically results in a higher quality decompressed image. Most of the time, the JPEG image looks a lot worse (despite having almost the same encoded file size!). We confirm this via some stats below the image: additionally to the bpp for both methods, we report the Peak-Signal-To-Noise-Ratio (PSNR). This is a measure of how much the image differs from the original.

What is a neural image codec?

A codec is a short word for an algorithm which encodes or decodes data. The vast majority of the internet traffic is carried in the form of images and videos, which are largely compressed using classical, lossy codecs (JPEG, MPEG...). These have been hand-crafted to ensure a good compression performance (the compressed data is small) and a low distortion (the image still looks similar to the original after decompression). In the past years, Machine Learning has popped up in all kinds of different places, and compression algorithms have been a very recent field of inquiry. In contrast to the manual classical codec, a neural codec uses a neural network to learn how to best compress the data while keeping a low distortion.

Why is this interesting?

As mentioned above, the majority of the internet traffic is encoded image and video data. Recent research has shown that neural codecs have the potential to greatly increase compression performance and while even producing more natural looking reconstructions (no more JPEG artifacts!). A greater compression performance means reduced data size. This results in reduced network traffic and can thus save a lot of costs in hardware, reduce energy usage and increase network performance.

What does Zipnet add to this?

One of the biggest drawbacks of current neural codecs is their computational cost. The focus of Machine Learning research often lies on the performance on certain benchmarks. For this, the networks often get very big and are required to run on expensive hardware, like GPUs costing upwards of 1000€. Thus, most Machine Learning services that can be used by someone over a website are powered via large servers. This is not feasible for compression algorithms, as a server obviously cannot do the decoding job for an end user. Zipnet aims to bridge this gap between research and application. We provide neural codec that can run in any web browser on any reasonably modern hardware in an acceptable time.

I want to know all the details!

The following are some more technical informations about the project. If you are further interested, make sure to check out our Github Repository!

Implementation

We built an architecture that closely matches "End-to-End optimized image compression", 2017, Ballé et al. published at ICLR. This architecture has been described in later papers as the (Fully) Factorized Prior Model, in contrast to the hierarchical models that have been built upon this. These achieve a better compression performance, but the Factorized Model still outperforms JPEG and uses significantly less weights, which makes it faster. The model functions similar to a Variational Auto-Encoder, with a slight twist: parallel to optimizing for reconstruction performance, we also learn a probability distribution over the latent state, which can later be used with modern entropy coders to compress the latent state to a very small binary file. The full training objective is then a trade-off between the reconstruction performance and the size of the encoded image.

We implemented the model architecture in Python as well as in Rust. In Python, we were able to use the by compression library by CompressAI , which we always used to train the models. In Rust, no such libraries existed, thus we have rewritten all the compression components from scratch. This had to be done in pure Rust, as else we would lose the simple WASM compatibility. We have only written the forward part of the networks in Rust and then imported the weights we obtained from Python. This had the nice side effect of forcing us to write an implementation that is 100% compliant with Pytorch and CompressAI, which has been shown through rigorous testing. During this, we have also publish a crate for convolutions in Rust: convolutions-rs"