Parsing GIFs in Rust with nom

It all started a couple months ago. I was with a friend out in a shopping mall, and they noticed a music box playing in a temporary pop up stand (It was Valentine’s Day), and remarked how they don’t make them like how they used to.

That gave me the idea of making a virtual music box page! At first I thought of putting a 3d model on it, then got the idea that making it 2D would be a) Easier to code b) Allow me to play animated GIFs, which you can’t do in the real world! (Unless I suppose you have a sheet of glass on a music box with some fancy lighting tricks, but feels kind of gimmicky).

I also had a list of a few of my favourite songs to transcribe on Musescore, using the MusicBox “instrument”, that I thought would sound good on a music box. I did think of having the audio generated by the webpage (so you can have a virtual “keyboard” of sorts) but unfortunately I don’t have much experience with audio synthesis and also I couldn’t find any good libraries to do that anyway, so I went with exporting my tunes I transcribed to mp3s.

Because I’m a masochist, I thought I’ll make this in WebGL instead of using one of the many, many javascript libraries to help me do this, since I don’t like bloat and npm. This isn’t as actually as bad as it sounds, I do have experience with OpenGL and figured WebGL wouldn’t be that far off.

So the game plan is simple:

  1. Have a spinning spinning music box model
  2. Add a GIF texture on it
  3. Add a winding button to wind up the music box and start rotating it and playing the audio.

Unfortunately, even though there are a shit ton of Web APIs (Really! There’s WebGL, WebGPU, WebBluetooth, WebUSB, there’s even a Speech Synthesizer API on browsers!), there apparently isn’t one for Animated GIFs. The closest I found was playing a webp video, and you can use some javascript to get individual frames of the video, which I can then pass to WebGL to use as a texture. But that won’t do, won’t it?

Fortunately, the GIF file format is fairly simple, and I had previously already worked on a GIF file parser back in my university days, written in python and c++ (I wrote the compression code in C++ for performance reasons.)

GIF Specification: https://www.w3.org/Graphics/GIF/spec-gif89a.txt

This time, I decided to use Rust and compile it to WebAssembly so I could use it on webpages. Why? No reason in particular. I later found out there that there were already WASM GIF Parsers, written with Google’s wuffs library, which apparently is “ridiculously fast” and memory-safe.

After a bit of digging, I believe Google Chrome / Chromium also uses wuffs to parse GIFs in its skia library, so unlike libwebp, I don’t think a deep dive into the GIF file format will uncover any 0-days. Oh well, this was still a fun project at least.

I used the nom library, which is a parser combinator library.

The concept of parser combinators isn’t unique to this Rust library, there is for example Boost’s Spirit library. But I wanted to try my hand at a simple Rust project, as I had only worked on simpler projects or followed the Rust book’s project before.

From nom’s github page,

Instead of writing the grammar in a separate file and generating the corresponding code, you use very small functions with very specific purpose, like “take 5 bytes”, or “recognize the word ‘HTTP’”, and assemble them in meaningful patterns like “recognize ‘HTTP’, then a space, then a version”. The resulting code is small, and looks like the grammar you would have written with other parser approaches.

This was a really nice structure to work with, as I can do things like for example, write a function that parses a color table, and re-use that later on when parsing other parts of the file.

I also really like it because it lets me easily separate the different parts of the file format parsing.

For context, this is how the GIF file format looks like at a very high level.

And this how it looks like in the relevant Rust code, which is pretty similar.

impl GifFile {
    pub fn new(bytes: &[u8]) -> Result<GifFile, &'static str> {
        const TRAILER: &[u8] = &[0x3B];
        let (bytes, header) = parse_header(bytes).unwrap();
        let (bytes, logical_screen_descriptor) = parse_logical_screen_descriptor(bytes).unwrap();
        let (bytes, global_color_table) =
            parse_global_color_table(bytes, &logical_screen_descriptor).unwrap();
        let (bytes, frames) = many1(parse_frame)(bytes).unwrap();
        let (bytes, _) = tag::<&[u8], &[u8], nom::error::Error<&[u8]>>(TRAILER)(bytes).unwrap();
        let (_, _) = eof::<&[u8], nom::error::Error<&[u8]>>(bytes).unwrap();
        Ok(GifFile {
            header,
            logical_screen_descriptor,
            global_color_table,
            frames,
        })
    }
}

Likewise, in the parse_frame function, it’s also pretty similar.

fn parse_frame(bytes: &[u8]) -> IResult<&[u8], GifFrame> {
    let (bytes, extensions) = parse_extensions(bytes).unwrap();
    let (bytes, image_descriptor) = parse_image_descriptor(bytes)?;
    let (bytes, local_color_table) = parse_local_color_table(bytes, &image_descriptor)?;
    let (bytes, frame_indices) = parse_image_data(bytes)?;
    Ok((
        bytes,
        GifFrame {
            image_descriptor,
            local_color_table,
            frame_indices,
            extensions,
        },
    ))
}

If you’re wondering, the “lower level” parsers like parse_image_descriptor just read the bytes and uses functions like le_u16 to read in a little-endian 16-bit unsigned integer for example, for the width or height of the image.

Writing the parser was probably the easiest part. The GIF file format uses a custom variant of LZW compression of “indices” within a colour table embedded in the file (Fun fact: A frame can have at most 256 colours in it! That’s why GIFs are so low quality, and why you shouldn’t use them.). Luckily, it is a pretty simple compression algorithm, and I have written code to compress and decompress it before in python and C++, which I ported over to Rust.

I still had many bugs I had to squash, ranging from silly errors on my part that took longer than it should have taken me (To name a few, an off-by-one error that completely broke everything, a down-casting bug that made me very confused for many hours, my assumption that a global color table would always be present, and javascript implicitly converting an “undefined” type (that was undefined because of a typo) into a 0). These are all mostly bugs that I should have caught quicker than I should have, if I wasn’t “relaxed”. But I’ll focus on one that is relevant to file format parsing.

Curiously, there was a bug where the first frame would sometimes be messed up

Though subsequent frames would be, more or less, discernible

Why was that? It wasn’t happening with all GIFs in my testing, only some of them.

I printed out some debug info of my GIF, and I find that in the very first frame, there was an “offset” in the x and y coordinate.

This was completely breaking everything because I had assumed that at the very least, the first frame would contain all the necessary pixel data from the entire image.

I think it is interesting that because of the way Rust is structured, for example with the match expression, it really forces you to think of some edge-case scenarios and catch them. For example, what if there isn’t a “colour table” specified, but there is a frame referencing it? Maybe in some other languages, it would happily do a NULL dereference and crash (Assuming you’re using a pointer to the “current colour table” and since there isn’t any, it defaults to a NULL value), but in Rust, since there aren’t any pointers, you’ll probably wrap it in an Optional<ColourTable> and forced to match it against Some(table) or None and think of all the possible cases.

Another bug I ran into was an unusual “extension” in a GIF I received from a friend.

The file format specifies the Graphics Control Extension as the only required one, a “Comment” extension for metadata that in practice isn’t used, and a “Plaintext” extension that is supposed to overlay the GIF file with some text on it (though I don’t think any GIF viewer uses this). Matthew Flickinger’s “What’s in a GIF” article, which really dives deep into the file format, also talks about a Netscape extension that you might see, but I found another one, “ICCRGBG1012”.

Some light googling lead me to this page https://www.color.org/wpaper2.xalter where they say it is an extension meant to use it as a colour profile.

Interestingly, I saw an issue )on the Moodle issue tracker of some warnings being emitted when there are GIFs with ICC profiles. The team seemed to deem it a non-issue and said the warning is good because it supposedly bloats the file size a lot.

I downloaded the GIF through Facebook Messenger, and my friend told me she made the GIF with Procreate. So my theory? Either procreate does some weird messing with the GIF file format, or, Facebook does some weird stuff to the GIF. I’m not sure which. (Worth pointing out the GIF has a bug in it with a flashing black background that I talk about later, but displayed correctly on Messenger…)

I’m inclined to believe that its probably Procreate, because I’m guessing the embedding the ICC Color Profile into the GIF (Which is not officially part of the specification, it’s an application-specific thing, see: above) is probably Procreate, because I can’t imagine Facebook would want to spend processing power and waste storage space to make an image upload larger than necessary when post-processing an uploaded GIF, but an image editing/creating/sketching application?

Definitely makes sense to embed a color profile in there, it’s targeted towards artists who would use software that would parse and use the color profiles.

I’m not sure if it was related, but on this GIF, I kept getting a flashing black background instead of a transparent background on the GIF. I again manually checked the file, and it was indeed correct, but it wasn’t actually a black background? For some reason, every 2nd frame, it used a RGB(3,2,2) colour as the background, even though that wasn’t specified as the transparent background colour. I verified this by checking the GIF on other image viewers (Gwenview) which also had the “bug”.

Not sure why exactly this happened, maybe the Colour Profile extension specifies you to replace this specific colour, and since I didn’t look into the extension, I don’t know how it works, but a hard-coded patch replacing RGB(3,2,2) with a transparent colour fixed it for me.

But anyway, I just skipped over any extension I didn’t have code to parse. Later on, I found this wiki page that listed a lot more other extensions, for example one used by ImageMagick, others used to store XMP metadata, and a couple others. But I haven’t looked for or found any GIFs with those extensions yet.

So after I got my GIF parsing code, I made a separate crate to use my GIF parsing crate as a library, convert that to a WebGL texture, and display it on my web screen! This MDN article was particularly helpful in the Rust to WASM part, as was their WebGL tutorial.

(Credit to the creator of the GIF. I downloaded this GIF a long time ago from giphy, and can’t find it again.)

This isn’t actually my music box site, as I’m keeping that private, this is just a simple rotating GIF on a WebGL texture. And I’m aware this can probably be accomplished much easier with maybe CSS’s rotate3d.

But that isn’t the point. It’s just to illustrate that the GIF parsed the file correctly. In practice, perhaps you’ll use the GIF as a texture on one of the faces of an actual 3d model, which you can’t accomplish with some CSS.

You can find my code for my GIF parsing code here (TIP: I made a branch for converting it to a PPM file for debugging, which also might fix a few bugs for some GIF files that I forgot to merge back to master.) and the Spinning GIF thing here (The crate for the previous repo should be in the parent directory of this repository, you can compile to WASM with wasm-pack build --target web).

Hope you learnt something about GIF files in the process! All I’ve learnt is that it is an archaic file format with many problems that honestly shouldn’t be used (And I think most services now use mp4s or APNGs that pretend to be GIFs, but the vernacular is still to call videos with no audios as a “GIF”). But I think it’s still important part of Internet history, as it is such a widespread file format that anyone on the Internet knows about.

Naavin Ravinthran
Naavin Ravinthran
Computer Science Graduate

My interests include cybersecurity, osdev, and graphics programming.