Federico's Blog

  1. My gdk-pixbuf braindump

    - gdk-pixbuf, gnome

    I want to write a braindump on the stuff that I remember from gdk-pixbuf's history. There is some talk about replacing it with something newer; hopefully this history will show some things that worked, some that didn't, and why.

    The beginnings

    Gdk-pixbuf started as a replacement for Imlib, the image loading and rendering library that GNOME used in its earliest versions. Imlib came from the Enlightenment project; it provided an easy API around the idiosyncratic libungif, libjpeg, libpng, etc., and it maintained decoded images in memory with a uniform representation. Imlib also worked as an image cache for the Enlightenment window manager, which made memory management very inconvenient for GNOME.

    Imlib worked well as a "just load me an image" library. It showed that a small, uniform API to load various image formats into a common representation was desirable. And in those days, hiding all the complexities of displaying images in X was very important indeed.

    The initial API

    Gdk-pixbuf replaced Imlib, and added two important features: reference counting for image data, and support for an alpha channel.

    Gdk-pixbuf appeared with support for RGB(A) images. And although in theory it was possible to grow the API to support other representations, GdkColorspace never acquired anything other than GDK_COLORSPACE_RGB, and the bits_per_sample argument to some functions only ever supported being 8. The presence or absence of an alpha channel was done with a gboolean argument in conjunction with that single GDK_COLORSPACE_RGB value; we didn't have something like cairo_format_t which actually specifies the pixel format in single enum values.

    While all the code in gdk-pixbuf carefully checks that those conditions are met — RGBA at 8 bits per channel —, some applications inadvertently assume that that is the only possible case, and would get into trouble really fast if gdk-pixbuf ever started returning pixbufs with different color spaces or depths.

    One can still see the battle between bilevel-alpha vs. continuous-alpha in this enum:

    typedef enum
    {
            GDK_PIXBUF_ALPHA_BILEVEL,
            GDK_PIXBUF_ALPHA_FULL
    } GdkPixbufAlphaMode;
    

    Fortunately, only the "render this pixbuf with alpha to an Xlib drawable" functions take values of this type: before the Xrender days, it was a Big Deal to draw an image with alpha to an X window, and applications often opted to use a bitmask instead, even if they had jagged edges as a result.

    Pixel formats

    The only pixel format that ever got implemented was unpremultiplied RGBA on all platforms. Back then I didn't understand premultiplied alpha! Also, the GIMP followed that scheme, and copying it seemed like the easiest thing.

    After gdk-pixbuf, libart also copied that pixel format, I think.

    But later we got Cairo, Pixman, and all the Xrender stack. These prefer premultiplied ARGB. Moreover, Cairo prefers it if each pixel is actually a 32-bit value, with the ARGB values inside it in platform-endian order. So if you look at a memory dump, a Cairo pixel looks like BGRA on a little-endian box, while it looks like ARGB on a big-endian box.

    Every time we paint a GdkPixbuf to a cairo_t, there is a conversion from unpremultiplied RGBA to premultiplied, platform-endian ARGB. I talked a bit about this in Reducing the number of image copies in GNOME.

    The loading API

    The public loading API in gdk-pixbuf, and its relationship to loader plug-ins, evolved in interesting ways.

    At first the public API and loaders only implemented load_from_file: you gave the library a FILE * and it gave you back a GdkPixbuf. Back then we didn't have a robust MIME sniffing framework in the form of a library, so gdk-pixbuf got its own. This lives in the mostly-obsolete GdkPixbufFormat machinery; it even has its own little language for sniffing file headers! Nowadays we do most MIME sniffing with GIO.

    After the intial load_from_file API... I think we got progressive loading first, and animation support aftewards.

    Progressive loading

    This where the calling program feeds chunks of bytes to the library, and at the end a fully-formed GdkPixbuf comes out, instead of having a single "read a whole file" operation.

    We conflated this with a way to get updates on how the image area gets modified as the data gets parsed. I think we wanted to support the case of a web browser, which downloads images slowly over the network, and gradually displays them as they are downloaded. In 1998, images downloading slowly over the network was a real concern!

    It took a lot of very careful work to convert the image loaders, which parsed a whole file at a time, into loaders that could maintain some state between each time that they got handed an extra bit of buffer.

    It also sounded easy to implement the progressive updating API by simply emitting a signal that said, "this rectangular area got updated from the last read". It could handle the case of reading whole scanlines, or a few pixels, or even area-based updates for progressive JPEGs and PNGs.

    The internal API for the image format loaders still keeps a distinction between the "load a whole file" API and the "load an image in chunks". Not all loaders got redone to simply just use the second one: io-jpeg.c still implements loading whole files by calling the corresponding libjpeg functions. I think it could remove that code and use the progressive loading functions instead.

    Animations

    Animations: we followed the GIF model for animations, in which each frame overlays the previous one, and there's a delay set between each frame. This is not a video file; it's a hacky flipbook.

    However, animations presented the problem that the whole gdk-pixbuf API was meant for static images, and now we needed to support multi-frame images as well.

    We defined the "correct" way to use the gdk-pixbuf library as to actually try to load an animation, and then see if it is a single-frame image, in which case you can just get a GdkPixbuf for the only frame and use it.

    Or, if you got an animation, that would be a GdkPixbufAnimation object, from which you could ask for an iterator to get each frame as a separate GdkPixbuf.

    However, the progressive updating API never got extended to really support animations. So, we have awkward functions like gdk_pixbuf_animation_iter_on_currently_loading_frame() instead.

    Necessary accretion

    Gdk-pixbuf got support for saving just a few formats: JPEG, PNG, TIFF, ICO, and some of the formats that are implemented with the Windows-native loaders.

    Over time gdk-pixbuf got support for preserving some metadata-ish chunks from formats that provide it: DPI, color profiles, image comments, hotspots for cursors/icons...

    While an image is being loaded with the progressive loaders, there is a clunky way to specify that one doesn't want the actual size of the image, but another size instead. The loader can handle that situation itself, hopefully if an image format actually embeds different sizes in it. Or if not, the main loading code will rescale the full loaded image into the size specified by the application.

    Historical cruft

    GdkPixdata - a way to embed binary image data in executables, with a funky encoding. Nowadays it's just easier to directly store a PNG or JPEG or whatever in a GResource.

    contrib/gdk-pixbuf-xlib - to deal with old-style X drawables. Hopefully mostly unused now, but there's a good number of mostly old, third-party software that still uses gdk-pixbuf as an image loader and renderer to X drawables.

    gdk-pixbuf-transform.h - Gdk-pixbuf had some very high-quality scaling functions, which the original versions of EOG used for the core of the image viewer. Nowadays Cairo is the preferred way of doing this, since it not only does scaling, but general affine transformations as well. Did you know that gdk_pixbuf_composite_color takes 17 arguments, and it can composite an image with alpha on top of a checkerboard? Yes, that used to be the core of EOG.

    Debatable historical cruft

    gdk_pixbuf_get_pixels(). This lets the program look into the actual pixels of a loaded pixbuf, and modify them. Gdk-pixbuf just did not have a concept of immutability.

    Back in GNOME 1.x / 2.x, when it was fashionable to put icons beside menu items, or in toolbar buttons, applications would load their icon images, and modify them in various ways before setting them onto the corresponding widgets. Some things they did: load a colorful icon, desaturate it for "insensitive" command buttons or menu items, or simulate desaturation by compositing a 1x1-pixel checkerboard on the icon image. Or lighten the icon and set it as the "prelight" one onto widgets.

    The concept of "decode an image and just give me the pixels" is of course useful. Image viewers, image processing programs, and all those, of course need this functionality.

    However, these days GTK would prefer to have a way to decode an image, and ship it as fast as possible ot the GPU, without intermediaries. There is all sorts of awkward machinery in the GTK widgets that can consume either an icon from an icon theme, or a user-supplied image, or one of the various schemes for providing icons that GTK has acquired over the years.

    It is interesting to note that gdk_pixbuf_get_pixels() was available pretty much since the beginning, but it was only until much later that we got gdk_pixbuf_get_pixels_with_length(), the "give me the guchar * buffer and also its length" function, so that calling code has a chance of actually checking for buffer overruns. (... and it is one of the broken "give me a length" functions that returns a guint rather than a gsize. There is a better gdk_pixbuf_get_byte_length() which actually returns a gsize, though.)

    Problems with mutable pixbufs

    The main problem is that as things are right now, we have no flexibility in changing the internal representation of image data to make it better for current idioms: GPU-specific pixel formats may not be unpremultiplied RGBA data.

    We have no API to say, "this pixbuf has been modified", akin to cairo_surface_mark_dirty(): once an application calls gdk_pixbuf_get_pixels(), gdk-pixbuf or GTK have to assume that the data will be changed and they have to re-run the pipeline to send the image to the GPU (format conversions? caching? creating a texture?).

    Also, ever since the beginnings of the gdk-pixbuf API, we had a way to create pixbufs from arbitrary user-supplied RGBA buffers: the gdk_pixbuf_new_from_data functions. One problem with this scheme is that memory management of the buffer is up to the calling application, so the resulting pixbuf isn't free to handle those resources as it pleases.

    A relatively recent addition is gdk_pixbuf_new_from_bytes(), which takes a GBytes buffer instead of a random guchar *. When a pixbuf is created that way, it is assumed to be immutable, since a GBytes is basically a shared reference into a byte buffer, and it's just easier to think of it as immutable. (Nothing in C actually enforces immutability, but the API indicates that convention.)

    Internally, GdkPixbuf actually prefers to be created from a GBytes. It will downgrade itself to a guchar * buffer if something calls the old gdk_pixbuf_get_pixels(); in the best case, that will just take ownership of the internal buffer from the GBytes (if the GBytes has a single reference count); in the worst case, it will copy the buffer from the GBytes and retain ownership of that copy. In either case, when the pixbuf downgrades itself to pixels, it is assumed that the calling application will modify the pixel data.

    What would immutable pixbufs look like?

    I mentioned this a bit in "Reducing Copies". The loaders in gdk-pixbuf would create immutable pixbufs, with an internal representation that is friendly to GPUs. In the proposed scheme, that internal representation would be a Cairo image surface; it can be something else if GTK/GDK eventually prefer a different way of shipping image data into the toolkit.

    Those pixbufs would be immutable. In true C fashion we can call it undefined behavior to change the pixel data (say, an app could request gimme_the_cairo_surface and tweak it, but that would not be supported).

    I think we could also have a "just give me the pixels" API, and a "create a pixbuf from these pixels" one, but those would be one-time conversions at the edge of the API. Internally, the pixel data that actually lives inside a GdkPixbuf would remain immutable, in some preferred representation, which is not necessarily what the application sees.

    What worked well

    A small API to load multiple image formats, and paint the images easily to the screen, while handling most of the X awkwardness semi-automatically, was very useful!

    A way to get and modify pixel data: applications clearly like doing this. We can formalize it as an application-side thing only, and keep the internal representation immutable and in a format that can evolve according to the needs of the internal API.

    Pluggable loaders, up to a point. Gdk-pixbuf doesn't support all the image formats in the world out of the box, but it is relatively easy for third-parties to provide loaders that, once installed, are automatically usable for all applications.

    What didn't work well

    Having effectively two pixel formats supported, and nothing else: gdk-pixbuf does packed RGB and unpremultiplied RGBA, and that's it. This isn't completely terrible: applications which really want to know about indexed or grayscale images, or high bit-depth ones, are probably specialized enough that they can afford to have their own custom loaders with all the functionality they need.

    Pluggable loaders, up to a point. While it is relatively easy to create third-party loaders, installation is awkward from a system's perspective: one has to run the script to regenerate the loader cache, there are more shared libraries running around, and the loaders are not sandboxed by default.

    I'm not sure if it's worthwhile to let any application read "any" image format if gdk-pixbuf supports it. If your word processor lets you paste an image into the document... do you want it to use gdk-pixbuf's limited view of things and include a high bit-depth image with its probably inadequate conversions? Or would you rather do some processing by hand to ensure that the image looks as good as it can, in the format that your word processor actually supports? I don't know.

    The API for animations is very awkward. We don't even support APNG... but honestly I don't recall actually seeing one of those in the wild.

    The progressive loading API is awkward. The "feed some bytes into the loader" part is mostly okay; the "notify me about changes to the pixel data" is questionable nowadays. Web browsers don't use it; they implement their own loaders. Even EOG doesn't use it.

    I think most code that actually connects to GdkPixbufLoader's signals only uses the size-prepared signal — the one that gets emitted soon after reading the image headers, when the loader gets to know the dimensions of the image. Apps sometimes use this to say, "this image is W*H pixels in size", but don't actually decode the rest of the image.

    The gdk-pixbuf model of static images, or GIF animations, doesn't work well for multi-page TIFFs. I'm not sure if this is actualy a problem. Again, applications with actual needs for multi-page TIFFs are probably specialized enough that they will want a full-featured TIFF loader of their own.

    Awkward architectures

    Thumbnailers

    The thumbnailing system has slowly been moving towards a model where we actually have thumbnailers specific to each file format, instead of just assuming that we can dump any image into a gdk-pixbuf loader.

    If we take this all the way, we would be able to remove some weird code in, for example, the JPEG pixbuf loader. Right now it supports loading images at a size that the calling code requests, not only at the "natural" size of the JPEG. The thumbnailer can say, "I want to load this JPEG at 128x128 pixels" or whatever, and in theory the JPEG loader will do the minimal amount of work required to do that. It's not 100% clear to me if this is actually working as intended, or if we downscale the whole image anyway.

    We had a distinction between in-process and out-of-process thumbnailers, and it had to do with the way pixbuf loaders are used; I'm not sure if they are all out-of-process and sandboxed now.

    Non-raster data

    There is a gdk-pixbuf loader for SVG images which uses librsvg internally, but only in a very basic way: it simply loads the SVG at its preferred size. Librsvg jumps through some hoops to compute a "preferred size" for SVGs, as not all of them actually indicate one. The SVG model would rather have the renderer say that the SVG is to be inserted into a rectangle of certain width/height, and scaled/positioned inside the rectangle according to some other parameters (i.e. like one would put it inside an HTML document, with a preserveAspectRatio attribute and all that). GNOME applications historically operated with a different model, one of "load me an image, I'll scale it to whatever size, and paint it".

    This gdk-pixbuf loader for SVG files gets used for the SVG thumbnailer, or more accurately, the "throw random images into a gdk-pixbuf loader" thumbnailer. It may be better/cleaner to have a specific thumbnailer for SVGs instead.

    Even EOG, our by-default image viewer, doesn't use the gdk-pixbuf loader for SVGs: it actually special-cases them and uses librsvg directly, to be able to load an SVG once and re-render it at different sizes if one changes the zoom factor, for example.

    GTK reads its SVG icons... without using librsvg... by assuming that librsvg installed its gdk-pixbuf loader, so it loads them as any normal raster image. This kind of dirty, but I can't quite pinpoint why. I'm sure it would be convenient for icon themes to ship a single SVG with tons of icons, and some metadata on their ids, so that GTK could pick them out of the SVG file with rsvg_render_cairo_sub() or something. Right now icon theme authors are responsible for splitting out those huge SVGs into many little ones, one for each icon, and I don't think that's their favorite thing in the world to do :)

    Exotic raster data

    High bit-depth images... would you expect EOG to be able to load them? Certainly; maybe not with all the fancy conversions from a real RAW photo editor. But maybe this can be done as EOG-specific plugins, rather than as low in the platform as the gdk-pixbuf loaders?

    (Same thing for thumbnailing high bit-depth images: the loading code should just provide its own thumbnailer program for those.)

    Non-image metadata

    The gdk_pixbuf_set_option / gdk_pixbuf_get_option family of functions is so that pixbuf loaders can set key/value pairs of strings onto a pixbuf. Loaders use this for comment blocks, or ICC profiles for color calibration, or DPI information for images that have it, or EXIF data from photos. It is up to applications to actually use this information.

    It's a bit uncomfortable that gdk-pixbuf makes no promises about the kind of raster data it gives to the caller: right now it is raw RGB(A) data that is not gamma-corrected nor in any particular color space. It is up to the caller to see if the pixbuf has an ICC profile attached to it as an option. Effectively, this means that applications don't know if they are getting SRGB, or linear RGB, or what... unless they specifically care to look.

    The gdk-pixbuf API could probably make promises: if you call this function you will get SRGB data; if you call this other function, you'll get the raw RGBA data and we'll tell you its colorspace/gamma/etc.

    The various set_option / get_option pairs are also usable by the gdk-pixbuf saving code (up to now we have just talked about loaders). I don't know enough about how applications use the saving code in gdk-pixbuf... the thumbnailers use it to save PNGs or JPEGs, but other apps? No idea.

    What I would like to see

    Immutable pixbufs in a useful format. I've started work on this in a merge request; the internal code is now ready to take in different internal representations of pixel data. My goal is to make Cairo image surfaces the preferred, immutable, internal representation. This would give us a gdk_pixbuf_get_cairo_surface(), which pretty much everything that needs one reimplements by hand.

    Find places that assume mutable pixbufs. To gradually deprecate mutable pixbufs, I think we would need to audit applications and libraries to find places that cause GdkPixbuf structures to degrade into mutable ones: basically, find callers of gdk_pixbuf_get_pixels() and related functions, see what they do, and reimplement them differently. Maybe they don't need to tint icons by hand anymore? Maybe they don't need icons anymore, given our changing UI paradigms? Maybe they are using gdk-pixbuf as an image loader only?

    Reconsider the loading-updates API. Do we need the GdkPixbufLoader::area-updated signal at all? Does anything break if we just... not emit it, or just emit it once at the end of the loading process? (Caveat: keeping it unchanged more or less means that "immutable pixbufs" as loaded by gdk-pixbuf actually mutate while being loaded, and this mutation is exposed to applications.)

    Sandboxed loaders. While these days gdk-pixbuf loaders prefer the progressive feed-it-bytes API, sandboxed loaders would maybe prefer a read-a-whole-file approach. I don't know enough about memfd or how sandboxes pass data around to know how either would work.

    Move loaders to Rust. Yes, really. Loaders are security-sensitive, and while we do need to sandbox them, it would certainly be better to do them in a memory-safe language. There are already pure Rust-based image loaders: JPEG, PNG, TIFF, GIF, ICO. I have no idea how featureful they are. We can certainly try them with gdk-pixbuf's own suite of test images. We can modify them to add hooks for things like a size-prepared notification, if they don't already have a way to read "just the image headers".

    Rust makes it very easy to plug in micro-benchmarks, fuzz testing, and other modern amenities. These would be perfect for improving the loaders.

    I started sketching a Rust backend for gdk-pixbuf loaders some months ago, but there's nothing useful yet. One mismatch between gdk-pixbuf's model for loaders, and the existing Rust codecs, is that Rust codecs generally take something that implements the Read trait: a blocking API to read bytes from abstract sources; it's a pull API. The gdk-pixbuf model is a push API: the calling code creates a loader object, and then pushes bytes into it. The gdk-pixbuf convenience functions that take a GInputStream basically do this:

    loader = gdk_pixbuf_loader_new (...);
    
    while (more_bytes) {
        n_read = g_input_stream_read (stream, buffer, ...);
        gdk_pixbuf_loader_write(loader, buffer, n_read, ...);
    }
    
    gdk_pixbuf_loader_close (loader);
    

    However, this cannot be flipped around easily. We could probably use a second thread (easy, safe to do in Rust) to make the reader/decoder thread block while the main thread pushes bytes into it.

    Also, I don't know how the Rust bindings for GIO present things like GInputStream and friends, with our nice async cancellables and all that.

    Deprecate animations? Move that code to EOG, just so one can look at memes in it? Do any "real apps" actually use GIF animations for their UI?

    Formalize promises around returned color profiles, gamma, etc. As mentioned above: have an "easy API" that returns SRGB, and a "raw API" that returns the ARGB data from the image, plus info on its ICC profile, gamma, or any other info needed to turn this into a "good enough to be universal" representation. (I think all the Apple APIs that pass colors around do so with an ICC profile attached, which seems... pretty much necessary for correctness.)

    Remove the internal MIME-sniffing machinery. And just use GIO's.

    Deprecate the crufty/old APIs in gdk-pixbuf. Scaling/transformation, compositing, GdkPixdata, gdk-pixbuf-csource, all those. Pixel crunching can be done by Cairo; the others are better done with GResource these days.

    Figure out if we want blessed codecs; fix thumbnailers. Link those loaders statically, unconditionally. Exotic formats can go in their own custom thumbnailers. Figure out if we want sandboxed loaders for everything, or just for user-side images (not ones read from the trusted system installation).

    Have GTK4 communicate clearly about its drawing model. I think we are having a disconnect between the GUI chrome, which is CSS/GPU friendly, and graphical content generated by applications, which by default right now is done via Cairo. And having Cairo as a to-screen and to-printer API is certainly very convenient! You Wouldn't Print a GUI, but certainly you would print a displayed document.

    It would also be useful for GTK4 to actually define what its preferred image format is if it wants to ship it to the GPU with as little work as possible. Maybe it's a Cairo image surface? Maybe something else?

    Conclusion

    We seem to change imaging models every ten years or so. Xlib, then Xrender with Cairo, then GPUs and CSS-based drawing for widgets. We've gone from trusted data on your local machine, to potentially malicious data that rains from the Internet. Gdk-pixbuf has spanned all of these periods so far, and it is due for a big change.

  2. Debugging an Rc<T> reference leak in Rust

    - gnome, librsvg, rust

    The bug that caused two brown-paper-bag released in librsvg — because it was leaking all the SVG nodes — has been interesting.

    Memory leaks in Rust? Isn't it supposed to prevent that?

    Well, yeah, but the leaks were caused by the C side of things, and by unsafe code in Rust, which does not prevent leaks.

    The first part of the bug was easy: C code started calling a function implemented in Rust, which returns a newly-acquired reference to an SVG node. The old code simply got a pointer to the node, without acquiring a reference. The new code was forgetting to rsvg_node_unref(). No biggie.

    The second part of the bug was trickier to find. The C code was apparently calling all the functions to unref nodes as appropriate, and even calling the rsvg_tree_free() function in the end; this is the "free the whole SVG tree" function.

    There are these types:

    // We take a pointer to this and expose it as an opaque pointer to C
    pub enum RsvgTree {}
    
    // This is the real structure we care about
    pub struct Tree {
        // This is the Rc that was getting leaked
        pub root: Rc<Node>,
        ...
    }
    

    Tree is the real struct that holds the root of the SVG tree and some other data. Each node is an Rc<Node>; the root node was getting leaked (... and all the children, recursively) because its reference count never went down from 1.

    RsvgTree is just an empty type. The code does an unsafe cast of *const Tree as *const RsvgTree in order to expose a raw pointer to the C code.

    The rsvg_tree_free() function, callable from C, looked like this:

    #[no_mangle]
    pub extern "C" fn rsvg_tree_free(tree: *mut RsvgTree) {
        if !tree.is_null() {
            let _ = unsafe { Box::from_raw(tree) };
                             // ^ this returns a Box<RsvgTree> which is an empty type!
        }
    }
    

    When we call Box::from_raw() on a *mut RsvgTree, it gives us back a Box<RsvgTree>... which is a box of a zero-sized type. So, the program frees zero memory when the box gets dropped.

    The code was missing this cast:

        let tree = unsafe { &mut *(tree as *mut Tree) };
                                     // ^ this cast to the actual type inside the Box
        let _ = unsafe { Box::from_raw(tree) };
    

    So, tree as *mut Tree gives us a value which will cause Box::from_raw() to return a Box<Tree>, which is what we intended. Dropping the box will drop the Tree, reduce the last reference count on the root node, and free all the nodes recursively.

    Monitoring an Rc<T>'s reference count in gdb

    So, how does one set a gdb watchpoint on the reference count?

    First I set a breakpoint on a function which I knew would get passed the Rc<Node> I care about:

    (gdb) b <rsvg_internals::structure::NodeSvg as rsvg_internals::node::NodeTrait>::set_atts
    Breakpoint 3 at 0x7ffff71f3aaa: file rsvg_internals/src/structure.rs, line 131.
    
    (gdb) c
    Continuing.
    
    Thread 1 "rsvg-convert" hit Breakpoint 3, <rsvg_internals::structure::NodeSvg as rsvg_internals::node::NodeTrait>::set_atts (self=0x646c60, node=0x64c890, pbag=0x64c820) at rsvg_internals/src/structure.rs:131
    
    (gdb) p node
    $5 = (alloc::rc::Rc<rsvg_internals::node::Node> *) 0x64c890
    

    Okay, node is a reference to an Rc<Node>. What's inside?

    (gdb) p *node
    $6 = {ptr = {pointer = {__0 = 0x625800}}, phantom = {<No data fields>}}
    

    Why, a pointer to the actual contents of the Rc. Look inside again:

    (gdb) p *node.ptr.pointer.__0
    $9 = {strong = {value = {value = 3}}, weak = {value = {value = 1}},  ... and lots of extra crap ...
    

    Aha! There are the strong and weak reference counts. So, set a watchpoint on the strong reference count:

    (gdb) set $ptr = &node.ptr.pointer.__0.strong.value.value
    (gdb) watch *$ptr
    Hardware watchpoint 4: *$ptr
    

    Continue running the program until the reference count changes:

    (gdb) continue
    Thread 1 "rsvg-convert" hit Hardware watchpoint 4: *$ptr
    
    Old value = 3
    New value = 2
    

    At this point I can print a stack trace and see if it makes sense, check that the refs/unrefs are matched, etc.

    TL;DR: dig into the Rc<T> until you find the reference count, and watch it. It's wrapped in several layers of Rust-y types; NonNull pointers, an RcBox for the actual container of the refcount plus the object it's wrapping, and Cells for the refcount values. Just dig until you reach the refcount values and they are there.

    So, how did I find the missing cast?

    Using that gdb recipe, I watched the reference count of the toplevel SVG node change until the program exited. When the program terminated, the reference count was 1 — it should have dropped to 0 if there was no memory leak.

    The last place where the toplevel node loses a reference is in rsvg_tree_free(). I ran the program again and checked if that function was being called; it was being called correctly. So I knew that the problem must lie in that function. After a little head-scratching, I found the missing cast. Other functions of the form rsvg_tree_whatever() had that cast, but rsvg_tree_free() was missing it.

    I think Rust now has better facilities to tag structs that are exposed as raw pointers to extern code, to avoid this kind of perilous casting. We'll see.

    In the meantime, apologies for the buggy releases!

  3. Logging from Rust in librsvg

    - gnome, librsvg, rust

    Over in this issue we are discussing how to add debug logging for librsvg.

    A popular way to add logging to Rust code is to use the log crate. This lets you sprinkle simple messages in your code:

    error!("something bad happened: {}", foo);
    debug!("a debug message");
    

    However, the log create is just a facade, and by default the messages do not get emitted anywhere. The calling code has to set up a logger. Crates like env_logger let one set up a logger, during program initialization, that gets configured through an environment variable.

    And this is a problem for librsvg: we are not the program's initialization! Librsvg is a library; it doesn't have a main() function. And since most of the calling code is not Rust, we can't assume that they can call code that can initialize the logging framework.

    Why not use glib's logging stuff?

    Currently this is a bit clunky to use from Rust, since glib's structured logging functions are not bound yet in glib-rs. Maybe it would be good to bind them and get this over with.

    What user experience do we want?

    In the past, what has worked well for me to do logging from libraries is to allow the user to set an environment variable to control the logging, or to drop a log configuration file in their $HOME. The former works well when the user is in control of running the program that will print the logs; the latter is useful when the user is not directly in control, like for gnome-shell, which gets launched through a lot of magic during session startup.

    For librsvg, it's probably enough to just use an environment variable. Set RSVG_LOG=parse_errors, run your program, and get useful output. Push button, receive bacon.

    Other options in Rust?

    There is a slog crate which looks promising. Instead of using context-less macros which depend on a single global logger, it provides logging macros to which you pass a logger object.

    For librsvg, this means that the basic RsvgHandle could create its own logger, based on an environment variable or whatever, and pass it around to all its child functions for when they need to log something.

    Slog supports structured logging, and seems to have some fancy output modes. We'll see.

  4. Three big things happening in librsvg

    - gnome, librsvg, rust

    I am incredibly happy because of three big things that are going on in librsvg right now:

    1. Paolo Borelli finished porting all the CSS properties to Rust. What was once a gigantic RsvgState struct in C is totally gone, along with all the janky C code to parse individual properties. The process of porting RsvgState to Rust has been going on since about two months ago, and has involved many multi-commit merge requests and refactorings. This is a tremendous amount of really good work! The result is all in Rust now in a State struct, which is opaque from C's viewpoint. The only places in C that still require accessors to the State are in the filter effects code. Which brings me to...

    2. Ivan Molodetskikh, my Summer of Code student, submitted his first merge request and it's merged to master now. This ports the bookkeeping infrastructure for SVG filters to Rust, and also the feOffset filter is ported now. Right now the code doesn't do anything fancy to iterate over the pixels of Cairo image surfaces; that will come later. I am very happy that filters, which were a huge barrier, are now starting to get chipped away into nicer code.

    3. I have started to move librsvg's old representation of CSS properties into something that can really represent properties that are not specified, or explicitly set to inherit from an SVG element's parent, or set to a normal value. Librsvg never had a representation of property values that actually matched the SVG/CSS specs; it just knew whether a property was specified or not for an element. This worked fine for properties which the spec mandates that they should inherit automatically, but those that don't, were handled through special hacks. The new code makes this a lot cleaner. It should also make it easier to copy Servo's idioms for property inheritance.

  5. Reducing the number of image copies in GNOME

    - gdk-pixbuf, gnome, performance

    Our graphics stack that deals with images has evolved a lot over the years.

    In ye olden days

    In the context of GIMP/GNOME, the only thing that knew how to draw RGB images to X11 windows (doing palette mapping for 256-color graphics cards and dithering if necessary) was the GIMP. Later, when GTK+ was written, it exported a GtkPreview widget, which could take an RGB image buffer supplied by the application and render it to an X window — this was what GIMP plug-ins could use in their user interface to show, well, previews of what they were about to do with the user's images. Later we got some obscure magic in a GdkColorContext object, which helped allocate X11 colors for the X drawing primitives. In turn, GdkColorContext came from the port that Miguel and I did of XmHTML's color context object (and for those that remember, XmHTML became the first version of GtkHtml; later it was rewritten as a port of KDE's HTML widget). Thankfully all that stuff is gone now; we can now assume that video cards are 24-bit RGB or better everywhere, and there is no need to worry about limited color palettes and color allocation.

    Later, we started using the Imlib library, from the Enlightenment project, as an easy API to load images — the APIs from libungif, libjpeg, libpng, etc. were not something one really wanted to use directly — and also to keep images in memory with a uniform representation. Unfortunately, Imlib's memory management was peculiar, as it was tied to Enlightenment's model for caching and rendering loaded/scaled images.

    A bunch of people worked to write GdkPixbuf: it kept Imlib's concepts of a unified representation for image data, and an easy API to load various image formats. It added support for an alpha channel (we only had 1-bit masks before), and it put memory management in the hands of the calling application, in the form of reference counting. GdkPixbuf obtained some high-quality scaling functions, mainly for use by Eye Of Gnome (our image viewer) and by applications that just needed scaling instead of arbitrary transformations.

    Later, we got libart, the first library in GNOME to do antialiased vector rendering and affine transformations. Libart was more or less compatible with GdkPixbuf: they both had the same internal representation for pixel data, but one had to pass the pixels/width/height/rowstride around by hand.

    Mea culpa

    Back then I didn't understand premultiplied alpha, which is now ubiquitous. The GIMP made the decision to use non-premultiplied alpha when it introduced layers with transparency, probably to "avoid losing data" from transparent pixels. GdkPixbuf follows the same scheme.

    (Now that the GIMP uses GEGL for its internal representation of images... I have no idea what it does with respect to alpha.)

    Cairo and afterwards

    Some time after the libart days, we got Cairo and pixman. Cairo had a different representation of images than GdkPixbuf's, and it supported more pixel formats and color models.

    GTK2 got patched to use Cairo in the toplevel API. We still had a dichotomy between Cairo's image surfaces, which are ARGB premultiplied data in memory, and GdkPixbufs, which are RGBA non-premultiplied. There are utilities in GTK+ to do these translations, but they are inconvenient: every time a program loads an image with GdkPixbuf's easy API, a translation has to happen from non-premul RGBA to premul ARGB.

    Having two formats means that we inevitably do translations back and forth of practically the same data. For example, when one embeds a JPEG inside an SVG, librsvg will read that JPEG using GdkPixbuf, translate it to Cairo's representation, composite it with Cairo onto the final result, and finally translate the whole thing back to a GdkPixbuf... if someone uses librsvg's legacy APIs to output pixbufs instead of rendering directly to a Cairo surface.

    Who uses that legacy API? GTK+, of course! GTK+ loads scalable SVG icons with GdkPixbuf's loader API, which dynamically links librsvg at runtime: in effect, GTK+ doesn't use librsvg directly. And the SVG pixbuf loader uses the "gimme a pixbuf" API in librsvg.

    GPUs

    Then, we got GPUs everywhere. Each GPU has its own preferred pixel format. Image data has to be copied to the GPU at some point. Cairo's ARGB needs to be translated to the GPU's preferred format and alignment.

    Summary so far

    • Libraries that load images from standard formats have different output formats. Generally they can be coaxed into spitting ARGB or RGBA, but we don't expect them to support any random representation that a GPU may want.

    • GdkPixbuf uses non-premultiplied RGBA data, always in that order.

    • Cairo uses premultiplied ARGB in platform-endian 32-bit chunks: if each pixel is 0xaarrggbb, then the bytes are shuffled around depending on whether the platform is little-endian or big-endian.

    • Cairo internally uses a subset of the formats supported by pixman.

    • GPUs use whatever they damn well please.

    • Hilarity ensues.

    What would we like to do?

    We would like to reduce the number of translations between image formats along the loading-processing-display pipeline. Here is a plan:

    • Make sure Cairo/pixman support the image formats that GPUs generally prefer. Have them do the necessary conversions if the rest of the program passes an unsupported format. Ensure that a Cairo image surface can be created with the GPU's preferred format.

    • Make GdkPixbuf just be a wrapper around a Cairo image surface. GdkPixbuf is already an opaque structure, and it already knows how to copy pixel data in case the calling code requests it, or wants to turn a pixbuf from immutable to mutable.

    • Provide GdkPixbuf APIs that deal with Cairo image surfaces. For example, deprecate gdk_pixbuf_new() and gdk_pixbuf_new_from_data(), in favor of a new gdk_pixbuf_new_from_cairo_image_surface(). Instead of gdk_pixbuf_get_pixels() and related functions, have gdk_pixbuf_get_cairo_image_surface(). Mark the "give me the pixel data" functions as highly discouraged, and only for use really by applications that want to use GdkPixbuf as an image loader and little else.

    • Remove calls in GTK+ that cause image conversions; make them use Cairo image surfaces directly, from GdkTexture up.

    • Audit applications to remove calls that cause image conversions. Generally, look for where they use GdkPixbuf's deprecated APIs and update them.

    Is this really a performance problem?

    This is in the "excess work" category of performance issues. All those conversions are not really slow (they don't make up for the biggest part of profiles), but they are nevertheless things that we could avoid doing. We may get some speedups, but it's probably more interesting to look at things like power consumption.

    Right now I'm seeing this as a cool, minor optimization, but more as a way to gradually modernize our image API.

    We seem to change imaging models every N years (X11 -> libart -> Cairo -> render trees in GPUs -> ???). It is very hard to change applications to use different APIs. In the meantime, we can provide a more linear path for image data, instead of doing unnecessary conversions everywhere.

    Code

    I have a use-cairo-surface-internally branch in gdk-pixbuf, which I'll be working on this week. Meanwhile, you may be interested in the ongoing Performance Hackfest in Cambridge!

  6. Madrid GNOME+Rust Hackfest, part 3 (conclusion)

    - gnome, hackfests, rust

    The last code I wrote during the hackfest was the start of code generation for GObject interfaces. This is so that you can do

    gobject_gen! {
        interface Foo {
            virtual fn frob(&self);
        }
    }
    

    and it will generate the appropriate FooIface like one would expect with the C versions of interfaces.

    It turns out that this can share a lot of code from the existing code generator for classes: both classes and interfaces are "just virtual method tables", plus signals and properties, and classes can actually have per-instance fields and such. I started refactoring the code generator to allow this.

    I also took a second look at how to present good error messages when the syn crate encounters a parse error. I need to sit down at home and experiment with this carefully.

    Back home

    I'm back home now, jetlagged but very happy that gnome-class is in a much more advanced a state than it was before the hackfest. I'm very thankful that practically everyone worked on it!

    Also, thanks to Alberto and Natalia for hosting me at their apartment and showing me around Madrid, all while wrangling their adorable baby Mario. We had a lovely time on Saturday, and ate excellent food downtown.

    Sponsored by the GNOME Foundation

    Hosted by OpenShine

  7. Madrid GNOME+Rust Hackfest, part 2

    - gnome, hackfests, librsvg, rust

    Hacking on gnome-class continues apace!

    Philippe updated our dependencies.

    Alberto made the syntax for per-instance private structs more ergonomic, and then made that code nice and compact.

    Martin improved our conversion from CamelCase to snake_case for code generation.

    Daniel added initial support for GObject properties. This is not finished yet, but the initial parser and code generation is done.

    Guillaume turned gir, the binding generator in gtk-rs, from a binary into a library crate. This will let us have all the GObject Introspection information for parent classes at compilation time.

    Antoni has been working on a tricky problem. GTK+ structs that have bitfields do not get reconstructed correctly from the GObject Introspection information — Rust does not handle C bitfields yet. This has two implications. First, we lose some of the original struct fields in the generated bindings. Second, the sizes of the generated structs are not the same as the original C structs, so g_type_register_static() complains that one is trying to register an invalid class.

    Yesterday we got as far as reading the amd64 and ARM ABI manuals to see what the hell C compilers are supposed to do for laying out structs with bitfields. Most likely, we will have a temporary fix in gir's code generator so that it generates structs with the same layout as the C ones, with padding in place of the space for bitfields. Later we can remove this when rustc gets support for C bitfields.

    I've been working on support for GObject interfaces. The basic parsing is done; I'm about to refactor the code generation so I can reuse the parts that fill vtables from classes.

    Yesterday we went to the Madrid Rust Meetup, a regular meeting of rustaceans here. Martin talked about WebRender; I talked about refactoring C to port it to Rust, and then Alex talked about Rust's plans for 2018. Fun times.

    Sponsored by the GNOME Foundation

    Hosted by OpenShine

  8. Madrid GNOME+Rust Hackfest, part 1

    - gnome, hackfests, librsvg, rust

    I'm in Madrid since Monday, at the third GNOME+Rust hackfest! The OpenShine folks are kindly letting us use their offices, on the seventh floor of a building by the Cuatro Caminos roundabout.

    I am very, very thankful that this time everyone seems to be working on developing gnome-class. It's a difficult project for me, and more brainpower is definitely welcome — all the indirection, type conversion, GObject obscurity, and procedural macro shenanigans definitely take a toll on oneself.

    Gnome-class internals

    Gnome-class internals on the whiteboard

    I explained how gnome-class works to the rest of the hackfest attendees. I've been writing a document on gnome-class's internals, so the whiteboard was a whirlwind tour through it.

    Error messages from the compiler

    Antoni Boucher, the author of relm (a Rust crate to write GTK+ asynchronous widgets with an Elm-like model), explained to me how relm manages to present good error messages from the Rust compiler, when the user's code has mistakes. Right now this is in a very bad state in gnome-class: user errors within the invocation of the procedural macro get shown by the compiler as errors at the macro call, so you don't get line number information that is meaningful.

    For a large part of the day we tried to refactor bits of gnome-class to do something similar. It is very slightly better now, but this really requires me to sit down calmly, at home, and to fully understand how relm does it and what changes are needed in the syn parser crate to make it easy to present good errors.

    I think I'll continue this work at home, as there is a lot of source code to understand: the combinator parsers in syn, the error handling scheme in relm, and the peculiarities of gnome-class.

    Further work during the hackfest

    Other people working on gnome-class are adding support for GObject properties, inheritance from non-Rust classes, and improving the ergonomics of class-private structures.

    I think I'll stop working on error messages for now, and focus instead on either supporting GTypeInterfaces, or completing support for type conversions for methods and signals.

    Other happenings in Rust

    Paolo Borelli has been porting RsvgState to Rust in librsvg. This is the big structure that holds all the CSS state for SVG elements. This is very meticulous work, and I'm thankful that Paolo is paying good attention to it. Soon we will have all the style machinery for librsvg in Rust, which will make it easier to use the selectors crate from Servo instead of libcroco, as the latter is unmaintained.

    Food

    Food in Madrid

    Ah, Spanish food. We have been enjoying cheese, jamón, tortilla, pimientos, oxtail stews, natillas, café con leche...

    Thanks

    Thanks to OpenShine for hosting the hackfest, and to the GNOME Foundation for sponsoring my travel. And thanks for Alberto Ruiz for putting me up in his house!

    Sponsored by the GNOME Foundation

  9. Refactoring some repetitive code to a Rust macro

    - librsvg, rust

    I have started porting the code in librsvg that parses SVG's CSS properties from C to Rust. Many properties have symbolic values:

    stroke-linejoin: miter | round | bevel | inherit
    
    stroke-linecap: butt | round | square | inherit
    
    fill-rule: nonzero | evenodd | inherit
    

    StrokeLinejoin is the first property that I ported. First I had to write a little bunch of machinery to allow CSS properties to be kept in Rust-space instead of the main C structure that holds them (upcoming blog post about that). But for now, I just want to show how this boiled down to a macro after refactoring.

    First cut at the code

    The stroke-linejoin property can have the values miter, round, bevel, or inherit. Here is an enum definition for those values, and the conventional machinery which librsvg uses to parse property values:

    #[derive(Debug, Copy, Clone)]
    pub enum StrokeLinejoin {
        Miter,
        Round,
        Bevel,
        Inherit,
    }
    
    impl Parse for StrokeLinejoin {
        type Data = ();
        type Err = AttributeError;
    
        fn parse(s: &str, _: Self::Data) -> Result<StrokeLinejoin, AttributeError> {
            match s.trim() {
                "miter" => Ok(StrokeLinejoin::Miter),
                "round" => Ok(StrokeLinejoin::Round),
                "bevel" => Ok(StrokeLinejoin::Bevel),
                "inherit" => Ok(StrokeLinejoin::Inherit),
                _ => Err(AttributeError::from(ParseError::new("invalid value"))),
            }
        }
    }
    

    We match the allowed string values and map them to enum values. No big deal, right?

    Properties also have a default value. For example, the SVG spec says that if a shape doesn't have a stroke-linejoin property specified, it will use miter by default. Let's implement that:

    impl Default for StrokeLinejoin {
        fn default() -> StrokeLinejoin {
            StrokeLinejoin::Miter
        }
    }
    

    So far, we have three things:

    • An enum definition for the property's possible values.
    • impl Parse so we can parse the property from a string.
    • impl Default so the property knows its default value.

    Where things got repetitive

    The next property I ported was stroke-linecap, which can take the following values:

    #[derive(Debug, Copy, Clone)]
    pub enum StrokeLinecap {
        Butt,
        Round,
        Square,
        Inherit,
    }
    

    This is similar in shape to the StrokeLinejoin enum above; it's just different names.

    The parsing has exactly the same shape, and just different values:

    impl Parse for StrokeLinecap {
        type Data = ();
        type Err = AttributeError;
    
        fn parse(s: &str, _: Self::Data) -> Result<StrokeLinecap, AttributeError> {
            match s.trim() {
                "butt" => Ok(StrokeLinecap::Butt),
                "round" => Ok(StrokeLinecap::Round),
                "square" => Ok(StrokeLinecap::Square),
                "inherit" => Ok(StrokeLinecap::Inherit),
    
                _ => Err(AttributeError::from(ParseError::new("invalid value"))),
            }
        }
    }
    

    Same thing with the default:

    impl Default for StrokeLinecap {
        fn default() -> StrokeLinecap {
            StrokeLinecap::Butt
        }
    }
    

    Yes, the SVG spec has

    default: butt
    

    somewhere in it, much to the delight of the 12-year old in me.

    Refactoring to a macro

    Here I wanted to define a make_ident_property!() macro that would get invoked like this:

    make_ident_property!(
        StrokeLinejoin,
        default: Miter,
    
        "miter" => Miter,
        "round" => Round,
        "bevel" => Bevel,
        "inherit" => Inherit,
    );
    

    It's called make_ident_property because it makes a property definition from simple string identifiers. It has the name of the property (StrokeLinejoin), a default value, and a few repeating elements, one for each possible value.

    In Rust-speak, the macro's basic pattern is like this:

    macro_rules! make_ident_property {
        ($name: ident,
         default: $default: ident,
         $($str_prop: expr => $variant: ident,)+
        ) => {
            ... macro body will go here ...
        };
    }
    

    Let's dissect that pattern:

    macro_rules! make_ident_property {
        ($name: ident,
    //   ^^^^^^^^^^^^ will match an identifier and put it in $name
    
         default: $default: ident,
    //            ^^^^^^^^^^^^^^^ will match an identifier and put it in $default
    //   ^^^^^^^^ arbitrary text
    
         $($str_prop: expr => $variant: ident,)+
                           ^^ arbitrary text
    //   ^^ start of repetition               ^^ end of repetition, repeats one or more times
    
        ) => {
            ...
        };
    }
    

    For example, saying "$foo: ident" in a macro's pattern means that the compiler will expect an identifier, and bind it to $foo within the macro's definition.

    Similarly, an expr means that the compiler will look for an expression — in this case, we want one of the string values.

    In a macro pattern, anything that is not a binding is just arbitrary text which must appear in the macro's invocation. This is how we can create a little syntax of our own within the macro: the "default:" part, and the "=>" inside each string/symbol pair.

    Finally, macro patterns allow repetition. Anything within $(...) indicates repetition. Here, $(...)+ indicates that the compiler must match one or more of the repeating elements.

    I pasted the duplicated code, and substituted the actual symbol names for the macro's bindings:

    macro_rules! make_ident_property {
        ($name: ident,
         default: $default: ident,
         $($str_prop: expr => $variant: ident,)+
        ) => {
            #[derive(Debug, Copy, Clone)]
            pub enum $name {
                $($variant),+
    //          ^^^^^^^^^^^^^ this is how we invoke a repeated element
    
            }
    
            impl Default for $name {
                fn default() -> $name {
                    $name::$default
    //              ^^^^^^^^^^^^^^^ construct an enum::variant
    
                }
            }
    
            impl Parse for $name {
                type Data = ();
                type Err = AttributeError;
    
                fn parse(s: &str, _: Self::Data) -> Result<$name, AttributeError> {
                    match s.trim() {
                        $($str_prop => Ok($name::$variant),)+
    //                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expand repeated elements
    
                        _ => Err(AttributeError::from(ParseError::new("invalid value"))),
                    }
                }
            }
        };
    }
    

    Getting rid of duplicated code

    Now we have a macro that we can call to define new properties. Librsvg now has this, which is much more readable than all the code written by hand:

    make_ident_property!(
        StrokeLinejoin,
        default: Miter,
    
        "miter" => Miter,
        "round" => Round,
        "bevel" => Bevel,
        "inherit" => Inherit,
    );
    
    make_ident_property!(
        StrokeLinecap,
        default: Butt,   // :)
    
        "butt" => Butt,
        "round" => Round,
        "square" => Square,
        "inherit" => Inherit,
    );
    
    make_ident_property!(
        FillRule,
        default: NonZero,
    
        "nonzero" => NonZero,
        "evenodd" => EvenOdd,
        "inherit" => Inherit,
    );
    

    Etcetera. It's now easy to port similar symbol-based properties from C to Rust.

    Eventually I'll need to refactor all the crap that deals with inheritable properties, but that's for another time.

    Conclusion and references

    Rust macros are very powerful to refactor repetitive code like this.

    The Rust book has an introductory appendix to macros, and The Little Book of Rust Macros is a fantastic resource that really dives into what you can do.

  10. Making sure the repository doesn't break, automatically

    - cairo, gnome, librsvg, rust

    Gitlab has a fairly conventional Continuous Integration system: you push some commits, the CI pipelines build the code and presumably run the test suite, and later you can know if this succeeded of failed.

    But by the time something fails, the broken code is already in the public repository.

    The Rust community uses Bors, a bot that prevents this from happening:

    • You push some commits and submit a merge request.

    • A human looks at your merge request; they may tell you to make changes, or they may tell Bors that your request is approved for merging.

    • Bors looks for approved merge requests. It merges each into a temporary branch and waits for the CI pipeline to run there. If CI passes, Bors automatically merges to master. If CI fails, Bors annotates the merge request with the failure, and the main repository stays working.

    Bors also tells you if the mainline has moved forward and there's a merge conflict. In that case you need to do a rebase yourself; the repository stays working in the meantime.

    This leads to a very fair, very transparent process for contributors and for maintainers. For all the details, watch Emily Dunham's presentation on Rust's community automation (transcript).

    For a description of where Bors came from, read Graydon Hoare's blog.

    Bors evolved into Homu and it is what Rust and Servo use currently. However, Homu depends on Github.

    I just found out that there is a port of Homu for Gitlab. Would anyone care to set it up?

    Update: Two people have suggested porting Bors-ng to Gitlab instead, for scalability reasons.

Page 1 / 4 »