Federico's Blog

  1. Quick and dirty checklist to update syn 0.11.x to syn 0.12

    - gnome, rust

    Today I ported gnome-class from version 0.11 of the syn crate to version 0.12. syn is a somewhat esoteric crate that you use to parse Rust code... from a stream of tokens... from within the implementation of a procedural macro. Gnome-class implements a mini-language inside your own Rust code, and so it needs to parse Rust!

    The API of syn has changed a lot, which is kind of a pain in the ass — but the new API seems on the road to stabilization, and is nicer indeed.

    Here is a quick list of things I had to change in gnome-class to upgrade its version of syn.

    There is no extern crate synom anymore. You can use syn::synom now.

    extern crate synom;    ->   use syn::synom;
    

    SynomBuffer is now TokenBuffer:

    synom::SynomBuffer  ->  syn::buffer:TokenBuffer
    

    PResult, the result of Synom::parse(), now has the tuple's arguments reversed:

    - pub type PResult<'a, O> = Result<(Cursor<'a>, O), ParseError>;
    + pub type PResult<'a, O> = Result<(O, Cursor<'a>), ParseError>;
    
    // therefore:
    
    impl Synom for MyThing { ... }
    
    let x = MyThing::parse(...).unwrap().1;   ->  let x = MyThing::parse(...).unwrap().0;
    

    The language tokens like synom::tokens::Amp, and keywords like synom::tokens::Type, are easier to use now. There is a Token! macro which you can use in type definitions, instead of having to remember the particular name of each token type:

    synom::tokens::Amp  ->  Token!(&)
    
    synom::tokens::For  ->  Token!(for)
    

    And for the corresponding values when matching:

    syn!(tokens::Colon)  ->  punct!(:)
    
    syn!(tokens::Type)   ->  keyword!(type)
    

    And to instantiate them for quoting/spanning:

    -     tokens::Comma::default().to_tokens(tokens);
    +     Token!(,)([Span::def_site()]).to_tokens(tokens);
    

    (OK, that one wasn't nicer after all.)

    To the get string for an Ident:

    ident.sym.as_str()  ->  ident.as_ref()
    

    There is no Delimited anymore; instead there is a Punctuated struct. My diff has this:

    -  inputs: parens!(call!(Delimited::<MyThing, tokens::Comma>::parse_terminated)) >>
    +  inputs: parens!(syn!(Punctuated<MyThing, Token!(,)>)) >>
    

    There is no syn::Mutability anymore; now it's an Option<token>, so basically

    syn::Mutability  ->  Option<Token![mut]>
    

    which I guess lets you refer to the span of the original mut token if you need.

    Some things changed names:

    TypeTup { tys, .. }  ->  TypeTuple { elems, .. }
    
    PatIdent {                          ->  PatIdent {
        mode: BindingMode(Mutability)           by_ref: Option<Token!(ref)>,
                                                mutability: Option<Token![mut]>,
        ident: Ident,                           ident: Ident,
        subpat: ...,                            subpat: Option<(Token![@], Box<Pat>)>,
        at_token: ...,                      }
    }
    
    TypeParen.ty  ->  TypeParen.elem   (and others like this, too)
    

    (I don't know everything that changed names; gnome-class doesn't use all the syn types yet; these are just the ones I've run into.)

    This new syn is much better at acknowledging the fine points of macro hygiene. The examples directory is particularly instructive; it shows how to properly span generated code vs. original code, so compiler error messages are nice. I need to write something about macro hygiene at some point.

  2. Librsvg's continuous integration pipeline

    - gnome, librsvg

    Jordan Petridis has been kicking ass by overhauling librsvg's continous integration (CI) pipeline. Take a look at this beauty:

    Continuous integration pipeline

    On every push, we run the Test stage. This is a quick compilation on a Fedora container that runs "make check" and ensures that the test suite passes.

    We have a Lint stage which can be run manually. This runs cargo clippy to get Rust lints (check the style of Rust idioms), and cargo fmt to check indentation and code style and such.

    We have a Distro_test stage which I think will be scheduled weekly, using Gitlab's Schedules feature, to check that the tests pass on three major Linux distros. Recently we had trouble with different rendering due to differences in Freetype versions, which broke the tests (ahem, likely because I hadn't updated my Freetype in a while and distros were already using a newer one); these distro tests are intended to catch that.

    Finally, we have a Rustc_test stage. The various crates that librsvg depends on have different minimum versions for the Rust compiler. These tests are intended to show when updating a dependency changes the minimum Rust version on which librsvg would compile. We don't have a policy yet for "how far from $newest" we should always work on, and it would be good to get input from distros on this. I think these Rust tests will be scheduled weekly as well.

    Jordan has been experimenting with the pipeline's stages and the distro-specific idiosyncrasies for each build. This pipeline depends on some custom-built container images that already have librsvg's dependencies installed. These images are built weekly in gitlab.com, so every week gitlab.gnome.org gets fresh images for librsvg's CI pipelines. Once image registries are enabled in gitlab.gnome.org, we should be able to regenerate the container images locally without depending on an external service.

    With the pre-built images, and caching of Rust artifacts, Jordan was able to reduce the time for the "test on every commit" builds from around 20 minutes, to little under 4 minutes in the current iteration. This will get even faster if the builds start using ccache and parallel builds from GNU make.

    Currently we have a problem in that tests are failing on 32-bit builds, and haven't had a chance to investigate the root cause. Hopefully we can add 32-bit jobs to the CI pipeline to catch this breakage as soon as possible.

    Having all these container images built for the CI infrastructure also means that it will be easy for people to set up a development environment for librsvg, even though we have better instructions now thanks to Jordan. I haven't investigated setting up a Flatpak-based environment; this would be nice to have as well.

  3. RFC: Integrating rsvg-rs into librsvg

    - gnome, librsvg, rust

    I have started an RFC to integrate rsvg-rs into librsvg. rsvg-rs is the Rust binding to librsvg. Like the gtk-rs bindings, it gets generated from a pre-built GIR file.

    It would be nice for librsvg to provide the Rust binding by itself, so that librsvg's own internal tools can be implemented in Rust — currently all the tests are done in C, as are the rsvg-convert(1) and rsvg-view-3(1) programs.

    There are some implications for how rsvg-rs would get built then. For librsvg's internal consumption, the binding can be built from the Rsvg-2.0.gir file that gets built out of the main librsvg.so. But for public consumption of rsvg-rs, when it is being used as a normal crate and built by Cargo, that Rsvg-2.0.gir needs to be already built and available: it wouldn't be appropriate for Cargo to build librsvg and the .gir file itself.

    If this sort of thing interests you, take a look at the RFC!

  4. Rust things I miss in C

    - rust

    Librsvg feels like it is reaching a tipping point, where suddenly it seems like it would be easier to just port some major parts from C to Rust than to just add accessors for them. Also, more and more of the meat of the library is in Rust now.

    I'm switching back and forth a lot between C and Rust these days, and C feels very, very primitive these days.

    A sort of elegy to C

    I fell in love with the C language about 24 years ago. I learned the basics of it by reading a Spanish translation of The C Programming Language by K&R second edition. I had been using Turbo Pascal before in a reasonably low-level fashion, with pointers and manual memory allocation, and C felt refreshing and empowering.

    K&R is a great book for its style of writing and its conciseness of programming. This little book even taught you how to implement a simple malloc()/free(), which was completely enlightening. Even low-level constructs that seemed part of the language could be implemented in the language itself!

    I got good at C over the following years. It is a small language, with a small standard library. It was probably the perfect language to implement Unix kernels in 20,000 lines of code or so.

    The GIMP and GTK+ taught me how to do fancy object orientation in C. GNOME taught me how to maintain large-scale software in C. 20,000 lines of C code started to seem like a project one could more or less fully understand in a few weeks.

    But our code bases are not that small anymore. Our software now has huge expectations on the features that are available in the language's standard library.

    Some good experiences with C

    Reading the POV-Ray code source code for the first time and learning how to do object orientation and inheritance in C.

    Reading the GTK+ source code and learning a C style that was legible, maintainable, and clean.

    Reading SIOD's source code, then the early Guile sources, and seeing how a Scheme interpreter can be written in C.

    Writing the initial versions of Eye of Gnome and fine-tuning the microtile rendering.

    Some bad experiences with C

    In the Evolution team, when everything was crashing. We had to buy a Solaris machine just to be able to buy Purify; there was no Valgrind back then.

    Debugging gnome-vfs threading deadlocks.

    Debugging Mesa and getting nowhere.

    Taking over the intial versions of Nautilus-share and seeing that it never free()d anything.

    Trying to refactor code where I had no idea about the memory management strategy.

    Trying to turn code into a library when it is full of global variables and no functions are static.

    But anyway — let's get on with things in Rust I miss in C.

    Automatic resource management

    One of the first blog posts I read about Rust was "Rust means never having to close a socket". Rust borrows C++'s ideas about Resource Acquisition Is Initialization (RAII), Smart Pointers, adds in the single-ownership principle for values, and gives you automatic, deterministic resource management in a very neat package.

    • Automatic: you don't free() by hand. Memory gets deallocated, files get closed, mutexes get unlocked when they go out of scope. If you are wrapping an external resource, you just implement the Drop trait and that's basically it. The wrapped resource feels like part of the language since you don't have to babysit its lifetime by hand.

    • Deterministic: resources get created (memory allocated, initialized, files opened, etc.), and they get destroyed when they go out of scope. There is no garbage collection: things really get terminated when you close a brace. You start to see your program's data lifetimes as a tree of function calls.

    After forgetting to free/close/destroy C objects all the time, or worse, figuring out where code that I didn't write forgot to do those things (or did them twice, incorrectly)... I don't want to do it again.

    Generics

    Vec<T> really is a vector of whose elements are the size of T. It's not an array of pointers to individually allocated objects. It gets compiled specifically to code that can only handle objects of type T.

    After writing many janky macros in C to do similar things... I don't want to do it again.

    Traits are not just interfaces

    Rust is not a Java-like object-oriented language. Instead it has traits, which at first seem like Java interfaces — an easy way to do dynamic dispatch, so that if an object implements Drawable then you can assume it has a draw() method.

    However, traits are more powerful than that.

    Associated types

    Traits can have associated types. As an example, Rust provies the Iterator trait which you can implement:

    pub trait Iterator {
        type Item;
        fn next(&mut self) -> Option<Self::Item>;
    }
    

    This means that whenever you implement Iterator for some iterable object, you also have to specify an Item type for the things that will be produced. If you call next() and there are more elements, you'll get back a Some(YourElementType). When your iterator runs out of items, it will return None.

    Associated types can refer to other traits.

    For example, in Rust, you can use for loops on anything that implements the IntoIterator trait:

    pub trait IntoIterator {
        /// The type of the elements being iterated over.
        type Item;
    
        /// Which kind of iterator are we turning this into?
        type IntoIter: Iterator<Item=Self::Item>;
    
        fn into_iter(self) -> Self::IntoIter;
    }
    

    When implementing this trait, you must provide both the type of the Item which your iterator will produce, and IntoIter, the actual type that implements Iterator and that holds your iterator's state.

    This way you can build webs of types that refer to each other. You can have a trait that says, "I can do foo and bar, but only if you give me a type that can do this and that".

    Slices

    I already posted about the lack of string slices in C and how this is a pain in the ass once you get used to having them.

    Modern tooling for dependency management

    Instead of

    • Having to invoke pkg-config by hand or with Autotools macros
    • Wrangling include paths for header files...
    • ... and library files.
    • And basically depending on the user to ensure that the correct versions of libraries are installed,

    You write a Cargo.toml file which lists the names and versions of your dependencies. These get downloaded from a well-known location, or from elsewhere if you specify.

    You don't have to fight dependencies. It just works when you cargo build.

    Tests

    C makes it very hard to have unit tests for several reasons:

    • Internal functions are often static. This means they can't be called outside of the source file that defined them. A test program either has to #include the source file where the static functions live, or use #ifdefs to remove the statics only during testing.

    • You have to write Makefile-related hackery to link the test program to only part of your code's dependencies, or to only part of the rest of your code.

    • You have to pick a testing framework. You have to register tests against the testing framework. You have to learn the testing framework.

    In Rust you write

    #[test]
    fn test_that_foo_works() {
        assert!(foo() == expected_result);
    }
    

    anywhere in your program or library, and when you type cargo test, IT JUST FUCKING WORKS. That code only gets linked into the test binary. You don't have to compile anything twice by hand, or write Makefile hackery, or figure out how to extract internal functions for testing.

    This is a very killer feature for me.

    Documentation, with tests

    Rust generates documentation from comments in Markdown syntax. Code in the docs gets run as tests. You can illustrate how a function is used and test it at the same time:

    /// Multiples the specified number by two
    ///
    /// ```
    /// assert_eq!(multiply_by_two(5), 10);
    /// ```
    fn multiply_by_two(x: i32) -> i32 {
        x * 2
    }
    

    Your example code gets run as tests to ensure that your documentation stays up to date with the actual code.

    Update 2018/Feb/23: QuietMisdreavus has posted how rustdoc turns doctests into runnable code internally. This is high-grade magic and thoroughly interesting.

    Hygienic macros

    Rust has hygienic macros that avoid all of C's problems with things in macros that inadvertently shadow identifiers in the code. You don't need to write macros where every symbol has to be in parentheses for max(5 + 3, 4) to work correctly.

    No automatic coercions

    All the bugs in C that result from inadvertently converting an int to a short or char or whatever — Rust doesn't do them. You have to explicitly convert.

    No integer overflow

    Enough said.

    Generally, no undefined behavior in safe Rust

    In Rust, it is considered a bug in the language if something written in "safe Rust" (what you would be allowed to write outside unsafe {} blocks) results in undefined behavior. You can shift-right a negative integer and it will do exactly what you expect.

    Pattern matching

    You know how gcc warns you if you switch() on an enum but don't handle all values? That's like a little baby.

    Rust has pattern matching in various places. It can do that trick for enums inside a match() expression. It can do destructuring so you can return multiple values from a function:

    impl f64 {
        pub fn sin_cos(self) -> (f64, f64);
    }
    
    let angle: f64 = 42.0;
    let (sin_angle, cos_angle) = angle.sin_cos();
    

    You can match() on strings. YOU CAN MATCH ON FUCKING STRINGS.

    let color = "green";
    
    match color {
        "red"   => println!("it's red"),
        "green" => println!("it's green"),
        _       => println!("it's something else"),
    }
    

    You know how this is illegible?

    my_func(true, false, false)
    

    How about this instead, with pattern matching on function arguments:

    pub struct Fubarize(pub bool);
    pub struct Frobnify(pub bool);
    pub struct Bazificate(pub bool);
    
    fn my_func(Fubarize(fub): Fubarize, 
               Frobnify(frob): Frobnify, 
               Bazificate(baz): Bazificate) {
        if fub {
            ...;
        }
    
        if frob && baz {
            ...;
        }
    }
    
    ...
    
    my_func(Fubarize(true), Frobnify(false), Bazificate(true));
    

    Standard, useful error handling

    I've talked at length about this. No more returning a boolean with no extra explanation for an error, no ignoring errors inadvertently, no exception handling with nonlocal jumps.

    #[derive(Debug)]

    If you write a new type (say, a struct with a ton of fields), you can #[derive(Debug)] and Rust will know how to automatically print that type's contents for debug output. You no longer have to write a special function that you must call in gdb by hand just to examine a custom type.

    Closures

    No more passing function pointers and a user_data by hand.

    Conclusion

    I haven't done the "fearless concurrency" bit yet, where the compiler is able to prevent data races in threaded code. I imagine it being a game-changer for people who write concurrent code on an everyday basis.

    C is an old language with primitive constructs and primitive tooling. It was a good language for small uniprocessor Unix kernels that ran in trusted, academic environments. It's no longer a good language for the software of today.

    Rust is not easy to learn, but I think it is completely worth it. It's hard because it demands a lot from your understanding of the code you want to write. I think it's one of those languages that make you a better programmer and that let you tackle more ambitious problems.

  5. Writing a command-line program in Rust

    - gnome, librsvg, rust

    As a library writer, it feels a bit strange, but refreshing, to write a program that actually has a main() function.

    My experience with Rust so far has been threefold:

    • Porting chunks of C to Rust for librsvg - this is all work on librsvg's internals and no users are exposed to it directly.

    • Working on gnome-class, the procedural macro ("a little compiler") to generate GObject boilerplate from Rust. This feels like working on the edge of the exotic; it is something that runs in the Rust compiler and spits code on behalf of the programmer.

    • A few patches to the gtk-rs ecosystem. Again, work on the internals, or something that feels library-like.

    But other than toy programs to test things, I haven't written a stand-alone tool until rsvg-bench. It's quite a thrill to be able to just run the thing instead of waiting for other people to write code to use it!

    Parsing command-line arguments

    There are quite a few Rust crates ("libraries") to parse command-line arguments. I read about structopt via Robert O'Callahan's blog; structopt lets you define a struct to hold the values of your command-line options, and then you annotate the fields in that struct to indicate how they should be parsed from the command line. It works via Rust's procedural macros. Internally it generates stuff for the clap crate, a well-established mechanism for dealing with command-line options.

    And it is quite pleasant! This is basically all I needed to do:

    #[derive(StructOpt, Debug)]
    #[structopt(name = "rsvg-bench", about = "Benchmarking utility for librsvg.")]
    struct Opt {
        #[structopt(short = "s",
                    long  = "sleep",
                    help  = "Number of seconds to sleep before starting to process SVGs",
                    default_value = "0")]
        sleep_secs: usize,
    
        #[structopt(short = "p",
                    long  = "num-parse",
                    help  = "Number of times to parse each file",
                    default_value = "100")]
        num_parse: usize,
    
        #[structopt(short = "r",
                    long  = "num-render",
                    help  = "Number of times to render each file",
                    default_value = "100")]
        num_render: usize,
    
        #[structopt(long = "pixbuf",
                    help = "Render to a GdkPixbuf instead of a Cairo image surface")]
        render_to_pixbuf: bool,
    
        #[structopt(help = "Input files or directories",
                    parse(from_os_str))]
        inputs: Vec<PathBuf>
    }
    
    fn main() {
        let opt = Opt::from_args();
    
        if opt.inputs.len() == 0 {
            eprintln!("No input files or directories specified\n");
            process.exit(1);
        }
    
        ...
    }
    

    Each field in the Opt struct above corresponds to one command-line argument; each field has annotations for structopt to generate the appropriate code to parse each option. For example, the render_to_pixbuf field has a long option name called "pixbuf"; that field will be set to true if the --pixbuf option gets passed to rsvg-bench.

    Handling errors

    Command-line programs generally have the luxury of being able to just exit as soon as they encounter an error.

    In C this is a bit cumbersome since you need to deal with every place that may return an error, find out what to print, and call exit(1) by hand or something. If you miss a single place where an error is returned, your program will keep running with an inconsistent state.

    In languages with exception handling, it's a bit easier - a small script can just let exceptions be thrown wherever, and if it catches them at the toplevel, it can just print the exception and abort gracefully. However, these nonlocal jumps make me uncomfortable; I think exceptions are hard to reason about.

    Rust makes this easy: it forces you to handle every call that may return an error, but it lets you bubble errors up easily, or handle them in-place, or translate them to a higher-level error.

    In the Rust world the [failure] crate is getting a lot of traction as a convenient, modern way to handle errors.

    In rsvg-bench, errors can come from several places:

    • I/O errors when reading files and directories.

    • Errors from librsvg's parsing stage; you get a GError.

    • Errors from the rendering stage. This can be a Cairo error (a cairo_status_t), or a simple "something bad happened; can't render" from librsvg's old convenience api in C. Don't you hate it when C code just gives up and returns NULL or a boolean false, without any further details on what went wrong?

    For rsvg-bench, I just needed to be able to represent Cairo errors and generic rendering errors. Everything else, like an io::Error, is automatically wrapped by the failure crate's mechanism. I just needed to do this:

    extern crate failure;
    #[macro_use]
    extern crate failure_derive;
    
    #[derive(Debug, Fail)]
    enum ProcessingError {
        #[fail(display = "Cairo error: {:?}", status)]
        CairoError {
            status: cairo::Status
        },
    
        #[fail(display = "Rendering error")]
        RenderingError
    }
    

    Whenever the code gets a Cairo error, I can translate it to a ProcessingError::CairoError and bubble it up:

    fn render_to_cairo(handle: &rsvg::Handle) -> Result<(), Error> {
        let dim = handle.get_dimensions();
        let surface = cairo::ImageSurface::create(cairo::Format::ARgb32,
                                                  dim.width,
                                                  dim.height)
            .map_err(|e| ProcessingError::CairoError { status: e })?;
    
        ...
    }
    

    And when librsvg returns a "couldn't render" error, I translate that to a ProcessingError::RenderingError:

    fn render_to_cairo(handle: &rsvg::Handle) -> Result<(), Error> {
        ...
    
        let cr = cairo::Context::new(&surface);
    
        if handle.render_cairo(&cr) {
            Ok(())
        } else {
            Err(Error::from(ProcessingError::RenderingError))
        }
    }
    

    Here, the Ok() case of the Result does not contain any value — it's just (), as the generated images are not stored anywhere: they are just rendered to get some timings, not to be saved or anything.

    Up to where do errors bubble?

    This is the "do everything" function:

    fn run(opt: &Opt) -> Result<(), Error> {
        ...
    
        for path in &opt.inputs {
            process_path(opt, &path)?;
        }
    
        Ok(())
    }
    

    For each path passed in the command line, process it. The program sees if the path corresponds to a directory, and it will scan it recursively. Or if the path is an SVG file, the program will load the file and render it.

    Finally, main() just has this:

    fn main() {
        let opt = Opt::from_args();
    
        ...
    
        match run(&opt) {
            Ok(_) => (),
            Err(e) => {
                eprintln!("{}", e);
                process::exit(1);
            }
        }
    }
    

    I.e. process command line arguments, run the whole thing, and print an error if there was one.

    I really appreciate that most places that can return an error an just put a ? for the error to bubble up. This is much more legible than in C, where every call must have an if (something_bad_happened) { deal_with_it; } after it... and Rust won't let me get away with ignoring an error, but it makes it easy to actually deal with it properly.

    Reading an SVG file quickly

    Why, just mmap() it and feed it to librsvg, to avoid buffer copies. This is easy in Rust:

    fn process_file<P: AsRef<Path>>(opt: &Opt, path: P) -> Result<(), Error> {
        let file = File::open(path)?;
        let mmap = unsafe { MmapOptions::new().map(&file)? };
    
        let bytes = &mmap;
    
        let handle = rsvg::Handle::new_from_data(bytes)?;
        ...
    }
    

    Many things can go wrong here:

    • File::open() can return an io::Error.
    • MmapOptions::map() can return an io::Error from the mmap(2) system call, or from the fstat(2) to read the file's size to map it.
    • rsvg::Handle::new_from_data() can return a GError from parsing the file.

    The little ? characters after each call that can return an error mean, just give me back the result, or convert the error to a failure::Error that can be examined later. This is beautifully legible to me.

    Summary

    Writing command-line programs in Rust is fun! It's nice to have neurotically-safe scripts that one can trust in the future.

    Rsvg-bench is available here.

  6. rsvg-bench - a benchmark for librsvg

    - gnome, librsvg, performance, rust

    Librsvg 2.42.0 came out with a rather major performance regression compared to 2.40.20: SVGs with many transform attributes would slow it down. It was fixed in 2.42.1. We changed from using a parser that would recompile regexes each time it was called, to one that does simple string-based matching and parsing.

    When I rewrote librsvg's parser for the transform attribute from C to Rust, I was just learning about writing parsers in Rust. I chose lalrpop, an excellent, Yacc-like parser generator for Rust. It generates big, fast parsers, like what you would need for a compiler — but it compiles the tokenizer's regexes each time you call the parser. This is not a problem for a compiler, where you basically call the parser only once, but in librsvg, we may call it thousands of times for an SVG file with thousands of objects with transform attributes.

    So, for 2.42.1 I rewrote that parser using rust-cssparser. This is what Servo uses to parse CSS data; it's a simple tokenizer with an API that knows about CSS's particular constructs. This is exactly the kind of data that librsvg cares about. Today all of librsvg's internal parsers work using rust-cssparser, or they are so simple that they can be done with Rust's normal functions to split strings and such.

    Getting good timings

    Librsvg ships with rsvg-convert, a command-line utility that can render an SVG file and write the output to a PNG. While it would be possible to get timings for SVG rendering by timing how long rsvg-convert takes to run, it's a bit clunky for that. The process startup adds noise to the timings, and it only handles one file at a time.

    So, I've written rsvg-bench, a small utility to get timings out of librsvg. I wanted a tool that:

    • Is able to process many SVG images with a single command. For example, this lets us answer a question like, "how long does version N of librsvg take to render a directory full of SVG icons?" — which is important for the performance of an application chooser.

    • Is able to repeatedly process SVG files, for example, "render this SVG 1000 times in a row". This is useful to get accurate timings, as a single render may only take a few microseconds and may be hard to measure. It also helps with running profilers, as they will be able to get more useful samples if the SVG rendering process runs repeatedly for a long time.

    • Exercises librsvg's major code paths for parsing and rendering separately. For example, librsvg uses different parts of the XML parser depending on whether it is being pushed data, vs. being asked to pull data from a stream. Also, we may only want to benchmark the parser but not the renderer; or we may want to parse SVGs only once but render them many times after that.

    • Is aware of librsvg's peculiarities, such as the extra pass to convert a Cairo image surface to a GdkPixbuf when one uses the convenience function rsvg_handle_get_pixbuf().

    Currently rsvg-bench supports all of that.

    An initial benchmark

    I ran this

    /usr/bin/time rsvg-bench -p 1 -r 1 /usr/share/icons

    to cause every SVG icon in /usr/share/icons to be parsed once, and rendered once (i.e. just render every file sequentially). I did this for librsvg 2.40.20 (C only), and 2.42.{0, 1, 2} (C and Rust). There are 5522 SVG files in there. The timings look like this:

    version time (sec)
    2.40.20 95.54
    2.42.0 209.50
    2.42.1 97.18
    2.42.2 95.89

    Bar chart of timings

    So, 2.42.0 was over twice as slow as the C-only version, due to the parsing problems. But now, 2.42.2 is practically just as fast as the C only version. What made this possible?

    • 2.40.20 - the old C-only version
    • 2.42.0 - C + Rust, with a lalrpop parser for the transform attribute
    • 2.42.1 - Servo's cssparser for the transform attribute
    • 2.42.2 - removed most C-to-Rust string copies during parsing

    I have started taking profiles of rsvg-bench runs with sysprof, and there are some improvements worth making. Expect news soon!

    Rsvg-bench is available in Gnome's gitlab instance.

  7. Help needed for librsvg 2.42.1

    - librsvg

    Would you like to help fix a couple of bugs in librsvg, in preparation for the 2.42.1 release?

    I have prepared a list of bugs which I'd like to be fixed in the 2.42.1 milestone. Two of them are assigned to myself, as I'm already working on them.

    There are two other bugs which I'd love someone to look at. Neither of these requires deep knowledge of librsvg, just some debugging and code-writing:

    • Bug 141 - GNOME's thumbnailing machinery creates an icon which has the wrong fill: it's an image of a builder's trowel, and the inside is filled black instead of with a nice gradient. This is the only place in librsvg where a cairo_surface_t is converted to a GdkPixbuf; this involves unpremultiplying the alpha channel. Maybe the relevant function is buggy?

    • Bug 136: The stroke-dasharray attribute in SVG elements is parsed incorrectly. It is a list of CSS length values, separated by commas or spaces. Currently librsvg uses a shitty parser based on g_strsplit() only for commas; it doesn't allow just a space-separated list. Then, it uses g_ascii_strtod() to parse plain numbers; it doesn't support CSS lengths generically. This parser needs to be rewritten in Rust; we already have machinery there to parse CSS length values properly.

    Feel free to contact me by mail, or write something in the bugs themselves, if you would like to work on them. I'll happily guide you through the code :)

  8. Librsvg gets Continuous Integration

    - gitlab, librsvg

    One nice thing about gitlab.gnome.org is that we can now have Continuous Integration (CI) enabled for projects there. After every commit, the CI machinery can build the project, run the tests, and tell you if something goes wrong.

    Carlos Soriano posted a "tips of the week" mail to desktop-devel-list, and a link to how Nautilus implements CI in Gitlab. It turns out that it's reasonably easy to set up: you just create a .gitlab-ci.yml file in the toplevel of your project, and that has the configuration for what to run on every commit.

    Of course instead of reading the manual, I copied-and-pasted the file from Nautilus and just changed some things in it. There is a .yml linter so you can at least check the syntax before pushing a full job.

    Then I read Robert Ancell's reply about how simple-scan builds its CI jobs on both Fedora and Ubuntu... and then the realization hit me:

    This lets me CI librsvg on multiple distros at once. I've had trouble with slight differences in fontconfig/freetype in the past, and this would let me catch them early.

    However, people on IRC advised against this, as we need more hardware to run CI on a large scale.

    Linux distros have a vested interest in getting code out of gnome.org that works well. Surely they can give us some hardware?

  9. Loving Gitlab.gnome.org, and getting notifications

    - gitlab, gnome

    I'm loving gitlab.gnome.org. It has been only a couple of weeks since librsvg moved to gitlab, and I've already received and merged two merge requests. (Isn't it a bit weird that Github uses "pull request" and Everyone(tm) knows the PR acronym, but Gitlab uses "merge request"?)

    Notifications about merge requests

    One thing to note if your GNOME project has moved to Gitlab: if you want to get notified of incoming merge requests, you need to tell Gitlab that you want to "Watch" that project, instead of using one of the default notification settings. Thanks to Carlos Soriano for making me aware of this.

    Notifications from Github's mirror

    The github mirror of git.gnome.org is configured so that pull requests are automatically closed, since currently there is no way to notify the upstream maintainers when someone creates a pull request in the mirror (this is super-unfriendly by default, but at least submitters get notified that their PR would not be looked at by anyone, by default).

    If you have a Github account, you can Watch the project in question to get notified — the bot will close the pull request, but you will get notified, and then you can check it by hand, review it as appropriate, or redirect the submitter to gitlab.gnome.org instead.

  10. Librsvg 2.40.20 is released

    - gnome, librsvg, rust

    Today I released librsvg 2.40.20. This will be the last release in the 2.40.x series, which is deprecated effectively immediately.

    People and distros are strongly encouraged to switch to librsvg 2.41.x as soon as possible. This is the version that is implemented in a mixture of C and Rust. It is 100% API and ABI compatible with 2.40.x, so it is a drop-in replacement for it. If you or your distro can compile Firefox 57, you can probably build librsvg-2.41.x without problems.

    Some statistics

    Here are a few runs of loc — a tool to count lines of code — when run on librsvg. The output is trimmed by hand to only include C and Rust files.

    This is 2.40.20:
    -------------------------------------------------------
     Language      Files   Lines   Blank   Comment    Code
    -------------------------------------------------------
     C                41   20972    3438      2100   15434
     C/C++ Header     27    2377     452       625    1300
    
    This is 2.41.latest (the master branch):
    -------------------------------------------------------
     Language      Files   Lines   Blank   Comment    Code
    -------------------------------------------------------
     C                34   17253    3024      1892   12337
     C/C++ Header     23    2327     501       624    1202
     Rust             38   11254    1873       675    8706
    
    And this is 2.41.latest *without unit tests*, 
    just "real source code":
    -------------------------------------------------------
     Language      Files   Lines   Blank   Comment    Code
    -------------------------------------------------------
     C                34   17253    3024      1892   12337
     C/C++ Header     23    2327     501       624    1202
     Rust             38    9340    1513       610    7217
    

    Summary

    Not counting blank lines nor comments:

    • The C-only version has 16734 lines of C code.

    • The C-only version has no unit tests, just some integration tests.

    • The Rust-and-C version has 13539 lines of C code, 7217 lines of Rust code, and 1489 lines of unit tests in Rust.

    As for the integration tests:

    • The C-only version has 64 integration tests.

    • The Rust-and-C version has 130 integration tests.

    The Rust-and-C version supports a few more SVG features, and it is A LOT more robust and spec-compliant with the SVG features that were supported in the C-only version.

    The C sources in librsvg are shrinking steadily. It would be incredibly awesome if someone could run some git filter-branch magic with the loc tool and generate some pretty graphs of source lines vs. commits over time.

« Page 2 / 4 »