Go forward in time to December 2005.
Requirement for API Documentation
I'm delighted to announce that the Release Team has approved the proposal to require API documentation for all new public interfaces which come into set of core platform modules.
If you maintain any of these modules, this announcement applies to you: at-spi, atk, gail, gconf, glib, gnome-mime-data, gnome-vfs, gtk+, gtk-doc, intltool, libglade, libxml2, libxslt, pango, pkgconfig.
The requirements are pretty simple:
Document any new public interfaces since the last stable version of the module (e.g. the jump from 2.12.x to 2.14.0). You can do this with gtk-doc. People on gtk-doc-list or desktop-devel-list can help you set up this infrastructure. Your docs should say the version number in which each interface was added; please see the Howto section of the proposal for the details.
Mark any newly deprecated interfaces as such. Please see the Howto section for the details.
Any new module proposed for the platform must be fully documented.
The deadline for API documentation is a week after API freeze, so this means 2006/Jan/23. Lack of compliance will be treated the same as freeze breaks.
Please read the full plan: Documentation requirements for new APIs .
My friend Joakim, one of the crazy Norwegian expats sent to invade the free software scene in Mexico, is a movie star now.
I'm pleased to bring you Joakim, Leo, and Øyvind, ready to take over the world.
Oralia sometimes has a bad hair day.
And sometimes she just looks stunning.
My mom and I in Veracruz.
Proposal to reduce the memory consumption of images in Mozilla
When you run Mozilla or Firefox and load a web page with images, it stores the uncompressed images as pixmaps in the X server. In particular, it seems to maintain live pixmaps for all the images in all the tabs that you have open; even if a tab is not visible, the images will be in your X server's memory.
When you exit Firefox, the X server is smart enough to return this memory to the kernel.
Web pages use compressed images, of course, to reduce download time. These images will likely take up a lot of space when uncompressed.
For example, I have a directory with about 4.3 MB of JPEG images:
$ du -s jpegs 4372 jpegs
If I uncompress those images, they balloon in size to about 67 MB:
$ mkdir uncompressed $ for i in jpegs/*.jpg; do convert $i uncompressed/`basename $i .jpg`.ppm; done $ du -s uncompressed 67172 uncompressed
Let's see what happens once we load those images in Firefox. I created a simple HTML page with <img> tags for each of the files in my jpegs directory, so that Firefox would load all the images, uncompress them, turn the uncompressed data into a format suitable for X pixmaps, and ship the pixmaps to the server.
To take some rough numbers on memory consumption, I used ps and the fabulous xrestop. Xrestop is like the usual top command, but it will tell you how many resources each client is using in the X server. In particular, one of the columns in xrestop tells you how much memory a client is using for pixmaps.
This is output of ps and xrestop for a newly-started Firefox:
ps: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND federico 27833 20.7 4.6 55080 24136 pts/0 Sl 11:12 0:01 /opt/MozillaFirefox/lib/firefox-bin xrestop: res-base Wins GCs Fnts Pxms Misc Pxm mem Other Total PID Identifier 4400000 87 53 1 29 30 415K 4K 420K ? Mozilla Firefox
So, our base numbers are 24 MB resident on the client, and 415 KB of pixmaps in the server. Remember, that's for a newly-started Firefox.
After loading my page full of JPEGs, we get this:
ps: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND federico 27833 4.7 5.0 56544 26172 pts/0 Sl 11:12 0:05 /opt/MozillaFirefox/lib/firefox-bin xrestop: res-base Wins GCs Fnts Pxms Misc Pxm mem Other Total PID Identifier 4400000 88 71 1 73 31 95327K 5K 95332K ? Mozilla Firefox
Woah! The Firefox process grew its resident size by only 2 MB, but it created about 95 MB of pixmaps in the server. Why is this larger than the 67 MB of uncompressed image data that we got before? Who knows — probably the pixmaps require more memory for padding than the raw RGB data.
Remember that our compressed images occupied about 4.3 MB? The Firefox process only grew by about 2 MB: this indicates that it doesn't keep the compressed data in memory once it has shipped the images to the server. That's a good thing: keep a single copy the image data.
Then, I opened a new empty tab in Firefox, and closed the tab with the images:
ps: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND federico 27833 2.9 5.0 56544 26200 pts/0 Sl 11:12 0:06 /opt/MozillaFirefox/lib/firefox-bin xrestop: res-base Wins GCs Fnts Pxms Misc Pxm mem Other Total PID Identifier 4400000 91 83 1 46 32 24988K 5K 24994K ? Mozilla Firefox
The Firefox process grew its resident size by a bit, but it didn't grow its total virtual size. So, no worries on that front. However, it left about 24.9 MB of pixmaps in the server. Why is that? Is it part of the browser's cache? Or is it leaking pixmaps?
Let's find out. I loaded the page with JPEGs again:
ps: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND federico 27833 3.3 5.5 58568 28544 pts/0 Sl 11:12 0:09 /opt/MozillaFirefox/lib/firefox-bin xrestop: res-base Wins GCs Fnts Pxms Misc Pxm mem Other Total PID Identifier 4400000 105 89 1 78 38 95328K 6K 95334K ? Mozilla Firefox
The process grew this time by about 2 MB in its total virtual size, and increased its resident size by about 2 MB. Note that it is slightly bigger than the first time we loaded the page with JPEGs. But that's only a little memory compared to what we have in the server, so let's not worry about it for now. Note that the server grew again to hold about 95 MB of pixmaps. This matches the value from our initial run, so we can be reasonably confident that Firefox is not leaking pixmaps in the X server. It just keeps them there as part of the browser cache.
What have we learned so far?
Yes, uncompressed images are much larger than compressed ones.
Firefox doesn't keep more than copy of the images around, in either representation. That's good.
Firefox keeps the images for all the tabs that are open as uncompressed pixmaps in the X server. That's not horrible, as it has to keep the images somewhere.
But can we do better? That is, can we keep only (or mostly only) the compressed images in memory, until we need to draw them on the screen?
Proof of concept
Let's summarize what we want to test:
If you must keep images in memory, keep them in compressed format ("as downloaded from the net"). What's the memory savings of doing this?
When you need to paint an image to the screen (i.e. when the user scrolls to make the image visible), uncompress it on the fly. Does this lead to bad performance?
To answer the part about memory savings, we just have to compare the size of the compressed and the uncompressed images. That's easy.
To answer the performance question, we have to run a program with that behavior and scroll around its window to get an intuitive feel of how it performs. If needed, we can add instrumentation or profile the program to get hard numbers.
The program in question is very simple: moz-images.cs. Download it and look at the comments in the top to see how to compile it. You run the program like this:
mono moz-images.exe /directory/full/of/images mode
The mode option can be --server, --client, or --compressed. These do the following:
--server: Loads all the images in advance, creates uncompressed X pixmaps for them, frees the compressed data. When it has to repaint, it just copies from the appropriate pixmap to the window.
--client: Loads all the images in advance, keeps them uncompressed in the client as GdkPixbuf objects. When it has to repaint, it sends the appropriate area from a pixbuf to the X server for painting.
--compressed: This is the interesting one. Loads the images, but keeps them in their original compressed form in memory. When it has to repaint, it uncompresses the images that would be visible on the screen, and sends them to the X server. When you scroll so that an image is no longer visible, the program frees the uncompressed data for that image; it will decompress the original data again as needed.
I used ps and xrestop for each case to get a table of results. To ensure that all the images were paged into memory, I scrolled up and down the window a few times for each case, and then ran ps and xrestop.
| Mode | Resident size | Pixmaps in X server | Total size | Performance |
|---|---|---|---|---|
| --server | 15380 KB | 92683 KB | 108063 KB | (1) |
| --client | 85324 KB | 384 KB | 85708 KB | (2) |
| --compressed | 19204 KB | 384 KB | 19588 KB | (3) |
With --server, the program took a long time to start up as my machine started swapping to make room for the overgrown X server process. The second run was faster, because there was room in RAM already. It still took a little while to uncompress all the images in advance (between one and two seconds). Scrolling was super-fast, as expected, since all the pixmaps were already in RAM and in the server. Note that we use relatively little memory in the client, but a lot of memory in the server. The cumulative usage is 108063 KB.
With --client, the program was reasonably fast to start up, though there was room in RAM already. It still took a little while to uncompress all the images in advance. Scrolling was super-fast and I could not see a difference between this and --server. This is to be expected as the program was running on the same machine as the X server. We use very little memory in the server indeed, but a lot in the client. The cumulative usage is 85708 KB.
With --compressed, the program started up faster than in the other runs, since it doesn't have to uncompress all the images in advance. Scrolling is a bit jerky if you yank the scrollbar's thumb up and down very quickly. The jerkiness is barely noticeable if you use the scroll wheel or the scrollbar's arrows at about the same rate as you would use while reading a web page. Note that we use reasonably little memory in the client, and very little memory in the server. The cumulative usage is 19588 KB.
Also, compare the resident size of --compressed with that of --server. In the former, we use 19204 KB; in the latter, we use 15380 KB. The difference is about 4 MB. This matches our initial measurement of 4.3 MB for the compressed JPEGs.
Summary
For Firefox, keeping the compressed, as-downloaded images in the client could reduce memory consumption a lot. My program is a proof of concept and doesn't do anything fancy to reduce the little jerkiness that results from uncompressing the images on the fly. Firefox could get much fancier; it could probably uncompress the images that are adjacent to the viewable area, in advance, while you are still reading what is in the viewable area. This would be equivalent to read-ahead for hard disks.
For the proof-of-concept program, we reduced the cumulative memory usage from 108063 KB to 19588 KB — that is, a factor of 5.5.
Emmanuele Bassi keeps on profiling GMarkup to use it with the new bookmark parser for GTK+, and gets some very promising results. This code should land in GTK+ pretty soon; the idea is to quickly replace all uses of EggRecent with this new shared code.
Rich Burridge describes trying to use Evolution as a blind user. Pretty interesting stuff.
If you are using gcc 4.1 and libgnomecanvas makes your gdm/evolution crash, see this bug. It's about the second time that libart-related code has exposed a gcc bug. This has been a public service announcement.
Profiling the file chooser, part 9: different sizes for the gunichar -> glyph_index cache
Previous entries:
The patches for Pango are landing now on CVS! Behdad committed an updated Fribidi, my GQuark patch, and other patches. I committed my cleaned-up patch for the gunichar -> glyph_index mapping.
After writing that patch, the first question that arose was what would happen if we made the glyph cache bigger — that should help CJK languages since they have large numbers of glyphs, right?. Currently the cache is fixed-size, with 256 entries. I modified the patch slightly, and took timings for 512 and 1024 entries. Here are the results:
Little variations are just noise. Big jumps are noise — I probably need to run more iterations (although the test does take a good hour to run as it is). It's interesting to see that bigger caches don't help CJK languages very much, if at all. So, we can use 256 entries confidently. As Behdad pointed out, taking the lower 8 bits of a word is probably faster than taking the lower 9 bits...
The remaining interesting cases are ne and hi, which are actually harmed by the cache. Why does that happen? Do we have any speakers of those languages that can investigate?
Summary: this patch is in Pango CVS HEAD right now. Enjoy!
Go backward in time to October 2005.
Federico Mena-Quintero <federico@gnome.org> Fri 2005/Nov/04 17:18:54 CST