Go forward in time to June 2007.
Matthieu Delahaye has been analyzing the GTK+ icon cache in greater depth. He's looking into how many pages must be loaded from the cache file, in total, to access a particular icon. This includes the actual RGBA data, the pages that contain the hash tables to find the icons, file headers, etc.
The other day I patched GTK+ to log accesses to icon caches. Here is a time plot of where icons get accessed within $(prefix)/share/icons/Tango/icon-theme.cache in my machine, over the course of about two days:
This is using my usual set of programs: gnome-panel, Nautilus, Evolution, Epiphany, a bit of OpenOffice, EOG, etc. All of them access the Tango icon cache at various times, requesting various icons. These icons fall at particular offsets within the cache file, and the plot shows that the icons I actually use are scattered all over the file. For example, most apps use GTK_STOCK_OPEN and GTK_STOCK_SAVE, but hardly any apps use GTK_STOCK_UNINDENT.
It would make sense to put frequently-used icons together near the beginning of the cache file, and to put seldom-used ones at the end. This would minimize disk seeks and also the number of pages that need to be loaded from the file. For example, OPEN and SAVE are often right next to each other in a program's toolbar, but those icons are spread far apart in the icon cache — OPEN is at offset 44,862,448 on my machine, while SAVE is at offset 34,974,592. This is 9.4 megabytes apart! No wonder there is so much disk seeking going on.
This is another version of the same plot, but timestamps have been replaced simply with the number of icons accessed.
This second plot reveals some interesting things. After I unsuspend my laptop, gnome-main-menu requests nm-no-connection 4699 times, and then it requests nm-device-wireless 1634 times within a few seconds. From a quick look at the code, this is not gnome-main-menu's fault; it could be that NetworkManager is sending out tons of status-changed notifications while it tries to re-connect after unsuspending.
We need an algorithm that will do this:
Here are some tools:
gtk-log-icon-cache.diff: patch to GTK+ to log accesses to the icon cache (be sure to create a ~/gtk-icon-cache-logs directory first).
collate-icons.py: Collect the icon cache logs and sort them by cache file and timestamp.
icon-accesses.gnuplot: Gnuplot file to generate the second plot, after the Tango icons have been cut&pasted out of the result of collate-icons.py.
Subject: WE ARE ALL GOING TO NEED IT
We all know destiny will get us one day.
(Complete with bad spelling and all — alcanzará is missing an accent...)
Some highlights from that spam:
Mourning chapels within the cemetery and a 24-hour coffee shop [...] the most practical and elegant cemetery in Mexico City [...] unique design [...] valuable equity.
GTK+ can be made to create an icon cache out of the many icons that are installed on your system. This cache ends up being a big binary file that programs mmap() and share, so there is only one copy in memory of the RGB data for icons. The gtk-update-icon-cache program, which most distributions run when they install packages, takes care of creating these cache files.
To create the caches, gtk-update-icon-cache simply scans /usr/share/icons/themename and puts all the icons it finds into the new cache file. So, the icons get laid out in the file in the same order they appear on disk.
This ordering, however, is not optimal. It's probably alphabetical, so icons called "open" and "save" will end up far away from each other within the cache. Still, most applications which create a File menu or a toolbar will want to place the icons for "open" and "save" next to each other, so they'll load them one after the other — and this could cause a disk seek the first time those icons are read.
Also, with the most common icons being scattered all over the icon cache files, we are loading more pages from disk that we strictly need: if a commonly-used icon starts in the middle of a page which it shares with an infrequently-used one, then we'll be wasting space for that infrequently-used icon — and yet the page must be loaded from disk, since it contains data for our common icon.
Here is a patch for gtkiconcache.c which creates a log of which icons get fetched from the icon cache:
Build and install GTK+ with this patch, and create a ~/gtk-icon-cache-logs directory in your $HOME. When you run GTK+ programs, this directory will start to get populated with one log file for each process. The logs have information like this:
cache_mapped: 0x85d9bc8 /opt/gnome/share/icons/Tango cache_mapped: 0x879b020 /opt/gnome/share/icons/Tango cache_mapped: 0x87af620 /opt/gnome/share/icons/gnome cache_mapped: 0x87c61e0 /opt/gnome/share/icons/hicolor cache_mapped: 0x87c6660 /usr/share/icons/hicolor get_icon: cache:0x879b020 directory:'48x48/mimetypes' icon_name:'text-x-generic' offset:58012708 length:9240 get_icon: cache:0x879b020 directory:'24x24/mimetypes' icon_name:'text-x-generic' offset:58026100 length:2328 get_icon: cache:0x85d9bc8 directory:'22x22/actions' icon_name:'gtk-go-forward-ltr' offset:12248984 length:1960 get_icon: cache:0x85d9bc8 directory:'22x22/actions' icon_name:'gtk-save' offset:34976936 length:1960 get_icon: cache:0x85d9bc8 directory:'22x22/actions' icon_name:'gtk-go-forward-ltr' offset:12248984 length:1960 get_icon: cache:0x879b020 directory:'48x48/mimetypes' icon_name:'text-x-generic' offset:58012708 length:9240 get_icon: cache:0x879b020 directory:'24x24/mimetypes' icon_name:'text-x-generic' offset:58026100 length:2328 get_icon: cache:0x85d9bc8 directory:'22x22/actions' icon_name:'gtk-save' offset:34976936 length:1960
The idea is to find an arrangement that would pack the most commonly-used icons together in the icon cache, and then to modify gtk-update-icon-cache to use that arrangement when creating the cache files.
If you are looking for a nice weekend hack, you can do this:
Download gtk-icon-cache-logs.tar.bz2, which contains some data from my own GNOME session.
Write a script to sort the icons by popularity.
Make a nice plot showing which icons are the most popular.
Add an option to gtk-update-icon-cache so that you can pass it "--popularity-data=gtk-icon-cache-logs/*" and it will build the popularity table itself. Then, it should lay out the icons in the appropriate order within the cache file.
Big thanks to Emmanuele for merging Beagle support into GtkFileChooser. Just to clarify: I didn't write all of the patch, just the GUI code and the originally-hardwired glue to the Beagle API. Someone at Imendio (Anders Carlsson?) wrote the general GtkSearchEngine interface with Beagle and Tracker backends, similar to the way things are handled inside Nautilus.
Go backward in time to April 2007.Federico Mena-Quintero <email@example.com> Thu 2007/May/03 15:04:19 CDT