Go forward in time to April 2007.
Nicolas Setton mailed gtk-devel-list with an interesting problem. He created a GtkTreeView with 5000 columns and just 50 rows, using GtkListStore as the data model. He said that it is fantastically slow to operate, taking between 5 and 10 seconds to repaint when you show the last columns.
He ran a profiler on his test program, and found some interesting things.
How does GtkListStore work?
This diagram shows a GtkListStore with 8 rows and 3 columns (SVG):
GtkListStore is one of the stock data models for GtkTreeView. It has the following requirements:
To access, insert, and delete rows quickly, GtkListStore uses a GSequence internally, which is a splay tree. That's the big green area on the left of the diagram.
Each node of the splay tree stores the column data for that row in the GtkListStore. A list can have several columns, and we have a little structure called GtkTreeDataList that looks like this:
typedef struct _GtkTreeDataList GtkTreeDataList;
struct _GtkTreeDataList
{
GtkTreeDataList *next;
union {
gint v_int;
gint8 v_char;
guint8 v_uchar;
... etc ...
} data;
} GtkTreeDataList;
Each leaf node in the splay tree points to a GtkTreeDataList, which is a node in a linked list of data nodes. To access the data for a particular (row, column) in the GtkListStore, it first finds the appropriate row in the splay tree, and then it walks the list of GtkTreeDataList structures to find the appropriate column.
Disaster ensues
Of course, GtkTreeView needs to access the data in its model pretty frequently, i.e. every time it has to repaint. The repaint cycle looks like this:
foreach row in repaint_area {
foreach column in row {
cell_renderer = find the cell renderer for the column;
data = gtk_list_store_get_value (row, col);
cell_renderer.set_data (data);
cell_renderer.paint ();
}
}
So, within each row we walk a linked list many times, as many as there are columns in the tree. To access the cell highlighted in red in the diagram, you must first find its row, and walk all the cells that come before it. So, painting a row is an O(n^2) operation, where n is the number of columns. This will dominate the profile whenever there is a large number of columns, like in Nicolas's example.
The solution should be pretty simple: just replace GtkTreeDataList with an array of datums. This will give us constant-time access for any of the columns within a given row. The implementation of GtkListStore is thankfully completely private in GTK+, so we don't need to change any APIs/ABIs to do this.
Changing the implementation of rows to be an array will also save us some memory: there will be no padding overhead from GSlice (as GtkTreeDataList nodes are allocated with GSlice), and we will save one pointer per node.
President Calderón received Bill Gates in his office to give him an important medal. A medal for what!? For fucking all of Mexico in the ass whenever there are ideas about stopping the use of proprietary software? For donating computers running Windows which will need maintenance and support from Microsoft in the future? For making sure that it will be very hard for the Mexican public administration to move to free software?
Did we really vote these criminals into office?
Oh, wait, we didn't.
Projects for Summer of Code 2007
I'm willing to mentor at most two of these Summer of Code projects this year:
Integrate desktop search engines into GtkFileChooser. This is reasonably easy. You just have to go and fix the remaining problems with the existing patch. Who will love this: every GNOME user.
New geometry manager for GTK+. Currently we support a very simple system in which widgets tell GTK+ about their minimum size. In addition to that, we need widgets to be able to tell GTK+ about their natural size or preferred size. For example, the minimum width for a list widget is the sum of the minimum width of each column. Its natural size, however, may be the amount of space to see a reasonable amount of data in the list. Also, GTK+ needs to support height-for-width or width-for-height geometry: currently text labels which wrap their text have no way to tell GTK+ that they need to change their height, and people have to resort to all sorts of ugly/barely-working hacks to make this work.
Lockdown support for Nautilus and GtkFileChooser. System administrators of large installations need to be able to pre-define items that will show up on users' desktops, define shortcuts in the file manager and file chooser, and to constrain which parts of the file system are viewable by users.
You can find more details about these projects in the page about Summer of Code ideas for GNOME.
I've written a little Summer of Code Mentoring HOWTO. I hope people find it useful :)
Three weeks ago I went to the Novell offices in Provo, Utah, USA, a sprawling, sleepy town which is completely surrounded by MOUNTAINS. Who needs walled cities when you have incredibly beautiful mountains all around you. This is the company's parking lot:
I flew back from Provo and met Oralia at the Mexico City airport, and we flew to Brussels for FOSDEM. This is, by far, the best organized conference I've ever been to. It must be something they put in the beer.
Or the mussels. Diving into a gigantic pot of moules, with a side dish of frites, while drinking insanely good beer... well, even though we did it squarely in the middle of the Brussels tourist trap, it was fantastic.
During FOSDEM I gave a talk on Profiling Desktop Applications (ODP / video).
Andrew Morton gave a talk on the status of the Linux kernel. During the Q&A session I asked more or less the following:
Why do the kernel people blame us when our apps are slow? The kernel does not give us enough hooks to get useful profiling data.
Andrew answered, "um, well, uh, like what?". And Michael Meeks retorted, more or less:
The Linux kernel has completely unpredictable swap performance. It does not give us information on what processes cause disk access (especially disk seeks) to happen, either through I/O syscalls or "unconsciously" through page faults.
Andrew: "uh, oh, that's pretty interesting. Yeah, that would be nice to have".
The kernel people do not know what we need to really profile applications for I/O and memory access. We need to tell them so that they can implement this for us.
Meanwhile, Michael came up with an elegant idea to use valgrind to profile page faults.
Following the conference, we visited the lovely town of Brugge, where bonobos haunt the streets.
SISCTI 32, Tecnológico de Monterrey
After Brussels, I went to Monterrey to a very well-known private university in Mexico to give my Tutorial de liberación de oprimidos y opresores (ODP), a talk in favor of using free software for teaching Computer Science and Engineering.
The students went absolutely wild with this, and they were kind enough to give me a standing ovation. Not even Woz, who also spoke at the event, got that kind of response. Take that, proprietary software!
These are the bright students from the Linux User Group at the Tec de Monterrey. Felipe, the one on the far right with the blue T-shirt, wrote the code to support MSN messaging in Gaim.
It's good to be back home.
Go backward in time to February 2007.
Federico Mena-Quintero <federico@gnome.org> Tue 2007/Mar/06 21:10:16 CST