Stuff Michael Meeks is doing

This is my (in)activity log. You might like to visit Collabora Productivity a subsidiary of Collabora focusing on LibreOffice support and services for whom I work. Also if you have the time to read this sort of stuff you could enlighten yourself by going to Unraveling Wittgenstein's net or if you are feeling objectionable perhaps here. Failing that, there are all manner of interesting things to read on the LibreOffice Planet news feed.

Older items: 2021: ( J F M A M J J A S O N D ), 2019, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, legacy html


LibreOffice progress to 3.4.0

Today we released 3.4.0, there is a great list of new features, specific to LibreOffice available (expertly assembled by Marc Pare and others). Each should also be credited so some of the depth of the (code) developer community is apparent, this is of course in addition to our extensive credits page (kept up to date by a volunteer of course). This is the first major release containing code from many of our new volunteers which is exciting. Of course, there are some great improvements there, I like the named range / data-pilot work that makes it easy to extend the data range you're working on without manually editing perhaps ten data-pilots depending on it but there are a load more. Some of the changes are invisible, and/or behind the scenes - so I thought I'd expand on them here.

The incredible shrinking codebase

First - ridding ourself of sillies - there is lots of good work in this area, eg. big cleanups of dead, and unreachable code, dropping export support from our (deprecated for a decade) binary filters and more. I'd like to highlight one invisible area: icons. Lots of volunteers worked on this, at least: Joseph Powers, Ace Dent, Joachim Tremouroux and Matus Kukan. The problem is that previously OO.o had simply tons of duplication, of icons everywhere: it had around one hundred and fifty (duplicate) 'missing icon' icons as an example. It also duplicated each icon for a 'high contrast' version in each theme (in place of a simple, separate high contrast icon theme), and it also propagated this effective bool highcontrast all across the code bloating things. All of that nonsense is now gone, and we have a great framework for handling eg. low-contrast disabilities consistently. This also reduces run-time memory consumption (we have to cache the .zip theme's directory), and of course binary installation and download size shrinks (we ship several themes to taste) - here is a pretty picture:


The incredible shrinking footprint on Windows

When we first released LibreOffice, in place of the individual install set per language - duplicating the code, artwork, templates, etc. many tens of times on each server we switched to bundling lots of languages (and also the run-time adapted BrOffice branding) into a single package. That shrunk our mirror load from 76Gb down to 11Gb for 3.3, which is now for 3.4 down to 7.6Gb a handy win (of ~70Gb) for mirror admins, making us more agile, and appreciated by our fantastic mirror network I hope. To achieve this, in consultation with the l10n team, Kendy worked hard to split out our help to the wiki - so you can browse it on-line at http://help.libreoffice.org. That of course helps Linux users' space-constrained live-CDs more useful too.

With that work in-place, we managed to cram the top fifty languages into only 225Mb on release, which (sadly) left the remaining languages in a rather larger 265Mb download. In 3.4.0 down to improvements such as the them work above, sharing templates, dropping the BrOffice brand (at the Brazilians request) and more importantly Kami's gret msi packing improvements we've managed to pack all our languages, and many more dictionaries, and more functionality into a single download at 197Mb. That is clearly still too big, but smaller than a MS Office service pack. We are still some 40Mb larger than the original single-lang packaging (which it is our goal to match), but there is plenty more room for improvement (eg. gutting the megabytes of pointless ICU code we ship), and we'd love more help, (cf. scale offset of 150Mb):


Better translation infrastructure

I'm not an expert here, but our fantastic l10n team, swelled to 200 contributors is doing some great work on getting the latest translations into the product. They're using pootle for that, but better (thanks to Andras) we've switched from storing our translations in SDF format, instead using native gettext .po files compiled to various odd other formats during the build. Bjoern has a prototype patch coming to allow run-time .po file translation, to allow post-ship translation and better integration with Launchpad.

Heroic merging

One of the huge, invisible tasks for 3.4.0 was the merging of Oracle's code changes. Luckily, of course we use git - and several split repositories, such that merging should be easy. Then again, we have done lots of widespread changes, many hundreds of thousands, if not millions of lines of diff, stripping dead code, translating thousands of comments, touching every single file, as well as more substantial API and code changes.

One of the things that made the merge more tricky, was that (after decades of inactivity), some Oracle developers decided now (post fork) would be a great time to do some long overdue large-scale code cleanups. One of those was replacing every instance of TRUE with sal_True, same for FALSE, all instances of legacy types like BOOL to sal_Bool, for each of UINT16, ULONG, LONG etc. etc. Unfortunately, not an easy 'sed' either, as witness to some of the bogus "sal_True" type strings that popped out of the works. While a valuable, and much needed cleanup, this results in the kind of multi-million-line patch (touching all the dead code too) that has the effect of obfuscating their other valuable changes, and takes a certain determination to merge and reconcile.

The effort, spearheaded by Norbert with many straight days of mindless grunt work from myself, and able assistance from Thorsten, Kendy, Kohei, Noel and others - hopefully highlights a triumph of determination and survival against the odds, the tedium and some RSI. Unfortunately, due to some comic, transient technical hitches (that resulted in having to do much of the work twice) this merge rather delayed the code freeze and our first betas for 3.4.

Tinderboxes / rapid building & QA

With the rush to get 3.3 out, and the stabilisation releases after it, as well as the one-shot fun of migrating many Linux distributions over to use LibreOffice there was not enough spare bandwidth to get many tinderboxes or build-bots building. Unfortunately, this had quite an impact on the QA teams particularly after the merge completed, which was sad. The good news is, that we have mostly fixed this now, and have much more recent (the aim is daily) install-sets for most platforms available for testing - a great way to get involved is to help with re-testing old bugs against the latest stable releases.

A chunk of this is down to better tooling and scripts to drive tinderboxes, though we still struggle with unreliable Windows builds, cygwin's sh.exe, or perl.exe or dmake like to wedge intermittently some hours into the build. On the Mac front, Norbert had a breakthrough to shrink the build time. An all-lang Mac build used to take 15 hours (13 inside helpcontent2) - he managed to get this down to under 30 minutes using a ramdisk - substantially improving our agility and ability to turn around builds quickly: getting the latest code to QA fast. One of many great build performance improvements - with other much appreciated packaging acceleration wins in the tangle of badly written perl by guys like Jan Darmochwal, Julien Nabet and Steve Butler.

Then of course some QA stars like Rainer, Alex, Cor, Andre and many others have been working hard at finding, cleanly filing, triaging, prioritising and marking bugs, critical work.

Time based releases

One of the key features of LibreOffice, from my perspective, is its time based release process. Italo has done a great write-up with several nice pictures of this, one of these is this:


To me - having a predictable, and time-based release is such a key concept, that shipping 3.4.0 as a carefully labeled, point-zero release on time is more important than shipping an incredibly bug-free product, at a future, undefined point. This avoids a deeply political process of deciding when and if to release, and whose pet bug is worth waiting for and why, and why is his worse than mine, and oh ! I just found another one, lets delay another week. The alternatives to a time based release seem to have lead only to long (multi-month) slips.

Clearly, we have learned a lot this cycle, and improved our processes to make future releases even better. Obviously our succession of point releases: 3.4.1, 3.4.2 etc, will incrementally improve quality: indeed we are confident of that, since we already have a lot of fixes in-place for 3.4.1, however the fact remains that 3.4.0 is buggy (in fact all software is, but it is more so than usual, and we found a lot of bugs rather late). The bright side, is that our future point-zero releases, build on our improved infrastructure will be better in future. This is why we are continuing to advertise 3.3.2 (and the soon to be available 3.3.3) as the primary download build for now (thanks Christian for all your hard Javascripty work to make that fly). Please do - check-out 3.4.0 and get stuck into helping us make the Free Software Office world even more fun, fast-moving, and exciting. A great way to start as a developer is to get stuck into an Easy Hack.

Contributors

Of course, lots of people got stuck in already, and continue to do so. We like to graph statistics of commits by affiliation - charting how many people of what affiliation commit at least one patch per month, that gives us a pretty graph like this:


Which (I hope) shows the strength and diversity of the contribitor base, as well as its extraordinary growth since we launched. Sadly, some like to denigrate and despise contributors that only come up with small changes, personally I started off in Free Software as a contributor like that - and we love to help encourage, and mentor contributors from small things - to greater ones.

Other highlights

Well, of course so many others have been involved in making LibreOffice the success it is today, tirelesss work by many Steering Committee members, like Italo and Florian writing and co-ordinating multiple press briefings concurrently, and many helping translate and promote the software. The design team, being patient as we get the basics right before getting too stuck into fixing the UI (while producing some beautiful, much improved artwork), and no-doubt many others I forgot. And please take note - this is just some of the features you can't see, there are plenty that you can"


My content in this blog and associated images / data under images/ and data/ directories are (usually) created by me and (unless obviously labelled otherwise) are licensed under the public domain, and/or if that doesn't float your boat a CC0 license. I encourage linking back (of course) to help people decide for themselves, in context, in the battle for ideas, and I love fixes / improvements / corrections by private mail.

In case it's not painfully obvious: the reflections reflected here are my own; mine, all mine ! and don't reflect the views of Collabora, SUSE, Novell, The Document Foundation, Spaghetti Hurlers (International), or anyone else. It's also important to realise that I'm not in on the Swedish Conspiracy. Occasionally people ask for formal photos for conferences or fun.

Michael Meeks (michael.meeks@collabora.com)

pyblosxom