Ex Bibliotheca

The life and times of Zack Weinberg.

Monday, 26 May 2003

# 6 AM

In between talks, people were wandering around verifying each other's PGP keys. This is a process not unlike notarization. If you sign someone's PGP key it means that you met the person whose key it is, face to face, and checked that (a) they are who they claim to be, (b) that really is their key. Assuming that you've never met the person off the net before, part (a) involves looking at government-issued ID; I saw a lot of passports. Surprising to me, although it makes perfect logistic sense in retrospect, is that you do this completely offline. The actual operation of signing someone's key is of course done on a computer, but there's no need to do that immediately after one has checked it. I have a list, on a piece of paper, of people whose identities and key fingerprints I've verified, and I can sign the keys when I get home.

I attended all of the talks today, which was a mistake; several of them were boring. I'm going to summarize only the interesting ones.

Richard Henderson's keynote got the conference off to a solid start. He hadn't given any hint of what he might say beforehand, and many of us were wondering if he'd just make something up the night before. In fact, for all I know, he did, but no one noticed.

He focused on better intermediate representations for GCC and the improvements which they could make possible. It's been known for some time that the current intermediate representation, RTL, is too low-level. It throws away information which modern optimizer algorithms would like to use. There's a project underway at Red Hat to produce a new intermediate representation, called GIMPLE, which is higher-level and easier to work with. Richard pointed out that almost all the interesting optimizations can be done on GIMPLE, which makes the existing code that does them on RTL unnecessary. That's a good thing; that code is old, buggy, slow, and no one wants to work on it.

In Richard's plan, only the very lowest level optimizations would be left to RTL: instruction combination, register allocation, scheduling, basic-block reordering. These passes are either okay as is, or other projects are underway to replace them. He also outlined a clever scheme for reducing the memory consumption of RTL form. Right now RTL is an expansive representation that carries a lot of detail about each instruction:

(insn 12 5 16 0 (nil) (parallel [
            (set (reg:SI 61)
                (plus:SI (mem/f:SI (reg/f:SI 16 argp) [2 a+0 S4 A32])
                    (reg/v:SI 60)))
            (clobber (reg:CC 17 flags))
        ]) {*addsi_1})

Most of this information is redundant with the {*addsi_1} annotation at the end. It could be compressed down to

(insn 12 5 16 0 (nil) {*addsi_1} [
     (reg:SI 61)
     (mem/f:SI (reg/f:SI 16 argp) [2 a+0 S4 A32])
     (reg/v:SI 60)
     (reg:CC 17 flags)
  ])

Probably some of the numbers at the beginning would become unnecessary, too.

Reaction to this scheme was generally positive. People worried about the loop optimizer, which ideally works at a high level, but needs to know about certain low-level characteristics of the target machine. The best approach seems to be providing that information to the high-level optimizers as needed.

My talk (paper) was immediately after Richard's. I talked about the maintenance issues which hinder contributors to GCC. This was generally well received. One thing I hadn't anticipated was the depth of loathing people have for the build process. I concentrated on issues with the code itself, but lots of people wanted to talk about the Makefiles. Which are horrid, it has to be said.

After that we broke for lunch, and then there were several more talks. The most interesting of these was Jason Merrill's detailed introduction to GIMPLE. It uses the same data structures as the very high level "tree" representation that we already have, but imposes constraints which make life easier for optimizers. For instance, an arithmetic expression like y = m*x + b will be broken in half so that there are only two operands on the right hand of each subexpression: T1 = m * x; y = T1 + b.

In the evening there was a reception with dubious cheese and crackers, and decent alcoholic beverages.