[Source](http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html "Permalink to Linus Torvalds - Re: Faster compilation speed") # Linus Torvalds - Re: Faster compilation speed This is the mail archive of the `gcc@gcc.gnu.org` mailing list for the [GCC project][1]. * * * | ----- | | Index Nav: | [[Date Index][2]] [[Subject Index][3]] [[Author Index][4]] [[Thread Index][5]] | | Message Nav: | [[Date Prev][6]] [[Date Next][7]] | [[Thread Prev][6]] [[Thread Next][7]] | | Other format: | [[Raw text][8]] | | # Re: Faster compilation speed * _From_: Linus Torvalds * _To_: kevin at atkinson dot dhs dot org, gcc at gcc dot gnu dot org * _Cc_: * _Date_: Fri, 9 Aug 2002 20:28:16 -0700 * _Subject_: Re: Faster compilation speed * _Newsgroups_: linux.egcs * _Organization_: * _References_: <[200208100156.g7A1uwn01415@penguin.transmeta.com][9]> * * * In article <[Pine.LNX.4.44.0208092227500.2273-100000@kevin-pc.atkinson.dhs.org][10]> you write: >On Fri, 9 Aug 2002, Linus Torvalds wrote: > >> And that, in turn, is probably impossible to fix as long as gcc uses >> garbage collection for most of its internal memory management. There >> just aren't all that many worse ways to f*ck up your cache behaviour >> than by using lots of allocations and lazy GC to manage your memory. > >Excuse the interruption, but from what I read a good generational garbage >collector can be just as fast as manually managing memory? All the papers I've seen on it are total jokes. But maybe I've looked at the wrong ones. One fundamental fact on modern hardware is that data cache locality is good, and not being in the cache sucks. This is not likely to change. In particular, this means that if you allocate stuff, you want to re-use the stuff you just freed _as_soon_as_possible_ - preferably before the previously dirty data has ever even been evicted from the cache, so that you can re-use the thing to avoid reading it in, but also to avoid writing out stale data. This implies that any lazy de-allocation is bad. When a piece of memory is free, you want to de-allocate it _immediately_, so that the next allocation gets to re-use it and gets the cache footprint "for free". Generational garabage collectors tend to never re-use hot objects, and often do the copying between generations making things even worse on the cache. Compaction helps subsequent use somewhat, but is in itself inherently costly, and the indirection (or fixup) implied by it can limit other optimization. Sure, by being lazy you can sometimes win in icache footprint (and in instruction count - a lot of the "GC is fast" papers seem to rely on the fact that you can do other optimizations if you're lazy), but you lose big in dirty dcache footprint. And since dcache is much more expensive than instructions, you're better off doing explicit memory management with refcounting (optionally helped by the programming language, of course. You can make exact refcounting be your "GC" with some language support). However, there's another, more fundamental issue. It's the _mindset_. The GC mindset tends to go hand-in-hand with pointer chasing, while people who use explicit allocators tend to be happier with doing things like "realloc()" and trying to use arrays and indexes instead of linked lists and just generally trying to avoid allocating lots of small things. Which tends to be better on the cache. Yes, I generalize. Don't we all? For example, if you have an _explicit_ refcounting system, then it is quite natural to have operations like "copy-on-write", where if you decide to change a tree node you do something like copy_on_write(node_t **np) { note_t *node = *np; if (node->count > 1) newnode = copy_alloc(node); *np = newnode; node->count--; node = newnode; } return node; } and then before you change a tree node you do node = copy_on_write(&tree->node); .. we now know we are the exclusive owners of "node" .. which tends to be very efficient - it allows sharing, even if sharing is often not the common case (and doesn't do any extra allocations for the common case of an access that was already exclusively owned). (If you want to be thread-safe you need to be more careful yet, and have thread-safe "get_node()/put_node()" actions etc. Most applications don't need to be that careful, but you'll see a _lot_ of this inside an operating system). In contrast, in a GC system where you do _not_ have access to the explicit refcounting, you tend to always copy the node, just because you don't know if the original node might be shared through another tree or not. Even if sharing ends up not being the most common case. So you do a lot of extra work, and you end up with even more cache pressure. Are the GC systems that do refcounting internally _and_ expose the information upwards to the user? I bet there are. But the fact is, the rest of them (99.9%) give those few well-done GC's a bad name. "So what about circular data structures? Refcounting doesn't work for them". Right. Don't do them. Or handle them very very carefully (ie there can be a "head" that gets special handling and keeps the others alive). Compilers certainly almost always end up working with DAG's, not cyclic structures. Make it a rule. Does it take more effort? Yes. The advantage of GC is that it is automatic. But CG apologists should just admit that it causes bad problems and often _encourages_ people to write code that performs badly. I really think it's the mindset that is the biggest problem. A GC system with explicitly visible reference counts (and immediate freeing) with language support to make it easier to get the refcounts right (things like automatically incrementing the refcounts when passing the object off to others) wouldn't necessarily be painful to use, and would clearly offer all the advantages of just doing it all by hand. That's not the world we live in, though. Linus * * * * **Follow-Ups**: * [**Re: Faster compilation speed][7]** * _From:_ Daniel Berlin * [**Re: Faster compilation speed][11]** * _From:_ Robert Lipe * **References**: * [**Re: Faster compilation speed][9]** * _From:_ Linus Torvalds * [**Re: Faster compilation speed][10]** * _From:_ Kevin Atkinson | ----- | | Index Nav: | [[Date Index][2]] [[Subject Index][3]] [[Author Index][4]] [[Thread Index][5]] | | Message Nav: | [[Date Prev][6]] [[Date Next][7]] | [[Thread Prev][6]] [[Thread Next][7]] | [1]: http://gcc.gnu.org/ [2]: http://gcc.gnu.org/index.html#00552 [3]: http://gcc.gnu.org/subjects.html#00552 [4]: http://gcc.gnu.org/authors.html#00552 [5]: http://gcc.gnu.org/threads.html#00552 [6]: http://gcc.gnu.org/msg00551.html [7]: http://gcc.gnu.org/msg00553.html [8]: http://gcc.gnu.org/cgi-bin/get-raw-msg?listname=gcc&date=2002-08&msgid=200208100328.g7A3SGS01429@penguin.transmeta.com [9]: http://gcc.gnu.org/msg00544.html [10]: http://gcc.gnu.org/msg00548.html [11]: http://gcc.gnu.org/msg00567.html