170 lines
7.4 KiB
Markdown
170 lines
7.4 KiB
Markdown
|
[Source](http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html "Permalink to Linus Torvalds - Re: Faster compilation speed")
|
|||
|
|
|||
|
# Linus Torvalds - Re: Faster compilation speed
|
|||
|
|
|||
|
This is the mail archive of the `gcc@gcc.gnu.org` mailing list for the [GCC project][1].
|
|||
|
|
|||
|
* * *
|
|||
|
|
|||
|
| ----- |
|
|||
|
| Index Nav: | [[Date Index][2]] [[Subject Index][3]] [[Author Index][4]] [[Thread Index][5]] |
|
|||
|
| Message Nav: | [[Date Prev][6]] [[Date Next][7]] | [[Thread Prev][6]] [[Thread Next][7]] |
|
|||
|
| Other format: | [[Raw text][8]] | |
|
|||
|
|
|||
|
# Re: Faster compilation speed
|
|||
|
|
|||
|
* _From_: Linus Torvalds <torvalds at transmeta dot com>
|
|||
|
* _To_: kevin at atkinson dot dhs dot org, gcc at gcc dot gnu dot org
|
|||
|
* _Cc_:
|
|||
|
* _Date_: Fri, 9 Aug 2002 20:28:16 -0700
|
|||
|
* _Subject_: Re: Faster compilation speed
|
|||
|
* _Newsgroups_: linux.egcs
|
|||
|
* _Organization_:
|
|||
|
* _References_: <[200208100156.g7A1uwn01415@penguin.transmeta.com][9]>
|
|||
|
* * *
|
|||
|
|
|||
|
|
|||
|
In article <[Pine.LNX.4.44.0208092227500.2273-100000@kevin-pc.atkinson.dhs.org][10]> you write:
|
|||
|
>On Fri, 9 Aug 2002, Linus Torvalds wrote:
|
|||
|
>
|
|||
|
>> And that, in turn, is probably impossible to fix as long as gcc uses
|
|||
|
>> garbage collection for most of its internal memory management. There
|
|||
|
>> just aren't all that many worse ways to f*ck up your cache behaviour
|
|||
|
>> than by using lots of allocations and lazy GC to manage your memory.
|
|||
|
>
|
|||
|
>Excuse the interruption, but from what I read a good generational garbage
|
|||
|
>collector can be just as fast as manually managing memory?
|
|||
|
|
|||
|
All the papers I've seen on it are total jokes. But maybe I've looked
|
|||
|
at the wrong ones.
|
|||
|
|
|||
|
One fundamental fact on modern hardware is that data cache locality is
|
|||
|
good, and not being in the cache sucks. This is not likely to change.
|
|||
|
In particular, this means that if you allocate stuff, you want to re-use
|
|||
|
the stuff you just freed _as_soon_as_possible_ - preferably before the
|
|||
|
previously dirty data has ever even been evicted from the cache, so that
|
|||
|
you can re-use the thing to avoid reading it in, but also to avoid
|
|||
|
writing out stale data.
|
|||
|
|
|||
|
This implies that any lazy de-allocation is bad. When a piece of memory
|
|||
|
is free, you want to de-allocate it _immediately_, so that the next
|
|||
|
allocation gets to re-use it and gets the cache footprint "for free".
|
|||
|
|
|||
|
Generational garabage collectors tend to never re-use hot objects, and
|
|||
|
often do the copying between generations making things even worse on the
|
|||
|
cache. Compaction helps subsequent use somewhat, but is in itself
|
|||
|
inherently costly, and the indirection (or fixup) implied by it can
|
|||
|
limit other optimization.
|
|||
|
|
|||
|
Sure, by being lazy you can sometimes win in icache footprint (and in
|
|||
|
instruction count - a lot of the "GC is fast" papers seem to rely on the
|
|||
|
fact that you can do other optimizations if you're lazy), but you lose
|
|||
|
big in dirty dcache footprint. And since dcache is much more expensive
|
|||
|
than instructions, you're better off doing explicit memory management
|
|||
|
with refcounting (optionally helped by the programming language, of
|
|||
|
course. You can make exact refcounting be your "GC" with some language
|
|||
|
support).
|
|||
|
|
|||
|
However, there's another, more fundamental issue. It's the _mindset_.
|
|||
|
The GC mindset tends to go hand-in-hand with pointer chasing, while
|
|||
|
people who use explicit allocators tend to be happier with doing things
|
|||
|
like "realloc()" and trying to use arrays and indexes instead of linked
|
|||
|
lists and just generally trying to avoid allocating lots of small
|
|||
|
things. Which tends to be better on the cache.
|
|||
|
|
|||
|
Yes, I generalize. Don't we all?
|
|||
|
|
|||
|
For example, if you have an _explicit_ refcounting system, then it is
|
|||
|
quite natural to have operations like "copy-on-write", where if you
|
|||
|
decide to change a tree node you do something like
|
|||
|
|
|||
|
copy_on_write(node_t **np)
|
|||
|
{
|
|||
|
|
|||
|
note_t *node = *np;
|
|||
|
if (node->count > 1)
|
|||
|
newnode = copy_alloc(node);
|
|||
|
*np = newnode;
|
|||
|
node->count--;
|
|||
|
node = newnode;
|
|||
|
}
|
|||
|
return node;
|
|||
|
}
|
|||
|
|
|||
|
and then before you change a tree node you do
|
|||
|
|
|||
|
node = copy_on_write(&tree->node);
|
|||
|
.. we now know we are the exclusive owners of "node" ..
|
|||
|
|
|||
|
which tends to be very efficient - it allows sharing, even if sharing is
|
|||
|
often not the common case (and doesn't do any extra allocations for the
|
|||
|
common case of an access that was already exclusively owned).
|
|||
|
|
|||
|
(If you want to be thread-safe you need to be more careful yet, and have
|
|||
|
thread-safe "get_node()/put_node()" actions etc. Most applications
|
|||
|
don't need to be that careful, but you'll see a _lot_ of this inside an
|
|||
|
operating system).
|
|||
|
|
|||
|
In contrast, in a GC system where you do _not_ have access to the
|
|||
|
explicit refcounting, you tend to always copy the node, just because you
|
|||
|
don't know if the original node might be shared through another tree or
|
|||
|
not. Even if sharing ends up not being the most common case. So you do
|
|||
|
a lot of extra work, and you end up with even more cache pressure.
|
|||
|
|
|||
|
Are the GC systems that do refcounting internally _and_ expose the
|
|||
|
information upwards to the user? I bet there are. But the fact is, the
|
|||
|
rest of them (99.9%) give those few well-done GC's a bad name.
|
|||
|
|
|||
|
"So what about circular data structures? Refcounting doesn't work for
|
|||
|
them". Right. Don't do them. Or handle them very very carefully (ie
|
|||
|
there can be a "head" that gets special handling and keeps the others
|
|||
|
alive). Compilers certainly almost always end up working with DAG's, not
|
|||
|
cyclic structures. Make it a rule.
|
|||
|
|
|||
|
Does it take more effort? Yes. The advantage of GC is that it is
|
|||
|
automatic. But CG apologists should just admit that it causes bad
|
|||
|
problems and often _encourages_ people to write code that performs
|
|||
|
badly.
|
|||
|
|
|||
|
I really think it's the mindset that is the biggest problem. A GC
|
|||
|
system with explicitly visible reference counts (and immediate freeing)
|
|||
|
with language support to make it easier to get the refcounts right
|
|||
|
(things like automatically incrementing the refcounts when passing the
|
|||
|
object off to others) wouldn't necessarily be painful to use, and would
|
|||
|
clearly offer all the advantages of just doing it all by hand.
|
|||
|
|
|||
|
That's not the world we live in, though.
|
|||
|
|
|||
|
Linus
|
|||
|
|
|||
|
|
|||
|
|
|||
|
* * *
|
|||
|
* **Follow-Ups**:
|
|||
|
* [**Re: Faster compilation speed][7]**
|
|||
|
* _From:_ Daniel Berlin
|
|||
|
* [**Re: Faster compilation speed][11]**
|
|||
|
* _From:_ Robert Lipe
|
|||
|
* **References**:
|
|||
|
* [**Re: Faster compilation speed][9]**
|
|||
|
* _From:_ Linus Torvalds
|
|||
|
* [**Re: Faster compilation speed][10]**
|
|||
|
* _From:_ Kevin Atkinson
|
|||
|
|
|||
|
| ----- |
|
|||
|
| Index Nav: | [[Date Index][2]] [[Subject Index][3]] [[Author Index][4]] [[Thread Index][5]] |
|
|||
|
| Message Nav: | [[Date Prev][6]] [[Date Next][7]] | [[Thread Prev][6]] [[Thread Next][7]] |
|
|||
|
|
|||
|
[1]: http://gcc.gnu.org/
|
|||
|
[2]: http://gcc.gnu.org/index.html#00552
|
|||
|
[3]: http://gcc.gnu.org/subjects.html#00552
|
|||
|
[4]: http://gcc.gnu.org/authors.html#00552
|
|||
|
[5]: http://gcc.gnu.org/threads.html#00552
|
|||
|
[6]: http://gcc.gnu.org/msg00551.html
|
|||
|
[7]: http://gcc.gnu.org/msg00553.html
|
|||
|
[8]: http://gcc.gnu.org/cgi-bin/get-raw-msg?listname=gcc&date=2002-08&msgid=200208100328.g7A3SGS01429@penguin.transmeta.com
|
|||
|
[9]: http://gcc.gnu.org/msg00544.html
|
|||
|
[10]: http://gcc.gnu.org/msg00548.html
|
|||
|
[11]: http://gcc.gnu.org/msg00567.html
|
|||
|
|