hn-classics/_stories/2001/13243489.md

[Source](https://gcc.gnu.org/ml/gcc/2001-07/msg02150.html "Permalink to Linus Torvalds - Re: What is acceptable for -ffast-math? (Was: associative law incombine)")

# Linus Torvalds - Re: What is acceptable for -ffast-math? (Was: associative law incombine)

This is the mail archive of the `gcc@gcc.gnu.org` mailing list for the [GCC project][1]. 

* * *

| ----- |
| Index Nav: |  [[Date Index][2]] [[Subject Index][3]] [[Author Index][4]] [[Thread Index][5]]  |  
| Message Nav: |  [[Date Prev][6]] [[Date Next][7]] |  [[Thread Prev][6]] [[Thread Next][8]] | 

# Re: What is acceptable for -ffast-math? (Was: associative law incombine)

* _To_: <dewar at gnat dot com>
* _Subject_: Re: What is acceptable for -ffast-math? (Was: associative law incombine)
* _From_: Linus Torvalds <torvalds at transmeta dot com>
* _Date_: Tue, 31 Jul 2001 15:50:28 -0700 (PDT)
* _cc_: <gdr at codesourcery dot com>, <fjh at cs dot mu dot oz dot au>, <gcc at gcc dot gnu dot org>, <moshier at moshier dot ne dot mediaone dot net>, <tprince at computer dot org>
* * *
    
    
    
    On Tue, 31 Jul 2001 dewar@gnat.com wrote:
    >
    > Well it sure would be nice to here from some of these mythical numerical
    > programmers (I don't care if they are writing games or nuclear reactor codes)
    > who would be happier, so far we haven't heard this! And in my experience,
    > even quite inexperienced floating-point numerical programmers are very
    > disturbed when optimization changes the results of their programs.
    
    I used -ffast-math myself, when I worked on the quake3 port to Linux (it's
    been five years, how time flies).
    
    It didn't make much difference at that point, because the x86 had
    hand-written assembly, and gcc for the alpha didn't do much (anything?)
    with -ffast-math.
    
    But I tried _everything_. The main FP work I did on that thing on the
    alpha improved the framerate by about 50% on alpha - FP was _that_
    critical for it.  Most of it was by looking at gcc output and trying to
    re-organize the C code to make it be better (because gcc didn't do much on
    its own).
    
    And yes, it was exactly things like multiplying by reciprocals.
    
    > > Your arguments about "numerical computation" are just silly, as you don't
    > > seem to realize that there are tons of problems where your theoretical
    > > issues are nothing more than noise.
    >
    > If you think the arguments are silly, then I really fear you lack the full
    > context for this discussion, a discussion that has, as you should know raged
    > for well over thirty years.
    
    Most of the kind of "numercial" work that you seem to be talking about has
    probably rather little to do with FP performance. Most of the traditional
    heavy FP code tends to be _much_ more about cache layout and good memory
    access patterns.
    
    I'm personally aquainted with tryign to make a game engine go fast, where
    the memory effects are fewer, and the FP itself is the bottleneck.
    
    > Sure -ffast-math is precisely intended to allow transformations that would
    > not otherwise be allowed (let's not call them optimizations, that's just
    > too controversial a word in the context of this argument).
    
    Why not call them optimizations? They are. The only thing we change is the
    boundary of valid ranges.
    
    > The question is what is the boundary of allowable transformations. No one
    > agrees that there should be no boundaries (even you don't like the change
    > results to zero, though note that abandoning denormals has exactly this
    > effect, and might be considered acceptable).
    
    Oh, round-to-zero is definitely acceptable in the world of "who cares
    about IEEE, we want fast math, and we'll use fixed arithmetic if the FP
    code is too slow".
    
    In fact, it is _so_ acceptable that CPU designers design for it. Look at
    MMX2, and wonder why they have a RTZ mode? Because it makes the _hardware_
    go faster.
    
    That should tell you something. Big companies that have billion-dollar
    fabs spend time optimizing their chips that take several years to design
    for _games_. Not for IEEE traditional Fortran-kind math.
    
    But apparently some gcc developers don't think that is even a worthy
    market, because you just want to do fluid dynamics.
    
    > So, what is the boundary, can one for instance forget about denormals and
    > flush to zero to save a bit of time, can one truncate instead of round,
    > can one ignore negative zeroes, or infinity semantics, can one ignore
    > intermediate overflow (note: nearly all the discussed transformations are
    > implicitly saying yes to this last question).
    
    Do them all by default with -ffast-math.
    
    Then, you can have specific flags for people who want just _one_
    optimization. I doubt you'll find many users who do that, but maybe I'm
    wrong. Giving people the choice is always a good idea.
    
    > I have not seen anyone writing from the point of view of serious numerical
    > coding saying [ .. ]
    
    There you go again. What the hell do you call "serious numerical coding"?
    
    Take a look at the computer game market today. It's a lot more serious
    than most matematicians puttering around in their labs, let me tell you.
    That's a BIG industry.
    
    Also note that _nobody_ in your kind of "serious numerical coding"
    community would ever worry about "-ffast-math" in the first place. Why the
    hell would they, when 99% of the time it doesn't make any difference at
    all. The people you apparently consider serious are a lot more interested
    in fast communication (so that they can solve the thing in parallell) and
    incredible memory bandwidth.
    
    I doubt you'll find many of your "serious numerical coding" people who
    would even _notice_ the raw FP throughput. Look at SpecFP - CPU's are fast
    enough, it spends most of its time waiting on memory.
    
    > Should -ffast-math allow full precision operation? I would think so,
    > since it definitely improves performance, and reduces surprises.
    
    Ehh.. gcc right now allows full precision operation BY DEFAULT. No
    -ffast-math required.
    
    Same goes for negative zero as far as I remember - on some HP-PA stuff at
    least. Simply because it was too painful to get "right" on their earlier
    hardware.
    
    In short, what you seem to argue that -ffast-math should means are all
    things gcc _already_ does, with no
    
    > By the way, I said I would be shocked to find a Fortran compiler that did
    > associative redistribution in the absence of parens. I am somewhat surprised
    > that no one stepped forward with a counter-example, but I suspect in fact that
    > there may not be any shocking Fortran implementations around.
    
    I would suspect that very few people have Fortran compilers around and
    bother to check it.
    
    > It is an old argument, the one that says that fpt is approximate, so why bother
    > to be persnickety about it. Seymour Cray always tool this viewpoint, and it
    > did not bother him that 81.0/3.0 did not give exactly 27.0 on the CDC 6000
    > class machines.
    
    ..and he was universally respected for making the fastest machines around.
    
    What you forget to mention is that these days it's so _cheap_ to get
    IEEE, that from a hardware standpoint pretty much everybody includes it
    anyway.
    
    But they then often throw it away because it ends up having expensive
    run-time issues (alpha with exception handling and proper denormals, Intel
    with special RTZ modes etc).
    
    Why? Because in many areas Seymour Cray is _still_ right. The thing that
    killed off non-IEEE was not that he was wrong, but the fact that _some_
    people do need IEEE "exact FP". Not everybody. Not even the majority. But
    because some people do need it, you need to support it. Which is why
    everybody does, today.
    
    But do you see the difference between
    
      "We have to support it because a portion of the user base has to have
       it, and if we don't have it we won't be able to sell to any of that
       user base"
    
    and
    
      "Everybody must use it, because anything else is wrong"
    
    Eh?
    
    Do you see that Seymours approach didn't fail because he was always wrong?
    It failed because he was _sometimes_ wrong.
    
    And you know what? He was right enough of the time to have built up an
    empire for a while. That's something not everybody can say about
    themselves. And that is something that you should respect.
    
    			Linus
    
    

* * *
* **References**: 
    * [**Re: What is acceptable for -ffast-math? (Was: associative law in combine)][9]**
        * _From:_ dewar

| ----- |
| Index Nav: |  [[Date Index][2]] [[Subject Index][3]] [[Author Index][4]] [[Thread Index][5]]  |  
| Message Nav: |  [[Date Prev][6]] [[Date Next][7]] |  [[Thread Prev][6]] [[Thread Next][8]] | 

[1]: https://gcc.gnu.org/
[2]: https://gcc.gnu.org/index.html#02150
[3]: https://gcc.gnu.org/subjects.html#02150
[4]: https://gcc.gnu.org/authors.html#02150
[5]: https://gcc.gnu.org/threads.html#02150
[6]: https://gcc.gnu.org/msg02149.html
[7]: https://gcc.gnu.org/msg02151.html
[8]: https://gcc.gnu.org/msg02107.html
[9]: https://gcc.gnu.org/msg02106.html