hn-classics/_stories/2010/5506749.md

213 lines
6.3 KiB
Markdown

---
created_at: '2013-04-07T10:39:18.000Z'
title: Will It Optimize? See how well you can anticipate gcc's optimizer (2010)
url: http://ridiculousfish.com/blog/posts/will-it-optimize.html
author: alexbowe
points: 94
story_text: ''
comment_text:
num_comments: 32
story_id:
story_title:
story_url:
parent_id:
created_at_i: 1365331158
_tags:
- story
- author_alexbowe
- story_5506749
objectID: '5506749'
year: 2010
---
[Source](http://ridiculousfish.com/blog/posts/will-it-optimize.html "Permalink to Will It Optimize?")
# Will It Optimize?
![hand][1] ![fishtank][2] ![fishtank][3] ![fishtank][4] ![fishtank][5]
[recent posts][6] | [all posts][7] | [about _fish_][8]
Will It Optimize?
July 23rd, 2010
See how well you know (or can anticipate) gcc's optimizer. For each question, the left box contains some code, while the right box contains code that purports to do the same thing, but that illustrates a particular optimization. Will gcc apply that optimization? Put another way, will the code on the left be as fast as the code on the right, when compiled with an optimizing gcc?
I used a pretty ancient gcc 4.2.1 for these tests. If newer versions have different behavior, please leave a comment.
Beware: not all proposed optimizations are actually valid!
**1\. Recursion elimination**
Can GCC replace recursive functions with a loop?
int factorial(int x) {
if (x > 1) return x * factorial(x-1);
else return 1;
}
int factorial(int x) {
int result = 1;
while (x > 1) result *= x--;
return result;
}
[I think GCC does this][9] [I think GCC cannot do this][10] [I say this optimization is incorrect!][11]
* * *
**2\. Loop-invariant strlen()**
Will GCC hoist out strlen()?
unsigned sum(const unsigned char *s) {
unsigned result = 0;
for (size_t i=0; i < strlen(s); i++) {
result += s[i];
}
return result;
}
unsigned sum(const unsigned char *s) {
unsigned result = 0;
size_t length = strlen(s);
for (size_t i=0; i < length; i++) {
result += s[i];
}
return result;
}
[I think GCC does this][12] [I think GCC cannot do this][13] [I say this optimization is incorrect!][14]
* * *
**3\. Multiplication by 2 to addition - integer**
Will GCC transform an integer multiplication by 2 to addition?
int double_it(int x) {
return x * 2;
}
int double_it(int x) {
return x + x;
}
[I think GCC does this][15] [I think GCC cannot do this][16] [I say this optimization is incorrect!][17]
* * *
**4\. Multiplication by 2 to addition - floating point**
Will GCC transform a floating point multiplication by 2 to addition?
float double_it(float x) {
return x * 2.0f;
}
float double_it(float x) {
return x + x;
}
[I think GCC does this][18] [I think GCC cannot do this][19] [I say this optimization is incorrect!][20]
* * *
**5\. Division by 2 to right shift**
Will GCC transform an integer division by 2 to a right shift?
int halve_it(int x) {
return x / 2;
}
int halve_it(int x) {
return x >> 1;
}
[I think GCC does this][21] [I think GCC cannot do this][22] [I say this optimization is incorrect!][23]
* * *
**6\. If-else chains to switch statements**
Will GCC apply the same optimizations to if-else chains as it does to switch statements?
void function(int x) {
if (x == 0) f0();
else if (x == 1) f1();
else if (x == 2) f2();
else if (x == 3) f3();
else if (x == 4) f4();
else if (x == 5) f5();
}
void function(int x) {
switch (x) {
case 0: f0(); break;
case 1: f1(); break;
case 2: f2(); break;
case 3: f3(); break;
case 4: f4(); break;
case 5: f5(); break;
}
}
[I think GCC does this][24] [I think GCC cannot do this][25] [I say this optimization is incorrect!][26]
* * *
**Summing up**
It is tempting to think of compiler optimizations as reducing the constant in your program's big-O complexity, and nothing else. They aren't supposed to be able to make your program asymptotically faster, or affect its output.
However, as we saw, they really can reduce the asymptotic complexity in space (question 1) and time (question 2). They can also affect calculated results (discussion of question 4) and maybe even whether your program goes into an infinite loop ([see here][27]).
On the flip side, several "obvious" optimizations are subtly incorrect and so will not be performed by the compiler, especially when they involve floating point. If your floating point code is demonstrably a bottleneck and you don't need exact precision or care about special FP values, you may be able to realize a speedup by doing some optimizations manually. However, untying the compiler's hands through options like -ffast-math is probably a better idea, and then only for the affected files, since these flags have a global impact.
And lastly, this isn't meant to be a prescriptive post, but we all know why micro-optimizing is usually a mistake: it wastes your time, it's easy to screw up (see question 5), and it typically produces no measurable speedup.
Code smart, and be safe out there!
[1]: http://ridiculousfish.com/blog/images/layout/hand.jpg
[2]: http://ridiculousfish.com/blog/images/layout/tank_ul.png
[3]: http://ridiculousfish.com/blog/images/layout/tank_ur.jpg
[4]: http://ridiculousfish.com/blog/images/layout/tank_lr.png
[5]: http://ridiculousfish.com/blog/images/layout/towels_top_lr.jpg
[6]: http://ridiculousfish.com/blog/
[7]: http://ridiculousfish.com/blog/posts.html
[8]: http://ridiculousfish.com/blog/about.html
[9]: javascript:answered(0,0)
[10]: javascript:answered(0,1)
[11]: javascript:answered(0,2)
[12]: javascript:answered(1,0)
[13]: javascript:answered(1,1)
[14]: javascript:answered(1,2)
[15]: javascript:answered(2,0)
[16]: javascript:answered(2,1)
[17]: javascript:answered(2,2)
[18]: javascript:answered(3,0)
[19]: javascript:answered(3,1)
[20]: javascript:answered(3,2)
[21]: javascript:answered(4,0)
[22]: javascript:answered(4,1)
[23]: javascript:answered(4,2)
[24]: javascript:answered(5,0)
[25]: javascript:answered(5,1)
[26]: javascript:answered(5,2)
[27]: http://blog.regehr.org/archives/161