hn-classics/_stories/2010/9153203.md

---
created_at: '2015-03-05T19:25:56.000Z'
title: Why GNU grep is fast (2010)
url: http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html
author: bpierre
points: 151
story_text: 
comment_text: 
num_comments: 40
story_id: 
story_title: 
story_url: 
parent_id: 
created_at_i: 1425583556
_tags:
- story
- author_bpierre
- story_9153203
objectID: '9153203'

---
[Source](https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html "Permalink to  why GNU grep is fast
   ")

#  why GNU grep is fast
   

# why GNU grep is fast

**Mike Haertel** [mike at ducky.net ][1]  
_Sat Aug 21 03:00:30 UTC 2010_

* Previous message: [Latest intr problems ][2]
* Next message: [why GNU grep is fast ][3]
* **Messages sorted by:** [[ date ]][4] [[ thread ]][5] [[ subject ]][6] [[ author ]][7]
* * *
    
    
    Hi Gabor,
    
    I am the original author of GNU grep.  I am also a FreeBSD user,
    although I live on -stable (and older) and rarely pay attention
    to -current.
    
    However, while searching the -current mailing list for an unrelated
    reason, I stumbled across some flamage regarding BSD grep vs GNU grep
    performance.  You may have noticed that discussion too...
    
    Anyway, just FYI, here's a quick summary of where GNU grep gets
    its speed.  Hopefully you can carry these ideas over to BSD grep.
    
    #1 trick: GNU grep is fast because it AVOIDS LOOKING AT
    EVERY INPUT BYTE.
    
    #2 trick: GNU grep is fast because it EXECUTES VERY FEW
    INSTRUCTIONS FOR EACH BYTE that it *does* look at.
    
    GNU grep uses the well-known Boyer-Moore algorithm, which looks
    first for the final letter of the target string, and uses a lookup
    table to tell it how far ahead it can skip in the input whenever
    it finds a non-matching character.
    
    GNU grep also unrolls the inner loop of Boyer-Moore, and sets up
    the Boyer-Moore delta table entries in such a way that it doesn't
    need to do the loop exit test at every unrolled step.  The result
    of this is that, in the limit, GNU grep averages fewer than 3 x86
    instructions executed for each input byte it actually looks at
    (and it skips many bytes entirely).
    
    See "Fast String Searching", by Andrew Hume and Daniel Sunday,
    in the November 1991 issue of Software Practice & Experience, for
    a good discussion of Boyer-Moore implementation tricks.  It's
    available as a free PDF online.
    
    Once you have fast search, you'll find you also need fast input.
    
    GNU grep uses raw Unix input system calls and avoids copying data
    after reading it.
    
    Moreover, GNU grep AVOIDS BREAKING THE INPUT INTO LINES.  Looking
    for newlines would slow grep down by a factor of several times,
    because to find the newlines it would have to look at every byte!
    
    So instead of using line-oriented input, GNU grep reads raw data into
    a large buffer, searches the buffer using Boyer-Moore, and only when
    it finds a match does it go and look for the bounding newlines.
    (Certain command line options like -n disable this optimization.)
    
    Finally, when I was last the maintainer of GNU grep (15+ years ago...),
    GNU grep also tried very hard to set things up so that the *kernel*
    could ALSO avoid handling every byte of the input, by using mmap()
    instead of read() for file input.  At the time, using read() caused
    most Unix versions to do extra copying.  Since GNU grep passed out
    of my hands, it appears that use of mmap became non-default, but you
    can still get it via --mmap.  And at least in cases where the data
    is already file system buffer caches, mmap is still faster:
    
      $ time sh -c 'find . -type f -print | xargs grep -l 123456789abcdef'
      real	0m1.530s
      user	0m0.230s
      sys	0m1.357s
      $ time sh -c 'find . -type f -print | xargs grep --mmap -l 123456789abcdef'
      real	0m1.201s
      user	0m0.330s
      sys	0m0.929s
    
    [workload was a 648 megabyte MH mail folder containing 41000 messages]
    So even nowadays, using --mmap can be worth a >20% speedup.
    
    Summary:
    
    - Use Boyer-Moore (and unroll its inner loop a few times).
    
    - Roll your own unbuffered input using raw system calls.  Avoid copying
      the input bytes before searching them.  (Do, however, use buffered
      *output*.  The normal grep scenario is that the amount of output is
      small compared to the amount of input, so the overhead of output
      buffer copying is small, while savings due to avoiding many small
      unbuffered writes can be large.)
    
    - Don't look for newlines in the input until after you've found a match.
    
    - Try to set things up (page-aligned buffers, page-sized read chunks,
      optionally use mmap) so the kernel can ALSO avoid copying the bytes.
    
    The key to making programs fast is to make them do practically nothing. ;-)
    
    Regards,
    
    	Mike
    

* * *
* Previous message: [Latest intr problems ][2]
* Next message: [why GNU grep is fast ][3]
* **Messages sorted by:** [[ date ]][4] [[ thread ]][5] [[ subject ]][6] [[ author ]][7]
* * *

[More information about the freebsd-current mailing list][8]  

[1]: mailto:freebsd-current%40freebsd.org?Subject=why%20GNU%20grep%20is%20fast&In-Reply-To= "why GNU grep is fast"
[2]: https://lists.freebsd.org/019322.html
[3]: https://lists.freebsd.org/019323.html
[4]: https://lists.freebsd.org/date.html#19310
[5]: https://lists.freebsd.org/thread.html#19310
[6]: https://lists.freebsd.org/subject.html#19310
[7]: https://lists.freebsd.org/author.html#19310
[8]: http://lists.freebsd.org/mailman/listinfo/freebsd-current
Tufte CSS, upgrade jekyll, add metadata, index 2018-02-23 18:58:03 +00:00			`---`
			`created_at: '2015-03-05T19:25:56.000Z'`
			`title: Why GNU grep is fast (2010)`
			`url: http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html`
			`author: bpierre`
			`points: 151`
			`story_text:`
			`comment_text:`
			`num_comments: 40`
			`story_id:`
			`story_title:`
			`story_url:`
			`parent_id:`
			`created_at_i: 1425583556`
			`_tags:`
			`- story`
			`- author_bpierre`
			`- story_9153203`
			`objectID: '9153203'`

			`---`
Initial commit 2018-02-23 18:19:40 +00:00			`[Source](https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html "Permalink to why GNU grep is fast`
			`")`

			`# why GNU grep is fast`


			`# why GNU grep is fast`

			`Mike Haertel [mike at ducky.net ][1]`
			`_Sat Aug 21 03:00:30 UTC 2010_`

			`* Previous message: [Latest intr problems ][2]`
			`* Next message: [why GNU grep is fast ][3]`
			`* Messages sorted by: [[ date ]][4] [[ thread ]][5] [[ subject ]][6] [[ author ]][7]`
			`* * *`


			`Hi Gabor,`

			`I am the original author of GNU grep. I am also a FreeBSD user,`
			`although I live on -stable (and older) and rarely pay attention`
			`to -current.`

			`However, while searching the -current mailing list for an unrelated`
			`reason, I stumbled across some flamage regarding BSD grep vs GNU grep`
			`performance. You may have noticed that discussion too...`

			`Anyway, just FYI, here's a quick summary of where GNU grep gets`
			`its speed. Hopefully you can carry these ideas over to BSD grep.`

			`#1 trick: GNU grep is fast because it AVOIDS LOOKING AT`
			`EVERY INPUT BYTE.`

			`#2 trick: GNU grep is fast because it EXECUTES VERY FEW`
			`INSTRUCTIONS FOR EACH BYTE that it does look at.`

			`GNU grep uses the well-known Boyer-Moore algorithm, which looks`
			`first for the final letter of the target string, and uses a lookup`
			`table to tell it how far ahead it can skip in the input whenever`
			`it finds a non-matching character.`

			`GNU grep also unrolls the inner loop of Boyer-Moore, and sets up`
			`the Boyer-Moore delta table entries in such a way that it doesn't`
			`need to do the loop exit test at every unrolled step. The result`
			`of this is that, in the limit, GNU grep averages fewer than 3 x86`
			`instructions executed for each input byte it actually looks at`
			`(and it skips many bytes entirely).`

			`See "Fast String Searching", by Andrew Hume and Daniel Sunday,`
			`in the November 1991 issue of Software Practice & Experience, for`
			`a good discussion of Boyer-Moore implementation tricks. It's`
			`available as a free PDF online.`

			`Once you have fast search, you'll find you also need fast input.`

			`GNU grep uses raw Unix input system calls and avoids copying data`
			`after reading it.`

			`Moreover, GNU grep AVOIDS BREAKING THE INPUT INTO LINES. Looking`
			`for newlines would slow grep down by a factor of several times,`
			`because to find the newlines it would have to look at every byte!`

			`So instead of using line-oriented input, GNU grep reads raw data into`
			`a large buffer, searches the buffer using Boyer-Moore, and only when`
			`it finds a match does it go and look for the bounding newlines.`
			`(Certain command line options like -n disable this optimization.)`

			`Finally, when I was last the maintainer of GNU grep (15+ years ago...),`
			`GNU grep also tried very hard to set things up so that the kernel`
			`could ALSO avoid handling every byte of the input, by using mmap()`
			`instead of read() for file input. At the time, using read() caused`
			`most Unix versions to do extra copying. Since GNU grep passed out`
			`of my hands, it appears that use of mmap became non-default, but you`
			`can still get it via --mmap. And at least in cases where the data`
			`is already file system buffer caches, mmap is still faster:`

			`$ time sh -c 'find . -type f -print \| xargs grep -l 123456789abcdef'`
			`real 0m1.530s`
			`user 0m0.230s`
			`sys 0m1.357s`
			`$ time sh -c 'find . -type f -print \| xargs grep --mmap -l 123456789abcdef'`
			`real 0m1.201s`
			`user 0m0.330s`
			`sys 0m0.929s`

			`[workload was a 648 megabyte MH mail folder containing 41000 messages]`
			`So even nowadays, using --mmap can be worth a >20% speedup.`

			`Summary:`

			`- Use Boyer-Moore (and unroll its inner loop a few times).`

			`- Roll your own unbuffered input using raw system calls. Avoid copying`
			`the input bytes before searching them. (Do, however, use buffered`
			`output. The normal grep scenario is that the amount of output is`
			`small compared to the amount of input, so the overhead of output`
			`buffer copying is small, while savings due to avoiding many small`
			`unbuffered writes can be large.)`

			`- Don't look for newlines in the input until after you've found a match.`

			`- Try to set things up (page-aligned buffers, page-sized read chunks,`
			`optionally use mmap) so the kernel can ALSO avoid copying the bytes.`

			`The key to making programs fast is to make them do practically nothing. ;-)`

			`Regards,`

			`Mike`


			`* * *`
			`* Previous message: [Latest intr problems ][2]`
			`* Next message: [why GNU grep is fast ][3]`
			`* Messages sorted by: [[ date ]][4] [[ thread ]][5] [[ subject ]][6] [[ author ]][7]`
			`* * *`

			`[More information about the freebsd-current mailing list][8]`

			`[1]: mailto:freebsd-current%40freebsd.org?Subject=why%20GNU%20grep%20is%20fast&In-Reply-To= "why GNU grep is fast"`
			`[2]: https://lists.freebsd.org/019322.html`
			`[3]: https://lists.freebsd.org/019323.html`
			`[4]: https://lists.freebsd.org/date.html#19310`
			`[5]: https://lists.freebsd.org/thread.html#19310`
			`[6]: https://lists.freebsd.org/subject.html#19310`
			`[7]: https://lists.freebsd.org/author.html#19310`
			`[8]: http://lists.freebsd.org/mailman/listinfo/freebsd-current`