hn-classics/_stories/2003/10831379.md

39 KiB
Raw Permalink Blame History

created_at title url author points story_text comment_text num_comments story_id story_title story_url parent_id created_at_i _tags objectID year
2016-01-03T17:39:29.000Z The Art of Unix Programming (2003) http://www.catb.org/esr/writings/taoup/html/ lookupmobile 43 10 1451842769
story
author_lookupmobile
story_10831379
10831379 2003

Source

The Art of Unix Programming

| ----- | | The Art of Unix Programming | | | | Next |


The Art of Unix Programming

Eric Steven Raymond

Thyrsus Enterprises

<[esr@thyrsus.com][3]>

Copyright 2003 Eric S. Raymond

This book and its on-line version are distributed under the terms of the Creative Commons Attribution-NoDerivs 1.0 license, with the additional proviso that the right to publish it on paper for sale or other for-profit use is reserved to Pearson Education, Inc. A reference copy of this license may be found at http://creativecommons.org/licenses/by-nd/1.0/legalcode.

AIX, AS/400, DB/2, OS/2, System/360, MVS, VM/CMS, and IBM PC are trademarks of IBM. Alpha, DEC, VAX, HP-UX, PDP, TOPS-10, TOPS-20, VMS, and VT-100 are trademarks of Compaq. Amiga and AmigaOS are trademarks of Amiga, Inc. Apple, Macintosh, MacOS, Newton, OpenDoc, and OpenStep are trademarks of Apple Computers, Inc. ClearCase is a trademark of Rational Software, Inc. Ethernet is a trademark of 3COM, Inc. Excel, MS-DOS, Microsoft Windows and PowerPoint are trademarks of Microsoft, Inc. Java. J2EE, JavaScript, NeWS, and Solaris are trademarks of Sun Microsystems. SPARC is a trademark of SPARC international. Informix is a trademark of Informix software. Itanium is a trademark of Intel. Linux is a trademark of Linus Torvalds. Netscape is a trademark of AOL. PDF and PostScript are trademarks of Adobe, Inc. UNIX is a trademark of The Open Group.

The photograph of Ken and Dennis in Chapter2 appears courtesy of Bell Labs/Lucent Technologies.

The epigraph on the Portability chapter is from the Bell System Technical Journal, v57 #6 part 2 (July-Aug. 1978) pp. 2021-2048 and is reproduced with the permission of Bell Labs/Lucent Technologies.

| ----- | | Revision History | | Revision 1.0 | 19 September 2003 | esr | | This is the content that went to Addison-Wesley's printers. | | | | Revision 0.4 | 5 February 2003 | esr | | Release for public review. | | | | Revision 0.3 | 22 January 2003 | esr | | First eighteen-chapter draft. Manuscript walkthrough at Chapter 12. Limited release for early reviewers. | | | | Revision 0.2 | 2 January 2003 | esr | | First manuscript walkthrough at Chapter 7. Released to Dmitry Kirsanov at AW production. | | | | Revision 0.1 | 16 November 2002 | esr | | First DocBook draft, fifteen chapters. Languages rewritten to incorporate lots of feedback. Transparency, Modularity, Multiprogramming, Configuration, Interfaces, Documentation, and Open Source chapters released. Shipped to Mark Taub at AW. | | | | Revision 0.0 | 1999 | esr | | Public HTML draft, first four chapters only. | | |


Dedication

To Ken Thompson and Dennis Ritchie, because you inspired me.

Table of Contents

Preface

Who Should Read This Book How to Use This Book Related References Conventions Used in This Book Our Case Studies Author's Acknowledgements

I. Context
1. Philosophy
Culture? What Culture?
The Durability of Unix
The Case against Learning Unix Culture
What Unix Gets Wrong
What Unix Gets Right

Open-Source Software Cross-Platform Portability and Open Standards The Internet and the World Wide Web The Open-Source Community Flexibility All the Way Down Unix Is Fun to Hack The Lessons of Unix Can Be Applied Elsewhere

Basics of the Unix Philosophy

Rule of Modularity: Write simple parts connected by clean interfaces. Rule of Clarity: Clarity is better than cleverness. Rule of Composition: Design programs to be connected with other programs. Rule of Separation: Separate policy from mechanism; separate interfaces from engines. Rule of Simplicity: Design for simplicity; add complexity only where you must. Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do. Rule of Transparency: Design for visibility to make inspection and debugging easier. Rule of Robustness: Robustness is the child oftransparency and simplicity. Rule of Representation: Fold knowledge into data, soprogram logic can be stupid and robust. Rule of Least Surprise: In interface design, always do theleast surprising thing. Rule of Silence: When a program has nothing surprising to say, it should say nothing. Rule of Repair: Repair what you can — but when you must fail, fail noisily and as soon as possible. Rule of Economy: Programmer time is expensive; conserve it in preference to machine time. Rule of Generation: Avoid hand-hacking; write programs to write programs when you can. Rule of Optimization: Prototype before polishing. Get it working before you optimize it. Rule of Diversity: Distrust all claims for one true way. Rule of Extensibility: Design for the future, because it will be here sooner than you think.

The Unix Philosophy in One Lesson Applying the Unix Philosophy Attitude Matters Too

2. History
Origins and History of Unix, 1969-1995

Genesis: 19691971 Exodus: 19711980 TCP/IP and the Unix Wars: 1980-1990 Blows against the Empire: 1991-1995

Origins and History of the Hackers, 1961-1995

At Play in the Groves of Academe: 1961-1980 Internet Fusion and the Free Software Movement: 1981-1991 Linux and the Pragmatist Reaction: 1991-1998

The Open-Source Movement: 1998 and Onward The Lessons of Unix History

3. Contrasts
The Elements of Operating-System Style

What Is the Operating System's Unifying Idea? Multitasking Capability Cooperating Processes Internal Boundaries File Attributes and Record Structures Binary File Formats Preferred User Interface Style Intended Audience Entry Barriers to Development

Operating-System Comparisons

VMS MacOS OS/2 Windows NT BeOS MVS VM/CMS Linux

What Goes Around, Comes Around

II. Design
4. Modularity
Encapsulation and Optimal Module Size
Compactness and Orthogonality

Compactness Orthogonality The SPOT Rule Compactness and the Strong Single Center The Value of Detachment

Software Is a Many-Layered Thing

Top-Down versus Bottom-Up Glue Layers Case Study: C Considered as Thin Glue

Libraries

Case Study: GIMP Plugins

Unix and Object-Oriented Languages Coding for Modularity

5. Textuality
The Importance of Being Textual

Case Study: Unix Password File Format Case Study: .newsrc Format Case Study: The PNG Graphics File Format

Data File Metaformats

DSV Style RFC 822 Format Cookie-Jar Format Record-Jar Format XML Windows INI Format Unix Textual File Format Conventions The Pros and Cons of File Compression

Application Protocol Design

Case Study: SMTP, the Simple Mail Transfer Protocol Case Study: POP3, the Post Office Protocol Case Study: IMAP, the Internet Message Access Protocol

Application Protocol Metaformats

The Classical Internet Application Metaprotocol HTTP as a Universal Application Protocol BEEP: Blocks Extensible Exchange Protocol XML-RPC, SOAP, and Jabber

6. Transparency
Studying Cases

Case Study: audacity Case Study: fetchmail's -v option Case Study: GCC Case Study: kmail Case Study: SNG Case Study: The Terminfo Database Case Study: Freeciv Data Files

Designing for Transparency and Discoverability

The Zen of Transparency Coding for Transparency and Discoverability Transparency and Avoiding Overprotectiveness Transparency and Editable Representations Transparency, Fault Diagnosis, and Fault Recovery

Designing for Maintainability

7. Multiprogramming
Separating Complexity Control from Performance Tuning
Taxonomy of Unix IPC Methods

Handing off Tasks to Specialist Programs Pipes, Redirection, and Filters Wrappers Security Wrappers and Bernstein Chaining Slave Processes Peer-to-Peer Inter-Process Communication

Problems and Methods to Avoid

Obsolescent Unix IPC Methods Remote Procedure Calls Threads — Threat or Menace?

Process Partitioning at the Design Level

8. Minilanguages
Understanding the Taxonomy of Languages
Applying Minilanguages

Case Study: sng Case Study: Regular Expressions Case Study: Glade Case Study: m4 Case Study: XSLT Case Study: The Documenter's Workbench Tools Case Study: fetchmail Run-Control Syntax Case Study: awk Case Study: PostScript Case Study: bc and dc Case Study: Emacs Lisp Case Study: JavaScript

Designing Minilanguages

Choosing the Right Complexity Level Extending and Embedding Languages Writing a Custom Grammar Macros — Beware! Language or Application Protocol?

9. Generation
Data-Driven Programming

Case Study: ascii Case Study: Statistical Spam Filtering Case Study: Metaclass Hacking in fetchmailconf

Ad-hoc Code Generation

Case Study: Generating Code for the ascii Displays Case Study: Generating HTML Code for a Tabular List

10. Configuration
What Should Be Configurable?
Where Configurations Live
Run-Control Files

Case Study: The .netrc File Portability to Other Operating Systems

Environment Variables

System Environment Variables User Environment Variables When to Use Environment Variables Portability to Other Operating Systems

Command-Line Options

The -a to -z of Command-Line Options Portability to Other Operating Systems

How to Choose among the Methods

Case Study: fetchmail Case Study: The XFree86 Server

On Breaking These Rules

11. Interfaces
Applying the Rule of Least Surprise
History of Interface Design on Unix
Evaluating Interface Designs
Tradeoffs between CLI and Visual Interfaces

Case Study: Two Ways to Write a Calculator Program

Transparency, Expressiveness, and Configurability
Unix Interface Design Patterns

The Filter Pattern The Cantrip Pattern The Source Pattern The Sink Pattern The Compiler Pattern The ed pattern The Roguelike Pattern The Separated Engine and Interface Pattern The CLI Server Pattern Language-Based Interface Patterns

Applying Unix Interface-Design Patterns

The Polyvalent-Program Pattern

The Web Browser as a Universal Front End Silence Is Golden

12. Optimization
Don't Just Do Something, Stand There!
Measure before Optimizing
Nonlocality Considered Harmful
Throughput vs. Latency

Batching Operations Overlapping Operations Caching Operation Results

13. Complexity
Speaking of Complexity

The Three Sources of Complexity Tradeoffs between Interface and Implementation Complexity Essential, Optional, and Accidental Complexity Mapping Complexity When Simplicity Is Not Enough

A Tale of Five Editors

ed vi Sam Emacs Wily

The Right Size for an Editor

Identifying the Complexity Problems Compromise Doesn't Work Is Emacs an Argument against the Unix Tradition?

The Right Size of Software

III. Implementation
14. Languages
Unix's Cornucopia of Languages
Why Not C?
Interpreted Languages and Mixed Strategies
Language Evaluations

C C++ Shell Perl Tcl Python Java Emacs Lisp

Trends for the Future Choosing an X Toolkit

15. Tools
A Developer-Friendly Operating System
Choosing an Editor

Useful Things to Know about vi Useful Things to Know about Emacs The Antireligious Choice: Using Both

Special-Purpose Code Generators

yacc and lex Case Study: Glade

make: Automating Your Recipes

Basic Theory of make make in Non-C/C++ Development Utility Productions Generating Makefiles

Version-Control Systems

Why Version Control? Version Control by Hand Automated Version Control Unix Tools for Version Control

Runtime Debugging
Profiling
Combining Tools with Emacs

Emacs and make Emacs and Runtime Debugging Emacs and Version Control Emacs and Profiling Like an IDE, Only Better

16. Reuse
The Tale of J. Random Newbie
Transparency as the Key to Reuse
From Reuse to Open Source
The Best Things in Life Are Open
Where to Look?
Issues in Using Open-Source Software
Licensing Issues

What Qualifies as Open Source Standard Open-Source Licenses When You Need a Lawyer

IV. Community
17. Portability
Evolution of C

Early History of C C Standards

Unix Standards

Standards and the Unix Wars The Ghost at the Victory Banquet Unix Standards in the Open-Source World

IETF and the RFC Standards Process
Specifications as DNA, Code as RNA
Programming for Portability

Portability and Choice of Language Avoiding System Dependencies Tools for Portability

Internationalization Portability, Open Standards, and Open Source

18. Documentation
Documentation Concepts
The Unix Style

The Large-Document Bias Cultural Style

The Zoo of Unix Documentation Formats

troff and the Documenter's Workbench Tools TeX Texinfo POD HTML DocBook

The Present Chaos and a Possible Way Out
DocBook

Document Type Definitions Other DTDs The DocBook Toolchain Migration Tools Editing Tools Related Standards and Practices SGML XML-DocBook References

Best Practices for Writing Unix Documentation

19. Open Source
Unix and Open Source
Best Practices for Working with Open-Source Developers

Good Patching Practice Good Project- and Archive-Naming Practice Good Development Practice Good Distribution-Making Practice Good Communication Practice

The Logic of Licenses: How to Pick One
Why You Should Use a Standard License
Varieties of Open-Source Licensing

MIT or X Consortium License BSD Classic License Artistic License General Public License Mozilla Public License

20. Futures
Essence and Accident in Unix Tradition
Plan 9: The Way the Future Was
Problems in the Design of Unix

A Unix File Is Just a Big Bag of Bytes Unix Support for GUIs Is Weak File Deletion Is Forever Unix Assumes a Static File System The Design of Job Control Was Badly Botched The Unix API Doesn't Use Exceptions ioctl2 and fcntl2 Are an Embarrassment The Unix Security Model May Be Too Primitive Unix Has Too Many Different Kinds of Names File Systems Might Be Considered Harmful Towards a Global Internet Address Space

Problems in the Environment of Unix Problems in the Culture of Unix Reasons to Believe

A. Glossary of Abbreviations
B. References
C. Contributors
D. Rootless Root

Editor's Introduction Master Foo and the Ten Thousand Lines Master Foo and the Script Kiddie Master Foo Discourses on the Two Paths Master Foo and the Methodologist Master Foo Discourses on the Graphical User Interface Master Foo and the Unix Zealot Master Foo Discourses on the Unix-Nature Master Foo and the End User

List of Figures

2.1. The PDP-7. 3.1. Schematic history of timesharing. 4.1. Qualitative plot of defect count and density vs. module size. 4.2. Caller/callee relationships in GIMP with a plugin loaded. 6.1. Screen shot of audacity. 6.2. Screen shot of kmail. 6.3. Main window of a Freeciv game. 8.1. Taxonomy of languages. 11.1. The xcalc GUI. 11.2. Screen shot of the original Rogue game. 11.3. The Xcdroast GUI. 11.4. Caller/callee relationships in a polyvalent program. 13.1. Sources and kinds of complexity. 18.1. Processing structural documents. 18.2. Present-day XML-DocBook toolchain. 18.3. Future XML-DocBook toolchain with FOP.

List of Tables

8.1. Regular-expression examples. 8.2. Introduction to regular-expression operations. 14.1. Language choices. 14.2. Summary of X Toolkits.

List of Examples

5.1. Password file example. 5.2. A .newsrc example. 5.3. A fortune file example. 5.4. Basic data for three planets in a record-jar format. 5.5. An XML example. 5.6. A .INI file example. 5.7. An SMTP session example. 5.8. A POP3 example session. 5.9. An IMAP session example. 6.1. An example fetchmail -v transcript. 6.2. An SNG Example. 7.1. The pic2graph pipeline. 8.1. Glade Hello, World. 8.2. A sample m4 macro. 8.3. A sample XSLT program. 8.4. Taxonomy of languages — the pic source. 8.5. Synthetic example of a fetchmailrc. 8.6. RSA implementation using dc. 9.1. Example of fetchmailrc syntax. 9.2. Python structure dump of a fetchmail configuration. 9.3. copy_instance metaclass code. 9.4. Calling context for copy_instance. 9.5. ascii usage screen. 9.6. Desired output format for the star table. 9.7. Master form of the star table. 10.1. A .netrc example. 10.2. X configuration example. 18.1. groff1 markup example. 18.2. man markup example. 19.1. tar archive maker production.


| ----- | | | | Next | | | | Preface |