46 KiB

Raw Permalink Blame History

created_at

title

url

author

points

story_text

comment_text

num_comments

story_id

story_title

story_url

parent_id

created_at_i

_tags

objectID

year

2016-10-11T08:43:15.000Z

Multics Emacs: The History, Design and Implementation (1979)

http://www.multicians.org/mepap.html

Plugawy

1476175395

story

author_Plugawy

story_12682624

12682624

1979

Source

Multics Emacs History/Design/Implementation

Multics Emacs: The History, Design and Implementation

April 8, 1996

What follows is my vast, unpublished 1979 "Mother of All Multics Emacs papers" from which all of my lesser and greater Emacs papers, published, internal, and unpublished, were ultimately excerpted, including Prose and CONS: A Commercial Text-Processing System in Lisp in the 1980 Lisp Conference proceedings and the 1980 Honeywell conference paper. It's about time to expose it, and the WWW/HTML is now the ideal vehicle.

Multics is no longer produced or offered for sale; Honeywell no longer even makes computers. People edit on computers on their desktop so cheap and fast that not only do redisplay algorithms no longer matter, but the whole idea of autonomous redisplay in a display editor is no longer a given (although autonomous redisplay's illustrious child, WYSIWYG, is now the standard paradigm of the industry.). There is now no other kind of editor besides what we then called the "video editor". Thus, all of the battles, acrimony, and invidious or arrogant comparisons in what follows are finished and done with, and to be viewed in the context of 1979 -- this is a historical document about Multics and the evolution of an editor. It is part of the histories of Multics, of Emacs, and of Lisp.

Many of the deficiencies of Multics described here were thereafter remedied, not only by Emacs, but by a managed-video system inspired by Emacs. Although it started out as rebellious "hack", Multics Emacs became an integral part of the Multics product.

The term ARPANET refers to the early stages of what is now called the INTERNET. ARPA was the Advanced Research Projects Agency of the U.S. Department of Defense, who instigated and underwrote its development.

Please enjoy.

David Mery posted an interesting story about Multics Emacs on 14 Nov 2007: "Multics source reveals le jetteur des gazongues".
The source for Multics Emacs is available online at MIT.

: I. Prehistory of Editing on Multics: II. History of Video-oriented Editing on ITS; III. Inception of Multics Emacs; IV. Multics Emacs: The Embryonic Phase and Basic Decisions; V. Early Enhancements; VI. Echo Negotiation; VII. The SUPDUP-OUTPUT ARPANET Protocol; VIII. The Place of Emacs in Multics; IX. Experience and Conclusions; X. Acknowledgements; Appendix A. The Redisplay; Appendix B. The Extension Language

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by the author. All persons copying this information are expected to adhere to the terms and constraints invoked by the author's copyright. In most cases, these works may not be reposted or republished without the explicit permission of the copyright holder.

Multics Emacs: The History, Design and Implementation

Bernard S. Greenberg -- 15 August 1979

(Paper on Multics Emacs, intended audience unclear. Idea is to put this stuff down on "paper" for posterity.) Brackets ([xxx]) denote references to the Bibliography at the end of the main text.

Multics Emacs is a video-oriented text preparation and editing system running on Honeywell's Multics [Multics] system, being distributed as an experimental offering in Multics Release 7.0. From the viewpoint of Multics, it represents the first video-management software to be implemented, the first time character-at-a-time-interaction has been used, and a radical and complete departure from other editing and text preparation tools and techniques prevalent on Multics. From the viewpoint of similar systems running elsewhere, several features are noteworthy, including a major commitment to the programming language Lisp [Moonual] [Chineual], user-accessible extensibility through Lisp, and an internal implementation designed from the start (as is often not the case) with display-oriented editing in mind. The seemingly innate expense of video-oriented interaction has also led to the development of performance enhancement techniques applicable to any such system. The growth of Multics Emacs and its user community on MIT's Multics system has also led to the development of protocols on for the ARPANET [Arpanet] designed to facilitate the use of video-oriented software in a device-independent fashion.

Multics Emacs is currently in widespread use at three Multics exposure/development sites, serving about 60 regular users. Due to the lack of previous video-oriented software on Multics, not many users have a large number of high-speed video terminals connected to their Multics system. Thus, much usage of Multics Emacs is via 300- and 1200-baud dialup lines. This fact, combined with the acknowledged expense and resource consumption of Multics Emacs, places Multics Emacs among a choice of editing tools: given current resource economies and communications devices at current sites, it is not always the tool of choice, even among its most fervid partisans.

This paper describes the background and history of Multics Emacs, the previous developments, and the climate in which it was created. As many of the salient features and design and implementation criteria and decisions as possible are stated, as well as experience with the design and implementation. Where important, complete algorithms are detailed, notably the redisplay algorithm, detailed in the Appendix.

I. Prehistory of Editing on Multics

Four editors were in common use on Multics before the introduction of Multics Emacs. They are still in widespread use. Two of them, "edm" [MPM] and "qedx" [QUG], are standard, and intended as end-user interfaces. Both of these editors are line-oriented, printing-terminal editors of a conventional mold. When in command mode, lines full of editor commands are typed to effect the positioning to, printing, deleting, or changing (by string substitution) of lines, or entry into input mode, which is the only way new text may be entered. The typing of a reserved character sequence exits input mode back to command mode. These editors maintain buffers (edm maintains one text buffer, qedx many) as character strings with a gap in the middle, which represents the current point at which text may be inserted. Neither editor can address individual characters. The edm editor is extremely limited, is not programmable, and intended for novice users. It was developed from the EDL [CTSS] editor interface on CTSS. The qedx editor was developed as a stripped-down (for runtime efficiency) version of the QED editor [CTSS] [Bell QED], also on CTSS. (Many of the original Multics developers had worked on CTSS). Both editors are oriented towards a low-speed, half-duplex, printing terminal interface, where minimal typeout is a design criterion. It was solely with these editors that the bulk of the Multics operating system and applications were entered, edited, and maintained.

An implementation of TECO (a version of which is the standard DEC editor) was built on Multics in 1971. This version was derived from the original TECO implementation, on the MIT AI Lab's PDP-10 [TECDOC]. Unlike the latter, Multics TECO has no display support, and does not have the complex control-structure constructs of the AI Lab editor. Multics TECO is supported by Honeywell as a tool, which is to say, not an end-user interface [Tools manual]. Multics TECO has had a handful of supporters, but has never really achieved widespread use. TECO presents as an interface a repertoire of single-character commands designed to provide substantial power as a text-processing programming language. While trying to succeed as both an editing language and a programming language, TECO falls short at both due to the necessary compromises. Nevertheless, a principal feature of TECO is the ability to construct powerful text-processing programs (macros) readily, and Multics TECO has often been used for such.

The remaining editor which is widely used on Multics is a version of qedx ("ted") developed by James A. Falksen, adding substantial power, including the ability to address characters and many commands which qedx is lacking. Many fail-soft features and help features also stand out in Falksen's editor. Due to a large variety of compatibility constraints and product schedules, no improvements to qedx have been made in the last five years, resulting in widespread dissatisfaction among the Multics user community with the inadequacies of qedx. Thus, Falksen's editor, which is not a Honeywell product, or distributed with Multics, has achieved almost exclusive use at many Multics sites. It is not conceptually different from qedx, differing principally in the vastly extended command repertoire. There is no published documentation available.

II. History of Video-oriented Editing on ITS

The impetus for Multics Emacs came from the outside. In late 1977 and early 1978, the author became acquainted with Emacs [Stallman] on the ITS [ITSDOC] system at the MIT AI Lab. ITS Emacs grew out of ITS TECO during the period 1975-1977. Some background on the history and development of ITS Emacs is now in order.

ITS TECO was designed for use in a display-oriented environment. In its normal mode of usage, before the evolution of Emacs, it split the user's screen into two regions, one in which the user typed editor commands and one in which a region of the text being edited was displayed. As the user completed a sequence of TECO commands, the editor would update the image of the text buffer being displayed on the screen. While obviating the need for a "print command" (which shows lines of a buffer on request), this technology still revolved around a user typing a line of "editor commands," and "activating" them. Included therein are requests to insert text, as normal editor commands. An appropriate section of the buffer to display would be chosen automatically.

The next advance in editor technology on ITS was the development of Control R, or Real-time edit mode in TECO. This mode (so called because the "Control R" character invoked the command which entered it) allowed a selected set of characters to be used as "real-time" commands: instead of entering a string of editor commands, a user would type one of these characters (such as "Control D" for "delete a character") and the effect would immediately be made visible in the buffer display. As each command character was typed, the text displayed would be appropriately modified. All of the command characters, a fixed, small set, were drawn from the "Control" characters of the ASCII character set, i.e., non-printing characters. The typing of printing characters caused them to be inserted directly into the buffer (and appear on the screen). TECO maintains a point, a virtual pointer, into its text buffer: in Control R mode, an identification of TECO's point with a terminal's cursor is made, so that modifications to the text being edited appear "at the cursor" in the buffer display.

The novelty of Control R mode was the freedom from editor requests: the terminal became an "editing device," upon which certain keys had predictable effects upon the text displayed thereupon, as a "machine," as opposed to an "interactive editor." In fact, what had been achieved were whole new horizons in interaction and editing! Complex request lines, having possible errors, were no longer among the user's tools. As the effect of commands became visible as each command was typed, erroneous action could be stopped at the erroneous command. This natural, simple interface was eventually the one adopted (independently) by "stand-alone word processing machines" (Wang, Xerox, Lanier, etc.). Within some time, it became apparent that Real-time edit mode was in fact a more potent and natural approach to text editing than the conventional interactive (even video-oriented) text editor.

The end result of Control R mode is the user operating as though he or she were "editing the screen" by typing keys. As a key is typed, the text on the screen changes. There are Control R mode commands to position the cursor to different character positions (and therefore lines) in the text buffer: if an attempt is made to position to some line not on the screen, TECO chooses some new portion of the buffer to display automatically. Unlike "editing terminals," the user is not in fact editing the screen, and thus need never be concerned with what particular portion of the buffer is in fact on the screen. The user need never "read in a new screen from the computer," "send the screen to the computer," or any similar implementation-derived constraint.

Control R mode depends upon the ability to interact on a character-at-a-time basis with the editor program (TECO). As each single character is typed, TECO must take action, both modifying the buffer (the text being edited) and updating the display. This interaction is innately more "expensive" (consumptive of computer resources) than the line-at-a-time request lines of conventional editors. Much of the historical interest in the development of Multics Emacs derives from the necessity for this character-at-a-time interaction and ways of ameliorating its performance impact.

The next significant advance was the introduction of the ability to associate arbitrary macros (TECO programs, possibly user-supplied) with keys in Control R mode. This ability allows arbitrarily complex actions to be performed in response to any given keystroke. TECO excels at the ability to construct subroutines, of editor commands, in its internal registers. These subroutines can call each other, recursively, and pass arguments and results around. It is quite common in TECO to construct these macros to perform arbitrarily complex text-manipulation tasks, such as dealing with sentences, paragraphs, or constructs in the source representations of specific programming languages. The ability to associate TECO macros with keystrokes allowed editor developers and users to augment Control R mode by adding sets of keys whose functions are tuned to specific editing tasks.

By 1976, several packages of TECO macros [TMACS, TECMAC] had proliferated. These packages contained macros for many common operations on text entities, and brought the power of Control R mode up to and beyond that of today's stand-alone word processors. By this time, use of "raw TECO" had almost ceased, with almost universal use of these Control R mode macro packages. TECO augmented by these packages in fact transformed the user's terminal into a word-processing, program-processing, mail-processing, or other highly specialized video device.

At this time Richard Stallman coalesced most of the ideas in these packages, and created a unified approach to command set, extensibility, documentation, and integration of these facilities, and created a large, unified, set of macros which came to be known as Emacs. The name is derived from "Editor Macros."

Since the user using Emacs (or any of the earlier packages) never deals with the command set of TECO, but only the actions specified by the Control R mode macros, Stallman rightly considers Emacs to be an editor implemented in TECO, the latter being an interpretive editor implementation language, and one suffering severe deficiencies at that. In fact, Stallman's viewpoint is largely justified, and one that we will adopt. The impact of Emacs and Emacs-style editing far outweighs that of any TECO, and the basic philosophies of TECO, as a user interface, are largely masked by Emacs. Indeed, TECO is simply the vehicle in which ITS Emacs is expressed.

It was soon found that Emacs could be taught within minutes or an hour to those with no technical experience at all. Experienced and sophisticated users found Emacs to be eminently more usable than any of the previous forms of editing, and via the construction of more macros (for processing Lisp source programs, for example), could rapidly be extended to handle any task in the same manner. Emacs rapidly became the standard editor on ITS, and has remained so to this writing.

III. Inception of Multics Emacs

When Multics was begun in the mid-1960's, the legacy of CTSS left a large variety of IBM half-duplex printing terminals as the standard interactive device. No display terminals were used on Multics (other than storage-tube graphics devices (which cannot be used for video editing)) until the mid-1970's, when the consumer display terminal first became readily available. Some Multics users purchased these terminals (notably the Delta Data 4000, which was the only kind of terminal in use at the Multics installation at the University of Southwestern Louisiana), using them either as "glass teletypes" or via "local editing." Video terminals generate less noise and less waste paper than printing terminals, and, for higher line speeds, are markedly less expensive. Users of video terminals on Multics developed methodologies of using qedx or Falksen's editor to print lines to the screen, and go into "input mode," whereupon they would edit the screen by pressing locally-provided terminal controls, and then depress the "transmit" button, sending screen contents over. So prevalent were these techniques that some users of Multics Lisp maintained no source files, using local editing to edit functions and read them back in. Such poor programming practice, and the severe limitations of the "command set" of the "built-in editors" of terminals created poor results all around.

Eugene Ciccarelli, at MIT's Laboratory for Computer Science (LCS) (now at BB&N), having been a major contributor to the pre-Emacs TECO macro packages, and then to ITS Emacs, felt the need for character-oriented interaction on Multics, and in 197?, developed a technique which relied upon the Multics ARPANET [ARPANET] implementation to effect such input. The Multics ARPANET implementation was not limited to line-at-a-time I/O. Ciccarelli constructed a video-terminal oriented line-editor, using Control R mode-like commands, on Multics. A user of his system would log into Multics via the ARPANET, and then invoke Ciccarelli's line editor. From this editor, the user would log into Multics recursively, via a looped-back ARPANET connection. Ciccarelli supported three popular terminals. His technique allowed users to edit any Multics input line, and retrieve previous lines. It acquired some limited use at LCS, but did not become popular, the use of two processes and the ARPANET being neither readily available to most users, nor in any way efficient. Some experimentation was done by Ciccarelli and Charles R. Davis with developing a video editor in this environment, but it never saw any use or left the experimental stage. Ciccarelli's efforts were the first use of character-at-a-time input on any form on Multics.

By late 1977, people at MIT, familiar with the video editing developments on ITS, began to see the state of editing on Multics as severely deficient. Although various proposals to create new qedx-like editors were being bandied about, none of them achieved sufficient consensus to be implemented. The people critical of editing on Multics were not impressed by any of these proposals, and realized, from experience on ITS, that the real-time video editor (as ITS Emacs was now recognized to be) was rapidly becoming the central user interface of the system which supported it. They began to claim that Emacs symbolized everything that was right about ITS and wrong about Multics.

The numerous and vocal partisans of Multics, which has traditionally prided itself on its user interface, were sensitive to this criticism, and a discussion rapidly grew about whether or not Multics could ever support an Emacs-like editor. (An Emacs-like editor, for the purposes of this discussion, is a real-time, character-at-a-time, interactive, display-oriented editor (control R mode-like) running in a mainframe). A large set of people contended that this was the wrong way to go, given the predicted economies of terminals, processors, and communications equipment. These parties stated terminals would become more intelligent, and would be more adept at the editing task as technology advanced. Others, notably Stallman, contended (correctly in the author's view) that only by an integrated, powerful program running in a mainframe, could an interface of sufficient usability, generality, extensibility, and power be supported.

Other discussion centered around the feasibility of character-at-a-time input on Multics. Multics had never supported such: neither do many large commercial mainframes. DEC, from their minicomputer orientation, provided this facility, and implementors of operating systems on DEC hardware seem predisposed to carry this facility through. The Multics Communications System is complicated by the presence of a Front-End Processor (known as the FNP), which performs line-control and echoing functions. The mainframe/FNP communication protocols were not optimized for small transfers, or rapid interaction, and FNP interrupts to the Multics central system are expensive. Process wakeups on Multics are expensive as well, and the working sets of Multics processes are large. The predicted expense of character-at-a-time interaction was great, yet the fundamental importance of this feature to the type of system under contemplation was acknowledged by all.

By February, 1978, the discussion had basically come down to whether or not someone in the Multics organization could be persuaded to attempt to implement such an editor. The author had used ITS Emacs during the preparation of a course that January, and had become convinced of the unquestionable superiority and desirability of the ideas contained therein. A demonstration of ITS Emacs, using the Delta Data 4000 at Honeywell's Cambridge Information Systems Lab, was scheduled for March 3, 1978, with the intent of stimulating interest within the Multics Development organization in real-time editing.

The demonstration was widely attended. However, lack of preparation, hardware difficulties, and the presence of a number of unruly outsiders led to chaos, and not very many people went away with any ideas markedly different from that which they brought to the demonstration. One person who was impressed was Larry Johnson, a Multics Communications specialist, who was sufficiently impressed that within in hour after the end of the demonstration, he had effected a patch to the FNP to cause transmission of each character to the mainframe as it was typed. This implementation was extremely inefficent, causing interrupts and wakeups on every character typed, yet it inspired the author and the ITS people present to consider character-at-a-time I/O as no longer a possibility, but a reality. If an implementation of a real-time editing system could be constructed, it was felt that it would rapidly gain such popularity that all necessary implementation efficiencies would be necessitated by popular demand.

It was that day that the author contemplated the construction of Multics Emacs. Several lessons were to be learned from the history of ITS Emacs. The inferior nature of TECO as an implementation language, plus the "add-on" nature of the display support, led to consideration of an editor constructed with real-time display editing as its primary goal. The removal of the overhead of interpretation of an intermediate code (e.g., TECO) seemed to be one way in which efficiency could rapidly be bought back. A goal-directed implementation in PL/I, the standard Multics system and application language, suggested itself. Multics PL/I is extremely complete, well-debugged, and optimized to systems programming tasks [AG94].

However, another of the lessons learned from ITS Emacs was that of extensibility and modularity. The ability to add programs (macros, in the TECO implementation) to augment the editor was crucial. If the smaller primitives of the editor were to be called in an efficient fashion, they would have to be internal subroutines (internal procedures) of some very large program, which would presumably grow without bound as functionality was increasing the difficulty of maintenance, and incremental growth and debugging. Calling external programs is expensive on Multics. If this approach were taken, the ability to extend function by writing new code would require externally available interfaces to the editor primitives, which would presumably operate at reduced efficiency, never equalling the efficiency of code in the "big" procedure. This appeared to be an unreasonable restriction. Other alternatives which presented themselves, such as lexical inclusion of the code for editor primitives, seemed equally unpalatable.

The incremental overhead of creating a new PL/I source program in a large subsystem is substantial: a large amount of communications information must be lexically included in the source program for even the smallest component. Declarations are necessary for every temporary variable, every built-in function, and every other procedure name used. The substantial expense of the PL/I compiler must be invoked on every version of code, no matter how minor the change, to test it. Either source programs proliferate without bound, or routines must be packaged into larger programs, increasing the compilation overhead without bound. Any possible user augmentation, a prime feature, would have to include a large amount of declarative information giving internal structure of editor data-bases.

PL/I did not seem conducive to interactive development of such a subsystem. The choice of languages was limited: technology had long since passed the stage where assembler language was reasonable, and the idea of substantially augmenting Multics TECO seemed distasteful, and a step backwards. Among the other standard languages on Multics, neither FORTRAN, COBOL, APL, nor BASIC could be given serious consideration.

It was at this point that the idea of Lisp as an editor implementation language occured to the author. Lisp function calls are substantially more efficient than Multics PL/I external calls (but far less efficient than PL/I internal calls). Lisp is ideally suited to buiding large subsystems of many functions, many of them very small. Lisp, when viewed properly [Lispbook], is a highly object-oriented language, masterfully adept at maintaining complex data structure. Programs can be added to a running Lisp environment without the need for any compilation at all (when debugging), and can be interactively debugged with the aid of the Lisp interpreter. The global namespace of Lisp allows users to reference only the variables they need to reference for a given task, without compiled-in assumptions as to their storage location in any structure. Lisp seemed to solve many of the design problems inherent with the use of PL/I.

The author also had substantial experience with the Multics implementation of MacLisp [Moonual], having implemented a large subsystem (a Multics crash-dump analyzer) in it, and becoming sufficiently dependent upon it that he had, by this time, become the maintainer of the implementation.

Lisp has acquired a bad reputation for several reasons, all false. People often point out the inefficiency of interpreted code. However, all production Lisp code in mature implementations is compiled into machine code, not any kind of pseudocode. The Multics Lisp Compiler [LCP] is well-debugged, and reasonably efficient. People speak of the innate inefficiency of the data-representation of Lisp: yet, the existence of character strings as a data type (in MacLisp), plus the judicious use of outside-of-the-language facilities where appropriate, create a reasonable efficiency of data representation. People are quick to point out the unreadability and unmaintainability of the source code of Lisp programs; yet, it is precisely through editors like Emacs that automatic technologies for editing and formatting Lisp code come about. Given proper editing tools and formatted code, Lisp is more readable (in some opinions) than many block-structured languages. Finally, many, if not most, Lisp implementations, are not fully mature, and thus not suited for large subsystem implementation. The unique nature of the Multics process environment allows subsystems in Multics MacLisp to invoke, and communicate readily with, Multics facilities outside of the Lisp environment. The set of facilities available to a program running in a Multics process is one of the major features of Multics.

The status of Multics MacLisp at this time was that of a holdover from MIT co-development days of Multics. Other than the above-mentioned dump analysis tool, it had no major uses, and its documentation [Moonual] was no longer published or available. Honeywell did not support it officially in any way. No distributed Multics program was written in Lisp.

The decision to use Multics MacLisp as an implementation language had profound consequences for its future. A large amount of interest in Lisp was eventually manifested by those desirous of knowing it solely to be able to augment Multics Emacs. Non-technical personnel have been observed to acquire enough knowledge of Lisp to extend Emacs for only this purpose. Lisp had previously been limited to undergraduate computer science courses and Artificial Intelligence laboratories. The choice of Lisp turned out to be a very wise choice, for the incremental creation of the editor, through its "extensibility," could have been done in no other language.

On the evening of March 3, 1978, Bruce Edwards and the author sat logged into MIT's Multics System and coded, and debugged, what was to become the central module of Multics Emacs. This Lisp program maintained a buffer of text as a list of Editorlines, each Editorline representing one line of text of the buffer. An Editorline is a triplet, of a Linecontent, the previous Editorline of the buffer, and the next Editorline. A Lisp variable identified the current Editorline, and another variable the current point in the current line. The Linecontents were, in this initial implementation, represented as a doubly-linked list of single characters. The current point in the current line was identified with the list node containing the character at that point Lisp functions were provided to insert a character at the current point, delete a character, break a line into two by inserting a newline, merge two lines by deleting one, move the "current line" and "current character" pointers forward and backward over characters and lines, and print out a line.

IV. Multics Emacs: The Embryonic Phase and Basic Decisions

The program created that day did nothing but maintain a buffer of text in Lisp list structure. Primitives to operate upon the buffer and the pointers had no visible effect when invoked. Lines had to be "printed" by invoking a function provided for that purpose. This program is (much augmented since) the core of Multics Emacs. It could have been used equally well to implement a non-display editor of the conventional (qedx, or TECO-like) mold.

The basic structure of an Emacs-like display editor consists of three parts, such a program (called the basic editor), an interactive driver, and a redisplay. The basic editor maintains text and contains primitives to operate upon it. The interactive driver interprets characters read from the keyboard, and invokes primitives in the basic editor to execute these requests. The redisplay is a screen manager, and is invoked by the interactive driver between input characters, and constructs and maintains a display screen by looking at the basic editor's buffer, and effecting incremental change. Much more will be said about the interactive driver and the redisplay later.

The basic editor developed on March 3, 1978 had several noteworthy design points. The representation of the text buffer as a list of Editorlines was designed to optimize the redisplay of a display editor, and to optimize user-visible and internal primitives which dealt with lines. Much of editing and text-processing operation consits of operations upon lines, or iterated over lines. Having to search for end-of-line characters for these common operations seemed suboptimal. More specifically, the redisplay would have to identify lines of the buffer with lines displayed upon the screen when the screen was last updated, in order to move lines around and better know which text to compare at redisplay time. Redisplay is basically a compare-and-update process (which will be discussed further), and any way of making the comparisons and heuristics cheaper is of tremendous value.

The maintenance of the buffer as list-structure also means that text does not have to be moved around to perform insertions or deletions: lines deleted are simply unlinked from the list representing the buffer, and new lines are linked in. Lisp garbage collection ultimately reclaims the space used by deleted nodes. Yet, even in non-garbage-collected programming languages, explicit storage management of the list nodes allows this potent strategy to be utilized. The doubly-linked list has as a disadvantage that the representation of an empty buffer, i.e., one containing no lines, and no characters at all, is difficult, and this remains a problem to this day. Multics Emacs buffers are created with one line consisting of only a newline character; such a buffer is considered to be "empty," and the reading of a file into an empty buffer is special-cased to produce the desired result. Another problem with this approach is the difficulty of searching for a character string, particularly one containing imbedded newline characters. In spite of these difficulties, Daniel Weinreb subsequently adopted the doubly-linked buffer list in his editor ZWEI [DLWThesis] on MIT's Lisp Machines [Chineual]. ZWEI is also coded in Lisp.

The decision was made from the start to represent Linecontents in two different ways, one way for all lines but the current line, and the other for the current line. The representation medium for the current line must be easily modifiable, while the other lines must be storage-efficient. MacLisp strings, which were the natural choice to represent Linecontents, are not modifiable. (Strings on the Lisp Machine [Chineual], on the other hand, are array-like objects, and are). In the initial implementation, the current line was represented by the doubly-linked list of characters described above. The Linecontents of all other lines were (and still are) Lisp strings. The current line is copied into the modifiable representation when it is first modified (this is known as opening the line), and copied back into a (new) Lisp string when the current line is left (i.e., is no longer the current line). This strategy matches fairly well the normal user pattern of moving to a line, modifiying it, moving to another line, etc. Making sequential changes through a buffer, or simply typing in any amount of new text, are both special cases of this pattern.

It was realized quite early that the doubly-threaded list of characters could not be efficient enough for a production implementation: even attempting to re-use the storage of the list nodes of the representation was deemed too inefficient. For operations upon the current line, the traditional character-string buffer had distinct advantages. Eventually (about three weeks into the development of the editor), a new type of Lisp object had to be invented to hold the modifiable representation of the current line. The rplacable string (from the Lisp terms rplaca and rplacd, the primitive pointer storage modifiers) is stored outside of the Lisp environment, in a Multics segment. Pointers to it can exist in Lisp pointer cells, and these pointers have the type bits of a character string (Multics Lisp pointers are explicitly typed). A special bit pattern in the pointer indicates that the pointer must not be chased or be subjected to data-object relocation by the MacLisp garbage collector, which, in the Multics implementation, is a recompacting-type garbage collector [GCPAPER]. Four rplacable strings are needed by the entire implementation.

The rplacable string is manipulated by two kinds of primitives: normal Lisp (and Lisp-interfacing) primitives can view it as a character string, and special primitives (in LAP, the Lisp-interfacing assembler in many implementations) are provided to modify its contents: delete characters from any point in it, and insert characters at any point in it. These primitives make use of powerful Multics hardware instructions which can perform overlapping string moves in either direction, mapping precisely the actions of inserting and deleting characters from the active current line. The LAP functions run in the Lisp environment, and are called as efficiently as one Lisp function from another.

The next step was the development of an interactive driver. The function of the interactive driver of an Emacs-like editor is basically that of TECO Control R mode: to read characters from the user's keyboard, find out what program (in TECO, a macro, in Lisp, a function) to run (the binding of the key), and execute it. After each such function is executed, the interactive driver invokes the redisplay to reflect changes to the buffer on the screen. This loop of read a character, dispatch on it to a function, redisplay, repeated indefinitely, is the basic control loop of an Emacs-like editor.

The interactive driver provided no special problems in the initial implementation: the atomic symbol [Moonual] whose name was the character which had been typed was given as a Lisp property the function to be run when that character was struck. Two-keystroke sequences were mapped into different properties of the second character. This mechanism was not conducive to switching key bindings rapidly nor easily, and prevented the latter from being implemented for the three months while it lasted. The storage inefficiency implicit in the storage of the properties was also undesirable. Nevertheless, the natural mapping of the key-bindings into the Lisp property mechanims provided an easy path to create an operative mechanism to allow the rest of the editor to grow.

A more significant difficulty was the availability of character-at-a-time I/O for experimentation. The growing editor was completely experimental and not part of any recognized or funded project, and no resources were immediately available among the already highly overcommitted Multics Communications support specialists. Thus, Ciccarelli's code was sought out, to find the basis of his techniques of single-character-input via the Multics ARPANET. Within a day, the interactive driver was operating in true character-at-a-time (real-time) mode for processes logged in via the Multics ARPANET. As there was no redisplay, a Lisp function which printed out the current line, with an overstruck < and > for the current character position, served in its place.

Soon thereafter, Johnson's patch to effect single-character transmission from the FNP was applied on a regular basis, on the CISL Development Multics Site (a testbed system with no real user community) as editor development progressed. As this went on, users on MIT's Multics system willing to experiment with the new editor were forced to use it via the ARPANET, for those administratively allowed to use the ARPANET, or experiment with it in non-real-time mode (typing a linefeed to activate, i.e., cause Multics to take cognizance of) previously typed input. These development paths proceeded in parallel: the application of the patches (to become known as the breakall mode patches), a slow and dangerous operation requiring highly privileged access, to the MIT service system was out of the question.

The next and final step in the birth of the editor was the design and creation of the Redisplay. A redisplay's job is to depict a section of the contents of a buffer on a video terminal screen. The redisplay is invoked by the interactive driver after each function has been called, which has performed arbitrary modifications upon the buffer. The redisplay must know precisely what it put upon the screen the previous time it was invoked, compute what must be put upon the screen this time, and determine how to most effectively and efficiently modify the screen, using the terminal facilities available, to transform the old screen content into the new screen content.

It is one of the fundamental design principals of a real-time video editor of this nature, that the basic editor is aware of neither the existence nor the organization of the redisplay. The redisplay is, symmetrically, aware of neither the organization nor actions of the basic editor. Between invocations of the redisplay, it is given no hint as to how the transformations upon the buffer which it will observe were performed, it can observe only the new state of the buffer (including current line pointer, etc.). This philosophy leads the highly desirable state of affairs, where extensions (i.e., user-supplied editor features) as well as the basic editor need not be at all concerned with display management, but only manipulation of the text in the buffer via the supplied primitives. The display is managed automatically.

The first coding of the Multics Emacs redisplay was performed on March 6, 1978. The Delta Data 4000 terminal at CISL was the only video terminal readily available, and fortunately, in spite of severe implementation bugs in the terminal, it had the features of the better class of consumer video terminals available. The ability to insert and delete characters and lines from the screen was thus designed into the redisplay from the start. The interface to the redisplay was designed as one Lisp function, redisplay, which took no explicit arguments. The current buffer, the current line pointer, the current character position pointer, etc., i.e., the current state of the basic editor, are all implicit parameters. The contract of the function redisplay is to determine what is to be displayed on the screen, how it differs from what is already on the screen, update the screen, and remember for next time what is now on the screen.

Even the appearance of simple typed input on the screen is a manifestation of the redisplay. Theoretically, a redisplay occurs between every two characters typed, and it is the redisplay which puts all characters on the screen, including typed input, one at a time, as they are typed. In fact, later optimizations (to be discussed) allow participation of the operating system to be negotiated for simple echoing, but the principle remains the same.

The redisplay is the only part of the editor which interfaces to the terminal's display. In such a system, the keyboard and display of the terminal are considered to be completely disjoint; terminals for which this cannot be said to be so are simply not usable in this environment. Device-independent terminal support was provided by supplying a separate Lisp program (known as a CTL, for "controller") for each terminal type supported. The functions defined in Each CTL are the same, and provide the common functionalities of terminal displays. For example, DCTL-position-cursor, of two arguments, coordinates, positions the terminal's cursor to those coordinates. DCTL-delete-lines, of one argument, the number of lines to be deleted, deletes that many lines at the current cursor. An initialization function, DCTL-init, is provided in each terminal's CTL, to set parameters used by the redisplay which state which functions are available: all terminals subset the maximal CTL. Functions in all CTLs call a common interface to output characters to the terminal.

Via the CTL mechanism, the dynamic nature of the Lisp workspace, and of Lisp function calling in particular, is used to add a terminal-specific component to the editor at invocation time. The type of terminal being used is (usually) provided by the Multics Communications software, and thus, the loading of the CTL is automatic.

The heart of the redisplay is its screen image, a data structure by which the redisplay remembers what it left on the screen after one invocation, so that it might know what is there at the start of the next. Rather than an array of characters, the representation of the screen image is designed to take advantage of the basic editor's division of the buffer into Editorlines. The screen image is a Lisp array [Moonual] of one dimension, one with element for each physical line of the display terminal. The element of this array (called screen) is a Displayline, which is a pair of an Editorline and a string which is the exact character sequence known to be on the physical line of the display (the Linedata). Several consecutive Displaylines may have the same Editorline, for Editorlines whose printed representation is longer than one physical terminal line.

The contents of a Linedata reflect the actual printed representation of an Editorline or a part of an Editorline. Linedatas do not contain new-line characters, tabs, non-printing characters, or other format effectors. The number of characters in a Linedata is its width in columns upon the screen; all characters in it are single-width printing characters. The character in a given position on the corresponding terminal line is the character in the corresponding position of the Linedata. Non-printing characters in the Editorline (actually the Linecontent of the Editorline) are converted to their printable representation (an ASCII WRU to either "

46 KiB Raw Permalink Blame History

Multics Emacs History/Design/Implementation

Multics Emacs: The History, Design and Implementation

Table of Contents

Multics Emacs: The History, Design and Implementation

Bernard S. Greenberg -- 15 August 1979

I. Prehistory of Editing on Multics

II. History of Video-oriented Editing on ITS

III. Inception of Multics Emacs

IV. Multics Emacs: The Embryonic Phase and Basic Decisions

46 KiB

Raw Permalink Blame History