ACS

ACS Project

First, I have to say something about my job interview. While visiting in California in the autumn of 1968 I learned that IBM was looking for people with a compiler background to work on their Advanced Computer System (ACS) project located in Menlo Park. That sounded promising, so I sent my resume to the local personnel office, and was granted an interview. I was told later that the personnel office had said, essentially, "there's a woman here who doesn't have any of the experience you are interested in, but, just to be polite, please talk to her for 20 minutes or so." In the sense of creating low expectations, I couldn't have asked for a better introduction to Fran Allen, in future the first female winner of the Turing award. (The problem probably was that the personnel employee usually saw resumes from hardware architects, who made up the majority of the ACS project members, and thus my references to various compiler projects were not recognized.)

In the next sections I'll first provide some context in terms of a brief history of the ACS project as a whole, and a view of compiler optimization in the late 1960s. Then I'll talk about the ACS compiler and follow-on, and my role in that. And, finally, there's a long footnote about the attribution of credit for some optimizer design ideas (not mine).

ACS Project Outline

The ACS project was begun in 1963, as "Project Y", to develop one of two planned entries for IBM into the area of supercomputers. In 1965 the project was moved to the San Francisco Bay Area, to a new lab named "IBM Advanced Computing Systems". An optimizing compiler was an integral part of the project, and the initial compiler team included John Cocke (who also played a major role in the ACS project as a whole), Fran Allen, and Jim Beatty.

At a kickoff meeting in 1965, Gene Amdahl, a major figure in IBM hardware architecture, proposed a 360-compatible approach, which was rejected. Gene was thereafter ostracized for his continuing support of 360 compatibility, but maintained contact with IBM upper management. The controversy came to a head in March 1968 when a task force was organized to consider the two alternatives, and a few months later the task force decided in favor of an Amdahl proposal (made in collaboration with John Earle). The redirection had major effects on the project. Amdahl took over as overall head, and although some of the people involved left, both the hardware and software work continued until the renamed "ACS/360" was cancelled in May 1969. However, the compiler work continued under the aegis of IBM's Systems Development Division (SDD) until 1971.

Note: Please see [SM_1] and [SM_2] for accounts of the ACS project that focus on its all-important hardware aspects, which will not be discussed further here.

Compiler Optimization in the Late 1960s

The initial, pioneering IBM Fortran compilers of the late 1950s and early 1960s focused on Fortran DO loop optimization, especially (but not exclusively) on avoiding expensive multiplications required by array references within those loops. See [BACK]. Then, in the mid-to-late 1960s, more general views of optimization types, and how they might be implemented, emerged. Part of this was probably just a natural intellectual development, but part was because it was becoming of interest to build compilers for multiple programming languages with similar optimization requirements and opportunities. A few of these general types of optimizations:

Common Subexpression Elimination: If there are two identical computations at different places in the code, and it can be established that one of them always precedes the other, and the values of the component variables are not changed between the two, then the first computed value can be stored, and referenced in place of the second computation.
Code Motion: If an expression or subexpression in a loop does not change value within the loop, the computation can be moved outside the loop, with only the computed value referenced in the loop.
Strength Reduction: This is the more general name for the DO loop subscript optimization performed by the first Fortran compilers. The optimization is used when a slower operation (like multiplication) repeated in a loop can be replaced by a faster one (like addition). So in a loop where I is the "loop index" and is incremented by 1 each time around the loop, an expression like "I * 1000", used to compute the location of an element of an array, can be replaced by adding 1000 to an introduced variable each time around the loop. (The name "strength reduction" relates to the precedence level of the operations, which is sometimes called, alternatively, its "binding strength.")

Now performing these transformations depends on knowing a great deal about the structure and content of the program. So optimizing compilers performed some standard kinds of analysis, but differing in method, determining how the information was obtained and represented. The analyses:

Basic Block Identification Dividing the program code, usually transformed to a lower level "intermediate language" (IL) into "basic blocks" entered only at the first instruction and exited (possibly to alternative blocks) only after all value assignments.
Control Flow Analysis: Identifying the paths, in terms of sequences of basic blocks, that might be traversed during the execution of the program, including the location and scope of loops (often called "strongly connected regions"). That analysis or a follow-on would create a structured representation of some sort for use in further investigation.
Data Flow Analysis: This might be integrated with the optimizing transformations, and would consist of traversing the structured program representation, essentially to find relationships between definitions of variables ("x=") and their uses ("= .. x .. "), and relationships between identical uses, as bases for the various optimizations. Also, by the late 1960s, logical operations (AND and OR) on bit vectors, with the bits representing different types of references to variables, were increasingly used in this kind of analysis to compute situations at different points in the program.

There were several IBM-related efforts in the mid-to-late 1960s that designed compiler optimization techniques using these analyses, and sometimes built the associated compilers. It is important to mention this, because there are occasional suggestions that the ACS effort "invented" serious optimization. The most successful of the IBM efforts was the 360 Fortran H compiler [LM], released for public use circa 1967; it and its enhancements remained in wide use for many years.

ACS Compiler and Continuation

When I joined ACS in late 1968, Gene Amdahl was the overall manager and Fran Allen was the compiler manager. Compiler development responsibilities were divided into those for (a) the "front ends" including translators for Fortran, and later COBOL, to an "intermediate language" (IL) closer to the target machine language, (b) the "back end" optimizers, and (c) compiler-wide definitions and functions. (At some point a PL/I translator was planned, but was later dropped.)

The optimizers were further subdivided into two phases. "Machine independent optimization" (MIO) subsumed analyses and transformations that were assumed to be relevant to any target machine. "Machine dependent optimization" (MDO) was largely concerned with the critical function of "register allocation", which was dependent on the number and type of hardware registers, and with the function of "instruction scheduling", concerned with arranging instructions to maximize the effect of low level hardware parallelism (e.g., the allocation of multiple adders). Not long after joining ACS I was given responsibility for the MIO phase.

Most ACS personnel worked at an IBM building on Sand Hill Road in Menlo Park, while Fran, I, and some others involved in the compiler occupied rented space in the Addison-Wesley building several blocks away, so I didn't often see most of the ACS staff. Also at the Addison-Wesley office was Jim Beatty, who was responsible for the MDO phase, and who became a good friend and advisor (he and his wife and I and another friend would attend concerts together at a local community college). See [BEAT_1, BEAT_2, and BEAT_3] for some of his work stemming from the ACS project. And Jack Schwartz, founder and chair of the Computer Science Department of NYU's Courant Institute, who had consulted to the project earlier, in 1967, returned there on sabbatical in 1969.

As mentioned earlier, after ACS/360 was cancelled, the compiler work continued under the aegis of IBM's Systems Development Division (SDD). Fran stayed on for a few months to organize and supervise the transition. Then a new manager was appointed, and we moved first to the IBM building in Menlo Park, and then to the IBM plant site in south San Jose. In San Jose, the "front-" and "middle-" end compiler developers reported to a first-level manager whom Jim Beatty and I didn't especially get along with, so we instead reported to the relevant second-level manager.

During the project I and a talented young man named Dick Foster assigned to assist me (who possibly was present only for Menlo Park stages of the project) developed detailed algorithms and implementation designs for the machine independent optimizer, and a substantial amount of code. (The coding was intended to ensure that we worried about all the necessary details; to my recollection it was not considered production code.)

It is important to note that the algorithms we used were adapted from written material developed in the 1967-1968 timeframe by Fran, John Cocke, and Jack Schwartz, and also by Chris Earnest, Karl Balke, and others from Computer Sciences Corporation (CSC) who consulted to the project before I arrived. Versions of most of the material appeared in external publications several years later (see [ALLEN_1], [CS] (Chapter 6). [EARN_1], [EARN_2]). And the CSC material exerted possibly the strongest influence.

The entire project was cancelled some time during 1971. Towards the end of the project we collaborated to some extent with a new IBM compiler effort in NYC. I taught a short course there, and discussed and critiqued their designs. After the project was cancelled, I was given the choice of returning to New York, but decided against it. That was probably, inadvertently, a good career decision, as it led to the fascinating challenges of data base models and their relationship to programming language, and of natural language processing, concerns that occupied me for almost the next two decades. Did the decision affect what happened in IBM compiler development? Because our design was more advanced than that of the NYC project? Probably not.

Long Footnote

The project had a strange aftermath relating to invention credit (not mine!!). As noted above, when I arrived I was given considerable written material documenting the optimization algorithms that had been developed during the previous several years, including proofs of their properties. One group of algorithms related to "control flow analysis", which generally produces a representation of the program in terms of (a) the identification of loops (not all of which are identified by explicit looping constructs), their entry points, contained blocks, and exits, and (b) a useful order of processing loop content (blocks and contained loops) to limit the cost of further analyses.

Now, in the material I was given, two methods of control flow analysis were described. One, by Fran Allen and John Cocke, was called "interval analysis" [ALLEN_1]. Another, developed by the CSC consultants, was called, variously,"code straightening" or "basic block numbering" [EARN_1]. I looked at both methods, and, because the CSC method seemed simpler and the results more directly useful, I decided to use it in my further work on the project, definitely with Fran's acquiescence. (I recall she once said something to the effect that she preferred "intervals" but couldn't put her finger on exactly why.)

Just for fun, here's an informal description of the CSC algorithm. It has two parts. In the first part, an ordered list is created as follows:

the program entry block is made the first element of the list, and identified as the "current block".
if there is an as-yet-unprocessed successor of the current block, chain it immediately after the current block and make it the current block, and return to 2.
otherwise if there is a block in the list before the current one, make it current and return to 2.
otherwise exit, finished

Then the blocks are initially numbered based on their position in the list, starting with 1. Note that any block b comes after all its "back dominators", i.e., the blocks which must be reached before b on any path from the program entry. And all the loops in the program are enclosed by "back branches", i.e., by branches from a higher numbered block to a lower numbered one.

But those back branches can enclose blocks not actually part of the loop, so the second part of the method, called "loop cleansing", takes care of that. Starting with an innermost unprocessed loop, and the block (currently) immediately before the last one in the loop, look at each block going backward and, if it has no successors within the loop, move it in the list to immediately following the last block of the loop (and renumber the affected parts). Keep doing this until all unprocessed loops are dealt with. And that's all, folks :-)

(Note: when the control flow analysis is complete, the program can be represented for further processing as a layered structure in which loops are treated as collapsed within containing loops. The complexity of the "collapsed" representation of a loop depends on whether it is fully "reducible", meaning, roughly, whether it can be treated as single-entry.)

Fran left the project in 1969, but she continued to be concerned with compilers and optimization for the rest of her long career in the IBM research division. So, in 1976, she co-authored an article with John Cocke, entitled "A Program Data Analysis Procedure" [ALLEN_2], which appeared in the ACM Communications. In the paper they used "interval analysis" for the control flow analysis function, but mention that

"Clearly the rapidity with which the [N.B. data flow] information stabilizes for a given graph very much depends on the order in which the nodes of the graph are examined. In [10], Hecht and Ullman use a node ordering established by applying Tarjan's depth first spanning tree algorithm to the control flow graph."

The reference "[10]" is our [HU_1] (see ACS References). In it, to be overly brief, Hecht and Ullman discuss various methods of control flow analysis and the follow-on data flow analyses often applied in a processing order identified during control flow. Among the control flow methods discussed are "interval analysis", and Tarjan's algorithm as described in [TARJ_1]. But Hecht and Ullman also mention (without giving any detail) a method of "Earnest et al" (our [EARN_1]), which they say obtains a processing order equal to the inverse of that obtained by Tarjan, which it does. The "Earnest et al" order is actually the correct order for further processing, and the analysis method is simple, straightforward, and fast.

Further discussion of control flow analysis by Fran seems to mention only Tarjan. For example, in her (in many ways useful) paper "The History of Language Processors in IBM" [Allen_3] she says:

In 1970 two papers [70, 71] described fast, abstract methods for program control and data flow analysis. Given a directed graph representation of a program, one paper [70] showed how the graph could be partitioned into “intervals” (single entry subgraphs in which all loops contain the entry node) and, treating each interval as a node, higher order graphs were derived and then analyzed to effectively codify the nesting structure of the program.... Since that time a nearly linear algorithm has been found [72] which also gives a somewhat better codification of interesting control flow relationships. However, the most important feature of the interval analysis method was that a node ordering was established which could be used to rapidly determine other relationships in the program.

Well, [72] is another Tarjan paper (our [TARJ_3]) dealing with reducibility of program flow graphs. But, certainly, [EARN_1] obtains a node ordering from which other relationships in the program, including "reducibility", can be rapidly determined.

So the question is why the Earnest/Balke contribution to control flow analysis, which Fran was certainly aware of, was erased, along with their contributions to other aspects of optimization as described in [EARN_2]. The latter paper is essentially the same as one we received in mid-1969 and was an outgrowth of memos given to ACS by CSC in 1968. I don't know the answer.