Subject: Reverse-engineering an analog Bendix air data computer: part 4, the Mach section
In the 1950s, many fighter planes used the Bendix Central Air Data Computer (CADC) to compute airspeed, Mach number, and other "air data". The CADC is an analog computer, using tiny gears and specially-machined cams for its mathematics. In this article, part 4 of my series,1 I reverse engineer the Mach section of the CADC and explain its calculations. (In the photo below, the Mach section is the middle section of the CADC.)
Aircraft have determined airspeed from air pressure for over a century. A port in the side of the plane provides the static air pressure,2 the air pressure outside the aircraft. A pitot tube points forward and receives the "total" air pressure, a higher pressure due to the air forced into the tube by the speed of the airplane. The airspeed can be determined from the ratio of these two pressures, while the altitude can be determined from the static pressure.
But as you approach the speed of sound, the fluid dynamics of air change and the calculations become very complicated. With the development of supersonic fighter planes in the 1950s, simple mechanical instruments were no longer sufficient. Instead, an analog computer calculated the "air data" (airspeed, air density, Mach number, and so forth) from the pressure measurements. This computer then transmitted the air data electrically to the systems that needed it: instruments, weapons targeting, engine control, and so forth. Since the computer was centralized, the system was called a Central Air Data Computer or CADC, manufactured by Bendix and other companies.
Each value in the Bendix CADC is indicated by the rotational position of a shaft. Compact electric motors rotate the shafts, controlled by the pressure inputs. Gears, cams, and differentials perform computations, with the results indicated by more rotations. Devices called synchros converted the rotations to electrical outputs that are connected to other aircraft systems. The CADC is said to contain 46 synchros, 511 gears, 820 ball bearings, and a total of 2,781 major parts (but I haven't counted). These components are crammed into a compact cylinder: just 15 inches long and weighing 28.7 pounds.
The equations computed by the CADC are impressively complicated. For instance, one equation is:
\[~~~\frac{P_t}{P_s} = \frac{166.9215M^7}{( 7M^2-1)^{2.5}}\]
It seems incredible that these functions could be computed mechanically, but three techniques make this possible. The fundamental mechanism is the differential gear, which adds or subtracts values. Second, logarithms are used extensively, so multiplications and divisions are implemented by additions and subtractions performed by a differential, while square roots are calculated by gearing down by a factor of 2. Finally, specially-shaped cams implement functions: logarithm, exponential, and application-specific functions. By combining these mechanisms, complicated functions can be computed mechanically, as I will explain below.
The differential
The differential gear assembly is the mathematical component of the CADC, as it performs addition or subtraction.3 The differential takes two input rotations and produces an output rotation that is the sum or difference of these rotations.4 Since most values in the CADC are expressed logarithmically, the differential computes multiplication and division when it adds or subtracts its inputs.
While the differential functions like the differential in a car, it is constructed differently, with a spur-gear design. This compact arrangement of gears is about 1 cm thick and 3 cm in diameter. The differential is mounted on a shaft along with three co-axial gears: two gears provide the inputs to the differential and the third provides the output. In the photo, the gears above and below the differential are the input gears. The entire differential body rotates with the sum, connected to the output gear at the top through a concentric shaft. (In practice, any of the three gears can be used as the output.) The two thick gears inside the differential body are part of the mechanism.
The cams
The CADC uses cams to implement various functions. Most importantly, cams compute logarithms and exponentials. Cams also implement complicated functions of one variable such as ${M}/{\sqrt{1 + .2 M^2}}$. The function is encoded into the cam's shape during manufacturing, so a hard-to-compute nonlinear function isn't a problem for the CADC. The photo below shows a cam with the follower arm in front. As the cam rotates, the follower moves in and out according to the cam's radius.
However, the shape of the cam doesn't provide the function directly, as you might expect. The main problem with the straightforward approach is the discontinuity when the cam wraps around. For example, if the cam implemented an exponential directly, its radius would spiral exponentially and there would be a jump back to the starting value when it wraps around. Instead, the CADC uses a clever patented method: the cam encodes the difference between the desired function and a straight line. For example, an exponential curve is shown below (blue), with a line (red) between the endpoints. The height of the gray segment, the difference, specifies the radius of the cam (added to the cam's fixed minimum radius). The point is that this difference goes to 0 at the extremes, so the cam will no longer have a discontinuity when it wraps around. Moreover, this technique significantly reduces the size of the value (i.e. the height of the gray region is smaller than the height of the blue line), increasing the cam's accuracy.5
To make this work, the cam position must be added to the linear value to yield the result. This is implemented by combining each cam with a differential gear; watch for the paired cams and differentials below. As the diagram below shows, the input (23) drives the cam (30) and the differential (25, 37-41). The follower (32) tracks the cam and provides a second input (35) to the differential. The sum from the differential produces the desired function (26).
The synchro outputs
A synchro is an interesting device that can transmit a rotational position electrically over three wires. In appearance, a synchro is similar to an electric motor, but its internal construction is different, as shown below. Before digital systems, synchros were very popular for transmitting signals electrically through an aircraft. For instance, a synchro could transmit an altitude reading to a cockpit display or a targeting system. Two synchros at different locations have their stator windings connected together, while the rotor windings are driven with AC. Rotating the shaft of one synchro causes the other to rotate to the same position.6
For the CADC, most of the outputs are synchro signals, using compact synchros that are about 3 cm in length. For improved resolution, many of the CADC outputs use two synchros: a coarse synchro and a fine synchro. The two synchros are typically geared in an 11:1 ratio, so the fine synchro rotates 11 times as fast as the coarse synchro. Over the output range, the coarse synchro may turn 180°, providing the approximate output unambiguously, while the fine synchro spins multiple times to provide more accuracy.
Examining the Mach section of the CADC
The Bendix CADC is constructed from modular sections. In this blog post, I'm focusing on the middle section, called the "Mach section" and indicated by the arrow above. This section computes log static pressure, impact pressure, pressure ratio, and Mach number and provides these outputs electrically as synchro signals. It also provides the log pressure ratio and log static pressure to the rest of the CADC as shaft rotations. The left section of the CADC computes values related to airspeed, air density, and temperature.7 The right section has the pressure sensors (the black domes), along with the servo mechanisms that control them.
I had feared that any attempt at disassembly would result in tiny gears flying in every direction, but the CADC was designed to be taken apart for maintenance. Thus, I could remove the left section of the CADC for analysis. Unfortunately, we lost the gear alignment between the sections and don't have the calibration instructions, so the CADC no longer produces accurate results.
The diagram below shows the internal components of the Mach section after disassembly. The synchros are in pairs to generate coarse and fine outputs; the coarse synchros can be distinguished because they have spiral anti-backlash springs installed. These springs prevent wobble in the synchro and gear train as the gears change direction. The gears and differentials are not visible from this angle as they are underneath the metal plate. The Pressure Error Correction (PEC) subsystem has a motor to drive the shaft and a control transformer for feedback. The Mach section has two D-sub connectors. The one on the right links the Mach section and pressure section to the front section of the CADC. The Position Error Correction (PEC) servo amplifier board plugs into the left connector. The static pressure and total pressure input lines have fittings so the lines can be disconnected from the lines from the front of the CADC.8
The photo below shows the left section of the CADC. This section meshes with the Mach section shown above. The two sections have parts at various heights, so they join in a complicated way. Two gears receive the pressure signals \( log ~ P_t / P_s \) and \( log ~ P_s \) from the Mach section. The third gear sends the log total temperature to the rest of the CADC. The electrical connector (a standard 37-pin D-sub) supplies 120 V 400 Hz power to the Mach section and pressure transducers and passes synchro signals to the output connectors.
The position error correction servo loop
The CADC receives two pressure inputs and two pressure transducers convert the pressures into rotational positions, providing the indicated static pressure \( P_{si} \) and the total pressure \( P_t \) as shaft rotations to the rest of the CADC. (I explained the pressure transducers in detail in the previous article.)
There's one complication though. The static pressure \( P_s \) is the atmospheric pressure outside the aircraft. The problem is that the static pressure measurement is perturbed by the airflow around the aircraft, so the measured pressure (called the indicated static pressure \( P_{si} \)) doesn't match the real pressure. This is bad because a "static-pressure error manifests itself as errors in indicated airspeed, altitude, and Mach number to the pilot."9
The solution is a correction factor called the Position Error Correction. This factor gives the ratio between the real pressure \( P_s \) and the measured pressure \( P_{si} \). By applying this correction factor to the indicated (i.e. measured) pressure, the true pressure can be obtained. Since this correction factor depends on the shape of the aircraft, it is generated outside the CADC by a separate cylindrical unit called the Compensator, customized to the aircraft type. The position error computation depends on two parameters: the Mach number provided by the CADC and the angle of attack provided by an aircraft sensor. The compensator determines the correction factor by using a three-dimensional cam. The vintage photo below shows the components inside the compensator.
The correction factor is transmitted from the compensator to the CADC as a synchro signal over three wires. To use this value, the CADC must convert the synchro signal to a shaft rotation. The CADC uses a motorized servo loop that rotates the shaft until the shaft position matches the angle specified by the synchro input.
The key to the servo loop is a control transformer. This device looks like a synchro and has five wires like a synchro, but its function is different. Like the synchro motor, the control transformer has three stator wires that provide the angle input. Unlike the synchro, the control transformer also uses the shaft position as an input, while the rotor winding generates an output voltage indicating the error. This output voltage indicates the error between the control transformer's shaft position and the three-wire angle input. The control transformer provides its error signal as a 400 Hz sine wave, with a larger signal indicating more error.10
The amplifier board (below) drives the motor in the appropriate direction to cancel out the error. The power transformer in the upper left is the largest component, powering the amplifier board from the CADC's 115-volt, 400 Hertz aviation power. Below it are two transformer-like components; these are the magnetic amplifiers. The relay in the lower-right corner switches the amplifier into test mode. The rest of the circuitry consists of transistors, resistors, capacitors, and diodes. The construction is completely different from modern printed circuit boards. Instead, the amplifier uses point-to-point wiring between plastic-insulated metal pegs. Both sides of the board have components, with connections between the sides through the metal pegs.
The amplifier board is implemented with a transistor amplifier driving two magnetic amplifiers, which control the motor.11 (Magnetic amplifiers are an old technology that can amplify AC signals, allowing the relatively weak transistor output to control a larger AC output.12) The motor is a "Motor / Tachometer Generator" unit that also generates a voltage based on the motor's speed. This speed signal provides negative feedback, limiting the motor speed as the error becomes smaller and ensuring that the feedback loop doesn't overshoot. The photo below shows how the amplifier board is mounted in the middle of the CADC, behind the static pressure tubing.
The equations
Although the CADC looks like an inscrutable conglomeration of tiny gears, it is possible to trace out the gearing and see exactly how it computes the air data functions. With considerable effort, I have reverse-engineered the mechanisms to create the diagram below, showing how each computation is broken down into mechanical steps. Each line indicates a particular value, specified by a shaft rotation. The ⊕ symbol indicates a differential gear, adding or subtracting its inputs to produce another value. The cam symbol indicates a cam coupled to a differential gear. Each cam computes either a specific function or an exponential, providing the value as a rotation. At the right, the outputs are either shaft rotations to the rest of the CADC or synchro outputs.
I'll go through each calculation briefly.
log static pressure
The static pressure is calculated by dividing the indicated static pressure by the pressure error correction factor. Since these values are all represented logarithmically, the division turns into a subtraction, performed by a differential gear. The output goes to two synchros, geared to provide coarse and fine outputs.13
\[log ~ P_s = log ~ P_{si} - log ~ P_{si} / P_s \]
Impact pressure
The impact pressure is the pressure due to the aircraft's speed, the difference between the total pressure and the static pressure. To compute the impact pressure, the log pressure values are first converted to linear values by exponentiation, performed by cams. The linear pressure values are then subtracted by a differential gear. Finally, the impact pressure is output through two synchros, coarse and fine in an 11:1 ratio.
\[ P_t - P_s = exp(log ~ P_t) - exp(log ~ P_s) \]
log pressure ratio
The log pressure ratio \( P_t/P_s \) is the ratio of total pressure to static pressure. This value is important because it is used to compute the Mach number, true airspeed, and log free air temperature. The Mach number is computed in the Mach section as described below. The true airspeed and log free air temperature are computed in the left section. The left section receives the log pressure ratio as a rotation. Since the left section and Mach section can be separated for maintenance, a direct shaft connection is not used. Instead, each section has a gear and the gears mesh when the sections are joined.
Computing the log pressure ratio is straightforward. Since the log total pressure and log static pressure are both available, subtracting the logs with a differential yields the desired value. That is,
\[log ~ P_t/P_s = log ~ P_t - log ~ P_s \]
Mach number
The Mach number is defined in terms of \(P_t/P_s \), with separate cases for subsonic and supersonic:14
\[M<1:\] \[~~~\frac{P_t}{P_s} = ( 1+.2M^2)^{3.5}\]
\[M > 1:\]
\[~~~\frac{P_t}{P_s} = \frac{166.9215M^7}{( 7M^2-1)^{2.5}}\]
Although these equations are very complicated, the solution is a function of one variable \(P_t/P_s\) so M can be computed with a single cam. In other words, the mathematics needed to be done when the CADC was manufactured, but once the cam exists, computing M is easy, using the log pressure ratio computed earlier:
\[ M = f(log ~ P_t / P_s) \]
Conclusions
The CADC performs nonlinear calculations that seem way too complicated to solve with mechanical gearing. But reverse-engineering the mechanism shows how the equations are broken down into steps that can be performed with cams and differentials, using logarithms for multiplication and division. The diagram below shows the complex gearing in the Mach section. Each differential below corresponds to a differential in the earlier equation diagram.
Follow me on Twitter @kenshirriff or RSS for more reverse engineering. I'm also on Mastodon as @oldbytes.space@kenshirriff. Thanks to Joe for providing the CADC. Thanks to Nancy Chen for obtaining a hard-to-find document for me.15 Marc Verdiell and Eric Schlaepfer are working on the CADC with me. CuriousMarc's video shows the CADC in action:
Notes and references
-
My articles on the CADC are:
There is a lot of overlap between the articles, so skip over parts that seem repetitive :-) ↩
-
The static air pressure can also be provided by holes in the side of the pitot tube; this is the typical approach in fighter planes. ↩
-
Multiplying a rotation by a constant factor doesn't require a differential; it can be done simply with the ratio between two gears. (If a large gear rotates a small gear, the small gear rotates faster according to the size ratio.) Adding a constant to a rotation is even easier, just a matter of defining what shaft position indicates 0. For this reason, I will ignore constants in the equations. ↩
-
Strictly speaking, the output of the differential is the sum of the inputs divided by two. I'm ignoring the factor of 2 because the gear ratios can easily cancel it out. It's also arbitrary whether you think of the differential as adding or subtracting, since it depends on which rotation direction is defined as positive. ↩
-
The diagram below shows a typical cam function in more detail. The input is \(log~ dP/P_s\) and the output is \(log~M / \sqrt{1+.2KM^2}\). The small humped curve at the bottom is the cam correction. Although the input and output functions cover a wide range, the difference that is encoded in the cam is much smaller and drops to zero at both ends.
This diagram, from Patent 2969910, shows how a cam implements a complicated function. -
Internally, a synchro has a moving rotor winding and three fixed stator windings. When AC is applied to the rotor, voltages are developed on the stator windings depending on the position of the rotor. These voltages produce a torque that rotates the synchros to the same position. In other words, the rotor receives power (26 V, 400 Hz in this case), while the three stator wires transmit the position. The diagram below shows how a synchro is represented schematically, with rotor and stator coils.
The schematic symbol for a synchro.A control transformer has a similar structure, but the rotor winding provides an output, instead of being powered. ↩
-
Specifically, the left part of the CADC computes true airspeed, air density, total temperature, log true free air temperature, and air density × speed of sound. I discussed the left section in detail here. ↩
-
From the outside, the CADC is a boring black cylinder, with no hint of the complex gearing inside. The CADC is wired to the rest of the aircraft through round military connectors. The front panel interfaces these connectors to the D-sub connectors used internally. The two pressure inputs are the black cylinders at the bottom of the photo.
The exterior of the CADC. It is packaged in a rugged metal cylinder. It is sealed by a soldered metal band, so we needed a blowtorch to open it. -
The concepts of position error correction are described here. ↩
-
The phase of the signal is 0° or 180°, depending on the direction of the error. In other words, the error signal is proportional to the driving AC signal in one direction and flipped when the error is in the other direction. This is important since it indicates which direction the motor should turn. When the error is eliminated, the signal is zero. ↩
-
I reverse-engineered the circuit board to create the schematic below for the amplifier. The idea is that one magnetic amplifier or the other is selected, depending on the phase of the error signal, causing the motor to turn counterclockwise or clockwise as needed. To implement this, the magnetic amplifier control windings are connected to opposite phases of the 400 Hz power. The transistor is connected to both magnetic amplifiers through diodes, so current will flow only if the transistor pulls the winding low during the half-cycle that the winding is powered high. Thus, depending on the phase of the transistor output, one winding or the other will be powered, allowing that magnetic amplifier to pass AC to the motor.
This reverse-engineered schematic probably has a few errors. Click the schematic for a larger version.The CADC has four servo amplifiers: this one for pressure error correction, one for temperature, and two for pressure. The amplifiers have different types of inputs: the temperature input is the probe resistance, the pressure error correction uses an error voltage from the control transformer, and the pressure inputs are voltages from the inductive pickups in the sensor. The circuitry is roughly the same for each amplifier—a transistor amplifier driving two magnetic amplifiers—but the details are different. The largest difference is that each pressure transducer amplifier drives two motors (coarse and fine) so each has two transistor stages and four magnetic amplifiers. ↩
-
The basic idea of a magnetic amplifier is a controllable inductor. Normally, the inductor blocks alternating current. But applying a relatively small DC signal to a control winding causes the inductor to saturate, permitting the flow of AC. Since the magnetic amplifier uses a small signal to control a much larger signal, it provides amplification.
In the early 1900s, magnetic amplifiers were used in applications such as dimming lights. Germany improved the technology in World War II, using magnetic amplifiers in ships, rockets, and trains. The magnetic amplifier had a resurgence in the 1950s; the Univac Solid State computer used magnetic amplifiers (rather than vacuum tubes or transistors) as its logic elements. However, improvements in transistors made the magnetic amplifier obsolete except for specialized applications. (See my IEEE Spectrum article on magnetic amplifiers for more history of magnetic amplifiers.) ↩
-
The CADC specification defines how the parameter values correspond to rotation angles of the synchros. For instance, for the log static pressure synchros, the CADC supports the parameter range 0.8099 to 31.0185 inches of mercury. The spec defines the corresponding synchro outputs as 16,320° rotation of the fine synchro and 175.48° rotation of the coarse synchro over this range. The synchro null point corresponds to 29.92 inches of mercury (i.e. zero altitude). The fine synchro is geared to rotate 93 times as fast as the coarse synchro, so it rotates over 45 times during this range, providing higher resolution than a single synchro would provide. The other synchro pairs use a much smaller 11:1 ratio; presumably high accuracy of the static pressure was important. ↩
-
Although the CADC's equations may seem ad hoc, they can be derived from fluid dynamics principles. These equations were standardized in the 1950s by various government organizations including the National Bureau of Standards and NACA (the precursor of NASA). ↩
-
It was very difficult to find information about the CADC. The official military specification is MIL-C-25653C(USAF). After searching everywhere, I was finally able to get a copy from the Technical Reports & Standards unit of the Library of Congress. The other useful document was in an obscure conference proceedings from 1958: "Air Data Computer Mechanization" (Hazen), Symposium on the USAF Flight Control Data Integration Program, Wright Air Dev Center US Air Force, Feb 3-4, 1958, pp 171-194. ↩
Subject: Inside the mechanical Bendix Air Data Computer, part 5: motor/tachometers
The servomotors in the CADC are unlike standard motors. Their name—"Motor-Tachometer Generator" or "Motor and Rate Generator"1—indicates that each unit contains both a motor and a speed sensor. Because the motor and generator use two-phase signals, there are a total of eight colorful wires coming out, many more than a typical motor. Moreover, the direction of the motor can be controlled, unlike typical AC motors. I couldn't find a satisfactory explanation of how these units worked, so I bought one and disassembled it. This article (part 5 of my series on the CADC2) provides a complete teardown of the motor/generator and explain how it works.
The image below shows a closeup of two motors powering one of the pressure signal outputs. Note the bundles of colorful wires to each motor, entering in two locations. At the top, the motors drive complex gear trains. The high-speed motors are geared down by the gear trains to provide much slower rotations with sufficient torque to power the rest of the CADC's mechanisms.
The motor/tachometer that we disassembled is shorter than the ones in the CADC (despite having the same part number), but the principles are the same. We started by removing a small C-clip on the end of the motor and and unscrewing the end plate. The unit is pretty simple mechanically. It has bearings at each end for the rotor shaft. There are four wires for the motor and four wires for the tachometer.3
The rotor (below) has two parts on the shaft. the left part is for the motor and the right drum is for the tachometer. The left part is a squirrel-cage rotor4 for the motor. It consists of conducting bars (light-colored) on an iron core. The conductors are all connected at both ends by the conductive rings at either end. The metal drum on the right is used by the tachometer. Note that there are no electrical connections between the rotor components and the rest of the motor: there are no brushes or slip rings. The interaction between the rotor and the windings in the body of the motor is purely magnetic, as will be explained.
The motor/tachometer contains two cylindrical stators that create the magnetic fields, one for the motor and one for the tachometer. The photo below shows the motor stator inside the unit after removing the tachometer stator. The stators are encased in hard green plastic and tightly pressed inside the unit. In the center, eight metal poles are visible. They direct the magnetic field onto the rotor.
The photo below shows the stator for the tachometer, similar to the stator for the motor. Note the shallow notches that look like black lines in the body on the lower left. These are probably adjustments to the tachometer during manufacturing to compensate for imperfections. The adjustments ensure that the magnetic fields are nulled out so the tachometer returns zero voltage when stationary. The metal plate on top shields the tachometer from the motor's magnetic fields.
The poles and the metal case of the stator look solid, but they are not. Instead, they are formed from a stack of thin laminations. The reason to use laminations instead of solid metal is to reduce eddy currents in the metal. Each lamination is varnished, so it is insulated from its neighbors, preventing the flow of eddy currents.
In the photo below, I removed some of the plastic to show the wire windings underneath. The wires look like bare copper, but they have a very thin layer of varnish to insulate them. There are two sets of windings (orange and blue, or red and black) around alternating metal poles. Note that the wires run along the pole, parallel to the rotor, and then wrap around the pole at the top and bottom, forming oblong coils around each pole.5 This generates a magnetic field through each pole.
The motor
The motor part of the unit is a two-phase induction motor with a squirrel-cage rotor.6 There are no brushes or electrical connections to the rotor, and there are no magnets, so it isn't obvious what makes the rotor rotate. The trick is the "squirrel-cage" rotor, shown below. It consists of metal bars that are connected at the top and bottom by rings. Assume (for now) that the fixed part of the motor, the stator, creates a rotating magnetic field. The important principle is that a changing magnetic field will produce a current in a wire loop.7 As a result, each loop in the squirrel-cage rotor will have an induced current: current will flow up9 the bars facing the north magnetic field and down the south-facing bars, with the rings on the end closing the circuits.
But how does the stator produce a rotating magnetic field? And how do you control the direction of rotation? The next important principle is that current flowing through a wire produces a magnetic field.8 As a result, the currents in the squirrel cage rotor produce a magnetic field perpendicular to the cage. This magnetic field causes the rotor to turn in the same direction as the stator's magnetic field, driving the motor. Because the rotor is powered by the induced currents, the motor is called an induction motor.
The diagram below shows how the motor is wired, with a control winding and a reference winding. Both windings are powered with AC, but the control voltage either lags the reference winding by 90° or leads the reference winding by 90°, due to the capacitor. Suppose the current through the control winding lags by 90°. First, the reference voltage's sine wave will have a peak, producing the magnetic field's north pole at A. Next (90° later), the control voltage will peak, producing the north pole at B. The reference voltage will go negative, producing a south pole at A and thus a north pole at C. The control voltage will go negative, producing a south pole at B and a north pole at D. This cycle will repeat, with the magnetic field rotating counter-clockwise from A to D. Conversely, if the control voltage leads the reference voltage, the magnetic field will rotate clockwise. This causes the motor to spin in one direction or the other, with the direction controlled by the control voltage. (The motor has four poles for each winding, rather than the one shown below; this increases the torque and reduces the speed.)
The purpose of the capacitor is to provide the 90° phase shift so the reference voltage and the control voltage can be driven from the same single-phase AC supply (in this case, 26 volts, 400 hertz). Switching the polarity of the control voltage reverses the direction of the motor.
There are a few interesting things about induction motors. You might expect that the motor would spin at the same rate as the rotating magnetic field. However, this is not the case. Remember that a changing magnetic field induces the current in the squirrel-cage rotor. If the rotor is spinning at the same rate as the magnetic field, the rotor will encounter an unchanging magnetic field and there will be no current in the bars of the rotor. As a result, the rotor will not generate a magnetic field and there will be no torque to rotate it. The consequence is that the rotor must spin somewhat slower than the magnetic field. This is called "slippage" and is typically a few percent of the full speed, with more slippage as more torque is required.
Many household appliances use induction motors, but how do they generate a rotating magnetic field from a single-phase AC winding? The problem is that the magnetic field in a single AC winding will just flip back and forth, so the motor will not turn in either direction. One solution is a shaded-pole motor, which puts a copper bar around part of each pole to break the symmetry and produce a weakly rotating magnetic field. More powerful induction motors use a startup winding with a capacitor (analogous to the control winding). This winding can either be switched out of the circuit once the motor starts spinning,10 or used continuously, called a permanent-split capacitor (PSC) motor. The best solution is three-phase power (if available); a three-phase winding automatically produces a rotating magnetic field.
Tachometer/generator
The second part of the unit is the tachometer generator, sometimes called the rate unit.11 The purpose of the generator is to produce a voltage proportional to the speed of the shaft. The unusual thing about this generator is that it produces a 400-hertz output that is either in phase with the input or 180° out of phase. This is important because the phase indicates which direction the shaft is turning. Note that a "normal" generator is different: the output frequency is proportional to the speed.
The diagram below shows the principle behind the generator. It has two stator windings: the reference coil that is powered at 400 Hz, and the output coil that produces the output signal. When the rotor is stationary (A), the magnetic flux is perpendicular to the output coil, so no output voltage is produced. But when the rotor turns (B), eddy currents in the rotor distort the magnetic field. It now couples with the output coil, producing a voltage. As the rotor turns faster, the magnetic field is distorted more, increasing the coupling and thus the output voltage. If the rotor turns in the opposite direction (C), the magnetic field couples with the output coil in the opposite direction, inverting the output phase. (This diagram is more conceptual than realistic, with the coils and flux 90° from their real orientation, so don't take it too seriously. As shown earlier, the coils are perpendicular to the rotor so the real flux lines are completely different.)
But why does the rotating drum change the magnetic field? It's easier to understand by considering a tachometer that uses a squirrel-cage rotor instead of a drum. When the rotor rotates, currents will be induced in the squirrel cage, as described earlier with the motor. These currents, in turn, generate a perpendicular magnetic field, as before. This magnetic field, perpendicular to the orginal field, will be aligned with the output coil and will be picked up. The strength of the induced field (and thus the output voltage) is proportional to the speed, while the direction of the field depends on the direction of rotation. Because the primary coil is excited at 400 hertz, the currents in the squirrel cage and the resulting magnetic field also oscillate at 400 hertz. Thus, the output is at 400 hertz, regardless of the input speed.
Using a drum instead of a squirrel cage provides higher accuracy because there are no fluctuations due to the discrete bars. The operation is essentially the same, except that the currents pass through the metal of the drum continuously instead of through individual bars. The result is eddy currents in the drum, producing the second magnetic field. The diagram below shows the eddy currents (red lines) from a metal plate moving through a magnetic field (green), producing a second magnetic field (blue arrows). For the rotating drum, the situation is similar except the metal surface is curved, so both field arrows will have a component pointing to the left. This creates the directed magnetic field that produces the output.
The servo loop
The motor/generator is called a servomotor because it is used in a servo loop, a control system that uses feedback to obtain precise positioning. In particular, the CADC uses the rotational position of shafts to represent various values. The servo loops convert the CADC's inputs (static pressure, dynamic pressure, temperature, and pressure correction) into shaft positions. The rotations of these shafts power the gears, cams, and differentials that perform the computations.
The diagram below shows a typical servo loop in the CADC. The goal is to rotate the output shaft to a position that exactly matches the input voltage. To accomplish this, the output position is converted into a feedback voltage by a potentiometer that rotates as the output shaft rotates.12 The error amplifier compares the input voltage to the feedback voltage and generates an error signal, rotating the servomotor in the appropriate direction. Once the output shaft is in the proper position, the error signal drops to zero and the motor stops. To improve the dynamic response of the servo loop, the tachometer signal is used as a negative feedback voltage. This ensures that the motor slows as the system gets closer to the right position, so the motor doesn't overshoot the position and oscillate. (This is sort of like a PID controller.)
The error amplifier and motor drive circuit for a pressure transducer are shown below. Because of the state of electronics at the time, it took three circuit boards to implement a single servo loop. The amplifier was implemented with germanium transistors (since silicon transistors were later). The transistors weren't powerful enough to drive the motors directly. Instead, magnetic amplifiers (the yellow transformer-like modules at the front) powered the servomotors. The large rectangular capacitors on the right provided the phase shift required for the control voltage.
Conclusions
The Bendix CADC used a variety of electromechanical devices including synchros, control transformers, servo motors, and tachometer generators. These were expensive military-grade components driven by complex electronics. Nowadays, you can get a PWM servo motor for a few dollars with the gearing, feedback, and control circuitry inside the motor housing. These motors are widely used for hobbyist robotics, drones, and other applications. It's amazing that servo motors have gone from specialized avionics hardware to an easy-to-use, inexpensive commodity.
Follow me on Twitter @kenshirriff or RSS for updates. I'm also on Mastodon as @oldbytes.space@kenshirriff. Thanks to Joe for providing the CADC. Thanks to Marc Verdiell for disassembling the motor.
Notes and references
-
The two types of motors in the CADC are part number "FV-101-19-A1" and part number "FV-101-5-A1" (or FV101-5A1). They are called either a "Tachometer Rate Generator" or "Tachometer Motor Generator", with both names applied to the same part number. The "19" and "5" units look the same, with the "19" used for one pressure servo loop and the "5" used everywhere else.
The motor that I got is similar to the ones in the CADC, but shorter. The difference in size is mysterious since both have the Bendix part number FV-101-5-A1.
For reference, the motor I disassembled is labeled:
Cedar Division Control Data Corp. ST10162 Motor Tachometer F0: 26V C0: 26V TACH: 18V 400 CPS DSA-400-70C-4651 FSN6105-581-5331 US BENDIX FV-101-5-A1I wondered why the motor listed both Control Data and Bendix. In 1952, the Cedar Engineering Company was spun off from the Minneapolis Honeywell Regulator Company (better known as Honeywell, the name it took in 1964). Cedar Engineering produced motors, servos, and aircraft actuators. In 1957, Control Data bought Cedar Engineering, which became the Cedar Division of CDC. Then, Control Data acquired Bendix's computer division in 1963. Thus, three companies were involved. ↩
-
My previous articles on the CADC are:
↩ -
From testing the motor, here is how I believe it is wired:
Motor reference (power): red and black
Motor control: blue and orange
Generator reference (power): green and brown
Generator out: white and yellow ↩ -
The bars on the squirrel-cage rotor are at a slight angle. Parallel bars would go in and out of alignment with the stator, causing fluctuations in the force, while the angled bars avoid this problem. ↩
-
This cross-section through the stator shows the windings. On the left, each winding is separated into the parts on either side of the pole. On the right, you can see how the wires loop over from one side of the pole to the other. Note the small circles in the 12 o'clock and 9 o'clock positions: cross sections of the input wires. The individual horizontal wires near the circumference connect alternating windings.
A cross-section of the stator, formed by sanding down the plastic on the end. -
It's hard to find explanations of AC servomotors since they are an old technology. One discussion is in Electromechanical components for servomechanisms (1961). This book points out some interesting things about a servomotor. The stall torque is proportional to the control voltage. Servomotors are generally high-speed, but low-torque devices, heavily geared down. Because of their high speed and their need to change direction, rotational inertia is a problem. Thus, servomotors typically have a long, narrow rotor compared with typical motors. (You can see in the teardown photo that the rotor is long and narrow.) Servomotors are typically designed with many poles (to reduce speed) and smaller air gaps to increase inductance. These small airgaps (e.g. 0.001") require careful manufacturing tolerance, making servomotors a precision part. ↩
-
The principle is Faraday's law of induction: "The electromotive force around a closed path is equal to the negative of the time rate of change of the magnetic flux enclosed by the path." ↩
-
Ampère's law states that "the integral of the magnetizing field H around any closed loop is equal to the sum of the current flowing through the loop." ↩
-
The direction of the current flow (up or down) depends on the direction of rotation. I'm not going to worry about the specific direction of current flow, magnetic flux, and so forth in this article. ↩
-
Once an induction motor is spinning, it can be powered from a single AC phase since the stator is rotating with respect to the magnetic field. This works for the servomotor too. I noticed that once the motor is spinning, it can operate without the control voltage. This isn't the normal way of using the motor, though. ↩
-
A long discussion of tachometers is in the book Electromechanical Components for Servomechanisms (1961). The AC induction-generator tachometer is described starting on page 193.
For a mathematical analysis of the tachometer generator, see Servomechanisms, Section 2, Measurement and Signal Converters, MCP 706-137, U.S. Army. This source also discusses sources of errors in detail. Inexpensive tachometer generators may have an error of 1-2%, while precision devices can have an error of about 0.1%. Accuracy is worse for small airborne generators, though. Since the Bendix CADC uses the tachometer output for damping, not as a signal output, accuracy is less important. ↩
-
Different inputs in the CADC use different feedback mechanisms. The temperature servo uses a potentiometer for feedback. The angle of attack correction uses a synchro control transformer, which generates a voltage based on the angle error. The pressure transducers contain inductive pickups that generate a voltage based on the pressure error. For more details, see my article on the CADC's pressure transducer servo circuits. ↩
Subject: The first microcomputer: The transfluxor-powered Arma Micro Computer from 1962
Obviously, the Arma Micro Computer is not a microcomputer according to modern definitions, since its processor was made from discrete components. But it's an interesting computer in many ways. First, it is an example of the aerospace computers of the 1960s, advanced systems that are now almost entirely forgotten. People think of 1960s computers as room-filling mainframes, but there was a whole separate world of cutting-edge miniaturized aerospace computers. (Taking up just 0.4 cubic feet, the Arma Micro Computer was smaller than an Apple II.) Second, the Arma Micro Computer used strange components such as transfluxors and had an unusual 22-bit serial architecture. Finally, the Arma Micro Computer evolved into a series of computers used on Navy ships and submarines, the E-2C Hawkeye airborne early warning plane, the Concorde, and even Air Force One.
The Arma Micro Computer
The Micro Computer used 22-bit words, which may seem like a strange size from the modern perspective. But there's no inherent need for a word size to be a power of 2. In particular, the Micro Computer was designed for mathematical calculations, not dealing with 8-bit characters. The word size was selected to provide enough accuracy for its navigational tasks.
Another strange aspect of the Micro Computer is that it was a serial machine, sequentially operating on one bit of a word at a time.2 This approach was often used in early machines because it substantially reduced the amount of hardware required: it only needs a 1-bit data bus and a 1-bit ALU. The downside is that a serial machine is much slower because each 22-bit word takes 22 clock cycles (plus 5 cycles of overhead). As a result, the Micro Computer executed just 36000 operations per second, despite its 1 megahertz clock speed.
The Micro Computer had a small instruction set of 19 instructions.3 It included multiply, divide, and square root, instructions that weren't implemented in early microprocessors. This illustrates how early microprocessors were a significant step backward in functionality. Moreover, the multiply, divide, and square root instructions used a separate arithmetic unit, so they could execute in parallel with other arithmetic instructions. Because the Micro Computer needed to interact with spacecraft systems, it had a focus on I/O, with 120 digital inputs or outputs, configured as needed for a particular mission.
Circuits
The Micro Computer was built from silicon transistors and diodes, using diode-transistor logic. The construction technique was somewhat unusual. The basic circuits were the flip-flop, the complementary buffer (i.e. an inverter), and the diode gate. Each basic circuit was constructed on a small wafer, .77 inches on a side.5 The photo below shows wafers for a two-transistor flip-flop and two diode gates. Each wafer had up to 16 connection tabs on the edges. These wafers are analogous to integrated circuits, but constructed from discrete components.
The wafers were mounted on printed circuit boards, with up to 22 wafers on a board. Pairs of boards were mounted back to back with polyurethane foam between the boards to form a "sandwich", which was conformally coated. The result was a module that was protected against the harsh environment of a missile or spacecraft. The computer could handle a shock of 100 g's and temperatures of 0°C to 85°C as well as 100% humidity or a vacuum.
Because the Micro Computer was a serial machine, its bits were constantly moving. For register storage such as the accumulator, it used six magnetostrictive torsional delay lines, storing a sequence of bits as physical twists that formed pulses racing through a long coil of wire.
The photo below shows the Arma Micro Computer with the case removed. If you look closely, you can see the 22 small circuit wafers mounted on each printed circuit board. The memory driver boards and delay lines are towards the back, spaced more widely than the other printed circuit boards. The cable harness underneath the boards provides the connections between boards.4
Transfluxors
One of the most unusual parts of the Micro Computer was its storage. Computers at the time typically used magnetic core memory, with each bit stored in a tiny ferrite ring, magnetized either clockwise or counterclockwise to store a 0 or 1. One drawback of standard core memory was that the process of reading a core also cleared the core, requiring data to be written back after a read.
The Micro Computer used ferrite cores, but these were "two-aperture" cores, with a larger hole and a smaller hole, as shown above. Data is written to the "major aperture" and read from the "minor aperture". Although the minor aperture switches state and is erased during a read, the major aperture retains the bit, allowing the minor aperture to be switched back to its original state. Thus, unlike regular core memory, transfluxors don't lose their data when reading.
The resulting system is called non-destructive readout (NDRO), compared to the destructive readout (DRO) of regular core memory.6 The Micro Computer used non-destructive readout memory to ensure that the program memory remained uncorrupted. In contrast, if a program is stored in regular core memory, each instruction must be written back as it is executed, creating the possibility that a transient could corrupt the software. By using transfluxors, this possibility of error is eliminated. (In either case, core memory has the convenient property that data is preserved when power is removed, since data is stored magnetically. With modern semiconductor memory, you lose data when the power goes off.)
The photo below shows a compact transfluxor-based storage module used in the Micro Computer, holding 512 words. In total, the computer could hold up to 7808 words of program memory and 256 words of data memory. It appears that transfluxors didn't live up to their promise, since most computers used regular core memory until semiconductor memory took over in the early 1970s.
Arma's history and the path to the Micro Computer
The Arma Engineering Company was founded in 1918 and built advanced military equipment.7 Its first product was a searchlight for the Navy, followed by a gyroscopic compass and analog computers for naval gun targeting. In 1939, Arma produced the Torpedo Data Computer, a remarkable electromechanical analog computer. US submarines used this computer to track target ships and automatically aim torpedos. The Torpedo Data Computer performed complex trigonometric calculations and integration to account for the motion of the target ship and the submarine. While the Torpedo Data Computer performed well, the Navy's Mark 14 torpedo had many problems—running too deep, exploding too soon, or failing to explode—making torpedoes often ineffectual even with a perfect hit.
Arma underwent major corporate changes due to World War II. Before the war, the German-owned Bosch Company built vehicle starters and aircraft magnetos in the United States. When the US entered World War II in 1941, the government was concerned that a German-controlled company was manufacturing key military hardware so the Office of Alien Property Custodian took over the Bosch plant. In 1948, the banking group that controlled Arma bought Bosch from the Office of the Alien Property Custodian, merging them into the American Bosch Arma Corporation (AMBAC).8 (Arma had earlier received the rights to gyrocompass technology from the German Anschutz company, seized by the Navy after World War I, so Arma benefitted twice from wartime government seizures.)
In the mid-1950s, Arma moved into digital computers, building an inertial guidance computer for the Atlas nuclear missile program. America's first ICBM was the Atlas missile, which became operational in 1959. The first Atlas missiles used radio guidance from the launch site to direct the missile. Since radio signals could be jammed by the enemy, this wasn't a robust solution.
The solution to missile guidance was an inertial navigation system. By using sensitive gyroscopes and accelerometers, a missile could continuously track its position and velocity without any external input, making it unjammable. A key developer of this system was Arma's Wen Tsing Chow, one of the driving forces behind digital aviation computers. He faced extreme skepticism in the 1950s for the idea of putting a computer in a missile. One general mocked him, asking "Where are you going to put the five Harvard professors you'll need to keep it running?" But computerized navigation was successful and in 1961, the Atlas missile was updated to use the Arma inertial guidance computer. It was said to be the first production airborne digital computer.9 Wen Tsing Chow also invented the programmable read-only memory (PROM), allowing missile targeting information to be programmed into a computer outside the factory.
The photo below shows the Atlas ICBM's guidance system. The Arma W-107A computer is at the top and the gyroscopes are in the middle. This computer was an 18-bit serial machine running at 143.36 kHz. It ran a hard-wired program that integrated the accelerometer information and solved equations for the crossrange error function, range error function, and gravity, making these computations every half second.10 The computer weighed 240 pounds and consumed 1000 watts. The computer contained about 36,000 components: discrete transistors, diodes, resistors, and capacitors mounted on 9.5" × 6.5" printed-circuit boards. On the ground, the computer was air-cooled to 55 °F, but there was no cooling after launch as the computer only operated for five minutes of powered flight and wouldn't overheat during that time.
The Atlas wasn't originally designed for a computerized guidance system so there wasn't room inside the missile for the computer. To get around this, a large pod was stuck on the side of the missile to hold the computer and gyroscopes, as indicated in the photo below. This doesn't look aerodynamic, but I guess it worked.
The Atlas guidance computer (left, below) consisted of three aluminum sections called "decks". The top deck held two replaceable target constant units, each providing 54 navigation constants that specified a target. The constants were stored in a stack of printed circuit boards 16" × 8" × 1.5", covered in over a thousand diodes, Wen Tsing Chow's PROM memory. A target was programmed into the stack by a rack of equipment that would selectively burn out diodes, changing the corresponding bit to a 1. (This is why programming a PROM is referred to as "burning the PROM".11) The diode matrix was later replaced with a transfluxor memory array, which had the advantage that it could be reprogrammed as necessary. The top deck also had connectors for the accelerometer inputs, the outputs, and connections for ground support equipment. The bottom deck had power connectors for 28 volts DC and 115V 400 Hz 3-phase AC. In the bottom deck, quartz delay lines were used for storage, representing bits as acoustic waves. Twelve circuit cards, each with a faceted quartz block four inches in diameter, provided a total of 32 words of storage.
Arma considered the Micro Computer the third generation of its airborne computers. The first generation was the Atlas guidance computer, constructed from germanium transistors and diodes (in the pre-silicon era). The second-generation computer moved to silicon transistors and diodes. The third-generation computers still used discrete components, but mounted on the small square wafers. The third generation also had a general-purpose architecture and programmable transfluxor memory instead of a hard-wired program.
After the Micro Computer
Arma continued to develop computers, improving the Arma Micro Computer. The Micro C computer (1965) was developed for Navy ships and submarines. Much like the original Micro, the Micro C used transfluxor storage, but increased the clock frequency to 972 kHz. The computer was much larger: 3.87 cubic feet and 150 pounds. This description states that "the machine is an outgrowth of the ARMA product line of micro computers and is logically and electrically similar to micro-computers designed for missile environments."
In mid-1966, Arma introduced the Micro D computer, built from TTL integrated circuits. Like the original Micro, this computer was serial, but the Micro D had a word length of 18 bits and ran at 1.5 MHz. It weighed 5.25 pounds and was very compact, just 0.09 ft3. Instead of transfluxors, the Micro D used regular magnetic core memory, 4K to 31K words.
The widely-used Litton LTN-51 inertial navigation system was built around the Arma Micro-D computer.12 This navigation system was designed for commercial aircraft, but was also used for military applications, ships, and NASA aircraft. Aircraft from early Concordes to Air Force One used the LTN-51 for navigation. The photo below shows a navigation unit with the Arma Micro-D computer in the lower left and the gyroscope unit on the right.
In early 1968, the Arma Portable Micro D was introduced, a 14-pound battery-powered computer also called the Celestial Data Processor. This handheld computer was designed for navigation in crewed earth orbital flight, determining orbital parameters from stadimeter and sextant measurements performed by astronauts. As far as I can tell, this computer never made it beyond the prototype stage.
Conclusions
The Arma Micro Computer is just one of the dozens of compact aerospace computers of the 1960s, a category that is mostly forgotten and ignored. Another example is the Delco MAGIC I (1961), said to be the "first complete airborne computer to have its logic functions mechanized exclusively with integrated circuits". IBM's 4 Pi series started in 1966 and was used in many systems from the F-15 to the Space Shuttle. By 1968, denser MOS/LSI chips were used in general-purpose aerospace computers such as the Rockwell MOS GP and the Texas Instruments Model 2502 LSI Computer. 13
Arma also illustrates that a company can be on the cutting edge of technology for decades and then suddenly go out of business and be forgotten. After some struggles, Arma was acquired by United Technologies in 1978 for $210 million and was then shut down in 1982. (The German Bosch corporation remains, now a large multinational known for products such as dishwashers, auto parts, and power tools.) Looking at a list of aerospace computers shows many innovative but vanished companies: Univac, Burroughs, Sperry (now all Unisys), AC Electronics (now part of Raytheon), Autonetics (acquired by Boeing), RCA (bought by GE), and TRW (acquired by Northrop Grumman).
Finally, the Micro Computer illustrates that terms such as "microcomputer" are not objective categories but are social constructs. At first, it seems obvious that the Arma Micro Computer is not a real microcomputer. If you consider a microcomputer to be a computer built around a microprocessor, that's true. (Although "microprocessor" is also not as clear as you might think.) But a microcomputer can also be defined as "A small computer that includes one or more input/output units and sufficient memory to execute instructions" (according to the IBM Dictionary of Computing, 1994)14 and the Arma Micro Computer meets that definition. The "microcomputer" is a shifting concept, changing from the 1960s to the 1990s to today.
For more, follow me on Twitter @kenshirriff or RSS for updates. I'm also on Mastodon as @kenshirriff@oldbytes.space. Thanks to Daniel Plotnick for providing a great deal of information and photos. Thanks to John Hartman for obtaining an obscure conference proceedings for me.
Notes and references
-
I should mention the danger of "firsts" from a historical perspective. Historian Michael Williams advised "not to use the word 'first'" and said, "If you add enough adjectives to a description you can always claim your own favorite." (See ENIAC in Action, p7.)
The first usage of "micro-computer" that I could find is from 1956. In Isaac Asimov's short story "The Dying Night", he mentions a "micro-computer" in passing: "In recent years, it [the handheld scanner] had become the hallmark of the scientist, much as the stethoscope was that of the physician and the micro-computer that of the statistician."
Another interesting example of a "micro-computer" is the Texas Instruments Semiconductor Network Computer. This palm-sized computer is often considered the first integrated-circuit computer. It was an 11-bit serial computer running at 100 kHz, built out of RS flip-flops, NOR gates, and logic drivers. The 1961 article below described this computer as a "micro-computer", although this was a one-off use of the term, not the computer's name. This brochure describes the Semiconductor Network Computer in more detail and Semiconductor Networks are described in detail in this article. Unlike modern ICs, these integrated circuits used flying wires for internal connections rather than a deposited metal layer, making their design a dead end.
The Texas Instruments Semiconductor Network Computer. From Computers and Automation, Dec. 1961. -
Most of the information on the Arma Micro Computer in this article is from "The Arma Micro Computer for Space Applications", by E. Keonjian and J. Marx, Spaceborne Computing Engineering Conference, 1962, pages 103-116. ↩
-
The Arma Micro Computer's instruction set consisted of 19 22-bit instructions, shown below.
Instruction set of the Arma Micro Computer. Figure from "The Arma Micro Computer for Space Applications". -
This block diagram shows the structure of the Micro Computer. The accumulator register (AC) is used for all data transfers as well as addition and subtraction. The multiply-divide register is used for multiplication, division, and square roots. The product register (PR), quotient register (QR), and square root register (SR) are used by the corresponding instructions. The data buffer register (S) holds data moving in or out of storage; it is shown with two 11-bit parts.
Block diagram of the Arma Micro Computer. Figure from "The Arma Micro Computer for Space Applications".For control logic, the location counter (L) is the 13-bit program counter. For a subroutine call, the current address can be stored in the recall register (RR), which acts as a link register to hold the return address. (The RR is not shown on the diagram because it is held in memory.) Instruction decoding uses the instruction register (I), with the next instruction in the instruction buffer (B). The operand register (P) contains the 13-bit address from an instruction, while the remaining register (R) is used for I/O addressing. ↩
-
Arma's original plan was to mount circuits on ceramic wafers. Resistors would be printed onto the wafer and wiring silk-screened. (This is similar to IBM's SLT modules (1964), although IBM mounted diode and transistors as bare dies rather than components.) However, the Micro Computer ended up using epoxy-glass wafers with small, but discrete components: standard TO-46 transistors, "fly-speck" diodes, and 1/10 watt resistors. I don't see much advantage to these wafers over mounting the components directly on the printed-circuit board; maybe standardization is the benefit. ↩
-
The Micro Computer used an unusual mechanism to select a word to read or write. Most computers used a grid of selection wires; by energizing an X and a Y wire at the same time, the corresponding core was selected. The key idea of this "coincident-current" approach is that each wire has half the current necessary to flip a core, so the core with the energized X and Y wires will have enough current to flip. This puts tight constraints on the current level, since too much current will flip all the cores along the wire, but not enough current will not flip the selected current. What makes this difficult is that the properties of a core change with temperature, so either the cores need to be temperature-stabilized or the current needs to be adjusted based on the temperature.
The Micro Computer instead used a separate wire for each word, so as long as the current is large enough, the cores will flip. This approach avoids the issues with temperature sensitivity, an important concern for a computer that needs to handle the large temperature swings of a spacecraft, not an air-conditioned data center. Unfortunately, it requires much more wiring. Specifically, the large advantage of the coincident-current approach is that an N×N grid of wires lets you select N2 words. With the Micro Computer approach, N wires only select N words, so the scalability is much worse.
For more on Arma's memory systems, see patents: Memory Device, 3048828 and Multiaperture Core Memory Matrix, 3289181. ↩
-
The capitalization of Arma vs. ARMA is inconsistent. It often appears in all-caps, but both forms are used, sometimes in the same article. "Arma" is not an acronym; the name came from the names of its founders: Arthur Davis and David Mahood (source: Between Human and Machine, p54). I suspect a 1960s corporate branding effort was responsible for the use of all-caps. ↩
-
For more on the corporate history of Arma, see IRE Pulse, March 1958, p9-10. Details of corporate politics and what went wrong are here. More information on the financial ups and downs of Arma is in "Charles Perelle's Spacemanship", Fortune, January 1959, an article that focused on Charles Perelle, the president of American Bosch Arma. ↩
-
Wikipedia says that Arma's guidance computer was "the first production airborne digital computer". However, the Hughes Digitair (1958) has also been called "the first airborne digital computer in actual production." Another source says the Arma computer was the "first all-solid-state, high-reliability, space-borne digital computer." The TRADIC (Transistorized Airborne Digital Computer) (1954) was earlier, but was a prototype system, not a production system. In turn, the TRADIC is said by some to be the first fully transistorized computer, but that depends on exactly how you interpret "fully".
This is another example of how the "first" depends on the specific adjectives used. ↩
-
The information on the Arma W-107A computer is from "Atlas Inertial Guidance System: As I Remember It" by Principal Engineer John Heiderstadt. ↩
-
Chow Wen Tsing's PROM patent discusses the term "burning", explaining that it refers to burning out the diodes electrically. To widen the patent, he clarifies that "The term 'blowing out' or 'burning out' further includes any process which, by means less drastic than actual destruction of the non-linear elements, effects a change of the circuit impedance to a level which makes the particular circuit inoperative." This description prevented someone from trying to get around the patent by stating that nothing was really burning. ↩
-
Details on the LTN-51 navigation system and its uses are in this document. ↩
-
For more information on early aerospace computers, see State-of-the-art of Aerospace Digital Computers (1967), updated as Trends in Aerospace Digital Computer Design (1969). Also see the 1970 Survey of Military CPUs. Efficient partitioning for the batch-fabricated fourth generation computer (1968) discusses how "The computer industry is on the verge of an upheaval" from new hardware including LSI and fast ROMs, and describes various LSI aerospace computers. ↩
-
The "IBM Dictionary of Computing" (1994) has two definitions of "microcomputer": "(1) A digital computer whose processing unit consists of one or more microprocessors, and includes storage and input/output facilities. (2) A small computer that includes one or more input/output units and sufficient memory to execute instructions; for example a personal computer. The essential components of a microcomputer are often contained within a single enclosure." The latter definition was from an ISO/IEC draft standard for terminology so it is somewhat "official". ↩
Subject: The Intel 8088 processor's instruction prefetch circuitry: a look inside
In 1979, Intel introduced the 8088 microprocessor, a variant of the 16-bit 8086 processor. IBM's decision to use the 8088 processor in the IBM PC (1981) was a critical point in computer history, leading to the dominance of the x86 architecture that continues to the present.1 One way that the 8086 and 8088 increased performance was by prefetching: the processor fetches instructions from memory before they are needed, so the processor can execute them without waiting on the relatively slow memory. I've been reverse-engineering the 8088 from die photos and this blog post discusses what I've uncovered about the prefetch circuitry.
The die photo below shows the 8088 microprocessor under a microscope. The metal layer on top of the chip is visible, with the silicon and polysilicon mostly hidden underneath. Around the edges of the die, bond wires connect pads to the chip's 40 external pins. I've labeled the key functional blocks; this article focuses on the prefetch queue components highlighted in red. The components in purple also play a role, and will be discussed below. Architecturally, the chip is partitioned into a Bus Interface Unit (BIU) at the top and an Execution Unit (EU) below. The BIU handles memory accesses, while the Execution Unit (EU) executes instructions. In particular, the BIU fetches instructions, which are transferred from the prefetch queue to the Execution Unit via the queue bus.
The 8086 and 8088 processors present the same 16-bit architecture to the programmer. The key difference is that the 8088 has an 8-bit data bus for communication with memory and I/O, rather than the 16-bit bus of the 8086. The 8088's narrower bus reduced performance, since the processor only transfers one byte at a time rather than two. However, the 8-bit bus enabled cheaper computer hardware. The 8-bit bus was also a better match for hardware based on the older but popular 8-bit Intel 8080 and 8085 processors, allowing the reuse of 8-bit I/O circuitry for instance. Much of the IBM PC was based on the little-known IBM DataMaster, a computer built around the Intel 8085. Thus, selecting the 8088 processor was a natural choice for the IBM PC.
For the most part, the 8086 and 8088 are very similar internally, apart from trivial but numerous layout changes on the die. The biggest differences are in the Bus Interface Unit, the circuitry that communicates with memory and I/O devices, since this circuitry handles 16 bits in the 8086 versus 8 bits in the 8088. There are a few microcode differences between the two chips. One interesting change is that for performance reasons the 8088 has a smaller prefetch queue than the 8086 (four bytes instead of six). (I wrote about the 8086's prefetch circuity earlier.)
Prefetching and the architecture of the 8086 and 8088
The 8086 and 8088 were introduced at an interesting point in microprocessor history, when memory was becoming slower than the CPU. For the first microprocessors, the speed of the CPU and the speed of memory were comparable.2 However, as processors became faster, the speed of memory failed to keep up. The 8086 was probably the first microprocessor to prefetch instructions to improve performance. While modern microprocessors have megabytes of fast cache3 to act as a buffer between the CPU and much slower main memory, the 8088 has just 4 bytes of prefetch queue. However, this was enough to substantially increase performance.
Prefetching had a major impact on the design of the 8086 and thus the 8088. Earlier processors such as the 6502, 8080, or Z80 were deterministic: the processor fetched an instruction, executed the instruction, and so forth. Memory accesses corresponded directly to instruction fetching and execution and instructions took a predictable number of clock cycles. This all changed with the introduction of the prefetch queue. Memory operations became unlinked from instruction execution since prefetches happen as needed and when the memory bus is available.
To handle memory operations and instruction execution independently, the implementors of the 8086 and 8088 divided the processors into two processing units: the Bus Interface Unit (BIU) that handles memory accesses, and the Execution Unit (EU) that executes instructions. The Bus Interface Unit contains the instruction prefetch queue; it supplies instructions to the Execution Unit via the Q (queue) bus. The BIU also contains an adder (Σ) for address calculation, adding the segment register base to an address offset, among other things. The Execution Unit is what comes to mind when you think of a processor: it has most of the registers, the arithmetic/logic unit (ALU), and the microcode that implements instructions. The segment registers (CS, DS, SS, ES) and the Instruction Pointer (IP) are in the Bus Interface Unit since they are directly involved in memory accesses, while the general-purpose registers are in the Execution Unit.
It may seem inefficient for the Bus Interface Unit to have its own adder instead of using the ALU, but there are reasons for the separate adder. First, every memory access uses the adder at least once to add the segment base and offset. The adder is also used to increment the PC or index registers. Since these operations are so frequent, they would create a bottleneck if they used the ALU. Second, since the Execution Unit and the Bus Interface Unit run asynchronously with respect to each other, it would be complicated to share the ALU without conflicts.
Prefetching had another major but little-known effect on the 8086 architecture: the designers were considering making the 8086 a two-chip microprocessor. Prefetching, however, required a one-chip design because the number of control signals required to synchronize prefetching across two chips exceeded the package pins available. This became a compelling argument for the one-chip design that was used for the 8086.4 (The unsuccessful Intel iAPX 432, which was under development at the same time, ended up being a two-chip processor: one to fetch and decode instructions, and one to execute them.)
Implementing the queue
The 8088's instruction prefetch queue is implemented with four 8-bit queue registers along with two hardware "pointers" into the queue. One two-bit counter keeps track of the current read position from 0 to 3, i.e. the queue register that will provide the next instruction byte. The second counter keeps track of the current write position, i.e. the queue register that will receive the next instruction from memory.5 As bytes are fetched from the queue, the read pointer advances. As bytes are added to the queue, the write pointer advances.
The diagram below shows an example queue configuration with two prefetched bytes. The middle two queue registers (Q1 and Q2) hold data. The read pointer indicates that the Execution Unit will get its next byte from Q1. The write pointer indicates that the next prefetched byte will go into Q3.
The diagram below shows how the queue pointers can wrap around. In this configuration, two more bytes have been written to the queue (Q3 and Q0), so the queue is full. The write pointer now points to Q1, the same as the read pointer.
There is an important ambiguity, however. Suppose that four bytes are read from the queue, so the read pointer advances four positions, wrapping around back to Q1. The queue is now empty, as shown below, but the pointers have the same position as the full case above. Thus, if the read pointer and the write pointer both point to the same position, the queue may be empty or full. To distinguish these cases, a flip-flop is set if the queue enters the empty state. This flip-flop generates a signal that Intel called MT (empty).
To determine how many bytes are in the queue, the queue circuitry uses a two-bit queue length value, along with the MT flip-flop value to distinguish the empty state. Conceptually, the queue length is generated by subtracting the read position from the write position. However, the implementation does not use a standard subtraction circuit, but instead uses hardcoded logic to determine the two bits of the length, as shown below.
The low bit of the length is the XOR of the two positions. In NMOS logic (used by the 8088), an AND-NOR gate is easy to implement, while an XOR gate is difficult. Thus, XOR is implemented as shown in the top circuit. (You can verify that if one input is 1 and the other is 0, the output is 1.) The high-order bit of the length is also based on an AND-NOR gate, one with six inputs. Each input is a combination of read and write positions that yields an output bit 1; each input is computed by a NOR gate, which I haven't drawn.6 As a result, the amount of logic circuitry to compute the length is fairly large.
The diagram below zooms in on the queue control circuitry on the die, with the main flip-flops and circuitry labeled. The circuitry in the middle computes the queue length with the 6-input NOR gate stretched across the whole region. The flip-flops for the read and write positions are in the lower region. Despite the relative simplicity of the queue circuits, they take up a substantial part of the die. Compared to modern chips, the density of the 8088 is very low; you can almost see the flip-flops with the naked eye. But this isn't all the circuitry as prefetching also required queue registers and memory cycle control circuitry. Thus, prefetching was a moderately expensive feature for the 8088, as far as die area.
The loader
To decode and execute an instruction, the Execution Unit must get instruction bytes from the Bus Interface Unit, but this is not entirely straightforward. The main problem is that the queue can be empty, in which case instruction decoding must block until a byte is available from the queue. The second problem is that instruction decoding is relatively slow so it is pipelined. For maximum performance, the decoder needs a new byte before the current instruction is finished. A circuit called the "loader" solves these problems by providing synchronization between the prefetch queue and the instruction decoder. The loader uses a small state machine to efficiently fetch bytes from the queue at the right time and to provide timing signals to the decoder and microcode engine.
In more detail, as the loader requests the first two instruction bytes from the prefetch queue, it generates two timing signals that control the microcode execution. The FC (First Clock) indicates that the first instruction byte is available, while the SC (Second Clock) indicates the second instruction byte. Note that the First Clock and Second Clock are not necessarily consecutive clock cycles because the prefetch queue could be empty or contain just one byte, in which case the First Clock and/or Second Clock would be delayed. The instruction decoding circuitry and the microcode engine are controlled by the First Clock and Second Clock signals, so they remain synchronized with the bytes supplied by the prefetch queue.
At the end of a microcode sequence, the Run Next Instruction (RNI) micro-operation causes the loader to fetch the next machine instruction. However, fetching and decoding the next instruction is a bit slow so microcode execution would be blocked for a cycle. In many cases, this slowdown can be avoided: if the microcode knows that it is one micro-instruction away from finishing, it issues a Next-to-last (NXT) micro-operation so the loader can start loading the next instruction. This achieves a degree of pipelining in most cases; fetching the next instruction is overlapped with finishing the execution of the previous instruction.
The diagram above shows the state machine for the loader. I won't explain it in detail, but essentially it keeps track of whether it is waiting for a First Clock byte or a Second Clock byte, and if it is performing a fetch in advance (NXT) or at the end of an instruction (RNI). The state machine is implemented with two flip-flops to support its four states.
Microcode and the prefetch queue
The loader takes care of fetching an instruction that consists of an opcode byte and a Mod R/M (addressing mode) byte. However, many instructions have additional bytes or don't follow this format For example, an opcode such as "ADD AX" can be followed by an 8- or 16-bit immediate value, adding that value to the AX register. Or a "move memory to AX" instruction can be followed by a 16-bit memory address The microcode uses a separate mechanism for fetching these instruction bytes from the queue. Specifically, each micro-instruction contains a source register and a destination register that specify a data move. By specifying "Q" (the queue) as the source, a byte is fetched from the prefetch queue. If the queue is empty, microcode execution blocks until the Bus Interface Unit loads a byte into the prefetch queue. Thus, the complexity of instruction fetching and the prefetch queue is invisible to the microcode.7
A jump, subroutine call, or other control flow change causes the prefetch queue to be flushed since the queue contents
are no longer useful.
This is accomplished in microcode with the FLUSH
micro-instruction, which resets the queue read and write pointers and
sets the MT (empty) flip-flop.
Note that the queue is flushed even if the target address is in the queue, for example if you jump one byte ahead.
One complication due to the prefetch queue is that the processor's Instruction Pointer points to the next instruction to
be fetched, not the next instruction to be executed.
This becomes a problem for a subroutine call, which needs to push the return address.
It is also a problem for a relative jump, which is computed from the current instruction.
The solution is the CORR
micro-instruction, which corrects the Instruction Pointer by subtracting the queue length to
determine the current execution position.
This is implemented by the Bus Interface Unit, which holds correction constants in the Constant ROM, and subtracts them
using the address adder (not the ALU).8
The queue registers
The 8086 and 8088 partition the registers into upper registers (in the Bus Interface Unit) and lower registers (in the Execution Unit). The upper registers are the registers associated with memory accesses (e.g. Instruction Pointer, segment registers) while the lower registers are more general purpose (e.g. AX, BX, SI, SP). The upper registers are connected to two 16-bit internal buses: the B bus and the C bus.
The queue registers are physically part of the upper registers, but are wired into the buses slightly differently, as shown below. In particular, the 8088's queue registers are written 8 bits at a time from the C bus. (In contrast, the 8086's queue registers can be written 16 bits at a time to support two-byte prefetches.) When accessing the queue, the queue registers are read 16 bits at a time, but only one byte is transferred to the Q bus for instruction processing.9
The diagram below shows how the queue registers appear on the die, comparing the six-byte prefetch queue in the 8086 (top) to the four-byte 8088 queue (bottom). The 8086 prefetch registers are structured as three rows of 16-bit registers, while the 8088 prefetch registers are structured as four rows of 8-bit registers. In both cases, each bit is stored in a cross-coupled pair of inverters. The bit lines (not present) are vertical, while the control lines to select a register are horizontal. The layout is different between the processors to support 16-bit versus 8-bit writes. Note the empty space at the bottom of the 8088 registers. Because the rest of the chips are mostly the same, the 8088 couldn't be "compacted" to avoid this wasted space.
Intel used simulations to determine the best queue sizes for the 8086 and 8088, balancing the performance cost of prefetching against the benefit. (The cost is that prefetching makes the bus unavailable for other memory or I/O operations.) The prefetch queue is discarded on a jump instruction or other change of control flow, causing the prefetched bytes to be wasted. Thus, as the queue gets longer, the chance of discarding a prefetched byte becomes larger, so the potential benefit of prefetching becomes smaller. Since the 8088 prefetches one byte at a time, compared to two bytes at a time on the 8086, prefetching on the 8088 costs twice as much as on the 8086 in terms of bus cycles used per byte. This changes the tradeoffs in favor of a shorter queue.
Because of the difference in queue lengths, the queue control circuitry is different between the 8086 and 8088. In particular, the 8086 needs three-bit counters for the read and write positions, while the 8088 uses two-bit counters. Because of this, the length computation circuitry is also different between the processors.
I plan to continue reverse-engineering the 8088 die so follow me on Twitter @kenshirriff or RSS for updates. I've also started experimenting with Mastodon recently as @oldbytes.space@kenshirriff. If you're interested in the 8086, I wrote about the 8086 die, its die shrink process and the 8086 registers earlier.
Notes and references
-
Whenever I mention x86's domination of the computing market, people bring up ARM, but ARM has a lot more market share in people's minds than in actual numbers. One research firm says that ARM has 15% of the laptop market share in 2023, expected to increase to 25% by 2027. (Surprisingly, Apple only has 90% of the ARM laptop market.) In the server market, just an estimated 8% of CPU shipments in 2023 were ARM. See Arm-based PCs to Nearly Double Market Share by 2027 and Digitimes. (Of course, mobile phones are almost entirely ARM.) ↩
-
Steve Furber, co-creator of the ARM chip, mentions that "The first integrated CPUs were coincidentally quite well matched to semiconductor memory speeds, and were therefore built without caches. This can now be seen as a temporary aberration." See VLSI Risc Architecture and Organization p77. To make this concrete, the Apple II (1977) used a MOS 6502 processor running at about 1 megahertz while its 4116 DRAM chips could perform an access in 250 nanoseconds (4 times the clock speed). The 8088 processor ran at 5-10 MHz which meant that 250 ns DRAM chips were slower than the clock speed. Nowadays, processors run at 4 GHz but DRAM access speed is about 50 nanoseconds (1/200 the clock speed). ↩
-
Modern processors use caches to improve memory performance. Accessing data from a cache is faster than accessing it from main memory, but the tradeoff is that caches are much smaller than main memory. The prefetch queues in the 8086 and 8088 are similar to a cache in some ways, but there are some key differences. First, the prefetch queue is strictly sequential. If you jump ahead two bytes, even if the prefetch queue has those instruction bytes, the processor can't use them. Second, the prefetch queue can't reuse bytes. If you have a 6-byte loop, even though all the code fits in the prefetch queue, it will be reloaded every time. Third, the prefetch queue doesn't provide any consistency. If you modify an instruction in memory a couple of bytes ahead of the PC, the 8086 or 8088 will run the old instruction if it's in the queue. ↩
-
The design decisions for the 8086 prefetch cache (and many other aspects of the chip) are described in: J. McKevitt and J. Bayliss, "New options from big chips," in IEEE Spectrum, vol. 16, no. 3, pp. 28-34, March 1979, doi: 10.1109/MSPEC.1979.6367944. Prefetch provided a 50% performance benefit to the 8086. ↩
-
The queue read process doesn't use an explicit read operation. Instead, the selected queue register continuously puts its value onto the queue bus. When the Execution Unit uses this byte, it sends an increment signal to the queue to advance the read pointer. If the queue empty (MT) flip-flop is set, the Execution Unit will wait until a byte is ready. ↩
-
The NOR gates are used as AND gates, following DeMorgan's laws. For example to produce a 1 output for write position 00 and read position 01, the logic is:
NOR(write bit 1', write bit 0', read bit 1', read bit 0)
. Note that the bits into the NOR gate are all inverted from the "desired" values; if they are all 0, the NOR output is 1. Thus, there are also some inverters on the inputs. ↩ -
Arbitrary memory reads and writes are performed directly on memory, bypassing the prefetch queue. The 8086/8088 do not provide consistency; if you modify an instruction byte in memory and the byte is in the queue, the processor will execute the old byte. (This type of self-modifying code can be used to determine the queue length, distinguishing the 8086 from the 8088 in software.) ↩
-
The Constant ROM is used for more than just address correction. For example, it is also used to increment the Instruction Pointer after a prefetch. Other constants are used for the 8088's string operations, which act on a block of memory. The index registers are incremented or decremented by 1 for bytes or 2 for words. When popping a value from the stack, the stack pointer is decremented using the Constant ROM. ↩
-
Are the 8088's queue registers 16 bits wide or 8 bits wide? It's ambiguous, since the registers are written 8 bits at a time, but read 16 bits at a time. This implementation was probably selected to support the 8088's 8-bit bus while reusing as much of the 8086 design as possible. In particular, the 8088 can only prefetch one byte at a time, so writes need to happen a byte at a time. Thus, there are four control lines selecting which queue byte is written. (The 8088 could write to half of a 16-bit register but that would require moving the prefetched byte to the correct half of a 16-bit bus.) On the read side, it would make sense to have four read lines, selecting one byte from the 8088's queue. However, since the 8086 already had a multiplexer to select one byte from two, the 8088 designers probably felt it was easier to keep that circuit. And with the smaller queue on the 8088, there was no need to try to save space by removing the circuit. Thus, the queue has two read-select lines and a multiplexer control line. All these lines are controlled by the write position and read position flip-flops. ↩
Subject: Inside an unusual 7400-series chip implemented with a gate array
When I look inside a chip from the popular 7400 series, I know what to expect: a fairly simple die, implemented in a straightforward, cost-effective way. However, when I looked inside a military-grade chip built by Integrated Device Technology (IDT)4 I found a very unexpected layout: over 1500 transistors in an orderly matrix. Even stranger, most of the die is wasted: less than 20% of these transistors are used, forming scattered circuits connected by thin metal wires.
In this blog post, I look at this chip in detail, describe its gates, and explain how it implements the "1-of-4" decoder function. I also discuss why it sometimes makes sense to build chips with a gate array design such as this, despite the inefficiency.
In the photo below, you can see the silicon die in more detail, with the silicon appearing pink. The main circuitry is implemented in the nine rows that form the gate array, a grid of 1584 transistors. The tiny dark rectangles are transistors of two types, NMOS and PMOS, that work together to implement CMOS logic circuits. At this scale, the metal wiring is visible as faint gray lines and smudges, but most of the transistors are unconnected. Surrounding the gate array are 22 input/output (I/O) blocks each with a square bond pad. As with the transistors, many of these I/O blocks are unused. Fourteen of these bond pads have tiny metal bond wires (the thick black lines) that connect the silicon die to the chip's external pins. Finally, the pairs of bond wires at the center left and center right provide ground and power connections for the chip.
The photo below zooms in on three rows of circuitry in the chip. The large dark rectangles are pairs of transistors, with two lines of transistors in each row of circuitry. At the top and bottom of each row, the thick horizontal white lines are metal wiring that provides power and ground. In each row, one line of transistors holds PMOS transistors, next to the power wiring, while the other line holds NMOS transistors, next to the ground wiring. (The orientation flips in each successive row, so it isn't obvious which transistors are which unless you check the power connections at the end of the row.)
The transistors are wired into gates by the metal layers, the white lines. The gates are connected by horizontal and vertical wiring using the wiring channels between the rows. This wiring style is very similar to standard-cell logic. However, unlike standard-cell logic, the underlying transistor grid is fixed, resulting in wasted transistors. In the image above, most of the transistors in the middle row are used, while the top row is unused and the bottom row is mostly unused.
The diagram below shows the structure of one of the transistor blocks, which contains two tall, thin MOS transistors. The vertical metal contacts connect to the sources and drains of the transistors, with the two transistors sharing the middle contact. (On an integrated circuit, the source and drain of a transistor are identical, so it is arbitrary which side is the source and which is the drain.) The short horizontal metal contacts at the top connect to the gates of the two transistors; the gates are made of polysilicon, which is barely visible in the die photo. The gates partition the active silicon (green), forming the transistors. The gate width is approximately 1 µm.
NAND gate
In this section, I'll explain the construction of one of the NAND gates on the die. The NAND gate below uses four transistors, two NMOS transistors on the top and two PMOS transistors on the bottom. The white lines are the metal wiring, forming two layers. Most of the wiring (including power and ground) is in the lower (M1) layer. The slightly wider and darker vertical segments are the upper (M2) layer. The circles connect the metal layers when they join, or connect the metal layer to the underlying silicon or polysilicon. With two metal layers, it's a bit tricky to see how the wiring is connected. The A and B inputs each connect to two transistor gates. The transistor group at the top is connected to ground on the right, with the output on the left. The transistor group on the bottom is connected to Vcc on the left and right, with the output in the middle. This has the effect of putting the upper transistors in series and the lower transistors in parallel.
Below, I've drawn the schematic of the NAND gate. On the left, the layout of the schematic matches the die layout above. On the right, I've redrawn the schematic with a more traditional layout. To understand its operation, note that a PMOS transistor (top on the right schematic) turns on when the input is low, while an NMOS transistor (bottom on the right) turns on when the input is high. When both inputs are high, the two NMOS transistors turn on, connecting ground to the output, pulling it low. When either input is low, one of the PMOS transistors turns on, pulling the output high. Thus, the circuit implements the NAND function. The NMOS and PMOS transistors operate in a complementary fashion, giving CMOS (Complementary MOS) its name.
NOR gate
In this section, I'll explain the layout of one of the NOR gates on the die, shown below.
This gate is twice as large as the previous NAND gate so it can provide twice the output current.1
The NOR gate uses eight transistors, four PMOS transistors in the upper half and four NMOS transistors in the lower half.
(Note that Vcc and ground are flipped compared to the previous gate, as are the NMOS and PMOS transistors.)
The two transistors in each block are wired in parallel to produce more current for
the output.
(A out
is the same signal as A in
, exiting the block at the top to connect to other circuitry.)
The schematic below shows the wiring of the eight transistors. The schematic layout corresponds to the physical layout to make it easier to map between the image and the schematic. The upper transistor groups are wired in series, while the lower transistor groups are wired in parallel.
The schematic below has been redrawn to make the functionality clearer, and the parallel transistors have been removed. If either input is high, one of the NMOS transistors on the bottom will turn on and pull the input low. If both inputs are low, the two PMOS transistors will turn on and pull the input high. This provides the desired NOR function.
Note that the NAND and NOR gates have similar but opposite schematics. In the NAND gate, the NMOS transistors are in series while the PMOS transistors are in parallel. In the NOR gate, the roles of the transistors are swapped.
The chip's circuit
The chip I examined is a "dual 1-of-4 decoder with enable".2 The decoding function takes a two-bit input and selects one of four output lines depending on the binary value. The enable line must be low to activate this operation; otherwise all four output lines are disabled. The chip contains two of these decoders, which is why it is called a dual decoder. In total, the chip contains 18 logic gates,3 so it is very simple, even by 1990s standards.
I reverse-engineered the chip and created the schematic below, showing one of the dual units.
Each NAND gate matches one of the four input possibilities to drive one of the four outputs.
The NOR gates support the ENABLE
signal, blocking the outputs unless ENABLE
is active (i.e. low).
The chip uses a general-purpose I/O block (below) for each pin, that can be used as an input or an output depending on how it is wired. Each block contains two large drive transistors: an NMOS transistor to pull the output low and a PMOS transistor to pull the output high. The I/O block has separate control lines for the two output transistors. (At the bottom of the image below, two thin metal wires drive the high-side and low-side transistors.) This permits tri-state logic: if neither transistor is energized, the output is left floating. The gate array drives the output transistors with high-current inverter, constructed from multiple transistors in parallel. (This is why the schematic shows more inverters than may seem necessary.)
When used as an input, the pad is wired to the surrounding circuitry slightly differently, connecting to input protection diodes (not shown on the schematic). Thus, the functionality of the I/O blocks can be changed by modifying the metal layers, without changing the underlying silicon.
Some 7400-series history
The earliest logic integrated circuits used resistors and transistors internally, so they were called RTL (Resistor Transistor Logic), but RTL had significant performance problems. RTL was rapidly replaced by Diode Transistor Logic (DTL) and then Transistor Transistor Logic (TTL). In 1964, Texas Instruments created a line of TTL integrated circuits for military applications called the SN5400 series. This was shortly followed by the commercial-grade SN7400 series.
The 7400 series of integrated circuits was inexpensive, fast, and easy to use. The line started with simple logic circuits such as four NAND gates on a chip, and moved into more complex chips such as counters, shift registers, and ALUs. The 7400 series became very popular in the 1970s and 1980s, used by electronics hobbyists and high-performance minicomputers alike. These chips became essential building blocks and "glue" logic for microcomputers, heavily used in the Apple II for instance.
The original 7400 series branched into dozens of families with different performance characteristics but the same functionality. The 74LS (low-power Schottky) family, for instance, became very popular as it both improved speed and reduced power consumption. In the mid-1970s, 7400-series chips were introduced that used CMOS circuitry instead of TTL for dramatically lower power consumption. This CMOS family, the 74C series, was followed by numerous other CMOS families.
That brings us to the chip I examined, a member of IDT's 74FCT (Fast CMOS TTL-compatible) line of chips, introduced in the mid-1980s. (Specifically, it is in the 54FCT family because it handles a wider temperature range.) These chips used advanced CMOS technology to provide high speed, low power consumption, and as a military option, radiation tolerance.
Conclusions
Why would you make a chip in this inefficient way, using a gate array that wastes most of the die area? The motivation is that most of the design cost can be shared across many different part types. Each step of integrated circuit processing requires an expensive mask for photolithography. With a gate array, all chip types use the same underlying silicon and transistors, with custom masks just for the two metal layers. In comparison, a fully custom chip might require eight custom masks, which costs much more. The tradeoff is that gate array chips are larger so the manufacturing cost is higher per device.5 Thus, a gate array design is better when selling chips in relatively small quantities, while a custom design is cheaper when mass-producing chips.6 IDT focused on the high-performance and military market rather than the commodity chip market, so gate arrays were a good fit.
One last thing. The packaging of this chip is very interesting since it is mounted on a multi-chip module. The module also contains two Atmel EEPROMs. Presumably the decoder chip decodes address bits to select one of the EEPROMs.
Thanks to Don S. for providing the chip. Follow me on Twitter @kenshirriff or RSS for updates. I've also started experimenting with Mastodon recently as @oldbytes.space@kenshirriff.
Notes and references
-
Properly sizing the transistors in a gate is important for performance. Since the transistors in the gate array are all the same size, multiple transistors are used in parallel to get the desired current. The 1999 book Logical Effort describes a methodology for maximizing the performance of CMOS circuits by correctly sizing the transistors. ↩
-
The part number is "IDT 54FCT139ALB". "54" indicates the chip operates under an enhanced temperature range of -55°C to +125°C. The "A" indicates the chip is 35% faster than the base series (but not as fast as "C"). "L" indicates the chip is packaged in a leadless chip carrier, the square package shown at the top of the article. Finally, "B" indicates the chip was tested according to military standards: MIL-STD-883, Class B. ↩
-
The chip contains 18 logic gates according to the functional schematic in the datasheet (below). The implementation actually uses 52 logic gates by my count (2×26) because the implementation doesn't exactly match the schematic. In particular, the datasheet shows three-input NAND gates, but the chip uses a NAND gate and a NOR gate along with inverters. The chip also has additional inverters to drive the output transistors in each I/O block.
Schematic of the chip from the datasheet. -
Integrated Device Technology was a spinoff from Hewlett Packard that started in 1980. IDT built advanced CMOS chips including fast static RAM and microprocessors (bit-slice and MIPS). It became part of Renesas in 2018. A very detailed 1986 profile of IDT is here. IDT's logo is pretty cool, combining a chip wafer and calculus.
The logo of Integrated Device Technology.Here's how the logo looks on the die:
Closeup of the die showing the IDT logo.The die also has the initials of the designers, along with some mysterious symbols. One looks like the Chinese character "正".
Closeups of two parts of the die. -
Integrated circuit manufacturing is partitioned into the "front end of line", where the transistors are created on the silicon wafer, and the "back end of line", where the metal wiring is put on top to connect the transistors. With a gate array construction, the front end of line steps create generic gate array wafers. The back end of line steps then connect the transistors as desired for a particular component. The gate array wafers can be produced in large quantities and stored, and then customized for specific products in smaller quantities as needed. This reduces the time to supply a particular chip type since only the back end of line process needs to take place. ↩
-
The IDT High-Speed CMOS Logic Design Guide briefly mentions the gate array design. The FCT family was built from two sizes of gate arrays, "4004" for smaller chips and "8000" for larger chips. Later, IDT shrunk the original "Z-step" gate arrays to smaller, higher-performance "Y-step" arrays. They then customized some of the devices to create the "W-step" devices. Looking at the markings on the die, we see that this chip uses the "4004Y" gate array.
The die shows gate slice 4004Y and part 4139Y (indicating 54139 or 74139). The numbers are slightly obscured by a bond wire.
Subject: Talking to memory: Inside the Intel 8088 processor's bus interface state machine
In 1979, Intel introduced the 8088 microprocessor, a variant of the 16-bit 8086 processor. IBM's decision to use the 8088 processor in the IBM PC (1981) was a critical point in computer history, leading to the success of the x86 architecture. The designers of the IBM PC selected the 8088 for multiple reasons, but a key factor was that the 8088 processor's 8-bit bus was similar to the bus of the 8085 processor.1 The designers were familiar with the 8085 since they had selected it for the IBM System/23 Datamaster, a now-forgotten desktop computer, making the more-powerful 8088 processor an easy choice for the IBM PC.
The 8088 processor communicates over the bus with memory and I/O devices through a highly-structured sequence of steps called "T-states." A typical 8088 bus cycle consists of four T-states, with one T-state per clock cycle. Although a four-step bus cycle may sound straightforward, its implementation uses a complicated state machine making it one of the most difficult parts of the 8088 to explain. First, the 8088 has many special cases that complicate the bus cycle. Moreover, the bus cycle is really six steps, with two undocumented "extra" steps to make bus operations more efficient. Finally, the complexity of the bus cycle is largely arbitrary, a consequence of Intel's attempts to make the 8088's bus backward-compatible with the earlier 8080 and 8085 processors. However, investigating the bus cycle circuitry in detail provides insight into the timing of the processor's instructions. In addition, this circuitry illustrates the tradeoffs and implementation decisions that are necessary in a production processor. In this blog post, I look in detail at the circuitry that implements this state machine.
By examining the die of the 8088 microprocessor, I could reverse engineer the bus circuitry. The die photo below shows the 8088 microprocessor's silicon die under a microscope. Most visible in the photo is the metal layer on top of the chip, with the silicon and polysilicon mostly hidden underneath. Around the edges of the die, bond wires connect pads to the chip's 40 external pins. Architecturally, the chip is partitioned into a Bus Interface Unit (BIU) at the top and an Execution Unit (EU) below, with the two units running largely independently. The BIU handles bus communication (memory and I/O accesses), while the Execution Unit (EU) executes instructions. In the diagram, I've labeled the processor's key functional blocks. This article focuses on the bus state machine, highlighted in red, but other parts of the Bus Interface Unit will also play a role.
Although I'm focusing on the 8088 processor in this blog post, the 8086 is mostly the same. The 8086 and 8088 processors present the same 16-bit architecture to the programmer. The key difference is that the 8088 has an 8-bit data bus for communication with memory and I/O, rather than the 16-bit bus of the 8086. For the most part, the 8086 and 8088 are very similar internally, apart from trivial but numerous layout changes on the die. In this article, I'm focusing on the 8088 processor, but most of the description applies to the 8086 as well. Instead of constantly saying "8086/8088", I'll refer to the 8088 and try to point out places where the 8086 is different.
The bus cycle
In this section, I'll describe the basic four-step bus cycles that the 8088 performs.2
To start, the diagram below shows the states for a write cycle (slightly simplified3), when the 8088 writes to memory or an I/O device.
The external bus activity is organized as four "T-states", each one clock cycle long and called T1
, T2
, T3
, and T4
, with
specific actions during each state.
During T1
, the 8088 outputs the address on the pins. During the T2
, T3
, and T4
states, the 8088 outputs the data word on the same pins.
The external memory or I/O device uses the T states to know when it is receiving address information or data over the bus lines.
For a read, the bus cycle is slightly different from the write cycle, but uses the same four T-states.
During T1
, the address is provided on the pins, the same as for a write.
After that, however, the processor's data pins are "tri-stated" so they float electrically, allowing the external memory to put data on the bus.
The processor reads the data at the end of the T3
state.
The purpose of the bus state machine is to move through these four T states for a read or a write. This process may seem straightforward, but (as is usually the case with the 8088) many complications make this process anything but easy. In the next sections, I'll discuss these complications. After that, I'll explain the state machine circuitry with a schematic.
Address calculation
One of the notable (if not hated) features of the 8088 processor is segmentation: the processor supports 1 megabyte of memory, but memory is partitioned into segments of 64 KB for compatibility with the earlier 8080 and 8085 processors. The 8088 calculates each 20-bit memory address by adding the value of a segment register to a 16-bit offset. This calculation is done by a dedicated address adder in the Bus Interface Unit, completely separate from the chip's ALU. (This address adder can be spotted in the upper left of the earlier die photo.)
Calculating the memory address complicates the bus cycle. As the timing diagrams above show, the processor issues the memory address during state T1
of the bus cycle.
However, it takes time to perform the address calculation addition, so the address calculation must take place
before T1
.
To accomplish this, there are two "invisible" bus states before T1
; I call these states "TS" (T-start) and "T0".
During these states, the Bus Interface Unit uses the address adder to compute the address, so the address will be available
during the T1
state.
These states are invisible to the external circuitry because they don't affect the signals from the chip.
Thus, a single memory operation takes six clock cycles: two preparatory cycles to compute the address before the four visible cycles.
However, if multiple memory operations are performed, the operations are overlapped to achieve a degree of pipelining that improves performance.
Specifically, the address calculation for the next memory operation takes place during the last two clock cycles of the current
memory operation, saving two clock cycles.
That is, for consecutive bus cycles, T3
and T4
of one bus cycle overlap with TS
and T0
of the next cycle.
In other words, during T3
and T4
of one bus cycle, the memory address gets computed for the next bus cycle.
This pipelining significantly improves the performance of the 8088, compared to taking 6 clock cycles for each bus cycle.
With this timing, the address adder is free during cycles T1
and T2
.
To improve performance in another way, the 8088 uses the adder during this idle time to increment or decrement memory addresses.
For instance, after popping a word from the stack, the stack pointer needs to be incremented by 2.5
Another case is block move operations (string operations), which need to increment or decrement the pointers each step.
By using the address adder, the new pointer value is calculated "for free" as part of the memory
cycle, without using the processors regular ALU.4
Address corrections
The address adder is used in one more context: correcting the Instruction Pointer value. Conceptually, the Instruction Pointer (or Program Counter) register points to the next instruction to execute. However, since the 8088 prefetches instructions, the Instruction Pointer indicates the next instruction to be fetched. Thus, the Instruction Pointer typically runs ahead of the "real" value. For the most part, this doesn't matter. This discrepancy becomes an issue, though, for a subroutine call, which needs to push the return address. It is also an issue for a relative branch, which jumps to an address relative to the current execution position.
To support instructions that need the next instruction address, the 8088 implements a micro-instruction CORR
, which corrects the Instruction Pointer.
This micro-instruction subtracts the length of the prefetch queue from the Instruction Pointer to determine the "real" Instruction Pointer.
This subtraction is performed by the address adder, using correction constants that are stored in a small Constant ROM.
The tricky part is ensuring that using the address adder for correction doesn't conflict with other uses of the adder.
The solution is to run a special shortened memory cycle—just the
TS
and T0
states—while the CORR
micro-instruction is performed.6
These states block a regular memory cycle from starting, preventing a conflict over the address adder.
Prefetching
The 8088 prefetches instructions before they are needed, loading instructions from memory into a 6-byte prefetch queue.
Prefetching usually improves performance, but can result in an
instruction's memory access being delayed by a prefetch, hurting overall performance.
To minimize this delay, a bus request from an instruction will preempt a prefetch, even if the prefetch has
gone through TS
and T0
.
At that point, the prefetch hasn't created any bus activity yet (which first happens in T1
), so preempting the prefetch
can be done cleanly.
To preempt the prefetch, the bus cycle state machine jumps back to TS
, skipping over T1
through T4
, and starting the desired access.
A prefetch will also be preempted by the micro-instruction that stops prefetching (SUSP
) or the micro-instruction
that corrects addresses (CORR
). In these cases, there is no point in completing the prefetch, so the state machine cycle
will end with T0
.
Wait states
One problem with memory accesses is that the memory may be slower than the system's clock speed, a characteristic of less-expensive
memory chips.
The solution in the 1970s was "wait states".
If the memory couldn't respond fast enough, it would tell the processor to add idle clock cycles called wait states, until
the memory could respond.7
To produce a wait state, the memory (or I/O device) lowers the processor's READY
pin until it is ready to proceed.
During this time, the Bus Interface Unit waits, although the Execution Unit continues operation if possible.
Although Intel's documentation gives the wait cycle a separate name (Tw
), internally the wait is implemented by repeating the T3
state as long as the READY
pin is not active.
Halts
Another complication is that the 8088 has a HALT
instruction that halts program execution until an interrupt comes in.
One consequence is that HALT
stops bus operations (specifically prefetching, since stopping execution will automatically stop instruction-driven bus operations).
A complication is that the 8088 indicates the HALT
state to external devices by performing a special T1
bus cycle
without any following bus cycles.
But wait: there's another complication. External devices can take control of the bus through the HOLD
functionality,
allowing external devices to perform operations such as DMA (Direct Memory Access).
When the device ends the HOLD
, the 8088 performs another special T1
bus cycle, indicating that the HALT
is still in effect.
Thus, the bus state machine must generate these special T1
states based on HALT
and HOLD
actions.
(I discussed the HALT
process in detail here.)
Putting it all together: the state diagram
The state diagram below summarizes the different types of bus cycles.
Each circle indicates a specific T-state, and the arrows indicate the transitions between states.
The green line shows the basic bus cycle or cycles, starting in TS
and then going around the cycle.
From T3
, a new cycle can start with T0
or the cycle will end with T4
.
Thus, new cycles can start every four clocks, but a full cycle takes six states (counting the "invisible" TS
and T0
).
The brown line shows that the bus cycle will stay in T3
as long as there is a wait state.
The red line shows the two cycles for a CORR correction, while the purple line shows the special T1
state for a HALT
instruction.
The cyan line shows that a prefetch cycle can be preempted after T0
; the cycle will either restart at TS
or end.
I'm showing states TS
and T3
together since they overlap but aren't the same.
Likewise, I'm showing T4
and T0
together. T4
is grayed out because it doesn't exist from the state machine's perspective;
the circuitry doesn't take any particular action during T4
.
The schematic below shows the implementation of the state machine.
The four flip-flops represent the four states, with one flip-flop active at a time, generating states T0
, T1
, T2
,
and T3
(from top to bottom).
Each output feeds into the logic for the next state, with T3
wrapping back to the top, so the circuit moves through
the states in sequence.
The flip-flops are clocked so the active state will move from one flip-flop to the next according to the system clock.
State TS
doesn't have its own flip-flop, but is represented by the input to the T0
flip-flop, so it happens one clock
cycle earlier.8
State T4
doesn't have a flip-flop since it isn't "real" to the bus state machine.
The logic gates handle the special cases: blocking the state transfer if necessary or starting a state.
I'll explain the logic for each state in more detail.
The circuitry for the TS
state has two AND gates to generate new bus cycles starting from TS
.
The first one (a) causes TS
to happen with T3
if there is a pending bus request (and no HOLD
). The second AND gate (b) starts a bus cycle if
the bus is not currently active and there is a bus request or a CORR
micro-instruction.
The flip-flop causes T0
to follow T3
/TS
, one clock cycle later.
The next gates (c) generate the T1
state following T0
if there is pending bus activity and the cycle isn't preempted to T3
. The AND gate (d)
starts the special T1
for the HALT
instruction.9
The T2
state follows T1
unless T1
was generated by a HALT
(e).
The T3
logic is more complicated. First, T3
will always follow T2
(f).
Next, a wait state will cause T3
to remain in T3
(g).
Finally, for a preempt, T3
will follow T0
(h) if there is a prefetch and a microcode bus operation (i.e. an instruction specified the bus operation).
Next, I'll explain BUS-ACTIVE
, an important signal that indicates if the bus is active or not.
The Bus Interface Unit generates the BUS-ACTIVE
signal to help control the state machine.
The BUS-ACTIVE
signal is also widely used in the Bus Interface Unit, controlling many functions such as transfers to and from
the address registers.
BUS-ACTIVE
is generated by the complex circuit below that determines if the bus will be active,
specifically in states T0
through T3
.
Because of the flip-flop, the computation of BUS-ACTIVE
happens in the previous clock cycle.
In more detail, the signal BUS-ACTIVE-PRE
indicates if the bus cycle will continue or will end on the next clock cycle.
Delaying this signal through the flip-flop generates BUS-ACTIVE
, which indicates if the bus is currently active in states T0
through T3
.
The top AND gate (a) is responsible for starting a cycle or keeping a cycle going (a1).
It will allow a new cycle if there is a bus request (without HOLD
) (a3).
It will also allow a new cycle if there is a CORR
micro-instruction prior to the T1
state (even if there is a HOLD
, since
this "fake" cycle won't use the bus) (a2).
Finally, it allows a new cycle for a HALT
, using T1-pre
(a2).10
Next are the special cases that end a bus cycle.
The second AND gate (b) ends the bus cycle after T3
unless there is a wait state or another bus request.
(But a HOLD
will block the next bus request.)
The remaining gates end the cycle after T0
to preempt a prefetch if a CORR
or SUSP
micro-instruction occurs (d), or
end after T1
for a HALT
(e).
The BUS-ACTIVE
circuit above uses a complex gate, a 5-input NOR gate fed by 5 AND gates with two attached OR gates. Surprisingly,
this is implemented in the processor as a single gate with 14 inputs.
Due to how gates are implemented with NMOS transistors, it is straightforward to implement this as a single gate.
The inverter and NOR gate on the left, however, needed to be implemented separately, as they involve inversion;
an NMOS gate must have a single inversion.
The diagram above shows the layout of the bus state machine circuitry on the die, zooming in on the top region of the die. The metal layer has been removed to expose the underlying silicon and polysilicon. The layout of each flip-flop is completely different, since the layout of each transistor is optimized to its surroundings. (This is in contrast to later processors such as the 386, which used standard-cell layout.) Even though the state machine consists of just a handful of flip-flops and gates, it takes a noticeable area on the die due to the large 3.2 µm feature size of the 8088. (Modern processors have features measured in nanometers, not micrometers.)
Conclusions
The bus state machine is an example of how the 8088's design consists of complications on top of complications.
While the four-state bus cycle seems straightforward at first, it gets more complicated due to prefetching,
wait states, the HALT
instruction, and the bus hold feature, not to mention the interactions between these features.
While there were good motivations behind these features, they made the processor considerably more complicated.
Looking at the internals of the 8088 gives me a better understanding of why simple RISC processors became popular.
The bus state machine is a key part of the read and write circuitry, moving the bus operation through the necessary T-states. However, the state machine is not the only component in this process; a higher-level circuit decides when to perform a read, write, or prefetch, as well as breaking a 16-bit operation into two 8-bit operations.11 These circuits work together with the higher-level circuit telling the state machine when to go through the states.
In my next blog post, I'll describe the higher-level memory circuit so follow me on Twitter @kenshirriff or RSS for updates. I'm also on Mastodon as oldbytes.space@kenshirriff. If you're interested in the 8086, I wrote about the 8086 die, its die shrink process, and the 8086 registers earlier.
Notes and references
-
The 8085 and 8088 processors both use a 4-step bus cycle for instruction fetching. For other reads and writes, the 8085's bus cycle has three steps compared to four for the 8088. Thus, the 8085 and 8088 bus cycles are similar but not an exact match. ↩
-
The 8088 has separate instructions to read or write an I/O device. From the bus perspective, there's no difference between an I/O operation and a memory operation except that a pin on the chip indicates if the operation is for memory or I/O.
The 8088 supports I/O operations for historical reasons, going back through the 8086, 8080, 8008, and the Datapoint 2200 system. In contrast, many other contemporary processors such as the 6502 used memory-mapped I/O, using standard memory accesses for I/O devices.
The 8086 has a pin M/IO that is high for a memory access and low for an I/O access. External hardware uses this pin to determine how to handle the request. Confusingly, the pin's function is inverted on the 8088, providing IO/M. One motivation behind the 8088's 8-bit bus was to allow reuse of peripherals from the earlier 8-bit 8085 processor. Thus, the pin's function was inverted so it matched the 8085. (The pin is only available when the 8086/8088 is used in "minimum mode"; "maximum mode" remaps some of the pins, making the system more complicated but providing more control.) ↩
-
I've made the timing diagram somewhat idealized so actions line up with the clock. In the real datasheet, all the signals are skewed by various amounts so the timing is more complicated. See the datasheet for pages of timing constraints on exactly when signals can change. ↩
-
For more information on the implementation of the address adder, see my previous blog post. ↩
-
The
POP
operation is an example of how the address adder updates a memory pointer. In this case, the stack address is moved from the Stack Pointer to theIND
register in order to perform the memory read. As part of the read operation, theIND
register is incremented by 2. The address is then moved from theIND
register to the Stack Pointer. Thus, the address adder not only performs the segment arithmetic, but also computes the new value for theSP
register.Note that the increment/decrement of the
IND
register happens after the memory operation. For stack operations, the SP must be decremented before aPUSH
and incremented after aPOP
. The adder cannot perform a predecrement, so thePUSH
instruction uses the ALU (Arithmetic/Logic Unit) to perform the decrement. ↩ -
During the
CORR
micro-instruction, the Bus Interface Unit performs specialTS
andT0
states. Note that these states don't have any external effect, so they are invisible outside the processor. ↩ -
The tradeoff with memory boards was that slower RAM chips were cheaper. The better RAM boards advertised "no wait states", but cheaper boards would add one or more wait states to every access, reducing performance. ↩
-
Only the second half of the
TS
state has an effect on the Bus Interface Unit, soTS
is not a full state like the other states. Specifically, a delayedTS
signal is taken from the first half of theT0
flip-flop, and this signal is used to control various actions in the Bus Interface Unit. (Alternatively, you could think of this as an earlyT0
state.) This is why there isn't a separate flip-flop for theTS
state. I suspect this is due to timing issues; by the time theTS
state is generated by the logic, there isn't enough time to do anything with the state in that half clock cycle, due to propagation delays. ↩ -
There is a bit more circuitry for the
T1
state for aHALT
. Specifically, there is a flip-flop that is set on this signal. On the next cycle, this flip-flop both blocks the generation of anotherT1
state and blocks the previousT1
state from progressing toT2
. In other words, this flip-flop makes sure the specialT1
lasts for one cycle. However, aHOLD
state resets this flip-flop. That allows another specialT1
to be generated when theHOLD
ends. ↩ -
The trickiest part of this circuit is using
T1-pre
to start a (short) cycle forHALT
. The way it works is that theT1-pre
signal only makes a difference if there isn't a bus cycle already active. The only way to get an "unexpected"T1-pre
signal is if the state machine generates it for the first cycle of aHALT
. Thus, theHALT
triggersT1-pre
and thus thebus-active
signal. You might wonder why thebus-active
uses this roundabout technique rather than getting triggered directly byHALT
. The motivation is that the specialT1
state forHALT
requires the AND of three signals to ensure that the state is generated once for theHALT
rather than continuously, but happens again after aHOLD
, and waits until the current bus cycle is done. Instead of duplicating that AND gate, the circuit usesT1-pre
which incorporates that logic. (This took me a long time to figure out.) ↩ -
The 8088 has a 16-bit bus, compared to the 8088's 8-bit bus. Thus, a 16-bit bus operation on the 8088 will always require two 8-bit operations, while the 8086 can usually perform this operation in a single step. However, a 16-bit bus operation on the 8086 will still need to be broken into two 8-bit operations if the address is unaligned (i.e. odd). ↩
Subject: Inside a vintage aerospace navigation computer of uncertain purpose
I recently obtained an aerospace computer from the early 1970s, apparently part of a navigation system. Aerospace computers are an interesting but mostly neglected area of computer hardware, so I'm always delighted to examine one up close. In an era when most computers were large mainframes, aerospace computers packed dense electronics into a small package, using technologies such as surface-mounted components and multi-layer printed circuit boards, technologies that wouldn't reach the mainstream for another decade. This blog post examines the circuitry and components inside this computer, including an unusual electromechanical display. Although I was unable to determine who manufactured this system or even its exact function, this system illustrates how hundreds of integrated circuits and a core memory stack can be crammed into a compact package.
The keyboard
The device has a simple numeric keyboard with a few unexpected features. The numeric keypad can also be used for direction entry, as four of the keys have N, S, E, and W on them. The keys are large, roughly the size of the Apollo spacecraft's DSKY buttons. My theory is that these buttons are designed for operation with gloves, perhaps in a fighter plane where the pilot wears a pressure suit. The buttons are hinged at the top, so they don't push straight in, but pivot when pressed.
Numeric keypads typically use one of two layouts: a telephone-style keypad has the digits 123 at the top, while a calculator-style keypad has the digits 789 at the top. Interestingly, this device uses a calculator layout, while most aviation devices have a telephone layout. The Apollo DSKY also used a calculator layout, which could be a hint at a NASA connection for this device.
Above the keyboard are four codes for self-test: N4576, E9384, S9021, and W4830. Entering these codes on the keyboard presumably triggered the appropriate test of the system when the switch is in test mode.
The display
The computer's display is simple, showing a latitude and longitude. Each value has one decimal position, providing 0.1° of accuracy. The latitude and longitude are prefixed with a compass direction: North/South for latitude and East/West for longitude.
The display is constructed from an unusual type of electromechanical indicator, with an indicator module for each digit. Each digit position has a rotating wheel with 11 positions (ten digits and a blank). When the indicator module for a position is energized, the wheel spins to the specified position, showing the selected digit. The two leftmost indicators are slightly different as they show a compass direction instead of a digit: N, S, E, or W. Moreover, the direction indicators can also show the compass direction with a diagonal slash through it, as seen above. Perhaps the slashed direction indicates a problem with the value.
The diagram below shows how a digit indicator operates. Each digit position has an electromagnet with a wire to energize it. The dial wheel has an attached permanent magnet (indicated by N and S). Energizing one of the electromagnets causes the dial to spin to that position, aligning the permanent magnet on the dial with the electromagnet. This mechanism forms a reliable indicator with just one moving part. The displayed digit is clearer than a seven-segment display since the digit uses a real font rather than being created from segments.
Looking at the back of the keyboard/display unit shows the wiring of the display indicators. Each indicator has a common connection and ten wires to energize one of the electromagnets.1 The electromagnets are connected in a matrix, with all the "1" wires connected, the "2" wires connected, and so forth. To rotate an indicator to a particular digit, a common wire and an electromagnet wire are energized. For instance, powering the common wire of the second indicator and the "5" electromagnetic wire causes the second indicator to rotate to the "5" position. The wiring has a three-dimensional structure with ten bare wires running between the boards, one for each digit value. A yellow wire hangs off each bare wire, linking it to the connector on the left. Each indicator has ten diodes on a circuit board to block "sneak" paths that would energize unselected electromagnets.
This matrix circuit reduces the amount of wiring required: although there are 100 electromagnets in total, just 20 wires are sufficient to control them. The driver circuitry, however, is a bit more complex as it must scan through the ten digit positions, activating the right pair of driver wires at the right time. Some of the logic circuitry described below must implement this scanning, as well as the driver circuitry to energize the indicators.
The display and keyboard have many similarities to the Delco Carousel Inertial Navigation System (INS) shown below. (The Delco Carousel was used in many military and civilian aircraft, from the C-141 cargo plane to the Boeing 747 passenger plane.) Both devices have two digital displays, one for latitude North/South and one for longitude East/West. Also note the numeric keypads with four keys assigned to the four compass directions. The controls of the Carousel INS system are considerably more complicated, though. The Carousel has a knob position "TK/GS" (track/ground speed), which may correspond to the "T/G" position on my device.
Note that the display on my unit has just four digits of accuracy, with one digit after the decimal point. A tenth of a degree would provide an accuracy of about ±7 miles, which is low for a navigation device. In comparison, the Delco Carousel has six digits of accuracy (± 100 feet perhaps). This suggests that the device does not provide INS navigation, but some other guidance with lower accuracy.
Packaging the electronics
The unit contains 14 circuit boards, crammed with TTL integrated circuits, along with a core memory stack. The photo below shows how circuit boards surround the core memory stack. The mechanical design of the unit is advanced, allowing the boards to be opened up like a book. This provides compact packaging while allowing access to the boards.
The circuit boards are four-layer printed circuit boards, more advanced than the common two-layer boards of the time. The boards use a mixture of surface-mounted and through-hole components. The flat-pack ICs and the tiny round transistors are surface mounted, which was rare at the time. On the other hand, the resistors, capacitors, diodes, and larger transistors use standard through-hole components. At the time, most electronics used through-hole components, although aerospace systems often used surface-mounted components for higher density. It wasn't until the late 1980s that surface-mount technology became commonplace.
The boards are mounted in solid metal frames, providing both structural integrity and heat conduction for cooling. Most of the frames hold two boards, mounted back-to-back for higher density.
The logic boards
Four of the circuit boards are logic boards, packed with flat-pack integrated circuits. The board below holds 55 integrated circuits, showing the high density that is possible with flat packs.
The logic ICs are Signetics 400-series chips, an early type of TTL (Transistor-Transistor Logic) chip. Just three types of these ICs are used: SE440J "Dual exclusive OR" (really AND-OR-INVERT but XOR if provided with particular inputs), SE455J "Dual 4-input buffer/driver" (4-input NAND or NOR gates depending on polarity), and SE480J "Quad 2-input NAND/NOR". These integrated circuits cost $15.45 each in 1966 (about $150 each in current dollars).2
The schematic below shows the circuit that implements AND-OR-INVERT (or exclusive or) in the SE440J. The multiple-emitter transistors on the inputs may appear unusual, but this is the standard way to implement TTL gates. It is important to note that this chip only contains 12 transistors, so the density is low. (Since the chip contains two of these gates, this circuit is duplicated.) In the mid-1960s, integrated circuits only contained a few transistors—the Apollo Guidance Computer's ICs had just 6 transistors—but by the time this unit was built in the early 1970s, some chips had thousands of transistors, tracking Moore's Law. Thus, this unit both illustrates how aviation computers could be built from simple integrated circuits and how the dramatic improvements in IC technology rapidly obsoleted these computers.
The Signetics 400-series seems to have been obscure and short-lived, probably killed off by the wild success of 7400-series TTL chips. I was able to find only a few announcements and datasheets for these chips. The only users of these chips that I could find were NASA projects from the late 1960s.3 Signetics 400-series chips were used in the Mariner Mars and Venus probes, in the Data Automation Subsystem (DAS) (link, link). The Voyager Mars probes also used them. The SE455J gates were also used to interface the Apollo Guidance Computer to a core-rope simulator. JPL used the SE455J in a core memory system. NASA used the SE455J, SE480J, and other Signetics chips in its design for the MICROMIN computer. None of these systems appear to be related to the navigation system, but they illustrate that NASA was using these specific Signetics chips at the time in multiple designs.
The chips are labeled "CDC", raising the possibility that these chips were built by Control Data Corporation (CDC) under license from Signetics. The Aerospace Division of CDC was active at the time, building various compact computer systems. For instance, the CDC 480 computer (1976) was a 16-bit computer based on the Am2900 bit-slice chip. Also known as the AN/AYK-14, this system was used on numerous aircraft including the F-18. An earlier CDC aerospace computer is the AN/AWG-9 Airborne Missile Control System (1965), a 24-bit computer in a compact 1.1 cubit foot package. Used on the F-14 fighter plane, this computer guided the Phoenix air-to-air missile. Based on CDC's activity in aerospace computers at the time, the mystery computer could be a CDC system, although this hypothesis is based solely on integrated circuits labeled "CDC".
The photo below shows another logic board. This one has numerous red and white wires attached, linking it to the rest of the system. Curiously, this board has a single transistor, with two associated resistors, in the middle of the board.
Analog boards
The computer contains not only logic boards but also boards full of analog circuitry to interface with the core memory, keyboard, and display. The board below contains 17 of the logic ICs seen earlier. However, it also uses many resistors, capacitors (red cylinders), transistors (white circles), inductors (white banded cylinders), and glass diodes. The board also has some analog integrated circuits. In particular, it has three TI SN52709 op-amps, the smaller 10-pin packages. The board also contains some integrated circuits that I couldn't identify: UT1000, UT1027, UD4001, and D245F. The SM 60 ICs in white packages have a logo that I don't recognize. The op-amps could function as sense amplifiers for the core memory, or this board could provide other analog interfacing.
The board has multiple gray four-pin packages labeled "926D". Based on the + and - markings, these packages are probably bridge rectifiers, maybe providing power for the circuits. Many of the other boards have these rectifiers. The analog boards also contain a few Halex flat-pack devices labeled "HALEX 101205 727". Hanlex manufactured thin-film resistors in flat packs, so these are probably resistor networks. NASA used Halex resistor networks in some devices (link).4
The analog board shown below sits next to the core memory stack. It uses a different set of flat-pack components: Signetics C8930G and PL 98321. Unfortunately, I could not identify these ICs. This board, unlike the previous boards, has a copper ground plane in the second layer of the circuit board; this layer is visible in the photo as the copper-colored background occupying most of the board.
Core memory
The unit is built around a core memory stack, as was common in the era before semiconductor memory took over. Magnetic core memory consists of a grid of tiny ferrite cores with wires threaded through them, forming a core plane. Typically, a core memory unit consists of multiple planes, one for each bit in the word, stacked to form a three-dimensional block of memory.
The photo below shows a closeup of the stack. It appears to have 20 planes, suggesting a 20-bit processor. Soldered wires connect the planes together to provide continuous wiring through the stack. The soldering on these wires looks somewhat haphazard, suggesting that this was not a production unit.
The photo below shows the other side of the core memory stack, with similar wiring between the planes. At the right are a few layers of a different type, connected with 26 wires. The tape measure shows that the core memory stack is compact, about 6 cm on a side (2¼").
Some of the boards are drivers for the core memory stack. The board below has 48 small round transistors, colored either blue or red. Note the green, white, and yellow wires in the lower right, mostly hidden under the brown ground ribbon. These wires are connected to the core memory stack.
The board below also has numerous wires to the core stack, underneath the brown ground ribbon, so it is presumably another driver board. This board has some round driver transistors with yellow dots. Curiously, in the upper left there are a few circuit board pads where transistors could be mounted but are missing. Perhaps with the additional components the board would support a system with more of something: a larger keyboard? more memory?
Looking at the back of the unit, you can see the display indicator wiring at the top and a circuit board at the bottom. This board contains 20 transistors in metal cans, specifically Motorola 2N3736 NPN transistors. The core memory stack has 20 planes, matching the 20 transistors on this board, so the board probably implements the core memory "inhibit drivers", controlling the bit written to each plane. The board also has numerous tiny surface-mount transistors in white, red, and black packages. Close examination shows a few thin green "bodge" wires on this board, indicating that rework was performed on the board to fix a circuit problem, another piece of evidence that this unit is a prototype.
The core memory stack is enclosed by two sheet metal boxes, which I removed for the photos. The stack also has two flexible ground planes attached to it. The designers clearly wanted to ensure that the memory was well shielded, to a degree that I haven't seen in other systems.
Conclusions
Despite my research, this aerospace computer remains a mystery. I was unable to identify who manufactured it or even its exact function. One hypothesis is a NASA connection since NASA was extensively using these Signetics chips at the time. Moreover, this computer was obtained in the Houston area. Another hypothesis, based on the "CDC" label on the chips, is that this computer was built by Control Data's Aerospace Division. If you have any leads on this mysterious aviation computer, please contact me.
This system may have been a prototype. It has no part numbers, manufacturer name, or identifying plate.5 Moreover, the soldering on the core memory stack doesn't seem to be flight quality. Finally, the boards don't have conformal coating, which is typically used for spaceflight systems. However, the mechanical design looks advanced for a prototype, with dense boards that fold together like a book.
This unit clearly has a navigation role, but seems to be too inaccurate for an inertial navigation system (INS). It contains many integrated circuits, but not enough to form a full computer. I hypothesize that this unit contains the circuitry to drive the core memory and the display, and handle keyboard input. Looking at the underside of the unit (below), there are three connectors. I suspect these connectors were plugged into a larger box that held the computer itself.
The date codes on the integrated circuits range from 1966 to 1973, so the computer was probably manufactured in 1973. The seven-year range for date codes is a bit surprising, since integrated circuit technology changed a lot during these years. I suspect that the Signetics 400-series ICs had older date codes because this line didn't catch on so there was a lot of old stock rather than newly-manufactured parts. I also suspect that this system was designed around 1969, based on the multiple NASA systems using these chips then, suggesting that the design and manufacturing of this unit was a multi-year project.
Despite the lingering mysteries of this device, it provides an interesting example of aerospace computers at the beginning of the 1970s. Even though integrated circuits were primitive at the time, with just a few transistors per chip, aerospace computers used these chips and high-density packaging to build computers that were compact, reliable, and low power. These miniature computers controlled aircraft, missiles, and spacecraft, worlds away from the room-filling mainframes that attracted most of the attention.
Thanks to Usagi Electric for providing the aerospace computer. Eric Schlaepfer and Marc Verdiell helped with the analysis. Thanks to Don Straney for his research and comments. Various commenters on Reddit and Twitter provided suggestions. Follow me on Twitter @kenshirriff or RSS for updates. I'm also on Mastodon as oldbytes.space@kenshirriff.
Notes and references
-
The indicators have a blank position, so there are 11 electromagnets. However, only the ten electromagnets associated with digits are used in the device. The N/S/E/W indicators have a square box in one of the positions, which probably is not used. ↩
-
Signetics had multiple temperature ranges for the 400-series low-power ICs. The RE prefix indicated ultra high reliability aerospace components rated for a temperature range of -55°C to +125°C. The SE prefix on the chips in this unit indicated military airborne chips with the same temperature range. A NE or ST prefix indicated military prototype or industrial chips with a smaller temperature range (0°C to +70°C). A SP prefix indicated the commercial temperature rating, from +15°C to +55°C. A J suffix indicated a flat pack and an A suffix indicated a dual in-line pack (DIP). ↩
-
NASA computers are the only documented systems that I could find that used these Signetics chips. One possible conclusion is that NASA was the only organization to use these chips. However, it is likely that other companies used these chips but didn't document them as thoroughly as NASA. That is, detailed circuitry for military aerospace computers is unlikely to be on the Internet. ↩
-
Halex also made hybrid microcircuits, such as flip-flops, so these packages could be more complex than resistor networks. However, I think a resistor network is more likely. ↩
-
One of the circuit boards had the number "45333000" on it, along with a symbol like "+I-", as shown below.
Closeup of a circuit board showing a number, maybe identifying the board.One board also had a mysterious symbol that resembles "mw". I couldn't match these symbols to any manufacturers, and it is unclear if they are logos, fiducials, or other symbols.
Closeup of a circuit board showing the "mw" mark.
Subject: Inside the tiny chip that powers Montreal subway tickets
To use the Montreal subway (the Métro), you tap a paper ticket against the turnstile and it opens. The ticket works through a system called NFC, but what's happening internally? How does the ticket work without a battery? How does it communicate with the turnstile? And how can it be so cheap that you can throw the ticket away after one use? To answer these questions, I opened up a ticket and examined the tiny chip inside.
The image below shows the chip inside the ticket, highly magnified. The four golden squares in the corner are the connections to the antenna. The tan-colored lines are the metal wiring layer on top of the chip; the thickest lines wire the antenna to other parts of the chip. The darker region that takes up the majority of the chip is the chip's digital logic. To the left is the analog circuitry that handles the signal from the antenna.
The chip uses NFC (Near-Field Communication). The idea behind NFC is that a reader (i.e. the turnstile) and an NFC tag (i.e. the ticket) communicate over a short distance through magnetic fields, allowing them to exchange data. The reader generates a magnetic field that both powers the tag and sends data to the tag. Both the reader and the tag have coil-like antennas so the reader's magnetic field can be picked up by the tag.1 When you tap your ticket on the turnstile, the NFC communication happens in 35 milliseconds, faster than an eyeblink. The data provided by the NFC tag shows that you have a valid ticket and then you can enter the subway.
The photo below shows the subway ticket, made of printed paper.2 At the right, the ticket appears to have golden smart-card contacts, like a credit card with an EMV chip. However, those contacts are completely fake, just printed onto the card with ink, and there is no chip there. Presumably, the makers thought that making the card look like a smart card would help people understand it. The card actually uses an entirely different technology.
Although the subway card is paper on the outside, its core is a thin plastic sheet, shown below. The sheet has a coiled antenna made from a layer of metal foil. If you look closely, you can see the tiny NFC chip in the lower left, a black speck connected to two sides of the antenna wire.3 The diagonal metal stripe in the upper left makes the antenna into a loop; topologically, a spiral antenna won't work on a 2-D sheet, so the diagonal bridge completes the circuit.
I want to emphasize the absurdly small size of the chip: 570 µm × 485 µm. The photo below shows that it is about the size of a grain of salt. The chip is also extremely thin—75 µm or 120 µm—so you can't even feel the chip inside the ticket.
Functions of the chip
There are many different types of NFC chips with varying levels of functionality. 4 This one is called the MIFARE Ultralight EV1,5 a low-cost chip designed for one-time ticketing applications. The basic function of the Ultralight chip is simple: providing a block of data to the reader. The chip holds its data in a small EEPROM; this chip has 48 bytes of user memory, while another variant has 108 bytes of user memory.
The Ultralight chip lacks the cryptography support found in more advanced chips. The Ultralight isn't much more secure than a printed ticket with a QR code or barcode, like you'd download for a show. It's up to the reader to validate the data and make sure the same ticket isn't being used multiple times.6
The Ultralight chip has a few features beyond a printed ticket, though. The chips are manufactured with a unique 7-byte identification code (UID). Moreover, the UID is signed, ensuring that fake UIDs cannot be generated.7 The chip also supports password-protected memory access and locking of memory pages to prevent modification. Since the password is transmitted without encryption, the security is weak, but better than nothing.8
Another interesting feature of the chip is the one-way counter. The chip has three 24-bit counters that can be incremented but not decremented. The counters can be used to allow the ticket to be used a particular number of times, for instance.9
Photographing the chip
To photograph the chip, I went through several steps to remove the chip from the ticket and then strip the chip down to the bare silicon. First, to extract the plastic sheet with the chip and the antenna from the paper ticket, I simply soaked the ticket in water. This turned the paper into mush, which could be scraped off to reveal the plastic core. Next, I cut out a small square of plastic that included the chip and put it in boiling sulfuric acid for about 30 seconds. This removed the plastic and adhesive, leaving the silicon die. (I try to avoid boiling acids, but processing a tiny chip like this only required a few drops of sulfuric acid, minimizing the risk.)
The die was covered with a passivation layer to protect its surface, a sandwich of silicon nitride and PSG (phosphosilicate glass) 1.1 µm thick according to the datasheet. The chip's underlying circuitry was visible, but slightly hazy due to this layer. I removed the passivation layer by boiling the chip in phosphoric acid for a few minutes. The image below shows the chip after this step. The top metal layer is much more visible, although some of the metal was dissolved by the acid. The thick metal lines connect the four bond pads to various parts of the analog circuitry, while many thin vertical metal lines provide interconnections of the logic circuitry.
Next, I treated the die with several cycles of treatment with Armour Etch to dissolve the oxide layer and hydrochloric acid to dissolve the metal. I think the chip had three layers of metal wiring on top of the silicon. Unfortunately, my process doesn't remove the metal layers cleanly, but causes them to come off in chaotic tangles. Since I wasn't interested in tracing the circuitry layer-by-layer, this wasn't a significant problem.
With the metal layers and polysilicon removed, I was left with the bare silicon. At this point, the underlying structure of the chip is visible. The doped silicon regions show the transistors, although they are extremely small at this scale. The white rectangles are capacitors. The chip has capacitors for many reasons: producing the right resonant frequency with the antenna, filtering the power, and boosting the voltage with charge pumps.
My biggest concern while processing this chip was to avoid losing it. With a chip this small, bumping the chip or even breathing on it can send the chip flying perhaps never to be seen again. Even trying to pick up the chip with tweezers is risky, since it can easily pop out and disappear. It's no fun examining the floor, inch by inch, trying to figure out if a speck is the lost chip or a bit of dirt. I found that the best way to move the chip between processing and a microscope slide was to put the chip in a few drops of water and move it with a pipette. Even so, there were a couple of times that I lost track of the chip and had to check some specks under the microscope to determine which was the chip and which were dirt.
Overview of the chip
The block diagram below shows the high-level structure of the chip. At the left, the antenna is connected to the RF interface, the analog circuitry that converts the high-frequency signals into digital data. This circuitry also extracts power from the antenna's signal to power the chip.
The majority of the chip contains digital logic to process the 18 different commands that it can receive from the reader.
Some commands, such as Wake-up
or Halt
control the chip's state.
Other commands, such as Read
or Write
provide access to the EEPROM storage.
The specialized Read_Cnt
and Incr_Cnt
commands access the chip's counters.
The chip has an "intelligent anticollision function" that allows multiple cards to be read without conflict if they are presented to the reader simultaneously. If a conflict is detected, the reader uses a standard NFC algorithm to select the cards one at a time, based on their identification numbers. The anticollision algorithm uses four of the chip's commands.
Finally, the chip has an EEPROM to store its data. Unlike RAM, the EEPROM holds data even when unpowered; it is designed to hold data for 10 years. To store data in the EEPROM, it must be written with a higher voltage than the rest of the chip uses. The EEPROM interface circuit produces the necessary signals.
The diagram shows the chip with its functional blocks labeled. The majority of the die is occupied with digital logic; I'll explain below how it is implemented with standard-cell logic. At the top is the EEPROM, a square of storage cells. To the right of the EEPROM is a charge pump, a circuit to boost the voltage through switched capacitors. The EEPROM interface circuitry is between the EEPROM and the digital logic.
The remainder of the chip contains analog circuitry that is harder to interpret, so my labels are somewhat speculative. The four bond pads are where the antenna is connected to the chip. There are four pads to support two parallel antennas if desired. The first die photo shows the metal wiring between the bond pads and the structures that I've labeled as RF transistors and RF diodes. The "RF transistors" in the upper left are large, oval-shaped structures. These may be the transistors that send data back to the reader by modifying the load. Alternatively, they could be Zener diodes to regulate the voltage powering the chip, since Zener diodes often have an oval shape. The "RF diodes" at the bottom may rectify the signal from the antenna, producing the power for the chip. The rectified signal is also demodulated and processed by the analog logic to extract the digital data sent from the reader.
Sending data from the tag to the reader: load modulation
You might expect the tag to send data back to the receiver by transmitting a signal through the antenna. However, transmitting a signal takes power and the tag doesn't have much power available, just the power that it extracts from the reader's signal. Instead, the tag uses a clever technique called load modulation to send data to the reader. The idea is that if the tag changes the load across the antenna, it will absorb more or less energy from the reader. The reader can detect this change as a small variation in voltage across its transmitting antenna. Thus, the tag can dynamically change its load to send data back to the reader. Even though the signal produced by load modulation is extremely weak (80 dB less than the transmitted signal), the reader can detect it and extract the data.
In more detail, the reader transmits at a carrier frequency of 13.56 MHz.10 To send data back, the tag switches its load on and off at 848 kHz (1/16 of the carrier frequency), producing a subcarrier on top of the reader's signal. To transmit bits, this load modulation is switched on or off to transmit 106 kilobits per second (1/8 of the modulation frequency). The reader, in turn, extracts the subcarrier with a filter to receive the data bits from the tag.
An NFC tag can apply a load that is either a resistor or a capacitor; a resistor absorbs the signal directly, while a capacitor changes the antenna's resonant frequency and thus the amount of signal transferred to the tag. The die contains many capacitors, but I didn't see any significant resistors, so I suspect that this chip uses a capacitor for the load.
The chip's manufacturing process
The image below shows an extreme closeup of the die. The red box surrounds a region of doped silicon, forming five MOS transistors in series. Each dark vertical line corresponds to the gate of one transistor so the width of this line corresponds to the feature size. I estimate that the chip's feature size is 180 nm. In comparison, the wavelength of visible light is 400-700 nm. Since the features are smaller than the wavelength of light, it's not surprising that image appears blurry.
The 180 nm process was popular in the late 1990s. These features are very large, however, compared to recent chips with features that are a few nanometers across. At the time the MIFARE Ultralight EV1 chip was released (October 2012), the newest semiconductor manufacturing process was 22 nm, so the 180 nm process they used was old even then.
However, it makes sense that the chip would be manufactured with an older process for several reasons. First, much of the chip's area is occupied by analog circuitry and the four bond pads, so shrinking the digital logic won't reduce the overall size much. Moreover, a significantly smaller chip would be impractical to attach to the antenna; I expect even the current chip is a pain to mount. Finally, this chip is designed for the extremely low-cost (i.e. disposable) market, so the chip is manufactured as inexpensively as possible. With a more modern process, more chips would fit on a wafer, dropping the price, but manufacturing each wafer would be more expensive, so there is a tradeoff.
Standard-cell logic
The chip's digital circuitry is implemented with standard-cell logic, a common way of implementing digital logic. The idea behind standard-cell logic is to use automated tools to create the chip layout from a description of the desired logic. The process starts with a library of standard cells. Each cell is a standardized implementation of a simple circuit such as a NAND gate or a flip-flop. The cells are designed so they have a fixed height and can be arranged in rows. The cells are then connected by metal wiring on top of the cells to produce the desired circuitry. Although the resulting circuitry isn't as dense and efficient as a fully customized and optimized layout, standard cell logic is much faster (and thus cheaper) to design than a hand-tuned layout. Thus, standard-cell logic has been heavily used for integrated circuit design since the 1980s.
The photo below shows four rows of gates implemented with standard cell logic, The chip (like most modern chips) uses CMOS logic, with each logic gate built from two types of transistors: NMOS and PMOS. To simplify manufacturing, the NMOS and PMOS transistors are arranged in separate rows. Thus, each row of logic consists of a row of PMOS transistors on top and a row of NMOS transistors below, or vice versa. Due to the physics of semiconductors, the PMOS transistors are larger, which allows the transistor types to be distinguished in the image.
Looking at some of the cells and extrapolating, I estimate about 8000 gates in the logic section with about 45,000 transistors. One question is if the chip is implemented as a hardcoded state machine, or if it contains a processor (microcontroller). The transistor count is barely large enough to implement a simple microcontroller such as an 8051, but that wouldn't leave many transistors left over for other necessary circuitry. If a microcontroller were present, it would need software stored somewhere. Given the simplicity of the protocol and the relatively small number of transistors, my guess is that the chip is implemented in hardware (state machines and counters) rather than through a microcontroller.
The diagram below shows how a standard cell implements a 2-input NAND. (This cell is from the Intel 386, not the NFC chip, but the structures are similar.) The cell contains four transistors. The yellow region is the P-type silicon that forms two PMOS transistors; the transistor gates are where the polysilicon (red) crosses the yellow region. (The middle yellow region is the drain for both transistors; there is no discrete boundary between the transistors.) Likewise, the two NMOS transistors are at the bottom, where the polysilicon (red) crosses the active silicon (green). The blue lines indicate the metal wiring for the cell. The black circles are contacts, connections between the metal and the silicon or polysilicon. Finally, the well taps are the opposite type of silicon, connected to the underlying silicon well or substrate to keep it at the proper voltage.
EEPROM
The chip stores its data in an EEPROM, similar to flash memory. The chip provides 640 or 1312 bits of EEPROM, based on the part number; I believe both versions use the same EEPROM implementation, but the cheaper version limits the amount that can be used. I think the EEPROM is the matrix shown below, with row and column drive circuitry to the right and below. (The diagonal lines are accidental scratches while I was processing the chip.)
In the photo, the EEPROM appears to be a 64×64 grid, 4K bits of storage rather than the advertised 1312 bits. There are several possible explanations. First, I could be miscounting the capacity (it is easy to be off by a factor of 2, depending on the cell structure). Second, the chip stores data that isn't reflected in the EEPROM memory map; for instance, the one-way counters and the UID signature are not included in the EEPROM storage count. Another possibility is that the extra EEPROM space holds code for a microcontroller (if the chip has one).
An EEPROM requires a relatively high voltage (10-20V) to force electrons into the storage cell for a bit. This voltage is generated by a charge pump circuit that switches capacitors at high frequency to boost the voltage. To the right of the EEPROM is a circuit with several large capacitors, presumably the charge pump.
Conclusions
It's remarkable that these NFC chips can be manufactured so cheaply that they are disposable. To keep the price down, the chips are sold by the wafer and then mounted in the tickets.11 You can buy an eight-inch silicon wafer with the chips for $9000 from Digikey. This may seem expensive until you realize that a single wafer provides an astonishing 100,587 chips, yielding a per-chip price of nine cents. According to the datasheet, a wafer has 103,682 potential good dies per wafer (PGDW). Some dies will be faulty, of course, so the wafer comes with a file telling you which dies are the good ones, 97% of them. (During the manufacturing of a typical chip, the faulty ones are marked with a spot of ink. But that won't work in this case since each die is much smaller than an ink spot.) If you need more chips, you can buy a 12" wafer for $19,000, providing 215,712 chips. A ticket manufacturer mounts each chip on an antenna sheet and then prints the ticket, adding a few cents to the cost of the ticket. The result is an inexpensive ticket that can be used once and discarded.
I'll leave you with one last die photo. In my first attempt at processing the chip, I treated it with Armour Etch. Although this failed to remove the passivation layer, it thinned it slightly, enough to generate some wild colors due to thin-film interference. I call this the "tie die".
Follow me on Twitter @kenshirriff or RSS for more. I'm also on Mastodon as oldbytes.space@kenshirriff. If you're interested in this type of chip, a few years ago, I looked at two RFID race timing chips, the Monza R4 and Monza R6.
Notes and references
-
Because the card and the reader are positioned close together, the two antennas use "inductive coupling", coupled by magnetic fields rather than radio waves. That is, the two antennas act like transformer windings, transmitting the signal from the reader to the card. ↩
-
The Montreal subway uses multiple types of cards. In this blog post, I examine the Occasional card (L'Occasionnelle). This is a non-rechargeable card that works for a single trip or up to three days, and then is discarded. For long-term usage, Montreal uses the Opus card, which provides more security and implements the Calypso standard. An Opus card is plastic rather than paper, giving it a longer life. The Calypso standard is much more secure, using cryptography such as AES, DES, and ECC (spec) and provides much larger EEPROM storage. Thus, the transit system uses the Occasional card for cheap, disposable tickets and the Opus card for a long-term ticket, where spending a dollar or two on the physical card isn't an issue.
I haven't examined an Opus card, so I don't know what type of chip it uses or even who manufactures the chip. Many companies produce Calypso cards, for instance, the STMicroelectronics CD21 Calypso chip is based on an Arm core. ↩
-
If you look closely at the lower right corner of the NFC card, it has three positions that can hold a chip, with the chip in position #3. Presumably, this allows three different NFC chips to be mounted in one card, so one card could have three functions. The NFC protocol is designed to avoid collisions if multiple chips respond, so the three chips won't interfere with each other. ↩
-
You can easily examine NFC cards like this using your phone, with an app such as NFC Tools or NXP's Taginfo. Tapping a card will display the type of the card and allow the memory to be read (subject to security restrictions). It's entertaining to tap various NFC cards and see what type of chip they use; I found that hotels typically use the MIFARE Classic chip, more advanced than the MIFARE Ultralight chip in the subway ticket.
The NFC Tools app shows that this card is a MIFARE Ultralight EV1. -
The part number, as provided by the chip, is
MF0UL1101DUx
. "MF0UL" indicates the MIFARE Ultralight EV1, a chip in the Ultralight family manufactured by NXP. An "H" if present indicates 50 pF input capacitance, rather than 17 pF in the chip I examined, allowing a different antenna. Next, "1" indicates a chip with 384 bits of user memory, while "2" would indicate 1024 bits. This is followed by "101D", and then a code indicating the specific package: "U" indicates a wafer, while "A" indicates a plastic leadless module carrier (LCC). Other characters specify the wafer diameter and thickness. ↩ -
It is instructive to think about the security of a printed ticket for a concert with a barcode. You could print out a hundred copies of the ticket, but it will only get you into the concert once. (This assumes that the venue has a centralized database so they can keep track of which tickets have been scanned.) Most of the security is implemented in the backend system, not the ticket itself. The ticket numbers need to be unforgeable, either by generating random numbers or using cryptography. (If the tickets just have QR codes with the numbers 1 to 100, for instance, it would be trivial to make fake tickets.) Moreover, there is nothing to ensure that the person scanning the ticket is legitimate; someone malicious could scan your ticket in line, print out a copy, and get into the concert instead of you. The MIFARE Ultralight chip is similar to a paper ticket in many ways with only slightly more security. ↩
-
The UID signing is done with an ECC (elliptic-curve cryptography) algorithm. Note that the chip doesn't need any cryptographic support for this; the chip just holds the signature that was programmed during manufacturing. As far as the chip is concerned, it is just providing some stored bytes. ↩
-
The MIFARE Ultralight has enough security to work as a limited-use ticket, but more advanced applications such as reloadable stored-value cards require a chip that supports encryption such as the DESFire. This allows the market to be partitioned, with the inexpensive Ultralight supporting the low-end market, while the more costly DESFire is required for more advanced applications.
There are many types of MIFARE cards and it's hard to keep them straight, but the diagram below from NXP may help. The different families are arranged left to right: Ultralight, Classic, Plus, DESFire, and SmartMX. The Y dimension indicates the official security certification level. The Z dimension (front to back) shows the evolution within a family over time. I've added a red arrow to indicate the "Ultralight EV1" chip, the focus of this blog post. (Personally, if you need a three-dimensional diagram to explain your product line, the product line may be excessively complicated.)
The various MIFARE NFC types. Diagram from aMIFARE Plus Product Family. -
In more detail, a 3-byte counter can be incremented by a specified value until it reaches the all-1's state (0xFFFFFF), at which point it stops. If you wanted to allow, say, 5 uses of a ticket, you could initialize the counter to all-1's minus 5. Then the counter could be incremented 5 times before reaching the limit.
One complication is that the counters have an "anti-tearing" feature for additional security. The problem is that if you tear the card away from the reader in the middle of an update, there is a possibility for counters to be partially updated, yielding a bad result. The anti-tearing feature ensures that a counter will be atomically updated, avoiding a partial update. ↩
-
There are multiple NFC standards with differences in speed, protocol, and range, including NFC-A, NFC-B, NFC-C, NFC-F, and NFC-V. The MIFARE Ultralight cards use NFC-A, which is defined by the standard "ISO/IEC 14443 Type A". Annoyingly, each part of the standard costs $70. The NFC Forum Analog Technical Specification provides a lot of detail, though. ↩
-
Instead of a wafer, you can buy the chips on tape but it costs more than twice as much. ↩
Subject: Standard cells: Looking at individual gates in the Pentium processor
Intel released the powerful Pentium processor in 1993, a chip to "separate the really power-hungry folks from ordinary mortals." The original Pentium was followed by the Pentium Pro, the Pentium II, and others, spawning a long-running brand of high-performance processors, Intel's flagship line until the Core processors took over in 2006. The Pentium eventually became virtually synonymous with "PC" and even made it into pop culture.
Even though the Pentium is a complex chip with 3.3 million transistors, its transistors are visible under a microscope, unlike modern chips. By examining the chip, we can see the interesting circuits used for gates, flip-flops, and other circuits, including the use of an unusual technology called BiCMOS. In this article, I take a close look at the original Pentium chip1, showing how much of its circuitry was built out of structured rows of tiny transistors, a technique known as standard-cell design.
The die photo below shows the Pentium's fingernail-sized silicon die under a microscope. I removed the chip's four metal layers to show the underlying silicon, revealing the individual transistors, which are obscured in most die photos by the layers of metal. Standard-cell circuitry, indicated by red boxes, is recognizable because the circuitry is arranged in uniform columns of cells, giving it a characteristic striped appearance. In contrast, the chip's manually-optimized functional blocks are denser and more structured, giving them a darker appearance. Examples are the caches on the left, the datapaths in the middle, and the microcode ROMs on the right.
Standard-cell design
Early processors in the 1970s were usually designed by manually laying out every transistor individually, fitting transistors together like puzzle pieces to optimize their layout. While this was tedious, it resulted in a highly dense layout. Federico Faggin, designer of the popular Z80 processor, was almost done when he ran into a problem. The last few transistors wouldn't fit, so he had to erase three weeks of work and start over. The closeup of the resulting Z80 layout below shows that each transistor has a different, complex shape, optimized to pack the transistors as tightly as possible.2
Because manual layout is slow, difficult, and error-prone, people developed automated approaches such as standard-cell.3 The idea behind standard-cell is to create a standard library of blocks (cells) to implement each type of gate, flip-flop, and other low-level component. To use a particular circuit, instead of arranging each transistor, you use the standard design from the library. Each cell has a fixed height but the width varies as needed, so the standard cells can be arranged in rows. The Pentium die photo below seven cells in a row. (The rectangular blobs are doped silicon while the long, thin vertical lines are polysilicon.) Compare the orderly arrangement of these transistors with the Z80 transistors above.
The photo below zooms out to show five rows of standard cells (the dark bands) and the wiring in between. Because CMOS circuitry uses two types of transistors (NMOS and PMOS), each standard-cell row appears as two closely-spaced bands: one of NMOS transistors and one of PMOS transistors. The space between rows is used as a "wiring channel" that holds the wiring between the cells. Power and ground for the circuitry run along the top and bottom of each row.
The fixed structure of standard cell design makes it suitable for automation, with the layout generated by "automatic place and route" software. The first step, placement, consists of determining an arrangement of cells that minimizes the distance between connected cells. Running long wires between cells wastes space on the die, since you end up with a lot of unnecessary metal wiring. But more importantly, long paths have higher resistance, slowing down the signals. Once the cells are placed in their positions, the "routing" step generates the wiring to connect the calls. Placement and routing are both difficult optimization problems that are NP-complete.
Intel started using automated place and route techniques for the 386 processor, since it was much faster than manual layout and dramatically reduced the number of errors. Placement was done with a program called Timberwolf, developed by a Berkeley grad student. As one member of the 386 team said, "If management had known that we were using a tool by some grad student as a key part of the methodology, they would never have let us use it." Intel developed custom software for routing, using an iterative heuristic approach. Standard-cell design is still used in current processors, but the software is much more advanced.
A brief overview of CMOS
Before looking at the standard cell circuits in detail, I'll give a quick overview of how CMOS circuits are implemented. Modern processors are built from CMOS circuitry, which uses two types of transistors: NMOS and PMOS. The diagram below shows how an NMOS transistor is constructed. The transistor can be considered a switch between the source and drain, controlled by the gate. The source and drain regions (green) consist of silicon doped with impurities to change its semiconductor properties, forming N+ silicon. The gate consists of a layer of polysilicon (red), separated from the silicon by a very thin insulating oxide layer. Whenever polysilicon crosses active silicon, a transistor is formed.
The NMOS and PMOS transistors are opposite in their construction and operation. A PMOS transistor swaps the N-type and P-type silicon, so it consists of P+ regions in a substrate of N silicon. In operation, an NMOS transistor turns on when the gate is high, while a PMOS transistor turns on when the gate is low.4 An NMOS transistor is best at pulling its output low, while a PMOS transistor is best at pulling its output high. In a CMOS circuit, the transistors work as a team, pulling the output high or low as needed; the "C" in CMOS indicates this "Complementary" approach. NMOS and PMOS transistors are not entirely symmetrical, however, due to the underlying semiconductor physics. Instead, PMOS transistors need to be larger than NMOS transistors, which helps to distinguish PMOS transistors from NMOS transistors on the die.
The layers of circuitry in the Pentium
The construction of the Pentium is more complicated than the diagram above, with four layers of metal wiring that connect the transistors.5 Starting at the surface of the silicon die, the Pentium's transistors are similar to the diagram, with regions of silicon doped to change their semiconductor properties. Polysilicon wiring is created on top of the silicon. The most important role of the polysilicon is that when it crosses doped silicon, a transistor is formed, with the polysilicon as the gate. However, polysilicon is also used as wiring over short distances.
Above the silicon, four layers of metal connect the components: multiple metal layers allow signals to crisscross the chip without running into each other. The metal layers are numbered M1 through M4, with M1 on the bottom. A few rules control the wiring: a metal layer can connect with the layer above or below through a tungsten plug called a "via". Only the bottom metal, M1, can connect to the silicon or polysilicon, through a "contact". The layers usually alternate between horizontal wiring and vertical wiring (at least locally). Thus, a signal from a transistor may travel through M1, bounce up to M2 and M3 to cross other signals, and then go back down to M1 to connect to another transistor. As you can see, automated place and route software has a complicated task, producing millions of complicated wiring paths as densely as possible.
The diagram below shows how the layers appear on the chip. (This photo shows one of the rare spots on the chip where all the layers are visible.) The M4 metal layer on top of the chip is the thickest, so it is mostly used for power, ground, and clock signals rather than data. An M4 ground wire covers the top of this photo. The next layer down is M3. In this part of the chip, M3 lines run vertically. (Due to optical effects, the vertical M3 lines may look like they are on top of M4, but they are below.) The horizontal M2 metal lines are lower and appear brown rather than golden, due to the oxide layers that cover them. The bottom metal layer is M1. The vertical M1 lines are thick in this part of the chip because they provide power to the circuitry.
The silicon and polysilicon are mostly obscured in the above photo. By removing all the metal layers, I obtained the image below. This image shows the same region as the image above, but it is hard to see the correlation because the metal layers almost completely obscure the silicon. The orderly columns of transistors reveal the standard-cell design. The irregular dark regions are doped silicon, which forms the chip's transistors. The dark or shiny horizontal bands are polysilicon. I will explain below how these regions form gates and other circuits.
Inverter
The fundamental CMOS gate is an inverter, shown in the schematic below. The inverter is built from one PMOS transistor (top) and one NMOS transistor (bottom). If the gate input is a "1", the bottom transistor turns on, pulling the output to ground (0). A "0" input turns on the top transistor, pulling the output high (1). Thus, this two-transistor circuit implements an inverter.10
The diagram below shows two views of how a standard-cell inverter appears on the Pentium die, with and without metal. The inverter consists of two transistors, just like the schematic above. The input is connected to the two polysilicon gates of the transistors. The metal output wire is connected to the two transistors (the left sides, specifically).
In more detail, the image on the left includes the bottom (M1) metal layer, but I removed the other metal layers. Two thick metal lines at the top and bottom provide power and ground to the standard cells. The multiple dark circles are contacts between the M1 metal layer and the metal layer on top (M2), providing a path for power and ground that eventually reaches the top (M4) metal layer and then the chip's pins. (The power and ground wires are thick to provide sufficient current to the circuitry while minimizing voltage drops and noise.) The small, lighter circles are vias that connect the M1 metal layer to the underlying silicon or polysilicon. The input to the gate is provided from the M2 metal, which connects to the M1 layer at the indicated contact. The smaller black dots at the top and bottom of this metal strip are vias, connections to the underlying silicon.
For the image on the right, I removed all four metal layers, revealing the polysilicon and doped silicon. Recall that a transistor is constructed from regions of doped silicon with a stripe of polysilicon between the regions, forming the transistor's gate. The diagram shows the two transistors that form the inverter. When combined with the metal wiring, they form the inverter schematic shown earlier. The final feature is the "well tap". The PMOS transistors are constructed in a "well" of N-doped silicon. The well must be kept at a positive voltage, so periodic "taps" connect the well to the +3.3V supply. As mentioned earlier, the PMOS transistor is larger than the NMOS transistor, which allowed me to figure out the transistor types in the photo.
By the way, the chip is built with a 600 nm process, so the width of the polysilicon lines is approximately 600 nm. For comparison, the wavelength of visible light is 400 to 700 nm, with 600 nm corresponding to orange light. This explains why the microscope photos are somewhat fuzzy; the features are the size of the wavelength of light.6
NAND gate
Another common gate in the Pentium is the NAND gate. The schematic below shows a NAND gate with two PMOS transistors above and two NMOS transistors below. If both inputs are high, the two NMOS transistors turn on, pulling the output low. If either input is low, a PMOS transistor turns on, pulling the output high. (Recall that NMOS and PMOS are opposites: a high voltage turns an NMOS transistor on while a low voltage turns a PMOS transistor on.) Thus, the CMOS circuit below produces the desired output for the NAND function.
The implementation of the gate as a standard cell, below, follows the schematic. The left photo shows the circuit with one layer of metal (M1). A thick metal line provides 3.3 volts to the gate; it has two contacts that provide power to the two PMOS transistors. The metal line for ground is similar, except only one NMOS transistor is grounded. The thinner metal in the middle has two contacts to get the transistor outputs and a via to connect the output to the M2 metal layer on top. Finally, two tiny bits of M1 metal connect the inputs from the M2 layer to the underlying polysilicon.
The right photo shows the circuit with all metal removed, showing the polysilicon and silicon. Since a transistor is formed where a polysilicon line crosses doped silicon, the two polysilicon lines create four transistors. Polysilicon functions both as local wiring and as the transistor gates. In particular, the inputs can be connected at the top or bottom of the circuit (or both), depending on what works best for wiring the circuitry. Note that the transistors are squashed together so the silicon in the middle is part of two transistors. An important asymmetry is that the output is taken from the middle of the PMOS transistors, wiring them in parallel, while the output is taken from the right side of the NMOS transistors, wiring them in series.
Zooming out a bit, the photo below shows three NAND gates. Although the underlying standard cell is the same for each one, there are differences between the gates. At the top, horizontal wiring links the inputs to M2 through vias. The length of each polysilicon line depends on the position of the metal. Moreover, in the middle of each gate, the metal connection to the output is positioned differently. Finally, note that the power wiring shifts upward in the upper right corner; this is to make room for a larger cell to the right. The point is that the standard cells aren't simply copies of each other, but are adjusted in each case to put the inputs, outputs, and power in the right location. Also note that these standard cells are not isolated, but are squeezed together so the PMOS transistors are touching. This optimization slightly increases the density.
OR-NAND gate
The standard cell library includes some complex gates. For instance, the gate below is a 5-input OR-NAND gate, computing
~((A+B+C+D)⋅E)
.
In the NMOS circuit, transistors A
through D
are paralleled while E
is in series.
The PMOS circuit is the opposite, with A
through D
in series and E
in parallel.
To provide sufficient current, the PMOS circuit has two sets of transistors for A
through D
, so the PMOS block is
much larger than the NMOS block.
Latch
One of the key building blocks of the Pentium's circuitry is the latch. The idea of the latch is to hold one bit, controlled by the clock signal. A latch is "transparent": the latch's input immediately appears on the output while the clock is high. But when the clock is low, the latch holds its previous value. The latch is implemented with a feedback loop that passes the latch's output back into the latch. The heart of this latch circuit is the multiplexer (mux), which selects either the previous output (when the clock is low) or the new input (when the clock is high). The inverters amplify the feedback signal so it doesn't decay in the loop. An inverter also amplifies the output so it can drive other circuitry.
The circuit for a multiplexer is interesting since it uses "pass transistors". That is, the transistors simply pass their
input through to the output, rather than pulling a signal to power or ground as in a typical logic gate.
The schematic shows how this works.
First, suppose that the select
line is low. This will turn on the two transistors connected to the first input, allowing
its level to flow to the output. Meanwhile, both transistors connected to the second input will be turned off, blocking
that signal.
But if the select
line is high, everything switches. Now, the two transistors connected to the second input turn on, passing
its level to the output.
Thus, the multiplexer selects the first input if the control signal is low, and the second input if the control signal is high.
The diagram below shows a multiplexer, part of a latch. On the left, an inverter feeds into one input of the multiplexer.7 On the right is the other input to the multiplexer. The output is taken from the middle, between the pairs of the transistors.
Note that the multiplexer's circuit is opposite, in a way, to a logic gate. In a logic gate, you want either the NMOS transistor on or the PMOS transistor on, so the output is pulled low or high respectively. This is accomplished by giving the signals on the transistor gates the same polarity, so the same polysilicon line runs through both transistors. In a multiplexer, however, you want the corresponding PMOS and NMOS transistors to turn on at the same time, so they can pass the signal. This requires the signals on the transistor gates to have opposite polarity. One polysilicon line runs through the right PMOS transistor and the left NMOS transistor. The other polysilicon line runs through the left PMOS transistor and the right NMOS transistor, connected by metal wiring (not shown). The multiplexer includes an inverter to provide the necessary signal, but I cropped it out of the diagram below.
The flip-flop
The Pentium makes extensive use of flip-flops. A flip-flop is similar to a latch, except its clock input is edge-sensitive instead of level-sensitive. That is, the flip-flop "remembers" its input at the moment the clock goes from low to high, and provides that value as its output. This difference may seem unimportant, but it turns out to make the flip-flop more useful in counters, state machines, and other clocked circuits.
In the Pentium, a flip-flop is constructed from two latches: a primary latch and a secondary latch. The primary latch passes its value through while the clock is low and holds its value when the clock is high. The output of the primary latch is fed into the secondary latch, which has the opposite clock behavior. The result is that when the clock switches from low to high, the primary latch stops updating its output at the same time that the secondary starts passing this value through, providing the desired flip-flop behavior.
The photo above shows a standard-cell flop-flop, with an intricate pattern of metal wiring connecting the various sub-components. There are a few variants; with minor logic changes, the flip-flop can have "set" or "reset" inputs, bypassing the clock to force the output to the desired state. (Set and reset functions are useful for initializing flip-flops to a desired value, for example when the processor starts up.)
The BiCMOS buffer
Although I've been discussing CMOS circuits so far, the Pentium was built with BiCMOS, a process that allows circuits to use bipolar transistors in addition to CMOS. By adding a few extra processing steps to the regular CMOS manufacturing process, bipolar (NPN and PNP) transistors can be created. The Pentium made extensive use of BiCMOS circuits since they reduced signal delays by up to 35%. Intel also used BiCMOS for the Pentium Pro, Pentium II, Pentium III, and Xeon processors (but not the Pentium MMX). However, as chip voltages dropped, the benefit from bipolar transistors dropped too and BiCMOS was eventually abandoned.
The schematic below shows a standard-cell BiCMOS buffer in the Pentium chip.8 This circuit is more complex than a CMOS buffer: it uses two inverters, an NPN pull-up transistor, an NMOS pull-down transistor, and a PMOS pull-up transistor.9
In the die images below, note the circular structure of the NPN transistor, very different from the linear structure of the NMOS and PMOS transistors and considerably larger. A sign of the buffer's high-current drive capacity is the output's thick metal wiring, much thicker than the typical signal wiring.
Conclusions
Standard-cell layout is extensively used in modern chips. Modern processors, with their nanometer-scale transistors, are much too small to study under a microscope. The Pentium, on the other hand, has features large enough that its circuits can be observed and reverse engineered. Of course, with 3.3 million transistors, the Pentium is too much for me to reverse engineer in depth, but I still find it interesting to study small-scale circuits and see how they were implemented. This post presented a small sample of the standard cells in the Pentium. The full standard-cell library is much larger, with dozens, if not hundreds, of different cells: many types of logic gates in a variety of sizes and drive strengths. But the fundamental design and layout principles are the same as the cells described here.
One unusual feature of the Pentium is its use of BiCMOS circuitry, which had a peak of popularity in the 1990s, right around the era of the Pentium. Although changing tradeoffs made BiCMOS impractical for digital circuitry, BiCMOS still has an important role in analog ICs, especially high-frequency applications. The Pentium in a sense is a time capsule with its use of BiCMOS.
I hope that you have enjoyed this look at some of the Pentium's circuits. I find it reassuring to see that even complex processors are made up of simple transistor circuits and you can observe and understand these circuits if you look closely.
For more on standard-cell circuits, I wrote about standard cells in an IBM chip and standard cells in the 386 (the 386 article has a lot of overlap with this one). Follow me on Twitter @kenshirriff or RSS for updates. I'm also on Mastodon occasionally as @kenshirriff@oldbytes.space.
Notes and references
-
In this blog post, I'm focusing on the "P54C" version of the original Pentium processor. Intel produced many different versions of the Pentium, and it can be hard to keep them straight. Part of the problem is that "Pentium" is a brand name, with multiple microarchitectures, lines, and products. At the high level, the Pentium (1993) was followed by the Pentium Pro (1995) Pentium II (1997), Pentium III (1999), Pentium 4 (2000), and so on. The original Pentium used the P5 microarchitecture, a superscalar microarchitecture that was advanced but still executed instruction in order like traditional microprocessors. The Pentium Pro was a major jump, implementing a microarchitecture called P6 that broke instructions into micro-operations and executed them out of order using dataflow techniques. The next microarchitecture version was NetBurst, first used with the Pentium 4. NetBurst provided a deep pipeline and introduced hyper-threading, but it was disappointingly slow and was replaced by the Core microarchitecture. The Core microarchitecture is based on the P6 and is Intel's current microarchitecture.
I'll focus now on the original Pentium, which went through several substantial revisions. The first Pentium product was the 80501 (codenamed P5), running at 60 or 66 MHz and using 5 volts. These chips were built with an 800 nm process and contained 3.1 million transistors.
The power consumption of these chips was disappointing, so Intel improved the chip, producing the 80502. These chips, codenamed P54C, used 3.3 volts and ran at 75-120 MHz. The chip's architecture remained essentially the same but support was added for multiprocessing, boosting the transistor count to 3.3 million. The P54C had a much more advanced clock circuit, allowing the external bust speed to stay low (50-66 MHz) while the internal clock speed—and thus performance—climbed to 100 MHz. The chips were built with a smaller 600 nm process with four layers of metal, compared to the previous three. Visually, the die of the P54C is almost the same as the P5, with the additional multiprocessing logic at the bottom and the clock circuitry at the top. For this article, I examined the P54C, but the standard cells should be similar in other versions.
Next, Intel moved to the 350 nm process, producing a smaller, faster Pentium chip, codenamed the P54CS; the die looks almost identical to the P54C (but smaller), with subtle changes to the bond pads. Another variant was designed for mobile use: the Pentium processor with "Voltage Reduction Technology" reduced power consumption by using a 2.9- or 3.1-volt supply for the core and a 3.3-volt supply to drive the I/O pins. These were built first with the 600 nm process (75-100 MHz) and then the 350 nm process (100-150 MHz).
The biggest change to the original Pentium was the Pentium MMX, with part number 80503 and codename P55C. This chip extended the x86 instruction set with 57 new instructions for vector processing. It was built on a 350 nm process before moving to 280 nm, and had 4.5 million transistors. More obscure variants of the original Pentium include the P54CQS, P54CS, P54LM, P24T, and Tillamook, but I won't get into them. ↩
-
Circuits that had a high degree of regularity, such as the arithmetic/logic unit (ALU) or register storage were typically constructed by manually laying out a block to implement the circuitry for one bit and then repeating the block as needed. Because a circuit was repeated 32 times for the 32-bit processor, the additional effort was worthwhile. ↩
-
An alternative layout technique is the gate array, which doesn't provide as much flexibility as a standard cell approach. In a gate array (sometimes called a master slice), the chip had a fixed array of transistors (and often resistors). The chip could be customized for a particular application by designing the metal layer to connect the transistors as needed. The density of the chip was usually poor, but gate arrays were much faster to design, so they were advantageous for applications that didn't need high density or produced a relatively small volume of chips. Moreover, manufacturing was much faster because the silicon wafers could be constructed in advance with the transistor array and warehoused. Putting the metal layer on top for a particular application could then be quick. Similar gate arrays used a fixed arrangement of logic gates or flip-flops, rather than transistors. Gate arrays date back to 1967. ↩
-
The behavior of MOS transistors is complicated, so the description above is simplified, just enough to understand digital circuits. In particular, MOS transistors don't simply switch between "on" and "off" but have states in between. This allows MOS transistors to be used in a wide variety of analog circuits. ↩
-
The earliest Pentiums had three layers of metal wiring, but Intel moved to a four-layer process with the P54C die, the version that I'm examining. ↩
-
To get this level of magnification with my microscope, I had to use an oil immersion lens. Instead of looking at the chip in air, as with a normal lens, I had to put a drop of special microscope oil on the chip. I carefully lower the lens until it dips into the oil (making sure I don't crash the lens into the chip). The purpose of the oil is that its index of refraction is almost the same as glass, much higher than air. This gives the lens a higher "numerical aperture", allowing the lens to resolve smaller details. ↩
-
For completeness, I'll mention that the inverter feeding the multiplexer inverter isn't exactly an inverter. Specifically, the inverter's two transistors are not tied together to produce an output. Instead, the inverter's NMOS transistor provides an input to the multiplexer's NMOS transistor and likewise, the PMOS transistor provides an input to the PMOS transistor. The omission of this connection does not affect the circuit's behavior, but it makes calling the circuit an inverter and a multiplexer a bit of an abstraction. ↩
-
Intel called this gate "BiNMOS" rather than "BiCMOS" because it uses a bipolar transistor and an NMOS transistor to drive the output, rather than two bipolar transistors. The Pentium's BiCMOS circuitry is described in a conference paper, showing a second NPN transistor to protect the first one. I don't see the second transistor on the die so the two transistors may be implemented in one silicon structure. Reference: R. F. Krick et al., “A 150 MHz 0.6 µm BiCMOS superscalar microprocessor,” IEEE Journal of Solid-State Circuits, vol. 29, no. 12, Dec. 1994, doi:10.1109/4.340418. ↩
-
The Pentium contains multiple types of BiCMOS standard cells, which I'll show in this footnote. The cell below is an inverter. It is similar to the BiCMOS buffer described earlier, except it lacks the first inverter in the circuit. To make room for the NPN transistor on the left, the PMOS transistors are shifted to the right. As a result, they don't line up with the PMOS transistors in other cells. This is a break from the traditional orderliness of standard cells.
A BiCMOS inverter with PMOS on the left and NMOS on the right. The input is at the bottom and the output is in the middle.The BiCMOS inverter below is similar, except it uses two NPN transistors, providing more output drive. I removed the M1 metal layer to provide a better view of the transistors.
A BiCMOS inverter with two NPN transistors. The PMOS transistors are in the lower left and the NMOS transistors are in the lower right.Another interesting BiCMOS circuit is the D flip-flop with enable and BiCMOS output, shown below. This is similar to the earlier flip-flop except it has an
enable
input, allowing it to either load a new value triggered by the clock, or to hold its earlier value. This allows the flip-flop to remember a value for more than one clock cycle. The additional functionality is implemented by another multiplexer, selecting either the old value or the new value. (This multiplexer is, in a way, one level higher than the multiplexer in each latch.) The transistor for the BiCMOS output is in the upper right, poking out from under the metal. (This circuit might be implemented as two independent cells, one for the flip-flop and one for the driver; I'm not sure.)A D flip-flop in the Pentium. -
One puzzling inverter variant is used in a gate I'll call the "slow buffer". This buffer consists of two inverters, so it passes its input through to the output, buffered. The strange part is that the first inverter uses transistors with wide gates, which makes these transistors much weaker than regular transistors. As a result, the first inverter will be slow to switch states. My guess is that this circuit is used to delay signals, for example, to keep a signal aligned with another signal that is delayed by multiple logic gates.
The buffer consists of two inverters. The first inverter uses wide, weak transistors.You might expect that larger transistors would be stronger, not weaker. The problem is that these transistors are larger in the wrong dimension. If you make the gate wider, the effect is similar to multiple transistors in parallel, providing more current. But if you make the gate longer (as in this case), the effect is similar to multiple transistors in series, so the resistances add and the total current is reduced. In most cases, transistors are constructed with the smallest gate length possible, which is determined by the manufacturing process, so the transistors here are unusual. This chip was manufactured with an 800 nm process, so the smallest gate length is approximately 800 nm. The gate width (the normal direction for variation) varies dramatically depending on the circuit, optimized to provide maximum performance. ↩
Subject: Inside an IBM/Motorola mainframe controller chip from 1981
In this article, I look inside a chip in the IBM 3274 Control Unit.1 But before I discuss the chip, I need to give some background on mainframes. (I didn't completely analyze the chip, so don't expect a nice narrative or solid conclusions.)
IBM's vintage mainframes were extremely underpowered compared to modern computers; a System/370 mainframe ran well under 1 million instructions per second, while a modern laptop executes billions of instructions per second. But these mainframes could support rooms full of users, while my 2017 laptop can barely handle one person.2 Mainframes achieved their high capacity by offloading much of the data entry overhead so the mainframe could focus on the "important" work. The mainframe received data directly into memory in bulk over high-speed I/O channels, without needing to handle character-by-character editing. For instance, a typical data entry terminal (a "3270") let the user update fields on the screen without involving the computer. When the user had filled out the screen, pressing the "Enter" key sent the entire data record to the mainframe at once. Thus, the mainframe didn't need to process every keystroke; it only dealt with complete records. (This is also why many modern keyboards have an "Enter" key.)
But that was just the beginning of the hierarchy of offloaded processing in a mainframe system. Terminals weren't attached directly to the mainframe. You could wire 16 terminals to a terminal multiplexer (such as the 3299). This would in turn be connected to a 3274 Control Unit that merged the terminal data and handled the network protocols. The Control Unit was connected to the mainframe's channel processor which handled I/O by moving data between memory and peripherals without slowing down the CPU. All these layers allowed the mainframe to focus on the important data processing while the layers underneath dealt with the details.3
The 3274 Control Unit (highlighted above) is the source of the chip I examined. The purpose of the Control Unit "is to take care of all communication between the host system and your organization's display stations and printers". The diagram above shows how terminals were connected to a mainframe, with the 3274 Control Unit (indicated by arrows) in the middle. The 3274 was an all-purpose box, handling terminals, printers, modems, and encryption (if needed). It could communicate with the mainframe at up to 650,000 characters per second. The control unit below (above) is a boring beige box. The control panel is minimal since people normally didn't interact with the unit. On the back are coaxial connectors for the lines to the terminals, as well as connectors to interface with the computer and other peripherals.
The Keystone II board
In 1983, IBM announced new Control Unit models with twice the speed: these were the Model 41 and Model 61. These units were built around a board called Keystone II, shown below. The board is constructed with IBM's peculiar PCB style. The board is arranged as a grid of squares with the PCB traces too small to see unless you zoom in. Most of the decoupling capacitors are in IBM's thin, rectangular packages, although I see a few capacitors in more standard blue packages. IBM is almost a parallel universe with its unusual packaging for ICs and capacitors as well as the strange circuit board appearance.
Most of the chips on the board are IBM chips packaged in square aluminum cans, known as MST (Monolithic System Technology). The first line on each package is the IBM part number, which is usually undocumented. The empty socket can hold a ROS chip; ROS is Read-Only Store, known as ROM to people outside IBM. The Texas Instruments ICs in the upper right are easier to identify; the 74LS641 chips are octal bus transceivers, presumably connecting this board to the rest of the system. Similarly, the 561 5843 is a 74S240 octal bus driver while the 561 6647 chips are 74LS245 octal bus transceivers.
The memory chips on the left side of this board are interesting: each one consists of two "piggybacked" 16-kilobit DRAM chips. IBM's part number 8279251 corresponds to the Intel 4116 chip, originally made by Mostek. With 18 piggybacked chips, the board holds 64 kilobytes of parity-protected memory.
The photo below shows the Keystone II board mounted in the 3274 Control Unit. The board is in slot E towards the left and the purple Motorola IC is visible.
The Motorola/IBM chip
The board has a Motorola chip in a purple ceramic package; this is the chip that I examined. Popping off the golden lid reveals the silicon die underneath. The package has the part number "SC81150R", indicating a Motorola Special/Custom chip. This part number is also visible on the die, as shown below.
While the outside of the IC is labeled "Motorola", there are no signs of Motorola internally. Instead, the die is marked "IBM" with the eight-striped logo. My guess is that IBM designed the chip and Motorola manufactured it.
The diagram below shows the chip with some of the functional blocks identified. Around the outside are the bond pads and the bond wires that are connected to the chip's grid of pins. At the right is the 16×16 block of memory, along with its associated control, byte swap, and output circuitry. The yellowish-white lines are the metal layer on top of the chip that provides the chip's wiring. The thick metal lines distribute power and ground throughout the chip. Unlike modern chips, this chip only has a single metal layer, so power and ground distribution tends to get in the way of useful circuitry.
The chip is centered around a 16-bit bus (yellow line) that connects many part of the chip. To write to the bus, a circuit pulls bus lines low. The bus lines are kept high by default by 16 pull-up transistors. This approach was fairly common in the NMOS era. However, performance is limited by the relatively weak pull-up current, making bus lines slow to go high due to R-C delays. For higher performance, some chips would precharge the bus high during one clock cycle and then pull lines low during the next cycle.
The two groups of I/O pins at the bottom are connected to the input buffer on the left and the output buffer on the right. The input buffer includes XOR circuits to compute the parity of each byte. Curiously, only 6 bits of the inputs are connected to the main bus, although other circuits use all 8 bits. The buffer also has a circuit to test for a zero value, but only using 5 of the bits.
I've put red boxes around the numerous PLAs, which can be identified by their grids of transistors. This chip has an unusually large number of PLAs. Eric Schlaepfer hypothesizes that the chip was designed on a prototype circuit board using commercial PAL chips for flexibility, and then they transferred the prototype to silicon, preserving the PLA structure. I didn't see any obvious structure to the PLAs; they all seemed to have wires going all over.
The miscellaneous logic scattered around the chip includes many latches and bus drivers; the latch circuit is similar to the memory cells. I didn't fully reverse-engineer this circuitry but I didn't see anything that looked particularly interesting, such as an ALU or counter. The circuitry near the PLAs could be latches as part of state machines, but I didn't investigate further.
I was hoping to find a recognizable processor inside the package, maybe a Motorola 6809 or 68000 processor. Instead, I found a complicated chip that doesn't appear to be a processor. It has a 16×16 memory block along with about 20 PLAs (Programmable Logic Arrays), a curiously large number. PLAs are commonly used in processors for decoding instructions, since they can match bit patterns. I couldn't find a datapatch in the chip; I expected to see the ALU and registers organized in a large but regular 8-bit or 16-bit block of circuitry. The chip doesn't have any ROM4 so there's no microcode on the chip. For these reasons, I think the chip is not a processor or microcontroller, but a specialized data-handling chip, maybe using the PLAs to interpret bits of a protocol.
The chip is built with NMOS technology, the same as the 6502 and 8086 for instance, rather than CMOS technology that is used in modern chips. I measured the transistor features and the chip appears to be built with a 3.5 µm process (not nm!), which Motorola also used for the 68000 processor (1979).
The memory buffer
The chip has a 16×16 memory buffer, which could be a register file or a FIFO buffer. One interesting feature is that the buffer is triple-ported, so it can handle two reads and one write at the same time. The buffer is implemented as a grid of cells, each storing one bit. Each row corresponds to a 16-bit word, while each column corresponds to one bit in a word. Horizontal control lines (made of polysilicon) select which word gets written or read, while vertical bit lines of metal transmit each bit of the word as it is written or read.
The microscope photo below shows two memory cells. These cells are repeated to create the entire memory buffer. The white vertical lines are metal wiring. The short segments are connections within a cell. The thicker vertical lines are power and ground. The thinner lines are the read and write bit lines. The silicon die itself is underneath the metal. The pinkish regions are active silicon, doped to make it conductive. The speckled golden lines are regions are polysilicon wires between the silicon and the metal. It has two roles: most importantly, when polysilicon crosses active silicon, it forms the gate of a transistor. But polysilicon is also used as wiring, important since this chip only has one layer of metal. The large, dark circles are contacts, connections between the metal layer and the silicon. Smaller square regions are contacts between silicon and polysilicon.
It was too difficult to interpret the circuits when they were obscured by the metal layer so I dissolved the metal layer and oxide with hydrochloric acid and Armour Etch respectively. The photo below shows the die with the metal removed; the greenish areas are remnants in areas where the metal was thick, mostly power and ground supplies. The dark regions in this image are regions of doped silicon. These are the active areas of the chip, showing the blocks of circuitry. There are also some thin lines of polysilicon wiring. The memory buffer is the large block on the right, just below the center.
Like most implementations of static RAM, each storage cell of the buffer is implemented with cross-coupled inverters, with the output of one inverter feeding into the input of the other. To write a new value to the cell, the new value simply overpowers the inverter output, forcing the cell to the new state. To support this, one of the inverters is designed to be weak, generating a smaller signal than a regular inverter. Most circuits that I've examined create the inverter by using a weak transistor, one with a longer gate. This chip, however, uses a circuit that I haven't seen before: an additional transistor, configured to limit the current from the inverter.
The schematic below shows one cell. Each cell uses ten transistors, so it is a "10T" cell. To support multiple reads and writes, each row of cells has three horizontal control signals: one to write to the word, and two to read. Each bit position has one vertical bit line to provide the write data and two vertical bit lines for the data that is read. Pass transistors connect the bit lines to the selected cells to perform a read or a write, allowing the data to flow in or out of the cell. The symbol that looks like an op-amp is a two-transistor NMOS buffer to amplify the signal when reading the cell.
With the metal layer removed, it is easier to see the underlying silicon circuitry and reverse-engineer it. The diagram below shows the silicon and polysilicon for one storage cell, corresponding to the schematic above. (Imagine vertical metal lines for power, ground, and the three bitlines.)
The output from the memory unit contains a byte swapper. A 16-bit word is generated with the left half from the read 1 output and the second half from the read 2 output, but the bytes can be swapped. This was probably used to read an aligned 16-bit word if it was unaligned in memory.
Parity circuits
In the lower right part of the chip are two parity circuits, each computing the parity of an 8-bit input. The parity of an input is computed by XORing the bits together through a tree of 2-input XOR gates. First, four gates process pairs of input bits. Next, two XOR gates combine the outputs of the first gates. Finally, an XOR gate combines the two previous outputs to generate the final parity.
The schematic below shows how an XOR gate is built from a NOR gate and an AND-NOR gate. If both inputs are 0, the first NOR gate forces the output to 0. If both inputs are 1, the AND gate forces the output to 0. Thus, the circuit computes XOR. Each labeled block above implements the XOR circuit below.
Conclusion
My conclusion is that the processor for the Keystone II board is probably one of the other chips, one of the IBM metal-can MST packages, and this chip helps with data movement in some way. It would be possible to trace out the complete circuitry of the chip and determine exactly how it functions, but that is too time-consuming a project for this relatively obscure chip.
Follow me on Twitter @kenshirriff or RSS for more chip posts. I'm also on Mastodon occasionally as @kenshirriff@oldbytes.space. Thanks to Al Kossow for providing the chip and Dag Spicer for providing photos. Thanks to Eric Schlaepfer for discussion.
Notes and references
-
The 3274 Control Unit was replaced by the 3174 Establishment Controller, introduced in 1986. An "Establishment Controller" managed a cluster of peripherals or PCs connected to a host mainframe, essentially a box that provided a "kitchen-sink" of functionality including terminal support, local disk storage, Ethernet or token-ring networking, ASCII terminal support, encryption/decryption, and modem support. These units ranged from PC-sized boxes to mini-fridge-sized boxes, depending on how much functionality was required. ↩
-
I'm serious that my laptop can barely handle one person; my 2017 MacBook Air starts dropping characters if it has even a moderate load, and I have to start one-finger typing. You would think that a 1.8 GHz dual-core i5 processor could handle more than 2 characters per second. I don't know if there's something wrong with it, or if modern software just has too much overhead. Don't worry, I upgraded and do most of my work on a faster, more recent laptop. ↩
-
The IBM hardware model had the CPU focusing on the big picture, while the hierarchy of boxes underneath processed data, performed storage, handled printing, and so forth. In a sense, this paralleled the structure of offices in that era, where executives had assistants and secretaries to do the tedious work for them: typing, filing, and so forth. Nowadays, the computer hierarchy and the office hierarchy are both considerably flatter. Maybe there's a connection? ↩
-
A ROM and a PLA are similar in many ways. The general distinction is that a ROM activates one word (row) at a time, while a PLA can activate multiple rows at a time and combine the values, giving more flexibility. A ROM generally has a binary decoder to select the row. This decoder can be recognized by its binary structure: transistors alternating by 1's, by 2's, by 4's, and so forth. ↩
Subject: Reverse engineering the 59-pound printer onboard the Space Shuttle
The Space Shuttle contained a bulky printer so the astronauts could receive procedures, mission plans, weather reports, crew activity plans, and other documents. Needed for the first Shuttle launch in 1981, this printer was designed in just 7 months, built around an Army communications terminal. Unlike modern printers, the Shuttle's printer contains a spinning metal drum with raised characters, allowing it to rapidly print a line at a time.
This printer is known as the Space Shuttle Interim Teleprinter System.1 As the name "Interim" suggests, this printer was intended as a stop-gap measure, operating for a few flights until a better printer was operational. However, the teleprinter proved to be more reliable than its replacement, so it remained in use as a backup for over 50 flights, often printing thousands of lines per flight. This didn't come cheap: with a Shuttle flight costing $27,000 per pound, putting the 59-pound teleprinter in space cost over $1.5 million per flight.
We obtained access to a Shuttle teleprinter (probably a development system that remained on the ground) and wanted to put it into operation. I had to reverse engineer three of the boards inside the printer to determine the data format the printer accepted: serial data encoded into audio. But after analyzing the printer and performing a lot of maintenance, we succeeded in getting the printer to print. In this article, I'll describe the Shuttle's Interim Teleprinter, explain its circuitry and drum-based printing mechanism, and show it in operation.
History of the Shuttle's Interim Teleprinter
The motivation for the teleprinter goes back to the Apollo program. During Apollo missions, the only way to send information to the astronauts was by talking to them over the radio and having the astronauts write down the data. NASA decided that the Space Shuttle should include a mechanism to send text and images to the astronauts, a 78-pound, high-tech fax machine called the Uplink Text & Graphics System (TAGS). A high-resolution grayscale image was sent to the Shuttle as a digital data stream. Onboard the Shuttle, a squat CRT displayed the image one line at a time and a fiber-optic faceplate transferred each line to light-sensitive silver emulsion paper. The paper was developed by passing it over a hot roller at 260ºF for 25 seconds, creating a permanent image.
The one flaw in this plan was that sending the digital image to the Shuttle required the Tracking and Data Relay Satellite System (TDRS), which due to delays wouldn't be ready until the sixth Shuttle flight. (The TDRS was a space-based replacement for the worldwide network of ground stations that was used during Apollo.) As a result, NASA decided just seven months before the first Shuttle launch that they needed an interim system "for transmission of real-time, flight-plan changes and other operational data to the crew."2
The Shuttle teleprinter is the result of this rushed effort to create a printer that could work over the existing audio channel rather than the digital TDRS satellite. Due to the time pressure, the Shuttle teleprinter needed to be based on an off-the-shelf printer. Thermal and electrostatic printers were rejected due to toxicity and flammability problems. (The Shuttle teleprinter used a roll of yellowish paper, which required a NASA waiver due to its flammability, a concern ever since the Apollo-1 disaster).
The decision was made to use a military communications terminal, the the AN/UCG-743 "Tactical Teletype". The terminal's interfacing was very flexible, supporting serial data in either ASCII or Baudot format, with multiple configurations and baud rates (up to 1200 baud), using either a current-loop or voltage signals. The military terminal supported two-way communication, so it had a keyboard. Remarkably, the terminal also implemented a word processor, controlled by a Motorola 6800 microprocessor (ancestor of the famous MOS 6502). The word processor allowed messages to be composed offline, minimizing the radio transmission time, which was important in a hostile environment. As will be seen, this 100-pound military system required many large changes to be usable on the Space Shuttle, most visibly removing the keyboard.
The printing mechanism
The teleprinter uses a spinning drum with raised characters, shown below.4 To print a character, the printer fires a hammer, forcing the inked ribbon and paper against the raised character on the drum. The drum is 80 characters wide, matching the line length, and there are 80 corresponding hammers, one for each print position. The drum has 64 printable characters, wrapped around each position of the drum.
The printer prints a line at a time, not instantaneously, but during each revolution of the drum. When the drum makes one complete revolution, each of the 64 characters passes by each print position once. Printing requires precise timing of the hammers to strike the right character on the drum as it whizzes by. The printer control circuitry triggers each hammer at the proper time, when the desired character on the drum is lined up with the hammer, producing the desired text.5
The character set is slightly different between the military printer and the Shuttle printer.
The military drum had 64 ASCII characters (upper-case letters only, numbers, and special characters).
The drum doesn't contain an explicit space character, since nothing is printed for a space. In its place, the drum has a diamond "◊", used as a special character to indicate a parity error or
other error.
The drum for the Shuttle teleprinter replaces 10 ASCII special characters with symbols that are more useful to the Shuttle, such as Greek letters for angles.
Specifically, the characters ;@[\]^!"#$
are replaced by θ✓‾↑↓~αβΔϕ
.
The video below shows a closeup of the hammers as they strike the paper to print text. The text is the teleprinter's built-in test message: "THE LAZY YELLOW DOG WAS CAUGHT BY THE SLOW RED FOX AS HE LAY SLEEPING IN THE SUN". This test message is based on the traditional quick brown fox..., which is a pangram, containing all 26 letters, but the teleprinter's test sentence is missing J, K, M, Q, and V. However, the test message is exactly 80 characters long and replaces spaces with the diamond "◊", so it is effective for verifying that all 80 columns work.
The electronics
The photo below shows the circuitry inside the teleprinter, looking down from above. At the left are the three interface boards, custom boards that demodulate the incoming audio signal. In front of the interface boards are large inductors to filter the incoming power. Hidden beneath them, a solid-state relay controls the power to the rest of the printer, implementing the low-power standby mode. In the middle, the blue board is the surprisingly complex switching power supply, mounted on a thick metal plate for cooling. Normally, the large roll of paper is mounted above the power supply board. At the right, four large circuit boards implement the main logic of the printer: a printer driver board, a communications board, a memory board, and the processor board. The rotating drum is protected by the perforated black metal grill at the front.
The demodulator boards
The original military teleprinter received data as a serial bitstream. However, on the Space Shuttle, data was encoded as frequencies on the audio link. Three custom boards were constructed to demodulate the audio data so the rest of the printer could handle it. These boards also performed Shuttle-specific tasks such as powering up the printer when a message comes in, and then returning the printer to standby mode. I reverse-engineered these boards to determine how they work and to determine the data encoding. (Schematics are in the footnotes.7) In this section, I'll discuss these three boards, which are on the left side of the printer.
To summarize, the serial bitstream is encoded with Frequency Shift Keying, with a 0 represented by 3600 Hz and a 1 represented by 7200 Hz.6 The serial data is transmitted at 600 baud, even parity, one stop bit. The demodulation process first converts the input audio to a digital signal by thresholding it. (That is, the input sine wave is converted to a square wave.) The digital signal is autocorrelated to distinguish the 3600 Hz and 7200 Hz signals, recovering the underlying serial data. This signal is passed to the printer's logic boards (part of the original military teleprinter), which convert the serial signal to ASCII bytes and prints them.
Signal processing starts with the "FSK input" board, shown below. First, it amplifies the input audio signal. (The two large resistors provide a 600 Ω load for the audio input.) Next, a 900 Hz high-pass filter eliminates low-frequency noise. (The filter is implemented by a two-stage Sallen-Key topology.)
The signal bounces from board to board, going to the "output FSK demod" board next. This board has a carrier-detect circuit that turns on the rest of the printer if it detects an input signal. This allows the printer to sit idle until it receives a signal from Earth. This board also applies the threshold to the signal to turn it into a digital waveform, which goes to the "control" board.
The output board also holds the 5-volt and 12-volt linear regulators that power the three boards; these are the metal-can ICs at the bottom of the board. To reduce the load on the regulators, two large resistors drop the input voltage (28 volts) to a lower level before it is regulated.
The control board holds the FSK decoder, an interesting circuit that converts the two FSK frequencies to binary by implementing a digital auto-correlator. It uses a 64-bit shift register to delay the digital input by 139 µs. The input and the delayed input are XOR'd together, generating a result that depends on the frequency. A 7200 Hz signal repeats every 139 µs, so the input and the delayed input match, yielding 0 from the XOR. However, a 3600 Hz square wave switches state every 139 µs, so the two XOR inputs will always differ, resulting in a 1 output. Thus, the circuit cleanly distinguishes between a 3600 Hz input and a 7200 Hz input. (The XOR output is opposite from the final value since it gets inverted later.)
The digital demodulator avoids some of the problems of an analog FSK demodulator. It is not sensitive to signal levels, since the signal is converted to digital. The digital demodulator is also not sensitive to harmonics, which can cause problems with analog demodulators. Finally, it doesn't require the carefully-tuned filters of an analog circuit.
The demodulated signal passes from the control board back to the output board. This board applies a 400 Hz low-pass filter and then a threshold to convert the signal back to binary. If the input frequencies are not exact, the demodulator will produce the correct 0 or 1 value over most of the waveform, but there will be glitches at the edges. The low-pass filter removes these glitches. (You might be concerned that a 600-baud signal would be wiped out by a 400 Hz low-pass filter. However, the worst case signal (alternating 0's and 1's) would be 300 Hz because it takes two bits to make one cycle, so the filter has plenty of margin.) Next, the board blocks the signal unless a carrier is detected. This ensures that random noise isn't demodulated and printed. Finally, the serial binary signal leaves the custom Shuttle boards and goes to the teleprinter's communication board, part of the standard teleprinter.
I noticed two unusual things about these boards. First, they have some modifications: "bodge" wires and added components. Second, the boards are not conformal coated, which is unusual for aerospace boards. (The four logic cards, in comparison, are protected with conformal coating.) My hypothesis is that these boards were development boards, early in the design process of the Shuttle teleprinter, so they were modified as the design changed. The teleprinter is also marked "Not for flight", which supports this theory.
The logic cards
The military teleprinter contained four logic circuit cards: a CPU card, a memory card, a communications card, and a print control card, mounted at the right rear of the teleprinter. These cards are used unchanged in the Shuttle teleprinter.
The circuitry is more complex than you might expect, with four large cards full of ICs. There are several reasons for this. First, the cards use 1970s microprocessor technology, so it takes a lot of circuitry to do anything. In particular, many simple 7400-series logic chips perform "glue" functions: decoding addresses, buffering data, latching signals, and so forth. Moreover, a drum printer is inherently complicated, since 80 hammers must be driven at the right time based on the desired characters. Third, the teleprinter is very flexible, supporting multiple signal levels and two character formats (ASCII and Baudot). Most surprisingly, the teleprinter implements a word processor, allowing messages to be composed and edited offline. Of course, since the Shuttle's teleprinter is only used to receive data, and doesn't even have a keyboard, the word processor feature is entirely useless.
The CPU card
The CPU card holds the microprocessor that controls the teleprinter. Its most important function is to convert a line of ASCII characters into print drum codes. These codes are stored in memory for use by the print control card. The CPU also implements configuration and self-test functions.
The diagram below shows some of the main components. The CPU card contains a Motorola 6800 CPU, 4 kilobytes of memory, and a ROM that holds its program code.8 Inconveniently, all the IC part numbers are military numbers so it takes some investigation to determine what a part really is. The MC6822 is a Peripheral Interface Adapter, a Motorola chip that provides two parallel I/O ports. This chip is used on three of the cards to support a variety of I/O tasks. On the CPU card, the I/O ports drive eight status lamps (most of which were removed for the Shuttle teleprinter) as well as internal status signals such as "paper low" or "keyboard present" and the baud rate setting input.
The print control card
In a sense, the print control card is the heart of the printer, since it causes characters to be printed by firing hammers against the rotating drum. As the drum goes through one revolution, all 64 characters will spin past each of the 80 print positions. By firing hammers at the exact time, the card prints a line of text.9 In more detail, for each row on the drum, the printer card scans through the 80-character memory buffer using Direct Memory Access (DMA). If the value in memory matches the current drum row number, the hammer is fired. Note that the hammers don't fire simultaneously, but in sequence as memory is scanned.
The diagram above shows the interaction between the drum, the print control card, and the 80 hammers. The hammers are implemented on 20 print hammer cards, each with 4 hammers. Electrically, the hammers are arranged in a matrix. One wire out of 20 (S1-S20) selects the hammer board, the group of four. Another wire selects one of four hammers (Col 1-4). This approach simplifies the electronics, since 20 + 4 driver circuits and wires are used, rather than 80 (one for each column). The print control card is synchronized to the drum by two photo-transistor sensors that detect the drum's position. One sensor is triggered on each row, while the other sensor triggers once per revolution.
The print control card is shown below, with the main functional blocks labeled. The large purple-and-gold chip is the PIA, the same I/O chip that appeared on the CPU card. It handles a variety of signals such as the self-test request, paper out, and the drum stop signal. The mode control logic generates timing signals depending on the printer's mode. The data compare logic increments the row counter on each drum pulse, and compares the row counter to the value read from memory.10 The hammer driver circuitry on the left selects one of the 20 hammer cards, while the hammer driver circuitry on the right selects one of four hammers. The ribbon circuitry raises and lowers the ribbon so the ribbon doesn't block the text when the printer is idle. The line feed circuitry advances the paper for a line feed operation.
The photo below shows one of the hammer cards, with four hammers. Each hammer has an electromagnet that pulls a lever, rotating the hammer wheel, and causing the hammer to strike the paper. (The hammers themselves are in the upper right of the photo.) A screw adjustment controls the distance between each hammer and the paper, allowing precise adjustment of the timing. (Marc had to carefully adjust all the hammers to make the print quality readable.)
The communication card
The communication card handles the teleprinter's serial data input. The key chip is the 8251A, a USART (Universal Synchronous/Asynchronous Receiver/Transmitter). This complex chip performs the conversion between the serial data stream and the bytes that the processor uses. (Note that the military teleprinter both sent and received serial data, while the Shuttle teleprinter only receives data.) The chip has a few support chips, labeled "UART" in the diagram below. The board has another Peripheral Interface Adapter chip, providing two I/O ports. These ports have functions such as reading the serial line settings (ASCII vs. Baudot, odd or even parity, number of stop bits, and current loop levels).
The board also has circuitry to generate the clock pulses for the selected baud rate. The mode circuitry handles various phases of transmit/receive. The filter/demod circuitry handles different input types, digitally filtering and demodulating as necessary.11
The memory card
The memory card supports the word-processing feature. It provides additional RAM to hold the text buffer as well as the ROM holding the software for editing. The 16 DRAM chips on the left (MK4027) provide 8 KB of RAM while the two ROM chips on the right provide 8K of ROM. The chips in the middle to the right of the resistors split the 12 address bits into row and column addresses as required by the RAM chips. The address signals go through the numerous 24 Ω resistors in the middle; I don't know why. According to the manual, the printer operates fine without this card, except without the word processor. Since the word processor was irrelevant to the Shuttle, I wonder why this card wasn't removed to reduce weight.
The power supply
The power supply board (shown earlier) implements separate power supplies for different parts of the printer.12 The supplies are implemented as switching power supplies, which were not as common at the time as now. The microprocessor supply provides +5V, +12V, and -5V, voltages required by memory chips in the 1970s. A separate switching power supply provides +5V, -8.6V, and +8.6V for the keyboard, dustcover, and interface module, components that were removed for the Shuttle teleprinter. Another supply powers the printer's status lamps.
The drum motor supply is important because its voltage is regulated to control the rotational speed of the drum. A sensor on the drum provides a feedback pulse for each row on the drum. (I think the drum speed is 868 RPM.) These pulses control the drum motor's switching supply. If the drum spins too slowly, the voltage is increased, and similarly if it spins too fast.
The hammers have an unusual constant-current power supply. When the printer is active, this power supply generates +18 V. However, the power supply is designed to use a constant current of 600 mA regardless of the hammer activity. A capacitor provides a reservoir of power that is filled by the constant current. If the hammers are using less current, the excess current is bled off through a resistor. The purpose of this is "to mask printing intelligence during periods of message traffic." In other words, if you used a teleprinter in the embassy in Moscow, for instance, spies could monitor power transients to see when hammers are firing, and perhaps figure out what is being printed. By keeping the current constant, this source of intelligence is blocked. Of course, this feature is useless on the Space Shuttle and only wastes power.
The military teleprinter accepted multiple input voltages: 22-30 VDC, 115 VAC, or 230 VAC, along with a 12 VDC battery backup. The transformers and diodes to support these voltages were part of the interface module that was removed for the Shuttle teleprinter. Instead, the Shuttle teleprinter is powered by 28 VDC.
Mechanical changes
The military teleprinter underwent significant mechanical changes to make it suitable for the Shuttle. These changes reduced its weight from 100 pounds to 59 pounds. The most visible change to the printer is the removal of the keyboard. The entire front section of the printer was replaced, removing the controls that were not needed in the Shuttle.13 The rugged frame of the original printer was replaced with a lighter-weight (but still substantial) frame. Horizontal rails were added to the frame to support the printer in the Shuttle locker.
The photo below shows the front of the Shuttle teleprinter. While the military teleprinter had numerous lights and switches on the front, the Shuttle teleprinter has just two lights and four switches.
NASA was concerned that the temperature of the teleprinter could become hazardous to the astronauts. To mitigate this danger, the teleprinter had a large heat-sensitive warning sticker. The yellow sticker on the left of the teleprinter changes color and displays an image if it heats up: it shows a bandaged hand and the word "HOT". Above it is an "Omegalabel" temperature monitoring sticker that shows the highest temperature the device reached. There are more of these stickers inside the teleprinter on various motors.
The Interim Teleprinter inside the Space Shuttle
The teleprinter was too large to be mounted on the flight deck, so it was mounted in a storage locker on the middeck, one level lower. The photo below shows the location of the locker that held the teleprinter (although the teleprinter was not present in this photo), looking backward (aft) toward the airlock. The locker is denoted MA9F, indicating Mid-deck Aft, position 9F (details), in the back on the right side of the Shuttle.
The teleprinter was noisy because of its impact printing; even with it in a locker, the sound outside was 69.5 dB. The solution was to soundproof the locker with acoustic insulation. Various insulating materials were tested until one was found that passed the toxicity requirements. Another flammability waiver was required for the insulation.
Putting the teleprinter in an insulated locker without cooling caused another problem: overheating. The military teleprinter used 34 watts even while idle, which would cause the printer to become dangerously hot after just 6 orbits. The printer was redesigned to support a standby mode that used just 1 watt. When a signal from Earth was detected, the printer would power up while in use, and then return to standby mode. A circuit was added to send a tone back to Earth when the printer was activated, reassuring Mission Control that the printer had switched out of standby mode. These circuits were on the three custom Shuttle boards described earlier.
Putting the teleprinter in a locker made cabling difficult. The solution was a panel on the locker door with connectors for power and audio. The panel has a power switch and light as well as a light to indicate that a message has been received.
The photo below shows the teleprinter locker with the connection panel on the far left. Note the cables attached to the connectors. These cables went across the back of the Shuttle to the left side, where they went up to the flight deck; the cable routing was performed before launch.14 For this flight, the neighboring locker MA16F held 3300 honeybees for a student experiment.
The teleprinter cables connect to the shuttle at panel A15 on the aft bulkhead of the flight deck on the left side of the Shuttle. In other words, if you sat in the Shuttle Commander's seat in the cockpit and turned around, this is what you would see.
The audio cable from the teleprinter went to the Payload Specialist communication connection on panel A15, while the power cable went to the DC power connection right below. During launch, this audio connection was needed for crew communication, so the teleprinter was plugged in after launch and the audio settings were reconfigured on panel L9. A cue card was placed above panel L9 with instructions on the teleprinter.
The teleprinter's replacements
The Shuttle teleprinter was supposed to be used for a short time until the Uplink Text and Graphics System (TAGS) entered service, but things didn't work out that way. TAGS, described earlier, was the fax-like system that could receive grayscale images, but it depended on the TDRS satellites with their support for digital data. The first TDRS satellite was launched by the sixth shuttle flight, STS-6 (1983). This allowed the use of TAGS on STS-7, but the printer promptly jammed.15 TAGS had constant problems with jamming; on STS-35, the printer jammed and then the unjamming tool broke. Due to the unreliability of the TAGS, the Interim Teleprinter was kept in service as a backup device. TAGS was mounted on a dual cold plate in avionics bay 3 of the crew compartment middeck (details), on the other side of the airlock from the teleprinter.
After a decade, another printer, the Thermal Impulse Printer System (TIPS) was put into service, probably on flight STS-56 in 1993. Once TIPS proved its reliability, it replaced both the teleprinter and the Text and Graphics System (TAGS). The TIPS printer was installed in mid-deck locker MF28E; the F indicates the locker was on the forward wall, not the aft wall that held the Interim Teleprinter. As a backup for the TIPS, the Shuttle flew with a second TIPS.
One motivation behind the TIPS thermal printer was NASA's desire to use more commercial-off-the-shelf (COTS) equipment instead of expensive custom equipment. The TIPS printer is the Raytheon TDU-850 printer (below), a commercial product that sold for $4950. A custom communication interface board inside the printer provided the interface between the printer and the Shuttle's S-Band and Ku-Band communications systems. This interface also allowed astronauts to use the TIPS as a printer for an onboard personal computer.
The photo below shows the TIPS printer in use, printing a long stream of output that Eileen Collins is reading. Collins was the first woman to pilot the Space Shuttle; she flew on the Shuttle four times, twice as pilot and twice as commander.
The teleprinter, operational
We succeeded in making the Shuttle teleprinter operational. The printer had many mechanical problems, mainly because the rubber rollers had turned to liquid and gummed up the mechanism. Marc disassembled the printer, carefully cleaned the mechanism, and realigned everything. I won't discuss the restoration process here since there will be a video on CuriousMarc's channel. We were able to send FSK-modulated data to the printer and it was printed successfully, as shown below.
Conclusions
At first, I thought that the Shuttle's Interim Teleprinter was a terrible design. It's absurdly heavy and was in danger of overheating. Although the design started with an existing product, much of it required redesign: the front section, the new drum, the interface, and even the frame. The design inherited features it couldn't use, such as the built-in word processor. And the constant-current feature was pointless for the Shuttle and just wasted power.
When I learned that the design had to be completed in just seven months, my opinion of the teleprinter improved. Moreover, the design had many constraints, such as toxicity and flammability restrictions, that limited the potential approaches.
In the end, the teleprinter was used on over 50 flights, acting as a reliable backup to the somewhat flaky Text and Graphics System (TAGS).16 Despite its name, the Interim Teleprinter turned out to be a long-lasting solution, not interim at all. So I have to conclude that the teleprinter was a good design, working much better and much longer than intended.17
In any case, the Interim Teleprinter is an interesting piece of hardware and I hope you enjoyed this article. Follow me on Mastodon as @kenshirriff@oldbytes.space or RSS. Thanks to Marcel for providing the printer. Restoration performed with CuriousMarc, Eric Schlapefer, and Mike Stewart.
Notes and references
-
References for the teleprinter:
The Interim Teleprinter and its development is described in detail in: M.D. Schuette, “Space Shuttle Interim Teleprinter System,” in Conference record: NTC ’82, Systems for the Eighties, IEEE. (I'll call this the "teleprinter paper" for short.)
The Shuttle Crew Operations Manual has extensive information on the shuttle and some information on the teleprinter.
The teleprinter is briefly discussed here.
Some teleprinter information is in the "Crew Systems Equipment Workbook" via RR Auction.
The layouts of the Shuttle panels are in Orbiter OV-102 Display and Control Panel Configuration.
The lockers are described in Orbiter middeck/paylod standard interfaces control document.
The manuals for the AN-UGC/74 are at RadioNerds.
An enormous collection of Shuttle documents is at gandalfddi. ↩ -
The teleprinter paper mentions that Shuttle had one other option for receiving hardcopy data: the Text Uplink to Mass Memory System (TUMMS). This allowed text to be displayed on a CRT and the crew could take a Polaroid photo. This was obviously an impractical solution. I couldn't find any other references to TUMMS, so TUMMS may be a proposal that wasn't implemented. ↩
-
Specifically, the Shuttle teleprinter was based on the Honeywell Model AN/UGC-74A9(V)3 Communications Terminal. ↩
-
The mechanism of a drum printer is similar to a chain printer such as the IBM 1403 line printer: each print position has a hammer that fires when the correct character is in that position. However, chain printers have better print quality than drum printers, due to the effect of timing errors. In a drum printer, a small timing error on a hammer will cause the character to be printed too high or too low. In a chain printer, however, a timing error will cause the character to be shifted to the left or right. Vertical mispositioning is obvious and looks terrible. Horizontal mispositioning is much less noticeable since character spacing is normally slightly variable. ↩
-
To be precise, the hammer is fired 1.5 characters early due to its travel time. By the time the hammer hits the drum, the drum has rotated enough to put the desired character in place. Each hammer has a screw to adjust its distance to the drum, necessary to get the timing exact. It's amazing that this system works and doesn't produce a smudged mess. ↩
-
After reverse-engineering the boards, I found a paper on the Shuttle teleprinter that specified the FSK frequencies as 1600 Hz for a 0 and 2057 Hz for a 1, different from what we used. Perhaps the frequencies were changed during development. ↩
-
I created schematics of the three Shuttle-specific boards. Click an image for a larger (readable) version.
Schematic of the input board.Schematic of the control board.Schematic of the output board. -
The block diagram below shows the main functional blocks of the CPU card.
CPU block diagram. From Maintenance Manual, TM 11-5815-602-24, p3-6 -
I expected that a line would be printed during one drum revolution but looking at the print pattern, it appears to take multiple revolutions per line. Perhaps the printer is avoiding hammers firing too close together to minimize current spikes. Moreover, the published print speed of 60 characters per second is considerably slower than one revolution. Or perhaps the hammer pattern is randomized so spies can't listen in and determine what is being printed. I'm still investigating. ↩
-
Looking at the circuitry, I think the memory buffer holds the drum row number for each position, and the print control card fires the hammer if the value matches the current row number. In contrast, the "obvious" approach would put the character values in the memory buffer and the print control card would match against the current drum character. The implemented solution puts less work on the print control card, which only needs to update the target comparison value once per line, rather than every character. However, it requires the CPU card to transform the input characters into row values. ↩
-
The teleprinter accepts two types of inputs: NRZ and D10. NRZ (Non-Return to Zero) is the straightforward encoding of the serial signal as 0's or 1's. The manual doesn't define D10, but I think it is Manchester encoding, using a 01 sequence for a 0 and a 10 sequence for a 1 (or inverted). The D10 signal is self-clocking, since each bit contains a transition. The demodulation circuit converts the D10 signal into a straight bit sequence. An NRZ signal can either use an external clock or an internal clock from the baud rate generator. With the internal clock, the input is sampled four times and digitally filtered since the input may not exactly line up with the internal clock. ↩
-
The power supply is explained in the Maintenance Manual. The fold-out power supply schematics in that manual were not scanned for some reason but can be found in the B&C Maintenance Manual. ↩
-
The military teleprinter contained a large interface module at the back, providing the signal and power connections to the terminal. The serial-line signals could be a 20-milliamp current loop, a 60-milliamp current loop, or MIL-STD-188/144 (similar to RS-422). The interface module converts these signals to the TTL signals used internally. The interface module also contains a power supply for the interface circuitry. Since this interfacing was not required for the Shuttle, the interface module was discarded and replaced with the Shuttle's custom FSK interface cards. The AC power supply and filtering was also removed. ↩
-
I was a bit surprised that the teleprinter cables would run for a long distance through the Shuttle. But the Shuttle is full of wires and cables running in all directions, as shown in the photo below. This photo is from the same angle as the earlier diagram showing where the teleprinter is connected. This flight was after the teleprinter was retired, but the teleprinter would have been plugged in behind the exercise equipment.
The aft flight deck of Discovery during STS-116. From National Archives. -
One source says that the inaugural flight of TAGS was STS-29 (March 1989). Another source says that testing of the "new" TAGS system continued on STS-29. Contradicting this, TAGS was used on STS-7 (June 1983), jamming after the first page. TAGS was also used on STS-8 (August 1983) but failed after five pages. The TAGS unit was not flown on STS-41B (Feb 1984, the next Challenger flight after STS-8). (Note that STS-41B was the tenth flight, considerably before STS-29, the 28th flight. The Space Shuttle mission numbers are a mess.) It's hard to reconcile these statements. Probably, TAGS was still in the testing stage as late as STS-29 due to reliability problems. ↩
-
The teleprinter had a few problems during use. On flight STS-6, the teleprinter got stuck in high power mode. On flight STS-30, messages were illegible (link). ↩
-
The teleprinter shows the risk of building an interim solution that turns out to last much longer than expected. This also happened with the Interim Upper Stage (IUS), a launch system to boost Shuttle payloads to a higher orbit. The Interim Upper Stage was designed as a temporary solution until a space tug became available. Eventually, NASA realized that nothing was replacing the IUS, so it was renamed to "Inertial Upper Stage", preserving the acronym.
I'll mention that this also happened with the 8086 processor. It was intended as an interim processor until the iAPX 432 "micro-mainframe" processor was ready. The iAPX 432 turned out to be a disaster, while the "stopgap" 8086 is still with us as the x86 architecture. ↩
Subject: Inside the guidance system and computer of the Minuteman III nuclear missile
The Minuteman missile was introduced in 1962 as a key part of America's nuclear deterrent. The Minuteman III missile is currently the only US land-based intercontinental ballistic missile (ICBM), with 400 missiles ready for launch, spread across five central states.1 The missile contains a precision guidance system, capable of delivering a warhead to a target 13,000 km away (8000 miles) with an accuracy of 200 meters (660 feet).
The diagram below shows the guidance system of the Minuteman III missile (1970). This guidance system contains over 17,000 electronic and mechanical parts, costing $510,000 (about $4.5 million in current dollars). The heart of the guidance system is the gyro stabilized platform, which uses gyroscopes and accelerometers to measure the missile's orientation and acceleration. The computer uses the measurements from the platform to determine the missile's position and guide the missile on its trajectory to the target. Other key components are the missile guidance set controller, which contains electronics to support the gyro stabilized platform, and the amplifier, which interfaces the computer with the rest of the missile. In this blog post, I take a close look at the components of the guidance system that was used until the early 2000s.2
Fundamentally, the guidance computer constantly compares the missile position to the desired trajectory and generates the appropriate steering commands to keep the missile on track.3 The diagram below shows how directing the engine nozzles causes the missile to rotate around its three axes: roll, pitch, and yaw.4 In the silo, the roll angle (the azimuth) is aligned with the direction to the target. The missile takes off vertically and then the missile gradually rotates along the pitch axis to tilt over toward the target. During flight, adjustments along all three axes keep the missile on target. The Minuteman III has four rocket stages so the guidance computer jettisons each rocket stage and ignites the next stage in sequence.
The guidance platform
The idea behind inertial navigation is to keep track of the missile's position by constantly measuring its acceleration. By integrating the acceleration, you get the velocity. And by integrating the velocity, you get the position. Inertial navigation is self-contained, a big advantage for a missile since the enemy can't jam your navigation. The hard part is measuring the acceleration and angles with extreme accuracy, since even tiny errors are multiplied as the missile travels.
In more detail, the Minuteman's inertial guidance is built around a gyroscopically stabilized platform, which is kept in a fixed orientation. The platform is mounted on two beryllium gimbals. Feedback from gyroscopes drives three torque motors to rotate the gimbals to keep the stable platform in exactly the same orientation no matter how the missile twists and turns.
The diagram below shows the components of the stable platform, in approximately the same orientation as the photo above. Three accelerometers are mounted on the stable platform to measure acceleration. The accelerometers are oriented along three perpendicular axes so each one measures acceleration along one axis. (The accelerometer axes are not aligned with the platform axes; this distributes the acceleration (mostly "up") across the accelerometers, increasing accuracy.) The two alignment mirrors allow the stable platform to be aligned with a precise device called an autocollimator, as will be described below. The gyrocompass uses the Earth's rotation to precisely determine North, providing a backup alignment technique. Both the alignment mirrors and the gyrocompass can be rotated to a precise angle, reported by the resolver.
To target a Minuteman I missile, the missile had to be physically rotated in the silo to be aligned with the target, an angle called the launch azimuth. This angle had to be extremely precise, since even a tiny angle error will be greatly magnified over the missile's journey. Aligning the missile was a tedious process that used the North Start to determine North. Since the star was not visible from inside the silo, a complex surveying technique was used, using a surveyor's theodolite to measure the angles between the North Star and three concrete monuments outside the silo. Inside the silo, the closest monument was visible through a sighting tube, allowing the precise angle measurement to be transferred to the silo. After many more measurements inside the silo, a special device called an autocollimator was positioned precisely 90° from the desired launch azimuth. The autocollimator shot a beam of light through a window in the side of the missile, where it bounced off a mirror on the stable platform and returned to the autocollimator. If the returning beam wasn't exactly parallel, the autocollimator sent a signal to the missile, causing the stable platform to rotate as needed. The result of this process was that the stable platform was exactly aligned with the desired angle to the target.5
The guidance platform was completely redesigned for Minuteman II and III, eliminating the time-consuming alignment that Minuteman I required. The new platform had an alignment block with rotating mirrors. Instead of rotating the missile, the autocollimator remained fixed in the East position and the mirror (and thus the stable platform) was rotated to the desired launch azimuth. The new guidance platform also added a gyrocompass under the alignment block, a special compass that could precisely align itself to North by precessing against the Earth's rotation. At first, the gyrocompass was used as a backup check against the autocollimator, but eventually the gyrocompass became the primary alignment. For calibration, the alignment block also includes electrolytic bubble levels to position the stable platform in known orientations with respect to local gravity.6
The photo above shows the alignment block on top of the gyrocompass. The front and back of the block are the precision mirrors that reflect the light beam from the autocollimator. The circles on top of the block and at the right are two level detectors, with set screws for exact adjustment. The platform has four level detectors, allowing it to be aligned against gravity in multiple positions. Like the gimbals, the gyrocompass assembly is made of beryllium due to its rigidity and light weight; it has a warning sticker because beryllium is highly toxic.
The diagram below shows how the axes align with the gimbals of the stable platform.7 Note the window at the top of the photo. Light from the autocollimator shines in through the window, reflects off the mirror on the alignment block, and returns through the window to the autocollimator. The autocollimator detects any error in alignment and signals the guidance system to correct its position accordingly.
The stable platform uses gyroscopes to maintain its fixed orientation as the missile turns. The idea behind a gyroscope is that a spinning disk will tend to maintain its spin axis. The problem is that any friction, even from precision ball bearings, will reduce the accuracy. The solution in the Minuteman is a "gas bearing", where the gyroscope rotor is supported by an extremely thin layer of hydrogen. As shown below, the gyroscope is built around a stationary marble-sized ball (blue), fastened to the gyroscope frame at the top and bottom. The rotor (pink) is clamped around the equator of the ball and spins at high speed, powered by an induction motor (windings green, rotor yellow). If the gyroscope frame is tilted, the rotor will stay in its orientation. The resulting change in angle between the frame and the rotor is detected by sensitive capacitive pickups (purple). The gyroscope is sensitive to tilt in two axes: left-right, and front-back. Since nothing touches the rotor except the thin layer of gas around the ball, the influence of friction is minimal.
A gas-bearing gyroscope has the problem that when it starts or stops, the gas layer dissipates, allowing the rotor and the bearing to rub. The Minuteman missile's guidance system was kept continuously running, so starts and stops were infrequent. Moreover, when the gyroscope did need to be started, the electronics gave it a 40-volt jolt to get it up to speed quickly. Because the Minuteman's guidance system was always running—and its solid-fuel engines didn't require fueling—the missile could be launched in under a minute.
To summarize the guidance trajectory, a Minuteman flight is typically about 35 minutes,8 but only the first few minutes are powered by the rockets; the warheads coast most of the way on a ballistic trajectory. The first three rocket stages are active for just 180 seconds; this completed the boost phase for Minuteman I and II. However, the innovation of Minuteman III was that it held three warheads, a system called MIRV (Multiple Independently-targeted Reentry Vehicles). To direct these warheads to their targets, Minuteman III has a fourth stage, called PSRE (Propulsion System Rocket Engine), mounted just below the guidance system. The PSRE was active for 440 seconds, directing each warhead on its specific path. (Meanwhile, a retro-rocket sent the third stage in a random direction. Otherwise, it would tag along with the warheads, acting as a giant radar beacon for enemy anti-ballistic-missile systems.) The warheads travel very high, typically over 800 nautical miles (1500 km), more than three times the altitude of the International Space Station. As for the multiple-warhead MIRV, the Minuteman III missiles were converted back to single warheads as part of the New START arms reduction treaty, with the last MIRV removed in June 2014.
The Minuteman D-17B computer
The guidance computer has a key role in the Minuteman missile, determining the missile's position from the stable platform data, executing a guidance algorithm, and steering the missile on the desired trajectory. Before explaining the D-37 computer used in Minuteman II and III, I'll start by discussing the D-17B computer used in the first Minuteman, since its characteristics strongly influenced the later computers. The Minuteman I computer was very primitive by modern standards. Although it was a 24-bit machine, it was a serial computer, operating on one bit at a time. The big advantage of serial processing is that it dramatically reduces the hardware requirements. Since the computer only processes one bit at a time, it uses a one-bit ALU. Moreover, the buses and datapaths are one bit wide rather than 24 bits. The disadvantage, of course, is that a serial computer is slow; the D-17B took 27 clock cycles (24 bits and three overhead) to perform any operation. At best, the computer could perform 12,800 additions per second.
The computer has an unusual cylindrical structure, 29 inches (74 cm) in diameter, designed to fit the diameter of the Minuteman missile. The computer itself is the bottom half of the cylindrical shell. The top half is the electronic equipment chassis, holding the power supplies for the computer and the stable platform, as well as servo control amplifiers, oscillators, and converters.
The computer doesn't have any RAM. Instead, all instructions, data, and registers are stored on a hard disk, but not like a modern hard disk. The disk has separate, fixed heads for each track so it can access tracks without seeking. (This approach is similar to a computer built around drum memory, except the drum is flattened.) In total, the disk held just 2727 24-bit words (approximately 8 Kbytes). The computer's serial processing and its disk-based storage worked well together. The disk provided data one bit at a time, which the computer would process serially. The results were written back to the disk, one bit at a time as calculation proceeded. The write head was positioned just behind the read head so a value could be overwritten as it was computed.
The photo below shows the numerous read and write heads for the D-17B's hard disk. Note that the heads are fixed (unlike modern hard drives), and the heads are widely distributed across the surface. (There is no need for different tracks to be aligned.) I believe that the green and white heads in pairs are for the "regular" tracks, while the heads with other spacings implement registers and short-term storage called loops.9
The D-17B computer was transistorized. The photo below shows one of its circuit boards, crammed with transistors (the black cylinders), resistors, diodes, and other components. (This board is a read amplifier, amplifying the signals from the hard disk.) The computer used diode-resistor logic and diode-transistor logic to minimize the number of transistors; as a result, it used 6282 diodes and 5094 resistors compared to 1521 silicon and germanium transistors (source).
The computer supported 39 instructions. Many of the instructions are straightforward: add, subtract, multiply (but no divide), complement, magnitude, AND, left shift, and right shift. The computer handled 24-bit words as well as 11-bit split words, so many of these instructions had "split" versions to operate on a shorter value. One unusual instruction was "split compare and limit", which replaced the accumulator value with a limit value from memory, if the accumulator value exceeded the limit.
The focus of the computer was I/O with 48 digital inputs, 26 incremental inputs, 28 digital outputs, 12 analog voltage outputs, and 3 pulse outputs for gyro control. The computer had special instructions to support the various inputs and outputs.10 For example, to integrate pulse signals from the stable platform, the computer had instructions to enter and exit "Fine Countdown" mode, which caused two special registers to operate as digital integrators, in parallel with regular computation (details).
The D-37 computer
For the Minuteman II missile, Autonetics built the D-37 computer, one of the earliest integrated circuit computers. By using integrated circuits, the guidance computer was dramatically shrunk, increasing range, functionality, and accuracy. The photo below compares the size of the older D-17B computer (half-cylinder) with the D-37B (held by the engineer).
Although the main task of the computer is guidance, with the increased capacity of the D-37, the computer took over many of the tasks formerly performed by ground support equipment. The D-37 managed "ground control and checkout, monitoring, communication coding and decoding, as well as the airborne tasks of navigation, guidance, steering, and control" (link).
The D-37 had several models. The D-37A was the prototype system, while the D-37B was deployed in the first 60 Minuteman II missiles. The Air Force soon realized that nuclear radiation posed a threat to the computer, so they developed the radiation-hardened D-37C.11 The Minuteman III used the D-37D, an improved and slightly larger version. Even with additional disk space, program memory was so tight that software features were dropped to save just 47 words.
As far as architecture and performance, the D-37 computer is almost the same as the D-17B, but extended. Most importantly, the D-37 kept the serial architecture of the D-17B, so it had the same slow instruction speed. The D-37 kept the instruction set of the D-17B, with additional instructions such as division, logical OR, bit rotates, and more I/O, giving it 58 instructions versus 39 in the older computer. It expanded the hard disk storage, but with a double-sided disk providing 7222 words of storage in the D37-C.12 The D-37 included division implemented in hardware (which the D-17B didn't have), along with a faster hardware implementation of multiplication, improving the speed of those instructions.13 The D-37C added more I/O lines, as well as radio input and 32 analog voltage inputs.
The diagram below shows the D-37C computer, used in the Minuteman II. At the left is the hard disk that provides the computer's memory. Most of the computer is occupied by complex circuit boards covered with flat-pack integrated circuits. At the right is the advanced switching power supply, generating numerous voltages for the computer (±3, 6, 9, 12, 18, and 24 volts). The connectors at the top provide the interface between the computer and the rest of the system. Because the computer has so many digital (discrete) and analog signals, it uses multiple 61-pin connectors (details).
The D-37C computer was built from 22 different integrated circuits, custom-built by Texas Instruments for the Minuteman project. These chips ranged from digital functions such as NAND gates and a flip-flop to linear amplifiers to specialized functions such as a demodulator/chopper. Texas Instruments sold the Minuteman series integrated circuits on the open market, but the chips were spectacularly expensive ($55 for a flip-flop, over $500 in current dollars) and not as popular as TI's general-purpose integrated circuits.14 The circuit boards were very complex for the time, with 10 interconnected layers. Each board was about 4 × 5½ inches and held about 150 flatpack integrated circuits, with components on both sides.
The growth of the integrated circuit industry owes a lot to the Minuteman computer and the Apollo Guidance Computer, both developed during the early days of the integrated circuit. These projects bought integrated circuits by the hundreds of thousands, helping the IC industry move from low-volume prototypes to mass-produced commodities, both by providing demand and by motivating companies to fix yield problems. Moreover, both computers required high-reliability integrated circuits, forcing the industry to improve its manufacturing processes. Finally, Minuteman and Apollo gave integrated circuits credibility, showing that ICs were a practical design choice.
The Minuteman III used the D-37D computer, which had about twice the disk capacity, 14,137 words. The layout is similar to the D-37C above, with the disk drive on the left and the power supply on the right. Since the computer is mounted "upside down", the boards are not visible inside, blocked by the interconnect board.15 Note the use of flexible PCBs, advanced technology for the time, soldered with low-melting-point indium/tin solder.
By 1970, the D-37 computer had made the cylindrical D-17B obsolete. The government gave away surplus D-17B computers to universities and other organizations for use as general-purpose microcomputers. Dozens of organizations, from Harvard to the Center for Disease Control to Tektronix jumped at the chance to obtain a free computer, even if it was slow and difficult to use, forming a large users group to share programming tips.
The P92 amplifier
The amplifier provides the interface between the computer and the rest of the missile. The amplifier sends control signals to the missile's four stages, controlling the engines and steering. (The electronic circuitry from the Minuteman I's nozzle control units was moved to the amplifier, simplifying maintenance.) Moreover, the Minuteman has explosive ordnance in many places, ranging from small squibs that activate valves to explosives that separate the missile stages. The amplifier sends the high-current (30 amp) signals to detonate the ordnance, while monitoring the current to detect faults.16 The amplifier acts as a safety device for the ordnance, blocking signals unless the amplifier has been armed with the proper code. The amplifier sends control signals to the reentry system (i.e. the warheads) as well as the chaff dispenser, which emits clouds of wires to jam enemy radar. The amplifier also sends and receives signals through the umbilical cable from the ground equipment.
The photo above shows the amplifier with its cover removed. The amplifier is constructed as two stacks of six circuit boards, on top of a double-width power supply board. At the top and bottom of each board, connectors with thick cables connect the boards to the rest of the system. Each board is a multi-layer printed-circuit board built on a thick magnesium frame for cooling. The amplifier has five power switching boards, a valve driver board, three servo amplifier boards, and an ACTR control board (whatever that is). The system board is visible on the left, with large capacitors and precision 0.01% resistors. To its right is the decoder board, presumably decoding computer commands to select a particular I/O device. Note the extensive use of Texas Instruments flat-pack integrated circuits on this board, the tiny white rectangles.
Missile Guidance Set Control
The Missile Guidance Set Control (MGSC) contains the electronics to power and run the inertial measurement unit (IMU), providing the interface to the computer. The MGSC handles the platform servo loop, accelerometer server loops, gyroscope torquing, gyrocompass torquing and slew, and accelerometer temperature control.17 One unexpected function of the MGSC is powering the computer's hard disk, supplying 400 Hz, 3-phase power at 27.25 volts (source).
The MGSC is constructed from hinged metal modules, each with a particular function, shown above. The modules are constructed around printed circuit boards. Two large connectors at the right of the MGSC provide electrical connectivity with the IMU and computer. At the top and bottom of the MGSC are connections for coolant. The MGSC is roughly equivalent to the top half of the Minuteman I's cylindrical guidance system, opposite the computer half. The MGSC is unchanged between the Minuteman II and Minuteman III. The MGSC is normally covered with a metal cover that provides radiation protection, but the cover is missing in the photo above.
Battery
The battery in the Minuteman Guidance System is very unusual, since it is a "reserve battery", completely inert until activated. It is a silver/zinc battery with the electrolyte stored separately, giving the battery an essentially infinite shelf life. To power up the battery during a launch, a gas generator inside the battery is ignited by a squib. The gas pressure forces the potassium hydroxide electrolyte out of a tank and into the battery, energizing the battery in under a second. The battery can only be used once, of course, and you can't test it. The battery was built by Delco-Remy (a division of General Motors) (details). It provides 28 volts at 14.5 Amp-hours, powering the guidance system and most of the missile; a separate battery powers the first-stage rocket.
The photo above shows the battery mounted inside the guidance system. Note the two thin wires attached to the posts on the left front of the battery to enable the battery, and the thick power wires bolted to the posts on the right. Above these posts is an "electrolyte vent port"; I'm not sure what prevents caustic electrolyte from spraying out under high pressure.
The photo below shows the construction of a Minuteman I battery, similar but with two independent battery blocks. The two round gas generators on the front of the electrolyte tube force the electrolyte into the battery sections.
Squib-activated switch
Another unusual component is the squib-activated switch. This switch is activated by a small explosive squib; when fired, the squib forces the switch to change positions. This switch may seem excessively dramatic, but it has a few advantages over, say, an electromagnetic relay. The squib-activated switch will switch solidly, while the contacts on a relay may "chatter" or bounce before settling into their new positions. An electromagnetic relay may require more current to switch, especially if it has large contacts or many contacts. However, like the battery, the squib-activated switch can only be used once.
The purpose of the switch is to disconnect important signals, known as critical leads, during launch. The Minuteman missile has an umbilical connection that provides power, cooling, and signals while the missile is in the silo. Just before the umbilical cable is disconnected, the switch severs the connections for the master reset signal along with an enable and disable signal. Presumably, these control signals are cleanly disconnected to avoid stray signals or electrical noise that could cause problems when the umbilical connection is pulled off.
The photo below shows the umbilical cable connected to a Minuteman II missile in its silo. Also note the window in the side of the missile to allow the light beam from the autocollimator to reflect off the guidance platform for alignment.
Cooling
The guidance system is water-cooled while in the silo, using a solution of sodium chromate to inhibit corrosion. After launch, the guidance system operated for just a few minutes before releasing the warheads, so it operated without water cooling. (The stable platform has a fan and heat exchanger to keep it cool during flight.) The diagram below highlights the cooling lines. Coolant is provided from the ground support equipment through the umbilical connector in the upper right. It flows through the computer, diode assembly, MGSC, and stable platform. Finally, the coolant exits through the umbilical connector.
Diode assembly
In the middle of the guidance system, the diode assembly consists of seven power diodes. These diodes control the power flow when switching from ground power to battery power. The photo below shows the diode assembly, with coolant connections at the top and bottom. The thick gray wire in the center of the diode assembly receives power from the battery just to the left.
Permutation plug
The Permutation plug (or P-plug) was the key cryptographic element of the guidance system, defining the launch codes for a particular missile. The P-plug looked similar to a hockey puck and plugged into a 55-pin socket attached to the amplifier. The retaining bar held the P-plug in place.
Because the security of the missile hinged on the P-plug, the P-plug was handled in a highly ritualized way, transported by a two-person team, an airman and an officer, both armed (source). After the guidance system underwent maintenance, the P-plug team would ensure that the plug was properly installed, just before the missile was bolted back together. There was also a lot of ritual around the disk memory, since it held security codes and targeting information.18 Before anyone could work on the computer, a special team would come to the silo and erase the memory. Afterward, another team would load up the computer from a magnetic tape (in the case of Minuteman III) or punched tape (earlier).19
The missile launch codes are said to be split between the hard disk and the permutation plug. In particular, the missile software holds a two-word code for each of the five launch control facilities.22 The launch code in an Execute Launch Command (ELC) must match the combination of the P-plug value and the site-specific value on disk.23 Thus, the launch code is unique to each launch control site and each missile.24 As another security feature, a launch requires messages from two launch control sites, unless only one was available.25
Transient current detector
A nuclear blast has many bad effects on semiconductors and can cause transient errors. A rather brute-force approach was used to minimize this risk in the D-37C and D-37D computers: if a nuclear blast is detected, the computer stops writing to disk until the burst of radiation passes by. When the radiation level drops, the computer carries on from where it left off, extrapolating to make up for the lost time26 to minimize the error. Since all data is stored on the hard disk, the system doesn't need to worry about memory corruption as could happen with semiconductor RAM.
The Minuteman documents euphemistically refer to "operating in a hostile environment" for the ability to handle large pulses of radiation from a nearby nuclear explosion. Another euphemism is "seismic environment", when a nuclear blast near a silo could disturb the missile's targeting alignment. To get an idea of the expected forces, note that the launch officers were strapped into their seats with four-point harnesses to protect against the seismic environment.27
The "transient current detector" above detects dangerous levels of radiation. I couldn't find any details, but I suspect that it contains a semiconductor and detects transient current through the semiconductor induced by radiation. It would make sense to use a semiconductor similar to the ones in the computer so the detector's response matches the response of the computer, perhaps a matching Texas Instruments IC.
The Minuteman III also has two "field detectors" mounted on the outside of the guidance ring. These presumably detect large fluctuations in the electromagnetic field, indicating an electromagnetic pulse (EMP), different from the ionizing radiation picked up by the Transient Current Detector.
Conclusions
The Minuteman guidance system is full of innovative technologies. Among other things, Minuteman I used an early transistorized computer, and Minuteman II used one of the first integrated circuit computers. The Minuteman missile isn't just something from the past, though. There are currently 400 Minuteman missiles in the United States, ready to launch at a moment's notice and create global devastation. Thus, its technical achievements can't be glorified without reflecting on the negativity of its underlying purpose. On the other hand, Minuteman has succeeded (so far) in its purpose of deterrence, so it can also be viewed in a positive, peacekeeping role. In any case, the Minuteman technology is morally ambiguous, compared to, say, the Apollo Guidance Computer.
I plan to write more about the role of Minuteman and Apollo in the IC industry, so follow me on Mastodon as @kenshirriff@oldbytes.space or RSS for updates. Probably the best overview of Minuteman is Minuteman weapon system history and description. The book Minuteman: A technical history has thorough information. For information on the missile targeting and alignment process, see Association of Air Force Missileers Newsletter, December 2006. The Minuteman guidance system is described in detail in The evolution of Minuteman guidance and control. Much of the imagery in this article is from the National Air and Space Museum. Thanks to Martin Miller for providing a detailed D-37C photo. He has taken amazing photos of nuclear equipment, published in his book Weapons of Mass Destruction: Specters of the Nuclear Age, so check it out.
Notes and references
-
The Minuteman missile was introduced in 1962, followed by the improved Minuteman II in 1965 and the Minuteman III in 1970. From 1966 to 1985, the US had 1000 Minuteman missiles fielded, but the number has been reduced since then due to various arms control agreements. At present, there are 400 active Minuteman III missiles spread among 450 launch sites. The Minuteman guidance system was updated in the early 2000s to a platform called the NS-50, using a computer based on a MIL-STD-1750A microprocessor. I'm not discussing that system in this post for reasons of space.
Although the Minuteman has undergone modernization projects, it is reaching the end of its life and is scheduled to be replaced by the Sentinel missile. The Sentinel program is encountering delays and is over budget by 80%, raising the risk of cancellation but the Sentinel program is proceeding as of July 2024. ↩
-
Disclaimer: This information is all from published sources. There's nothing secret, and it's mostly obsolete from 60 years ago. I don't have access to a Minuteman system (unlike the Titan), so this post is based on publications and photos, rather than hands-on experience. I've tried to be accurate, but I'm sure there are errors. ↩
-
Different guidance algorithms can be used, such as Q-guidance, delta guidance, explicit guidance, and numerical integration; the more advanced algorithms require better computers but provide easier targeting, better accuracy, and more ability to correct for course deviations (see Present and Advanced Guidance Techniques). Q-guidance uses a precomputed "Q matrix" to constantly determine the direction in which velocity needs to be gained, while delta guidance attempts to keep the missile along a precomputed trajectory by using polynomials. In explicit guidance, the equations of motion are solved to determine the steering direction. Minuteman used delta guidance at first, but moved to "hybrid explicit" guidance when the computer became more advanced. See Minuteman: A technical history, page 234 for more on targeting algorithms. ↩
-
On Minuteman I, the three stages were steered by changing the direction of the rocket nozzles. Minuteman II, however, used a single fixed nozzle on the second stage but injected fluid into the exhaust to steer the missile, a technique called liquid injection thrust vector control. The Minuteman III used this technique on the third stage as well, injecting a strontium perchlorate solution. (Small nozzles powered by a gas generator are used for roll control, since directing the exhaust won't produce roll motion.) The thrust control liquid was Freon 114B2, which turned out to be harmful to the ozone layer, so it was replaced in the 1990s with perfluorohexane. ↩
-
Strictly speaking, the launch azimuth wasn't aimed at the target. Because the Earth rotated during the missile's flight, the launch azimuth was aimed at where the target would be when the warhead landed. Another factor was the Minuteman I had a limited ability to steer off the launch azimuth, about 10°, allowing the missile to switch between two targets at launch time. ↩
-
The Minuteman guidance system is designed to achieve as much accuracy as possible. One problem is that the gyroscopes and accelerometers aren't perfect, but have small errors due to friction and other factors. Moreover, the construction of the stable platform isn't exact; components that should be parallel or perpendicular will have tiny angle errors. To deal with these problems, the missile performs periodic calibrations ranging from some every 15 minutes to some every few months.
To assist with calibration, the guidance platform contains electrolytic bubble levels, similar to an ordinary carpentry level, but extremely sensitive. Each bubble level contains wires positioned partially in the bubble and partially in the conductive electrolyte fluid. As the bubble shifts, the amount of wire in the fluid changes, changing the measured resistance. These levels are so sensitive that The levels allow the stable platform to be rotated to known positions relative to gravity for calibration.
The top of the gyrocompass has two mirrors for calibration, allowing the missile platform to rotate exactly 180° relative to the autocollimator. Every 15 minutes, the platform would flip over to measure the gyroscope and accelerometer signals in the opposite orientation. This allowed much better calibration, canceling out errors and improving the missile accuracy. Other calibrations were performed less frequently, such as checking each accelerometer in the up and down positions. Every 90 days, a calibration called PSAT (Perturbation Self-Alignment Technique) pitched the platform by 90° and then slowly rotated the gyrocompass around the vertical to simulate the Earth's rotation (details).
Another alignment measurement checks the angle between the two mirrors. The two mirrors on the alignment block are supposed to be parallel, but they won't be exactly parallel. The guidance platform periodically rotates the mirror assembly to check one mirror and the other against the autocollimator to compute the angle between the mirrors, called zeta. (See Software Validation Study, page A-94.)
These calibrations permitted the measurement of small biases and imperfections in the gyroscopes and accelerometers; this data was fed into the guidance calculations to squeeze out as much accuracy as possible. These measurements also provided statistical tracking of the devices so they could be replaced if their performance started to deteriorate. ↩
-
Inconveniently, I found contradictory sources about the Minuteman coordinate system. Most sources specify Z as the roll axis, but one detailed paper swaps the X and Z axes, maybe to match simulation software. Examining Figure 5 closely shows that the new axis names were drawn in by hand. ↩
-
The flight time of Minuteman depended on the distance and trajectory. The Minuteman's range is said to be 13,000 km. For a closer target, there are two possible trajectories: a high path and a low path. Being direct, the low path could take about 25 minutes, while the high path would reach over 1500 nautical miles (almost 3000 km, even times the altitude of the ISS) and take 45 minutes. See A simulation of Minuteman Trajectories. ↩
-
The disk holds a timing track, which provides the timing for the computer, giving it a 345.6 kHz clock speed. Note that all operations in the computer are synchronized to the disk, rather than a clock inside the computer. One consequence of this is that the processor speed depends on the disk speed, so it isn't as precise as most computers, which generate the clock from a quartz crystal. The processor timing is very important for a guidance computer, since its calculations of positions depend on the time step. If the processor is running fast or slow, the position will be correspondingly wrong. The solution is that the computer calculates a parameter "tau", the ratio between processor time and wall clock time. The computer receives an interrupt exactly once per second; by counting the number of instructions executed between interrupts, the computer can compute tau and ensure that the calculations are accurate. ↩
-
The computer has 8-bit analog-to-digital converters. The D-37C supports 32 analog inputs with a range of +/- 10 volts (source). It also has four digital-to-analog outputs with 8-bit accuracy, also +/- 10 volts.
In the D-17B, nine analog outputs control the rocket steering, providing roll, pitch, and yaw to the three stages, while three analog outputs go to the stable platform, probably positioning the gimbals. ↩
-
The housing for the stable platform provides radiation shielding; it is one of the few parts of the guidance system that is officially secret, but is said to be tantalum sheeting (see Minuteman: A technical history page 224). Although the computer is also said to have radiation shielding, it is curiously not on the secret list. ↩
-
Sources give different memory capacities. The reason is that in addition to the regular memory, part of the disk is used for special purposes including registers and rapid access loops. The problem with the regular memory is that the processor may need to wait for an entire disk revolution to access a particular word. The solution is rapid access loops: by putting the write head just upstream of the read head, the data can be accessed more rapidly. For instance, if the write head is positioned one word length upstream, the word can be read (and rewritten) every cycle, providing immediate access to a single word. Putting the write head further upstream allows storage of longer values, with a corresponding longer wait. The D-37C has ten rapid-access channels of one to 16 words (source). The regular memory in the D-37C consists of 56 channels (i.e. tracks) of 128 words, totaling 7168 words. Counting the loops and registers yields the higher memory capacity of 7222 words. ↩
-
The differences between the D-17B and D-37C instruction sets are described here. ↩
-
The schematic for the Minuteman's flip-flop IC is shown below. This is a complex circuit for the time, with six transistors along with numerous resistors, diodes, and capacitors.
Flip-flop schematic. From Integrated circuits go operational, Electronics, Feb 15, 1963. -
The diagram below shows an exploded view of the D-37D computer (rotated 180° from the earlier photo).
Exploded view of the D-37D computer. Modified and fixed from Minuteman weapon system history and description. -
The danger of these explosives is illustrated by a bizarre accident summarized by "The warhead is no longer on top of the missile." At 3:00 pm on December 5, 1964, two airmen were in the missile silo, troubleshooting a fault in the security system. One airman removed a fuse, triggering a loud explosion and the nuclear warhead fell off the missile, falling 75 feet to the floor of the silo. Nobody was injured and the warhead was hoisted out a few days later without incident.
The problem was that the airmen used an "unauthorized tool" (a screwdriver) to remove the fuse, briefly shorting power to ground. This caused a current on a ground line connected to the missile through an umbilical cable. Inside the missile, the retrorocket for the warhead had an igniter, but a short on its connector caused another connection to ground. This ground went out through a second umbilical, closing the circuit. (Apparently, the resistance between the two grounds was high enough that the path through the two shorts had enough current to ignite the igniter.) The force of the retrorocket flung the warhead off the rocket.
More details are in this report and this report. (This incident is not the 1980 Damascus Titan incident, where a dropped 8-pound wrench socket led to the explosion of the missile, killing one person and injuring 21 others, while flinging the warhead out of the silo. The very interesting book Command and Control discusses the Damascus incident and other mishaps with nuclear weapons.) ↩
-
The functional diagram below shows the interactions between the stable platform and the guidance set. Shaded circuits are mounted on the stable platform, while others are in the control set. This diagram is for the later NS-50 platform, but it should be mostly relevant to the NS-20 used in Minuteman III earlier. At the top are the feedback loops for the PIGA accelerometers (top). The torque motors (TM) in the middle provide feedback through the gimbals for the gyroscopes. Below that, the gyrocompass has a a feedback loop with its internal torquer. The torque motor at the bottom rotates the gyrocompass and mirrors with feedback through the optical resolver.
Platform Control Functional Diagram. From Technical Reference Handbook, SELECT WS133A, D2-27524-5, Fig. 3-12, page 3-68. -
The Air Force was especially concerned with keeping the targeting information secret; the people launching the missiles had no idea what the targets were. It occurs to me, though, that since the Minuteman I missile had to be physically rotated in its silo to exactly line up with the target, one presumably could draw an azimuth line on the map and know the target was along the line. ↩
-
The Minuteman computer has a conditional fill mode, where the computer can't be loaded with a new program unless the first four words match the first four words in memory channel 12. This ensures that the computer can't be loaded with unauthorized software. This four-word code must be different from the P-plug value for two reasons. First, the P-plug value is not allowed to be stored in memory. Second, the filling code is four words, while the P-plug value is two words.
The P-plug held two hardwired code words that could be read by the processor.20 For security, the two words were not allowed to be in memory (i.e. the hard drive) at the same time. I assume it is called a Permutation Plug for historical reasons; the Saturn V booster used in Apollo used a security plug that provided a permutation of the 21-character code.21 (That is, it mapped 21 inputs to 21 outputs as a permutation.) ↩
-
The processor read the P-plug code words by first triggering the discrete output #25 with the
DOB 25
instruction (Discrete Output B) and then reading the value (twice for reliability). The process was repeated with output #6. Finally, the discretes were cleared withDOB 0
(reference). ↩ -
The Apollo flights used "code plugs" to protect the Range Safety system from unauthorized access, since this system was capable of blowing up the Saturn V rockets (details). Signals were transmitted in a 21-symbol "alphabet" (encoded by 2 tones out of 7). The code plug permuted the 21 symbols in an arbitrary way. This wasn't a lot of security, just a simple substitution cipher, but it was sufficient for its role. A command consisted of 11 characters (9 for the address and 2 for the command), so the odds were low of hitting a valid message by chance. ↩
-
One feature of the Minuteman missile is that the missile sites themselves are uncrewed; the missile officers who launch the missiles work remotely, handling multiple missiles to reduce the personnel required. Specifically, each group of 10 missiles (called a "flight") is controlled by an underground launch control center. A squadron consists of 50 missiles. A "wing" is the largest grouping, handling 150 to 200 missiles, and attached to a particular Air Force base. At its peak, Minuteman had 1000 missiles divided among six wings in Missouri, Montana, North Dakota, South Dakota, and Wyoming, with missiles spilling across the Wyoming border into Colorado and Nebraska. ↩
-
Information on the launch code mechanism is from Technical Reference Handbook D2-27524-5, "System Engineering Level Evaluation Correction Team, WS133A", chapter 2. ↩
-
The Command Signals Decoder provides another layer of security. It is an electromechanical stepping decoder that blocks the first-stage rocket from igniting unless it receives the proper 27-bit code as part of an Enable command. (The Enable command (ENC) happens before the Execute Launch command (ELC); see the state diagram below.) Its operation is murky; my hypothesis is that the decoder acts much like a combination lock, with the 27 code posts raised or lowered by the input bits. If all the posts are in the proper position, the inner wheel is released, allowing it to rotate to the armed position and close the electrical firing circuit for the motor igniters. Specifically, the 27 posts have a high notch on one side and a low notch on the other, so the device is programmed by rotating each pin so the desired notch faces inward. When the device receives code bits, the wheel rotates one position for each bit and a solenoid raises or lowers the pin, depending on if it is a zero or one. If all pins are in the correct positions, the inner wheel can rotate through the notches, but if any pins are incorrect, the inner wheel will bind on that pin. The 27 bits are the "CSD(M) secure code", probably consisting of 24 code bits and three padding bits. Another Command Signals Decoder on the ground "CSD(G)" provides an interlock for ground ordnance.
The Command Signals Decoder, from Evolution of ordnance subsystems and components design in Air Force strategic missile systems.I think there are two motivations behind this complicated device. First, they want an interlock that is mechanical rather than electronic, since an electronic device can be affected unpredictably by radiation, power surges, component failure, programming errors, etc. Second, they want an interlock that physically disconnects the firing circuit so there is no path that can be triggered by stray current, lightning, EMP, etc.
The Minuteman's P92 amplifier assembly also blocks ordnance unless armed with a code. It's unclear if this is the same enable code as the Comand Signals Decoder or a different code.
The earlier Titan missile also had a code mechanism to prevent an unauthorized launch by blocking the engine. The Titan had a butterfly valve in the fuel line with a 6-digit code. If you don't enter the right code, the fuel line stays shut and the missile simply can't take off (video). ↩
-
A missile launch normally requires an Execute Launch Command (ELC) from two launch control sites, moving the missile to the "Launch in Process" mode. However, that raises the concern that there could only be one surviving site. The solution is that after receiving a single launch command, the missile starts a timer. If the "one-vote launch time" passes uneventfully, the missile is launched. However, another site can cancel a rogue launch during that time by sending an Inhibit Command (INC) message. The sites have a complex system to detect which sites are active and to determine the primary and secondary sites controlling each missile. (This is reminiscent of the Byzantine generals problem.)
The state machine for Minuteman missile status. From Technical Reference Handbook D2-27524-5, page 2-25. -
After detecting a nuclear blast, the Minuteman computer shuts down for an integral number of disk revolutions. When it comes back up, it double-counts the accelerometer pulses for the same number of disk revolutions to make up for the missed time (see Minuteman: A technical history pages 220 and 223). As long as not much changed during the lost time, the accuracy loss is small. Of course, this counter would need to be outside the part of the computer that gets shut down. ↩
-
Missiles were aligned to such accuracy that even running a diesel generator nearby could shift the silo enough to cause alignment problems, as happened with a Titan site. (See Association of Air Force Missileers Newsletter, March 2007, page 6.) A "seismic event" could also be an earthquake; the enormous 1964 Alaska earthquake—9.2 on the Richter scale—caused Minuteman guidance systems to lose alignment with the autocollimator (See Minuteman: A technical history page 221). ↩
Hurrying through the National Gallery of Art five minutes before closing, I passed a Navajo weaving with a complex abstract pattern. Suddenly, I realized the pattern was strangely familiar, so I stopped and looked closely. The design turned out to be an image of Intel's Pentium chip, the start of the long-lived Pentium family.1 The weaver, Marilou Schultz, created the artwork in 1994 using traditional materials and techniques. The rug was commissioned by Intel as a gift to AISES (American Indian Science & Engineering Society) and is currently part of an art exhibition—Woven Histories: Textiles and Modern Abstraction—focusing on the intersection between abstract art and woven textiles.
I talked with Marilou Schultz, a Navajo/Diné weaver and math teacher, to learn more about the artwork. Schultz learned weaving as a child—part of four generations of weavers—carding the wool, spinning it into yarn, and then weaving it. For the Intel project, she worked from a photograph of the die, marking it into 64 sections along each side so the die pattern could be accurately transferred to the weaving. Schultz used the "raised outline" technique, which gives a three-dimensional effect along borders. One of the interesting characteristics of the Pentium from the weaving perspective is its lack of symmetry, unlike traditional rugs. The Pentium weaving was colored with traditional plant dyes; the cream regions are the natural color of the wool from the long-horned Navajo-Churro sheep.2 The yarn in the weaving is a bit finer than the yarn typically used for knitting. Weaving was a slow process, with a day's work extending the rug by 1" to 1.5".
The Pentium die photo below shows the patterns and structures on the surface of the fingernail-sized silicon die, over three million tiny transistors. The weaving is a remarkably accurate representation of the die, reproducing the processor's complex designs. However, I noticed that the weaving was a mirror image of the physical Pentium die; I had to flip the rug image below to make them match. I asked Ms. Schultz if this was an artistic decision and she explained that she wove the rug to match the photograph. There is no specific front or back to a Navajo weaving because the design is similar on both sides,3 so the gallery picked an arbitrary side to display. Unfortunately, they picked the wrong side, resulting in a backward die image. This probably bothers nobody but me, but I hope the gallery will correct this in future exhibits. For the remainder of this article, I will mirror the rug to match the physical die.
The rug is accurate enough that each region can be marked with its corresponding function in the real chip, as shown below. Starting in the center, the section labeled "integer execution units" is the heart of the processor, performing arithmetic operations and other functions on integer numbers. The Pentium is a 32-bit processor, so the integer execution unit is a vertical rectangle, 32 bits wide. The horizontal lines correspond to different types of circuitry such as adders, multipliers, shifters, and registers. To the right, the "floating point unit" performs more complex arithmetic operations on floating-point numbers, numbers with a fractional part that are used in applications such as spreadsheets and CAD drawings. Like the integer execution unit, the floating point unit has horizontal stripes corresponding to different functions. Floating-point numbers are represented with more bits, so the stripes are wider.
At the top, the "instruction fetch" section fetches the machine instructions that make up the software. The "instruction decode" section analyzes each instruction to determine what operations to perform. Simple operations, such as addition, are performed directly by the integer execution unit. Complicated instructions (a hallmark of Intel's processors) are broken down into smaller steps by the "complex instruction support" circuitry, with the steps held in the "microcode ROM". The "branch prediction logic" improves performance when the processor must make a decision for a branch instruction.
The code and data caches provide a substantial performance boost. The problem is that the processor is considerably faster than the computer's RAM memory, so the processor can end up sitting idle until program code or data is provided by memory. The solution is the cache, a small, fast memory that holds bytes that the processor is likely to need. The Pentium processor had a small cache by modern standards, holding 8 kilobytes of code and 8 kilobytes of data. (In comparison, modern processors have multiple caches, with hundreds of kilobytes in the fastest cache and megabytes in a slower cache.) Cache memories are built from an array of memory storage elements in a structured grid, visible in the rug as uniform pink rectangles. The TLB (Translation Lookaside Buffer) assists the cache. Finally, the "bus interface logic" connects the processor to the computer's bus, providing access to memory and peripheral devices. Around the edges of the physical chip, tiny bond pads provide the connections between the silicon chip and the integrated circuit package. In the weaving, these tiny pads have been abstracted into small black rectangles.
The weaving is accurate enough to determine that it represents a specific Pentium variant, called P54C. The motivation for the P54C was that the original Pentium chips (called P5) were not as fast as hoped and ran hot. Intel fixed this by using a more advanced manufacturing process, reducing the feature size from 800 to 600 nanometers and running the chip at 3.3 volts instead of 5 volts. Intel also modified the chip so that when parts of the chip were idle, the clock signal could be stopped to save power. (This is the "clock driver" circuitry at the top of the weaving.) Finally, Intel added multiprocessor logic (adding 200,000 more transistors), allowing two processors to work together more easily. The improved Pentium chip was smaller, faster, and used less power. This variant was called the P54C (for reasons I haven't been able to determine). The "multiprocessor logic" is visible in the Pentium rug, showing that it is the P54C Pentium (right) and not the P5 Pentium (left).
Intel's connection with New Mexico started in 1980 when Intel opened a chip fabrication plant (fab) in Rio Rancho, a suburb north of Albuquerque. At the time, this plant, Fab 7, was Intel's largest and produced 70% of Intel's profits. Intel steadily grew the New Mexico facility, adding Fab 9 and then Fab 11, which opened in September 1995, building Pentium and Pentium Pro chips in a 140-step manufacturing process. Intel's investment in Rio Rancho has continued with a $4 billion project underway for Fab 9 and Fab 11x. Intel has been criticized for environmental issues in New Mexico, detailed in the book Intel inside New Mexico: A case study of environmental and economic injustice. Intel, however, claims a sustainable future in New Mexico, restoring watersheds, using 100% renewable electricity, and recycling construction waste.
Fairchild and Shiprock
Marilou Schultz is currently creating another weaving based on an integrated circuit, shown below. Although this chip, the Fairchild 9040, is much more obscure than the Pentium, it has important historical symbolism, as it was built by Navajo workers at a plant on Navajo land.
In 1965, Fairchild started producing semiconductors in Shiprock, New Mexico, about 200 miles northwest of Intel's future facility. Fairchild produced a brochure in 1969 to commemorate the opening of a new plant. Two of the photos in that brochure compared a traditional Navajo weaving to the pattern of a chip, which happened to be the 9040. Although Fairchild's Shiprock project started optimistically, it was suddenly shut down a decade later after an armed takeover. I'll discuss the complicated history of Fairchild in Shiprock and then describe the 9040 chip in more detail.
The story of Fairchild starts with William Shockley, who invented the junction transistor at Bell Labs, won the Nobel prize, and founded Shockley Semiconductor Laboratory in 1957 to build transistors. Unfortunately, although Shockley was brilliant, he was said to be the worst manager in the history of electronics, not to mention a notorious eugenicist and racist later in life. Eight of his top employees—called the "traitorous eight"—left Shockley's company in 1957 to found Fairchild Semiconductor. (The traitorous eight included Gordon Moore and Robert Noyce who ended up founding Intel in 1968). Noyce (co-)invented the integrated circuit in 1959 and Fairchild soon became a top semiconductor manufacturer, famous for its foundational role in Silicon Valley.
The Shiprock project was part of an attempt in the 1960s to improve the economic situation of the Navajo through industrial development. The Navajo had suffered a century of oppression including forced deportation from their land through the Long Walk (1864-1866). The Navajo were suffering from 65% unemployment, a per-capita income of $300, and a lack of basics such as roads, electricity, running water, and health care. The Bureau of Indian Affairs was now trying to encourage economic self-sufficiency by funding industrial projects on Indian land.4 Navajo Tribal Chairman Raymond Nakai viewed industrialization as the only answer. Called "the first modern Navajo political leader", Nakai stated, "There are some would-be leaders of the tribe calling for the banishment of industry from the reservation and a return to the life of a century ago! But, it would not solve the problems. There is not sufficient grazing land on the reservation to support the population so industry must be brought in." Finally, Fairchild was trying to escape the high cost of Silicon Valley labor by opening plants in low-cost locations such as Maine, Australia, and Hong Kong.
These factors led Fairchild to open a manufacturing facility on Navajo land in Shiprock, New Mexico. The project started in 1965 with 50 Navajo workers in the Shiprock Community Center manufacturing transistors, rapidly increasing to 366 Navajo workers.
By 1967, Robert Noyce, group vice-president of Fairchild, regarded the Shiprock plant as successful. He explained that Fairchild was motivated both by low labor costs and by social benefits, saying, "Probably nobody would ever admit it, but I feel sure the Indians are the most underprivileged ethnic group in the United States." Two years later, Lester Hogan, Fairchild's president, stated, "I thought the Shiprock plant was one of Bob Noyce's philanthropies until I went there," but he was so impressed that he decided to expand the plant. Hogan also directed Fairchild to help build hundreds of houses for workers; since a traditional Navajo dwelling is called a hogan, the houses were dubbed Hogan's hogans.
In 1969, Fairchild opened its new facility at Shiprock and produced the commemorative brochure mentioned earlier. As well as showing the striking visual similarity between the designs of traditional Navajo weavings and modern integrated circuits, it stated that "Weaving, like all Navajo arts, is done with unique imagination and craftsmanship" and described the "blending of innate Navajo skill and [Fairchild] Semiconductor's precision assembly techniques." Fairchild later said that "rug weaving, for instance, provides an inherent ability to recognize complex patterns, a skill which makes memorizing integrated circuit patterns a minimal problem."7
However, in Indigenous Circuits: Navajo Women and the Racialization of Early Electronic Manufacture, digital media theorist Lisa Nakamura critiques this language as a process by which "electronics assembly work became both gendered and identified with specific racialized qualities".5 Nakamura points out how "Navajo women’s affinity for electronics manufacture [was described] as both reflecting and satisfying an intrinsic gendered and racialized drive toward intricacy, detail, and quality."
At Shiprock, Fairchild employed 1200 workers,6 and all but 24 were Navajo, making Fairchild the nation's largest non-government employer of American Indians. Of the 33 production supervisors, 30 were Navajo. This project had extensive government involvement from the Bureau of Indian Affairs and the U.S. Public Health Service, while the Economic Development Administration made business loans to Fairchild, the Labor Department had job training programs, and Housing and Urban Development built housing at Shiprock7.
The Shiprock plant was considered a major success story at a meeting of the National Council on Indian Opportunity in 1971.7 US Vice President Agnew called the economic deprivation and 40-80% unemployment on Indian reservations "a problem of staggering magnitude" and encouraged more industrial development. Fairchild President Hogan stated that "Fairchild's program at Shiprock has been one of the most rewarding in the history of our company, from the standpoint of a sound business as well as social responsibility." He said that at first the plant was considered the "Shiprock experiment", but the plant was "now among the most productive and efficient of any Fairchild operation in the world." Peter MacDonald, Chairman of the Navajo Tribal Council and a World War II Navajo code talker, discussed the extreme poverty and unemployment on the Navajo reservation, along with "inadequate housing, inadequate health care and the lack of viable economic activities." He referred to Fairchild as "one of the best arrangements we have ever had" providing not only employment but also supporting housing through a non-profit.
In December 1972, National Geographic highlighted the Shiprock plant as "weaving for the Space Age", stating that the Fairchild plant was the tribe's most successful economic project with Shiprock booming due to the 4.5-million-dollar annual payroll. The article states: "Though the plant runs happily today, it was at first a battleground of warring cultures." A new manager, Paul Driscoll, realized that strict "white man's rules" were counterproductive. For instance, many employees couldn't phone in if they would be absent, as they didn't have telephones. Another issue was the language barrier since many workers spoke only Navajo, not English. So when technical words didn't exist in Navajo, substitutes were found: "aluminum" became "shiny metal". Driscoll also realized that Fairchild needed to adapt to traditional nine-day religious ceremonies. Soon the monthly turnover rate dropped from 12% to under 1%, better than Fairchild's other plants.
Unfortunately, the Fairchild-Navajo manufacturing partnership soon met a dramatic end. In 1975, the semiconductor industry was suffering from the ongoing US recession. Fairchild was especially hard hit, losing money on its integrated circuits, and it shed over 8000 employees between 1973 and 1975.8 At the Shiprock plant, Fairchild laid off9 140 Navajo employees in February 1975, angering the community. A group of 20 Indians armed with high-power rifles took over the plant, demanding that Fairchild rehire the employees. Fairchild portrayed the occupiers, part of the AIM (American Indian Movement), as an "outside group—representing neither employees, tribal authorities nor the community." Peter MacDonald, chairman of the Navajo Nation, agreed with the AIM on many points but viewed the AIM occupiers as "foolish" with "little sense of Navajo history" and "no sense of the need for an Indian nation to grow" (source). MacDonald negotiated with the occupiers and the occupation ended peacefully a week later, with unconditional amnesty granted to the occupiers.10 However, concerned about future disruptions, Fairchild permanently closed the Shiprock plant and transferred production to Southeast Asia.
For the most part, the Fairchild plant was viewed as a success prior to its occupation and closure. Navajo leader MacDonald looked back on the Fairchild plant as "a cooperative effort that was succeeding for everyone" (link). Alice Funston, a Navajo forewoman at Shiprock said, "Fairchild has not only helped women get ahead, it has been good for the entire Indian community in Shiprock."11 On the other hand, Fairchild general manager Charles Sporck had a negative view looking back: "It [Shiprock] never worked out. We were really screwing up the whole societal structure of the Indian tribe. You know, the women were making money and the guys were drinking it up. We had a very major negative impact upon the Navajo tribe."12
Despite the stereotypes in Sporck's comments, he touches on important gender issues, both at Fairchild and in the electronics industry as a whole. Fairchild had long recognized the lack of jobs for men at Shiprock, despite attempts to create roles for men. In 1971, Fairchild President Hogan stated that since "semiconductor assembly operation require a great amount of detail work with tiny components, [it] lends itself to female workers. As a result, there are nearly three times as many Navajo women employed by Fairchild as men."7
The role of women in fabricating and assembling electronics is often not recognized. A 1963 report on electronics manufacturing estimated that women workers made up 41 percent of total employment in electronics manufacturing, largely in gendered roles. The report suggested that microminiaturization of semiconductors gave women an advantage over men in assembly and production-line work; women made up over 70% of semiconductor production-line workers, with 90-99% of inspecting and testing jobs. and 90-100% of assembler jobs. Women were largely locked out of non-production jobs; although women held a few technician and drafting roles, the percentage of woman engineers was too low to measure.
The defense contractor General Dynamics also had Navajo plants, but with more success than Fairchild. General Dynamics opened a Navajo Nation plant in Fort Defiance, Arizona in 1967 to make missiles for the Navy. At the plant's opening, Navajo Tribal Chairman Raymond Nakai pushed for industrialization, stating that it was in "industrialization and the money and the jobs engendered thereby that the future of the Navajo people will lie." The plant started with 30 employees, growing to 224 by the end of 1969, but then dropping to 99 in 1971 due to a slowdown in the electronics industry. General Dynamics opened another Navajo plant near Farmington NM in 1988. Due to the end of the Cold War, Hughes acquired General Dynamics' missile business in 1991 before being acquired by General Motors in 1985 and sold to Raytheon in 1997. The Fort Defiance facility was closed in 2002 when its parent company, Delphi Automotive Systems, moved out of the military wiring business. The Farmington plant remains open, now Raytheon Diné, building components for Tomahawk, Javelin, and AMRAMM missiles.
Inside the Fairchild 9040 integrated circuit
The integrated circuit die image in Fairchild's commemorative brochure has an exceptionally striking design and color scheme. It's clear why this chip brings weaving to mind. Studying the die photo of the 9040 carefully reveals some interesting characteristics of integrated circuit design, so I will go into some detail.
The chip was fabricated from a tiny square of silicon, which appears purple in the photograph. Different regions of the silicon die were treated (doped) with impurities to change the properties of the silicon and thus create electronic devices. These doped regions appear as green or blue lines. The white lines are the metal layer on top of the silicon, connecting the components. The 13 metal rectangles around the border are the bond pads. The chip was packaged in an unusual 13-pin flat-pack, as shown below. Each of the 13 bond pads above was connected by a tiny wire to one of the 13 external pins.
The Fairchild 9040 was introduced in the mid-1960s as part of Fairchild's Micrologic family, a set of high-performance integrated circuits that were designed to work together.13 The 9040 chip was a "flip-flop", a circuit capable of storing a single bit, a 0 or 1. Flip-flops can be combined to form counters, counting the number of pulses, for instance.
The most dramatic patterns on the chip are the intricate serpentine blue lines. Each line forms a resistor, controlling the flow of electricity by impeding its path. The lines must be long to provide the desired resistance, so they wind back and forth to fit into the available space. Each end of a resistor is connected to the metal layer, wiring it to another part of the circuit. Most of the die is occupied by resistors, which is a disadvantage of this type of circuit. Modern integrated circuits use a different type of circuitry (CMOS), which is much more compact, partly because it doesn't need bulky resistors.
Transistors are the main component of an integrated circuit. These tiny devices act as switches, turning signals on and off. The photo below shows one of the transistors in the 9040. It consists of three layers of silicon, with metal wiring connected to each layer. Note the blue region in the middle, surrounded by a slightly darker purple region; these color changes indicate that the silicon has been doped to change its properties. The green region surrounding the transistor provides isolation between this transistor and the other circuitry, so the transistors don't interfere with each other. The chip also has many diodes, which look similar to transistors except a diode has two connections.
These transistors with their three layers of silicon are a type known as bipolar. Modern computers use a different type of transistor, metal-oxide-semiconductor (MOS), which is much more compact and efficient. One of Fairchild's major failures was staying with bipolar transistors too long, rather than moving to MOS.14 In a sense, the photo of the 9040 die shows the seeds of Fairchild's failure.
The 9040 chip was constructed on a completely different scale from the Pentium, showing the rapid progress of the IC industry. The 9040 contains just 16 transistors, while the Pentium contains 3.3 million transistors. Thus, individual transistors can be seen in the 9040 image, while only large-scale functional blocks are visible in the Pentium. This increasing transistor count illustrates the exponential growth in integrated circuit capacity between the 9040 in the mid-1960s and the Pentium in 1993. This growth pattern, with the number of transistors doubling about every two years, is known as Moore's law, since it was first observed in 1965 by Gordon Moore (one of Fairchild's "traitorous eight", who later started Intel).
The schematic below shows the circuitry inside the 9040 chip, with its 16 transistors, 16 diodes, and 22 resistors. The symmetry of the 9040 die photo makes it appealing, and that symmetry is reflected in the circuit below, with the left side and the right side mirror images. The idea behind a flip-flop is that it can hold either a 0 or a 1. In the chip, this is implemented by turning on the right side of the chip to hold a 0, or the left side to hold a 1. If one side of the chip is on, it forces the other side off, accomplished by the X-like crossings of signals in the center.15 Thus, the symmetry is not arbitrary, but is critical to the operation of the circuit.
Despite the obscurity of the 9040, multiple 9040 chips are currently on the Moon. The chip was used in the Apollo Lunar Surface Experiments Package (ALSEP),16 in particular, the Active Seismic Experiment on Apollo 14 and 16. This experiment detonated small explosives on the Moon and measured the resulting seismic waves. The photo below is a detail from a blueprint17 that shows three of the nineteen 9040 flip-flops (labeled "FF") as well as two 9041 logic gates, a chip in the same family as the 9040.
Conclusions
The similarities between Navajo weavings and the patterns in integrated circuits have been described since the 1960s. Marilou Schultz's weavings of integrated circuits make these visual metaphors into concrete works of art. Although the Woven Histories exhibit at the National Gallery of Art is no longer on display, the exhibit will be at the National Gallery of Canada (Ottawa) starting November 8, 2024, and the Museum of Modern Art (New York) starting April 20, 2025 (full dates here). If you're in the area, I recommend viewing the exhibit, but don't make my mistake: leave more than five minutes to see it!
Many thanks to Marilou Schultz for discussing her art with me. For more on her art, see A Conversation with Marilou Schultz on YouTube.18 Follow me on Mastodon as @kenshirriff@oldbytes.space or RSS for updates.
Notes and references
-
The original Pentium was followed by the Pentium Pro, the Pentium II, and others, forming a long-running brand of high-performance processors. Pentium was Intel's flagship line until the Core processors took over in 2006. ↩
-
Sheep hold a key role in Navajo culture and economy, which I'll briefly summarize here. Domestic sheep were brought to the Americas during the Spanish colonization, reaching the Navajo in the late 1500s. Since sheep were able to graze on semi-arid land unsuitable for crops, sheep became very important to the Navajo. Although the Navajo had used cotton for weaving in the past, the availability of wool made weaving a fundamental industry; the production and trading of woven Navajo blankets became an important economic factor in New Mexico by the 1750s (details).
Navajo leader Peter MacDonald described the role of sheep: "Sheep were like money in the bank: the more you had, the better your life, your future, and your family's future." The number of sheep grew exponentially in the early 1900s, resulting in overgrazing of the land. The drought and Dust Bowl of the 1930s led the government to restrict the number of sheep on Navajo land, imposing the Navajo Livestock Reduction. This heavy-handed program purchased and slaughtered over half the livestock, which was catastrophic to the Navajo, both economically and culturally, destroying the Navajo's wealth and self-sufficiency.
The Navajo-Churro sheep is a breed that the Navajo developed from the Churra sheep brought from Spain during the Spanish colonization of the Americas. These sheep have a long, lustrous fleece that is excellent for weaving. The Navajo-Churro is also called the Navajo Four-Horned Sheep as some rams have four horns, a rare trait. The Navajo-Churro breed was severely depleted when American troops killed livestock during the Navajo Wars (1863) and then brought close to extinction by the Livestock Reduction of the 1930s to 1950s. In the 1970s, the Navajo Sheep Project started efforts to preserve and revitalize the Navajo-Churro. The breed is still rare, but currently numbers in the thousands. Now, climate change and water shortages are putting more pressure on sheep grazing. ↩
-
A photo of the rug was published in American Indian Science & Engineering Society 1994 Annual Report. This photo shows the "physically accurate" side of the rug, not the side that is currently on display.
A photo of the rug from 1994.Which side of a die image is the top is mostly arbitrary. Intel usually presents die photos with the tiny text on the die right side up, so I will use that convention. For the Pentium die, this text is in the lower right corner and says "80P54C (m) (c) intel '92,'93". Of course, this text is much too small to be part of the woven rug. ↩
-
Strengthening the Indian Economy (Indian Affairs, 1966) discusses various industrial development projects, of which Fairchild was the largest. Other projects included a plant at Rolla, ND to produce sapphire and ruby bearings, a Seminole project with Amphenol to produce electronic connectors, and a Hopi project with BVD to produce garments. Other economic development projects included timber and mining; extractive industries provided over half of Navajo income. ↩
-
Racialization is defined by Nakamura as "the understanding of a specific population as possessing traits and behaviors that belong to a race, not an individual." ↩
-
Many photos of workers at the Shiprock plant are in Fairchild VIEWS, March 1969. Fairchild deserves credit for referring to the workers by name rather than viewing them as anonymous props for photos. Fairchild followed the same practice in its annual reports. ↩
-
NCIO (National Council on Indian Opportunity) News, Oct/Nov 1971 described a high-level meeting with industry to discuss "new development on Indian reservations" with industry. US Vice President Spiro Agnew ran the meeting, with Attorney General John Mitchell a speaker along with Navajo Tribal Council chairman Peter MacDonald. Bizarrely, all three ended up convicted of felonies for different reasons. Within a few years, Mitchell was imprisoned for Watergate crimes and Agnew pled guilty to federal tax evasion. In 1990, MacDonald was convicted of fraud, riot, extortion, racketeering, and conspiracy by a Navajo tribal judge and then a federal judge, spending eight years in prison until pardoned by Bill Clinton (details). The story of Peter MacDonald is complex and many view his prosecution as politically motivated; MacDonald's memoir provides his perspective. ↩↩↩↩
-
Although Fairchild was highly successful at first, it suffered from chaotic management and economic decline. Fairchild steadily lost key employees, many of whom started competing companies. Most important was Intel, started in 1968 by Moore and Noyce, two of the "Traitorous Eight". Eventually, hundreds of companies (called the Fairchildren) could be traced back to Fairchild. Economic factors also battered Fairchild; the semiconductor industry had barely recovered from the 1970-1971 recession when it was hit by the severe 1975 recession. As a result, Fairchild had large layoffs, of which the Shiprock layoffs were a small part. Fairchild's business continued to decline; it was purchased by Schlumberger in 1979 and went through various acquisitions, mergers, and spinoffs until it finally ended in 2016, acquired by ON Semiconductor. ↩
-
Were the employees "laid off" or "layed off"? Curiously, the New York Times article said "layed off" but sources uniformly state that "layed off" is grammatically wrong. The New York Times has extensively used "layed off" so this isn't a one-time typo. I hypothesized that usage had changed since the 1970s but Google Ngram Viewer shows laid off as the consistent and overwhelming winner. Maybe "layed off" was a stylistic quirk of the New York Times? ↩
-
Looking back, MacDonald questioned his decision to let the occupation of Fairchild's plant continue rather than ordering the tribal police to forcibly remove the occupiers from the plant. In his view, his decision to let the occupation led to the closing of the plant and the loss of 1200 jobs. On the other hand, forcibly removing the occupiers risked violence and loss of life: "I would have become the chairman who killed his own people instead of the chairman who allowed Navajo to lose their jobs."
The risk of bloodshed was not theoretical. In 1989, a riot between MacDonald's supporters and the police resulted in two Navajos being shot and killed by the police. MacDonald pressed for a federal investigation into police brutality, but instead MacDonald and Benally (a council delegate) received long prison sentences for inciting the riot even though they were not present at the time. ↩
-
Alice Funston was Forewoman for the Reliability and Quality Assurance Section at Shiprock. In a Fairchild employee newsletter, she said, "Fairchild has not only helped women get ahead, it has been good for the entire Indian community in Shiprock. Before the plant was built here, there weren't many jobs available. You could work for the Bureau of Indian Affairs, the Navajo Tribe or other government agencies, but there just weren't enough jobs to go around. I started in assembly in 1965 and was recently promoted to Production Supervisor in R & Q.A. Since the beginning of the year, a number of women have been promoted into supervisory positions. When I joined Fairchild, most of the members of management were non-Indian. Today, almost all of our supervisors and managers are Indian."
I quote this at length, since it was the only example I could find of an employee discussing Shiprock in their own words. It must be recognized, of course, that this is a company publication, so the comments may not be completely candid. See "Affirmative Action: A growing consciousness of the needs of the individual" in Fairchild HORIZONS, May-June, 1973. ↩
-
See Interview with Charlie Sporck, 2000 February 21, timestamp 0:27. From "Silicon Genesis: oral history interviews of Silicon Valley scientists, 1995-2024," Stanford Digital Repository.
I view Sporck's comments on the failure of Shiprock as highly questionable. First, Sporck left Fairchild in 1967, so he was not present for most of the Shiprock project. Moreover, he implies that Fairchild's closing of Shiprock was in the best interest of the Navajo, which is a morally convenient justification for Fairchild's decision, but contradicted by most other sources. ↩
-
Fairchild's 9040 logic family was called LPDTμL for "low-power diode-transistor Micrologic". Some sources label this family as TTL (Transistor-Transistor Logic), probably confusing it with the 9000-family, which was TTL. ↩
-
Fairchild's failure to recognize the importance of MOS transistors and transition from bipolar transistors is described in History of Semiconductor Engineering, page 170. ↩
-
I'll provide more details of the 9040 schematic in this footnote. The 9040 is a flexible flip-flop. It can be wired as an R-S (reset-set) flip-flop, set to 1 or reset to 0 as needed. It can also be wired as a J-K flip-flop, a flexible circuit that can store a value, hold a value, or toggle, based on the settings of the J and K inputs.
The 9040 is a "dual-rank" flip-flop, meaning it holds its value in two latches: a primary latch and a secondary latch. (This type of flip flop was generally called "master-slave", a name that is now controversial). Looking at the schematic, the primary latch at the bottom of the schematic passes its value to the secondary latch at the top under the control of the clock. This structure makes the flip-flop "edge-triggered", changing its value at the moment when the clock signal changes.
This circuit uses diode-transistor logic. Diodes perform most of the logic operations by combining input signals, while the transistors provide amplification. Diodes play a different role in the "push-pull" output circuit, raising the level of the high-side transistor. Because the output circuit has a transistor, diode, and transistor stacked vertically, it is often called a totem pole output, a name that seems questionable in this context.
One curious feature of the 9040 is that it contains two pull-up resistors that are not assigned any role. The user of the chip can attach them to unused inputs to keep the input at the desired value.
Looking at the schematic shows 13 pins, corresponding to the 13 pins of the flat-pack integrated circuit. All but three of these pins are symmetrical; power (Vcc), ground, and the clock (CP) have single connections. The ground pad is in the bottom-center of the die, which maintains symmetry. The clock and power pads are side-by-side in the top-center of the die. If you study the die photograph closely, you will see that they subtlely break the chip's symmetry as the clock signal runs down the center of the die while the power connection runs down both sides. There are a few other subtle violations of symmetry when signals cross from one side of the chip to the other, as well as the obviously asymmetrical text. ↩
-
I haven't been able to prove that the Apollo program used chips from the Shiprock plant rather than a different facility. Fairchild President Hogan stated that workers at Shiprock assembled guidance, communications, and gyro systems that were used on Apollo rockets. ↩
-
The ALSEP schematic is from Miller, K. Logic Schematic Type B Board No.4 ASE, A4, technical drawing, January 27, 1967, University of North Texas Libraries, The Portal to Texas History; crediting Lunar Planetary Institute Library. ↩
-
Marilou Schultz had another chip weaving on display at the National Gallery of Art. It is labeled "Untitled (Unknown Chip), 2008", but Antoine Bercovici identified it for me as the AMD K6 III processor, released in 1999 and comparable to the Pentium III.
A weaving created by Marilou Schultz, "Untitled (Unknown Chip)".If you're interested in computer-related weaving, the exhibition also had "Copper Tapestry (Riva 128 Graphics Card, Nvidia, 1997)" by Argentinian artist Analia Saban, created on a computer-automated Jacquard loom. This weaving represents a PC graphics card, specifically, the STB Velocity 128, which uses the Nvidia Riva 128 GPU chip. This chip was released in 1997, at a point when Nvidia was in a dire financial position, thirty days from going out of business. The Riva 128 saved Nvidia and now Nvidia is the world's third most valuable company.
A tapestry created by Analia Saban, "Copper Tapestry (Riva 128 Graphics Card, Nvidia, 1997)".
Ferroelectric memory (FRAM) is an interesting storage technique that stores bits in a special "ferroelectric" material. Ferroelectric memory is nonvolatile like flash memory, able to hold its data for decades. But, unlike flash, ferroelectric memory can write data rapidly. Moreover, FRAM is much more durable than flash and can be be written trillions of times. With these advantages, you might wonder why FRAM isn't more popular. The problem is that FRAM is much more expensive than flash, so it is only used in niche applications.
This post takes a look inside an FRAM chip from 1999, designed by a company called Ramtron. The die photo above shows this 64-kilobit chip under a microscope; the four large dark stripes are the memory cells, containing tiny cubes of ferroelectric material. The horizontal greenish bands are the drivers to select a column of memory, while the vertical greenish band at the right holds the sense amplifiers that amplify the tiny signals from the memory cells. The eight whitish squares around the border of the die are the bond pads, which are connected to the chip's eight pins.1 The logic circuitry at the left and right of the die implements the serial (I2C) interface for communication with the chip.2
The history of ferroelectric memory dates back to the early 1950s.3 Many companies worked on FRAM from the 1950s to the 1970s, including Bell Labs, IBM, RCA, and Ford. The 1955 photo below shows a 256-bit ferroelectric memory built by Bell Labs. Unfortunately, ferroelectric memory had many problems,4 limiting it to specialized applications, and development was mostly abandoned by the 1970s.
Ferroelectric memory had a second chance, though. A major proponent of ferroelectric memory was George Rohrer, who started working on ferroelectric memory in 1968. He formed a memory company, Technovation, which was unsuccessful, and then cofounded Ramtron in 1984.5 Ramtron produced a tiny 256-bit memory chip in 1988, followed by much larger memories in the 1990s.
How FRAM works
Ferroelectric memory uses a special material with the property of ferroelectricity. In a normal capacitor, applying an electric field causes the positive and negative charges to separate in the dielectric material, making it polarized. However, ferroelectric materials are special because they will retain this polarization even when the electric field is removed. By polarizing a ferroelectric material positively or negatively, a bit of data can be stored. (The name "ferroelectric" is in analogy to "ferromagnetic", even though ferroelectric materials are not ferrous.)
This FRAM chip uses a ferroelectric material called lead zirconate titanate or PZT, containing lead, zircon, titanium, and oxygen. The diagram below shows how an applied electric field causes the lead or zircon atom to physically move inside the crystal lattice, causing the ferroelectric effect. (Red atoms are lead, purple are oxygen, and yellow are zircon or titanium.) Because the atoms physically change position, the polarization is stable for decades; in contrast, the capacitors in a DRAM chip lose their data in milliseconds unless refreshed. FRAM memory will eventually wear out, but it can be written trillions of times, much more than flash or EEPROM memory.
To store data, FRAM uses ferroelectric capacitors, capacitors with a ferroelectric material as the dielectric between the plates. Applying a voltage to the capacitor will create an electric field, polarizing the ferroelectric material. A positive voltage will store a 1, and a negative voltage will store a 0.
Reading a bit from memory is a bit tricky. A positive voltage is applied, forcing the material into the 1 state. If the material was already in the 1 state, minimal current will flow. But if the material was in the 0 state, more current will flow as the capacitor changes state. This allows the 0 and 1 states to be distinguished.
Note that reading the bit destroys the stored value. Thus, after a read, the 0 or 1 value must be written back to the capacitor to restore its previous state. (This is very similar to the magnetic core memory that was used in the 1960s.)6
The FRAM chip that I examined uses two capacitors per bit, storing opposite values. This approach makes it easier to distinguish a 1 from a 0: a sense amplifier compares the two tiny signals and generates a 1 or a 0 depending on which is larger. The downside of this approach is that using two capacitors per bit reduces the memory capacity. Later FRAMs increased the density by using one capacitor per bit, along with reference cells for comparison.7
A closer look at the die
The diagram below shows the main functional blocks of the chip.8 The memory itself is partitioned into four blocks. The word line decoders select the appropriate column for the address and the drivers generate the pulses on the word and plate lines. The signals from that column go to the sense amplifiers on the right, where the signals are converted to bits and written back to memory. On the left, the precharge circuitry charges the bit lines to a fixed voltage at the start of the memory cycle, while the decoders select the desired byte from the bit lines.
The diagram below shows a closeup of the memory. I removed the top metal layer and many of the memory cells to reveal the underlying structure. The structure is very three-dimensional compared to regular chips; the gray squares in the image are cubes of PZT, sitting on top of the plate lines. The brown rectangles labeled "top plate connection" are also three-dimensional; they are S-shaped brackets with the low end attached to the silicon and the high end contacting the top of the PZT cube. Thus, each PZT cube forms a capacitor with the plate line forming the bottom plate of the capacitor, the bracket forming the top plate connection, and the PZT cube sandwiched in between, providing the ferroelectric dielectric. (Some cubes have been knocked loose in this photo and are sitting at an angle; the cubes form a regular grid in the original chip.)
The physical design of the chip is complicated and quite different from a typical planar integrated circuit. Each capacitor requires a cube of PZT sandwiched between platinum electrodes, with the three-dimensional contact from the top of the capacitor to the silicon. Creating these structures requires numerous steps that aren't used in normal integrated circuit fabrication. (See the footnote9 for details.) Moreover, the metal ions in the PZT material can contaminate the silicon production facility unless great care is taken, such as using a separate facility to apply the ferroelectric layer and all subsequent steps.10 The additional fabrication steps and unusual materials significantly increase the cost of manufacturing FRAM.
Each top plate connection has an associated transistor, gated by a vertical word line.11 The transistors are connected to horizontal bit lines, metal lines that were removed for this photo. A memory cell, containing two capacitors, measures about 4.2 µm × 6.5 µm. The PZT cubes are spaced about 2.1 µm apart. The transistor gate length is roughly 700 nm. The 700 nm node was introduced in 1993, while the die contains a 1999 copyright date, so the chip appears to be a few years behind the cutting edge as far as node.
The memory is organized as 256 capacitors horizontally by 512 capacitors vertically, for a total of 64 kilobits (since each bit requires two capacitors). The memory is accessed as 8192 bytes. Curiously, the columns are numbered on the die, as shown below.
The photo below shows the sense amplifiers to the right of the memory, with some large transistors to boost the signal. Each sense amplifier receives two signals from the pair of capacitors holding a bit. The sense amplifier determines which signal is larger, deciding if the bit is a 0 or 1. Because the signals are very small, the sense amplifier must be very sensitive. The amplifier has two cross-connected transistors with each transistor trying to pull the other signal low. The signal that starts off larger will "win", creating a solid 0 or 1 signal. This value is rewritten to memory to restore the value, since reading the value erases the cells. In the photo, a few of the ferroelectric capacitors are visible at the far left. Part of the lower metal layer has come loose, causing the randomly strewn brown rectangles.
The photo below shows eight of the plate drivers, below the memory cells. This circuit generates the pulse on the selected plate line. The plate lines are the thick white lines at the top of the image; they are platinum so they appear brighter in the photo than the other metal lines. Most of the capacitors are still present on the plate lines, but some capacitors have come loose and are scattered on the rest of the circuitry. Each plate line is connected to a metal line (brown), which connects the plate line to the drive transistors in the middle and bottom of the image. These transistors pull the appropriate plate line high or low as necessary. The columns of small black circles are connections between the metal line and the silicon of the transistor underneath.
Finally, here's the part number and Ramtron logo on the die.
Conclusions
Ferroelectric RAM is an example of a technology with many advantages that never achieved the hoped-for success. Many companies worked on FRAM from the 1950s to the 1970s but gave up on it. Ramtron tried again and produced products but they were not profitable. Ramtron had hoped that the density and cost of FRAM would be competitive with DRAM, but unfortunately that didn't pan out. Ramtron was acquired by Cypress Semiconductor in 2012 and then Cypress was acquired by Infineon in 2019. Infineon still sells FRAM, but it is a niche product, for instance satellites that need radiation hardness. Currently, FRAM costs roughly $3/megabit, almost three orders of magnitude more expensive than flash memory, which is about $15/gigabit. Nonetheless, FRAM is a fascinating technology and the structures inside the chip are very interesting.
For more, follow me on Mastodon as @kenshirriff@oldbytes.space or RSS. (I've given up on Twitter.) Thanks to CuriousMarc for providing the chip, which was used in a digital readout (DRO) for his CNC machine.
Notes and references
-
The photo below shows the chip's 8-pin package.
The chip is packaged in an 8-pin DIP. "RIC" stands for Ramtron International Corporation. -
The block diagram shows the structure of the chip, which is significantly different from a standard DRAM chip. The chip has logic to handle the I2C protocol, a serial protocol that uses a clock and a data line. (Note that the address lines A0-A2 are the address of the chip, not the memory address.) The WP (Write Protect) pin, protects one quarter of the chip from being modified. The chip allows an arbitrary number of bytes to be read or written sequentially in one operation. This is implemented by the counter and address latch.
Block diagram of the FRAM chip. From the datasheet. -
An early description of ferroelectric memory is in the October 1953 Proceedings of the IRE. This issue focused on computers and had an article on computer memory systems by J. P. Eckert of ENIAC fame. In 1953, computer memory systems were primitive: mercury delay lines, electrostatic CRTs (Williams tubes), or rotating drums. The article describes experimental memory technologies including ferroelectric memory, magnetic core memory, neon-capacitor memory, phosphor drums, temperature-sensitive pigments, corona discharge, or electrolytic diodes. Within a couple of years, magnetic core memory became successful, dominating storage until semiconductor memory took over in the 1970s, and most of the other technologies were forgotten. ↩
-
A 1969 article in Electronics discussed ferroelectric memories. At the time, ferroelectric memories were used for a few specialized applications. However, ferroelectric memories had many issues: slow write speed, high voltages (75 to 150 volts), and expensive logic to decode addresses. The article stated: "These considerations make the future of ferroelectric memories in computers rather bleak." ↩
-
Interestingly, the "Ram" in Ramtron comes from the initials of the cofounders: Rohrer, Araujo, and McMillan. Rohrer originally focused on potassium nitrate as the ferroelectric material, as described in his patent. (I find it surprising that potassium nitrate is ferroelectric since it seems like such a simple, non-exotic chemical.) An extensive history of Ramtron is here. A Popular Science article also provides information. ↩
-
Like core memory, ferroelectric memory is based on a hysteresis loop. Because of the hysteresis loop, the material has two stable states, storing a 0 or 1. While core memory has a hysteresis loop for magnetization with respect to the magnetic field, ferroelectric memory The difference is that core memory has hysteresis of the magnetization with respect to the applied magnetic field, while ferroelectric memory has hysteresis of the polarization with respect to the applied electric field. ↩
-
The reference cell approach is described in Ramtron patent 6028783A. The idea is to have a row of reference capacitors, but the reference capacitors are sized to generate a current midway between the 0 current and the 1 current. The reference capacitors provide the second input to the sense amplifiers, allowing the 0 and 1 bits to be distinguished. ↩
-
Ramtron's 1987 patent describes the approximate structure of the memory. ↩
-
The diagram below shows the complex process that Ramtron used to create an FRAM chip. (These steps are from a 2003 patent, so they may differ from the steps for the chip I examined.)
Ramtron's process flow to create an FRAM die. From Patent 6613586.Abbreviations: BPSG is borophosphosilicate glass. UTEOS is undoped tetraethylorthosilicate, a liquid used to deposit silicon dioxide on the surface. RTA is rapid thermal anneal. PTEOS is phosphorus-doped tetraethylorthosilicate, used to create a phosphorus-doped silicon dioxide layer. CMP is chemical mechanical planarization, polishing the die surface to be flat. TEC is the top electrode contact. ILD is interlevel dielectric, the insulating layer between conducting layers. ↩
-
See the detailed article Ferroelectric Memories, Science, 1989, by Scott and Araujo (who is the "A" in "Ramtron"). ↩
-
Early FRAM memories used an X-Y grid of wires without transistors. Although much simpler, this approach had the problem that current could flow through unwanted capacitors via "sneak" paths, causing noise in the signals and potentially corrupting data. High-density integrated circuits, however, made it practical to associate a transistor with each cell in modern FRAM chips. ↩
Subject: Reverse-engineering a three-axis attitude indicator from the F-4 fighter plane
We recently received an attitude indicator for the F-4 fighter plane, an instrument that uses a rotating ball to show the aircraft's orientation and direction. In a normal aircraft, the artificial horizon shows the orientation in two axes (pitch and roll), but the F-4 indicator uses a rotating ball to show the orientation in three axes, adding azimuth (yaw).1 It wasn't obvious to me how the ball could rotate in three axes: how could it turn in every direction and still remain attached to the instrument?
We disassembled the indicator, reverse-engineered its 1960s-era circuitry, fixed some problems,2 and got it spinning. The video clip below shows the indicator rotating around three axes. In this blog post, I discuss the mechanical and electrical construction of this indicator. (The quick explanation is that the ball is really two hollow half-shells attached to the internal mechanism at the "poles"; the shells rotate while the "equator" remains stationary.)
The F-4 aircraft
The indicator was used in the F-4 Phantom II3 so the pilot could keep track of the aircraft's orientation during high-speed maneuvers. The F-4 was a supersonic fighter manufactured from 1958 to 1981. Over 5000 were produced, making it the most-produced American supersonic aircraft ever. It was the main US fighter jet in the Vietnam War, operating from aircraft carriers. The F-4 was still used in the 1990s during the Gulf War, suppressing air defenses in the "Wild Weasel" role. The F-4 was capable of carrying nuclear bombs.4
The F-4 was a two-seat aircraft, with the radar intercept office controlling radar and weapons from a seat behind the pilot. Both cockpits had a panel crammed with instruments, with additional instruments and controls on the sides. As shown below, the pilot's panel had the three-axis attitude indicator in the central position, just below the reddish radar scope, reflecting its importance.5 (The rear cockpit had a simpler two-axis attitude indicator.)
The attitude indicator mechanism
The ball inside the indicator shows the aircraft's position in three axes. The roll axis indicates the aircraft's angle if it rolls side-to-side along its axis of flight. The pitch axis indicates the aircraft's angle if it pitches up or down. Finally, the azimuth axis indicates the compass direction that the aircraft is heading, changed by the aircraft's turning left or right (yaw). The indicator also has moving needles and status flags, but in this post I'm focusing on the rotating ball.6
The indicator uses three motors to move the ball. The roll motor (below) is attached to the frame of the indicator, while the pitch and azimuth motors are inside the ball. The ball is held in place by the roll gimbal, which is attached to the ball mechanism at the top and bottom pivot points. The roll motor turns the roll gimbal and thus the ball, providing a clockwise/counterclockwise movement. The roll control transformer provides position feedback. Note the numerous wires on the roll gimbal, connected to the mechanism inside the ball.
The diagram below shows the mechanism inside the ball, after removing the hemispherical shells of the ball. When the roll gimbal is rotated, this mechanism rotates with it. The pitch motor causes the entire mechanism to rotate around the pitch axis (horizontal here), which is attached along the "equator". The azimuth motor and control transformer are behind the pitch components, not visible in this photo. The azimuth motor turns the vertical shaft. The two hollow hemispheres of the ball attach to the top and bottom of the shaft. Thus, the azimuth motor rotates the ball shells around the azimuth axis, while the mechanism itself remains stationary.
Why doesn't the wiring get tangled up as the ball rotates? The solution is two sets of slip rings to implement the electrical connections. The photo below shows the first slip ring assembly, which handles rotation around the roll axis. These slip rings connect the stationary part of the instrument to the rotating roll gimbal. The black base and the vertical wires are attached to the instrument, while the striped shaft in the middle rotates with the ball assembly housing. Inside the shaft, wires go from the circular metal contacts to the roll gimbal.
Inside the ball, a second set of slip rings provides the electrical connection between the wiring on the roll gimbal and the ball mechanism. The photo below shows the connections to these slip rings, handling rotation around the pitch axis (horizontal in this photo). (The slip rings themselves are inside and are not visible.) The shaft sticking out of the assembly rotates around the azimuth (yaw) axis. The ball hemisphere is attached to the metal disk. The azimuth axis does not require slip rings since only the ball shells rotates; the electronics remain stationary.
The servo loop
In this section, I'll explain how the motors are controlled by servo loops. The attitude indicator is driven by an external gyroscope, receiving electrical signals indicating the roll, pitch, and azimuth positions. As was common in 1960s avionics, the signals are transmitted from synchros, which use three wires to indicate an angle. The motors inside the attitude indicator rotate until the indicator's angles for the three axes match the input angles.
Each motor is controlled by a servo loop, shown below. The goal is to rotate the output shaft to an angle that exactly matches the input angle, specified by the three synchro wires. The key is a device called a control transformer, which takes the three-wire input angle and a physical shaft rotation, and generates an error signal indicating the difference between the desired angle and the physical angle. The amplifier drives the motor in the appropriate direction until the error signal drops to zero. To improve the dynamic response of the servo loop, the tachometer signal is used as a negative feedback voltage. This ensures that the motor slows as the system gets closer to the right position, so the motor doesn't overshoot the position and oscillate. (This is sort of like a PID controller.)
In more detail, the external gyroscope unit contains synchro transmitters, small devices that convert the angular position of a shaft into AC signals on three wires. The photo below shows a typical synchro, with the input shaft on the top and five wires at the bottom: two for power and three for the output.
Internally, the synchro has a rotating winding called the rotor that is driven with 400 Hz AC. Three fixed stator windings provide the three AC output signals. As the shaft rotates, the phase and voltage of the output signals changes, indicating the angle. (Synchros may seem bizarre, but they were extensively used in the 1950s and 1960s to transmit angular information in ships and aircraft.)
The attitude indicator uses control transformers to process these input signals. A control transformer is similar to a synchro in appearance and construction, but it is wired differently. The three stator windings receive the inputs and the rotor winding provides the error output. If the rotor angle of the synchro transmitter and control transformer are the same, the signals cancel out and there is no error output. But as the difference between the two shaft angles increases, the rotor winding produces an error signal. The phase of the error signal indicates the direction of error.
The next component is the motor/tachometer, a special motor that was often used in avionics servo loops. This motor is more complicated than a regular electric motor. The motor is powered by 115 volts AC, 400-Hertz, but this isn't sufficient to get the motor spinning. The motor also has two low-voltage AC control windings. Energizing a control winding will cause the motor to spin in one direction or the other.
The motor/tachometer unit also contains a tachometer to measure its rotational speed, for use in a feedback loop. The tachometer is driven by another 115-volt AC winding and generates a low-voltage AC signal proportional to the rotational speed of the motor.
The photo above shows a motor/tachometer with the rotor removed. The unit has many wires because of its multiple windings. The rotor has two drums. The drum on the left, with the spiral stripes, is for the motor. This drum is a "squirrel-cage rotor", which spins due to induced currents. (There are no electrical connections to the rotor; the drums interact with the windings through magnetic fields.) The drum on the right is the tachometer rotor; it induces a signal in the output winding proportional to the speed due to eddy currents. The tachometer signal is at 400 Hz like the driving signal, either in phase or 180º out of phase, depending on the direction of rotation. For more information on how a motor/generator works, see my teardown.
The amplifier
The motors are powered by an amplifier assembly that contains three separate error amplifiers, one for each axis. I had to reverse engineer the amplifier assembly in order to get the indicator working. The assembly mounts on the back of the attitude indicator and connects to one of the indicator's round connectors. Note the cutout in the lower left of the amplifier assembly to provide access to the second connector on the back of the indicator. The aircraft connects to the indicator through the second connector and the indicator passes the input signals to the amplifier through the connector shown above.
The amplifier assembly contains three amplifier boards (for roll, pitch, and azimuth), a DC power supply board, an AC transformer, and a trim potentiometer.7 The photo below shows the amplifier assembly mounted on the back of the instrument. At the left, the AC transformer produces the motor control voltage and powers the power supply board, mounted vertically on the right. The assembly has three identical amplifier boards; the middle board has been unmounted to show the components. The amplifier connects to the instrument through a round connector below the transformer. The round connector at the upper left is on the instrument case (not the amplifier) and provides the connection between the aircraft and the instrument.8
The photo below shows one of the three amplifier boards. The construction is unusual, with some components stacked on top of other components to save space. Some of the component leads are long and protected with clear plastic sleeves. The board is connected to the rest of the amplifier assembly through a bundle of point-to-point wires, visible on the left. The round pulse transformer in the middle has five colorful wires coming out of it. At the right are the two transistors that drive the motor's control windings, with two capacitors between them. The transistors are mounted on a heat sink that is screwed down to the case of the amplifier assembly for cooling. The board is covered with a conformal coating to protect it from moisture or contaminants.
The function of each amplifier board is to generate the two control signals so the motor rotates in the appropriate direction based on the error signal fed into the amplifier. The amplifier also uses the tachometer output from the motor unit to slow the motor as the error signal decreases, preventing overshoot. The inputs to the amplifier are 400 hertz AC signals, with the phase indicating positive or negative error. The outputs drive the two control windings of the motor, determining which direction the motor rotates.
The schematic for the amplifier board is below. The two transistors on the left amplify the error and tachometer signals, driving the pulse transformer. The outputs of the pulse transformer will have opposite phase, driving the output transistors for opposite halves of the 400 Hz cycle. One of the transistors will be in the right phase to turn on and pull the motor control AC to ground, while the other transistor will be in the wrong phase. Thus, the appropriate control winding will be activated (for half the cycle), causing the motor to spin in the desired direction.
It turns out that there are two versions of the attitude indicator that use incompatible amplifiers. I think that the motors for the newer indicators have a single control winding rather than two. Fortunately, the connectors are keyed differently so you can't attach the wrong amplifier. The second amplifier (below) looks slightly more modern (1980s) with a double-sided circuit board and more components in place of the pulse transformer.
The pitch trim circuit
The attitude indicator has a pitch trim knob in the lower right, although the knob was missing from ours. The pitch trim adjustment turns out to be rather complicated. In level flight, an aircraft may have its nose angled up or down slightly to achieve the desired angle of attack. The pilot wants the attitude indicator to show level flight, even though the aircraft is slightly angled, so the indicator can be adjusted with the pitch trim knob. However, the problem is that a fighter plane may, for instance, do a vertical 90º climb. In this case, the attitude indicator should show the actual attitude and ignore the pitch trim adjustment.
I found a 1957 patent that explained how this is implemented. The solution is to "fade out" the trim adjustment when the aircraft moves away from horizontal flight. This is implemented with a special multi-zone potentiometer that is controlled by the pitch angle.
The schematic below shows how the pitch trim signal is generated from the special pitch angle potentiometer and the pilot's pitch trim adjustment. Like most signals in the attitude indicator, the pitch trim is a 400 Hz AC signal, with the phase indicating positive or negative. Ignoring the pitch angle for a moment, the drive signal into the transformer will be AC. The split windings of the transformer will generate a positive phase and a negative phase signal. Adjusting the pitch trim potentiometer lets the pilot vary the trim signal from positive to zero to negative, applying the desired correction to the indicator.
Now, look at the complex pitch angle potentiometer. It has alternating resistive and conducting segments, with AC fed into opposite sides. (Note that +AC and -AC refer to the phase, not the voltage.) Because the resistances are equal, the AC signals will cancel out at the top and the bottom, yielding 0 volts on those segments. If the aircraft is roughly horizontal, the potentiometer wiper will pick up the positive-phase AC and feed it into the transformer, providing the desired trim adjustment as described previously. However, if the aircraft is climbing nearly vertically, the wiper will pick up the 0-volt signal, so there will be no pitch trim adjustment. For an angle range in between, the resistance of the potentiometer will cause the pitch trim signal to smoothly fade out. Likewise, if the aircraft is steeply diving, the wiper will pick up the 0 signal at the bottom, removing the pitch trim. And if the aircraft is inverted, the wiper will pick up the negative AC phase, causing the pitch trim adjustment to be applied in the opposite direction.
Conclusions
The attitude indicator is a key instrument in any aircraft, especially important when flying in low visibility. The F-4's attitude indicator goes beyond the artificial horizon indicator in a typical aircraft, adding a third axis to show the aircraft's heading. Supporting a third axis makes the instrument much more complicated, though. Looking inside the indicator reveals how the ball rotates in three axes while still remaining firmly attached.
Modern fighter planes avoid complex electromechanical instruments. Instead, they provide a "glass cockpit" with most data provided digitally on screens. For instance, the F-35's console replaces all the instruments with a wide panoramic touchscreen displaying the desired information in color. Nonetheless, mechanical instruments have a special charm, despite their impracticality.
For more, follow me on Mastodon as @kenshirriff@oldbytes.space or RSS. (I've given up on Twitter.) I worked on this project with CuriousMarc and Eric Schlapfer, so expect a video at some point. Thanks to John Pumpkinhead and another collector for supplying the indicators and amplifiers.
Notes and references
Specifications9
-
This three-axis attitude indicator is similar in many ways to the FDAI (Flight Director Attitude Indicator) that was used in the Apollo space flights, although the FDAI has more indicators and needles. It is more complex than the Soyus Globus, used for navigation (teardown), which rotates in two axes. Maybe someone will loan us an FDAI to examine...
↩ -
Our indicator has been used as a parts source, as it has cut wires inside and is missing the pitch trim knob, several needles, and internal adjustment potentiometers. We had to replace two failed capacitors in the power supply. There is still a short somewhere that we are tracking down; at one point it caused the bond wire inside a transistor to melt(!). ↩
-
The aircraft is the "Phantom II" because the original Phantom was a World War II fighter aircraft, the McDonnell FH Phantom. McDonnell Douglas reused the Phantom name for the F-4. (McDonnell became McDonnell Douglas in 1967 after merging with Douglas Aircraft. McDonnell Douglas merged into Boeing in 1997. Many people blame Boeing's current problems on this merger.) ↩
-
The F-4 could carry a variety of nuclear bombs such as the B28EX, B61, B43 and B57, referred to as "special weapons". The photo below shows the nuclear store consent switch, which armed a nuclear bomb for release. (Somehow I expected a more elaborate mechanism for nuclear bombs.) The switch labels are in the shadows, but say "REL/ARM", "SAFE", and "REL". The F-4 Weapons Delivery Manual discusses this switch briefly.
The nuclear store consent switch, to the right of the Weapons System Officer in the rear cockpit. Photo from National Museum of the USAF. -
The photo below is a closeup of the attitude indicator in the F-4 cockpit. Note the Primary/Standby toggle switch in the upper-left. Curiously, this switch is just screwed onto the console, with exposed wires. Based on other sources, this appears to be the standard mounting. This switch is the "reference system selector switch" that selects the data source for the indicator. In the primary setting, the gyroscopically-stabilized inertial navigation system (INS) provides the information. The INS normally gets azimuth information from the magnetic compass, but can use a directional gyro if the Earth's magnetic field is distorted, such as in polar regions. See the F-4E Flight Manual for details.
A closeup of the indicator in the cockpit of the F-4 Phantom II. Photo from National Museum of the USAF.The standby switch setting uses the bombing computer (the AN/AJB-7 Attitude-Reference Bombing Computer Set) as the information source; it has two independent gyroscopes. If the main attitude indicator fails entirely, the backup is the "emergency attitude reference system", a self-contained gyroscope and indicator below and to the right of the main attitude indicator; see the earlier cockpit photo. ↩
-
The diagram below shows the features of the indicator.
The features of the Attitude Director Indicator (ADI). From F-4E Flight Manual TO 1F-4E-1.The pitch steering bar is used for an instrument (ILS) landing. The bank steering bar provides steering information from the navigation system for the desired course. ↩
-
The roll, pitch, and azimuth inputs require different resistances, for instance, to handle the pitch trim input. These resistors are on the power supply board rather than an amplifier board. This allows the three amplifier boards to be identical, rather than having slightly different amplifier boards for each axis. ↩
-
The attitude indicator assembly has a round mil-spec connector and the case has a pass-through connector. That is, the aircraft wiring plugs into the outside of the case and the indicator internals plug into the inside of the case. The pin numbers on the outside of the case don't match the pin numbers on the internal connector, which is very annoying when reverse-engineering the system. ↩
-
In this footnote, I'll link to some of the relevant military specifications.
The attitude indicator is specified in military spec MIL-I-27619, which covers three similar indicators, called ARU-11/A, ARU-21/A, and ARU-31/A. The three indicators are almost identical except the the ARU-21/A has the horizontal pointer alarm flag and the ARU-31/A has a bank angle command pointer and a bank scale at the bottom of the indicator, along with a bank angle command pointer adjustment knob in the lower left. The ARU-11/A was used in the F-111A. (The ID-1144/AJB-7 indicator is probably the same as the ARU-11/A.) The ARU-21/A was used in the A-7D Corsair. The ARU-31/A was used in the RF-4C Phantom II, the reconnaissance version of the F-4. The photo below shows the cockpit of the RF-4C; note that the attitude indicator in the center of the panel has two knobs.
Cockpit panel of the RF-4C. Photo from National Museum of the USAF.The indicator was part of the AN/ASN-55 Attitude Heading Reference Set, specified in MIL-A-38329. I think that the indicator originally received its information from an MD-1 gyroscope (MIL-G-25597) and an ML-1 flux valve compass, but I haven't tracked down all the revisions and variants.
Spec MIL-I-23524 describes an indicator that is almost identical to the ARU-21/A but with white flags. This indicator was also used with the AJB-3A Bomb Release Computing Set, part of the A-4 Skyhawk. This indicator was used with the integrated flight information system MIL-S-23535 which contained the flight director computer MIL-S-23367.
My indicator has no identifying markings, so I can't be sure of its exact model. Moreover, it has missing components, so it is hard to match up the features. Since my indicator has white flags it might be the ID-1329/A.
Forbes recently published the Forbes 400 List for 2024, listing the 400 richest people in the United States. This inspired me to make a histogram to show the distribution of wealth in the United States. It turns out that if you put Elon Musk on the graph, almost the entire US population is crammed into a vertical bar, one pixel wide. Each pixel is $500 million wide, illustrating that $500 million essentially rounds to zero from the perspective of the wealthiest Americans.
The histogram above shows the wealth distribution in red. Note that the visible red line is one pixel wide at the left and disappears everywhere else—this is the important point: essentially the entire US population is in that first bar. The graph is drawn with the scale of 1 pixel = $500 million in the X axis, and 1 pixel = 1 million people in the Y axis. Away from the origin, the red line is invisible—a tiny fraction of a pixel tall since so few people have more than 500 million dollars.
Since the median US household wealth is about $190,000, half the population would be crammed into a microscopic red line 1/2500 of a pixel wide using the scale above. (The line would be much narrower than the wavelength of light so it would be literally invisible). The very rich are so rich that you could take someone with a thousand times the median amount of money, and they would still have almost nothing compared to the richest Americans. If you increased their money by a factor of a thousand yet again, you'd be at Bezos' level, but still well short of Elon Musk.
Another way to visualize the extreme distribution of wealth in the US is to imagine everyone in the US standing up while someone counts off millions of dollars, once per second. When your net worth is reached, you sit down. At the first count of $1 million, most people sit down, with 22 million people left standing. As the count continues—$2 million, $3 million, $4 million—more people sit down. After 6 seconds, everyone except the "1%" has taken their seat. As the counting approaches the 17-minute mark, only billionaires are left standing, but there are still days of counting ahead. Bill Gates sits down after a bit over one day, leaving 8 people, but the process is nowhere near the end. After about two days and 20 hours of counting, Elon Musk finally sits down.
Sources
The main source of data is the Forbes 400 List for 2024. Forbes claims there are 813 billionaires in the US here. Median wealth data is from the Federal Reserve; note that it is from 2022 and household rather than personal. The current US population estimate is from Worldometer. I estimated wealth above $500 million, extrapolating from 2019 data.
I made a similar graph in 2013; you can see my post here for comparison.
Disclaimers: Wealth data has a lot of sources of error including people vs households, what gets counted, and changing time periods, but I've tried to make this graph as accurate as possible. I'm not making any prescriptive judgements here, just presenting the data. Obviously, if you want to see the details of the curve, a logarithmic scale makes more sense, but I want to show the "true" shape of the curve. I should also mention that wealth and income are very different things; this post looks strictly at wealth.
I was studying the silicon die of the Pentium processor and noticed some puzzling structures where signal lines were connected to the silicon substrate for no apparent reason. Two examples are in the photo below, where the metal wiring (orange) connects to small square regions of doped silicon (gray), isolated from the rest of the circuitry. I did some investigation and learned that these structures are "antenna diodes," special diodes that protect the circuitry from damage during manufacturing. In this blog post, I discuss the construction of the Pentium and explain how these antenna diodes work.
Intel released the Pentium processor in 1993, starting a long-running brand of high-performance processors: the Pentium Pro, Pentium II, and so on. In this post, I'm studying the original Pentium, which has 3.1 million transistors.1 The die photo below shows the Pentium's fingernail-sized silicon die under a microscope. The chip has three layers of metal wiring on top of the silicon so the underlying silicon is almost entirely obscured.
Modern processors are built from CMOS circuitry, which uses two types of transistors: NMOS and PMOS. The diagram below shows how an NMOS transistor is constructed. A transistor can be considered a switch between the source and drain, controlled by the gate. The source and drain regions (green) consist of silicon doped with impurities to change its semiconductor properties, forming N+ silicon. The gate consists of a layer of polysilicon (red), separated from the silicon by an absurdly thin insulating oxide layer. Since the oxide layer is just a few hundred atoms thick,2 it is very fragile and easily damaged by excess voltage. (This is why CMOS chips are sensitive to static electricity.) As we will see, the oxide layer can also be damaged by voltage during manufacturing.
The Pentium processor is constructed from multiple layers. Starting at the bottom, the Pentium has millions of transistors similar to the diagram above. Polysilicon wiring on top of the silicon not only forms the transistor gates but also provides short-range wiring. Above that, three layers of metal wiring connect the parts of the chip. Roughly speaking, the bottom layer of metal connects to the silicon and polysilicon to construct logic gates from the transistors, while the upper layers of wiring travel longer distances, with one layer for signals traveling horizontally and the other layer for signals traveling vertically. Tiny tungsten plugs called vias provide connections between the different layers of wiring. A key challenge of chip design is routing, directing signals through the multiple layers of wiring while packing the circuitry as densely as possible.
The photo below shows a small region of the Pentium die with the three metal layers visible. The golden vertical lines are the top metal layer, formed from aluminum and copper. Underneath, you can see the horizontal wiring of the middle metal layer. The more complex wiring of the bottom metal layer can be seen, along with the silicon and polysilicon that form transistors. The small black dots are the tungsten vias that connect metal layers, while the larger dark circles are contacts with the underlying silicon or polysilicon. Near the bottom of the photo, the vertical gray bands are polysilicon lines, forming transistor gates. Although the chip appears flat, it has a three-dimensional structure with multiple layers of metal separated by insulating layers of silicon dioxide. This three-dimensional structure will be important in the discussion below. (The metal wiring is much denser over most of the chip; this region is one of the rare spots where all the layers are visible.)
The manufacturing process for an integrated circuit is extraordinarily complicated but I'll skip over most of the details and focus on how each metal layer is constructed, layer by layer. First, a uniform metal layer is constructed over the silicon wafer. Next, the desired pattern is produced on the surface using a process called photolithography: a light-sensitive chemical called "resist" is applied to the wafer and exposed to light through a patterned mask. The light hardens the resist, creating a protective coating with the pattern of the desired wiring. Finally, the unprotected metal is etched away, leaving the wiring.
In the early days of integrated circuits, the metal was removed with liquid acids, a process called wet etching. Inconveniently, wet etching tended to eat away metal underneath the edges of the mask, which became a problem as integrated circuits became denser and the wires needed to be thinner. The solution was dry etch, using a plasma to remove the metal. By applying a large voltage to plates above and below the chip, a gas such as HCl is ionized into a highly reactive plasma. This plasma attacks the surface (unless it is protected by the resist), removing the unwanted metal. The advantage of dry etching is that it can act vertically (anisotropically), providing more control over the line width.
Although plasma etching improved the etching process, it caused another problem: plasma-induced oxide damage, also called the "antenna effect."3 The problem is that long metal wires on the chip could pick up an electrical charge from the plasma, producing a large voltage. As described earlier, the thin oxide layer under a transistor's gate is sensitive to voltage damage. The voltage induced by the plasma can destroy the transistor by blowing a hole through the gate oxide or it can degrade the transistor's performance by embedding charges inside the oxide layer.4
Several factors affect the risk of damage from the antenna effect. First, only the transistor's gate is sensitive to the induced voltage, due to the oxide layer. If the wire is also connected to a transistor's source or drain, the wire is "safe" since the source and drain provide connections to the chip's substrate, allowing the charge to dissipate harmlessly. Note that when the chip is completed, every transistor gate is connected to another transistor's source or drain (which provides the signal to the gate), so there is no risk of damage. Thus, the problem can only occur during manufacturing, with a metal line that is connected to a gate on one end but isn't connected on the other end. Moreover, the highest layer of metal is "safe" since everything is connected at that point. Another factor is that the induced voltage is proportional to the length of the metal wire, so short wires don't pose a risk. Finally, only the metal layer currently being etched poses a risk; since the lower layers are insulated by the thick oxide between layers, they won't pick up charge.
These factors motivate several ways to prevent antenna problems.5 First, a long wire can be broken into shorter segments, connected by jumpers on a higher layer. Second, moving long wires to the top metal layer eliminates problems.6 Third, diodes can be added to drain the charge from the wire; these are called "antenna diodes". When the chip is in use, the antenna diodes are reverse-biased so they have no electrical effect. But during manufacturing, the antenna diodes let charge flow to the substrate before it causes problems.
The third solution, the antenna diodes, explains the mysterious connections that I saw in the Pentium. In the diagram below, these diodes are visible on the die as square regions of doped silicon. The larger regions of doped silicon form PMOS transistors (upper) and NMOS transistors (lower). The polysilicon lines are faintly visible; they form transistor gates where they cross the doped silicon. (For this photo, I removed all the metal wiring.)
Confusingly, the antenna diodes look almost identical to "well taps", connections from the substrate to the chip's positive voltage supply, but have a completely different purpose. In the Pentium, the PMOS transistors are constructed in "wells" of N-type silicon. These wells must be raised to the chip's positive voltage, so there are numerous well tap connections from the positive supply to the wells. The well taps consist of squares of N+ doped silicon in the the N-type silicon well, providing an electrical connection. On the other hand, the antenna diodes also consist of N+ doped silicon, but embedded in P-type silicon. This forms a P-N junction that creates the diode.
In the Pentium, antenna diodes are used for only a small fraction of the wiring. The diodes require extra area on the die, so they are used only when necessary. Most of the antenna problems on the Pentium were apparently resolved through routing. Although the antenna diodes are relatively rare, they are still frequent enough that they caught my attention.
Antenna effects are still an issue in modern integrated circuits. Integrated circuit fabricators provide rules on the maximum allowable size of antenna wires for a particular manufacturing process.7 Software checks the design to ensure that the antenna rules are not violated, modifying the routing and inserting diodes as necessary. Violating the antenna rules can result in damaged chips and a very low yield, so it's more than just a theoretical issue.
Thanks to /r/chipdesign and Discord for discussion. If you're interested in the Pentium, I've written about standard cells in the Pentium, and the Pentium as a Navajo rug. Follow me on Mastodon (@kenshirriff@oldbytes.space) or Bluesky (@righto.com) or RSS for updates.
Notes and references
-
In this post, I'm looking at the Pentium model 80501 (codenamed P5). This model was soon replaced with a faster, lower-power version called the 80502 (P54C). Both are considered original Pentiums. ↩
-
IC manufacturing drives CPU performance states that gate oxide thickness was 100 to 300 angstroms in 1993. ↩
-
The wires are acting metaphorically as antennas, not literally, as they collect charge, not picking up radio waves.
Plasma-induced oxide damage gave rise to research and conferences in the 1990s to address this problem. The International Symposium on Plasma- and Process-Induced Damage started in 1996 and continued until 2003. Numerous researchers from semiconductor companies and academia studied the causes and effects of plasma damage. ↩
-
The damage is caused by "Fowler-Nordheim tunneling", where electrons tunnel through the oxide and cause damage. Flash memory uses this tunneling to erase the memory; the cumulative damage is why flash memory can only be written a limited number of times. ↩
-
Some relevant papers: Magnetron etching of polysilicon: Electrical damage (1991), Thin-oxide damage from gate charging during plasma processing (1992), Antenna protection strategy for ultra-thin gate MOSFETs (1998), Fixing antenna problem by dynamic diode dropping and jumper insertion (2000). The Pentium uses the "dynamic diode dropping" approach, adding antenna diodes only as needed, rather than putting them in every circuit. I noticed that the Pentium uses extension wires to put the diode in a more distant site if there is no room for the diode under the existing wiring. As an aside, the third paper uses the curious length unit of kµm; by calling 1000 µm a kµm, you can think in micrometers, even though this unit is normally called a mm. ↩
-
Sources say that routing signals on the top metal prevents antenna violations. However, I see several antenna diodes in the Pentium that are connected directly from the bottom metal (M1) through M2 to long lines on M3. These diodes seem redundant since the source/drain connections are in place by that time. So there are still a few mysteries... ↩
-
Foundries have antenna rules provided as part of the Process Design Kit (PDK). Here are the rules for MOSIS and SkyWater. I've focused on antenna effects from the metal wiring, but polysilicon and vias can also cause antenna damage. Thus, there are rules for these layers too. Polysilicon wiring is less likely to cause antenna problems, though, as it is usually restricted to short distances due to its higher resistance. ↩