Talk:CDC 6600
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
incomplete! working on it.
Machine Description
[edit]In the article, the two paragraphs beginning "The machine as a whole" are very confused and probably inaccurate. Thronton's book should be consulted for an accurate description of the all-important barrel architecture. The sentence beginning "Programs were written, with some difficulty" is just false.
How systems programming was affected by 6600 Architecture
[edit]Programmers tended to treat PPU programming as if each PPU was a completely independent computer, and as if the other PPUs did not exist. That was a direct result of the barrel architecture.
As for interaction with the CPU, this did not exist; central memory was read or written from the PPU as if it were some kind of simple device without an explicit channel number. The execution of the CPU was not affected by PPU programs, with the solitary exception of the PPU monitor.
Altho the article correctly observes that I/O was the province of the PPUs, they were remarkably ill-suited to the task. For example, they possessed no interrupt architecture. Therefore, PPU programmers had to code delay loops which repeatedly checked device statuses. The resultant code was very painful to write due to the numerous loops and exit conditions. To write I/O interfaces casually ignoring device status was to invite PPU hangs which would gradually degrade the entire operating system.
- I find this description to be quite inaccurate. Interaction between PP and CP of course did exist -- via shared memory data structures. Indeed, the circular buffer approach used is very efficient and handles concurrency correctly without any need for interlocks. As for "remarkably ill-suited to the task" -- not at all. WIth enough PPUs that each I/O device had its own, lack of interrupts is not a real issue. The only thing you might argue is that this approach was unfamiliar to programmers. Perhaps less so early on, because interrupts were not universally used in the 1960s (consider the 1620 -- synchronous I/O on a single processor machine). The proof is in the pudding. PLATO was a 1000 terminal, 600 user timesharing system, highly interactive, very high I/O rates, running on just two Cyber 73 class machines (combined power about the same as a single 6600). I submit that such a project would have failed miserably if it were even close to true that PPs are ill-suited to doing I/O. In fact, they did very well, provided you have good programmers working on the job. Finally, re "each PPU an independent computer" -- naturally, since that is in fact what they are, except for the shared access to the I/O channels. The OS would manage that access. Also, there were a few cases where multiple PP programs would cooperate on I/O -- high speed disk I/O for example (at PLATO) or long-block tape I/O (standard OS feature). Paul Koning (talk) 16:01, 9 November 2009 (UTC)
- the contributor above confuses (as many do) the meaning of "CP" -- Central Processor. The average PP program did not interact with the central processor, because that was the province of the monitor. Instead, the PPs interacted with "CM" -- Central Memory, and it is this to which the contributor refers. —Preceding unsigned comment added by Dmausner (talk • contribs) 21:29, 27 March 2010 (UTC)
How PPU were actually used
[edit]I removed the sentence beginning with "For instance, a program might use..." because the suggested interaction of CPU and PPU programming never occurred in actual practice.
The PPU were reserved for the exclusive use by the operating system and primarily for I/O operations, inasmuch as the PPU could access the data channels and the CPU could not. Because of this, PPU were not employed to assist in CPU computations or logic, even though the "virtual" concurrency of execution in CPU and PPU might offer the possibility.
PPU programming language was easily acquired because the same assembler could interpret both CPU and PPU mnemonics; however, PPU programs could access the entire machine's memory and the data channels. Hence amateurs were forbidden to load code into PPU by a number of preventative measures, not the least of which was the absence of an operating system command or process for seducing the system to do that from a user's job stream.
The easest way to load a custom PPU program was to go to the bootstrap panel and tell it to load from tape to PP0 on reset. Push the reset button and your code loaded. Other than that it was almost imposible to get a PP program loaded. The monitor was in charge of telling each PPU what to load next.
At Michigan State University (MSU) SCOPE was "enhanced" into "Scope/Hustler" as a pun on the way that user jobs were "hustled" in and out of control points (jobs ready to run). In order to improve performance parts of the PPU Monitor function were moved to the CPU. The monitor then forced a Context jump on the tick at which point the CPU would run the control points to find the highest priority job that was ready for the CPU and the context jump into that user job.
The other major improvement (Not sure if this was in SCOPE as I was writing for SCOPE/HUSTLER) was that the RA+1 calls (system calls) were improved by having the user programs execute an intentional illigal operation. When this happened the user job did a context switch to the OS which processed the RA+1 call and then dropped into the scheduling process.
This meant that the CPU was only interupted when the user job wanted a function from the OS OR when the PPU Monitor wanted to make sure that the CPU got some cycles. — Preceding unsigned comment added by 24.62.5.242 (talk) 19:39, 1 August 2012 (UTC)
Some universities possessed PPU emulators which either ran in the CPU or in a PPU (in place of the actual PPU program). These programs could in principle make PPU software development easier and safer since they protected the operating system from channel hangs and memory over-writes.
6600 and Cyber
[edit]I used a CDC 6400 and a Cyber 70/74. From what I knew at the time, the Cyber 70 was basically the same as the 6600. If I'm correct, the 6600 was one of the last second generation computers and the Cyber 70 was 3rd generation. Are these things correct? -
Also, in the second paragraph under Central Processor, "CU" is used one time. Is that supposed to be CP? I assume it is, since someone changed it in other places, so I'm going to take the liberty to change the remaining CU -> CP. Someone correct this if I'm wrong.
-Bubba73 02:56, 6 Jun 2005 (UTC)
Yes, it should be CP, which is how it was referred to in the 6600 documentation. Geoff97 07:45, 6 Jun 2005 (UTC)
'Population Count' ("Count Ones") Instruction
[edit][CONTROL DATA 6400/6500/6600 COMPUTER SYSTEMS Reference Manual]:
47 CXi Xk Count the number of "1's" in Xk to Xi (15 Bits)
"This instruction counts the number of "1 's" in operand register Xk and stores the count in the lower order 6 bits of operand register Xi. Bits 6 through 59 are cleared to zero."
Grishman (1974) gave a silly reason for this opcode, as if storage were so tight that packing individual BITS might be good practice. I have never been able to come up with any commercial application that would find this useful. Where it would be useful is in ciphers and cryptology at the NSA level.
As the article notes, "Most of these [50 CDC-6600 ever built] went to various nuclear bomb-related labs." It seems to me that this opcode wouldn't have found its way into the germanium, I mean silicon, without some Federal involvement in original design. Comments? Tribune 16:30, 12 July 2005 (UTC) What is source for "50 CDC 6600s.. built"? I believe the number is closer to 200 (and almost as many 6400s). Also, although some certainly went to bomb-makers, far more ultimately went to universities and engineering shops. Capek 01:35, 18 August 2006 (UTC)
[COMPASS]: CXi Xk
- Xi <- 60 bit value equal to number of ON bits in Xk
Grishman notes [p. 115] that the instruction is "rarely used." Then he says, "This instruction 'is of use' when binary data, such as yes-no responses to questionnaires, are stored one datum per bit rather than one datum per word (as they would be in a FORTRAN type LOGICAL array) so that sixty times as much information can be stored in a given block of memory. A count ones instruction may then be used to determine the total number of yes responses (1 bits) to a word." I find this unconvincing: one using any CDC architecture to pack/unpack 6-bit characters -- let alone individual bits -- has code so long and slow that an efficient way of counting 'on' bits will be the least of one's problems. Tribune 06:18, 16 July 2005 (UTC)
- The PLATO system used the population count instruction for fuzzy matching. Answers would be run through a transformation that used various bits to code various aspects of the words, and the expected answers were coded the same way. The coding was done in such a way that a near match would have most of the bits in the encoded form equal. So fuzzy matching requires simply popcount(a XOR b) < n -- a total of 4 instructions. Paul Koning (talk) 15:55, 9 November 2009 (UTC)
Comment on Pop Count
[edit]In the early 1970's I was told that pop count was added to the instruction set at the specific request of one particular US government customer who had been given a demonstration of the earliest 6600. Once the custom circuit modules were designed, it made no sense to omit them from successor machines.
The circuits for pop count were trivial.... literally a bunch of adders in a tree. But there was a indeed a desire on the part of NSA to have the instruction for doing set intersection and counting the size of the result. The instruction has other uses as well, including looking for a physical disk position that has at least n blocks available. The instruction was NOT, however, particularly slow, despite being implemented in the divide unit: it was 8 minor cycles, slightly more than a conditional branch (nominally 6, if in stack) and less than a multiply (10). More philosophically, there's also the observation that it's a function which is easy and cheap to do in hardware, and very slow to do in software. Since it's useful, put it in. Capek 01:35, 18 August 2006 (UTC)
The pop count was implemented in the 6600 floating divide unit, which was notoriously the slowest of the functional units. Pop count was the slowest single instruction. This was unrelated to the iterative divide strategy, however. Thornton's book shows that the count was produced by summing the bit count through several static logic adders. I think he wrote that they summed 4- (or 5-) bit chunks of the 60-bit word, in parallel.
The MACE operating system at one time contained an idle loop, used when the CPU was not needed, comprising four pop-count instructions in a row followed by a branch back to them. I have no idea why the author saw any benefit to this; the SCOPE operating system used to hang the CPU on a program stop when it had nothing to do. M
An additional use for pop count was in setting a register to zero in parallel with other functional units to gain overall speed. I believe that this was used in some nuclear physics programs. Geoff97 19:02, 19 August 2006 (UTC)
A use for the Count! In '67 I used the Count instruction on the CDC6600 at NYU's Courant Institute (Atomic Energy Commission was the sponsor of this installation). It was a frivolous use, but incredibly efficient: I used it to run a computer-dating service for my 11th grade prom.
There were several hundred boys and girls participating, and I wanted to do a round-robin match, comparing every possible heterosexual pair. Participants filled out a form with 60 multiple-choice questions, each with four possible answers. Thus the answer to question n could be coded in only two bits, which I spanned across the n-th position in two words--two words which therefore contained the answers to all 60 questions. The Boolean instructions in the 6600 operated on entire words, but calculated each bit independently (costing no time, because the CPU ran the 60 bits in parallel). Thus I could count the matching answers between a boy and a girl by using the function: Count((B1 XOR G1)NOR(B2 XOR G2)). I recall that several (socially) successful matches resulted. In the forty years since, I don't think I've written a program that comes close to that one in efficiency. Of course, I haven't written in machine language since then either.
Thanks for the opportunity to brag about this; at the time, obviously, I couldn't talk about what I was doing. My access to the machine was to work on a new OS called SOAPSUDS (Schwartz's Own Athene Processor Serial Uniprocessor Debugging Simulator), which allowed the CDC6600 to emulate a similar machine with 20 parallel CPUs. Our expectation at the time was that parallel processing would be the basis of the next leap in hardware performance, and we wanted to get a head start on programming for such an architecture. Instead, of course, Moore's law asserted itself, and hardware went in another direction entirely.
--Brian67 —Preceding unsigned comment added by 134.67.6.14 (talk) 22:29, 8 July 2008 (UTC)
- While population count is in the divide unit, it isn't even close to accurate that it is "the slowest instruction". Actually, it takes only 4 cycles, if I remember right, which ties it with floating multiply and makes it only one cycle slower than the fastest instructions. The other instructions in the divide unit (the actual divide instructions) are indeed slowest by far, but that doesn't affect CX. Paul Koning (talk) 01:58, 11 December 2013 (UTC)
Comment on Operating Systems
[edit]It could not be said fairly that NOS showed performance many times better than SCOPE. For one thing, both systems were so spare that even on a bad day 99% of the available CPU time was consumed by user programs. The two systems had radically different disk file systems, each with disadvantages. In practice they behaved in a pretty similar manner, and to prefer one over the other was a matter of style over substance, if a customer were even able to appreciate such a difference!
(The following discussion will be moved into an article on the CDC operating systems after a resonable period of time.)
The file systems differed in the representation of disk extent allocation in memory. The original COS/SCOPE 1 method, preserved in MACE, KRONOS, and NOS, divided a disk surface into conceptual tracks. (On the very first disk and drum devices, these corresponded to physical tracks.) The track reservations were stored as 12-bit bytes in central memory words (the TRT) for ease of use by the 12-bit PPU. A nonzero reservation contained a pointer to the next allocated track in a file.
This design produced a limit of 4095 tracks per device. If the physical device had more physical tracks, you had a choice of allocating more sectors per virtual track, or splitting the physical device into more logical devices with their own reservation tables. if you didn't mind creating huge sector allocations, you could also combine multiple physical devices into one logical device.
This very simple (with limitations) design was also predicated on the idea that each PPU desiring to perform disk I/O contained all of the channel and track changing logic in the PPU-resident code (the address region between 100 and 1077 octal). This made disk error detection and correction extremely difficult as drives became more complex in the 1980s. It also meant that a busy CPU could generate many competing requests from many PPUs for the same disk channels. In such cases PPUs had to wait for their turn, as mediated by the system monitor program.
The later SCOPE file system method, in SCOPE 2 & 3 and NOS/BE, also divided the disk surface into allocation zones with some number of sectors per reservation block. The reservations were stored as a bit vector in central memory words (the RBR), the offset of the bit indicating which block was reserved. In order to represent the sequence of blocks in a file, a separate table was created in a variable-sized region in upper central memory (the RBT). This table contained 12-bit bytes in a pair of CPU words. Each byte contained the bit offset number of the allocated block. One byte in each pair of words contained a pointer to the next pair of words, forming a linked list.
Calculating the bit offsets required some arithmetic best performed in the CPU. As such, SCOPE designers centralized disk access to one PPU program which remained active when any disk requests were present. This program exclusively communicated with the CPU disk allocation code via system requests. This placed all the error detection and diagnostic code in the full memory of one PPU. It also could evaluate all disk requests to optimize head movement.
A SCOPE PPU program wishing to perform disk I/O entered system requests instead of performing its own channel operations. The complex communication process included routing the request from the monitor to the unified disk processor, which performed the channel operations, then sent the data to the PPU over the same channel, or to central memory. Clearly the SCOPE design had many moving parts.
In comparing the COS and SCOPE file systems, these points may be made. The SCOPE design required significantly less storage, at a time when storage was the greatest cost of any computer. The SCOPE RBR was about 1/12 the size of the COS TRT. The SCOPE design cleverly realized that not every RBR sequence is in use; the RBT could be created in memory when the file was opened. Moreover, the RBT design permitted a file to flow from one device to another. This was the basis of the SCOPE permanent files design. Finally, an RBR could represent significantly more than 4095 tracks.
The SCOPE storage advantage was offset by the system complexity required to maintain the variable field length of the RBT. It should have been represented as a control point like any other process, but this would have removed some of the storage efficiency. Furthermore, the unified disk processor PPU program was never very effective at head motion optimization, as the system loads evolved toward time sharing.
The COS file system suffered from too much simplicity. In addition to the storage disadvantage above, its design did not permit ad-hoc device overflow to other physical or logical devices. The TRT design induced extremely wasteful sector allocations on large-capacity drives. As a defence, KRONOS introduced the twin concepts of direct and indirect permanent files, in which the user was expected to know whether his file was large (therefore, direct—stored in place) or small (therefore, indirect—copied to a small-allocation zone at some performance cost).
In the end, comparing system performance with an apples-to-apples batch job stream, without add-on subsystems running, KRONOS and SCOPE naked operating system performance was roughly equal.
However, "without add-on subsystems" is a major qualification, since those subsystems were always running in practice. That's grist for another mill.
Google News Archive Search on 6600
[edit]Someone with the right enthusiam should include info from these few news articles from 1960s, found with the new Google News Archive Search. I'd just add these to the external links but I don't think they'd get the attention they deserve. The Google News Archive is an amazing insight into the past. :) The first article, from 1961, describes a "huge, advanced-design 6600 computer" to be under development and one, from 1968, how "Four years ago, Minneapolis-based Control Data Corp. brought out its model 6600 computer, the largest machine of its type in the world." I really hope articles about old events, items and people will start getting external links to actual news stories now that we have this new Google service! (Oh, and I'm not paid for endorsing Google like this, I personally just think it's a Good Thing(tm) ;) --ZeroOne 23:59, 14 September 2006 (UTC)
Branch timings
[edit]In the section on the CP, at the point where it says the stack is flushed by an unconditional branch, it then says unconditional branches are faster than conditional branches. As I recall, this was the reverse. Many times used =X0 X0 for "unconditional" branches just to avoid the stack flush. Should this not read "it was sometimes faster (and would never be slower) to use a conditional jump," rather than "it was sometimes faster (and would never be slower) than a conditional jump"?
Poochner 18:21, 22 March 2007 (UTC)
The unconditional branch instruction, JP, was always 'out of stack'. In other words, the instruction stack was always invalidated. The conditional branch instructions only caused stack invalidation when their destination was not in the stack. (One can think of the 6600 instruction stack as an instruction cache with a single 'line'. So any out-of-stack branch would invalidate the cache.) So yes, an conditional branch was generally preferred - even for unconditional branches. The EQ instruction was almost always used - comparing B0 to B0.
Note that on the CDC 7600, words in the stack did not need to represent contiguous memory locations. IIRC, the JP instruction did not invalidate the stack on the 7600, but Return Jump still did. Last resort, a monitor eXchange Jump would certainly do it...
--Wws 04:29, 25 March 2007 (UTC)
- Coding standard was that unconditional branches were always "EQ label" (which is short for "EQ B0, B0, label" -- i.e., branch if B0 is equal to B0, which is of course always true). The "JP" instruction was used only if EQ was the wrong thing to use, usually because the branch target was a computed value or a table entry (in which case the JP instruction with a register for destination address would be used) or, rarely, if the action of invalidating the instruction stack was actually required (self-modifying code, such as overlay loading). Paul Koning (talk) 16:06, 9 November 2009 (UTC)
Scoreboarding
[edit]Since the CDC 6600 was the first computer with a scoreboard for dynamic rescheduling, this should definitely be included in the article. It's mentioned in the CDC 6000 series article, but not here. -- PyroPi (talk) 02:54, 11 May 2010 (UTC)
Register naming
[edit]The B-registers are named in the article "scratchpad" registers. As I can remember, at the CDC 6400 they were named "index" registers. Is there a different naming on the CDC 6600? Thanks for the clarification. CeeGee 07:02, 9 July 2013 (UTC)
- "Index" register is correct. For example, they are described as such in the Compass 3 manual (document no. 60492600), section 2.5, page 2-8. Paul Koning (talk) 02:03, 11 December 2013 (UTC)
- Actually, there is conflict in the manuals. The Compass manual calls them index registers, the 6600 reference manual calls them increment registers. The latter term seems better, after all they go with the "increment" unit, and they are used for things other than indexing. Paul Koning (talk) 16:57, 11 December 2013 (UTC)
- Yes, 'incremental' is a more accurate term for these registers. But it's twice as long to say as 'index', so everybody just called them index registers. T bonham (talk) 02:38, 12 June 2019 (UTC)
- Actually, there is conflict in the manuals. The Compass manual calls them index registers, the 6600 reference manual calls them increment registers. The latter term seems better, after all they go with the "increment" unit, and they are used for things other than indexing. Paul Koning (talk) 16:57, 11 December 2013 (UTC)
Delivery of first 6600
[edit]I have seen various versions of the claim that the first 6600 was delivered to CERN a year before the delivery to Lawrence Livermore Lab in California. This is simply not true! The first 6600 was delivered to Livermore Lab. One year before that we were still laying out the artwork for the printed circuit boards and there was no 6600. As I remember it the system we shipped to CERN was serial 4 or 5.
Oldgoat35 (talk) 17:45, 4 October 2013 (UTC) --Oldgoat35 (talk) 17:45, 4 October 2013 (UTC)
I can confirm that Livermore got the first external 6600. I traveled to Chippewa two times anticipating that first delivery. NormHardy (talk) 19:46, 9 February 2016 (UTC)
I agree. The serial #s 1,2,3 went to Livermore and I believe the NSA or Los Alamos. Serial 4 went to the Courant Institute Computer Center at NYU (where I worked 1965-68). Serial 5 would have gone to CERN. I believe serial 7 went to LBL (Berkeley). rchrd (talk) 02:25, 20 June 2017 (UTC)
Overly long intro?
[edit]Compared to what I'm used to, the intro section (before the first numbered section) seems very long and dense. Paul Koning (talk) 00:28, 6 January 2015 (UTC)
- You are right. I think the paragraph on the 7600 obviously should not be in the intro, so I moved it. I think the second paragraph needs to be moved out of the intro too. Bubba73 You talkin' to me? 00:42, 6 January 2015 (UTC)
- I moved the second para into a new "Models" subsect of the "Description" sect. I also changed the intro text from referring to "a 6600" to "the first 6600" (because it sounded awkward), but someone needs to verify that the CERN model was indeed the first CDC 6600. — Loadmaster (talk) 17:51, 6 January 2015 (UTC)
PP and memory timing
[edit]The discussion of the barrel says that it was partly for cost (sounds right -- not to mention space) and partly because of the CP memory timing which is 10 PP cycles. That doesn't sound right. All memory has the same timing (it's after all the same modules), 1000 ns full cycle. So a CM cycle is one major cycle. But in any case, the CM to PP interaction is isolated in the "pyramid" so the two timings aren't directly tied together. The relevant memory timing may instead be that of PP memory. Given the logic used, execution steps would run at small multiples of 100 ns, but with memory running at 1000 ns that speed would be wasted. 1 microsecond cycle time is fast enough for the intended uses of the PPs, so by having 10 PPs in a barrel, the effective PP execution cycle becomes 1000 ns which perfectly matches the PP memory cycle time. The key question (by Wikipedia rules) is not really whether this makes sense, but whether we can find a source to cite for it. Paul Koning (talk) 15:01, 7 January 2015 (UTC)
PP or PPU
[edit]Weren't the Peripheral Processors actually called Peripheral Processing Units (PPU)? Bubba73 You talkin' to me? 04:03, 20 January 2015 (UTC)
- The documentation for the 6600 calls them CP and PP; the documentation for some successor products calls them CPU and PPU. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:45, 6 October 2019 (UTC)
Euler's Conjecture
[edit]CDC 6600 is attributed in The "Bulleting of the American Mathematics Society", volume 72, number 6 (1966), in the paper "Counterexample to Euler's conjecture on sums of like powers" by L. J. Lander and T. R. Parkin with finding 275 + 845 + 1105 + 1335 = 1445, disproving Euler's conjecture of Fermat's Last Theorem. https://projecteuclid.org/euclid.bams/1183528522 --Conspiration 06:46, 25 March 2018 (UTC)
Number of machines delivered
[edit]At the end of the second paragraph it is stated "Approximately 50 were delivered in total.", quoting a later source, whereas in the beginning of the one but last para of the History section it says "More than 100 CDC 6600s were sold over the machine's lifetime.", without source. Any idea how to resolve this discrepancy? --HReuter (talk) 21:24, 5 October 2019 (UTC)
- Also, do either or both of those numbers include Cyber 70 model 74? Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:43, 6 October 2019 (UTC)
Monitor Exchange Jump?
[edit]Should the article discuss Monitor Exchange Jump, or would that be TMI? Shmuel (Seymour J.) Metz Username:Chatul (talk) 16:45, 31 May 2020 (UTC)
Dimensions
[edit]The dimensions given in the data box are apparently taken from this source:
Systems Hardware Handbook, Aug 75
but, having given it a look over, I can't see dimensions listed. Have I missed something?
Michael F 1967 (talk) 02:34, 27 February 2021 (UTC)
- Hi Michael ! That's me that created the CGI views and fill the dimensions. Unfortunately, sources that I had were wrong and I found some months ago the good dimensions (but lack of time, I didn't corrected graphics and datas). Actually, there is no really exact dimensions because elements were assembled (and frequently disassembled) without special care.
- I will put today corrected CGI and datas.
- One of my sources with correct dimensions : 60142400B_6000_Series_Site_Prep_Sep65.pdf
- --FlyAkwa (talk) 12:36, 1 March 2021 (UTC)
Annotate register table?
[edit]Should the register table include an annotation that A1-A5 are used for reading into X1-X5 and A6-A7 are used for writing from X6-X7? 08:20, 20 May 2021 (UTC)Shmuel (Seymour J.) Metz Username:Chatul (talk)
SIPROS
[edit]I was going to add SIPROS to Timeline of operating systems, but realized that I don't know the date. If anybody knows when and to whom SIPROS shipped, or when CDC dropped it, please update the article. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:42, 6 November 2022 (UTC)
128 kw address space
[edit]The current text says that the 6600 has a 128 kw address space, which is correct. It then claims that this is because the address registers are signed, which is incorrect. I suppose you could call A registers signed in that they use the standard one's complement arithmetic, but programmers certainly didn't think of them that way and the notion of "signed" doesn't appear anywhere that I know of.
It then goes on to say that later machines allowed more memory (yes, the 170 series) but individual programs were still limited to 128 kw. Not so; the 170 manual clearly states the FL register in the exchange package is 18 bits, not 17.
In fact, the limitation of the 6600 is related to the physical size of memory of the time: it uses 4k x 12 memory modules, so a 4kw memory bank requires 5 of those, which is two whole rows in a chassis. A fully loaded 6600 has 32 banks (so 32-way interleaving, yielding more than enough bandwidth to match the 100 ns minor cycle time). 4 banks would go in a chassis, with another 5.5 rows of logic, leaving 3.5 rows unused. That means a full memory 6600 used 8 memory chassis, i.e., two of the four wings of the + shaped system cabinet. A half-memory 6600 would have one arm of the + omitted. A hypothetical 256 kw 6600, or other 6000 series machine with that much memory, would require 16 memory chassis and an entirely different packaging strategy.
In the 170 series the memory technology had changed and density increased so at that point the bigger memory became feasible. Paul Koning (talk) 17:16, 28 September 2023 (UTC)
- Also, the 18-bit field length applies only to central memory, not to ECS. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:35, 28 September 2023 (UTC)
Organization
[edit]The article currently has material[a] under CDC 6600#Instruction-set architecture that has nothing to do with the ISA. That material is certainly important, but it belongs elsewhere. Splitting this section might also be a good opportunity to write a few words about the dead-start panel[1] that replaced the elaborate consoles of older machines.
Notes
- ^ E.g., models, physical design.
References
- ^ "Figure 6-1. Dead Start Panel" (PDF). Control Data - 6000 Series - Computer Systems (PDF). p. 6-3. Retrieved October 6, 2023.
Instruction-set architecture
[edit]@R. S. Shaw: The edit summary in edit Special:PermanentLink/1185120697 asked explain on talk what you think dubious
. CDC 6600#Instruction-set architecture refers to RISC, but in the 6600 CP there are two instruction sizes and not all instructions take the same amount of time, ranging from 3 to 29 minor cycles. I saw nothing else controversial in that section. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:00, 15 November 2023 (UTC)
- My guess as to what might have been the dubious statement being flagged was the preceding sentence:
This simplification also forced programmers to be very aware of their memory accesses, and therefore code deliberately to reduce them as much as possible
(and it didn't seem dubious to me). A wrong guess, but that's unsurprising since there was no talk section (nor even aReason=
description). - I don't like the phrasing about RISC- it doesn't quite say it is a RISC machine, but almost does. It seems to me the 6600 has a number of RISC attributes, particularly compared to other machines of the time, but probably not all attributes that are now usually considered RISC. The two instruction sizes don't bother me, though, in part because all memory accesses are 60-bit transfers, there are buffer registers for 8 instructions which are apparently high-speed filled from separate memory banks. The 2 sizes are easily handled by the queue it has of (semi-?) decoded instructions, whether 15-bit or 30-bit (neither of which are 60-bit, which would be unreasonably large). The varying instruction times I think are okay for the state of the art then.
- However, this RISC thing could be just an editor's opinion, so I think it's important to have the "citation needed" filled with a reliable source. Conceivably all mention of RISC should be immediately pulled until a good source is available.
- - --R. S. Shaw (talk) 07:40, 16 November 2023 (UTC)
- RISC ISAs having a single instruction length can be seen as (except for Aarch64) a short-lived compromise made at a particular point in technology. In early RISC-style machines not only the CDC6600 but also the Cray-1, the first (24 bit) version of the IBM 801, and Berkeley RISC-II all had two instruction lengths. Then between 1985 and 1992 ISAs such as MIPS, SPARC, ARM, HP-PA, POWER, Alpha had a single instruction length and rather bulky programs. But after that Hitachi SuperH, Arm Thumb2, and now RISC-V all have dual 16 and 32 bit instruction lengths.
- Unlike the wildly variable lengths of the VAX and x86, two lengths is very easy to deal with, even on a wide OoO machine, especially when the length can be determined by examining just a small number of bits in the instruction, before doing the main instruction decoding.
- In short, in the 60 year history of RISC principles (not of course given that name until 1981), having two instruction lengths has been common -- even the norm -- in all except one 7 year period. Brucehoult (talk) 01:59, 19 September 2024 (UTC)
Non-addressable registers in infobox?
[edit]@RastaKins: Edit permalink/1188760393 added the program counter to registers infobox. Do nonaddressable registers such as P belong there? If so, should other registers be listed? -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:44, 8 December 2023 (UTC)
- The P register appears to be the only user-accessible, non-IO register other than Xn, An, and Bn. (The "Control Data 6000 Series Hardware Reference Manual" referenced in the article, documents the P register on page 3-8.) Other Wikipedia articles such as Intel 8086 and Motorola 68000 document user-accessible program counter registers even though they are not registers that can be accessed with a register number. Should non-user accessible registers be listed? Probably not if we draw the line at userland programs. RastaKins (talk) 19:37, 11 December 2023 (UTC)
- ITYM p. 3-8; there is no p. 2-7 in that edition.
- The P, FL FL_ECS, RA and RA_ECS registers are not user-accessible, unless you consider a return jump to be a store instruction, and are definitely not addressable. Of course, all of these are loaded and stored by an exchange jump. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:25, 11 December 2023 (UTC)
- The 6600 user-level program can write P with any number of instructions: RJ, unconditional jump, and conditional branches. The user did not typically have access to the exchange exit to alter the other registers but retain control. To answer your original question: Since P is often altered by user programs, it should be shown in the register infobox. RastaKins (talk) 17:03, 12 December 2023 (UTC)
- Reference 27 is http://bitsavers.trailing-edge.com/pdf/cdc/cyber/cyber_70/60100000AL_6000_Series_Computer_Systems_HW_Reference_Aug78.pdf, the AL revision of the manual. Physical page 32, numbered p. 3-8, contains the Program Address description at the end of the Operating Registers (pp. 3-6 - 3-8) section. Is that the text that you're referring to?
- So are you saying that any register that the program can change and that can affect subsequent program behavior should be listed even if it doesn't have an address? -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:07, 13 December 2023 (UTC)
- That is the correct reference. Maybe I am not understanding a subtly of your argument. Sure, CDC typically did not show P in its register illustrations but that does not mean that P is not a register. P is important and the programmer must load it with valid parameters for all but the most trivial programs. In fact, P is the only register that is used by every fourth 15-bit instruction! Take a look at the register infoboxes for many common microprocessors. Registers are documented even though they often don't have a register number. Most computers before the CDC 6600 did not have numbered registers because they were mostly accumulator-based. Ultimately, drawing the line at application-visible registers is arbitrary. RastaKins (talk) 16:13, 14 December 2023 (UTC)
- I wasn't taking a position, just attempting to clarify what the policy was.
- BTW, by the time the last instruction in a word is dispatched, the P register is not involved in executing the next instructuin; the heavy lifting is done earlier. There is a considerable amount of parallelism. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:12, 14 December 2023 (UTC)
- That is the correct reference. Maybe I am not understanding a subtly of your argument. Sure, CDC typically did not show P in its register illustrations but that does not mean that P is not a register. P is important and the programmer must load it with valid parameters for all but the most trivial programs. In fact, P is the only register that is used by every fourth 15-bit instruction! Take a look at the register infoboxes for many common microprocessors. Registers are documented even though they often don't have a register number. Most computers before the CDC 6600 did not have numbered registers because they were mostly accumulator-based. Ultimately, drawing the line at application-visible registers is arbitrary. RastaKins (talk) 16:13, 14 December 2023 (UTC)
- The 6600 user-level program can write P with any number of instructions: RJ, unconditional jump, and conditional branches. The user did not typically have access to the exchange exit to alter the other registers but retain control. To answer your original question: Since P is often altered by user programs, it should be shown in the register infobox. RastaKins (talk) 17:03, 12 December 2023 (UTC)
CDC 6600 ISA is ambiguous
[edit]The CDC 6600 has a CP ISA and a PP ISA; they are very different. All statements about "the ISA" should be put in context, either explicitly or by putting the two ISAs in separate subsections. I'm not sure which approach is best. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 16:37, 18 December 2023 (UTC)
- Probably separate sections. Peter Flass (talk) 06:02, 19 December 2023 (UTC)
Parity?
[edit]Should the article mention that the 6600 did not have parity bits, and, if so, where? Also, is there an RS for the statement "Parity is for farmers."?
As a side note, the SUNY/AB 6400 sometimes had problems with dropped bits. It turned out that the best diagnostic for suspected memory failures was the Baseball game played at the console. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:49, 27 February 2024 (UTC)