Architecture and Compilers
About the course
This is the homepage for
CS 353: Architecture and Compilers,
a course offered by
Fritz Ruehr at the
Computer Science Department of
Learning to program in Java can be fun and empowering, but it may still leave you mystified about
how computers really work: what are the components of a physical computer? How does the computer
"know" what to do when it runs a Java program? What connects abstract ideas like objects and
methods to the silicon hardware?
This hybrid course in computer architecture and high-level language processing (compilers) will
provide you with the answers to these questions. The course is a "soup-to-nuts" overview of the
various levels of structure embodied in a physical computer, as well as the various stages by which
high-level languages are translated into a form which the computer can directly execute.
In traditional curricula, these two topics are often covered in separate courses—we have
combined them into a single comprehensive overview in order to expedite Willamette's pedagogical goals.
We will start our journey with basic facts about binary encodings, how they are used to represent
data and how logical operations can be used to transform these data. We will then see how basic
electrical components called gates can be used to realize these operations on digital signals,
and how they can be combined into circuits which will implement more complex processes. Finally,
we will see how these circuits are used to build higher-level components which constitute a typical
computer, and how digital data stored inside the computer can be used to control its operation.
In the second phase of our studies, we will start with the strings of characters which comprise a
program file, and see how it can be analyzed in successive stages until the internal structure of
the program is exposed. We will then see how this internal structure can be translated into the
low-level codes which a computer can execute, in a way which respects the meanings of the original
program. Finally, as time permits, we will survey a few important concepts which inform the design
of high-level languages both like and unlike Java.
Welcome to the Fall 2014 class!
The “news” items below (as well as assignments, etc.) are mostly from older
versions of the class.
I will be leaving them intact for now as convenient references. I'll try to add updates
and move others into focus as they become especially relevant.
2014 midtem review notes are now available.
- Here's a real-world example of
practical regular expression use (using the
sed unix tool)
and a sample from my text editor (for pulling out student rosters).
- Various missing things: here are those
graphics of lights I mentioned; feel free to use them in your GUIs.
And here is the advice on structuring the circuits lab.
- Here's a Sun's Java regular expression tutorial.
- Here's Wikipedia on the gated D latch.
- At this link you can find
a possible alternative
for use in the circuits lab (The Logic Lab from Neuro Productions).
- Here's the the course syllabus as a PDF file.
- Here's an
on-line , interactive regex tester for Java
from the people at ocpsoft.org
(part of their Guide to Regular Expressions in Java.
And here are
10 regular expression examples in Java
from M Kyong.
(You can find many other examples and tutorials
by using Google.)
- Here are links to
Nick Parlante’s Essential C at Stanford
my Lightbot project in C.
- Here you can read a story (or is it a poem?) about
Mel, a Real Programmer.
- Here's a link to that news report video I mentioned about
Paul Allen’s computer museum.
- Check out the hints on programming the circuit simulator.
- Philosoraptor goes meta on programming.
- Forget coconuts and palm trunk troughs, here's how you can make logic gates out of DNA plasmids!
- And here's that
gated D latch I was looking for BROKEN LINK.
- News from slashdot re hacking PROMs with microscopes.
- Eoin says: “
306 9ea c20 200 211 1e0 601 9e0 d10 91e 428 e24 000”!
- Eoin Sinclair found
a possible alternative for use in the circuits lab (The Logic Lab
from Neuro Productions).
Another possibility (commercial, but with a free trial period) is
magnetic switches aren‘t such an old-fashioned idea after all.
- My wife spotted this on-line via Facebook,
where user Eliyahu wrote:
“...did you see the bottom line? I programmed my first computers in
binary, using toggle switches.
I always think of how my code turns into electrical signals, even when it's "written"
by connecting pictures in XCode that are compiled into Obj-C that is compiled into C
that is compiled into LLVM that is compiled into Assembler that is assembled into
object code that is linked into binaries that becomes executable code that is
dispatched through the pipelines and routed through the logical arrays of gates ...
and then back out into representations that eventually have a meaning assigned by
(and here you thought I was just making this stuff up as we went along ...)
- Fritz wouldn't do this, would he?
- At least this isn't me!
- This diagram details building gates from switches.
- Happy Programmers’ Day!
- The controversy continues (!) in Yacc is Not Dead
- IBM is apparently making
great strides in light-based (optical) computing
(I have to say that “silicon nanophotonics” sounds like something out of Star Trek).
- Straight from the Unicode horse's mouth (is there a character for that?) comes
this FAQ about UTF encodings.
- Just in time for our discussion in class
here's a blog post about relocatable code.
- Some say that Yacc is dead (but not everyone agrees).
Also, check out
this paper, "Pure and Declarative Syntax Definition: Paradise Lost and Regained"
with a distinctly biblical introduction (!).
- Here are some links about more realistic architecture issues:
- ...and here are some links about issues of evaluation order in various languages:
- Here are some links about register allocation, graph coloring and related issues:
- Here are some links relevant to the code generation part of the compiler:
- Warning! The relative operator precedence of bitwise masks and shifts in Java means that you will likely need to parenthesize your masks when doing both.
... ( inst & 0x0F0 ) >>> 4
... inst & 0x0F0 >>> 4
- You can read about Chuck Peddle and the 6502 processor in
this article which questions who really
started the personal computer revolution
(of course, the story isn't quite that simple, but still, have you ever
heard of Chuck Peddle? (I mean, anyone other than Collin?)
- Study for the midterm (Wed 20 Oct 2010) using these notes
And here is a sample exam (from CS 231, Intro Programming)
to give you a sense of the format (of course, the content will be different!).
- Here's an explanation of the use of transistors to build
NAND and NOR gates using serial and parallel connections.
- Oh, and the version of multiply I showed in class the other day was
- Read about the latest in Unicode, including controversial Japanese emoji characters,
- By request, we have here a
a few sample problems to practice for the exam.
These will be self-graded; I'll post a separate solution sheet soon.
- The swapping multiply program we developed in class on Friday is
Thanks to Ben for the transcription!
- Here's a scanning electron microscope picture of the surface of a silicon microchip
(I assume this has been artifically colorized);
you can see other items close up in this article from the Daily Mail.
- Learn about null-terminated strings so that you don't become the butt of Java jokes!
... and look at some old computers.
- For those who read Reddit, there is now a
subreddit for CPU design.
- As we move into discussion of computer organization and architecture, the following links
may prove useful:
- Here are some links about the women who pioneered programming on the ENIAC computer,
as we discussed the other day in class:
- Check out this description of one programmer (Jeff Minter)'s first brush with assembler
and it's speed compared to higher-level langauges (I daresay the comparison with Java or Python would
be even more remarkable ... not that you really want to write in assembler, though).
- Some fine people at
Visual6502.org have written
runnable on the web (also available in Python). We'll have to try this out when we get
to that point in class! (They have more info, slides, a research paper, etc., on
their project homepage.)
- Here's an on-line truth table generator,
kindly provided by Professor Lawrence Turner.
(Here's another one,
by Brian Scott Borowski, and a bit more elaborate, although I had trouble getting it to work.)
- Is C really just a portable assembly language? Apparently
opinions on this matter differ.
(Context: this was the subject of some remarks in lab today, Wed 15 Sep 2010.)
- an ancient lecture on data representation
- the Tengwar ("elvish") Unicode proposal,
and the Klingon one
- Read what every programmer should know about Unicode
(from Joel Spolsky: don't let him put you in a submarine, peeling onions).
(And you might as well read about HTML 5 while your at it, here from Mark Pilgrim.)
- a visual key to bits, ints and chars
- Here are Sam's
diagrams (that's two links there) showing the "two's complement clock";
or, if you prefer, here's my own version
- A binary-decimal converter
- A binary counter graphic
- A Virginia Tech tutorial (look at I, II.A and II.B)
- Here's a wood and marbles adding machine.
- another kind of very little machine
- some real-world timing data, albeit circa 2001
- here are some nice, thorough
course notes on computer organization by Grant Braught at Dickinson College
- Here are some advanced tips on "real-life" regular expressions.
- Here's are some old notes on about converting DFAs/NFAs to regular expressions.
- So, as you can read here,
the Java programming language spec does guarantee that operator arguments are evaluated from
left to right ... but despite this guarantee, they recommend against any dependence on order of evaluation (!) .
- Check out Oliver Steele's
regular expression visualizer (edit the pattern itself on the upper left,
input on the upper right)
- I made a small (but important) change to the description of the shunting-yard approach
to parsing in the description of lab 9. More specifically, Sam noticed that I had
assumed there that we were doing operators of equal precedence right-associating,
whereas this semester we are using left-association. This means that
operators of strictly greater precedence would cause a shift instead of a reduction.
See the lab 9 write-up for more details (the change I made is in bullet point 3 of the
"Parsing Strategy" section.)
- Check out this graphical overview of the Java Virtual Machine.
- Here's a
better (i.e., actually works) link to the Java Virtual Machine spec
(the one on the class handout apparently suffers from bit-rot).
Labs (in chron. order) NB: due dates may be updated as we go!
(I have added “abstract due dates” below for each of the labs.
What does this mean?
Well, first of all, they are stated in terms of the weeks of the semester rather than
the days and months of any specific year (this mean less bother about updating for me,
but requires you to be at least somewhat aware of how far into the semester
we are “currently” if you wish to gague your own progress).
But the due dates are also abstract in that they are a little vague and soft:
you are not required to finish the lab by the expected date, but should
expect to do so, or perhaps strive to do so. As stated in lecture, if you
let these things go too long, you will have a Bad Time™
at the end of the semester.)
- Lab 1: Numerals, numbers and strings
(due 2nd week or so)
A simple lab to get you started thinking about strings and numbers as
they are represented inside the computer in binary.
- Lab 2: Binary addition with two's complement
(due 3rd week or so)
Buliding on the last lab, you will now implement some simple operations
on binary "words" (i.e., sequences of bits).
Special note: make sure your program can be reconfigured for different
word sizes (8, 16, 32). (You're allowed to re-compile if you wish.)
- Lab 3: Building circuits in a GUI
(due 4th week or so)
Use the graphical circuit simulator to build some simple circuits that
could be used to implement a computer. Use the
Analytical engine circuit simulator.
- Lab 4: Hidden circuit simulator
(due 5th week or so)
Implement a circuit simulator like the one you used last
week, but without showing the circuit graphically.
(Your input will be textual: you will "build" a circuit internally in Java
and run it to produce boolean outputs from boolean inputs.)
- Lab 5: Writing PC-231 Programs
(due 6th week or so)
For this lab you will write several programs in for the PC-231 computer,
in PC-231 assembly code (in the next few labs you will implement the
tools you used for this one).
- Lab 6: PC-231 Simulator
(due 8th week or so)
Write a simulator for the PC-231 computer: it should read in hex codes
and execute the program directly, reading, storing, printing, etc. as it goes.
- Lab 7: PC-231 Assembler
(due 10th week or so)
Write a PC-231 assembler: it should behave like the version supplied by me,
i.e., it should read assembly codes and generate hex codes.
- Lab 8: An Evaluator for Extended Expressions
(due 11th week or so)
Write an evaluator which will interpret (or execute "on the fly")
programs in the extended expression language, given as abstract syntax
trees (already formed, i.e., built with constructors in your Java code).
- Lab 9: A Parser for Extended Expressions
(due 13th week or so)
Write a parser, using standard techniques or "canned" parser generation software,
which will read in programs in the extended expression language and produce
abstract syntax trees.
- Lab 10: A Compiler for Extended Expressions
(due end of class (or thereabouts ☺))
Write a code generator which will take an abstract syntax tree in the extended expression
language and generate PC-231 machine code from it (of course, the code should run and
produce the same results that the high-level program would have). You should hook up this
code generator with your parser so as to make a complete compiler.
Some useful resources, etc.
Some information about C and other miscellanea
- You can get the C compiler (gcc) on our local (hydra) system as
/opt/sfw/bin/gcc. To make this easier, you might
want to set up an alias in your
just add a line like this:
alias gcc /opt/sfw/bin/gcc
- You can get the also fix up your backspace key by adding this
(that last character is a backspace key: if your backspace is currently
set to erase, this is hard to enter, but in the
stty erase ^H
vi editor you
can use a Control-V Control-H sequence (the Control-V escapes the next character
to be entered in as a literal character)
- You can find the sample C programs we looked at in class here:
- You can find the "hypertrm" program for terminal emulation
(the one Duke mentioned in class) on WIndows machines
C:\Program Files\Windows NT\hypertrm.exe
(note: no vowel in "trm"). You can also just enter
"hypertrm" in the "Run ..." command line feature of the Start menu.
Topical references (in chron. order)
We won't be using a formal textbook this semester, so I will be trying to gather
links to relevant on-line material. A lot of good-quality articles are available
on Wikipedia, the free on-line "open source"
encyclopedia (you might want to consider donating some portion of your textbook
savings to their efforts).
- Computing hardware in general
- Data representation
- Boolean logic and digital circuitry
- Computer architecture and organization
- Assembly languages and instruction sets
- NEW! C programming language tutorials, etc.
- High-level language implementation (overviews)
- Lexical analysis, tokens and regular expressions
- Abstract syntax trees, parsing and context-free grammars
- Compilers and code generation
- Java and the Java Virtual Machine