Architecture and Compilers
About the course
This is the homepage for
CS 353: Architecture and Compilers,
a course offered by
Fritz Ruehr at the
Computer Science Department of
Willamette University.
Learning to program in Java can be fun and empowering, but it may still leave you mystified about
how computers really work: what are the components of a physical computer? How does the computer
"know" what to do when it runs a Java program? What connects abstract ideas like objects and
methods to the silicon hardware?
This hybrid course in computer architecture and high-level language processing (compilers) will
provide you with the answers to these questions. The course is a "soup-to-nuts" overview of the
various levels of structure embodied in a physical computer, as well as the various stages by which
high-level languages are translated into a form which the computer can directly execute.
In traditional curricula, these two topics are often covered in separate courses—we have
combined them into a single comprehensive overview in order to expedite Willamette's pedagogical goals.
We will start our journey with basic facts about binary encodings, how they are used to represent
data and how logical operations can be used to transform these data. We will then see how basic
electrical components called gates can be used to realize these operations on digital signals,
and how they can be combined into circuits which will implement more complex processes. Finally,
we will see how these circuits are used to build higher-level components which constitute a typical
computer, and how digital data stored inside the computer can be used to control its operation.
In the second phase of our studies, we will start with the strings of characters which comprise a
program file, and see how it can be analyzed in successive stages until the internal structure of
the program is exposed. We will then see how this internal structure can be translated into the
low-level codes which a computer can execute, in a way which respects the meanings of the original
program. Finally, as time permits, we will survey a few important concepts which inform the design
of high-level languages both like and unlike Java.
News
- Here's are some old notes on about converting DFAs/NFAs to regular expressions.
- Here's a Sun's Java regular expression tutorial.
- Here's a wood and marbles adding machine.
- Read what every programmer should know about Unicode
(from Joel Spolsky: don't let him put you in a submarine, peeling onions).
(And you might as well read about HTML 5 while your at it, here from Mark Pilgrim.)
- Study for the midterm (Wed 28 Oct 2009) using these notes
(same link as below, just repeated here for convenience)
- another kind of very little machine
- Welcome to the Fall 2009 class!
The "news" below (as well as assignments, etc.) are from the 2008 version of the class;
I will be leaving them intact for now as convenient references. I'll try to add updates
above the horizontal rule below, and to move others above as they become especially relevant.
- a visual key to bits, ints and chars
- an ancient lecture on data representation
- A binary counter graphic
- A binary-decimal converter
- A Virginia Tech tutorial (look at I, II.A and II.B)
- Here's a
better (i.e., actually works) link to the Java Virtual Machine spec
(the one on the class handout apparently suffers from bit-rot).
- I made a small (but important) change to the description of the shunting-yard approach
to parsing in the description of lab 9. More specifically, Sam noticed that I had
assumed there that we were doing operators of equal precedence right-associating,
whereas this semester we are using left-association. This means that
operators of strictly greater precedence would cause a shift instead of a reduction.
See the lab 9 write-up for more details (the change I made is in bullet point 3 of the
"Parsing Strategy" section.)
Thanks, Sam!
- While studying for the final, you'll want to check out Sam's
diagrams (that's two links there) showing the "two's complement clock".
- Here is a version of multiplication for non-negative integers: mult.asm.
... and here's the output of the compiler: longexpr.asm
(but note that there is some premature (i.e., wrong) optimization in the multiplier here!)
- Study for the final exam (Fri 19 Dec 2008, from 2-5 pm) using these notes.
NOTE: the date for the final is now updated for the 2008 class, but the notes for studying
are not (i.e.,, they might chnage in some minor ways).
- Check out this graphical overview of the Java Virtual Machine.
- Study for the midterm (Wed 28 Oct 2009) using these notes
(same link as above, just repeated above for convenience)
- Check out Oliver Steele's
regular expression visualizer (edit the pattern itself on the upper left,
input on the upper right)
- here are some nice, thorough
course notes on computer organization by Grant Braught at Dickinson College
- the Tengwar ("elvish") Unicode proposal,
and the Klingon one
- Here is the (informal, ungraded) homework for Tuesday 4 Sep
- I have added the due date for the first lab: see below!
Labs (in chron. order) NB: I am updating the due dates for this year as we go!
- Lab 1: Numerals, numbers and strings
(due Fri 11 Sep 2009)
A simple lab to get you started thinking about strings and numbers as
they are represented inside the computer in binary.
- Lab 2: Binary addition with two's complement
(due Fri 18 Sep 2009)
Buliding on the last lab, you will now implement some simple operations
on binary "words" (i.e., sequences of bits).
Special note: make sure your program can be reconfigured for different
word sizes (8, 16, 32). (You're allowed to re-compile if you wish.)
- Lab 3: Building circuits in a GUI
(due Fri 25 Sep 2009)
Use the graphical circuit simulator to build some simple circuits that
could be used to implement a computer. Use the
Analytical engine circuit simulator.
- Lab 4: Hidden circuit simulator
(due Mon 5 Oct 2009)
Implement a circuit simulator like the one you used last
week, but without showing the circuit graphically.
(Your input will be textual: you will "build" a circuit internally in Java
and run it to produce boolean outputs from boolean inputs.)
- Lab 5: Writing PC-231 Programs
(due Wed 14 Oct 2009)
For this lab you will write several programs in for the PC-231 computer,
in PC-231 assembly code (in the next few labs you will implement the
tools you used for this one).
- Lab 6: PC-231 Simulator
(due
Fri 24 Oct 2008)
Write a simulator for the PC-231 computer: it should read in hex codes
and execute the program directly, reading, storing, printing, etc. as it goes.
- Lab 7: PC-231 Assembler
(due
Wed 5 Nov 2008)
Write a PC-231 assembler: it should behave like the version supplied by me,
i.e., it should read assembly codes and generate hex codes.
- Lab 8: An Evaluator for Extended Expressions
(due
Fri 14 Nov 2008)
Write an evaluator which will interpret (or execute "on the fly")
programs in the extended expression language, given as abstract syntax
trees (already formed, i.e., built with constructors in your Java code).
- Lab 9: A Parser for Extended Expressions
(due
Wed 3 Dec 2008)
Write a parser, using standard techniques or "canned" parser generation software,
which will read in programs in the extended expression language and produce
abstract syntax trees.
- Lab 10: A Compiler for Extended Expressions
(due end of class)
Write a code generator which will take an abstract syntax tree in the extended expression
language and generate PC-231 machine code from it (of course, the code should run and
produce the same results that the high-level program would have). You should hook up this
code generator with your parser so as to make a complete compiler.
Some useful resources, etc.
Some information about C and other miscellanea
- You can get the C compiler (gcc) on our local (hydra) system as
/opt/sfw/bin/gcc. To make this easier, you might
want to set up an alias in your ~/.cshrc file;
just add a line like this:
alias gcc /opt/sfw/bin/gcc
- You can get the also fix up your backspace key by adding this
to your
.login file:
stty erase ^H
(that last character is a backspace key: if your backspace is currently
set to erase, this is hard to enter, but in the vi editor you
can use a Control-V Control-H sequence (the Control-V escapes the next character
to be entered in as a literal character)
- You can find the sample C programs we looked at in class here:
ccode directory
- You can find the "hypertrm" program for terminal emulation
(the one Duke mentioned in class) on WIndows machines
under
C:\Program Files\Windows NT\hypertrm.exe
(note: no vowel in "trm"). You can also just enter
"hypertrm" in the "Run ..." command line feature of the Start menu.
Topical references (in chron. order)
We won't be using a formal textbook this semester, so I will be trying to gather
links to relevant on-line material. A lot of good-quality articles are available
on Wikipedia, the free on-line "open source"
encyclopedia (you might want to consider donating some portion of your textbook
savings to their efforts).
- Computing hardware in general
- Data representation
- Boolean logic and digital circuitry
- Computer architecture and organization
- Assembly languages and instruction sets
- NEW! C programming language tutorials, etc.
- High-level language implementation (overviews)
- Lexical analysis, tokens and regular expressions
- Abstract syntax trees, parsing and context-free grammars
- Compilers and code generation
- Java and the Java Virtual Machine