V
CS 241: Data Structures—Spring 2019 Midterm Exam Review Notes
V
Important note!
*
click on the triangles to the left to expand or contract sections!
*
use the text size feature in your browser to make this text larger and more readable!
V
About the exam
*
the midterm exam will take place on Friday, March 22nd, at the usual lecture time and place
*
the exam will be designed so that you should be able to complete it in less than an hour
*
you will not have access to books, notes, computers, networks or neighbors during the exam
(you may use scratch paper and it will be supplied if needed; you are encouraged to ask your instructor for clarifications during the exam if necessary)
V
In preparing for the exam, you should consult:
*
your textbook (see especially chapters and sections mentioned below)
*
your notes from lecture and lab—these are likely to be especially helpful
*
the home page for the course and various links, examples, etc., provided there
*
… and also especially the lab assignments linked from the home page, as well as your programs
V
Format: the exam will consist of several kinds of questions; typically:
*
10-12 True-or-False questions at 1-2 points each
*
8-10 short answer questions at 4-6 points each
*
1-2 longer answer questions at 8-15 points each
*
These review notes are not exhaustive, but are meant as a guide to coverage—see other sources for details!
V
General background to Data Structures and modularity
*
Chapter 1, section 1.1; also Chapter 3, section 3.1
*
large software projects are often written and maintained by groups of people
*
programming language features can allow us to manage complexity by circumscribing various tasks
*
different classes which implement the same interface can be "plugged" and "unplugged" (replaced) at will
*
ideally, we unplug a slower implementation and plug in a faster one, each implementing the same interface
*
a common language of specification, implementation and analysis (O notation) supports communication
*
the use of standard techniques is valued so that people can understand and modify each other's code
*
standard solutions to standard problems often involve "tricks" (innovations) that would be hard to come up with on the spot
V
Math review and recursion
*
Chapter 1, sections 1.2-1.3
V
Functions and algorithms
V
a (mathematical) function is a set of input-output pairs, without regard to how they are computed
*
the inputs and outputs may be numbers, but also data structures, etc.
*
an algorithm is a technique for computing a function, step by step method, but still somewhat abstract
*
a program is a very specific realization of an algorithm in a particular language, with choices of variable names, etc.
*
these three ideas form a kind of hierarchy, with functions the most abstract and programs the most concrete
*
the relationships are many-to-one between algorithms and functions, and between programs and algorithms
*
algorithms have (abstract, O-notation style) running times; functions do not (and programs have real-world running times)
V
Exponents and logarithms
*
the logarithm is a very slowly growing function of its argument
*
technically, the log is the power to which you must bring the base to equal the argument number
V
roughly, a log in "base b" is the number of digits required to write the argument in base b
*
so, in base 10, the log of one thousand is 3, and of one million is 6
*
at the whole power of the base, there is actually one more digit, as the "odometer" turns over
V
logarithms obey certain mathematical laws similar to those for exponents
*
example: log (a·b) = log(a) + log(b)
*
example: log (ak) = k·log(a)
*
log functions sit below linear functions, and (n log n) sits between linear and squaring in the hierarchy of O-notation (see below)
V
O notation eliminates the need for specifying the bas of the logarithm, since it ignores constant factors
*
(but many logs in computing are really base 2)
V
Basics of recursion
*
an algorithm is recursive if it refers to itself (calls itself) in its own definition
V
recursion is often an alternative to iteration (loops)
*
for non-linear structures (e.g., trees), we may need an auxiliary stack to hold pending nodes while we iterate
V
recursion is implemented using a stack, one item for each method call
*
(actually, all method calls are implemented this way, partly out of anticipation of recursion)
V
Java review and new features
*
Chapter 1, sections 1.4-1.5
V
Review—see external notes!
V
Interfaces (and abstract classes)
V
signatures include the name of a method, the type of the result and the number and type of the arguments
*
(argument names don't matter for matching a signature)
*
(signatures with different result types, but the same name and argument types, conflict)
*
interfaces are collections of method names and signatures: no method bodies!
*
an interface allows us to specify a set of capabilities without (yet) specifying how they will be implemented
V
interfaces support a notion of contractual agreement about how a class can be accesses and used, how it will behave
*
but Java does not allow behavioral specification beyond signatures
*
one interface may be declared as extending another, so as to provide some (hierarchical) structure
V
a class can be declared as implementing an interface: class A implements I { ... }
*
but note that only the signature, not the behavior, is specified in Java or checked by the compiler!
V
abstract classes are "halfway between" a class and an interface
*
they may have some signatures and some full definitions
*
this allows a form of "default implementation" when a full method is defined
*
classes may extend abstract classes, forming a hierarchy
*
you may not instantiate (with new) an abstract class!
V
Wrapper classes
*
for each primitive type (int, boolean, char) there is a wrapper class (Integer, Boolean, etc.)
V
the wrapper class provides a kind of a "box" which holds values of the associated type
*
the value can be changed, but the box object remains the same
*
the wrapper class also supplies a "home" for static methods related to primitive types (e.g., Integer.parseInt(…))
V
Generics
*
in order to avoid the problem of Object casting, we can specify an element type for a structure
*
the <T> type variable is affixed to the end of the class name in declarations of the class and of variable types
*
once in scope, the T type variable can be used as the type of variables, etc.
*
see Generics page at Oracle for a tutorial or section 1.5 in the textbook
V
Algorithm analysis and asymptotics (O notation)
*
Chapter 2, sections 2.1-2.4.5
V
Context and goals
V
we want a way to describe and compare the running times of algorithms, but:
*
it should be insensitive to particular hardware/machines
*
it should emphasize the behavior as problems (or data sets) grow large
V
we abstract away the detail of a function (e.g., a polynomial) representing the running time
*
we eliminate low order terms: just keep the highest one
*
we eliminate constant factors (this has a downside, though)
V
technically, we say that f is O(g) when there is a constant factor c and an initial point n0
*
f(n) is less than or equal to c·g(n) for all n > n0
*
idea: we can pick a constant that, when multiplied by the high-order term, will dominate all low order terms
V
the hierarchy of O-notation functions typically seen is:
*
log(n) < n < n log(n) < n2 < n3 < ... < cn
V
how do we measure times for O-notation?
*
count individual operations as steps
*
ignore constant amounts of steps: look for loops, traversals that depend on n somehow
V
Watch out for this!
*
when f is O(g), it may be that f is less than g
V
Arrays, ArrayLists and linked lists
*
Chapter 3, sections 3.2-3.5
V
Arrays
*
see your introductory Java text for review and examples
V
ArrayLists
*
ArrayLists provide a number of useful methods for getting, adding and removing items from the collection
*
however, indiscriminate use of these features without understanding how they are implemented can lead to bad performance
*
in general, recall that some operations may require up to O(n) time in order to move or copy other elements
*
specifically, adding and removing items from the end of the ArrayList is cheap (O(1)), but adding and removing from the front is expensive (O(n))
V
Linked Lists
*
basic tasks: insertion, deletion, search, indexing, containment, size, check for emptiness
*
singly-linked lists: links run only in one direction (can't easily go backwards)
*
doubly-linked lists: links run in both directions (takes more space, more complex operations)
*
circularly-linked lists: single or double links, no first node
*
be careful of the order when re-linking list nodes on insertion and deletion!
V
Stacks and Queues
*
Chapter 3, sections 3.6-3.7
V
Stacks
*
a stack supports insertion (push) and deletion (pop), but with a specific "discipline"
V
push and pop behave in a LIFO way: last in, first out
*
(think of a stack of dinner plates on a spring at Goudy)
V
applications of stacks:
*
keeping track of pending method calls
*
matching parentheses (and brackets and braces) in program syntax
*
traversing the nodes of a tree in depth-first order (e.g., pre-order, in-order or post-order)
V
implementing a stack with an array
*
keep an index of the top item (or the next top: this is different!)
*
increment on add, decrement on remove
*
remember to check for empty (or full!)
V
implementing a stack with a list
*
keep a reference to (the node for) the top item
*
add a new node at the front to push a new item (set the next reference before changing the top reference)
*
remove a node from the front top pop an item (hold onto the item while re-setting the top reference)
V
Queues
*
a queue supports insertion (add) and deletion (remove), but with a specific "discipline"
V
add and remove behave in a FIFO way: first in, first out
*
(think of a line of people waiting at the bank or grocery store)
V
applications of queues:
*
round robin allocation of resources (fair scheduling)
*
holding a buffer of items that cannot be treated (and released) as quickly as they arrive
*
traversing the nodes of a tree in breadth-first order (i.e., by levels)
V
implementing a queue with an array
*
keep indices for the front and the back: increment the appropriate index upon add or remove
*
(but do the indices "point" to current or next items?)
*
wrap around to the front when you reach the end (use modulus operator % after addition)
*
note that the full part of the queue can "wrap around" from the end to the beginning of the array
*
keep a separate size variable to make it easier to distinguish empty from full
V
implementing a queue with a list
*
keep pointer to first and last nodes
*
add at the last end; remove from the first (the other way won't work as well)
*
the empty queue is a special case
V
Infix and Postfix expressions
*
Chapter 3, sections 3.6.3
V
Expression trees
*
since arithmetic is built mostly out of binary operators, we can represent arithmetic expressions as
binary trees, where the internal nodes hold operators and the leaves are numeric constants (literals)
V
expressions can be written in various systematic styles: infix with parentheses (the usual), postfix or prefix
*
postfix puts the arguments first, then the operator afterwards (post-position); prefix is the opposite
*
postfix and prefix forms never need parentheses
*
you should be able to convert between different forms of expressions for the exam!!
V
various tree traversals (see below) yield similar representations of expressions
*
except for the parentheses of inorder/infix, which must be generated separately, before and after the left and right subtrees
V
Infix form
*
algebraic expressions are usually written with binary operators (like +, - and *), constants (numerals) and variables
*
when we write expressions, we can use parentheses to show structure
V
we leave excess parentheses out according to the usual "order of operations"
*
multiplication and division have higher precedence than addition and subtraction
*
when it matters, we associate to the left: a+b+c really means (a+b)+c, even though the value is unaffected by the order
*
… but sometimes we need to use parentheses, when the order does not match our intent
*
we can always put extra parentheses in for emphasis
*
this usual way of writing expressions is called infix form
V
Postfix form
*
an alternative style is to put the operators after their arguments: 2 3 +
*
in this case, called postfix form, we never need parentheses
*
postfix form is based on the ideas of Polish logicians and is sometimes called Polish notation
V
Evaluating postfix
V
we can use a stack to evaluate (variable-free) postfix forms:
*
proceed through the expression from left to right
*
for a constant, push it on the stack
*
for an operator, pop off two arguments and apply the operator, then push the result back on
*
be careful of the argument order: first one popped is the right argument!
*
when done, we should always have exactly one number on the stack (and have never failed to pop 2)