|
 |
CS 241: Data Structures—Midterm Exam Review
|
|
|
 |
Important note!
|
|
|
 |
click on the triangles to the left to expand sections!
|
|
|
 |
use the text size feature in your browser to make this text larger and more readable!
|
|
|
 |
About the exam
|
|
|
 |
The exam will be held on Wednesday 18 March at the usual lecture time (1:50-2:50) in the usual lecture room (Collins 408).
|
|
|
 |
The exam will consist of several kinds of questions; typically:
|
|
|
 |
10-15 True-or-False questions at 1-2 points each
|
|
|
 |
8-10 short answer questions at 4-6 points each
|
|
|
 |
2-4 longer answer questions at 8-15 points each
|
|
|
 |
remember to review your lecture notes, the textbook, the course home page and the labs!
|
|
|
 |
Some Java review
|
|
|
 |
data structures
|
|
|
 |
variables and declarations
|
|
|
 |
objects, fields and methods
|
|
|
 |
references and aliasing (two references to the same object may interfere)
|
|
|
 |
arrays (note especially issues of array declaration and initialization)
|
|
|
 |
int [] a;
|
|
|
 |
int [] a = new int[10];
|
|
|
 |
int [] a = {2,3,5,7,11,13,17,19,23,29};
|
|
|
 |
classes and sub-classes (inheritance)
|
|
|
 |
type casts allow us to verify the expected type of (e.g.) an Object extracted from a structure
|
|
|
 |
(we can also check for a variables class using the instanceof feature)
|
|
|
 |
statement structures
|
|
|
 |
for and while loops
|
|
|
 |
switch statements (only for certain primitive types: not strings!)
|
|
|
 |
"for each" loops
|
|
|
 |
for (int i : a) { ... }
|
|
|
 |
works for arbitrary classes implementing the iterator interface
|
|
|
 |
New Java features: exceptions
|
|
|
 |
rationale for exceptions: allow for unexpected circumstances without littering your code
|
|
|
 |
the try { ... } catch (...) { ... } statement wraps around code, minimizing the cognitive interference of error-checking
|
|
|
 |
methods must be "branded" with the "throws Exception" decoration if they might throw an uncaught exception
|
|
|
 |
some exceptions are exempt: standard runtime exceptions such as array bounds and null references
|
|
|
 |
exceptions themselves are declared like other classes, but they extend Exception
|
|
|
 |
exceptions are created and thrown as in: throw new Exception(...)
|
|
|
 |
New Java features: interfaces
|
|
|
 |
signatures include the name of a method, the type of the result and the number and type of the arguments
|
|
|
 |
(argument names don't matter for matching a signature)
|
|
|
 |
interfaces are collections of method names and signatures: no method bodies!
|
|
|
 |
an interface allows us to specify a set of capabilities without (yet) specifying how they will be implemented
|
|
|
 |
one interface may be declared as extending another, so as to provide some (hierarchical) structure
|
|
|
 |
a class can be declared as implementing an interface: class A implements I { ... }
|
|
|
 |
but note that only the signature, not the behavior, is specified in Java or checked by the compiler!
|
|
|
 |
abstract classes are "halfway between" a class and an interface
|
|
|
 |
they may have some signatures and some full definitions
|
|
|
 |
this allows a form of "default implementation" when a full method is defined
|
|
|
 |
classes may extend abstract classes, forming a hierarchy
|
|
|
 |
you may not instantiate (with new) an abstract class!
|
|
|
 |
New Java features: miscellaneous
|
|
|
 |
enumerated types
|
|
|
 |
declare a new type with fixed elements, like a primitive type
|
|
|
 |
elements have names (convention is to use capital letters)
|
|
|
 |
an alternative to using (e.g.) numbers as codes: enums offer more security through typing
|
|
|
 |
example: public enum Day { SUN, MON, TUE, WED, THU, FRI, SAT };
|
|
|
 |
can be used as the basis for switch statements
|
|
|
 |
generics: <T>
|
|
|
 |
in order to avoid the problem of Object casting, we can specify an element type for a structure
|
|
|
 |
the <T> type variable is affixed to the end of the class name in declarations of the class and of variable types
|
|
|
 |
once in scope, the T type variable can be used as the type of variables, etc.
|
|
|
 |
Software engineering issues (modularity)
|
|
|
 |
large software projects are often written and maintained by groups of people
|
|
|
 |
programming language features can allow us to manage complexity by circumscribing various tasks
|
|
|
 |
interfaces support a notion of contractual agreement about how a class can be accesses and used, how it will behave
|
|
|
 |
(but Java does not allow behavioral specification beyond signatures)
|
|
|
 |
different classes which implement the same interface can be "plugged" and "unplugged" (replaced) at will
|
|
|
 |
the use of standard techniques is valued so that people can understand and modify each other's code
|
|
|
 |
a common language of specification, implementation and analysis (O notation) supports communication
|
|
|
 |
standard solutions to standard problems often involve "tricks" (innovations) that would be hard to come up with on the spot
|
|
|
 |
Programming techniques: binary search
|
|
|
 |
to find a "random number" within limits, guess at the middle
|
|
|
 |
if we get answers back in the form "higher; lower; or found", we can split the search in half each time
|
|
|
 |
the "time" to find an element is log base 2 guesses; thus O(log n) [see below]
|
|
|
 |
in an actual program, "edge cases" can be tricky (finding first or last item)
|
|
|
 |
be careful about integer division by 2 (which rounds down)
|
|
|
 |
Mathematical analysis: logarithms
|
|
|
 |
the logarithm is a very slowly growing function of its argument
|
|
|
 |
technically, the log is the power to which you must bring the base to equal the argument number
|
|
|
 |
roughly, a log in "base b" is the number of digits required to write the argument in base b
|
|
|
 |
so, in base 10, the log of one thousand is 3, and of one million is 6
|
|
|
 |
logarithms obey certain mathematical laws similar to those for exponents
|
|
|
 |
example: log (a·b) = log(a) + log(b)
|
|
|
 |
example: log (ak) = k·log(a)
|
|
|
 |
log functions sit below linear functions, and (n log n) sits between linear and squaring in the hierarchy of O-notation (see below)
|
|
|
 |
O notation eliminates the need for specifying the bas of the logarithm, since it ignores constant factors
|
|
|
 |
(but many logs in computing are really base 2)
|
|
|
 |
Mathematical analysis: O notation
|
|
|
 |
we want a way to describe and compare the running times of algorithms, but:
|
|
|
 |
it should be insensitive to particular hardware/machines
|
|
|
 |
it should emphasize the behavior as problems (or data sets) grow large
|
|
|
 |
we abstract away the detail of a function (e.g., a polynomial) representing the running time
|
|
|
 |
we eliminate low order terms: just keep the highest one
|
|
|
 |
we eliminate constant factors (this has a downside, though)
|
|
|
 |
technically, we say that f is O(g) when there is a constant factor c and an initial point n0
|
|
|
 |
f(n) is less than or equal to c·g(n) for all n > n0
|
|
|
 |
idea: we can pick a constant that, when multiplied by the high-order term, will dominate all low order terms
|
|
|
 |
the hierarchy of O-notation functions typically seen is:
|
|
|
 |
log(n) < n < n log(n) < n2 < n3 < ... < cn
|
|
|
 |
how do we measure times for O-notation?
|
|
|
 |
count individual operations as steps
|
|
|
 |
ignore constant amounts of steps: look for loops, traversals that depend on n somehow
|
|
|
 |
Programming techniques: recursion
|
|
|
 |
an algorithm is recursive if it refers to itself (calls itself) in its own definition
|
|
|
 |
recursion is often an alternative to iteration (loops)
|
|
|
 |
for non-linear structures (e.g., trees), we may need an auxiliary stack to hold pending nodes while we iterate
|
|
|
 |
recursion is implemented using a stack, one item for each method call
|
|
|
 |
(actually, all method calls are implemented this way, partly out of anticipation of recursion)
|
|
|
 |
Linear structures: arrays as lists
|
|
|
 |
arrays can serve as linear collections (lists), but ...
|
|
|
 |
we need to keep track of the current size (versus total capacity)
|
|
|
 |
we must choose between giving an error on reaching capacity, or re-allocating a new array
|
|
|
 |
(re-allocation carries the cost of copying old elements into the new array)
|
|
|
 |
we can hide the re-allocation in a "wrapper class" which holds an array (or a new one) for us
|
|
|
 |
insertion and deletion require shifting (potentially large numbers of) elements
|
|
|
 |
(we could mark deleted elements, but this will get very messy)
|
|
|
 |
Linear structures: linked lists
|
|
|
 |
basic tasks: insertion, deletion, search, indexing, containment, size, emptiness
|
|
|
 |
header nodes vs. special cases
|
|
|
 |
empty lists can be handled as a special case, or we can use a technique of an extra node at the front of a list (header node)
|
|
|
 |
a wrapper class can also be used: not strictly a header node, since not a node
|
|
|
 |
... but allows methods to be called on it (whereas they cannot be called on a null reference)
|
|
|
 |
singly-linked lists: links run only in one direction (can't easily go backwards)
|
|
|
 |
doubly-linked lists: links run in both directions (takes more space, more complex operations)
|
|
|
 |
circularly-linked lists: single or double links, no first node
|
|
|
 |
be careful of the order when re-linking list nodes on insertion and deletion!
|
|
|
 |
Linear structures: stacks
|
|
|
 |
a stack supports insertion (push) and deletion (pop), but with a specific "discipline"
|
|
|
 |
push and pop behave in a LIFO way: last in, first out
|
|
|
 |
(think of a stack of dinner plates on a spring at Goudy)
|
|
|
 |
applications of stacks:
|
|
|
 |
keeping track of pending method calls
|
|
|
 |
matching parentheses (and brackets and braces) in program syntax
|
|
|
 |
traversing the nodes of a tree in depth-first order (e.g., pre-order, in-order or post-order)
|
|
|
 |
implementing a stack with an array
|
|
|
 |
keep an index of the top item (or the next top: this is different!)
|
|
|
 |
increment on add, decrement on remove
|
|
|
 |
remember to check for empty (or full!)
|
|
|
 |
implementing a stack with a list
|
|
|
 |
keep a reference to (the node for) the top item
|
|
|
 |
add a new node at the front to push a new item (set the next reference before changing the top reference)
|
|
|
 |
remove a node from the front top pop an item (hold onto the item while re-setting the top reference)
|
|
|
 |
other fun stack facts
|
|
|
 |
with two stacks we can make a list: they meet at the middle and we can move left and right in O(1) time
|
|
|
 |
we can reverse a stack by pushing its items onto another stack (think slinky)
|
|
|
 |
two stacks can make a queue: push onto one to add, pop from the other to remove, do the slinky when removal stack is empty
|
|
|
 |
(the operations are O(1) amortized over the life of the queue)
|
|
|
 |
Linear structures: queues
|
|
|
 |
a queue supports insertion (add) and deletion (remove), but with a specific "discipline"
|
|
|
 |
add and remove behave in a FIFO way: first in, first out
|
|
|
 |
(think of a line of people waiting at the bank or grocery store)
|
|
|
 |
applications of queues:
|
|
|
 |
round robin allocation of resources (fair scheduling)
|
|
|
 |
holding a buffer of items that cannot be treated (and released) as quickly as they arrive
|
|
|
 |
traversing the nodes of a tree in breadth-first order (i.e., by levels)
|
|
|
 |
implementing a queue with an array
|
|
|
 |
keep indices for the front and the back: increment the appropriate index upon add or remove
|
|
|
 |
(but do the indices "point" to current or next items?)
|
|
|
 |
wrap around to the front when you reach the end (use modulus operator % after addition)
|
|
|
 |
note that the full part of the queue can "wrap around" from the end to the beginning of the array
|
|
|
 |
keep a separate size variable to make it easier to distinguish empty from full
|
|
|
 |
implementing a queue with a list
|
|
|
 |
keep pointer to first and last nodes
|
|
|
 |
add at the last end; remove from the first (the other way won't work as well)
|
|
|
 |
the empty queue is a special case
|
|
|
 |
Linear structures: deques
|
|
|
 |
a deque is a "double-ended queue": we can add or remove from either end
|
|
|
 |
[we need more detail here!]
|
|
|
 |
Branching structures: general trees
|
|
|
 |
a tree is a structure built from nodes, but with multiple nodes "descending" from a given one
|
|
|
 |
(unlike (singly) linked lists, where only a single node follows any given one)
|
|
|
 |
we consider only such structures where:
|
|
|
 |
there is a well-defined starting node, called the root
|
|
|
 |
there is a direction to the links from a parent node to its children
|
|
|
 |
every node has a unique parent (no "sharing")
|
|
|
 |
links do not go around in a cycle
|
|
|
 |
other terminology derives from the idea of a family tree (ancestors, descendants, siblings, etc.)
|
|
|
 |
.. or from botanical trees: branch, leaves, height
|
|
|
 |
or is specific to computing: path (consecutive edges), depth (length of path from root), internal (non-leaf) and external (leaf) nodes
|
|
|
 |
the children may be ordered (e.g., left to right); see also binary trees below
|
|
|
 |
traversals are specific ways (orders) in which to visit the nodes
|
|
|
 |
pre-order: visit root, then traverse children in turn
|
|
|
 |
in-order: [makes sense only for binary trees] traverse left sub-tree, then visit root, then traverse right
|
|
|
 |
post-order: traverse children in turn, then visit root
|
|
|
 |
level-order: visit the roots of the children of each level in order (see below)
|
|
|
 |
pre-order and post-order (and in-order) traversals are called depth-first; level order is called breadth-first
|
|
|
 |
depth-first traversals can be done recursively, or iteratively with the help of a stack
|
|
|
 |
breadth-first traversals can be done iteratively using a queue
|
|
|
 |
with appropriate adapter classes, we can generalize the algorithm
|
|
|
 |
(but for one of the two, the sense of left/right order will be reversed)
|
|
|
 |
Branching structures: binary trees and especially binary search trees
|
|
|
 |
in a binary tree, each node must have no more than two children (and they have a specific left/right sense)
|
|
|
 |
these can be used to represent arithmetic expressions (operators in internal nodes, numbers in leaves)
|
|
|
 |
a binary search tree reifies binary search into a tree data structure
|
|
|
 |
there must be an ordering defined on the items held in the nodes
|
|
|
 |
all items stored to the left of the root are less than the item in the root
|
|
|
 |
all items stored to the right of the root are greater than (or equal to) the item in the root
|
|
|
 |
an in-order traversal of the tree will visit the items in order from smallest to largest
|
|
|
 |
the algorithms for finding and inserting items start the same: go left or right based on comparisons
|
|
|
 |
finding ends with success or (if a null/leaf node is reached) failure
|
|
|
 |
insertion ends with creation of a new node where a null once was
|
|
|
 |
finding an searching take time O(log n), the length of a path from root to leaf ...
|
|
|
 |
... but only if the tree is sufficiently bushy
|
|
|
 |
deletion is based on three cases:
|
|
|
 |
deleting leaves is easy (replace reference to leaf with a null)
|
|
|
 |
deleting nodes with one child involves "pointing around" the removed node
|
|
|
 |
deleting a node with two children involves replacing its item with one of the "inside corners (and deleting that node)
|
|
|
 |
other good review questions:
|
|
|
 |
how many nodes are in a complete tree?
|
|
|
 |
how many nodes are at level k in a complete tree?
|
|
|
 |
what is the ratio of nodes at level k to level k+1 (in a complete tree)
|
|
|