I. Algorithms and analysis A. What is an algorithm? 1. Algorithms as "the essence of a solution" a. in using the term algorithm, we attempt to capture the essence of some approach to solving a particular problem (e.g, sorting lists of numbers, searching for words in a dictionary, etc.) 2. Algorithms as rote procedure a. an algorithm should be completely automatable, i.e., given an algorithm, a machine should be able to perform the appropriate computation on some data, without any "creative interpretation" 3. A "written" representation a. algorithms should describe a specific sequence of instructions that can be performed in order to compute the solution to a problem; we should be able to record these instructions using some physical representation 4. Algorithms distinguished from programs a. although it should be possible to write down an algorithm, it as more abstract than a program: many different programs (e.g., in different languages) might implement the same algorithm 5. Algorithms distinguished from (mathematical) functions a. in mathematics, functions are sets of input/output pairs; this is too abstract for our purposes, since it does not reflect any notion of passage of time, or other use of resources 6. An unattainable ideal a. in summary, an algorithm is an idealized abstraction: less specific and arbitrary than a program, but more realistic and concrete than a mathematical function b. \fig{algorithms.JPG} B. Analyzing algorithms 1. Criteria for selection a. given a modular approach to programming, we are interested in finding the "best" algorithm to implement a particular problem; so we must be able to compare algorithms 2. Criteria for algorithm selection a. there are a number of criteria which we might use to choose one algorithm over another; some of these are extrinsic and/or subjective, but still important (e.g., clarity, ease of implementation) 3. Time, space and other resources a. among the more intrinsic and objective criteria involvethe algorithm's usage of computer resources, including time, space (of several kinds) or communication bandwidth 4. Resource usage in context (I) a. the same program, when run on different machines, may display different resource usage 5. Resource usage in context (II) a. the same program, when implemented in different languages, may display different resource usage 6. Resource usage in context (III) a. the same program, when run at different times on the same data, may display different resource usage 7. Detailed versus broad analyses C. Intuitions and desiderata 1. Dependence on data a. any algorithms use of resources would clearly depend on the input given to it (at least under normal circumstances) b. we use the variable n to represent some reasonable notion of input size 2. How do we compare functions? a. the graphs of functions may cross (even several times) over a range of input values; how do we compare them to see which is smaller (i.e., better?) 3. Asymptotic behavior a. although it is not an ideal choice, we can choose to compare the asymptotic behavior of functions (i.e., their behavior "in the limit") 4. Simplifying assumption a. assume for the moment that an algorithms usage of resources can be characterized by a polynomial in terms of the size of its input 5. Dominant terms a. the asymptotic behavior of a polynomial is determined by its highest-order term; in other words, this term dominates the results of the function 6. Constant factors a. although contant factors (e.g., the coefficient on the highest-order term) clearly affect the results, their presence is too closely tied to situation-specific circumstances D. A hierarchy of complexity 1. Review of logarithms a. what is a logarithm? What is a base? By how much do logarithms in different bases differ? 2. Why base 2? a. base 2 is a minimal reasonable base; it corresponds to binary representation and reflects divide-and-conquer (for halves) 3. The hierarchy a. lg(n) < n < n lg(n) < n^2 < n^3 < ... < 2^n b. \fig{graph.JPG} 4. Intractability of exponential algorithms a. in the case of exponential resource usage (such as 2^n), it is difficult to see how we can claim an algorithm is reasonable, since small increases in size yield such huge increases in resource usage 5. Limits of current knowledge a. currently, we can characterize many problems as impossible to solve and many others as "seemingly intractable", but we lack a definitive result on intractability (P = NP question) E. Formal versions of the notations 1. The basic O-notation ("Big O") a. We say that: b. "T(n) is O(f(n)) if there exist constants c, n0 > 0 such that c. T(n) <= c f(n) for all n > n0 2. Problems with the notation 3. Variations I: big Omega 4. Variations II: big Theta