SICP Notes

Posted on March 27, 2015 by Andrew Michael tags: compsci, math, notes

Table of Contents

  1. Recursion and Iteration
  2. Ackermann’s function
  3. Tree Recursion
  4. Pascal’s Triangle
  5. The Binet formula
  6. Orders of Growth

Notes on the classic (infamous?) computer science book Structure and Interpretation of Computer Programs by Hal Abelson and Gerald Sussman.

About

Code is written in Guile Scheme, using Emacs Org-mode’s Babel feature for literate programming. This allows you to weave blocks of code from any number of source languages into organized, foldable notes. The code can be evaluated, fed input, etc. Sufficiently detailing the power of Org-mode itself would warrant a separate full-length article.

I’ve read a similar text called Concrete Abstractions. CA covers a lot of the same material as SICP’s first few sections in great detail, so my notes over those sections will be limited to the odd exercise or so.

If you’re interested in reading SICP, you can’t go wrong with the original text. There’s also a lovely modernization effort available in EPUB and PDF format at https://github.com/sarabander/sicp.


Building Abstractions with Procedures

1.1: Elements of Programming

Exercise 1.8 - Approximating cubic roots

\[\frac{\frac{x}{y^3}+2y}{3}\]

  (define (cbrt x)
    (cbrt-iter 1.0 x))
  (define (cbrt-iter guess x)
    (if (good-enough? guess x)
        guess
        (cbrt-iter (improve guess x) x)))
  (define (improve guess x)
    (/ (+ (/ x (* guess guess)) (* 2 guess)) 3))
  (define (good-enough? guess x)
    (< (abs (- (expt guess 3) x)) 0.001))

  (cbrt 39837)

Lexical Scoping - Internal Definitions

The above exercise, but organized under one definition:

  (define (cbrt x)  
    (define (cbrt-iter guess x)
      (let ((improve (lambda (guess x)
                       (/ (+ (/ x (expt guess 2)) (* 2 guess)) 3)))
            (good-enough? (lambda (guess x)
                            (< (abs (- (expt guess 3) x)) 0.001))))
        (if (good-enough? guess x)
            guess
            (cbrt-iter (improve guess x) x))))
    (cbrt-iter 1.0 x))

    (cbrt 39837)

1.2: Procedures and Processes

Procedural abstraction is the first essential CS idea introduced by this book. When people say that /“computer science isn’t necessarily about computers or science”/, this is perhaps what they mean. Computer science is about computation, which can occur in any substrate (material, biological or otherwise) deemed sufficient to carry out procedural manipulation of information.

Recursion and Iteration

This chapter examines linear recursion and iteration, as well as the term procedure and what it means for local and global transformations. Specifically, recursive processes and linear processes have different “shapes”. They relate to the elements of computation differently in how the procedure carries out through time and (memory) space.

Recursive processes defer a chain of operations and then rapidly contract. A function that calls itself as a parameter in an operation generates a recursive process.

Example: for computing a factorial (\(n!\)), the number of steps would grow linearly with \(n\).

An iterative process would not “contract” in this manner– it simply keeps track of a fixed number of variable values and then re-iterates, generating a number of steps in finite-memory in which the variables are updated according to a fixed rule. An iterative process may call itself as an operation to generate a new state.

Recursive Process vs Recursive Procedure

A function defined in terms of itself is not necessarily recursive. It could be iterative as well. The difference is whether the function calls itself as a parameter in another expression (recursive) or calls itself just with adjustments to variable values (iterative). This is all illustrated in the code block below. Ultimately, a recursive process has to negotiate an increasing amount of information (steps and memory grow proportional to the input) whereas an interative process captures the state all at once in each step and starts fresh. The information remains fixed, only the number of steps remain proportional to input.

In other words, there is a difference between a recursive procedure and a recursive process (SICP, p.45). A procedure being recursive merely refers to the fact of its syntactical arrangement– it is defined in terms of itself. Whereas a process being recursive refers to its behavior as it evolves.

An example:

   ;; a recursive factorial function
    (define (fact-r n)
      (if (= n 0)
          1
          (* n (fact-r (- n 1)))))

(fact-r 5)
(* 5 (fact-r 4))
(* 5 (* 4 (fact-r 3)))
(* 5( * 4 (* 3 (fact-r 2))))
(* 5 (* 4 (* 3 (* 2 (fact-r 1)))))
(* 5 (* 4 (* 3 (* 2 (* 1 (fact-r 0))))))
(* 5 (* 4 (* 3 (* 2 (* 1 1)))))
(* 5 (* 4 (* 3 (* 2 1))))
(* 5 (* 4 (* 3 2)))
(* 5 (* 4 6))
(* 5 24) ;; -> 120

    ;; an iterative factorial function
    (define (fact-i n)
      (letrec ((iter
             (lambda (n product)
               (if (= 0 n)
                   product
                   (iter (- n 1) (* product n))))))
        (iter n 1)))

(fact-i 5) ;; (iter 5 1)
   (iter 5 1)
   (iter 4 5)
   (iter 3 20)
   (iter 2 60)
   (iter 1 120)
   (iter 0 120) ;; -> 120

letrec was not introduced at this point in the book. I used letrec in place of a “helper” iteration function.

An aside on binding constructs:
binding
binding is the relationship between the name of something and its location in memory.
let
a derived form of lambda. let directly assigns an identifier to the result of an expression; lambda is passed identifiers. These are equivalent:
  ((lambda (param1 param2 ... ) body) var1 var2 ... )
  (let ((param1 var1) (param2 var2) ... ) body)
let*
evaluates all declared bindings sequentially, that is, with respect to those that were declared before it. Normal let binds ids in parrallel.
letrec
allows the binding of recursive functions. See: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-14.html

As an aside, proper dialects of Scheme are tail-recursive– that is, a compiler trick allows for computationally cheap recursive function calls. This is different from many imperative programming languages in which iteration is almost always preffered to recursion.

Some Schemes feature a useful macro called “named let” that mirrors particular uses of letrec.

The Ackermann function

This exercise happens to be the one that discourages a lot of people from continuing on. It can be maddening if you attempt working it out on paper for too long. The Ackermann function is defined as such:

  (define (A x y)
    (cond ((= y 0) 0)
          ((= x 0) (* 2 y))
          ((= y 1) 2)
          (else (A (- x 1) (A x (- y 1))))))
(A 2 4)

Mathematically it can be represented as

\[ A(n) = \begin{cases} 0 &\mbox{if } y = 0 \\ 2y & \mbox{if } x = 0 \\ 2 & \mbox{if } y = 1 \\ A((x-1),(A(x,y-1))) & \mbox{otherwise } \end{cases}\]

And as with any problem, imagine first the most simple cases, taking 0 or 1 for the variables. You see that it ends without recurse if both of the variables are less than 2.

Now let’s take 1 for x and 2 for y. Here’s an execution tree:

  (A 1 2)
  (A 0 (A 1 1))
  (A 0 2)
  (* 2 2)

So \(A(1,2)=4\).

Let’s try 2 and 2.

  (A 2 2)
  (A 1 (A 2 1))
  (A 1 2)
  ...
  (* 2 2)

Also \(4\).

Let’s try 2 and 3.

  (A 2 3)
  (A 1 (A 2 2))
  (A 1 (A 1 (A 2 1)))
  (A 1 (A 1 2))
  (A 1 4) ;; we know (A 1 2) -> 4
  (A 0 (A 1 3)) ;; here's where it gets interesting
  (A 0 (A 0 (A 1 2)))
  (A 0 (A 0 4))
  (A 0 8)
  (* 2 8) ;; -> 16

\(A(2,3) = 16\)

So the process appears to be exponential. That’s not yet enough to scare us: 16 is a small number.

But that’s where many an unwitting soul has been lost– try to evaluate (A 2 3) on paper with pencil.

It turns out to evaluate to 65536.

 (A 2 4)
 (A 1 (A 2 3)
 (A 1 (A 1 (A 2 2)))
 (A 1 (A 1 (A 1 (A 2 1))))
 (A 1 (A 1 (A 1 2)))
 (A 1 (A 1 (A 0 (A 1 1))))
 (A 1 (A 1 (A 0 2)))
 (A 1 (A 1 4))
 (A 1 (A 0 (A 1 3)))
 (A 1 (A 0 (A 0 (A 1 2))))
 (A 1 (A 0 (A 0 (A 0 (A 1 1)))))
 (A 1 (A 0 (A 0 (A 0 2))))
 (A 1 (A 0 (A 0 4)))
 (A 1 (A 0 8))
 (A 1 16)
 (A 0 (A 1 15))
 (A 0 (A 0 (A 1 14)))
 (A 0 (A 0 (A 0 (A 1 13))))
 (A 0 (A 0 (A 0 (A 0 (A 1 12)))))
  ;; ... we eventually obtain 1024 at the far end
  ;; ... and then 'pop' the zeroes which multiply it by 2
  ;; ...
 (A 0 (A 0 (A 0 (A 0 4096))))
 (A 0 (A 0 (A 0 8192)))
 (A 0 (A 0 16384))
 (A 0 32768) ;; -> 65536

Try entering numbers too large and your scheme interpreter will likely hang. This function produces enormous output with minimal input.

The one nice thing, and you’ll have figured this out if you tried with paper, if that you can repeatedly perform subsitutions, treating expressions as their eventual results. Of course, you can do this with most any recursively-defined structure.

It might help to imagine this process visually, or kinetically. Think of the process whittling down the x values, each x having a ‘petite-recursion’ whittling down y and ultimately obtaining an evaluation of 2 at the deepest depth of its recursion, yielding (A 0 2) as it begins to wind out of that depth. Now the function rapidly contracts, ‘popping’ the zeroes, doubling this large number for each accumulated (A 0 ... ). Our final evaluation will obviously be related to the number 2, because of the evaluation to 2 once \(y=1\).

And because of the repeated doubling we can reasonably venture to guess that the final solution will somehow be 2 related to powers of 2 in some way.

The exercise asks us to consider the following procedures:

  (define (f n) (A 0 n))
  (define (g n) (A 1 n))
  (define (h n) (A 2 n))

How would we give a mathematical definition for these functions?

Well, we know that \(f(n)\) will always yield \(2n\). It simply refers to the base case, and no further proof is needed.

Now, \(g(n) = A(1,n)\). For \(n=1\), \(g(1) = A(1,1) = 2\). Now we solve for any \(n\). (A 1 n) will evaluate the else branch of the conditional in A, so we have (A (- 1 1) (A 1 (- n 1))).

  (A (- 1 1) (A 1 (- n 1)))
  (A 0 (A 1 (- n 1)))
  (A 0 (A 0 (A 1 (- n 2))))
  (A 0 (A 0 (A 0 (A 1 (- n 3)))))
  ;; this continues until n reaches 1, and evaluates the expression to 2
  ;; thus, 2 is multiplied by 2 n times

So following the evaluation above, \(g(n) = 2^{n}\).

Now we examine \(h(n) = A(2,n)\). For \(n=1\), \(h(1) = A(2,1) = 4\), as derived earlier. Now we solve for any \(n\). Because \(h(n)\) and \(g(n)\) are both derived from \(A(x,y)\), it’s likely we can use substitution to find out how they are related.

  (h n) ;; for n > 1
  (A 2 n)
  (A 1 (A 2 (- n 1)))
  ;; which is equivalent to
  (g (h (- n 1)))
  ;; so the evaluation would be 2 to the (h (- n 1))
  ;; we examine (h (- n 1))
  (g (g (h (- n 2))))
  (g (g (g (h (- n 3)))))

So we see clearly that since \(h(n) = g(h(n-1))\), the ultimate expression would evaluate as \(g(n)\) nested \(n\) times, with the inner-most argument being 2: \(g(g(g(...g(2)))\). So 2 to the 2 to the 2 to the 2… This is called tetration, or iterated exponentiation (read more here). It is an operation with a faster rate of growth than exponentiation. It can be represented symbolically as such:

\[Tet(a, n) = {}^na =\underbrace{a^{a^{a^{\dots^{a}}}}} _{n \> times}\]

Similarly, (A 3 n) will be an operation with a greater rate of growth than tetration. And (A 4 n) even greater than that, and so on. Incomprehensibly fast rates of growth. For a bit of math humor involving very large numbers, check out Steinhaus-Moser notation.

At this point you might be realize, as I did, that SICP is partly a math book in disguise. Indeed, the first part of the book features heaps of problems focused on basic numerical analysis, as we’ll see soon with Fibonacci numbers and exponentiation. This is likely because it was expected that MIT students at the time, taking the original SICP course, would be both familiar with and interested in these sorts of problems. Unfortunately, it can turn off certain contemporary readers who might be more interested in “practical” problems and software development. SICP’s focus on numerical analysis, “domain knowledge”, and the abstract nature of computation was partly the impetus for Matthias Felleisen’s article The Structure and Intepretation of the Computer Science Curriculum and his book How to Design Programs.

Essentialy, programs are proofs, and indeed, introductory books like SICP and Concrete Abstractions, that are grounded in building up your intuition for abstraction, ask you to do a bit of proof-writing to prove for correctness. So it pays to have a bit of math know-how in the pocket before approaching this material, especially in proof through induction. CA will teach you much of this from first principles– SICP mostly assumes that you are already familiar with it. This is one reason I’d always recommend CA over SICP for anyone unconfident in their mathematical sophistication. And of course, if the idea on your mind is “when am I ever going to use this”, from what I’ve heard HtDP is a great alternative to both.

Fibonacci Numbers and Tree Recursion

Recall the definition of the Fibonacci sequence:

\[F(n) = \begin{cases} 0 &\text{if } n = 0 \\ 1 & \text{if } n = 1 \\ F(n-1) + F(n-2) &\text{otherwise } \end{cases}\] So: \(1,1,2,3,5,8,13,21,34,55,89,\dots\)

A recursive process to find the $n$th Fibonacci number:

  (define (fibonacci n)
    (cond ((= 1 n) 1)
          ((= 2 n) 1)
          (else
            (+ (fibonacci (- n 1))
               (fibonacci (- n 2))))))
  (fibonacci 8)

And an linear iterative process (using a ‘named let’ construct):

  (define (fibonacci n)
    (let loop ((a 1)(b 0)(count n))
      (if (= 0 count)
          b
          (loop (+ a b) a (- count 1)))))
  (fibonacci 8)

The recursive process is redundant, with a lot of needless computation, growing exponentially in size with \(n\).

We recall that successive Fibonacci numbers approximate \(\varphi\). Particularly, \(F(n)\) is the closet integer to \(\varphi^{n}/\sqrt{5}\) 1 and \(\varphi\) has the characteristic equation \(\varphi^{2} = \varphi + 1\), which tells us concretely right at once that a process used to find Fibonacci numbers with recursive reference to their definition will exponentiate and perform a lot of needless computation.

Tree-recursive processes are useful for hierarchical data, but not for numbers. Here, the iterative process introduces three state variables and resolves in far fewer steps.

Counting Change

Here’s a simple program for computing the number of ways you can make change from a given amount of money. This program contains one procedure called count-combos that can be used to define other procedures. The $-combos function computes count-combos for a fixed units list, the currency denominations under a dollar.

  (define (count-combos units amount)
    (cond ((< amount 0) 0)
          ((= amount 0) 1)
          ((null? units) 0)
          (else (+ (count-combos units (- amount (car units)))
                   (count-combos (cdr units) amount)))))

  (define ($-combos amount)
   (count-combos '(50 25 10 5 1) amount))

  ($-combos 100)
292

As they mention in the book, you solve this problem by recognizing that the ways of changing some amount \(a\) using \(n\) kinds of coins will be equal to - the number of ways to change \(a\) will all but the first kind of coin - the number of ways to change \(a - c\) using any all kinds of coins, for \(c\) the first kind of coin

This is mirrored in the program above. The main two recursive branches investigate these very two things, and each new branch generates two more branches. Particular branches that “don’t work” will yield negative amounts or will run out the coin list before the starting amount is exhausted (so they aren’t counted, and evaluate to 0). The combinations that do work will exhaust the amount to 0, and will evaluate to 1. All the branches explored contract under addition operations (which rest at the root of each pair of branches), adding up all the 1’s and 0’s to the final result.

Exercise 1.12

We are asked to write a procedure to compute elements of Pascal’s triangle by means of a recursive process. If you study any amount of integer-heavy math, you’ll be good friends with Pascal’s triangle.

Explicitly, these elements (binomial coefficients) can be given by \[\frac{n!}{k!(n-k)!}\] Translating that into a procedure would not be entirely ideal. Fair enough– any element of the triangle is given by the sum of the two elements above it. A brief sketch:

  (define (PT-elem row col)
    (if (or (= 0 col) (= col row))
        1
        (+ (PT-elem (- row 1) col)
           (PT-elem (- row 1) (- col 1)))))

Mathematically, the above procedure makes use of the fact that \[{n \choose k} = {n-1 \choose k} + {n-1 \choose k-1}\]

Another method we can use involves the following equality: \[{n \choose k} = \frac{n}{k}{n-1 \choose k-1}\]

With this method, we can generate a linear recursive process. It will create a chain of multiplications of fractions with a length of \(k\)– one base case will be \(k=1\) which will give \(n\), the other base case will be \(n=k\) which will give \(1\). Additionally, we can further optimize the program by checking to see if we can reduce the problem using the equality \[{n \choose k} = {n \choose n-k}\]

We use a ‘named let’ construct for iterating, and give the program parameter names closer to our definitions to be consistent.

  (define (choose n k)
    (if (> k n) 0      ;; only needs checked here
        (let loop ((n n) (k k) (product 1))
          (cond ((= n k) product)
                ((> k (/ n 2)) (loop n (- n k) product))
                ((= k 1) (* n product))
                (else (loop (- n 1)
                            (- k 1)
                            (* (/ n k) product)))))))

While this procedure required more code, it’s much more efficient and also validates input. Anticipating the next section, we compare the rates of growth for these procedures. I’ll use \(x\) instead of the standard \(n\) to avoid confusion in these notations. The first procedure, for large input, would have time complexity of \(\mathcal{O}(x)\) but a space complexity of \(\mathcal{O}(x^{k})\), which would quickly get out of hand– and drawing a tree of the procedure will quickly show that there’s much redundant computation. But our new procedure has a time complexity of \(\mathcal{O}(x)\) (the worst case is running \(k/2\) or \(k\cdot \frac{1}{2}\) evaluations (and \(\mathcal{O}(x) = \mathcal{O}(c\cdot x)\)) and a space complexity of \(\mathcal{O}(1)\). Much better. We’ll discuss what these symbols mean soon.

Exercise 1.13

We prove that the $n$th Fibonacci number is the closest integer to \(\varphi^{n}/\sqrt{5}\). The text gives us a hint: recall that \(\varphi = 1+\sqrt{5}/2\) and let \(\psi = (1 - \sqrt{5})/2\): use induction to prove that \(Fib(n) = (\varphi^{n}-\psi^{n})/\sqrt{5}\).

Essentially, the book is using \(\psi\) to represent \(-1/\varphi\) (see my article on phi for derivations: http://armichael.github.io/posts/2015-03-12-Phi.html#fn1).

We are set with the task of proving what is called the Binet formula: \[F(n) = \frac{1}{\sqrt{5}}\left[\varphi^{n}-\left(-\frac{1}{\varphi}\right)^{n }\right]\]

Using the above definitions for \(\varphi\) and \(\psi\), we note a few relationships:

We recall from the definition of Fibonacci numbers that \(F(0)=0\) and \(F(1)=1\). We check that our formula holds for these base cases:

\[F(0) = \frac{1-1}{\sqrt{5}} = 0\] \[F(1) = \frac{1+\sqrt{5}-(1-\sqrt{5})}{2\sqrt{5}} = 1\]

Inductive hypothesis: we assume that the formula holds for \(F(n-1)\) and \(F(n-2)\).

So because \(F(n)=F(n-1)+F(n-2)\), we must show that where the Binet formula is denoted as \(B\), \[B(n) = \frac{\varphi^{n}-\psi^{n}}{\sqrt{5}} = B(n-1) + B(n-2)\]

Following from the above, it will suffice to show that \(\varphi^{n} = \varphi^{n-1} + \varphi^{n-2}\), and likewise for \(\psi\). The rest of the proof is just straightforward algebra and I won’t include it here. You’ll find that the formula holds for all \(n\).

Orders of Growth

An order of growth is just a crude measure of how the resources required by a procedure will scale with input. “Resources” here can be construed in a few different ways, namely, the time that the procedure takes to execute (the number of mechanical operations), or the memory that the procedure consumes (occupation of storage registers).

The book illustrates a formal definition in the following way: for one of these possible input parameters \(n\), we say \(R(n)\) is a measure of \(n\)’s resource consumption, for some resource (time, space, etc.). Now, \(R(n)\) has order of growth \(\Theta (f(n ))\) if there exist some positive constants \(c_{1}\) and \(c_{2}\) such that for large values of \(n\), \(c_{1}f(n) \leq R(n) \leq c_{2}f(n)\).

So “Big Theta” of some function \(f(n)\) is just a way of saying that input \(n\) for a particular process scales to the resource consumption in such a way that the resource consumption will end up between two bounds– some constant multiples of \(f(n)\). Resources are considered to be time and space, and you’ll hear talk of the time complexity and space complexity of algorithms.

As an aside from the text: what is the difference between \(\Theta\) (Big Theta) and \(\mathcal{O}\) (Big O) notation? Why is the latter seen more often informally?

The answer is that big-theta is more specific. It gives an upper and lower bounds. Whereas big-O just provides an upper bounds, used often to answer the question “how bad can it get?”. Big-O denotes a bounds for the worst case– for large input \(n\), it won’t ever get worse than __. Confusingly, sometimes \(\mathcal{O}\) is used for both.

And just as big-O denotes an asymptotic upper bound, \(\Omega\) or “Big Omega” denotes an asymptotic lower bound– a lower bounds of a procedure for the “ideal” case (to be careful with terms once again, given sufficiently large input).

Here are some more examples to tease out the thinking behind the notations, taking into account time complexity:

\(\mathcal{O}(1)\)
an algorithm with a constant rate of growth. It will take the same amount of time to execute, regardless of the input. An example would be a procedure to access an array, or to check if a list is empty. This is a standard function in Scheme.
  (null? lst)
\(\mathcal{O}(n)\)
an algorithm with a linear rate of growth. The time it takes to execute will grow in proportion to the input. An example would be a procedure that checks if an element is in a list.
  (define (elem-in? elem lst)
    (cond ((null? lst) #f)
          ((equal? elem (car lst)) #t)
          (else (elem-in? elem (cdr lst)))))

This procedure, in the worst case, cdr’s down the whole list (to find that the element is not present).

\(\mathcal{O}(n^{2})\)
an algorithm running in polynomial time, a faster rate of growth than \(\mathcal{O}(n)\). This order is commonly found when procedures involve nesting recursions or iterations. A good example here is a bubble sort, here implemented with vectors:
    (define (bubble-sort vec)
      (do ((passes (- (vector-length vec) 1)
                        (- passes 1)))
          ((= passes 0))
        (do ((index 0 (+ index 1)))
            ((= index (- (vector-length vec) 1)))
          (if (> (vector-ref vec index)
                 (vector-ref vec (+ index 1)))
              (vector-swap! vec index (+ index 1)))))
      vec)

Typically, a greater power of \(n\) suggests a greater number of nested iterations.

\(\mathcal{O}(2^{n})\)

an algorithm that with an exponential rate of growth. It grows even faster than the above orders. The \(2\) given here can be any constant \(c\). The naive recursive generation of Fibonacci numbers is a good example of a procedure with this rate of growth, for \(c=\varphi\). That is to say, the procedure would take a time complexity of \(\mathcal{O}(\varphi^{n})\) (it is, in fact, also \(\Theta(\varphi^{n})\)). This can be intuited from examination of the Binet formula.

\(\mathcal{O}(n\log n)\)
an algorithm with time complexity of this order would scale to input in a pseudo-linear fashion (just slightly longer, with even smaller difference for large \(n\)). This type of time complexity denotes a procedure running in linearithmic time, unless there is some exponent \(k\) on the logarithm, in which case it is said to run quasilinear time. An example of this would be a merge sort (which also happens to be \(\Theta(n\log n))\)).
  (define (merge-sort lst)
    (cond ((null? lst) lst) 
          ((null? (cdr lst)) lst) 
          (else
           (merge (merge-sort (odd-part lst))
                  (merge-sort (even-part lst))))))

  (define (merge lst1 lst2)
    (cond ((null? lst1) lst2) 
          ((null? lst2) lst1) 
          ((> (car lst1) (car lst2))
           (cons (car lst2) (merge lst1 (cdr lst2)) ))
          (else (cons (car lst1) (merge (cdr lst1) lst2)))))

  ;; mutually recursive functions to divide up lists
  (define (odd-part lst)
    (if (null?  lst)
        '()
        (cons (car lst) (even-part (cdr lst)))))
  (define (even-part lst)
    (if (null? lst)
        '()
        (odd-part (cdr lst))))
\(\mathcal{O}(\log n)\)
an algorithm of this order has a logarithmic rate of growth. So, procedures of this order are faster in the worst case than procedures of \(\mathcal{O}(n)\) (linear). An example of this order would be a procedure for insertion into a binary heap.
    (define (insert tree x)
      (if (empty-tree? tree)
          (make-tree x '() '())
          (cond ((> x (root tree))
                 (make-tree (root tree)
                            (left-subtree tree)
                            (insert (right-subtree tree) x)))
                ((< x (root tree))
                 (make-tree (root tree)
                            (insert (left-subtree tree) x)
                            (right-subtree tree))))))
  ;; where
  (define make-tree
    (lambda (root l-subtree r-subtree)
      (list root l-subtree r-subtree)))
  (define empty-tree? null?)
  (define root car)
  (define left-subtree cadr)
  (define right-subtree caddr)

Because they are simply aymptotic bounds, orders of growth only give a rough approximation of how we can expect a procedure to behave.

Exercise 1.14

We are asked to recall the change-counting function from earlier, and to determine its space and time complexity.

    (define (count-combos units amount)
      (cond ((< amount 0) 0)
            ((= amount 0) 1)
            ((null? units) 0)
            (else (+ (count-combos units (- amount (car units)))
                     (count-combos (cdr units) amount)))))

    (define ($-combos amount)
     (count-combos '(50 25 10 5 1) amount))

    ($-combos 100)

This procedure resembles a tree-structure in its evaluation (with the + operator as a root node and two recursive function calls as children of the node), so the number of steps for the procedure will be given by the height of the tree. We recall that the height of a binary tree such as this is given by \(2n-1\), so the number of steps (that is, the time-complexity) for the procedure will be \(\mathcal{O}(n)\).

Now we find the space complexity. For a list of 0 denominations, and an amount \(n\), the program ends without recursion, so it would execute in memory of \(\mathcal{O}(1)\). For a list of 1 denomination, and an amount \(n\), we would have two recursions. One branch ends immediately (let this be the left branch), the other recurs again with \(n\) minus the denomination (let this be the right branch). This continues, forming a tree of nodes with terminated left branches and a chain of branches on the right that continues so long as the repeated subtractions of the one denomination do not exhaust the amount. Assuming this single denomination is 1, the smallest possible amount, we have \(n\) branchings, so the memory takes \(\mathcal{O}(n)\). For denominations larger than 1, we have fewer than \(n\) branchings, so \(f(n) = \mathcal{O}(n)\) still holds. For a list of 2 denominations, we have for the right branch a tree that ignores the first denomination, and so takes \(\mathcal{O}(n)\) just as before, but we must also take into account the memory consumed by the left branch. The left branch will, far on one side, continually subtract the first denomination only until the amount is exhausted, which is \(\mathcal{O}(n)\). So far, we have \(\mathcal{O}(2n)\), which is the same as \(\mathcal{O}(n)\). However, each node after this operation will have a right branch that runs down the second denomination, and all of those will take \(\mathcal{O}(n)\) using the same reasoning as above. So then we have \(\mathcal{O}(n^{2})\).

If we extend this reasoning, we come to see that the memory of this procedure, scaling to input size, takes a polynomial order of time complexity \(\mathcal{O}(n^c)\) where \(c\) is the number of denominations to run through. In other words, it’s highly inefficient.

Exercise 1.15

This exercises concerns a procedure for numerical approximation. The sine of an angle can be approximated by taking into account the fact that \(sin x \equiv x\) for sufficiently small \(x\) (less than or equal to 0.1 radians), and the trigonometric identity \[sin x = 3 sin\frac{x}{3}-4sin^{3}\frac{x}{3}\]

The book presents the following procedure:

  (define (cube x) (* x x x))
  (define (p x) (- (* 3 x) (* 4 (cube x))))
  (define (sine angle)
    (if (not (> (abs angle) 0.1))
        angle
        (p (sine (/ angle 3.0)))))

a. How many times is the procedure p applied when (sine 12.15) is evaluated?

  (sine 12.15)
  (p (sine 4.05))
  (p (p (sine 1.35)))
  (p (p (p (sine 0.45))))
  (p (p (p (p (sine 0.15)))))
  (p (p (p (p (p (sine 0.05))))))
  (p (p (p (p (p 0.05)))))
  ;; ...

p is applied 5 times.

b. What is the order of growth in space and number of steps (as a function of a) used by the process generated by the sine procedure when (sine a) is evaluated?

This procedure is linear recursive, and so the space and steps required will both be related to the number of calls to sine that are required. This is directly dependent on how many times the input a can be divided by 3 until it is of a certain value– so we have the same number of sine calls until our input a triples in value. This indicates that our order of growth is precisely \(\Theta{(\log_{3}n)}\).

Fast Exponentiation

A naive recursive approach to exponentiation, such as

  (define (expt b n)
    (if (= n 0)
        1
        (* b (expt b (- n 1)))))

Would, as a linear recursive process, require both \(\Theta(n)\) steps and \(\Theta(n)\) time.

An iterative version, such as

  (define (expt b n)
    (expt-iter b n 1))
  (define (expt-iter b counter product)
    (if (= counter 0)
        product
        (expt-iter b
                   (- counter 1)
                   (* b product))))

would require \(\Theta(n)\) steps but only \(\Theta(1)\) space.

But we can make a procedure even more efficient by making use of the equalities \[\begin{align}b^{n} &= (b^{n/2})^{2} &\text{if n$ is even} \\ b^{n} &= b\cdot b^{n-1} &\text{if $n$ is odd} \end{align}\]

As we saw earlier with approximations of \(sine\), dramatically operating upon the procedure’s input (such through division) using equalities can dramatically reduce a procedure’s order of growth.

GCD

Primality

1.3 Higher-Order Procedures

Procedures as Arguments

Lambda

General Methods

Returned Values

Data Abstraction

Modularity, Objects, and State

Metalinguistic Abstraction

Register Machines


  1. {.example} SICP 2nd ed., page 49