Wednesday 26 November 2014

Week 11: Harder Time Complexity Problems

           There is no class this week so for fun, we go over a few harder time complexity problems.  Consider the sequences $f(n) = n^4 - 165n^3 - 240n^2 - 265n - 373$ and  $g(n) = 157n^3 + 257n^2 + 337n + 457$.

Problem 1: Show that $f \notin O(g)$ without using any limit definitions.

Problem 2:  Show that $f \in \Omega (g)$ without using any limit definitions.

Problem 1 Tips:


            The definition that we have is $f \in O(g)$, which is defined as:

$\exists c \in R^+, \exists B \in N, ( \forall n \in N, n \geq B \implies f(n) \leq g(n))$.

Thus, its negation is:

$\forall c \in R^+, \forall B \in N, ( \exists n \in N, n \geq B \wedge f(n) > g(n))$.

We must choose $n$.  Using the recommended strategies of my instructors, we get:

$n^4 - 165n^3 - 240n^2 - 265n - 373 \geq$

$n^4 - 1043n^3$

$???$

$c(1208n^3) \geq$

$c(157n^3 + 257n^2 + 337n + 457)$.

The $???$ part indicates the area where the strategies of my instructors do not cover.  So I think hard here (like really hard) and I come up with this.  All we must do is find a $n$ that satisfies the inequality $n^4 - 1043n^3 > c(1208n^3)$.  We have:

$n^4 - 1043n^3 > c(1208n^3)$

Move $ -1043n^3$ on the left side to the right and factor out $n^3$ from the resulting right:

$n^4 > n^3(c1208 + 1043)$

It looks like we will choose $n > 0$.  That way, we can divide both sides by $n^3$:
$n > c1208 + 1043$.

Our choice of $n$ is then $max(\lceil c1208 + 1043 \rceil + 1, B)$.

Problem 2 Tips:


            Likewise, we show that $f \in \Omega (g)$, which is defined as:

$\exists c \in R^+, \exists B \in N, ( \forall n \in N, n \geq B \implies f(n) \geq g(n))$.

We must choose $c, B$.  We have a similar inequality to problem 1:

$n^4 - 1043n^3 \geq 1208n^3$

Again, we choose $B >0$ so that we can divide both sides by $n^3$.  Then add both sides by $1043$ to yield:

$n \geq 2051$.

Of course, to choose $B = 2051$, we must choose $c = 1$.  Otherwise, the first step is irreversible.

Tuesday 18 November 2014

Week 10: Computability and Confusion

            Computability is the first topic in CSC165 that is new to me.  It is both interesting and somewhat confusing.  Unfortunately, any confusion is undesirable at this point in the course.  After thinking about my confusion, I realize that the halting problem is the culprit.  Why is this?

Elementary Definitions and Theorems:


            First, the stuff that makes sense.  Consider a set of inputs $I$ and a set of outputs $O$.  Computability begins with three relatively simple definitions and a single theorem.

Definition 1:  Let $f: I \to O$ be a function.  Then $f$ is well-defined $\iff$ for each $x \in I$, there is a unique $y \in O$ such that $f(x) = y$.


For remaining definitions and theorems, let $g, h: I \to O$ be well-defined functions.

Definition 2:  Then $g$ is computable $\iff$ there is a program $P$ such that $\forall x \in I, g(x) = P(x)$.

Definition 3:  Then $g$ reduces to $h \iff g$ can be implemented with $h$.  Below is what $g, h$ look like in Python code if $g$ reduces to $h$.  Let $P$ be the implementation of $g$ and $Q$ be the implementation of $h$.

            def P():
            """An implementation of the function f."""
                        Q():

Theorem 1:  Let $g$ reduce to $h$.  Then $h$ is computable $\implies g$ is computable.

The Confusing Part:


            Now, the stuff that is not so clear.  The function halt(f, i) tells us if a function $f$ with an input $i$ halts.  The halting function is not computable.  The proof of this statement starts simple enough:  Assume to the contrary that there is an implementation of halt(f, i) that works for all functions and inputs.  Then consider the function what_is_love(f):

            def what_is_love(f):

                        if halt(f, f):
                                    while True:
                                                print(“Look to the skies.”)
                        else:
                                    print(“Nobody knows.”)

The “while True” statement is the incomprehensible part of the proof.  I ask in lecture why it is there.  The instructor lets a classmate several rows ahead of me answer my question.  The explanation takes nearly half a minute.  I did not get any of it.  Apparently, the instructor did not have any problems with the explanation nor did my other classmates.

The Comprehending Part:


            It takes several days for me to understand why the “while True” statement is there.  Eventually, I look at what we are proving.  We assume that there is an implementation of halt(f, i) that works for all functions and inputs.  To disprove, we only have to show that one function and input does not work for halt(f, i).  This is a classic existential proof.  We are free to construct the function what_is_love(f) however we want to so long as it disproves the computability of halt(f, i).

            The “while True” statement guarantees that what_is_love(what_is_love) goes into an infinite loop if halt(what_is_love, what_is_love) returns True.  Likewise, what_is_love(what_is_love) halts if halt(what_is_love, what_is_love) returns False.  This is the contradiction and it disproves the computability of halt(f, i).  I wonder why that classmate who answered my question did not explain it like this....

Wednesday 12 November 2014

Week 9: Problem Solving Episode: The Stockbroker's Algorithm

Context


            So you graduate from UofT and land your dream job.  You work a few years and realize there is this thing called inflation.  So you get a brokerage firm to invest your savings.

            To guide its investment decisions, the firm uses the Stockbroker’s algorithm, which inputs a list of real numbers and outputs the max sum over a slice of the list. 

            Unfortunately, the firm uses the following implementation of the algorithm, which is called John Smith’s Sum as nobody knows the name of the employee who wrote it:

            def john_smith_sum(L: list) -> int:
                        """
                        Return the maximum sum over a slice of L in cubic time.
                        """

                        max = front_index = 0
                        n = len(L)

                        while front_index < n:
                                    end_index = front_index
     
                                    while end_index <= n:
                                                middle_index = front_index
                                                sum = 0
               
                                                while middle_index < end_index:
                                                            sum += L[middle_index]
                                                            middle_index += 1

                                                if sum > max:
                                                max = sum

                                                end_index += 1

                                    front_index += 1

                        return max

The runtime of john_smith_sum is cubic.  This means that it is agonizingly slow for large inputs and that you will miss out on great investments.  So you decide to do the following:

Main Problem:  Come up with an implementation of the Stockbroker’s algorithm with a linear runtime.

You will then give it to the firm because you are very generous person.  First things first; how will you come up with an implementation?

Step 1:  Understand the Problem


            Polya’s How to Solve It recommends understanding the problem first.  We have to meet two goals: our implementation should be correct and linear in its runtime.  Let us answer the following to see if we know what these goals mean:

            (1) What does an iterative algorithm with a linear runtime look like?
            (2) What is the correct output of the Stockbroker’s algorithm for different inputs?

            For (1), a single loop and a monotone loop index should suffice.  For (2), work out examples:  Consider the list $[3, 1, 4]$.  The max sum is $8$ so that is what our implementation should return.  How about $[1, -2, 3, -4, 5, -6]$?  Each negative number makes all the preceding partial sums negative so the max sum is $5$ here.  What about $[3, -1, 4, -1, 5]$?.  We may be tempted to look only at positive numbers but we see that adding $-1$ after $3$ and $4$ still keeps the partial sum positive.  Hence, the max sum is $10$.

Step 2: Devise a Plan


 “No plan survives contact with the enemy” – The Desert Fox

            There are a couple of approaches to this problem.  One is to assume an arbitrary list of real numbers as the input.  We write a single implementation for that input.  However, this is a little hard as we have to juggle negative and non-negative numbers simultaneously.

            A second approach is to implement by cases through the lists in (1).  Then combine them into a single implementation.  However, we are lazy and we do not want to write multiple unrelated implementations.

            Check out this approach.  We write a single implementation that works for a simple input and then extend it to all inputs.  This may be the best approach as there appears to be only a few cases.  Hopefully, this assumption is correct.  Then this approach is our plan.

Step 3: Carry out the Plan


            This is the part where we execute the plan so try not to lose your head.  Of the three lists in (1), the easiest one to write an implementation for is $[3, 1, 4]$ as all the elements are non-negative.  By delegating a variable to store the max sum, then we have version one of our implementation:

My Sum Version 1:

            def my_sum(L: list) -> int:
                        """
                        Return the maximum sum over a slice of L in linear time.
                        """

                        n = len(L)
                        sum = max = index = 0

                        while index < n:
                                    element = L[index]
                                    sum += element    

                                    if max < sum:
                                                max = sum

                                    index += 1

                        return max

            This version features a single loop with a monotone increasing loop index so we satisfy the linear runtime component.  Unfortunately, this version only works if all the elements are positive.  It falters for lists of the following nature: $[1, -2, 3, -4, 5, -6]$.  So we break down each iteration into two cases: one for negative and one for nonnegative.  This leads to version two of our implementation:
                                                                                     
My Sum Version 2:

            def my_sum(L: list) -> int:
                        """
                        Return the maximum sum over a slice of L in linear time.
                        """

                        n = len(L)
                        sum = max = index = 0

                        while index < n:
                                    element = L[index]

                                    if element >= 0:
                                                sum += element

                                    else:
                                                sum = 0           

                                    if max < sum:
                                                max = sum

                                    index += 1

                        return max

            This version resets the sum once it iterates over a negative number.  The linear runtime is still preserved.  However, consider the following list: $[3, -1, 4, -1, 5]$.  The element $-1$ is not enough to reduce the sum to below zero.  Then we should not reset the sum unless it is below zero.  This also allows us to eliminate some code: the negative and non-negative checks and the memoization step where L[index] is assigned to the variable element:

My Sum Version 3:

            def my_sum(L: list) -> int:
                        """
                        Return the maximum sum over a slice of L in linear time.
                        """

                        n = len(L)
                        sum = max = index = 0

                        while index < n:
                                    sum += L[index]
                                    
                                    if sum < 0:
                                                sum = 0           

                                    if max < sum:
                                                max = sum

                                    index += 1

                        return max

Step 4: Look Back


            Version three is our final version but our obsessive tendencies force us to double check its correctness and runtime.  Consider the test code below.  It checks that my_sum and john_smith_sum have matching outputs for each input.

            Verify this by running the code from the link on Python.  It contains the code for john_smith sum, the final version of my_sum and the test cases:


Test Cases:

            if __name__ == "__main__":
                        L = [[3, 1, 4], [3, -1, 4, 1], [3, -1, 4, -1, 5], [3, 1, -4, 1, -5, -9],
                                                [-3, -1, -4], [1, -2, 3, -4, 5, -6], [3, -5, 7, 1, -2, 0, 3, -2, 1]]

                        for x in L:
                                    assert john_smith_sum(x) == my_sum(x)

            Likewise for the runtime, the loop executes $7$ steps for each iteration in the worst case so that is $7n$ steps in total.  Add the steps outside the loop plus the loop guard and we have $7n + 4$ as the runtime, which is linear.  Hence, we have a credible implementation of the Stockbroker's algorithm. 

Tuesday 4 November 2014

Week 8: Time Complexity

            The post in Week 7 talks about algorithms being linear, quadratic or cubic.  Such descriptions are neither precise nor fancy enough keep computer scientists satisfied.  So we have this thing called the time complexity of an algorithm, which describes the growth in runtime over the growth in input size.

Here Comes the Notation!


            Time complexity is described in Big O.  It looks something like this.  Let $f: N \to R^{\geq 0}$ be a sequence that describes the runtime of an algorithm $P$ and $g: N \to R^{\geq 0}$ be a sequence that describes an arbitrary runtime (i.e. linear, quadratic, cubic, etc).  If we say that $f \in O(g)$, then we mean that:

\[\exists c \in R^+, \exists B \in N, (\forall n \in N, n \geq B \implies f(n) \leq cg(n))\]

Intuitively, what these little symbols say is that as the input becomes arbitrarily large, then the runtime of $P$ performs better than or as well as $g$.  Likewise, if we say that $f \in \Omega (g)$, then we mean that:

\[\exists c \in R^+, \exists B \in N, (\forall n \in N, n \geq B \implies f(n) \geq cg(n))\]

Again, this is just a fancy way of saying that as the input becomes arbitrarily large, then the runtime of $P$ performs worse than or as well as $g$.

A Fun Problem


            Let $f: N \to R^{\geq 0}$ be a polynomial sequence of degree $m$ and $g: N \to R^{\geq 0}$ be a polynomial sequence of degree $n$ with $m, n \in R$.  Find all $m, n$ such that $f \in O(g)$ and $(f \times g) \in \Omega(f)$.