algorithm and complexity analysis

EG: The Problem of sorting

Input: sequence (a1, a2, …, an) of numbers.

Output: permutation (a’1, a’2, …, a’n) such
that a’1 <= a’2 <= … <= a’n.

///ONE SOLUTION: INSERTION SORT///

void insertion_sort(std::vector<int> &A)
{
    int i;
    int j;
    int value;
    for(i=1; i<A.size(); i++)
    {
       value = A[i];
       j = i-1;
       while(j>=0 && A[j]>value)
       {
            A[j+1] = A[j];
            j = j-1; 
        }
       A[j+1] = value;
    }
}
  • Running Time

The running time depends on the input: an already sorted sequence is easier to sort.

Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones.

Generally, we seek upper bounds on the running time, because everybody likes a guarantee.

  • Kinds of Analysis

Worst-case: (usually)

T(n) = maximum time of algorithm on any input of size n.

Average-case: (sometimes)

T(n) = expected time of algorithm over all inputs of size n.

Need assumption of statistical distribution of inputs.

Best-case: (bogus)

Cheat with a slow algorithm that works fast on some input.

  • Machine-Independent Time

What is insertion sort’s worst-case time?

It depends on the speed of our computer:

relative speed (on the same machine),

absolute speed (on different machines).

BIG IDEA:

Ignore machine-dependent constants.

Look at growth of T(n) as n → ∞ .

  • Q-notation

MATH: Q(g(n)) = { f (n) :there exist positive constants c1, c2, and n0 such that 0 <= c1g(n) <= f (n) <= c2g(n) for all n >= n0 }

ENGINEERING: Drop low-order terms; ignore leading constants.Example: 3n^3 + 90n^2 – 5n + 6046 = Q(n^3)

Here get back to the example of insertion analysis:
• Worst case: Input reverse sorted.
T(n)=Σ(j=2…n)Q(j)=Q(n^2

While in the Merge-Sort

MERGE-SORT A[1 . . n] 1.

  1. If n = 1, done.

  2. Recursively sort A[ 1,…,[n/2] ] and A[ [n/2]+1,…,n ]

  3. “Merge” the 2 sorted lists.

void merge_sort(std::vector<int> &A)
{
    std::vector<int> A1;
    std::vector<int> A2;
    if (A.size()==1) return;
    for(int i=0; i<A.size()/2; i++)
            A1.push_back(A[i]);
    for(int i=A.size()/2; i<A.size(); i++)
            A2.push_back(A[i]);
    merge_sort(A1); 
    merge_sort(A2);
    A = merge(A1,A2);
}

The complexity of merge sort is

if n=1: T(n)=Q(1)

if n>1: T(n)=2T(n/2) + Q(n)

Since there is lg(n) levels, and for eahc level the cost of merge is n so the total cost is nlg(n).

  • O-Notation(Upper Bounds)

We write f(n) = O(g(n)) if there exist constants c > 0, n0 > 0 such that 0 <= f(n) <= cg(n) for all n >= n0.

O(g(n)) = { f(n) : there exist constants c > 0, n0 > 0 such that 0 <= f(n) <= cg(n) for all n >= n0 }

[O(g(n)) noted a whole class of algorithms whos upper bound complexity is O(g(n))]

  • Marco Substitution

Convention: A set in a formula represents an anonymous function in the set.

    EXAMPLE1:
        f(n)=n^3+O(n^2)
        MEANS
        f(n)=n^3+h(n)
        FOR SOME h(n) belongs to O(n^2)
    EXAMPLE2:
        n^2+O(n)=O(n^2)
        MEANS
        FOR ANY f(n) belongs to O(n)
            n^2+f(n)=h(n)
            FOR SOME h(n) belongs to O(n^2)
  • Omega-Notation

Omega(g(n)) = { f(n) : there exist constants c > 0, n0 > 0 such that 0 <= cg(n) <= f(n) for all n >= n0 }

  • Boundary Conditions

  • Common Cases