Algorithm Analysis
Algorithm Analysis measure the speed of the task algorithm suppose to accomplished. Here the necessary maths back ground to analyse the algorithmic complexity.
CONTENTS
- 12 -fold way
- Order of the Growth
- Algorithm Analysis of Sorting
- Algorithm Analysis of Searching
- Appendix
There are two basic rules in the Algorithm analysis:
Rule of sum: if an action is performed making A choices or B choices, then it can be performed \(A+B\) ways.
Rule of product: If an action can be performed by making A choices followed by B choices, then it can be performed \(A\times B\) ways.
Permutation: an ordered list where every object appears exactly once
\[
_{n}P_{r} = \frac {n!} {(n - r)!} \text{ where } !0 = 1
\]
combinations: When order is not important and the repetition is not allowed, the number of ways to choose k from the distinct n is as for \(n \ge k \ge 0\):
\[
{n \choose k }= \frac {n!}{ k!(n-k)! }
\]
in general, choosing k
out of n
is same as not choose n-k
out of n
.
\[
{n \choose k} = {n \choose n - k}
\]
According to the Pascal's Triangle:
\[
{n \choose k} = {n -1 \choose k} + {n -1 \choose k-1}
\]
Again from the pascal triangle:
\[
{n \choose 0}+{n \choose 1}+ {n \choose 2}+ ... + {n \choose n} = 2^{n}\\
\sum^n_{k=0}{n \choose k} = 2^{n}
\]
ordered | repeating | ||
---|---|---|---|
sequence | \(n^k\) | yes | yes |
Permutation | \(_{n}P_{r}\) | yes | no |
Multisubset | \(\left(\!\!{n\choose k}\!\!\right)\) | no | yes |
combination | \(_{n}C_{k}\) or \(n \choose k\) | no | no |
It is sufficent to implement \(n!\).: the programming construct is recursion because \( n !=n(n-1)!\).
12 -fold way
This is the mathematical basis for the Algorithm Analysis.
Here D - distinct and I - identical. In the following table, if \(x \gt b\) then all the values of the at most are zeros. if \(b \gt x\), then all the values of the at least are zeros.
A | B | any | at most | at least |
---|---|---|---|---|
D | D | \(b^{a}\) | \(_{b}P_{a}\) | \(S(a,b)!b\) |
I | D | \(\binom {a+b-1} {a}\) | \(_{b}C_{a}\) | \(x - 1 \choose b-1\) |
D | I | \(\sum ^{b}_{k=1} S(a,k)\) | 1 if \(a \le b\) else 0 | \(S(a,b)\) |
I | I | \(\sum ^{b}_{k=1} p_{k}(a)\) | 1 if \(a \le b\) else 0 | \(p_{b}(a)\) |
In the above table, Stirling numbers of the second kind \(S(a,b)\) count the number of ways to partition a set of a
elements into b
nonempty subsets. Stirling numbers of the first kind \(s(a,b)\) count permutations of a
objects with exactly b cycles.
In the above table, \(p_{b}(a)\) is the number of ways to partition the number a
as the sum of b
positive numbers since the order don't matter. For \(1 \lt k \lt n\):
\[
p_{k}(n) =p_{k-1}(n-1) + p_{k}(n-k)
\]
Multi-choosing \(\left(\!\!{n\choose k}\!\!\right)\) is the number of ways to choose k
objects from a set of n
objects where order is not important but repetition is allowed.
\[
\left(\!\!{n\choose k}\!\!\right)={{n+k-1}\choose{k}}
\]
Simplification,
\[
\sum^m_{k=0}{\left(\!\!{n\choose k}\!\!\right)} = {\left(\!\!{n+1\choose m}\!\!\right)}
\]
Multinomial theorem where \(a, b,..,c > 0 \text{ and } a+b+ ... +c = n\)
\[
(x+y+...+z)^n = \sum {n \choose a,b,..c}x^ay^b...z^c
\]
Principle of Inclusion-Exclusion
\[
\left\vert{A\cup B \cup C}\right\vert = \left\vert{A}\right\vert + \left\vert{B}\right\vert + \left\vert{C}\right\vert - \left\vert{A \cap B}\right\vert - \left\vert{A \cap C}\right\vert - \left\vert{B \cap C}\right\vert \\+ \left\vert{A \cap B \cap C}\right\vert
\]
Formula for the sterling number:
\[
S(a,b) = \frac {1}{b!} \sum ^{b}_{j=0} (-1) ^j{y\choose j} (4-j)^a
\]
Java
There are two choices for the intermediate represetiation
- portable machine language
- graph based
Initially Java was created only with byte code interpreter. But current Java has Just In Time (JIT) compiler.
There are two popular representations:
- Stack based - 0 operand
- Register based - 3 operand
Java uses Stack based representation.
If you consider the for single loop in Java
for (int i =0; i < N; i++){
...
}
The running time cost of the above loop is \(N\) means linear. The running time cost of the double loop is quadratic. Triple loop running time is cubic.The worst running time can be something like \(2^{N}\) exponential.
Order of the Growth
In the Algorithm Analysis, fortunately there are limited models to consider. Here the growth from best running time to worst:
- \(\log N\)
- \(N\) (Knuth shuffle)
- \(N \log N\) (Mergesort, QuickSort)
- \(N^{2}\) (selection =\(N^{2}/2\), insertion = 1/4 \(N^{2}\) )
- \(N^{3}\)
- \(2^{N}\)
Each instance of the java.util.ArrayList has the capacity, when reach to the capacity, the array need to be increased. Assume, each time ArrayList is double when it reach to the capacity then the equation is
\[
2+4+8+16 +...+N\sim 3N
\]
This is because ArrayList is using java array of Object as a implementation. Advantages are that every operation takes constant time of Amortised time( Average running time per operation over the worst case sequence of operations) and less wasted space compared to linked list implementation because linked list is based on object.
Algorithm Analysis of Sorting
There java.lang.Comparable<T>
and java.util.Comparator<T>
is based on the total order which is a binary relation that satisfy:
- Antisymmetry: if \(v \leq w\) and \(w \leq v\) , then \(v = w\)
- Transitivity: if \(v \leq w\) and \(w \leq x\) , then \(v \leq x\)
- Totality: either \(v \leq w\) or \(w \leq v\) or both
Quicksort is little bit faster than Mergesort because Quicksort doesn't exchange the elements always. However, quick sort wort case running time is quadratic (\(N^{2}\)) and average case is \(1.39 N \log N\). Quciksort random shuffle is the probabilistic guarantee to avoid worst case.
The java.utils.Arrays sort() method is using Quicksort for the primitives and Mergesort for the objects.
Dijkstart 3-way partitioning is the way to compromise with the duplicate keys because lower bound is reduced linearithmic to linear for most of the applications as follows:
\[
\lg\left( \dfrac {N!}{x_{1}!x_{2}!\ldots x_{n}!}\right) \sim \sum ^{n}_{i=1}x_{i}\lg\dfrac {x_{i}}{N}
\]
Algorithm | Worst | Average | Best | Remarks |
---|---|---|---|---|
selection | \(N^{2}/2\) | \(N^{2}/2\) | \(N^{2}/2\) | N exchanges |
insertions | \(N^{2}/2\) | \(N^{2}/4\) | \(N\) | small N or partially |
shell | ? | ? | N | tight code |
merge | \(N \lg N\) | \(N \lg N\) | \(N \lg N\) | \(N \lg N\) guaranteed |
quick | \(N^{2}/2\) | \(2N \lg N\) | \(N \lg N\) | \(N \lg N\) probabilistic guarantee |
3-way quick | \(N^{2}/2\) | \(2N \lg N\) | \(N\) | support duplicate keys |
Heapsort | \(2N \lg N\) | \(2N \lg N\) | \(N \lg N\) | In-place algo. Inner loop is longer than quicksort. Poor use of cache memory. Not stable. |
Algorithm Analysis of Searching
Binary Search Tree (BST) is a binary tree(BT) in symmetric order. BT can be either empty or two disjoint binary trees at left and right(left/right nodes can be null).
In the BST every node has a key:
- Larger than all keys in its left subtree
- smaller than all keys in its right subtree
For the N number of distinct values in random order, the number of comparisons are \(2 \ln N\).
Red-Black trees
The java.util.TreeMap
is based on the Red-black tree. Here some of the characteristics of the Red-Black tree:
- Represents 2-3 tree as
- use internal left-leaning link to glue three nodes which is red link
- no node has two red links
- every path from root to null link there are same number of black links.
Appendix
quadratic
\[
1+2+\ldots +N\Rightarrow \sum ^{n}_{i=1}i\sim \int ^{n}_{x=1}xdx\sim \dfrac {1}{2}N^{2}
\]
logarithmic ( \(N \ln N\) is called linearithmic)
\[
1+1/2+1/3+\ldots +1/N\Rightarrow \sum ^{.N}_{i=1}\dfrac {1}{i}\sim \int ^{N}_{x=1}\dfrac {1}{x}dx=\ln N
\]
cubic ( \(N^{2}\) is called quadratic)
\[
\sum ^{N}_{i=1}\sum ^{N}_{j=i}\sum ^{N}_{k=j}1\sim \int ^{N}_{x=1}\int ^{N}_{y=x}\int ^{N}_{z=y}dzdydx\sim
\dfrac {1}{6}N^{3}
\]
Comments
Post a Comment
commented your blog