Algorithms- Recurrence Relation, Solving recurrences, Order of Growth, Hashing notes

4 min readDec 17, 2019

Disclaimer: Most of the content below will be cited. If any sentence is not cited, I want to clarify that this article is only for knowledge purpose and I will not be earning any money from it.

Growth Relation

[ <<< symbol => (grows much much slower than) ]

Constant <<< Poly-logs <<< Polynomials <<< Factorial (n!) <<< Exponential (n^n)

Example:

1 < log(n) < n^(1/2) < n (i.e. n¹) < n log(n) ~ log(n!) < n² < n³ <<< 2^n < 3^n <<< n! <<< n^n

Recurrence Equation/ Recurrence/ Recurrence Relation

A recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs.
When an algorithm contains a recursive call to itself, we can often describe its running time by a recurrence equation, which describes the overall running time on a problem of size n in terms of the running time on smaller inputs.

Solving Recurrences:

Recursion Tree Method

The recursion-tree method converts the recurrence into a tree whose nodes represent the costs incurred at various levels of the recursion.
We use techniques for bounding summations to solve the recurrence.

Master Method/Theorem

The master theorem for DIVIDE-AND-CONQUER RECURRENCES provides an asymptotic analysis for recurrence relations of types that falls under any one of the three cases of master’s theorem.

Substitution Method

It is used to establish either upper or lower bounds on a recurrence

Substitution Method vs Master Method vs Recurrence equation

Similarity

All 3 are used for analyzing divide-and-conquer algorithms

Differences

Recurrence equation: It describes the overall running time on a problem of size n in terms of the running time on smaller inputs.
In the substitution method, we guess a bound and then use mathematical induction to prove our guess correct.
The master method provides bounds for recurrences of the form

T (n) = a . T (n/b) + f (n)

Why choose Iteration over Recursion?

• Recursive calls can have have stack overflow problem over iteration
• Each recursive call requires use of heap space in memory unlike iteration
• Recursive call incurs space allocation instructions which requires extra CPU running time

Order of Growth / Rate of Growth

We usually consider one algorithm to be more efficient than another if its worst-case running time has a lower order of growth

Example

(T^1.8) [lower order of growth] is more efficient than (T² )

x amount of work will be done in just (T¹.8) by algorithm 1, whereas algorithm 2 requires T²
Large enough inputs, Theta(n¹.8) algorithm, for example, will run more quickly in the worst case than Theta(n²) algorithm.
Lower is faster

Analyzing algorithms

Asymptotic Notation:
Input sizes are large enough to make the order of growth of the running time relevant

Average-case analysis [Θ Notation]
For a given function g(n), we denote by theta(g(n)) the set of functions
theta(g(n)) = f(n): there exist positive constants c1, c2, and n0 such that 0 0<= g(n) 􏰄<= f(n) <=􏰄 c2 g(n) for all n>=n0
The “average case” is often roughly as bad as the worst case.
f(n) lies in set Θ(g(n)) | g(n) is asymptotically tight bound for f(n)
An asymptotically positive/nonnegative function is one that is positive for all sufficiently large n

Best Case — Lower Bound — [ Ω (>=) (Asymptotic Lower Bound) ; ω(>)]
f(n) lies above c.g(n) on graph

Worst-case [O (<=)(Asymptotic Upper Bound) AND o(<)]

The worst-case running time of an algorithm gives us an upper bound on the running time for any input
f(n) lies below/is bounded by c.g(n) on graph

Average-case analysis vs Best Case vs Worst-case [Input based i.e Polynomial]

Worst case run-time means that you are feeding the worst possible input (of that size) into your algorithm.
Best case run-time means that you are feeding the best possible input into your algorithm.

Big O vs Big Omega vs and Big Theta
These refers to a way of bounding complicated functions by a simpler function

Properties of Asymptotic Notations

Reflexivity: f(n) is O/Big Omega/theta of f(n)
Symmetry: f(n) = Θ(g(n)) if and only if g(n) = Θ(f(n))
Transistivity: f(n) = O(g(n)) and g(n) = O(h(n)) ⇒ f(n) = O(h(n))
Transpose symmetry: f(n) = O(g(n)) if and only if g(n) = Ω(f(n))

General Recurrence Relation for Divide and Conquer based Problems

T(n) = a . T (n/b) + f(n) => Used in Master Method/Theorem

Dynammic Programming

Recursive implementation of dynamic programming algorithm without memoization is better than iterative implementation when it comes to real-world applications. (e. tail call recursion/optimization)

Longest Common Sub-sequence

Sub-sequence

A sub-sequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements

The length of a/the longest common subsequence (LCS) of “ADBCD” and “DAXBC” is 3.

Hashing

Load Factor = (Keys length) n/m (Table length)

Hash Tables/ Functions/ Hashing

Hash Function

Given two different hash functions h1 and h2, it would be better to use h1 (than h2) in practice if h1 leads to less collisions than h2 for every universe of keys as searching will be faster with less collisions
Suppose we use some hashing function h and chaining to implement hash tables. We INSERT two items (x first, followed by y) into an empty hash table where h(x.key) = h(y.key). After we insert y, the table would no longer have a reference to x as inserting y would overwrite to the slot with a reference to x.
For any deterministic algorithm that takes a permutation as an input, let T (n) be its worst-case running time (on input size n) and S(n) be its average-case running time (assuming that each of the n! permutations is equally likely). Then, for large enough n, we always have S(n) = O(T (n)) but S(n) = o(T(n)) may or may not hold.

Good Hash Functions:

Easy to compute : Should not be an Algo itself, will not be worth it
Even Distribution:
Less Collision

Hashing by division and hashing by multiplication VS Universal Hashing

Hashing by division and hashing by multiplication, are heuristic in nature, whereas universal hashing, uses randomization to provide provably good performance.