nathaphat.net

bounds

upper bound $f(n) = O(g(n))$ if $f(n)$ grows as most as fast as $g(n)$
lower bound $f(n) = \Omega(g(n))$ if $f(n)$ grows as least as fast as $g(n)$
tight bound $f(n) = \Theta(g(n))$ if $f(n)$ grows as fast as $g(n)$

sum of theta

if $f(n) = \Theta(g(n))$ and $h(n) = \Theta(g(n))$ , then $f(n) + h(n) = \Theta(max(f(n), h(n)))$

product of theta

if $f(n) = \Theta(g(n))$ and $h(n) = \Theta(g(n))$ , then $f(n) \cdot h(n) = \Theta(f(n) \cdot h(n))$

linear search

time complexity: $O(n)$
space complexity: $O(1)$

constants in time complexity

if we have $T_1(n) = 1000000n=\Theta(n)$ and $T_2(n) = 0.0001n^2=\Theta(n^2)$ , then $T_1(n)$ is worse for all but really large $n$

arrays (python)

accessing element, need its address $O(1)$
to resize array, new chunk of memory needs to be found; old values copied over
np.append takes $O(n)$ (better to pre-allocate)

linked list

retrieve k-th elem: $O(k)$ if you dont know address, $O(1)$ if you do
append/pop: $O(1)$
insert/remove k-th elem: $O(k)$ if you dont know address, $O(1)$ if you do
allocation not needed

dynamic array

allocate memory for underlying array
- there is physical size and logical size
- when logical size is full, allocate new underlying array whose physical size is $\gamma$ (growth factor) times the previous
- normally gamma is double
- the last append when physical size is full takes $O(n)$ , otherwise takes $O(1)$

amortized analysis

spreading cost of something over time
in a sequence of n appends, how many caused physical size to grow?
- simplicity: assume n is such that n-th append causes physical size to grow
- x is number of resizes
- $n=\beta\gamma^x$
- $x=\log_\gamma \dfrac{\text{n}}{\beta}$
- number of resizes is $\log_\gamma \dfrac{\text{n}}{\beta} + 1$
time for jth resize: $O(c\beta\gamma^{j-1})$ where c is constant that depends on $\gamma$
total time: $c\beta\sum_{j=1}^{x} \gamma^{j-1} = \Theta(c\beta\dfrac{\gamma^x - 1}{\gamma - 1}) = \Theta(c\beta\dfrac{\text{n} - \beta}{\gamma - 1}) = \Theta(c\dfrac{\text{n}}{\gamma - 1}) = \Theta(\dfrac{\text{n}}{\gamma - 1}) = \Theta(\dfrac{\text{n}}{1}) = \Theta(\text{n})$
amortized time is $\dfrac{\Theta(\text{n})}{\text{n}} = \Theta(1)$

binary heaps

binary tree normally used to implement priority queues
full if every node has either 0 or 2 children
perfect if all internal nodes have 2 children and all leaves are at the same level
complete if all levels are filled except possibly the last, which is filled from left to right
height of node is largest number of edges from root to leaf
height of tree is height of root
height of complete tree and complete binary tree is $\Theta(\log n)$

binary max heap (min heap) is binary tree that:

each node has a key
shape: tree is complete
max heap: key of node is greater than or equal to keys of children
min heap: key of node is less than or equal to keys of children
.max takes $O(1)$
.insert takes $O(\log n)$
.delete takes $O(\log n)$
.pop_max_key takes $O(\log n)$

def _push_down(self, i):
    left = left_child(i)
    right = right_child(i)
    if (
        left < len(self.keys)
        and
        self.keys[left] > self.keys[i]
    ):
        largest = left
    else:
        largest = i
 
    if (
        right < len(self.keys)
        and
        self.keys[right] > self.keys[largest]
    ):
        largest = right
 
    if largest != i:
        self._swap(i, largest)
        self._push_down(largest)

online median

given a stream of numbers one at a time, find the median of all numbers seen so far
keep a max heap of the first half and a min heap of the second half
may be unbalanced, so we need to rebalance

def _rebalance(self):
    if len(self.max_heap) > len(self.min_heap) + 1:
        self.min_heap.append(self.max_heap.pop())
        self._push_up(len(self.min_heap) - 1)
    elif len(self.min_heap) > len(self.max_heap):
        self.max_heap.append(self.min_heap.pop())
        self._push_down(len(self.max_heap) - 1)

dynamic set

similar to dictionary
store n elements in hash table
initial cost: $O(n)$
insert: $O(1)$
search: $O(1)$

hashing

collisions stored in same bin but in linked list
good hash function should utilize all bins evenly

binary search tree

alternative way to dynamic sets
slower insert/deletion and query than dynamic set
preserves data in sorted order
BST is binary tree that satisfies following for any node x:
- left subtree of x contains only keys less than x
- right subtree of x contains only keys greater than x
- left and right subtrees must also be binary search trees

Each element stored as a node at an arbitrary address in memory
Each node has a key and pointers to left child, right child, and parent nodes (if they exist).

(diagram 1: tree representation)

      6
     / \
    12  33

(diagram 2: memory representation with pointers)

An array-like structure representing memory slots.

[ , l, 6, r, p, , l, 33, r, p, , , l, 12, r, p, , ]

pointers are shown with arrows:
- the 'l' (left child) pointer for node 6 points to node 12.
- the 'r' (right child) pointer for node 6 points to node 33.
- the 'p' (parent) pointer for node 12 points to node 6.
- the 'p' (parent) pointer for node 33 points to node 6.
- other pointers (like children of 12 and 33, and parent of 6) might be null or point elsewhere depending on the full structure.
in order traversal: $O(n)$ - prints node values in ascending order
pre order traversal: $O(n)$ - visits root, left subtree, right subtree
post order traversal: $O(n)$ - visits left subtree, right subtree, then root
deleting node with no leaf is easy, if one child then easy
otherwise use the idea of rotating node to bottom, preserving BST, then when it is a leaf or has one child, delete it.
right rotation:
left rotation:
rotate takes $O(1)$

range queries in BST

ceiling of x is smallest key greater than or equal to x
sucessor of node u is smallest node greater than u
strat: find ceiling of a ([a,b] is range), repeatedly find sucessor until > b

BST : in general vs balanced vs unbalanced

query, insertion/deletion: $O(h)$ in general, $O(\log n)$ in balanced, $O(n)$ in unbalanced

balanced BSTs

AVL trees
red-black trees
- BST whose nodes are colored red and black
- leaf nodes are "nil"
- must satisfy:
  - root nodes are black
  - every leaf node is black
  - if a node is red, both child nodes are black
  - for any node, all paths fro node to leaf contain same number of black nodes
- n internal (non-nil) nodes = height is at most $2\log (n+1)$
- non-modifying operations: $O(\log n)$
- expected height of BST built by inserting $n$ random keys: $O(\log n)$

treaps

binary tree in which each node has both key and priority
- max heap with respect to priority
- BST with respect to key
insertion:
- using key, find place to insert node as if in BST
- while priority of node is more than parent, right rotate new node if it is left child, left rotate new node if it is right child
- repeat until heap property is satisfied
deletion:
- while node is not leaf:
  - rotate it with child of highest priority
  - once it is a leaf, delete it
if all keys and priorioties are unique, treap is unique.

randomized binary search trees

when inserting node into treap, generate priority
runtimes are normally $O(\log n)$ but can be $O(n)$ in the worst case (very rare)

citation: dsc190 advanced algos @ UCSD

algorithms