Amortized Analysis
Amortized Algorithms
Bart Massey
Amortized Analysis
Many algorithms just map a single input to a single output
- This makes it easy to compute the average cost of n invocations
- Usually get tight bounds on worst-case performance
For algorithms that involve a sequence of operations on a data structure, things are more complicated
- Cost per operation may vary
- Assuming worst-case for each operation can be loose
Solution: compute the total cost of a worst-case sequence of operations and divide by the length of sequence to get an "amortized cost
- Not "average case performance": "average performance in worst case"
Running Textbook Example: Stack
Stack with unit-cost operations "push" and "pop", and "multi-pop" with cost min(n,k)
multi-pop(stack s, k elements): r <- empty list n <- size of stack s while k > 0 and n > 0 r .= pop(s) k <- k - 1 n <- n - 1 return r
Running Textbook Example: Counter
Binary counter with variable-cost "increment" and "decrement"
increment(counter c): i <- 0 while bit i of c is 1 set bit i of c to 0 i <- i + 1 set bit i of c to 1 decrement(counter c): i <- 0 while bit i of c is 0 set bit i of c to 1 i <- i + 1 set bit i of c to 0 100111 101000 + 1 - 1 ------ ------ 101000 100111
Three Approaches
Simple averaging: total up the sequence and divide.
Debits and credits: charge some operations more than their cost, and some less, but with the total provably the same as for averaging.
Potential function: charge the data structure itself, and compute the average cost of change in data structure plus op.
Stack: Aggregate Method
Note that a worst-case sequence of ops can never be worse than loading and then popping off the entire stack.
Assume k pushes followed by multi-pop: 2k total cost, k + 1 ops.
Amortized cost/op = 2k/(k+1) = O(1)
Stack: Debits and Credits
For every push: charge 2. For every pop: charge 0. For every multi-pop: charge 0.
Since the actual cost of a push is 1, 1 credit is built into every push. Since the actual cost of a pop is 1, 1 debit is accrued for a pop: this is taken from the credit accrued by the corresponding push.
Similarly, since the actual cost of a multi-pop of min(n, k) elements is min(n, k), this much debit is accrued, and taken from the min(n, k) pushes used to create this portion of the stack.
Since all operations are O(1), and the math balances, the sequence must be O(1) per op.
Stack: Potential Function
Let P(stack of size n) = n
As before, push costs 1 and increases P by 1: c + dP = 2
Pop costs 1 and decreases P by 1: c + dP = 0
Multi-pop k costs min(k, n) and decreases P by min(k, n): c + dP = 0
Eager and Lazy Algorithms
Most algorithms are eager: each operation does all the work it can to maintain invariants and return the result.
There is sometimes benefit to lazy algorithms, which do the minimum needed to obtain the result and delay the rest of the work until later.
For example: Heapsort has both eager and lazy elements. Heapsort eagerly builds a heap at cost O(n) and then does lazy extraction later at cost O(n lg n).
Lazy algorithms are often suited for online use: process things as they come in. For example: heap as priority queue for processing stream of tasks.
Linked-List Queues
Consider implementing a queue (FIFO) using linked lists:
- Operations are enqueue, dequeue, isEmpty
- Advantages: Minimal memory overhead, easy to grow arbitrarily
Two bad-looking implementation choices:
- Singly-linked list: dequeue on a queue of depth n requires O(n) ops
- Doubly-linked list: complex, requires extra memory
But there's a better amortized plan ...
Lazy Linked-List Queues
Data structure: two lists, which we will call q.en and q.de
Enqueue operation puts its operand on the front of q.en
enqueue(q, v): insert-first(q.en, v)
Dequeue operation gets its result from the front of q.de but, if empty, reverses q.en onto q.de first
dequeue(q): if q.de is empty while q.en is not empty v <- extract-first(q.en) insert-first(q.de, v) if q.de is not empty return extract-first(q.de) fail
Amortized Cost of Lazy Linked-List Queue
Method of credits: enqueue cost = 3, dequeue cost = 1
- Each trip through the dequeue while loop must have been paid for by the enqueue that put the element there
Method of potentials: potential = 2 * size(q.en)
- The cost of the dequeue while loop is paid by decrease in potential
O(1) Array Initialization (see HW)
Allocate a large, zero-filled array. Clearly O(n), since writing all those zeros.
- If you only end up reading and writing a few elements (e.g. sparse hash table) O(n) initialization dominates.
Better idea: lazy initialization
- Keep a stack of referenced values
- Make the array be an array of uninited pointers
- When a value is referenced
- push onto the stack
- set the array pointer to point at the value
- set a pointer from the value back into the array (to check for false references)
Reference and store ops are still constant time, so O(1) "initialization"