## Amortized Analysis

# Amortized Algorithms

**Bart Massey**

### Amortized Analysis

Many algorithms just map a single input to a single output

- This makes it easy to compute the average cost of n invocations
- Usually get tight bounds on worst-case performance

For algorithms that involve a sequence of operations on a data structure, things are more complicated

- Cost per operation may vary
- Assuming worst-case for each operation can be loose

Solution: compute the total cost of a worst-case sequence of operations and divide by the length of sequence to get an "amortized cost

*Not*"average case performance": "average performance in worst case"

### Running Textbook Example: Stack

Stack with unit-cost operations "push" and "pop", and "multi-pop" with cost

*min(n,k)*`multi-pop(stack s, k elements): r <- empty list n <- size of stack s while k > 0 and n > 0 r .= pop(s) k <- k - 1 n <- n - 1 return r`

### Running Textbook Example: Counter

Binary counter with variable-cost "increment" and "decrement"

`increment(counter c): i <- 0 while bit i of c is 1 set bit i of c to 0 i <- i + 1 set bit i of c to 1 decrement(counter c): i <- 0 while bit i of c is 0 set bit i of c to 1 i <- i + 1 set bit i of c to 0 100111 101000 + 1 - 1 ------ ------ 101000 100111`

### Three Approaches

Simple averaging: total up the sequence and divide.

Debits and credits: charge some operations more than their cost, and some less, but with the total provably the same as for averaging.

Potential function: charge the data structure itself, and compute the average cost of change in data structure plus op.

### Stack: Aggregate Method

Note that a worst-case sequence of ops can never be worse than loading and then popping off the entire stack.

Assume k pushes followed by multi-pop:

*2k*total cost,*k + 1*ops.Amortized cost/op =

*2k/(k+1)*=*O(1)*

### Stack: Debits and Credits

For every push: charge 2. For every pop: charge 0. For every multi-pop: charge 0.

Since the actual cost of a push is 1, 1 credit is built into every push. Since the actual cost of a pop is 1, 1 debit is accrued for a pop: this is taken from the credit accrued by the corresponding push.

Similarly, since the actual cost of a multi-pop of

*min(n, k)*elements is*min(n, k)*, this much debit is accrued, and taken from the*min(n, k)*pushes used to create this portion of the stack.Since all operations are

*O(1)*, and the math balances, the sequence must be*O(1)*per op.

### Stack: Potential Function

Let

*P*(stack of size*n*) =*n*As before, push costs 1 and increases

*P*by 1:*c + dP = 2*Pop costs 1 and decreases

*P*by 1:*c + dP = 0*Multi-pop

*k*costs*min(k, n)*and decreases*P*by*min(k, n)*:*c + dP = 0*

### Eager and Lazy Algorithms

Most algorithms are

*eager*: each operation does all the work it can to maintain invariants and return the result.There is sometimes benefit to

*lazy*algorithms, which do the minimum needed to obtain the result and delay the rest of the work until later.For example: Heapsort has both eager and lazy elements. Heapsort eagerly builds a heap at cost

*O(n)*and then does lazy extraction later at cost*O(n lg n)*.Lazy algorithms are often suited for

*online*use: process things as they come in. For example: heap as priority queue for processing stream of tasks.

### Linked-List Queues

Consider implementing a queue (FIFO) using linked lists:

- Operations are enqueue, dequeue, isEmpty
- Advantages: Minimal memory overhead, easy to grow arbitrarily

Two bad-looking implementation choices:

- Singly-linked list: dequeue on a queue of depth
*n*requires*O(n)*ops - Doubly-linked list: complex, requires extra memory

- Singly-linked list: dequeue on a queue of depth
But there's a better amortized plan ...

### Lazy Linked-List Queues

Data structure:

*two*lists, which we will call q.en and q.deEnqueue operation puts its operand on the front of q.en

`enqueue(q, v): insert-first(q.en, v)`

Dequeue operation gets its result from the front of q.de

*but*, if empty, reverses q.en onto q.de first`dequeue(q): if q.de is empty while q.en is not empty v <- extract-first(q.en) insert-first(q.de, v) if q.de is not empty return extract-first(q.de) fail`

### Amortized Cost of Lazy Linked-List Queue

Method of credits: enqueue cost = 3, dequeue cost = 1

- Each trip through the dequeue while loop must have been paid for by the enqueue that put the element there

Method of potentials: potential = 2 * size(q.en)

- The cost of the dequeue while loop is paid by decrease in potential

### O(1) Array Initialization (see HW)

Allocate a large, zero-filled array. Clearly

*O(n)*, since writing all those zeros.- If you only end up reading and writing a few elements
(e.g. sparse hash table)
*O(n)*initialization dominates.

- If you only end up reading and writing a few elements
(e.g. sparse hash table)
Better idea: lazy initialization

- Keep a stack of referenced values
- Make the array be an array of uninited pointers
- When a value is referenced
- push onto the stack
- set the array pointer to point at the value
*set a pointer from the value back into the array (to check for false references)*

Reference and store ops are still constant time, so

*O(1)*"initialization"