Amortized Analysis

Amortized Algorithms

Bart Massey


Amortized Analysis

  • Many algorithms just map a single input to a single output

    • This makes it easy to compute the average cost of n invocations
    • Usually get tight bounds on worst-case performance
  • For algorithms that involve a sequence of operations on a data structure, things are more complicated

    • Cost per operation may vary
    • Assuming worst-case for each operation can be loose
  • Solution: compute the total cost of a worst-case sequence of operations and divide by the length of sequence to get an "amortized cost

    • Not "average case performance": "average performance in worst case"

Running Textbook Example: Stack

  • Stack with unit-cost operations "push" and "pop", and "multi-pop" with cost min(n,k)

     multi-pop(stack s, k elements):
         r <- empty list
         n <- size of stack s
         while k > 0 and n > 0
             r .= pop(s)
             k <- k - 1
             n <- n - 1
         return r
    

Running Textbook Example: Counter

  • Binary counter with variable-cost "increment" and "decrement"

     increment(counter c):
         i <- 0
         while bit i of c is 1
             set bit i of c to 0
             i <- i + 1
         set bit i of c to 1
    
     decrement(counter c):
         i <- 0
         while bit i of c is 0
             set bit i of c to 1
             i <- i + 1
         set bit i of c to 0
    
     100111  101000
     +    1  -    1
     ------  ------
     101000  100111
    

Three Approaches

  • Simple averaging: total up the sequence and divide.

  • Debits and credits: charge some operations more than their cost, and some less, but with the total provably the same as for averaging.

  • Potential function: charge the data structure itself, and compute the average cost of change in data structure plus op.

Stack: Aggregate Method

  • Note that a worst-case sequence of ops can never be worse than loading and then popping off the entire stack.

  • Assume k pushes followed by multi-pop: 2k total cost, k + 1 ops.

  • Amortized cost/op = 2k/(k+1) = O(1)

Stack: Debits and Credits

  • For every push: charge 2. For every pop: charge 0. For every multi-pop: charge 0.

  • Since the actual cost of a push is 1, 1 credit is built into every push. Since the actual cost of a pop is 1, 1 debit is accrued for a pop: this is taken from the credit accrued by the corresponding push.

  • Similarly, since the actual cost of a multi-pop of min(n, k) elements is min(n, k), this much debit is accrued, and taken from the min(n, k) pushes used to create this portion of the stack.

  • Since all operations are O(1), and the math balances, the sequence must be O(1) per op.

Stack: Potential Function

  • Let P(stack of size n) = n

  • As before, push costs 1 and increases P by 1: c + dP = 2

  • Pop costs 1 and decreases P by 1: c + dP = 0

  • Multi-pop k costs min(k, n) and decreases P by min(k, n): c + dP = 0

Eager and Lazy Algorithms

  • Most algorithms are eager: each operation does all the work it can to maintain invariants and return the result.

  • There is sometimes benefit to lazy algorithms, which do the minimum needed to obtain the result and delay the rest of the work until later.

  • For example: Heapsort has both eager and lazy elements. Heapsort eagerly builds a heap at cost O(n) and then does lazy extraction later at cost O(n lg n).

  • Lazy algorithms are often suited for online use: process things as they come in. For example: heap as priority queue for processing stream of tasks.

Linked-List Queues

  • Consider implementing a queue (FIFO) using linked lists:

    • Operations are enqueue, dequeue, isEmpty
    • Advantages: Minimal memory overhead, easy to grow arbitrarily
  • Two bad-looking implementation choices:

    • Singly-linked list: dequeue on a queue of depth n requires O(n) ops
    • Doubly-linked list: complex, requires extra memory
  • But there's a better amortized plan ...

Lazy Linked-List Queues

  • Data structure: two lists, which we will call q.en and q.de

  • Enqueue operation puts its operand on the front of q.en

     enqueue(q, v):
         insert-first(q.en, v)
    
  • Dequeue operation gets its result from the front of q.de but, if empty, reverses q.en onto q.de first

     dequeue(q):
         if q.de is empty
             while q.en is not empty
                 v <- extract-first(q.en)
                 insert-first(q.de, v)
         if q.de is not empty
             return extract-first(q.de)
         fail
    

Amortized Cost of Lazy Linked-List Queue

  • Method of credits: enqueue cost = 3, dequeue cost = 1

    • Each trip through the dequeue while loop must have been paid for by the enqueue that put the element there
  • Method of potentials: potential = 2 * size(q.en)

    • The cost of the dequeue while loop is paid by decrease in potential

O(1) Array Initialization (see HW)

  • Allocate a large, zero-filled array. Clearly O(n), since writing all those zeros.

    • If you only end up reading and writing a few elements (e.g. sparse hash table) O(n) initialization dominates.
  • Better idea: lazy initialization

    • Keep a stack of referenced values
    • Make the array be an array of uninited pointers
    • When a value is referenced
      • push onto the stack
      • set the array pointer to point at the value
      • set a pointer from the value back into the array (to check for false references)
  • Reference and store ops are still constant time, so O(1) "initialization"

Last modified: Tuesday, 29 April 2014, 9:56 AM