Digging Around In Arrays

Digging Around In Arrays

Copyright © 2015 Bart Massey

  1. Consider the following problem:


    INSTANCE: A one-based array a of n elements, with the elements drawn from the set {0, 1}. An integer lower bound m.

    SOLUTION: The index of the first position in a such that there is a string of m 1s starting at that position. That is: a position i in a such that a[i..i+m-1] are all 1s and there is no position j < i in a with this property. If there is no string of m 1s in a, fail.

    1.1. Give an algorithm that solves this problem. As always, argue its correctness.

    1.2. What is the worst-case O(n,m) running time of your algorithm? Why?

    1.3. What is the best-case Ω(n,m) running time of your algorithm? Why?

    1.4. Non-required BONUS: Find an algorithm with Θ(n/m) average-case complexity, assuming a uniform distribution of inputs. What is the worst-case complexity of this algorithm? What is a worst-case instance? A best-case instance?

    Note: This problem came up for me recently; I was trying to build tests for a hardware random number generator.

  2. Consider the following problem:


    INSTANCE: One-based arrays a and b of positive integers. Denote the size of a by |a| and similarly for b.

    SOLUTION: A one-based array c of positive integers, such that c contains exactly one copy of each integer that appears somewhere in both a and b. Thus, |c| ≤ min(|a|, |b|).

    2.1. Give a worst-case O(|a| lg |a| + |b| lg |b|) algorithm solving this problem.

    2.2. Generalize the problem description to allow for an arbitrary number of input arrays. (This involves choosing an appropriate representation for a collection of arrays.)

    2.3. Give an algorithm for the generalized problem.

    2.4. Implement the two-array algorithm in your favorite programming language.

    2.5. Devise tests for the two-array algorithm. Use these tests both to check correctness and to measure the running time on various size inputs. How does the running time grow in practice?

    Note: The generalized version of this problem came up just now for me; I was working on a video control application. I implemented an O(n3) algorithm, though, since the array sizes were quite small.