PhD Thesis, Department of Computer Science, Aarhus University, Denmark, x+121 pages, January 1997. BRICS-DS-97-1.

We study the design of efficient data structures. In particular we
focus on the design of data structures where each operation has a
worst case efficient implementations. The concrete problems we
consider are * partial persistence*, implementation of * priority
queues*, and implementation of * dictionaries*.

The first problem we consider is how to make bounded in-degree and
out-degree data structures partially persistent, * i.e.*, how to remember
old versions of a data structure for later access.
A * node copying * technique of Driscoll * et al.* supports
update steps in amortized constant time and access steps in worst case
constant time. The worst case time for an update step can be linear
in the size of the structure. We show how to extend the technique of
Driscoll * et al. * such that update steps can be performed in
worst case constant time on the pointer machine model.

We present two new comparison based priority queue implementations,
with the following properties. The first implementation supports the
operations ** FindMin**, ** Insert** and ** Meld** in worst case constant
time and ** Delete** and ** DeleteMin** in worst case time *O*(log *n*).
The priority queues can be implemented on the pointer machine and
require linear space. The second implementation achieves the same
worst case performance, but furthermore supports ** DecreaseKey** in
worst case constant time. The space requirement is again linear, but
the implementation requires auxiliary arrays of size *O*(log *n*). Our
bounds match the best known amortized bounds (achieved by respectively
binomial queues and Fibonacci heaps). The data structures presented
are the first achieving these worst case bounds, in particular
supporting ** Meld** in worst case constant time. We show that these
time bounds are optimal for all implementations supporting ** Meld** in
worst case time *o*(*n*). We also present a tradeoff between the update
time and the query time of comparison based priority queue
implementations. Finally we show that any randomized implementation
with expected amortized cost *t* comparisons per ** Insert** and
** Delete** operation has expected cost at least *n*/2^{O(t)}
comparisons for ** FindMin**.

Next we consider how to implement priority queues on parallel
(comparison based) models. We present time and work optimal priority
queues for the CREW PRAM, supporting ** FindMin**, ** Insert**, ** Meld**,
** DeleteMin**, ** Delete** and ** DecreaseKey** in constant time with
*O*(log *n*) processors. Our implementation is the first supporting
all of the listed operations in constant time. To be able to speed up
Dijkstra's algorithm for the single-source shortest path problem we
present a different parallel priority data structure. With this
specialized data structure we give a parallel implementation of
Dijkstra's algorithm which runs in *O*(*n*) time and performs *O*(*m*log
*n*) work on a CREW PRAM. This represents a logarithmic factor
improvement for the running time compared with previous approaches.

We also consider priority queues on a RAM model which is stronger than
the comparison model. The specific problem is the maintenance of a
set of *n* integers in the range 0..2^{w}-1 under the operations
** Insert**, ** Delete**, ** FindMin**, ** FindMax** and ** Pred** (predecessor
query) on a unit cost RAM with word size *w* bits. The RAM operations
used are addition, left and right bit shifts, and bit-wise boolean
operations. For any function *f*(*n*) satisfying loglog *n*≤
*f*(*n*)≤ \sqrt{log *n*}, we present a data structure supporting
** FindMin** and ** FindMax** in worst case constant time, ** Insert** and
** Delete** in worst case *O*(*f*(*n*)) time, and ** Pred** in worst case
*O*((log *n*)/*f*(*n*)) time. This represents the first priority queue
implementation for a RAM which supports ** Insert**, ** Delete** and
** FindMin** in worst case time *O*(loglog *n*) - previous bounds were
only amortized. The data structure is also the first dictionary
implementation for a RAM which supports ** Pred** in worst case *O*(log
*n*/loglog *n*) time while having worst case *O*(loglog *n*) update
time. Previous sublogarithmic dictionary implementations do not
provide for updates that are significantly faster than queries. The
best solutions known support both updates and queries in worst case
time *O*(\sqrt{log *n*}).

The last problem consider is the following dictionary problem over
binary strings. Given a set of *n* binary strings of length *m* each,
we want to answer *d*-queries, * i.e.*, given a binary query
string of length *m* to report if there exists a string in the set
within Hamming distance *d* of the query string. We present a data
structure of size *O*(*n**m*) supporting 1-queries in time *O*(*m*) and the
reporting of all strings within Hamming distance 1 of the query string
in time *O*(*m*). The data structure can be constructed in time
*O*(*n**m*). The implementation presented is the first achieving these
optimal time bounds for the preprocessing of the dictionary and for
1-queries. The data structure can be extended to support the
insertion of new strings in amortized time *O*(*m*).

Copyright © 1997, BRICS, Department of Computer Science, Aarhus University. All rights reserved.

**Online version**

brics-ds-97-1.pdf (944 Kb)

**Slides**

thesis.pdf (206 Kb)

**BIBT _{E}X entry**

@phdthesis{ThesisBrodal, author = "Gerth St{\o}lting Brodal", issn = "1396-7002", month = "January", number = "BRICS-DS-97-1", pages = "x+121", school = "Department of Computer Science, Aarhus University, Denmark", title = "Worst Case Efficient Data Structures", year = "1997" }