TPIE

2362a60
tpie::serialization_bits::file_handler< T > Class Template Reference

File handling for merge sort. More...

#include <tpie/serialization_sorter.h>

Public Member Functions

void set_temp_dir (const std::string &tempDir)
 
void open_new_writer ()
 
void write (const T &v)
 
void close_writer ()
 
size_t remaining_runs ()
 
size_t next_level_runs ()
 
bool readers_open ()
 
void open_readers (size_t fanout)
 
bool can_read (size_t idx)
 
read (size_t idx)
 
void close_readers_and_delete ()
 
void move_last_reader_to_next_level ()
 
void reset ()
 

Detailed Description

template<typename T>
class tpie::serialization_bits::file_handler< T >

File handling for merge sort.

This class abstracts away the details of numbering run files; tracking the number of runs in each merge level; informing the TPIE stats framework of the temporary size; deleting run files after use.

The important part of the state is the tuple consisting of (a, b, c) := (fileOffset, nextLevelFileOffset, nextFileOffset). a is the first file in the level currently being merged; b is the first file in the level being merged into; c is the next file to write output to.

Transition system

We let remainingRuns := b - a, and nextLevelRuns := c - b.

The tuple (remainingRuns, nextLevelRuns) has the following transitions: On open_new_writer(): (x, y) -> (x, 1+y), On open_readers(fanout): (fanout+x, y) -> (fanout+x, y), On open_readers(fanout): (0, fanout+y) -> (fanout+y, 0), On close_readers_and_delete(): (fanout+x, y) -> (x, y).

Merge sorter usage

During run formation (the first phase of merge sort), we repeatedly call open_new_writer() and close_writer() to write out runs to the disk.

After run formation, we call open_readers(fanout) to advance into the first level of the merge heap (so one can think of run formation as a "zeroth level" in the merge heap).

As a slight optimization, when remaining_runs() == 1, one may call move_last_reader_to_next_level() to move the remaining run into the next merge level without scanning through and copying the single remaining run.

See serialization_sorter::merge_runs() for the logic involving next_level_runs() and remaining_runs() in a loop.

Definition at line 265 of file serialization_sorter.h.


The documentation for this class was generated from the following file: