# Cache Oblivious Algorithms for Computing the Triplet Distance Between Trees

## Gerth Stølting Brodal and Konstantinos Mampentzidis

Technical Report, 1706.10284, arXiv.org, 16 pages, June 2017.

## Abstract

We study the problem of computing the triplet distance between two
rooted unordered trees with *n* labeled leafs. Introduced by Dobson
1975, the triplet distance is the number of leaf triples that induce
different topologies in the two trees. The current theoretically best
algorithm is an O(*n* log *n*) time algorithm by Brodal
*et al.* (SODA 2013). Recently Jansson *et al.* proposed
a new algorithm that, while slower in theory, requiring O(*n*
log^{3} *n*) time, in practice it outperforms the theoretically faster
O(*n* log *n*) algorithm. Both algorithms do not scale to
external memory.

We present two cache oblivious algorithms that combine the best of
both worlds. The first algorithm is for the case when the two input
trees are binary trees and the second a generalized algorithm for two
input trees of arbitrary degree. Analyzed in the RAM model, both
algorithms require O(*n* log *n*) time, and in the cache
oblivious model O(*n*/*B* log_{2} (*N*/*M*))
I/Os. Their relative simplicity and the fact that they scale to
external memory makes them achieve the best practical performance. We
note that these are the first algorithms that scale to external
memory, both in theory and practice, for this problem.

**Links**

**BIBT**_{E}X entry

@techreport{arxiv1706.10284,
author = "Gerth St{\o}lting Brodal and Konstantinos Mampentzidis",
institution = "arXiv.org",
month = "June",
number = "1706.10284",
pages = "16",
title = "Cache Oblivious Algorithms for Computing the Triplet Distance Between Trees",
year = "2017"
}