We present the Cuckoo Trie, a fast, memory-efficient ordered index structure. The Cuckoo Trie is designed to have memory-level parallelism -- which a modern out-of-order processor can exploit to execute DRAM accesses in parallel -- without sacrificing memory efficiency. The Cuckoo Trie thus breaks a fundamental performance barrier faced by current indexes, whose bottleneck is a series of dependent pointer-chasing DRAM accesses -- e.g., traversing a search tree path -- which the processor cannot parallelize. Our evaluation shows that the Cuckoo Trie outperforms state-of-the-art-indexes by up to 20%--360% on a variety of datasets and workloads, typically with a smaller or comparable memory footprint.
View on arXiv