Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Choosing Your Key-Value Store: A Decision Framework

In Which We Synthesize Our Options Into Actionable Wisdom

Having examined three different approaches to the “I have keys, I have values, now what?” problem, let’s construct a decision tree:

For Small Datasets (< 75 pairs)

Use orddict if:

  • You want transparency
  • You’re debugging
  • Human readability matters
  • You need ordered output
  • Simplicity trumps everything

Use maps if:

  • You want modern syntax
  • Pattern matching would be useful
  • You’re writing new code
  • Integration with other modern code matters

For Medium Datasets (75-10,000 pairs)

Use maps unless:

  • You’re maintaining legacy code (then use what’s there)
  • You specifically need gb_trees features

Use gb_trees if:

  • You need min/max operations
  • Ordered iteration is central to your algorithm
  • You’re implementing a tree-based algorithm

Avoid:

  • orddict (too slow)
  • dict (deprecated by maps)

For Large Datasets (10,000+ pairs)

Use maps unless:

  • You have specific needs only gb_trees fulfill
  • You’re maintaining legacy dict code

Use gb_trees if:

  • Min/max operations are critical
  • Your algorithm fundamentally requires tree operations
  • You’ve benchmarked and gb_trees is measurably better

Never use:

  • orddict (will ruin your performance)
  • dict (unless forced to by legacy constraints)

Special Considerations

Use dict only if:

  • You’re maintaining pre-R17 code
  • You must support Erlang versions without maps
  • You’re gradually migrating to maps and need interim compatibility

Use gb_trees even in preference to maps when:

  • Finding minimum/maximum keys is a frequent operation
  • You need ordered traversal without separate sorting
  • Your algorithm is naturally tree-based
  • You’ve measured and gb_trees performs better for your specific use case

A Note On Proplists

The document you provided mentions proplists, which are worth a brief note here. A proplist is simply a list of tuples: '(#(key1 val1) #(key2 val2)). They’re not a formal data structure so much as a common pattern, particularly for passing options to functions.

Proplists are appropriate for:

  • Function arguments with optional parameters
  • Configuration that humans will write
  • Very small datasets where the loose structure is acceptable
  • Compatibility with code expecting this pattern

For anything more structured, use one of the modules discussed above.

Performance Isn’t Everything

A final philosophical point: the fastest data structure is not always the best choice. Consider:

  • Maintainability: Code using orddict is often clearer than code using gb_trees
  • Debuggability: Transparent structures are easier to inspect
  • Integration: Matching your team’s conventions matters
  • Future-proofing: Maps are the present and future; choose them when practical

Premature optimization remains, as ever, a source of evil. Choose the simplest structure that meets your needs, measure if performance matters, and optimize only when measurement proves it necessary.