Why Implement New Data Structures?

We’ve all been through the same processes for interviewing, learning new languages, or improving problem-solving. I’m talking about writing the mundane data structures (linked list, hash table, queue, stack, etc.) over and over again. But how do you really refine your algorithmic thinking? Why not implement new and unique data structures? That’s why I have been practicing my C programming whilst writing under-utilized data structures. Alongside this, I’ve been practicing my vim skills (Gary Bernhardt would be proud!). Killing three birds with one stone! Check out the repository here.

Why C, You Ask?

Over the Winter Break, I took some time to reflect (previous blog post) and wanted to brush up on my weaker programming languages, specifically, C. I like programming in C for many reasons:

  • C is extremely fast.
  • C has easier debugging with memory addresses.
  • C is powerful by being a low-ish programming language.
  • etc.

Which kinds of data structures?

My goal is to check off every item off of this list. So far, I’ve implemented:

  1. Trie
  2. Bloom Filter
  3. Two-Four Tree (or, Two-Three-Four tree)
  4. Queap

And in my immediate future, I plan on completing:

In the Github repository, each data structure is placed in its own directory with the header and source files, driver, and Makefile. I designed it this way to eliminate having to build all of the files instead of one that you might need/improve. For some, there are supplementary structures necessary; for example, a queap contains a doubly linked list and a two-four tree.

The Road Ahead

Unfortunately, these aren’t taught in universities when students are trying to learn programming in the data structures and algorithms courses. Not only that, but these help ten-fold for interview questions. It not only shows your computer programming skills, but also your curiosity to learn more about the field. For example, I recently got this question from a well-known tech giant:

Implement an efficient algorithm to determine if an element is contained within an array.

Obviously most candidates will immediately spurt out the O(n) naive solution: iterating through the array and compare each element. But what about impressing them with an O(k) solution (k is the number of hashes)? And then taking it one step further by using a Murmur hash (with unique seeds) instead of the SHA1 to make it even faster?

Implementing these data structures are 80% of the battle! The other 20% comes from a deep understanding of the given algorithm’s theory and the motive. It’s really interesting to see why these were discovered and the purpose it has in the computer science universe.

Feel free to create an issue or pull request if you want to help!