Skip to content

Skip Lists (a.k.a. Balanced Trees are overrated)

Programming computers is fascinating.

A couple of days ago I was introduced to a new data structure I did not know before. It’s called a skip list. It’s a sorted linked list, where the nodes are linked at different levels, forming increasingly sparse chains, each of which skips more nodes, and therefore is faster to traverse, allowing search algorithms to hop over parts of the list when searching. That’s why they’re called skip lists.

The concept was shown to me during a job interview. They briefly introduced me to the concept of skip lists using an image. It is said that an image is worth a thousand words, and in this case the assertion holds:

After I finished the interview I went on to find out more about skip lists, and I was amazed on how I did not stumbled upon them before. Immediately my instinct was to try out the algorithm I came up with during my interview, and when I saw it worked, I ended up writing a whole new SortedMap subclass in Java, implemented with skip lists.

It is surprisingly amazing how easy it is to write the routines for all the basic operations on skip lists, like adding, updating, removing and querying on them, specially when you see it described as a probabilistic data structure, which you have to admit that looks scary at first glance.

Performance

Operation Expected time complexity
Insert O(log n)
Delete O(log n)
Search O(log n)
Item at index O(log n) with some tweaking
Enumerate O(n)

Performance-wise they can behave similar to, or perhaps even better than, balanced binary trees, according to the assertions made by the inventor of this concept in his original paper about them. They’re also easier to implement and comprehend, and the algorithms have a very little constant factor compared to the often recursive and complex algorithms on red-black, 2-3 trees, and other types of balanced trees.

Indexing

But the true killer feature of skip lists lie in its indexing capabilities. It’s possible, with some minor tweaks, to find the ith element in the list with an expected time complexity of O(log n). With this you can have a data structure that can act as a sorted dictionary, but as an array or vector at the same time. That’s not something you can do with self-balanced search trees, at least not that I’m aware of. I’m not quite sure right now in what type of application could this be useful, but it is certainly quite amazing, and worth having in mind if one ever finds such a situation.

Were you familiar with skip lists already? Do you know about any other cool and useful data structure that you would like to share?

Categories: Programming, Whatever.

Tags: , , , ,

Comment Feed

4 Responses

  1. Hey, I did not know about skip lists either. They sound like a very original idea.

    I wonder if they really are comparable to AVL trees. I’m not questioning the asymptotic time complexity analysis, but what about the constant overhead. Have you made an implementation and some benchmarking? I’d love to know your results.

    George2011/06/07 @ 11:47 amReply
  2. Hi George, sorry for the delayed response.

    I did develop a Java version of skip lists and I made some benchmarking against Java’s built-in TreeMap, which is implemented using a Red-Black tree. The tree map was faster in most operations, but not too overwhelmingly better.

    Plus the skip-list included the indexing operation, i.e. obtaining the element at a given index i between o and n-1. That is something the tree does not provide. So if you ever have an algorithm requiring a set or map data structure that also allows indexing, then skip-lists are better.

    Again, sorry for answering so late after your comment. I’ve been very busy these days and did not notice it until now.

  3. I don’t know what you guys keep insisting that you cannot do O(log n) random access on a balanced tree. Of course you can, just store at each vertex the size of the subtree rooted at that vertex. (This is easy to keep track of during insertions and rebalancings.) Maybe you mean that this operation is not provided by the standard library implementations of sets and maps?

    david2012/04/05 @ 9:54 amReply
  4. Yes, david, you’re right and I was wrong. I have known this for a while since I wrote this post and that comment above, but I had not come here to ammend my mistake until now. So thanks a lot for the clarification.

    It turns out that the real main advantage of skip lists over balanced trees is that they’re more suitable for concurrent access and modification, if properly implemented. Java has a ConcurrentSkipListMap that is an actual example of this.



Some HTML is OK

or, reply to this post via trackback.