Skip to main content

Computer Science: Offset-based Linked Lists

Note, I'm speaking a little loosely, and I'm not a C expert, but I think this post is still interesting nonetheless.

There are many ways to "architect" and implement lists.

These days, it's the norm in languages like Java, Python, etc. for the language or language library itself to provide a list implementation. You simply create a list and then start shoving objects into it. Python, Ruby, Perl and PHP all provide native syntax for lists. The algorithm used is more sophisticated than a singly or doubly-linked list because lists must double as arrays in those languages. Java and C++ have native arrays and array syntax, but they also provide a variety of list implementations in their libraries. Both can use generics to constrain the type of objects you put in those lists. One thing to note, however, is that there's a distinction between the list itself and the items in the list.

Since C's standard library doesn't provide a list implementation, a variety of approaches are used. It's not uncommon to create structs that have prev and next pointers in them and to manage the list manually. In general, it seems common in C to create data structures manually and to simply manage the data structure as part of the overall programming task. This is in stark contrast to, say, Python where the list implementation is in C and the code using the list is in Python.

My buddy Kelly Yancey once showed me that FreeBSD had macros so that programmers wouldn't have to keep reimplementing linked lists all the time in the kernel. I think macros were used instead of functions so that the code behaved as if it were actually written inline, thus avoiding the function call overhead--but I could be wrong.

At Metaweb, I had a buddy named Scott Meyer who use to work at Oracle. He showed me a pretty interesting trick for creating linked lists. As before, he had structs which contained application data as well as next and prev pointers for managing the structs in a linked list. However, rather than manipulate those linked lists manually, he wrote separate, generic linked list functions. The question is: how can a generic linked list function operate on random structs (i.e. void *'s) that just happen to have prev and next fields somewhere within them?

For each type of struct, he would create a "descriptor" struct that contained the offset of the prev and next fields. He would pass the descriptor to the function anytime he needed to manipulate the linked list. Loosely speaking, it's like saying, "Hey, I got this linked list, and I want you to insert a new member into it. You might not know anything about the type of structs in the linked list, but I can tell you that the prev pointer is 16 bytes from the beginning of the struct, and the next pointer is 20 bytes from the beginning."

I think the lesson is deep. If you want to write a function that operates generically on structs that you know have certain fields, you just have to tell that function the offset of those fields. Naturally, you don't want to just pass the offsets. Rather, you pack them up into a struct. Creating such a struct is like declaring that you implement some interface. I had seen how structs full of function pointers doubled as interfaces in Apache, but structs full of offsets was something entirely new for me.



Ian Bicking said…
Alan Kay described what he considers the first object programming similarly. I can't remember enough keywords to look it up. But basically it was in filesystems -- people wrote different filesystems for tapes, but this caused problems as new filesystems appeared or old filesystems were forgotten about. So some unnamed programmer made a system where the tape contained the code to access the records on the tape at known offsets. So, say, the get-record routine was at offset 0 (or maybe a pointer? Or maybe enough room to just jump to where the routine was held), set-record at offset 10, etc.

Similarly the linked list routines are just one step away from being a dynamically typed system -- if you put the struct describing offsets in the original struct itself then you'd have a type.
Joseph said…
This is off topic, but that wouldn't happen to be the Scott Meyers of Effective C++ fame would it?
mike barton said…
Huh... I've done similar things in the past, but my generic list functions mostly take a next pointer offset, which is computed with a macro. Something like:

#define NEXTOFFSET(list) ((unsigned long)((list)->next) - (unsigned long)(list))

list = list_sort(list, NEXTOFFSET(list), comparison_function);

I guess building a framework to organize that isn't too bad an idea.
Dave Kirby said…
The entire Amiga operating system was built around a linked list structure - every object started with the linked list struct, and the OS provided functions for navigating round them, even from assembler. Everything else was referenced by offsets to one of these structs - by convention data was at positive offsets and pointers to functions were at negative offsets. This enabled consistent object oriented programming, even from assembler.

In many ways the Amiga was way ahead of its time.
Kelly Yancey said…
Hey JJ, just wanted to add that I've seen versions of the BSD queue.h macros that used offsets from a known base rather than pointers for the next/previous links. The advantage is that you can then put the linked objects in a shared memory segment (the segment may have a different base address in each process).

There was a proposal to add them to FreeBSD back in 2002 which includes the source code.
jjinux said…
Ian, thanks for your comment.
jjinux said…
> This is off topic, but that wouldn't happen to be the Scott Meyers of Effective C++ fame would it?

Embarrassingly, I don't know. I don't know him all that well. I do know he's a brilliant hacker, and he's the manager of the graph database team at Metaweb.
jjinux said…
Wow, thanks for the great comments, everyone! I was taking a risk posting something outside my realm of experience, but, man, what interesting responses!
Anonymous said…
Yet another case of "why I didn't I think of it before?". A very simple approach to take advantage of obvious commonalities in existing code. .

I thnk that the Scott Meyers of Effective C++ is into training and consulting
jjinux said…
> I thnk that the Scott Meyers of Effective C++ is into training and consulting

Ah, not the same guy.
Anonymous said…
Interesting that you mention macros, and passing offsets... However, if you consistently name the next/prev pointers, you can combine the two into a macro that calls the underlying function with the offsets for the specific struct. For example:

#define LINKSHOW(object) linkshow(((void *) - ((void *) object), \
((void *) object.prev) - ((void *) object), object)

so you can just call "LINKSHOW(object)" which then calls linkshow(nextOffset, prevOffset, object)

The magic of macros effectively implement C++-like code templates.

jjinux said…
> #define LINKSHOW(object) linkshow(((void *) - ((void *) object), \
((void *) object.prev) - ((void *) object), object)

Very nice ;)
George Reilly said…
Windows, internally, has used the same idea for many years. Look at CONTAINING_RECORD and LIST_ENTRY, which are defined in ntdef.h. LIST_ENTRYs are used heavily in the Windows source code, though it's only exposed to general developers in driver code.

Popular posts from this blog

Drawing Sierpinski's Triangle in Minecraft Using Python

In his keynote at PyCon, Eben Upton, the Executive Director of the Rasberry Pi Foundation, mentioned that not only has Minecraft been ported to the Rasberry Pi, but you can even control it with Python . Since four of my kids are avid Minecraft fans, I figured this might be a good time to teach them to program using Python. So I started yesterday with the goal of programming something cool for Minecraft and then showing it off at the San Francisco Python Meetup in the evening. The first problem that I faced was that I didn't have a Rasberry Pi. You can't hack Minecraft by just installing the Minecraft client. Speaking of which, I didn't have the Minecraft client installed either ;) My kids always play it on their Nexus 7s. I found an open source Minecraft server called Bukkit that "provides the means to extend the popular Minecraft multiplayer server." Then I found a plugin called RaspberryJuice that implements a subset of the Minecraft Pi modding API for B

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

Creating Windows 10 Boot Media for a Lenovo Thinkpad T410 Using Only a Mac and a Linux Machine

TL;DR: Giovanni and I struggled trying to get Windows 10 installed on the Lenovo Thinkpad T410. We struggled a lot trying to create the installation media because we only had a Mac and a Linux machine to work with. Everytime we tried to boot the USB thumb drive, it just showed us a blinking cursor. At the end, we finally realized that Windows 10 wasn't supported on this laptop :-/ I've heard that it took Thomas Edison 100 tries to figure out the right material to use as a lightbulb filament. Well, I'm no Thomas Edison, but I thought it might be noteworthy to document our attempts at getting it to boot off a USB thumb drive: Download the ISO. Attempt 1: Use Etcher. Etcher says it doesn't work for Windows. Attempt 2: Use Boot Camp Assistant. It doesn't have that feature anymore. Attempt 3: Use Disk Utility on a Mac. Erase a USB thumb drive: Format: ExFAT Scheme: GUID Partition Map Mount the ISO. Copy everything from