Note, I'm speaking a little loosely, and I'm not a C expert, but I think this post is still interesting nonetheless.
There are many ways to "architect" and implement lists.
These days, it's the norm in languages like Java, Python, etc. for the language or language library itself to provide a list implementation. You simply create a list and then start shoving objects into it. Python, Ruby, Perl and PHP all provide native syntax for lists. The algorithm used is more sophisticated than a singly or doubly-linked list because lists must double as arrays in those languages. Java and C++ have native arrays and array syntax, but they also provide a variety of list implementations in their libraries. Both can use generics to constrain the type of objects you put in those lists. One thing to note, however, is that there's a distinction between the list itself and the items in the list.
Since C's standard library doesn't provide a list implementation, a variety of approaches are used. It's not uncommon to create structs that have prev and next pointers in them and to manage the list manually. In general, it seems common in C to create data structures manually and to simply manage the data structure as part of the overall programming task. This is in stark contrast to, say, Python where the list implementation is in C and the code using the list is in Python.
My buddy Kelly Yancey once showed me that FreeBSD had macros so that programmers wouldn't have to keep reimplementing linked lists all the time in the kernel. I think macros were used instead of functions so that the code behaved as if it were actually written inline, thus avoiding the function call overhead--but I could be wrong.
At Metaweb, I had a buddy named Scott Meyer who use to work at Oracle. He showed me a pretty interesting trick for creating linked lists. As before, he had structs which contained application data as well as next and prev pointers for managing the structs in a linked list. However, rather than manipulate those linked lists manually, he wrote separate, generic linked list functions. The question is: how can a generic linked list function operate on random structs (i.e. void *'s) that just happen to have prev and next fields somewhere within them?
For each type of struct, he would create a "descriptor" struct that contained the offset of the prev and next fields. He would pass the descriptor to the function anytime he needed to manipulate the linked list. Loosely speaking, it's like saying, "Hey, I got this linked list, and I want you to insert a new member into it. You might not know anything about the type of structs in the linked list, but I can tell you that the prev pointer is 16 bytes from the beginning of the struct, and the next pointer is 20 bytes from the beginning."
I think the lesson is deep. If you want to write a function that operates generically on structs that you know have certain fields, you just have to tell that function the offset of those fields. Naturally, you don't want to just pass the offsets. Rather, you pack them up into a struct. Creating such a struct is like declaring that you implement some interface. I had seen how structs full of function pointers doubled as interfaces in Apache, but structs full of offsets was something entirely new for me.
Neat!
There are many ways to "architect" and implement lists.
These days, it's the norm in languages like Java, Python, etc. for the language or language library itself to provide a list implementation. You simply create a list and then start shoving objects into it. Python, Ruby, Perl and PHP all provide native syntax for lists. The algorithm used is more sophisticated than a singly or doubly-linked list because lists must double as arrays in those languages. Java and C++ have native arrays and array syntax, but they also provide a variety of list implementations in their libraries. Both can use generics to constrain the type of objects you put in those lists. One thing to note, however, is that there's a distinction between the list itself and the items in the list.
Since C's standard library doesn't provide a list implementation, a variety of approaches are used. It's not uncommon to create structs that have prev and next pointers in them and to manage the list manually. In general, it seems common in C to create data structures manually and to simply manage the data structure as part of the overall programming task. This is in stark contrast to, say, Python where the list implementation is in C and the code using the list is in Python.
My buddy Kelly Yancey once showed me that FreeBSD had macros so that programmers wouldn't have to keep reimplementing linked lists all the time in the kernel. I think macros were used instead of functions so that the code behaved as if it were actually written inline, thus avoiding the function call overhead--but I could be wrong.
At Metaweb, I had a buddy named Scott Meyer who use to work at Oracle. He showed me a pretty interesting trick for creating linked lists. As before, he had structs which contained application data as well as next and prev pointers for managing the structs in a linked list. However, rather than manipulate those linked lists manually, he wrote separate, generic linked list functions. The question is: how can a generic linked list function operate on random structs (i.e. void *'s) that just happen to have prev and next fields somewhere within them?
For each type of struct, he would create a "descriptor" struct that contained the offset of the prev and next fields. He would pass the descriptor to the function anytime he needed to manipulate the linked list. Loosely speaking, it's like saying, "Hey, I got this linked list, and I want you to insert a new member into it. You might not know anything about the type of structs in the linked list, but I can tell you that the prev pointer is 16 bytes from the beginning of the struct, and the next pointer is 20 bytes from the beginning."
I think the lesson is deep. If you want to write a function that operates generically on structs that you know have certain fields, you just have to tell that function the offset of those fields. Naturally, you don't want to just pass the offsets. Rather, you pack them up into a struct. Creating such a struct is like declaring that you implement some interface. I had seen how structs full of function pointers doubled as interfaces in Apache, but structs full of offsets was something entirely new for me.
Neat!
Comments
Similarly the linked list routines are just one step away from being a dynamically typed system -- if you put the struct describing offsets in the original struct itself then you'd have a type.
#define NEXTOFFSET(list) ((unsigned long)((list)->next) - (unsigned long)(list))
Then,
list = list_sort(list, NEXTOFFSET(list), comparison_function);
I guess building a framework to organize that isn't too bad an idea.
In many ways the Amiga was way ahead of its time.
There was a proposal to add them to FreeBSD back in 2002 which includes the source code.
Embarrassingly, I don't know. I don't know him all that well. I do know he's a brilliant hacker, and he's the manager of the graph database team at Metaweb.
I thnk that the Scott Meyers of Effective C++ is into training and consulting http://www.aristeia.com/
Ah, not the same guy.
#define LINKSHOW(object) linkshow(((void *) object.next) - ((void *) object), \
((void *) object.prev) - ((void *) object), object)
so you can just call "LINKSHOW(object)" which then calls linkshow(nextOffset, prevOffset, object)
The magic of macros effectively implement C++-like code templates.
Sean
((void *) object.prev) - ((void *) object), object)
Very nice ;)