Designing A Linked List Class

Copyright Donovan Rebbechi 2001. This tutorial may be distributed freely if this copyright notice is preserved. Changes require authors permission. The code is no-strings attached free software, it may be redistributed, modified, and used.

linked list

A topic that is often addressed in introductory computer science courses is the design and implementation of a linked list. This seemingly simple task turns out to be surprisingly difficult. Unfortunately, most textbooks make a terrible mess of it. We examine some of the design issues that surface in the process of designing and implementing such a class, and offer some implementations.

Introduction to Linked Lists

A linked list is a data structure that consists of a sequence of nodes, x1, ..., xn where each node xi contains the following information:

A C++ implementation of a node is quite simple:

template <class T>
struct Node
{
	Node (const T& x, Node* y):data(x),next(y) {}
	T data;
	Node* next;
};

The main desirable property of the linked list is that the ordering is a property of the Node* data members of the list. This means that a lot of algorithms only require pointer manipulations. For example, to delete a node x2 from a linked list, the following code suffices:

x1->next = x2->next;
delete x2;

Notice that this operation did not require a change in the memory address of other nodes in the list, and it did not require us to move data around (deleting the second element of an array on the other hand would require this.) In fact, even operations such as sorting a linked list do not invalidate pointers to elements in the list, because the ordering is a property of the values of the member variable next in each node. This is often a very useful property. For example, it is possible to maintain a linked list in a sorted state, and hold pointers to elements in the list, while adding data to the list. The pointers do not need to be updated when new elements are added.

On the other hand, linked lists do have some disadvantages. For example, they support sequential access but not random access. To locate the nth element, one must dereference n pointers. This is in contrast to the constant time element access in an array.

Design Considerations

Some good advice found in one textbook (which unfortunately does not appear to heed its own advice) is to pay careful attention to how an interface will be actually used, when designing it. It's instructive to write some code that uses your class before you implement it. This may seem perverse to the uninitiated, but it is in fact very useful. First, it has the benefit of making sure that the class is usable, and secondly, it serves as a kind of instant test suite.

The most difficult aspect of designing a list class is element access. While this seems very simple, it's surprisingly subtle, and textbooks do a surprisingly good job of botching it. The following are key questions:

  1. How do we iterate over the elements of a list ?
  2. How do we represent a position in the list ? For example, operations on list nodes such as find, insert and erase require a means of representing position.

Addressing the former issue immediately addresses the latter, because if we have a way to represent a position in the list, then we can use an operation that moves from one position to the next, to iterate.

First, we look at the first consideration.

Iterating over lists

Accessing list elements

Clearly, the best way to flexibly address the former concern would be to provide a way to address the latter. There are a few ways to access elements in a list:

A Bad Approach to Element Access

The first approach is not acceptable. It suffers from the substantial problem that simply accessing data modifies the state of the list. Here is an example of code that will break if we use the bad approach:

	find (mylist,3); // sets current element
	print (mylist); // print() modifies current element
	cout << mylist.current();

The problem with the above code is that the print() statement needs to modify the "current" element. This has a few consequences: it means that we cannot pass a list by const reference to any useful function (since most useful functions will modify the "current" element), and the deeper problem is that seemingly innocent statements, even debugging statements, modify the state of the list. The prospect of debugging statements changing program behaviour should send shivers down the spine of any programmer. Such behaviour results in infuriating Heisenbugs -- bugs whose visibility depends on whether or not debugging is enabled.

This raises an issue -- change in the state of the "current" element should not be considered a change in the list. In other words, there is a need for a representation of "current element" in the list that is not part of the list itself. Having found our first approach unworkable, we move on to the other two alternatives.

A Naive (and bad) Approach to Element Access

An approach that appeals to beginners, and even some misguided textbook authors, is to use array notation to access elements of linked lists. This is tempting, because programmers become acquainted with arrays at an early level, and from this stems a natural desire to treat any sequence container as an array. Resist this temptation at all costs!!! There are several problems with this, but they all boil down to a very simple fact:

a linked list is not an array, and is not a random access data structure
And hence attempts to treat it as one inevitably lead to disaster. Consider the following code:
	for (int i = 0; i < mylist.size(); ++i )
		cout << mylist[i] << endl;

Accessing the nth node takes n operations where n is the size of the list, so the loop takes 1 + 2 + 3 + ... + n = n(n+1)/2 operations. This is disasterous for large values of n, considering that we can do the same in linear time if we use the pointer based or iterator based approach.

Another fundamental problem with integer offsets is that they are invalidated by insertions and deletions. For example, consider this code:

  int pos = mylist.find ( "Thomas" );
  mylist.insert ( "William" );
  cout << mylist[pos]; // pos no longer refers to "Thomas"
This is a weakness that we are stuck with when we work with arrays, but there is no reason why we should suffer the same for lists, because inserting in a list does not change the mewmory address of the nodes.

Suffice it to say that we will not seriously consider implementing a linked list that uses this misguided approach as the primary means of element access.

The Pointer Approach to Element Access

The second approach is the one that is usually used in C implementations. In a C++ implementation, there are two different sub-categories of this approach.

These approaches may seem identical on the face of it, but there are some key differences. For example, inserting an element before the head node is problematic in the former case. These are examples of ways one could write a function to insert at the front, using the linkedlist-is-a-node approach:
void insert_front ( Node*&  head, const element_type & x)
{
	Node* tmp = new Node(head, x);
	head = tmp;
}
// example client code
insert_front(mylist, x);

Node* insert_front ( Node*  head, const element_type & x)
{
	Node* tmp = new Node(head, x);
	return tmp;
}

// example client code
head = insert_front(mylist, x);
The linked-list-is-a-node approach is problematic in that calling them on a Node that is not at the head of the list will result in disaster (it will "unlink" the list, making access to elements beyond the operand inaccesible, and it will leave the pointer to the operand pointing at garbage)

The linked-list-is-not-a-node approach has the advantage that encapsulation makes it possible to ensure that client code does not corrupt the list. An example of code using this approach:

void insert_front ( const element_type & x)
{
	Node* tmp = new Node(head, x);
	head = tmp;
}
// example client code
mylist.insert_front(x);
Notice that client code does not have the opportunity to corrupt the list, because the client code does not have to know which Node is at the front of the list.

The Iterator Based Approach To Element Access

This approach uses a special iterator class that encapsulates a node pointer. One of the key advantages of this is that we can use this to provide consistent semantics to different container types. We defer further discussion of iterators.

Basic Operations

Recall our previous Node implementation:

	struct Node
	{
		Node(const T& x, Node* y):data(x),next(y) {}
		T data;
		Node* next;
	};

Notice the constructor. This makes it simpler to write code that creates list nodes. Even if we were implementing the list in C, it would be desirable to write a function that creates a node. Notice also the fact that the node is a struct. This is intentional -- the node is a simple data aggregate.

Now we need to design the interface for the list. The first thing to consider is what operations should a linked list support. Here is a start:

In addition, we also need to be able to create lists. This gives rise to the following creational operations:

There are other operations that may be required, and have consequences for our design. For example, fast removal at the back of the list, and concatenation of two lists. These require us to keep track of the last node of the list. Depending on how important these operations are, it may be useful for the list class to hold a pointer to the last element of the list.

Another possible design consideration is using a sentinel node at the beginning of the list. This has the drawback of adding overhead, but is advantageous in that it makes it possible to avoid "special-casing".

In our examples, we will decline both the "sentinel" and the "tail pointer" design enhancements, while observing that these are actually quite useful in certain situations, and could be introduced to our class via a wrapper class.

So we make a first attempt at writing an interface. For now, we omit the destructor. Strictly speaking, it isn't part of the interface, it's an implementation detail specific to the C++ language.


template <class T>
class List
{

	Node* head;
public:
	struct Node
	{
		Node(const T& x, Node* y):data(x),next(y) {}
		T data;
		Node* next;
	};	
	
	void push_front(const T& x)
	{
		Node* tmp = new Node(x, head);
		head = tmp;
	}
	List(); 
	List(const List&); 
	List& operator=(const List&); 
	bool empty();
	void pop_front();
	T& front();

};

A Design Crisis

We get stuck on the last two items. To "insert anywhere in the list", we need a way to represent a place in the list. Fortunately, our previous discussion has prepared us for this.

Another problem with singly linked lists is that given a node in the list, we can not erase that node in constant time, and we can't insert a node before it in constant time. This is because these operations require manipulation of the preceding node. This is a problem with singly linked lists that detracts from their usability, and the simplest solution is to use a doubly linked list instead if you want a more flexible, general purpose class (one could write iterators that contain ordered pairs of succesive node pointers, but given this overhead, one might as well use a doubly linked list)

We address this problem by using erase_after and insert_after methods, that operate on the successor to the function argument. The ugliness of this is noted.

A Simple Implementation

We first look into implementing a list, using the pointer based approach, with the linked-list-is-not-a-node model. We previously raised the importance of considering how the class is actually going to be used prior to implementing it. In this example, we could iterate over the list by using node pointers. For example, we could write a loop, and an insertion like this:

for ( Node* i = L.begin(); i!=L.end(); i=i-> next )
	cout << i->data << endl;
Node* x = L.find (3);
L.insert_after(x,4);

The notation is not that elegant, but it works.

Some Implementation Issues

There are a few issues that surface during implementation. For example, it proves useful to implement a swap() function that swaps the data of two lists. This function makes the assignment operator easier to implement (it can reuse the copy constructor) and it also offers a cheap destructive copy operation.

Another implementation issue is that copying a singly linked list requires a function to reverse the list, because the obvious strategy of iterating over the nodes and inserting them gives us a list in the wrong order. So a method that reverses the order of the list by pointer manipulation proves desirable.

Another subtle issue is const correctness. To declare a method that gives out pointers to data in the list const is dangerous, because it would allow functions that accepted a const reference to a list to use these pointers to modify the list. So we provide const versions of the methods begin() and front() that return a const pointer/reference.

Here is what the code would look like:, and here is a simple driver program


template <class T> class List
{
public:
	struct Node 
	{
		Node(const T& data, Node* next=0):data(data),next(next) {}
		Node* next;
		T data;
	};

	List() : head(0) {}

	List(const List& L) : head(0)
	{
		// copy in reverse order
		for ( const Node* i = L.begin(); i!= L.end(); i=i->next )
			push_front(i->data);
		reverse();
	}

	void reverse()
	{
		Node* p = 0; Node* i = begin(); Node* n;
		while (i)
		{
			n = i->next;
			i->next = p;
			p = i; i = n;
		}
		head = p;
	}

	void swap(List& x)
	{
		Node* tmp = head; head = x.head; x.head = tmp;
	}

	List& operator=(const List& x)
	{
		List tmp(x);
		swap(tmp);
		return *this;
	}

	~List() { clear(); }
	void clear() { while (!empty()) pop_front(); }

	bool empty() { return ! head; }

	void push_front(const T& x) { 
		Node* tmp = new Node(x,head); head = tmp; 
	}

	void pop_front() { 
		if (head) { Node* tmp = head; head=head->next; delete tmp; }
	}

	void insert_after(Node* x, const T& data)
	{
		Node* tmp = new Node(data, x->next);
		x->next = tmp;
	}

	void erase_after(Node* x)
	{
		Node* tmp = x->next;
		if (tmp)
		{
			x->next = x->next->next;
			delete tmp;
		}
	}


	T& front() { return head->data; }
	const T& front() const { return head->data; }

	Node* begin() { return head; }
	Node* end() { return 0; }

	const Node* begin() const { return head; }
	const Node* end() const { return 0; }

private:
	Node* head;
};





#include "simple_list.h"
#include <iostream>
int main()
{
	List<int> X;
	X.push_front(3);
	X.push_front(2);
	X.push_front(1);

	for (List<int>::Node* it = X.begin(); it; it = it->next )
		cout << it->data << endl;

	X.reverse();

	for (List<int>::Node* it = X.begin(); it; it = it->next )
		cout << it->data << endl;

}

Commentary on the Code

This list is still somewhat incomplete. It would be nice to have methods that perform splice and merge operations, as well as sorting and searching. But the code we have is a good start.

A Critique of the Simple Implementation

The simple implementation has some positive traits that set it apart from many textbook examples. It has the following properties:

However, there is one key drawback:

Introducing Iterators

An iterator is an object that represents a position. It makes it possible to move around in a container, without exposing the structure of that container. All container iterators support a "fetch" operation that retrieves the data at a given position, as well as comparison and assignment operations. But different iterators support different operations. For example, an iterator for a singly linked list supports "forward" movement, but not "backward" movement. An iterator for an array class supports movement in both directions, and supports constant time addition (random access). There are different types of iterators, with different capabilities:

General operations for iterators:

*it; // fetch data referenced by an iterator;
++it; it++;  // move a forward iterator forward. 
--it; it--; // move bidirectional iterator backward.
it += n; it -=n; //  increment/decrement random access iterator

i1 == i2; j1 == j2;  // comparison
i1 = x.begin();   // assignment

Iterators make for containers that are very consistent and easy to use. To get some idea of the power of iterators, here is some example code that uses iterators for vastly different STL containers:

// print out the elements of an STL set (implemented as a binary
// search tree)
for (set<double>::iterator it = x.begin(); it != x.end(); ++it )
	cout << *it << endl;

// print out the elements of an STL list  (doubly linked list)
for (list<double>::iterator it = y.begin(); it != y.end(); ++it )
	cout << *it << endl;

// print out the elements of an STL vector (dynamic array)
for (vector<double>::iterator it = z.begin(); it != z.end(); ++it )
	cout << *it << endl;

As you can see, the iterator lets the user get at the data, and keeps the details of the underlying data structure out of the way. While it's useful to know that the set is implemented as a binary tree, and has those performance characteristics (namely, it's maintained in sorted order, and searching takes O(log(n)) ), the complexity of the underlying data structure does not interfere with accessing data in the container. There is a good reason that the STL set is called set and not binary_search_tree -- the STL has diffused the complexity to the point that it really does behave like a "set", and iterators deserve a substantial amount of the credit for the success of this abstraction.

Iterators also have the advantage that one can use them to write generic functions that operate on pairs of iterators (ranges), and are entirely container independent (though they may be dependent on capaibilities of the iterator). For example, we could use the copy function to write some code that prints the contents of the above containers as follows:

// print out the elements of an STL set (implemented as a binary
// search tree)
copy (x.begin(),x.end(),ostream_iterator<double>(cout,"\n"));

// print out the elements of an STL list  (doubly linked list)
copy (y.begin(),y.end(),ostream_iterator<double>(cout,"\n"));

// print out the elements of an STL vector (dynamic array)
copy (z.begin(),z.end(),ostream_iterator<double>(cout,"\n"));

Implementing Iterators

An iterator that refers to a place in the list should support the following operations:

Some things we can't do --

so we have:

 T& operator*();

 iterator& operator++();   // ++it   
 iterator operator++(int); // it++
 bool operator==(const iterator&);  // i == j
 bool operator!=(const iterator&);  // i != j

 iterator(); // iterator i;
 iterator(Node*); 
 iterator(const iterator&); // iterator j(i)
 iterator& operator=(const iterator&); // iterator j; j = i;

Now we can add operations that depend on position to the list:

   iterator begin(); // return iterator at front of list
   iterator end(); //  return iterator at end of list
   insert_after(iterator,const T&); //  insert after iterator 
   erase_after(iterator); //  erase after iterator

We need to insert after the given iterator, because there is no way to backtrack in a singly linked list. We have the same issue with erase -- to erase a given node, we need to modify the node that precedes it.

const Correctness

You may have observed that the methods begin() and end() were not declared const. There is a reason for this -- suppose we pass our list to a function by const reference. If we declared begin() const, then this code would be legal:

void do_something(const list<int> & a )
{
	*a.begin() = 3; // modify list	
}

This is obviously a problem -- the function prototype claims to guarantee that the state of the list will remain unchanged, but does not honor this contract. To deal with this, we need a separate type for constant iterators, const_iterator. This supports the same operations as iterator, with the following differences:

Since there is an implicit conversion from iterator to const_iterator, there is no need to provide more comparison operators to handle mixed comparisons.

So this results in the addition of the following methods:

const_iterator begin() const;
const_iterator end() const;
const T& front() const;

One Last Thing: STL Compatibility

The standard template library (STL) offers a number of algorithms that we can use on our list for free, provided that we make our iterators STL compatible. The main thing this requires is several typedefs. in particular, the following information is expected:

typedef T value_type; // type of element
typedef T& reference; // return type of operator*()
typedef T* pointer; // return type of operator->()
difference_type; // type used to measure distance between iterators
typedef std::forward_iterator_tag iterator_category; 

The simplest way to do this is derive iterator from std::iterator. The standard iterator class is a template that takes two parameters -- an iterator tag, and a type. For example, for a list iterator, we want to derive from

std::iterator<std::forward_iterator_tag, T >
and for a const_iterator, we derive from
std::iterator<std::forward_iterator_tag, const T >

The reason we use const T is that this makes the pointer and reference types const *T and const T&.

It should be noted that there is no requirement that iterators be derived from std::iterator. However, it is a useful technique.

The iterator category tells algorithms about the capabilities of the iterator. For example, this tells the compiler that our iterators cannot be used with an algorithm that requires bidierctional or random access iteration. The typedefs for const_iterator are similar, just replace references and pointers with const versions.

An STL Compatible Singly Linked List

Finally, we have our list class and driver program


#include <iterator>
template <class T>
class List
{
	struct Node
	{
		Node(const T& x,Node* y = 0):m_data(x),m_next(y){}
		T m_data;
		Node* m_next;
	};

	Node* m_head;
public:

	class iterator 
		: public std::iterator<std::forward_iterator_tag, T>
	{
		Node* m_rep;
	public:
		friend class const_iterator;
		friend class List;

		inline iterator(Node* x=0):m_rep(x){}
		inline iterator(const iterator& x):m_rep(x.m_rep) {}
		inline iterator& operator=(const iterator& x)
		{ 
			m_rep=x.m_rep; return *this; 
		}
		inline iterator& operator++()
		{ 
			m_rep = m_rep->m_next; return *this; 
		}
		inline iterator operator++(int)
		{ 
			iterator tmp(*this); m_rep = m_rep->m_next; return tmp; 
		}
		inline reference operator*() const { return m_rep->m_data; }
		inline pointer operator->() const { return m_rep; }
		inline bool operator==(const iterator& x) const
		{
			return m_rep == x.m_rep; 
		}	
		inline bool operator!=(const iterator& x) const
		{
			return m_rep != x.m_rep; 
		}	

	};

	class const_iterator 
		: public std::iterator<std::forward_iterator_tag, const T> 
	{
		const Node* m_rep;
	public:
		friend class iterator;
		friend class List;

		inline const_iterator(const Node* x=0):m_rep(x){}
		inline const_iterator(const const_iterator& x):m_rep(x.m_rep) {}
		inline const_iterator(const iterator& x):m_rep(x.m_rep){}
		inline const_iterator& operator=(const const_iterator& x)
		{ 
			m_rep=x.m_rep; return *this; 
		}
		inline const_iterator& operator=(const iterator& x)
		{ 
			m_rep=x.m_rep; return *this; 
		}		
		inline const_iterator& operator++()
		{ 
			m_rep = m_rep->m_next; return *this; 
		}
		inline const_iterator operator++(int)
		{ 
			const_iterator tmp(*this); m_rep = m_rep->m_next; return tmp; 
		}
		inline reference operator*() const { return m_rep->m_data; }
		inline pointer operator->() const { return m_rep; }
		inline bool operator==(const const_iterator& x) const
		{
			return m_rep == x.m_rep; 
		}
		inline bool operator!=(const const_iterator& x) const
		{
			return m_rep != x.m_rep; 
		}



	};


	List() : m_head(0) {}

	List(const List& L) : m_head(0)
	{
		for ( const_iterator i = L.begin(); i!=L.end(); ++i )
			push_front(*i);
		reverse();
	}

	void reverse()
	{
		Node* p = 0; Node* i = m_head; Node* n;
		while (i)
		{
			n = i->m_next;
			i->m_next = p;
			p = i; i = n;
		}
		m_head = p;
	}

	void swap(List& x)
	{
		Node* tmp = m_head; m_head = x.m_head; x.m_head = tmp;
	}

	List& operator=(const List& x)
	{
		List tmp(x);
		swap(tmp);
		return *this;
	}

	~List() { clear(); }
	void clear() { while (!empty()) pop_front(); }



	inline void push_front(const T&x)
	{
		Node* tmp = new Node(x);
		tmp->m_next = m_head;
		m_head = tmp;
	}
	inline void pop_front()
	{
		if (m_head)
		{
			Node* newhead = m_head->m_next;
			delete m_head;
			m_head = newhead;
		}
	}
	inline bool empty() { return m_head; }

	inline T& front() { return *begin(); }
	inline const T& front() const { return *begin(); }

	inline iterator begin() { return iterator(m_head); }
	inline iterator end() { return iterator(); }
	inline const_iterator begin() const { return m_head; }
	inline const_iterator end() const { return const_iterator(); }

	void erase_after (iterator& x)
	{
		Node* tmp = x.m_rep->m_next;
		if (x.m_rep->m_next) 
			x.m_rep->m_next = x.m_rep->m_next->m_next;
		delete tmp;
	}

	void insert_after (iterator& x, const T& y)
	{
		Node* tmp = new Node(y,x.m_rep->m_next);
		x.m_rep->m_next = tmp;
	}
};




// linked list driver program

#include "linked_list.h"
#include <algorithm>
#include <numeric>
#include <iostream>

template <class T>
void print(const List<T> & x)
{
	std::copy(x.begin(), x.end(), std::ostream_iterator<T>(std::cout,"\n"));
}

int main()
{
	List<int> x;
	x.push_front(3);
	x.push_front(4);
	x.push_front(9);
	print(x);
	List<int>::iterator it = std::find(x.begin(),x.end(),9);
	if (it != x.end())
		std::cout << *it << std::endl;

	x.insert_after (it,11);
	std::cout << "The list is" << std::endl;
	print (x);

	int sum = std::accumulate ( x.begin(),x.end(),0);
	int sumsq = std::inner_product( x.begin(),x.end(),x.begin(),0);

	std::cout << "sum " << sum << std::endl;
	std::cout << "sum sq " << sumsq << std::endl;



	std::cout << "Copying ... " << std::endl;
	List<int> y(x);

	print (y);
	std::cout << "Done with copy" << std::endl;

	List<int>::const_iterator cit;
	std::cout << "** ";
	for (cit = y.begin(); cit != y.end(); ++cit) 
		std::cout << *cit << ' '; std::cout << std::endl;

}

Variations on the Theme

There are several other possible variations on the linked list theme. We discuss some of them:

Implementing A Circular Linked List with a Sentinel

We implement a circular linked list, with the following properties:

The bottom line is that we trade a slight amount of overhead (a sentinel node) for a fair degree of flexibility. At a first glance, it doesn't appear obvious that this design helps us write code that inserts before a given iterator. The trick here is to use "off-by-one" iterators. In particular, dereferencing an iterator produces the element at the next node:

		inline T& operator*() { return m_rep->m_next->m_data; }
The iterator has the effect of abstracting an "always look ahead by one position" strategy. Instead of providing messy operations such as find the iterator before the element whose value is 9, or erase the iterator after this one, which amounts to forcing the client code to look ahead, the iterators automatically do this work. This is possible because of the fact that we have a sentinel node, hence every node in the list has a predecessor. In fact, even the sentinel node has a predecessor, because the list is circular.

The main change in this list with respect to the non-circular list is the fact that we don't keep track of the head node, we keep track of a pointer m_node that points to the last element in the list. Hence m_node->next points to the dummy sentinel node, and m_node->next->next points to the first node in the list.

We dicuss the important changes:

Without further ado, we present the full listing. The source can be downloaded here, as can the driver program


#include <iterator>
template <class T>
class List
{
	struct Node
	{
		Node(const T& x,Node* y = 0):m_data(x),m_next(y){}
		T m_data;
		Node* m_next;
	};

	Node* m_node;

public:

	class iterator
		: public std::iterator<std::forward_iterator_tag, T>
	{
		Node* m_rep;
	public:
		friend class List;
		friend class const_iterator;
		typedef T value_type;
		typedef T& reference;
		typedef T* pointer;
		typedef int difference_type;
		typedef std::forward_iterator_tag iterator_category;

		inline iterator(Node* x=0):m_rep(x){}
		inline iterator(const iterator& x):m_rep(x.m_rep) {}
		inline iterator& operator=(const iterator& x)
		{ 
			m_rep=x.m_rep; return *this; 
		}
		inline iterator& operator++()
		{ 
			m_rep = m_rep->m_next; return *this; 
		}
		inline iterator operator++(int)
		{ 
			iterator tmp(*this); m_rep = m_rep->m_next; return tmp; 
		}
		inline reference operator*() const { return m_rep->m_next->m_data; }
		inline pointer operator->() const { return m_rep->m_next; }
		inline bool operator==(const iterator& x) const
		{
			return m_rep == x.m_rep; 
		}	
		inline bool operator!=(const iterator& x) const
		{
			return m_rep != x.m_rep; 
		}	

	};

	class const_iterator
		: public std::iterator<std::forward_iterator_tag, const T>
	{
		const Node* m_rep;
	public:
		friend class List;
		friend class iterator;
		inline const_iterator(const Node* x=0):m_rep(x){}
		inline const_iterator(const const_iterator& x):m_rep(x.m_rep) {}
		inline const_iterator(const iterator& x):m_rep(x.m_rep){}
		inline const_iterator& operator=(const const_iterator& x)
		{ 
			m_rep=x.m_rep; return *this; 
		}
		inline const_iterator& operator=(const iterator& x)
		{ 
			m_rep=x.m_rep; return *this; 
		}		
		inline const_iterator& operator++()
		{ 
			m_rep = m_rep->m_next; return *this; 
		}
		inline const_iterator operator++(int)
		{ 
			const_iterator tmp(*this); m_rep = m_rep->m_next; return tmp; 
		}
		inline reference operator*() const { return m_rep->m_next->m_data; }
		inline pointer operator->() const { return m_rep->m_next; }
		inline bool operator==(const const_iterator& x) const
		{
			return m_rep == x.m_rep; 
		}
		inline bool operator!=(const const_iterator& x) const
		{
			return m_rep != x.m_rep; 
		}



	};


	List() : m_node(new Node(T())) { m_node->m_next = m_node; }

	List(const List& L) : m_node(new Node(T()))
	{
		m_node->m_next=m_node;
		for ( const_iterator i = L.begin(); i!= L.end(); ++i )
			push_front(*i);
		reverse();
	}

	void reverse()
	{
		if (empty())
			return;
		Node* new_m_node = m_node->m_next->m_next;
		Node* p = m_node; Node* i = m_node->m_next; Node* n;
		do  
		{
			n = i->m_next;
			i->m_next = p;
			p = i; i = n;
		}
		while (p!=m_node);
		m_node = new_m_node;
	}

	void swap(List& x)
	{
		Node* tmp = m_node; m_node = x.m_node; x.m_node = tmp;
	}

	List& operator=(const List& x)
	{
		List tmp(x);
		swap(tmp);
		return *this;
	}

	~List() { clear(); delete m_node; }
	void clear() { while (!empty()) pop_front(); }



	inline void push_front(const T&x)
	{
		insert (begin(),x);
	}
	inline void push_back(const T& x)
	{
		insert (end(),x);
	}
	inline void pop_front()
	{
		erase(begin());
	}
	inline bool empty() { return m_node == m_node->m_next; }

	inline T& front() { return *begin(); }
	inline const T& front() const { return *begin(); }
	inline T& back() { return m_node->data; }
	inline const T& back() const { return m_node->data; }

	inline iterator begin() { return iterator(m_node->m_next); }
	inline iterator end() { return iterator(m_node); }
	inline const_iterator begin() const 
	{ 
		return const_iterator(m_node->m_next); 
	}
	inline const_iterator end() const 
	{ 
		return const_iterator(m_node); 
	}

	void erase (iterator x)
	{
		if (x==end())
			return;
		if (x.m_rep->m_next == m_node)
			m_node = x.m_rep;

		Node* tmp = x.m_rep->m_next;
		x.m_rep->m_next = x.m_rep->m_next->m_next;
		delete tmp;
	}

	void insert (iterator x, const T& y)
	{
		Node* tmp = new Node(y,x.m_rep->m_next);
		if (x.m_rep == m_node)
			m_node = tmp;
		x.m_rep->m_next = tmp;
	}

	// rotate x to beginning
	void rotate (iterator x)
	{
		if (x == end())
			return;
		Node* sentinel = m_node->m_next;
		m_node->m_next = m_node->m_next->m_next;
		sentinel->m_next = x.m_rep->m_next;
		x.m_rep->m_next = sentinel;
		m_node = x.m_rep; 
	}
};



Implementing A Circular Linked List without a Sentinel

We present example code for a circular list without a sentinel and a driver program. The main changes are that we need a lot of special-case code in methods such as push_front(), push_back(),pop_front(), and some extra baggage for the iterators. The up() method is provided to offer a means of handing the entire list as a range to an STL function. For example, the range it, it.up() will include the entire list, while it, it would obviously be an empty list. On the other hand, the rotate() function is somewhat simpler in this version. The driver code for this list can transparently handle ranges that straddle the end. The main penalty is not so much in the increased iterator sizes (we don't need to construct iterators very often anyway), but in the increased bookkeeping overhead. The relative slowness of comparing and incrementing iterators is a liability in cases where a lot of simple operations need to be performed on large lists. On the other hand, if iteration speed is not a potential performance bottleneck, these lists could prove useful. They have the versatility of the sentinel based linked list, and lessmemory overhead.

Comparison Between Different Lists

Here, we compare three list implementations -- the circular, and basic lists implemented above, and the SGI implementation of the STL list class. The SGI list is a doubly linked list. This list has a characteristic that sets it apart from singly linked lists -- it supports bidirectional iteration. Support for bidirectional iteration is an STL requirement, so a linked list that meets the requirements for the STL list must be doubly linked. As we will see, there is a pattern emerging in the comparison -- there is an escalating tradeoff between minimising overhead and maximising flexibility. The STL class aims to offer as much flexibility as possible.

Operation/Feature Simple list Our Circular list SGI STL list
CircularNoYesYes
Doubly linkedNoNoYes
SentinelNoYesYes
Empty list overhead One pointer One pointer and one node One pointer and one node
Overhead per node One pointer and data One pointer and data Two pointers and data
Constant time insert at frontYes Yes Yes
Constant time append at backNo Yes Yes
Constant time remove at front Yes Yes Yes
Constant time remove at back No No Yes
Access first elementConstant time Constant time Constant time
Access last elementLinear time Constant time Constant time
insert() insert_after only Yes Yes
erase() erase_after only Yes Yes
forward iteration YesYes Yes
reverse iteration NoNoYes
Constant time rotateNoYesYes (using splice())

Other Implementation Issues

While we've covered a lot of issues, there are more issues that the reader should be aware of. Think of this section as an invitation to further exploration.

Exception Safety

is an important consideration in library design, especially where templates are involved. There are three possible guarantees a class can make for a given method:

The most authoritative and detailed treatment of the topic of exception safety is given by Herb Sutter in his
Guru of the week column. The information is also to be found in his book Exceptional C++. This article by David Abrahams is also interesting reading.

Exception safety is one of the most essential topics for library designers, especially when template code is involved. Further reading is highly recommended.

Template Bloat

is a classical problem with implementing template classes. The problem is that templates generate a certain amount of object code per template instance. So for example, if you create a list<int>, list<float>, list<string>, and a list<double>, then the amount of object code generated is approximately 4 times as much as what you'd have if you only created one of the types of lists. The solution to this problem is to pull as much code as possible out of the templates. It is possible to do this for containers of pointers, by using specialisations:

template <class T> class List<T*> : private List<void*>
{
public:
	typedef List<void*> super;

	inline void push_back(const T* & x) { super::push_back(x); } 
	inline T* front() { return (T*) super::front(); }
	...


};

This way, the overhead per class amounts to a small type-safety wrapper, but the bulk of the implementation need only be compiled once. The approach I would recommend to this is to use an STL list as a basis, and use private inheritance. One of the nice things about this approach is that you can choose to only re-implement the member functions that you need, to minimise the overhead. In this example, I only implement push_front, pop_front and non-const iterators (conventional, as well as back/front inserters). Most of the member functions are very simple to implement, one simply forwards the call to the base class. This section is not intended to be a tutorial so much as it is an introduction to this template bloat issue (even so, it's as detailed as anything you're going to find in a book). However, in this instance, an example is in itself very instructive. Here is some example code (download) that illustrates the basic approach. One can test the effectiveness of the example, by comparing the size of object files generated by creating several instances of PtrList parametrised by different types, and do the same for the STL list class. One can explicitly request instances with the following syntax: template class list<int> The only compiler I know of that uses the derive-from-void* idiom presented here is Metrowerks Code Warrior.


template <class T> 
class PtrList : 
private std::list<void*>
{
	typedef std::list<void*> base_class;
public:
	inline void push_back (T* x) { base_class::push_back(x); }
	inline void push_front (T* x) { base_class::push_front(x); }

	friend class back_insert_iterator<PtrList>;
	friend class front_insert_iterator<PtrList>;

	class iterator : private base_class::iterator
	{
	public:
		friend class PtrList;
		inline iterator& operator++()
		{ 
			base_class::iterator::operator++(); return *this;
		}
		inline iterator operator++(int x)
		{ 
			return base_class::iterator::operator++(x);
		}
		inline iterator& operator--()
		{ 
			base_class::iterator::operator--(); return *this;
		}
		inline iterator operator--(int x)
		{ 
			return base_class::iterator::operator--(x);
		}

		inline bool operator==(const iterator& right) const
		{
			return base_class::iterator::operator==(right);
		}

		inline T* operator*() { return (T*)base_class::iterator::operator*(); }

	private:
		inline iterator ( const base_class::iterator & x)
			: base_class::iterator(x)
			{}


	};


	iterator begin() { return iterator(base_class::begin()); }
	iterator end() { return iterator(base_class::end()); }

};

template<class T>
class back_insert_iterator<PtrList<T> > : 
private back_insert_iterator<std::list<void*> >
{
	typedef back_insert_iterator<std::list<void*> > back_inserter_base;
public: 
	back_insert_iterator(PtrList<T> & x)
		: back_inserter_base(x) {}
	back_insert_iterator& operator=(T*  x)
	{
		back_inserter_base::operator=(x);
		return *this;
	}
	back_insert_iterator& operator*(){ return *this; }
	back_insert_iterator& operator++(){ return *this; }
	back_insert_iterator operator++(int){ return *this; }
};

template<class T>
class front_insert_iterator<PtrList<T> > : 
private front_insert_iterator<std::list<void*> >
{
	typedef front_insert_iterator<std::list<void*> > front_inserter_base;
public: 
	front_insert_iterator(PtrList<T> & x)
		: front_inserter_base(x) {}
	front_insert_iterator& operator=(T*  x)
	{
		front_inserter_base::operator=(x);
		return *this;
	}
	front_insert_iterator& operator*(){ return *this; }
	front_insert_iterator& operator++(){ return *this; }
	front_insert_iterator operator++(int){ return *this; }
};



Appendix: A Quick Lit Review

There are many textbooks that cover linked lists, and they reflect different approaches. Unfortunately, the vast majority of them do a fairly poor job at it. Of these titles, the only one that provides much insight to beginners on the topic of implementing data structures is the Weiss book. The Koenig/Moo book is also good, but it addresses a more advanced audience.

Of these books, I'd enthusiastically recommend Cormen et al as a pure theory book for Data Structures. I believe theory should be studied separately from implementation, it is complex and rich enough in its own right to deserve special attention. On the other hand, sticky implementation details are important, and this is where the books by Weiss prove valuable. The Koenig and Moo book also does a good job at addressing many library implementation and generic programming issues, but it is not a data structures book (the main theme is the related topic of library design)