Rust std study series: LinkedList

Continuing from Rust standard library study series, it’s time for LinkedList<T>. Note that implementation are taken fromRust stable v1.33.0.
Adoubly-linkedlist withowned nodes.
TheLinkedListallowspushing andpopping elements at either end inconstant time.
Almost always it is better to useVecorVecDequeinstead ofLinkedList. In general, array-based containers arefaster, more memory efficientand make better use of CPUcache**.Rust std doc
Note that unlike Vec<T>
- accessing an element through index is \(O(n)\) i.e. needs to iterate linearly over the list.
appendis \(O(1).\)- Interesting how
linked_list::Iteris different fromlinked_list::IterMut. Invariant ofIterMutis enforced with&mutandPhantomData<&'a Node<T>>ensures soundness (more onPhantomData, dropck later).
There’s an entire book (which I highly recommend to go through details if you haven’t already) convincing the reader why it’s tricky to implement evensingly-linked list and most probably not a good idea for new Rust users!
Because of Rust’s affine type system / ownership, it’s actually tricky to implementdoubly-linked list. The main reason is it seems a node needs to havetwo ownersfrom adjacent nodes. However, that’s possible with NonNull<T> which we talked about in Vec<T> study.
Here’s the stripped down definition
// Note: NonNull<T> does NOT own the referent
#[repr(transparent)] // <-- enforces the same type representation as *const T
struct NonNull<T: ?Sized> {
pointer: *const T,
}
struct Node<T> {
next: Option<NonNull<Node<T>>>, // Not Option<Box<Node<T>>>
prev: Option<NonNull<Node<T>>>,
element: T,
}
struct LinkedList<T> {
head: Option<NonNull<Node<T>>>,
tail: Option<NonNull<Node<T>>>,
len: usize,
marker: PhantomData<Box<Node<T>>>, // <-- sound dropck
}Why not Option<Box<Node<T>>> ?
It’s probably a good idea to see what would be the difference if we use Box<Node<T>> instead. We discussed what Unique<T> is and how it’s different from NonNull<T> previously, but as a quick reminder, Unique<T> owns the referent whereas NonNull<T> does not and in fact Box<T> (apointer typeforheap allocation) just wraps Unique<T> and provides new interface for interacting with Unique<T>.
Let’s consider Node<T> with Box as follows (playpen link)
struct Node<T> {
prev: Option<Box<Node<T>>>,
next: Option<Box<Node<T>>>,
element: T,
}
fn main() {
let mut head_node = Node {
prev: None,
next: None,
element: 1,
};
let next_node = Node {
prev: Some(Box::new(head_node)), // <-- head_node is moved here
next: None,
element: 2,
};
head_node.next = Some(Box::new(next_node)); // Not good!
}This begs forUse-After-Free(UAF) soUndefined Behavior(UB) which we know we shouldn’t push further. However, using a non-owning NonNull<T> can solve the problem as follows (playpen link)
use std::ptr::NonNull;
struct Node<T> {
prev: Option<NonNull<Node<T>>>,
next: Option<NonNull<Node<T>>>,
element: T,
}
fn main() {
let mut head_node = Node {
prev: None,
next: None,
element: 1,
};
let next_node = Node {
prev: NonNull::new(&mut head_node as *mut _),
next: None,
element: 2,
};
head_node.next = NonNull::new(&next_node as *const _ as *mut _);
}But how can we make sure this is sound esp. when using it in LinkedList<T>. More precisely
What’s the relation between PhantomData and dropck ?
I’ve been trying to understand the deeper relation between PhantomData and what makes dropck sound (so does LinkedList<T> sound), but couldn’t find any clear explanations so I asked it in the user channel and got an amazing thorough answer which can be generalized to Vec, LinkedList etc.