# Rust std study series: Vec

The upcoming series of blog posts will contain my study of Rust standard library. I’ve partially written some parts of the series in scattered places and want to gather them in one place for better and easier access. I intend to update whenever I discover something interesting/important to remember.

I’m referring to implementations in Rust stable v1.33.0.

This post covers some details of std::vec that I found important.

## Vec<T>

`Vec<T>` is a dynamic array which only grows and never shrinks automatically.

A contiguous growable array type with heap-allocated contents.

Rust std doc

Notice the difference between its counterpart stack-allocated fixed-size array `[T; N]` (where at this time `N` needs to be a specified non-negative integer. Constant generic hopefully will come soon).

Ok! let’s dig in to it. `Vec<T>` contains (pointer, capacity, length).

### Vec pointer

1. The pointer will never be null, so it enjoys null-pointer-optimization.
2. The pointer may not actually point to allocated memory for example in`Vec::new(), vec![]` or `Vec::with_capacity(0)`.

### Vec capacity and length

1. The capacity of a vector is the amount of space allocated for any future elements that will be added onto the vector.
2. The length is the number of actual elements pushed/inserted in the vector.
3. `Vec` allocates memory iff `mem::size_of::<T>() * capacity() > 0`. So it does not allocate for a Zero-Sized-Type (ZST) even with positive capacity.
4. When length matches the capacity, `Vec` will (re)allocate by a certain growth factor. This makes insertion amortized $O(1)$. Right now the growth factor is 2. However, comparing to other languages such as C++, Java, etc. it doesn’t seem to be optimal given any global first-fit allocator. Heuristically, 1.5 or a number a bit less than golden ratio is considered optimal. Here’s the related issue that’s currently open. I found it interesting to dig in!
5. How about shrink factor? for example, if we `pop` half of the elements out, would a quarter of the memory be freed? No, actually! That’s if `pop`ing all the elements the capacity won’t change leaving a hole on the heap. Therefore, `pop` (from back) is $O(1)$ and not amortized $O(1).$ If you need to free up some memory, use `shrink_to_fit`.
6. If need to use `Vec` for FFI or as a memory-backed collection, be sure to deallocate memory with `from_raw_parts` and then drop explicitly.
7. If used in FFI and need to pass as pointer, for safety remember to call `shrink_to_fit` or `truncate` by the length prior to passing the (`as_mut_ptr()` or `as_ptr()`) to not pass uninitialized memory buffer.
8. The order of elements is always guaranteed to be the same if coerced into slice.

Here’s the stripped down definition

```struct Vec<T> {
buf: RawVec<T>,
len: usize,
}

// Default Global allocator
struct RawVec<T, A: Alloc = Global> {
ptr: Unique<T>,
cap: usize,
a: A,
}

#[repr(transparent)]
struct Unique<T: ?Sized> {
pointer: *const T,
_marker: PhantomData<T>,
}```

### Unique<T>

1. #[repr(transparent)] enforces that `Unique<T>`type representation is the same as `*const T`.
2. `Unique<T>` is the covariant version of `*mut T` (wrt `T`) and has stronger semantic than `NonNull<T>`.
3. Unlike `*mut T`, the pointer must always be non-null.
4. In fact, `Box<T>` wraps `Unique<T>` i.e. `struct Box<T: ?Sized>(Unique<T>)`.
5. It can be accessed in nightly through `#![feature(ptr_internals)]` and `core::ptr::Unique`.
6. If `T`is `Send/Sync` then `Unique<T>` is `Send/Sync`.
7. The presence of `PhantomData<T>` marker is only important for rustc dropck to understand that we logically own a `T` causing the main difference between `Unique<T>` and `NonNull<T>` where it’s defined as
```// NonNull<T> doesn't own the referent whereas Unique<T> does
#[repr(transparent)]
struct NonNull<T: ?Sized> {
pointer: *const T,
}```

This site uses Akismet to reduce spam. Learn how your comment data is processed.