Rust std study series: Vec

The upcoming series of blog posts will contain my study of Rust standard library. I’ve partially written some parts of the series in scattered places and want to gather them in one place for better and easier access. I intend to update whenever I discover something interesting/important to remember.

I’m referring to implementations in Rust stable v1.33.0.

This post covers some details of std::vec that I found important.

Vec<T>

Vec<T> is a dynamic array which only grows and never shrinks automatically.

A contiguous growable array type with heap-allocated contents.

Rust std doc

Notice the difference between its counterpart stack-allocated fixed-size array [T; N] (where at this time N needs to be a specified non-negative integer. Constant generic hopefully will come soon).

Ok! let’s dig in to it. Vec<T> contains (pointer, capacity, length).

Vec pointer

  1. The pointer will never be null, so it enjoys null-pointer-optimization.
  2. The pointer may not actually point to allocated memory for example inVec::new(), vec![] or Vec::with_capacity(0).

Vec capacity and length

  1. The capacity of a vector is the amount of space allocated for any future elements that will be added onto the vector.
  2. The length is the number of actual elements pushed/inserted in the vector.
  3. Vec allocates memory iff mem::size_of::<T>() * capacity() > 0. So it does not allocate for a Zero-Sized-Type (ZST) even with positive capacity.
  4. When length matches the capacity, Vec will (re)allocate by a certain growth factor. This makes insertion amortized O(1). Right now the growth factor is 2. However, comparing to other languages such as C++, Java, etc. it doesn’t seem to be optimal given any global first-fit allocator. Heuristically, 1.5 or a number a bit less than golden ratio is considered optimal. Here’s the related issue that’s currently open. I found it interesting to dig in!
  5. How about shrink factor? for example, if we pop half of the elements out, would a quarter of the memory be freed? No, actually! That’s if poping all the elements the capacity won’t change leaving a hole on the heap. Therefore, pop (from back) is O(1) and not amortized O(1). If you need to free up some memory, use shrink_to_fit.
  6. If need to use Vec for FFI or as a memory-backed collection, be sure to deallocate memory with from_raw_parts and then drop explicitly.
  7. If used in FFI and need to pass as pointer, for safety remember to call shrink_to_fit or truncate by the length prior to passing the (as_mut_ptr() or as_ptr()) to not pass uninitialized memory buffer.
  8. The order of elements is always guaranteed to be the same if coerced into slice.

Here’s the stripped down definition

struct Vec<T> {
    buf: RawVec<T>,
    len: usize,
}

// Default Global allocator
struct RawVec<T, A: Alloc = Global> {
    ptr: Unique<T>,
    cap: usize,
    a: A,
}

#[repr(transparent)]
struct Unique<T: ?Sized> {
    pointer: *const T,
    _marker: PhantomData<T>,
}

Unique<T>

  1. #[repr(transparent)] enforces that Unique<T>type representation is the same as *const T.
  2. Unique<T> is the covariant version of *mut T (wrt T) and has stronger semantic than NonNull<T>.
  3. Unlike *mut T, the pointer must always be non-null.
  4. In fact, Box<T> wraps Unique<T> i.e. struct Box<T: ?Sized>(Unique<T>).
  5. It can be accessed in nightly through #![feature(ptr_internals)] and core::ptr::Unique.
  6. If Tis Send/Sync then Unique<T> is Send/Sync.
  7. The presence of PhantomData<T> marker is only important for rustc dropck to understand that we logically own a T causing the main difference between Unique<T> and NonNull<T> where it’s defined as
// NonNull<T> doesn't own the referent whereas Unique<T> does
#[repr(transparent)]
struct NonNull<T: ?Sized> {
    pointer: *const T,
}




Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About Me

I am a Machine Learning since 2013, transforming complex data into manageable solutions. Previously grounded in the discipline of pure Mathematics, I transitioned from beauty of Math to innovative code. I blog to manage and structure information. If you want to know more about me and my work, please check out my GitHub and Linkedin.

Newsletter

%d bloggers like this: