When Ferrous Metals Corrode, pt. VIII

Intro

This post corresponds to Chapter 9. Structs in the "Programming Rust" book.

Rust structs are data collection; fields can be named or tuple (or there can be no fields at all). Like in Go, you can have methods bound to structs (there is no separate "class" concept).

Named-Field Structs

There already were named-field like structs in previous chapters. Declaring a pub type, one of the fields being public as well:

pub struct GrayscaleMap {
    pub pixels: Vec<u8>,
    size: (usize, usize)
}

With the below we'd create a value of this type (struct expression):

let image = GrayscaleMap {
    pixels: vec![0; 100 * 100],
    size: (100, 100)
};  

A shorthand if you already have vars with the same name:

fn new_map(size: (usize, usize), pixels: Vec<u8>) -> GrayscaleMap {
    assert_eq!(pixels.len(), size.0 * size.1);
    GrayscaleMap { pixels, size }
}  

Some standard types like String or Vec only have private fields; you can't use struct expressions for those. Instead, they provide type-associated construct functions such as Vec::new() which of course have access to private fields.

A struct can be private or public, as well as its individual fields.

You can copy fields from another struct when constructing a value. E.g. broom2 copies all fields except name from broom1. As String is not Copy we have to .clone() it:

let broom1 = Broom { name: "foo".to_string(), height: 100 };
let broom2 = Broom { name: broom1.name.clone(), .. broom1 };  

Tuple-Like Structs

Fields of tuple-like structs don't have names but an index:

struct Bounds(usize, usize);
let image_bounds = Bounds(1024, 768);
assert_eq!(image_bounds.0 * image_bounds.1, 786432);

Tuple-like structs work nicely for pattern matching, but also as newtypes – types with a single component that provide better type checking. For example this might be helpful for ASCII-only text:

struct Ascii(Vec<u8>);  

Unit-Like Structs

These are structs with no fields:

struct Onesuch;  

There's only ever one value of these, and values take no memory. Useful for working with traits.

Struct Layout

Unlike C, there's no promises around how struct vals are laid out in memory, except that values are directly contained in the struct (unlike e.g. Python which would store pointers to values on the heap). Of course if values are heap-allocated (e.g. Vec) fields will be as well.

Defining Methods with impl

Like Go, you can define methods on structs. This is done with an impl block:

/// A first-in, first-out queue of characters.
pub struct Queue {
    older: Vec<char>,   // older elements, eldest last.
    younger: Vec<char>  // younger elements, youngest last.
}

impl Queue {
    /// Push a character onto the back of a queue.
    pub fn push(&mut self, c: char) {
        self.younger.push(c);
    }

    /// Pop a character off the front of a queue. Return `Some(c)` if there
    /// was a character to pop, or `None` if the queue was empty.
    pub fn pop(&mut self) -> Option<char> {
        // ...
    }
}  

Methods are also called associated functions. They are called with an arg of self as the first argument. Define self to be a mutable or shared ref as required, or pass self by value if you want to have the method take ownership, e.g. to define a split() method on our Queue which needs to move the component queues out of self:

impl Queue {
    pub fn split(self) -> (Vec<char>, Vec<char>) {
        (self.older, self.younger)
    }
}  

Passing Self as a Box, Rc, or Arc

It's possible to pass self as a Box or Rc value:

impl Node {
    fn append_to(self: Rc<Self>, parent: &mut Node) {
        parent.children.push(self);
    }
}

Type-Associated Functions

This are methods that are associated with a type (like class methods); they don't take a self arg:

impl Queue {
    pub fn new() -> Queue {
        Queue { older: Vec::new(), younger: Vec::new() }
    }
}

// usage
let mut q = Queue::new();

Constructor functions are by convention called new()

It's possible to have many impl blocks for a single type, however they must be in the same crate. There's a way to attach methods to other types, handled in chap. 11

Associated Consts

These are consts that are part of a type. For example:

pub struct Vector2 {
    x: f32,
    y: f32,
}

impl Vector2 {
    const NAME: &'static str = "Vector2";
    const UNIT: Vector2 = Vector2 { x: 1.0, y: 0.0 };
}
// usage
let scaled = Vector2::UNIT.scaled_by(2.0);

Note, these consts are not part of the struct but of the type.

Generic Structs

Structs can be parametrized with types as well. To expand on the Queue earlier, making it more generic:

pub struct Queue<T> {
    older: Vec<T>,
    younger: Vec<T>
}
impl<T> Queue<T> {
    pub fn new() -> Queue<T> {
        Queue { older: Vec::new(), younger: Vec::new() }
    }

    pub fn push(&mut self, t: T) {
        self.younger.push(t);
    }

    ...
}

So the struct as well as the impl are parametrized.

One could for instance define a Queue impl for a specific type like this:

impl Queue<f64> {
    fn sum(&self) -> f64 {
        ...
    }
}  

As a shorthand one can use Self instead of Queue<T>

pub fn new() -> Self {
    Queue { older: Vec::new(), younger: Vec::new() }
} 

Generic Structs with Lifetime Parameters

We already saw structs which had lifetimes specified – this is needed for structs that hold refs:

struct Extrema<'elt> {
    greatest: &'elt i32,
    least: &'elt i32
}

This can be read as: for any given lifetime 'elt you can make a corresponding Extrema<'elt>

Generic Structs with Constant Parameters

Another way to parametrize structs is with constant vals. Here's a Polynomial with a fixed number of coefficients and some methods:

/// A polynomial of degree N - 1.
struct Polynomial<const N: usize> {
    /// The coefficients of the polynomial.
    ///
    /// For a polynomial a + bx + cx² + ... + zxⁿ⁻¹,
    /// the `i`'th element is the coefficient of xⁱ.
    coefficients: [f64; N]
}
impl<const N: usize> Polynomial<N> {
    fn new(coefficients: [f64; N]) -> Polynomial<N> {
        Polynomial { coefficients }
    }

    /// Evaluate the polynomial at `x`.
    fn eval(&self, x: f64) -> f64 {
        // Horner's method is numerically stable, efficient, and simple:
        // c₀ + x(c₁ + x(c₂ + x(c₃ + ... x(c[n-1] + x c[n]))))
        let mut sum = 0.0;
        for i in (0..N).rev() {
            sum = self.coefficients[i] + x * sum;
        }

        sum
    }
}

Note the Polynomials' coefficients aren't stored on the heap here, but directly in the value as an array. Also, the methods can also refer to the N const.

Consts can be integers, chars or bool.

If you need to use lifetimes and generic with consts, the sequence should be like this for a struct with a lifetime 'a, generic over T and a const N:

struct LumpOfReferences<'a, T, const N: usize> {
    the_lump: [&'a T; N]
}

Deriving Common Traits for Struct Types

When defining structs it's usually helpful to supply a few capabilities aka traits from the start. The derive attrib can do this, providing prefab behaviours:

#[derive(Copy, Clone, Debug, PartialEq)]
struct Point {
    x: f64,
    y: f64
} 

With those traits Points can be copied, cloned, printed with the "{:?}" template, and support equality operations. With PartialOrd a struct can also given comparison ops.

Interior Mutability

Often you'll have structs with a mix of mutable and immutable data. To update the mut vals you'd pass around the struct val mut even though the mut val is only a small portion. This calls for interior mutability.

Two types that can help are Cells and RefCells from the std::cell module.

A Cell<T> has a single private field that has methods for getting/setting that field even if the Cell value is not mut.

Cell::new(value); // Creates a new Cell, moving the given value into it.

cell.get(); // Returns a copy of the value in the cell.

// Stores the given value in the cell, dropping the previously stored value.
cell.set(value);  

// This method takes self as a non-mut reference:
fn set(&self, value: T)    // note: not `&mut self`!!!

However, Cells don't let you call mut methods on the private value; the .get() returns a copy of the value.

For these cases use RefCell<T>:

ref_cell.borrow();
// Returns a Ref<T>, which is essentially just a shared reference to the value stored in ref_cell.
// This method panics if the value is already mutably borrowed; see details to follow.

ref_cell.borrow_mut();
// Returns a RefMut<T>, essentially a mutable reference to the value in ref_cell.
// This method panics if the value is already borrowed; see details to follow.

ref_cell.try_borrow(), ref_cell.try_borrow_mut()
// Work just like borrow() and borrow_mut(), but return a Result.
// Instead of panicking if the value is already mutably borrowed, they return an Err value.

In effect, RefCells checks the rules around borrowing (ie. mut needs exclusive access) at run time while usually Rust checks these at compile time.

Example usage – a SpiderRobot struct that needs mut access to a log file:

pub struct SpiderRobot {
    ...
    log_file: RefCell<File>,
    ...
}

impl SpiderRobot {
    /// Write a line to the log file.
    pub fn log(&self, message: &str) {
        let mut file = self.log_file.borrow_mut();
        // `writeln!` is like `println!`, but sends
        // output to the given file.
        writeln!(file, "{}", message).unwrap();
    }
}  

Note the .log() method has shared self ref, but accesses the log file mut.

Due to the rule-bending nature of Cells they aren't thread-safe; Rust will not allow multiple threads to access them. There's thread-safe variants of interior mutability too though.

Coda

That's it for structs and methods, providing a solid and flexible foundation for OOP features (sans inheritance, I'm 100% ok with that). Next up, Enums and Patterns.