When Ferrous Metals Corrode, pt. X

Intro

This post corresponds to Chapter 11. Traits and Generics.

Traits are like interfaces or abstract base classes – contracts that describe what a type can do by describing it's methods. Optionally, they can also define default implementations of the methods they prescribe.

An example trait:

trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>;
    fn flush(&mut self) -> Result<()>;

    fn write_all(&mut self, buf: &[u8]) -> Result<()> {
        // default implementation ...
    }
    ...
}

This trait offers a number of writing methods. There's several stdlib classes that implement this trait, e.g. File and TcpStream.

Usage might be something like:

use std::io::Write;

fn say_hello(out: &mut dyn Write) -> std::io::Result<()> {
    out.write_all(b"hello world\n")?;
    out.flush()
}

The say hello fn takes a mut ref to some value that implements Write. The dyn signifies dynamic dispatch.

We have used generics already; they also provide polymorphism. Generics go hand in hand with traits, as traits can be used to constrain which generic type can be used.

Using Traits

To use traits, they must be in scope, ie. to make use of the Write above need to import it use std::io::Write; so it's unambiguous which one is needed. Clone and Iterator traits btw. are part of the prelude and therefore always available.

Trait Objects

The dyn keyword above marks a trait object. As noted above these allow for virtual method calls – it's only known at runtime which actual method will be run.

Trait objects can't be instantiated directly because it's unknown at compile time what size they will have. One can however create a ref to a trait object:

let mut buf: Vec<u8> = vec![];
let writer: dyn Write = buf;  // error: `Write` does not have a constant size

let mut buf: Vec<u8> = vec![];
let writer: &mut dyn Write = &mut buf;  // ok

Trait object layout

A trait object is a fat pointer to some value (e.g. the buf above) plus a pointer to a virtual method table; this is used to lookup the actual methods to call.

Rust will auto-convert plain refs e.g. to a File object to trait objects when needed, and similarly for Boxed and Rc values.

Generic Functions and Type Parameters

Traits can also be used as a bound on which kind of type can be used as a generic. A rewrite of the say hello func from before:

fn say_hello<W: Write>(out: &mut W) -> std::io::Result<()> {
    out.write_all(b"hello world\n")?;
    out.flush()
}

This tells rustc we want a generic type W but it has to implement the Write trait.

If the 'out' param is a File, Rust will call the fn say_hello::<File>() with it and so on for whichever Write-implementing 'out' it will be used.

Sometimes it's necessary to spell this explicitly because Rust lacks clues how to infer:

// calling a generic method collect<C>() that takes no arguments
let v1 = (0 .. 1000).collect();  // error: can't infer type
let v2 = (0 .. 1000).collect::<Vec<i32>>(); // ok

To use several traits as a bound use +, e.g.:

use std::hash::Hash;
use std::fmt::Debug;

fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>) { ... }

This says the fn takes a generic =Vec<T>=but that T needs to implement the Debug, Hash and Eq traits.

As its entirely possible to have several generic types in a call signature, each one constrain-able by traits there's an alternate syntax to make this more readable by splitting out the bounds into a 'where' part:

fn run_query<M, R>(data: &DataSet, map: M, reduce: R) -> Results
    where M: Mapper + Serialize,
          R: Reducer + Serialize
{ ... }

The 'where' clause is allowed everywhere bounds are allowed.

When specifying lifetimes and bounds the lifetimes come first.

Similar to the struct we saw in chapter 9, generic functions can also take const. parameters:

fn dot_product<const N: usize>(a: [f64; N], b: [f64; N]) -> f64 {
    let mut sum = 0.;
    for i in 0..N {
        sum += a[i] * b[i];
    }
    sum
}

This example illustrates how the N const param can then be used in the fn signature and the body – with an ordinary fn arg you couldn't use it in the type signature.

Besides functions types, methods, type aliases and traits can be generic.

Which to Use

Use trait objects with collections of mixed types, e.g. mixing different types of veggies:

struct Salad {
    veggies: Vec<Box<dyn Vegetable>>
}

Note this needs a Box to get a constant size (a dyn Vegetable could be any size).

Also generics can suffer from code bloat as (possibly many) different types of code needs to be generated.

Using generics has other advantages though:

  • Speed, as there is no dynamic dispatch

  • Some traits can't be used with trait objects (e.g. associated functions)

  • It's possible to bound a generic type by several traits, which trait objects can't

Defining and Implementing Traits

Use trait FooTrait {...} and list method signatures to define a trait.

To implement a trait use something like impl FooTrait for MyType {...} where MyType is a struct. The block inside must not contain methods that aren't specified in the trait. Use extra impl blocks to define additional methods.

Default Methods

Here's a Sink type that just discards everything. We make it conform to the Write trait.

pub struct Sink;  // no fields needed

use std::io::{Write, Result};

impl Write for Sink {
    fn write(&mut self, buf: &[u8]) -> Result<usize> {
        // Claim to have successfully written the whole buffer.
        Ok(buf.len())
    }

    fn flush(&mut self) -> Result<()> {
        Ok(())
    }
}

We implement the Write interface with noops here.

Our implementation does not contain a write_all() method which Write declares. It's not needed here because the Write trait has a default implementation for it which gets used if the impl doesn't override it.

Traits and Other People’s Types

It's possible to implement traits from other peoples too:

trait IsEmoji {
    fn is_emoji(&self) -> bool;
}

/// Implement IsEmoji for the built-in character type.
impl IsEmoji for char {
    fn is_emoji(&self) -> bool {
        ...
    }
}
assert_

This is called an extension trait.

Can also add impl blocks to existing traits. The below adds a write_html() fn to all std::io::Write types:

use std::io::{self, Write};

/// Trait for values to which you can send HTML.
trait WriteHtml {
    fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()>;
}
/// You can write HTML to any std::io writer.
impl<W: Write> WriteHtml for W {
    fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()> {
        ...
    }
}

This says "for every type W that implements Write, here's an implementation of WriteHtml for W."

The serde library uses this to add serialization / deserialization methods for to standard data types.

To keep trait implementations unique Rust enforces the "orphan rule" – when implementing a trait, either the trait or the type must be new in the current crate.

Self in Traits

A trait can say "use this type, whatever it might be", by using Self (note uppercase S).

Example, the Clone trait says it's clone method will return an object of it's implementors type:

pub trait Clone {
    fn clone(&self) -> Self;
    ...
}

Each implementing type will subsitute it's own type here.

Using Self makes the trait incompatible with trait objects – because then Rust doesn't know which type to use at compile time.

Subtraits

Traits can extend other traits:

trait Creature: Visible {
    fn position(&self) -> (i32, i32);
    fn facing(&self) -> Direction;
    ...
}

impl Visible for Broom {
    ...
}

impl Creature for Broom {
    ...
}

So, if you want to implement Creature you'll also need to implement Visible. Here we have Brooms that are Visible as well has Creatures.

Note that each trait still needs to be in scope if you want to use it's methods.

Type-Associated Functions

Traits also can specify type-associated functions (aka static methods). Example with two constructors, note the missing self arg:

trait StringSet {
    /// Return a new empty set.
    fn new() -> Self;

    /// Return a set that contains all the strings in `strings`.
    fn from_slice(strings: &[&str]) -> Self;

}

// usage with generics
fn unknown_words<S: StringSet>(document: &[String], wordlist: &S) -> S {
    let mut unknowns = S::new();
    for word in document {
        if !wordlist.contains(word) {
            unknowns.add(word);
        }
    }
    unknowns
}

Caveat: trait objects don't support type-associated functions.

Fully Qualified Method Calls

These are equivalent:

"hello".to_string(); // method call
str::to_string("hello"); // qualified call: as an assoc. function
ToString::to_string("hello"); // qualified call: via the trait
<str as ToString>::to_string("hello"); // fully qualified method call

The to_string method is specified on the ToString trait, and implemented (among others) for the str type.

The value.method() form lets Rust figure out which method to actually run. If that ever happens to be ambiguous, use one of the (fully) qualified method calls.

This can happen with name clashes between traits:

outlaw.draw();  // error: draw on screen or draw pistol?

Visible::draw(&outlaw);  // ok: draw on screen
HasPistol::draw(&outlaw);  // ok: corral

Or if the type of the method can't be inferred (e.g. for the number 0).

Using qualified method calls when using functions as first-class objects like with a map():

let words: Vec<String> =
    line.split_whitespace()  // iterator produces &str values
        .map(ToString::to_string)  // ok
        .collect();

Fully qualified calls also work with associated functions.

Traits That Define Relationships Between Types

Traits can also describe relationships between types, for instance Iterators have an iterator type and produced values (with a type).

Associated Types (or How Iterators Work)

Traits can have an associated type – in the Iterator trait below this is Item, the type the iterator will produce

pub trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;
    ...
}

The .next() method will produce values of type Item (or None). Note it needs to be Self::Item because this can vary with each type of iterator.

An Iterator impl could look like the below – this impl can produce a sequence of Strings (or None):

impl Iterator for Args {
    type Item = String;

    fn next(&mut self) -> Option<String> {
        ...
    }
    ...
}

Example with a generic type:

/// Loop over an iterator, storing the values in a new vector.
fn collect_into_vector<I: Iterator>(iter: I) -> Vec<I::Item> {
    let mut results = Vec::new();
    for value in iter {
        results.push(value);
    }
    results
}

This will return a vector of Items.

We can also define bounds on those items, for instance require they implement the Debug trait:

use std::fmt::Debug;

fn dump<I>(iter: I)
    where I: Iterator, I::Item: Debug
{
    ...
}

Or outright specify the Items must be String:

fn dump<I>(iter: I)
    where I: Iterator<Item=String>
{
    ...
}

The cool thing is that Iterator<Item=String> is itself a trait: Iterators that produce Strings. This can be used anywhere where a trait name would be used.

Use associated types where each implementation has one specific related type.

Generic Traits (or How Operator Overloading Works)

Example of a generic trait:

/// std::ops::Mul, the trait for types that support `*`.
pub trait Mul<RHS> {
    /// The resulting type after applying the `*` operator
    type Output;

    /// The method for the `*` operator
    fn mul(self, rhs: RHS) -> Self::Output;
}

This says there's a family of Mul traits which vary by what they take on the right-hand side (Mul<f64>, Mul<i32>, …).

Generic traits don't have to obey the orphan rule, if the generic type is from the current crate both trait and impl type can be external.

The actual Mul trait has this:

pub trait Mul<RHS=Self> {
    ...
}

This means if generic type is specified, the generic defaults to the impl type.

E.g., impl Mul for Complex then really says impl Mul<Complex> for Complex

impl Trait

Instead of using a bound on a trait we can also use the somewhat lighter "impl Trait" syntax. These two both specify we want val to have the Display trait:

fn print<T: Display>(val: T) {
    println!("{}", val);
}

fn print(val: impl Display) {
    println!("{}", val);
}

One difference though is that with the generics syntax we can say specify the type explicitly: print::<i32>(42) – this isn't possible with impl Trait

Associated Consts

Traits can have associated consts like structs and enums:

trait Greet {
    const GREETING: &'static str = "Hello";
}

If you don't give the const a value in the trait, implementations can provide it. Example usage:

trait Float {
    const ONE: Self;  // implementors type
}

impl Float for f32 {
    const ONE: f32 = 0.999; // being mean
}

fn add_one<T: Float + Add<Output=T>>(value: T) -> T {
    value + T::ONE
}

Reverse-Engineering Bounds

This section is a bit of a case study into how to genericize a function to make it work across a range of types.

Key to this is applying bounds from built-in traits to accurately describe capabilities from numeric types: adding Mul and Add for the math operations, adding the Default trait which describes types that have a default value, and finally specifying that our output type must match our original numeric type, and finally telling Rust to only allow types which are Copy.

The final version looks like this:

use std::ops::{Add, Mul};

fn dot<N>(v1: &[N], v2: &[N]) -> N
    where N: Add<Output=N> + Mul<Output=N> + Default + Copy
{
    let mut total = N::default();
    for i in 0 .. v1.len() {
        total = total + v1[i] * v2[i];
    }
    total
}

#[test]
fn test_dot() {
    assert_eq!(dot(&[1, 2, 3, 4], &[1, 1, 1, 1]), 10);
    assert_eq!(dot(&[53.0, 7.0], &[1.0, 5.0]), 88.0);
}

Also it references the num crate which has the Num trait describing some of number behaviour.

Comparing this with a naive Python implementation Rust is quite a bit more verbose:

def dot(v1, v2):
    return sum(a*b for a,b in zip(v1, v2))  

But in exchange you get precise control over types, memory and the safety of compile-time checks (though the check for the vectors being of equal dimension is missing, I guess one could add asserts for this case?).

Coda

The topics here – Traits and generics – seem pretty key to Rust. Organizing code around an interface makes a lot of sense to me, and together with static type checking this gives a plenty of power but also safety. Good stuff!