When Ferrous Metals Corrode, pt. VII

Intro

This post corresponds to "Chapter 8. Crates and Modules" in the book.

Crates

Crates are the package format for Rust; they're made up of source code plus tests, data, etc. Managing those packages is the job of cargo, came across this previously. Building crates is triggered by running cargo build, which downloads (possibly transitive) deps mentioned in Cargo.toml and runs rustc on them. Rustc takes --crate-type lib or --crate-type bin options for building libraries or programs. With libraries it will create an .rlib file, otherwise it will look for a main() func and spit out a program. Code from .rlib files will be statically linked into the program. With cargo build --release I can get an optimized build.

Editions

Rusts compat promises are built around "editions", e.g. the 2018 edition added the async/await keywords. The desired edition is specified at the top of Cargo.toml. The promise is that rustc will always support all editions; and crates can mix editions, i.e. depend on crates that have specific edition requirements – crates only need to upgrade to newer editions for new lang features.

Build Profiles

Builds are configured by build profiles. The book recommends enabling debug symbols for the release profile for running the binary under a profiler (so I'd get both optimizations and debug syms).

Modules

Modules are Rusts namespaces. Symbols marked 'pub' are exported, otherwise they're private to the module; a 'pub crate' will be private to the crate. There are some more variants on controlling visibility: 'pub(super)' to make an item visible to the parent, 'pub(in <path>)' for visibility in a specific subtree.

Nesting and Code organization

Modules can be nested. Modules can be defined along the code in a single file, but also (and imho more practical) a module body can live in a separate file or a separate directory too.

Example: module foo with code in a separate file:

  • main.rs, contains declaration pub mod foo;

  • foo.rs, with code for module foo

Example: module bar and quux with code in a separate directory:

  • main.rs, contains declaration pub mod bar;

  • bar/mod.rs, contains declaration pub mod quux;

  • bar/quux.rs, with code for module quux

The compiler will look for a bar.rs or a bar/mod.rs

It's also possible to have a directory 'foo' plus a 'foo.rs' with declarations for nested modules:

  • foo.rs, contains declaration pub mod bar

  • foo/bar.rs with code body for the bar module

Paths and Imports

The :: operator accesses symbols of a module. For instance to use the swap fun in the std::mem module, use: std::mem::swap(...)

To shorten this, import the module:

use std::mem;
mem::swap(...)

It'd be possible to directly import swap() as well – but importing types, traits and modules, and then accessing other items from there is considered better style.

This imports several names and renames imports respectively:

use std::fs::{self, File}; // import both `std::fs` and `std::fs::File`.  
use std::io::Result as IOResult;

Importing from the parent must be done explicitly with the super keyword: use super::AminoAcid;

Referring to the top level with the crate keyword: use crate::proteins::AminoAcid;

To refer to an external crate (in case of naming conflicts), use the absolute path starting with colons: use ::image::Pixels;

The Standard Prelude

There's two ways in which the standard lib is automatically available. Firstly, the std lib is always linked.

Secondly, some items are always auto-imported, as if every module started with:

use std::prelude::v1::*;

This makes some things always available, like Vec or Result

Some external crates contain modules named prelude, that's a convention to signal it's meant to be imported with *

Making use Declarations pub

Importing with use can be made public and thus re-export. This will make Leaf and Root available at the top:

// mod.rs
pub use self::leaves::Leaf;
pub use self::roots::Root;  

Making Struct Fields pub

Visibility can also be controlled for individual fields:

pub struct Fern {
    pub roots: RootSet,
    pub stems: StemSet
}  

Statics and Constants

pub const ROOM_TEMPERATURE: f64 = 20.0;  // degrees Celsius

pub static STATIC_ROOM_TEMPERATURE: f64 = 68.0;  // degrees Fahrenhe  

Constants are similar to #define, while statics are variables that live through the program lifetime. Use statics for larger amounts of data, or if you need a shared ref. Statics can be mut but those can't be accessed in safe code; they're of course non-thread-safe.

Turning a Program into a Library

By default, cargo looks for a src/lib.rs and if it finds one, builds a library

The src/bin directory

Cargo treats src/bin/*.rs as programs to build, as well as src/bin/*/*.rs files

Attributes

Things like controlling compiler warnings or conditional compilation are done via attributes.

Example: quell warnings about naming conventions:

#[allow(non_camel_case_types)]
pub struct git_revspec {
    ...
}  

Conditional compilation:

// Only include this module in the project if we're building for Android.
#[cfg(target_os = "android")]
mod mobile;  

Suggest / force inlining:

//#[inline]
#[inline(always)]
fn do_osmosis(c1: &mut Cell, c2: &mut Cell) {
    ...
} 

The #[foo] attributes apply to a single item.

In order to attach an attribute to a whole crate, use !#[foo] at the top of main.rs or lib.rs.

For example, the #![feature] attribute is used to turn on unstable features of the Rust language and libraries:

#![feature(trace_macros)]

fn main() {
    // I wonder what actual Rust code this use of assert_eq!
    // gets replaced with!
    trace_macros!(true);
    assert_eq!(10*10*10 + 9*9*9, 12*12*12 + 1*1*1);
    trace_macros!(false);
}

Tests and Documentation

I already saw the test attribute for marking functions as tests:

#[test]
fn math_works() {
    let x: i32 = 1;
    assert!(x.is_positive());
    assert_eq!(x + 1, 2);
}  

These can then be run globally or individually with cargo test or cargo test math. The latter would run all fns whose names contain "math"

Use something like this to test for a panicking fn:

#[test]
#[allow(unconditional_panic, unused_must_use)]
#[should_panic(expected="divide by zero")]
fn test_divide_by_zero_error() {
    1 / 0;  // should panic!
}  

The allow attr is for convincing the compiler to let us do foolish things and not optimize them away.

It's useful to put tests into a separate module and have it only compiled for testing. The #[cfg(test)] attr checks for that:

#[cfg(test)]   // include this module only when testing
mod tests {
    fn roughly_equal(a: f64, b: f64) -> bool {
        (a - b).abs() < 1e-6
    }

    #[test]
    fn trig_works() {
        use std::f64::consts::PI;
        assert!(roughly_equal(PI.sin(), 0.0));
    }
}  

By default, cargo test will run tests multithreaded. To have tests run singlethreaded use cargo test -- --test-threads 1

Integration Tests

Integration tests live in tests/*.rs files alongside the src directory. When you run cargo test, Cargo compiles each integration test as a separate, standalone crate, linked with your library and the Rust test harness. The integration tests use the SUT as an external dep.

Documentation

To generate docs run cargo doc. To generate docs for our project only and open them in a browser use cargo doc --no-deps --open

Doc comments for an item start with ///

Comments starting with //! are treated as #![doc] attributes and are attached to the enclosing feature, e.g. a module or crate.

Doc comments can be Markdown-formatted. Markdown links can use Rust item paths, like leaves::Leaf to point to specific items.

It's possible to include longer doc items:

#![doc = include_str!("../README.md")]

Doctests

Things that look like MD code blocks in docs will get compiled and run, for instance:

use std::ops::Range;

/// Return true if two ranges overlap.
///
///     assert_eq!(ranges::overlap(0..7, 3..10), true);
///     assert_eq!(ranges::overlap(1..5, 101..105), false);
///
/// If either range is empty, they don't count as overlapping.
///
///     assert_eq!(ranges::overlap(0..0, 0..10), false);
///
pub fn overlap(r1: Range<usize>, r2: Range<usize>) -> bool {
    r1.start < r1.end && r2.start < r2.end &&
        r1.start < r2.end && r2.start < r1.end
}  

This generate two tests, one for each code block. Inside doctest code blocks, use # my_test_setup() to have setup code hidden – the setup code will be run for doctests but not displayed in the docs.

Doctests code blocks fenced with ```no_run will be compiled but not actually run

Specifying Dependencies

Ok, I already saw how to specify dependencies in Cargo.toml for crates hosted on crates.io, it's the default:

image = "0.6.1"

For a crate hosted on github:

image = { git = "https://github.com/Piston/image.git", rev = "528f19c" }

A local crate:

image = { path = "vendor/image" }

Versions

Interestingly, the default version spec gives cargo some leeway in choosing crate versions, along the lines of semantic versioning where it chooses the latest version that should be compatible to the specified version. How well this works depends on how well the semantic versioning is aligned to reality I guess.

The version spec allows for more control too though, for example to pin to an exact version: image = "=0.10.0"

Cargo.lock

However, if present cargo will consult a Cargo.lock file with exact crate versions to use as dependencies. This is output when creating the project and when running cargo update. For executables it's useful to commit Cargo.lock to source control for repeatable builds. Static libraries on the other hand shouldn't bother as their clients will have Cargo.locks of their own (shared libs wouldn't though, commit Cargo.lock for those).

Publishing Crates to crates.io

As has become standard in the open source universe, Rust has built-in support for publishing crates to a central location, here crates.io (presumably with the same issues around trustworthiness of 3rd party packages/crates).

This is managed by cargo as well, specifically cargo package to bundle up sources. It uses data from the [package] section of Cargo.toml to populate metdata like version info, license, etc. Sensibly, if Cargo.toml specifies deps by path that conf is ignored. For packages published to crates.io, it's deps should come from there as well. It's also possible to specify dependencies by path and have crates.io as a fallback.

The cargo login command can get you an API key, and cargo publish can be used to push the crate up.

Workspaces

To share dependencies it's possible to define a shared workspace for related project via a Cargo.toml in a root dir with a [workspace] section that lists the participating subdirs (each containing a crate). This will download deps that are common deps of those crates in a common subdir.

Also, the cargo build/test/doc commands accept a --workspace flag; when specified this will make the command act on all crates in the workspace.

More Nice Things

  • Publishing on crates.io pushes the crates docs to docs.rs

  • Githubs Travis CI has support Rust, and it looks like there's Actions as well

  • Generate a README.md file from top-level crate comment with the cargo-readme plugin

That's it for now!