Some generic functions need to verify properties of the types they accept that are not easy to express using traits, such as that the type’s size or layout satisfies a condition. These constraints come from unsafe or FFI code and are generally hard to avoid. Here is a silly function that has two requirements of the value: that it’s Pod, and that it’s non-zero-sized:
/// Returns the first byte of the in-memory representation of `value`.
/// Requires value to have non-zero size.
fn first_byte_of<T: bytemuck::Pod>(value: T) -> u8 {
assert!(std::mem::size_of::<T>() != 0);
let addr = (&value) as *const T as *const u8;
unsafe { *addr }
}
The Pod requirement is expressed with a trait bound (provided by an external crate), but there is no trait bound to cover all zero-sized types, so the function asserts that at run-time. The assertion is what makes the function’s use of unsafe sound: not only does first_byte_of()
not make sense with zero-sized types, but attempting to call it with one would cause undefined behavior if it weren’t for the check. Usage looks like you’d expect:
fn main() {
// 258u16 is [2, 1] in memory
assert_eq!(first_byte_of(258u16), 2);
// 3.14f64 is [31, 133, 235, 81, 184, 30, 9, 64] in IEEE 754-ese
assert_eq!(first_byte_of(3.14f64), 31);
//first_byte_of(()); // panics at run-time
}
While the above works, it does introduce the possibility of a run-time panic. The obvious fix would be to change the return type to Option<u8>
, returning None
for zero-sized T
. That would shift the burden of panicking to the caller, which would very likely immediately .unwrap()
the returned value, at least in cases where it believes the type to be non-zero-sized, and it needs the value unconditionally. Keep in mind that the caller of first_byte_of()
might itself be generic, so changing a type very far away from the call to first_byte_of()
could introduce the panic as long as the check is performed at run-time, and the panic might go unnoticed until production.
Thanks to monomorphization, the compiler knows the size of T
when generating the code of first_byte_of<T>()
, so it should in principle be possible to abort compilation when T
is something like ()
. And indeed, beginning with Rust 1.57, the compiler supports compile-time assertions:
pub const FOO: usize = 42;
pub const BAR: usize = 42;
const _: () = assert!(FOO == BAR); // compiles only if FOO == BAR
The const _: () = assert!(...)
syntax looks a bit weird, but it sort of makes sense – assignment to a constant makes sure that the assertion is executed at compile-time, and assert!()
does technically return ()
, since it operates by side effect. The assertion which would normally panic at run-time now becomes a compilation failure, which is just what we need.
Applied to first_byte_of()
, the check would look like this:
fn first_byte_of<T: bytemuck::Pod>(value: T) -> u8 {
// size_of() is const fn and can be invoked in const contexts
const _: () = assert!(std::mem::size_of::<T>() != 0);
let addr = (&value) as *const T as *const u8;
unsafe { *addr }
}
But… this doesn’t compile! The message is “error[E0401]: can’t use generic parameters from outer function”, and the explanation doesn’t really help with our use case. Simple attempts to work around the error, such as by moving the assertion to a separate const fn
, fail to work.
Some older discussions about this topic go even so far as to claim that rustc is actually justified in preventing post-monomorphization errors. They argue that it would be wrong for a generic function to compile with one type but not with another, at least in cases where both types satisfy the trait/lifetime bounds of the function. Fortunately this view was not shared by the compiler team, and Rust does allow you to verify properties of generics at compile-time. It just requires a bit of ceremony:
fn first_byte_of<T: bytemuck::Pod>(value: T) -> u8 {
struct Check<T>(T);
impl<T> Check<T> {
const NON_ZERO_SIZE: () = assert!(std::mem::size_of::<T>() != 0);
}
let _ = Check::<T>::NON_ZERO_SIZE;
let addr = (&value) as *const T as *const u8;
unsafe { *addr }
}
Before explaining what’s going on, let’s see how well this works. The main()
function from above compiles as before, but uncommenting the first_byte_of(())
invocation results in this beautiful compile time error:
error[E0080]: evaluation of `first_byte_of::Check::<()>::NON_ZERO_SIZE` failed
--> src/main.rs:4:35
|
4 | const NON_ZERO_SIZE: () = assert!(std::mem::size_of::<T>() != 0);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the evaluated program panicked at 'assertion failed: std::mem::size_of::<T>() != 0', src/main.rs:4:35
|
= note: this error originates in the macro `assert` (in Nightly builds, run with -Z macro-backtrace for more info)
note: the above error was encountered while instantiating `fn first_byte_of::<()>`
--> src/main.rs:16:5
|
16 | first_byte_of(());
| ^^^^^^^^^^^^^^^^^
Note that the error is shown only if you do cargo build
(or cargo run
etc), but not with cargo check
, where compilation appears to succeed.
To work around “can’t use generic parameters from outer function”, the const
in the function must not be initialized by an expression involving generic types. This limitation may be lifted in the future, but for now constants in functions must be fully resolved prior to monomorphization. That means that the const _: () = ...
trick doesn’t work and we need to find another way to force the assertion to be evaluated at compile-time.
This is where associated constants come into play – the initialization expression of constants attached to generic types isn’t subject to the same restrictions as the initialization expression of constants defined inside generic functions. We introduce a Check
type which is generic over T
, and contains a T
just to satisfy the compiler (we could use “phantom data” but we don’t bother because we never actually construct a Check
value). As before, the NON_ZERO_SIZE
constant serves only to execute the assert, its value never being really used and remaining ()
, as that’s what assert!()
returns. But we do need to trigger its evaluation from first_byte_of()
, which is accomplished with let _ = Check::<T>::NON_ZERO_SIZE;
. The dummy let
binding prevents an “unused value” warning we’d get if we just wrote Check::<T>::NON_ZERO_SIZE;
at function top-level.
Finally, this pattern for enforcing compile-time assertions using associated constants can be extracted into a macro:
macro_rules! gen_assert {
($t:ident, $c:expr) => {{
struct Check<$t>($t);
impl<$t> Check<$t> {
const CHECK: () = assert!($c);
}
let _ = Check::<$t>::CHECK;
}}
}
With the boilerplate taken care of by the macro, first_byte_of()
becomes simple again:
fn first_byte_of<T: bytemuck::Pod>(value: T) -> u8 {
gen_assert!(T, std::mem::size_of::<T>() != 0);
let addr = (&value) as *const T as *const u8;
unsafe { *addr }
}
Great article! I had been waiting for this to be possible for a while without realizing that it can already be done today. Thank you for teaching me.
I had to extended the proposed macro a bit to address the following points:
– support for arbitrary number of generic arguments
– support for optional trait bounds (this is necessary if the assertions rely on associated trait items of generic arguments)
– support for const generics
– suppressing some false-positive compiler and clippy lints
In case anyone else has similar requirements, here’s what I ended up with for now:
/// Helper macro to express assertions that are tested at compile time
/// despite using properties of generic parameters of an outer function.
///
/// See discussion at <https://morestina.net/blog/1940>.
macro_rules! generic_asserts {
(($($l:lifetime,)* $($($t:ident$(: $bound:path)?),+)? $(; $(const $c:ident:$ct:ty),+)?); $($label:ident: $test:expr);+$(;)?) => {
#[allow(path_statements, clippy::no_effect)]
{
struct Check<$($l,)* $($($t,)+)? $($(const $c:$ct,)+)?>($($($t,)+)?);
impl<$($l,)* $($($t$(:$bound)?,)+)? $($(const $c:$ct,)+)?> Check<$($l,)* $($($t,)+)? $($($c,)+)?> {
$(
const $label: () = assert!($test);
)+
}
generic_asserts!{@nested Check::<$($l,)* $($($t,)+)? $($($c,)+)?>, $($label: $test;)+}
}
};
(@nested $t:ty, $($label:ident: $test:expr;)+) => {
$(
<$t>::$label;
)+
}
}
The macro can be used in very general ways:
generic_asserts!(
(Type1: Trait1, Type2: Trait2; const C1: usize, const C2: u32);
TYPE1_MUST_BE_COMPATIBLE_WITH_C1: Type1::ASSOCIATED_CONST <= C1;
TYPE2_MUST_BE_COMPATIBLE_WITH_C2: Type2::OTHER_ASSOCIATED_CONST <= C2;
BOTH_TYPES_MUST_SATISFY_SOMETHING_JOINTLY: Type1::ASSOCIATED_CONST + Type2::ASSOCIATED_CONST <= C1 * C2 as usize
);
Here, the first line denotes the generic types and consts (consts are optional; note the semicolon between types and consts). Each one of the subsequent lines defines an assertion that can reference any subset of generic types and consts. Each assertion is prefixed with a label, which appears in the compiler error message if the assertion is violated.