2026-04-03
If I had to give a tagline for Rust procedural macros, it'd be "generic trait implementations for thrill seekers!" (Call me, Rust maintainers.) Proc macros, as they're called, power Rust's ability to #[derive()] implementations on your structs. While generics can fill in the implementation, they're limited by what you can fit into the generics system. Proc macros are much more flexible, but that also makes them much more complex!
For a practical intro, let's implement a silly trait that couldn't be implemented with generics.
The taxicab distance is a simple metric on multidimensional spaces, where the distance is just the sum of distances in each dimension. The name comes from a taxi driving on a city grid, where it has to follow the roads and not travel as the crow flies.
Let's start with a trait that has a taxicab method.
pub trait Taxicab {
fn taxicab(&self, other: &Self) -> f64;
}
Ideally, we could have this for all sorts of structs that only contain numbers.
#[derive(Debug)]
struct Point2 {
x: f64,
y: f64,
}
impl Taxicab for Point2 {
fn taxicab(&self, other: &Self) -> f64 {
(self.x - other.x).abs() + (self.y - other.y).abs()
}
}
#[derive(Debug)]
struct Point3 {
x: f64,
y: f64,
z: f64,
}
impl Taxicab for Point3 {
fn taxicab(&self, other: &Self) -> f64 {
(self.x - other.x).abs() + (self.y - other.y).abs() + (self.z - other.z).abs()
}
}
Okay, but this is a pain if we have to implement it for every single struct! It's all the same basic logic, so can't we take a shortcut?
You've surely used #[derive()] for a struct; the examples above both derived Debug! Some other common targets of derive are Clone, Copy, and PartialOrd.
Fortunately, we're not restricted to only deriving the standard traits, so we can use #[derive(Taxicab)] to automatically generate implementations of our own traits.
Unfortunately, you can't have the proc macro that derives Taxicab in the same crate where it's used, although the Rust team hopes to drop this requirement in the future. Instead, you need to carefully set up your workspace to include both the regular crate defining Taxicab and a special crate just containing the proc macros. Essentially, the proc macros are run during compilation, so they need to be built before any crate that would use them.
To use multiple crates, you'll want to use a Cargo workspace. For some example workspace setups in projects that need proc macros, check out serde or polars.
First, create one binary crate, taxicab, that will contain the trait definition and the main function. Then, create a second crate, the library taxicab-derive, that has the proc macro we need to use #[derive(Taxicab)].
cargo new taxicab
cd taxicab
cargo new --lib taxicab-derive
In the taxicab Cargo.toml, set up the workspace.
[workspace]
members = [".", "taxicab-derive"]
[dependencies]
taxicab-derive = { path = "taxicab-derive" }
We also have to tell Rust that taxicab-derive is going to give us proc macros, with a line in taxicab-derive's Cargo.toml. Make sure to put it in the right configuration, since we've got a lot of Cargo.tomls floating around now.
[lib]
proc-macro = trueThe barebones proc macro looks pretty short: you just take in a TokenStream, do something to it, and return another TokenStream.
use proc_macro::TokenStream;
#[proc_macro_derive(Taxicab)]
pub fn taxicab_derive(input: TokenStream) -> TokenStream {
todo!()
}
Filling it in is the hard part!
Despite proc macros being a core part of Rust, we pretty much always want two external crates to help fill out the derive function. The first is syn, which parses the token stream input into a syntax tree that we can traverse and process.
let ast = syn::parse_macro_input!(input as syn::DeriveInput);The second external crate is quote, which lets us generate token streams by writing Rust-like code with its quote! macro. It has a minimal pattern interpolation that uses the # symbol to identify variables from outside of the scope.
Here's an example of how you'd use it, with format_ident! to get y as a variable not a string.
fn example() -> TokenStream {
let x = format_ident!("y");
let rust = quote!( let #x = "x"; );
rust.into()
}
The example gives you a token stream corresponding to the line below.
let y = "x";Putting it all together, here's a quick implementation that derives the taxicab method!
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro_derive(Taxicab)]
pub fn taxicab_derive(input: TokenStream) -> TokenStream {
let ast = syn::parse_macro_input!(input as syn::DeriveInput);
// The struct's name: impl Taxicab for ???
let name = &ast.ident;
// We need to go through and get all the struct's fields
let mut sum = vec![];
if let syn::Data::Struct(data) = &ast.data {
if let syn::Fields::Named(fields) = &data.fields {
for field in &fields.named {
let field_name = field.ident.as_ref().unwrap();
// A field named x becomes self.x.taxicab(&other.x)
sum.push(quote! { self.#field_name.taxicab(&other.#field_name) });
}
}
}
// Putting it all together for the trait implementation
let implementation = quote! {
impl Taxicab for #name {
fn taxicab(&self, other: &Self) -> f64 {
0.0 #(+ #sum)*
}
}
};
implementation.into()
}
You can derive this on the Point2 struct above.
#[derive(Debug, Taxicab)]
struct Point2 {
x: f64,
y: f64,
}
When you compile, the proc macro generates the following implementation.
impl Taxicab for Point2 {
fn taxicab(&self, other: &Self) -> f64 {
0.0 + self.x.taxicab(&other.x) + self.y.taxicab(&other.y)
}
}
(We haven't implemented Taxicab for f64 yet, but we'll get to it later!)
So most of this is a fairly straightforward traversal of a syntax tree, but I want to talk through how the quoting works. When I go through the struct fields, I collect the field method calls. In the Point2 case, the sum ends up with two quotes like the vec below.
vec![quote!( self.x.taxicab(&other.x) ), quote!( self.y.taxicab(&other.y) )]
We now have to join this into one line, which is what the 0.0 #(+ #sum)* does. The #()* lets you expand a list.
#(),* would use a comma between items.We don't need to use "+" as a separator here, since we've got the leading 0.0. That zero will usually compile away, except in a single degenerate case without any fields.
#[derive(Taxicab)]
struct Point0;
Calling the taxicab method will always return zero.
We need to implement the taxicab distance for something to be able to use the derived implementation, since it uses the taxicab method for the struct's fields. We could just write it for f64, but why not do something more general? Ideally we'd cover f32, i32, and the like. This problem is a great use case for generics!
All we need to do for the primitive fields is subtract two values, take the absolute value, and somehow get it to a f64. We can do this with standard library traits! Specifically, std::ops::Sub covers subtraction and we can use Into<f64> to convert to our final type. We don't need a trait for the absolute value if we convert the difference to a float first.
use std::ops::Sub;
impl<T> Taxicab for T
where
for<'a> &'a T: Sub<&'a T>,
for<'a> <&'a T as Sub<&'a T>>::Output: Into<f64>,
{
fn taxicab(&self, other: &Self) -> f64 {
Into::<f64>::into(self - other).abs()
}
}
The one tricky part is that Sub can use all sorts of types! You implement it for the left-hand side type, and the right-hand side is the generic: Sub<Rhs>. The output is part of the trait. We'll need to require Into<f64> for the output type, not the base generic itself.
Also, we have to use a lot of lifetimes here because we're operating on references. We wouldn't want to consume the values when evaluating a taxicab distance!
With the trait implemented for a lot of numeric primitives, we can start using the proc macro on our structs.
#[derive(Taxicab, Debug)]
struct Point {
x: f64,
y: f64,
}
#[derive(Taxicab, Debug)]
struct PointMixed {
x: f64,
y: i32,
}
#[derive(Taxicab, Debug)]
struct PointParent {
x: f32,
y: PointMixed,
}
In the last case, PointParent, our derived taxicab method is itself using a derived taxicab method on PointMixed.
There's an important concept in macros called "hygiene" that basically refers to how well symbols in the caller's scope bleed into the macro-generated code. Do the identifiers from where the macro was called affect the execution of a macro? For example, this C macro is unhygienic and pulls in y from the containing scope, giving different results for the same input.
#define F(x) x + y
int function1() {
int y = 1;
return F(1);
}
int function2() {
int y = 2;
return F(1);
}
In contrast, functions are hygienic, since you don't need to care about what's in scope when the function is called.
The key thing to know is that proc macros are unhygienic! We're just outputting a token stream that's interpreted in the local context.
We should particularly worry about traits. In the Rust code block above, we generated an impl Taxicab for ..., but never specified what Taxicab meant!
What if some other trait were in scope as Taxicab? For instance, you could (but never would) pull in another trait with that name.
use std::fmt::Display as Taxicab;
Then, you'd get a compilation error that your derive proc macro implements the wrong method!
error[E0407]: method `taxicab` is not a member of trait `Taxicab`
--> src/main.rs:5:10
|
5 | #[derive(Taxicab, Debug)]
| ^^^^^^^ not a member of trait `Taxicab`
|
= note: this error originates in the derive macro `Taxicab` (in Nightly builds, run with -Z macro-backtrace for more info)
Another, probably more realistic scenario, is that you don't have the Taxicab trait in scope when you derive it. You wouldn't run into this problem for a small program where you define structs and use the taxicab distance in the same file, since you have to have Taxicab in scope whenever you use the taxicab method. However, you might have a larger project where all the structs that derive the taxicab distance are in a file that doesn't have Taxicab in scope. In that setup, you'll get an error that you're trying to implement something that isn't a trait.
The way to get around the hygiene problem is to use the trait's full, legal name: crate::Taxicab (for a binary).
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro_derive(Taxicab)]
pub fn taxicab_derive(input: TokenStream) -> TokenStream {
let ast = syn::parse_macro_input!(input as syn::DeriveInput);
let name = &ast.ident;
let mut sum = vec![];
if let syn::Data::Struct(data) = &ast.data {
if let syn::Fields::Named(fields) = &data.fields {
for field in &fields.named {
let field_name = field.ident.as_ref().unwrap();
sum.push(quote! { self.#field_name.taxicab(&other.#field_name) });
}
}
}
// Using :: here
let implementation = quote! {
impl crate::Taxicab for #name {
fn taxicab(&self, other: &Self) -> f64 {
0.0 #(+ #sum)*
}
}
};
implementation.into()
}If you put a trait in a library, then crate is going to be whoever is calling your library, not the library itself! That's bad hygiene.
Instead, you can quantify the library as ::taxicab::Taxicab.
To use the proc macro in your own library, you need to tell Rust that the library name refers to your crate.
extern crate self as taxicab;In the Taxicab case, hygiene is a bit more complicated since we actually need Taxicab in scope for the derived implementation! This requirement is not a general need for proc macros, but special to the recursive nature where each struct's taxicab method calls the taxicab method of its fields.
Again, we should worry about Taxicab being in scope when we call the method. We can be safe by putting the full quantifier for the method.
sum.push(quote! { ::taxicab::Taxicab::taxicab(&self.#field_name, &other.#field_name) });
I'll switch to the ::taxicab::Taxicab version from here out.
I've only shown derive for normal structs with named fields, but you can throw a #[derive(Taxicab)] on a bunch of other things! If you don't take care, like in the code above, you'll end up with weird implementations for them.
First, your struct might not have any named fields.
#[derive(Taxicab, Debug)]
struct PointUnnamed(f64, f64);
While this type should behave the same as Point2 above, the taxicab method will always return zero, even when the two values are different! We're silently skipping past an if let statement and not handling the other cases.
To handle these structs, let's replace the if let with a match:
match &data.fields {
syn::Fields::Named(fields) => { // Same as before
for field in &fields.named {
let field_name = field.ident.as_ref().unwrap();
sum.push(quote! { ::taxicab::Taxicab::taxicab(&self.#field_name, &other.#field_name) });
}
}
syn::Fields::Unnamed(fields) => { // New to handle tuple structs
for index in 0..fields.unnamed.len() {
let index = syn::Index::from(index);
sum.push(quote! { ::taxicab::Taxicab::taxicab(&self.#index, &other.#index) });
}
}
syn::Fields::Unit => {} // Nothing to do!
}
Now, you should get proper taxicab distances on your tuple structs.
You might note that there's also this case for syn::Fields::Unit, which corresponds to the simple structs with no fields.
#[derive(Taxicab)]
struct Point0;
The original implementation worked fine here as well, since it defaults to zero.
The last thing to keep in mind is enums and unions. We probably don't want to handle them in the taxicab metric, so we need some way to generate an error. We could panic!, but that will only execute when the code is run, not when it's compiled!
Instead, we want the compile_error! macro, which is like panic! but it crashes when the code is compiled, not when it's run.
if let syn::Data::Struct(data) = &ast.data {
// Skipped for brevity
} else {
return quote! {compile_error!("Can only derive Taxicab on structs.");}.into();
}
The trick here is that you need the compile_error! inside a quote! or you won't be able to build the derive crate! Remember that we're actually compiling twice: once for the crate that gives us the proc macro, and once for the program itself. We need the compile_error! to appear in the program, which is the second time we compile, not in the proc macro crate.
You could alternatively panic! in the proc macro directly, but that crash is a bit awkward for users to parse.
Putting all those changes together would give you something like the following proc macro.
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro_derive(Taxicab)]
pub fn taxicab_derive(input: TokenStream) -> TokenStream {
let ast = syn::parse_macro_input!(input as syn::DeriveInput);
let name = &ast.ident;
let mut sum = vec![];
if let syn::Data::Struct(data) = &ast.data {
match &data.fields {
syn::Fields::Named(fields) => {
for field in &fields.named {
let field_name = field.ident.as_ref().unwrap();
sum.push(quote! { ::taxicab::Taxicab::taxicab(&self.#field_name, &other.#field_name) });
}
}
syn::Fields::Unnamed(fields) => {
for index in 0..fields.unnamed.len() {
let index = syn::Index::from(index);
sum.push(quote! { ::taxicab::Taxicab::taxicab(&self.#index, &other.#index) });
}
}
syn::Fields::Unit => {}
}
} else {
return quote! {compile_error!("Can only derive Taxicab on structs.");}.into();
}
let implementation = quote! {
impl ::taxicab::Taxicab for #name {
fn taxicab(&self, other: &Self) -> f64 {
0.0 #(+ #sum)*
}
}
};
implementation.into()
}