In November 2021 I wrote a blog post that examined Rust’s curious relationship with global variables. It aimed to explain why this ubiquitous language feature required external crates, and ended with personal recommendations on the use of globals in new code. Two years have passed, and Rust has changed enough that it’s time to take a fresh look. The rest of this text assumes you’ve read the previous article or are familiar with the subject.
Const Mutex and RwLock constructors
The first change is that Mutex::new()
is const as of Rust 1.63, so this example from the previous post now compiles and works as expected:
// didn't compile two years ago, compiles now
static LOG_FILE: Mutex<String> = Mutex::new(String::new());
The foundation for this improvement was laid down in 1.62 which replaced Mutex
, RwLock
, and CondVar
with lightweight, non-allocating implementations on Linux, and which 1.63 extended to provide const construction of those types on all platforms. The result is that for simple types mutex-protected globals “just work” without doing anything special.
Although we no longer have to encase every static Mutex
in a OnceCell
or equivalent, we still need a cell-like wrapper for scenarios where locked writing is only done on first use to initialize the value. In that case subsequent accesses to the global are read-only and shouldn’t require locking, only an atomic check. This is a very common use of global variables, a good example being a global holding a lazily compiled regex.
This brings us to the next and more important news.
Once cell is now in std
Since Rust 1.70, once_cell::sync::OnceCell
, from the once_cell
crate got integrated into the standard library as std::sync::OnceLock
. For the first time in Rust’s existence, you don’t need to write unsafe code, or bring in external crates that encapsulate it, to create a global/static variable initialized on first use. Usage is essentially the same as with once_cell
:
use std::sync::OnceLock;
use regex::Regex;
pub fn log_file_regex() -> &'static Regex {
static LOG_FILE_REGEX: OnceLock<Regex> = OnceLock::new();
LOG_FILE_REGEX.get_or_init(|| Regex::new(r#"^\d+-[[:xdigit:]]{8}$"#).unwrap())
}
// use log_file_regex().is_match(some_name) anywhere in your program
This addition might not seem like a big deal at first given that once_cell
has provided the same functionality for years. However having it in the standard library greatly benefits the language in several ways. First, initialize-on-first-use globals are very widely used by both applications and libraries, and both can now phase out crates like once_cell
and lazy_static
from their dependencies. Second, global variables can now be created by macro-generated code without awkward reexports of once_cell
and other logistic issues. Third, it makes it easier to teach the language, with teaching materials no longer needing to decide whether to cover once_cell
or lazy_static
, nor explain why external crates are needed for global variables to begin with. This excruciatingly long StackOverflow answer is a good example of the quagmire, as is my previous blog post on this topic. The whole stdlib/unsafe section of the latter is now just rendered obsolete, as the same be achieved safely with OnceLock
at no loss of performance.
The work is not yet complete, however. Note how the static variable is placed inside the function that contains the sole call to OnceLock::get_or_init()
. This pattern ensures that every access to the static OnceLock
goes through one place which also initializes it. once_cell
makes this less verbose through once_cell::sync::Lazy
, but the equivalent stdlib type is not yet stable, being stuck on some technical issues. The workaround of placing the global into a function isn’t a significant obstacle, but it’s worth mentioning. It’s particularly relevant when comparing the ease of use of OnceLock
with that of lazy_static::lazy_static!
or once_cell::sync::Lazy
, both of which offer the convenience of initializing in a single location without additional effort.
What to use in 2024
Two years ago the TL;DR of my recommendation was to “use once_cell
or lazy_static
, depending on which syntax you prefer”. Now it shifts to: use standard-library facilities like OnceLock
or atomics in almost all situations, and once_cell
when you require convenience not yet covered by std.
In particular:
- As before, when the type you want to use in
static
supports thread-safe interior mutability and has a const constructor, you can declare it as static directly. (The compiler will check all that for you, just see if it compiles.) This used to only include atomics, but now also includes mutexes and rwlocks. So if something likestatic CURRENT_CONFIG: Mutex<Option<Config>> = Mutex::new(None)
orstatic SHOULD_LOG: AtomicBool = AtomicBool::new(true)
works for you, go for it. -
When this doesn’t work, or you need to initialize on first use, use
std::sync::OnceLock
, preferably encapsulated in a function as shown above. -
If you create a large number of globals and want to avoid the boilerplate encapsulating each in a function, use
once_cell::sync::Lazy
. That type is likely to be stabilized in some form, which makes it preferable overlazy_static
. There are no good reasons to uselazy_static
in new code.
Note that existing code that uses once_cell
or lazy_static
doesn’t require immediate attention. Those crates will remain available indefinitely, and they generate nearly identical assembly to that of the standard library’s OnceLock
. The above recommendations are meant to guide your decisions regarding new code, or regarding code you’re refactoring anyway.