[tor-commits] [tor/master] docs: More Rust coding standards, based on without boats' comments.
nickm at torproject.org
nickm at torproject.org
Mon Sep 4 15:45:01 UTC 2017
commit 12cf04646c571646b726e697d66ecafad7886cf2
Author: Isis Lovecruft <isis at torproject.org>
Date: Thu Aug 31 00:41:47 2017 +0000
docs: More Rust coding standards, based on without boats' comments.
---
doc/HACKING/CodingStandardsRust.md | 116 +++++++++++++++++++++++++++++++------
1 file changed, 99 insertions(+), 17 deletions(-)
diff --git a/doc/HACKING/CodingStandardsRust.md b/doc/HACKING/CodingStandardsRust.md
index fc4977ed9..1cce0d3d2 100644
--- a/doc/HACKING/CodingStandardsRust.md
+++ b/doc/HACKING/CodingStandardsRust.md
@@ -78,6 +78,12 @@ types/constants/objects/functions/methods, you SHOULD also include an
You MUST document your module with _module docstring_ comments,
i.e. `//!` at the beginning of each line.
+ Style
+-------
+
+You SHOULD consider breaking up large literal numbers with `_` when it makes it
+more human readable to do so, e.g. `let x: u64 = 100_000_000_000`.
+
Testing
---------
@@ -147,6 +153,14 @@ If you wish to fuzz parts of your code, please see the
[`cargo fuzz`](https://github.com/rust-fuzz/cargo-fuzz) crate, which uses
[libfuzzer-sys](https://github.com/rust-fuzz/libfuzzer-sys).
+ Whitespace & Formatting
+-------------------------
+
+You MUST run `rustfmt` (https://github.com/rust-lang-nursery/rustfmt)
+on your code before your code will be merged. You can install rustfmt
+by doing `cargo install rustfmt-nightly` and then run it with `cargo
+fmt`.
+
Safety
--------
@@ -162,6 +176,44 @@ Here are some additional bits of advice and rules:
an inline comment stating how the unwrap will either 1) never fail,
or 2) should fail (i.e. in a unittest).
+ You SHOULD NOT use `unwrap()` anywhere in which it is possible to handle the
+ potential error with either `expect()` or the eel operator, `?`.
+ For example, consider a function which parses a string into an integer:
+
+ fn parse_port_number(config_string: &str) -> u16 {
+ u16::from_str_radix(config_string, 10).unwrap()
+ }
+
+ There are numerous ways this can fail, and the `unwrap()` will cause the
+ whole program to byte the dust! Instead, either you SHOULD use `expect()`
+ (or another equivalent function which will return an `Option` or a `Result`)
+ and change the return type to be compatible:
+
+ fn parse_port_number(config_string: &str) -> Option<u16> {
+ u16::from_str_radix(config_string, 10).expect("Couldn't parse port into a u16")
+ }
+
+ or you SHOULD use `or()` (or another similar method):
+
+ fn parse_port_number(config_string: &str) -> Option<u16> {
+ u16::from_str_radix(config_string, 10).or(Err("Couldn't parse port into a u16")
+ }
+
+ Using methods like `or()` can be particularly handy when you must do
+ something afterwards with the data, for example, if we wanted to guarantee
+ that the port is high. Combining these methods with the eel operator (`?`)
+ makes this even easier:
+
+ fn parse_port_number(config_string: &str) -> Result<u16, Err> {
+ let port = u16::from_str_radix(config_string, 10).or(Err("Couldn't parse port into a u16"))?;
+
+ if port > 1024 {
+ return Ok(port);
+ } else {
+ return Err("Low ports not allowed");
+ }
+ }
+
2. `unsafe`
If you use `unsafe`, you MUST describe a contract in your
@@ -175,8 +227,8 @@ Here are some additional bits of advice and rules:
#[no_mangle]
pub unsafe extern "C" fn increment_and_combine_numbers(mut numbers: [u8; 4]) -> u32 {
- for index in 0..numbers.len() {
- numbers[index] += 1;
+ for number in &mut numbers {
+ *number += 1;
}
std::mem::transmute::<[u8; 4], u32>(numbers)
}
@@ -218,15 +270,14 @@ Here are some additional bits of advice and rules:
`from_raw()` and then deallocate in Rust can result in a
[memory leak](https://doc.rust-lang.org/std/ffi/struct.CString.html#method.into_raw).
- [It was determined](https://github.com/rust-lang/rust/pull/41074) that this
- is safe to do if you use the same allocator in C and Rust and also specify
- the memory alignment for CString (except that there is no way to specify
- the alignment for CString). It is believed that the alignment is always 1,
- which would mean it's safe to dealloc the resulting `*mut c_char` in Tor's
- C code. However, the Rust developers are not willing to guarantee the
- stability of, or a contract for, this behaviour, citing concerns that this
- is potentially extremely and subtly unsafe.
-
+ [It was determined](https://github.com/rust-lang/rust/pull/41074) that this
+ is safe to do if you use the same allocator in C and Rust and also specify
+ the memory alignment for CString (except that there is no way to specify
+ the alignment for CString). It is believed that the alignment is always 1,
+ which would mean it's safe to dealloc the resulting `*mut c_char` in Tor's
+ C code. However, the Rust developers are not willing to guarantee the
+ stability of, or a contract for, this behaviour, citing concerns that this
+ is potentially extremely and subtly unsafe.
4. Perform an allocation on the other side of the boundary
@@ -338,11 +389,42 @@ Here are some additional bits of advice and rules:
In fact, using `std::mem::transmute` for *any* reason is a code smell and as
such SHOULD be avoided.
+9. Casting integers with `as`
- Whitespace & Formatting
--------------------------
+ This is generally fine to do, but it has some behaviours which you should be
+ aware of. Casting down chops off the high bits, e.g.:
-You MUST run `rustfmt` (https://github.com/rust-lang-nursery/rustfmt)
-on your code before your code will be merged. You can install rustfmt
-by doing `cargo install rustfmt-nightly` and then run it with `cargo
-fmt`.
+ let x: u32 = 4294967295;
+ println!("{}", x as u16); // prints 65535
+
+ Some cases which you MUST NOT do include:
+
+ * Casting an `u128` down to an `f32` or vice versa (e.g.
+ `u128::MAX as f32` but this isn't only a problem with overflowing
+ as it is also undefined behaviour for `42.0f32 as u128`),
+
+ * Casting between integers and floats when the thing being cast
+ cannot fit into the type it is being casted into, e.g.:
+
+ println!("{}", 42949.0f32 as u8); // prints 197 in debug mode and 0 in release
+ println!("{}", 1.04E+17 as u8); // prints 0 in both modes
+ println!("{}", (0.0/0.0) as i64); // prints whatever the heck LLVM wants
+
+ Because this behaviour is undefined, it can even produce segfaults in
+ safe Rust code. For example, the following program built in release
+ mode segfaults:
+
+ #[inline(never)]
+ pub fn trigger_ub(sl: &[u8; 666]) -> &[u8] {
+ // Note that the float is out of the range of `usize`, invoking UB when casting.
+ let idx = 1e99999f64 as usize;
+ &sl[idx..] // The bound check is elided due to `idx` being of an undefined value.
+ }
+
+ fn main() {
+ println!("{}", trigger_ub(&[1; 666])[999999]); // ~ out of bound
+ }
+
+ And in debug mode panics with:
+
+ thread 'main' panicked at 'slice index starts at 140721821254240 but ends at 666', /checkout/src/libcore/slice/mod.rs:754:4
More information about the tor-commits
mailing list