Josh Matthews, @lastontheboat
If you're uncertain about writing Rust code, consider pairing with someone more experienced!
lib.h | lib.rs |
---|---|
|
|
Unsafe, low-level bindings | Safe, high-level bindings |
---|---|
|
|
The goal: encapsulate unsafety behind safe API barrier.
Use the libc
crate for maximum portability:
use libc::{c_void, c_char, c_short, c_uint, c_float};
Most primitive C types only require casting the equivalent Rust value.
lib.h
|
lib.rs
|
Strings in C are null-terminated:
const char* s = "Rust Belt Rust";
Strings in Rust are more complicated:
let s = &"Rust Belt Rust";
0x8000 = 0x9000
0x8004 = 14 0x9000 = Rust 0x9004 = Bel 0x9008 = t Ru 0x900D = st.. |
No null terminator in memory. |
// Panic if string contains a null byte
let s = CString::new("Rust Belt Rust").unwrap();
Allocates enough memory for string contents + null terminator.
Use s.as_ptr()
to pass a pointer to C functions.
let s = CString::new("Rust Belt Rust").unwrap();
let s_ptr = s.as_ptr();
unsafe {
ffi::puts(s_ptr);
}
let s = CString::new("Rust Belt Rust").unwrap();
let s_ptr = s.as_ptr();
drop(s);
unsafe {
ffi::puts(s_ptr);
}
Warning: pointer is invalid after destruction.
This is safe:
unsafe {
ffi::puts(CString::new("Rust!").unwrap().as_ptr());
}
But this is not:
unsafe {
let s = CString::new("Rust!").unwrap().as_ptr();
ffi::puts(s);
}
Temporary values only last as long as the expression context. As function arguments, that is the duration of the call.
Sometimes you need static strings, or worry-free literals.
unsafe {
ffi::puts(b"Rust Belt Rust\0");
}
The type of a byte string is &[u8]
.
Warning: it's easy to forget the null-terminator.
CString
is for Rust->C. For C->Rust, we need CStr
.
unsafe {
let s_ptr: *const c_char = ffi::getenv(b"PATH\0");
let s: &CStr = CStr::from_ptr(s_ptr);
}
Warning: pointer has no meaningful lifetime.
Either copy the value or use it with caution.
Copying the value from a CStr into a Rust string is straightforward:
unsafe {
let s_ptr: *const c_char = ffi::getenv(b"PATH\0");
let s: &CStr = CStr::from_ptr(s_ptr);
let s: &str = s.to_str().unwrap();
let s: String = s.to_owned();
}
CStr::to_str_lossy()
can avoid the intermediate Result.
C APIs often deal with pointers and explicit length values.
lib.h
|
lib.rs
|
Extracting these from Rust slices is straightforward:
let ints = &mut [0, 0, 0, 0];
unsafe {
ffi::fill_int_buffer(ints.as_mut_ptr(),
ints.len() as c_uint);
}
We can transform Vec<T>
/&[T]
into *const T
with as_ptr()
.
Vec<Vec<T>>
/&[&[T]]
requires allocating an intermediate vector.
C APIs that accept pointers expect a pointer to contiguous memory.
There is no contiguous memory containing inner pointers:
Vec<Vec<T>>
is different than Vec<*const T>
let mut outer = vec![vec![1.0, 2.0], vec![3.0, 4.0]];
let mut outer_c: Vec<*mut c_float> =
outer.iter_mut()
.map(|inner| inner.as_mut_ptr())
.collect()
unsafe {
scale_2x2_matrix(outer_c.as_mut_ptr(), 2.0);
}
cd exercise1; cargo build && cargo test
src/library.h
, fill in the ffi
module in src/lib.rs
src/lib.rs
that wrap the FFI functionsThe safe bindings should accept Rust types as arguments and convert them as appropriate in order to invoke FFI functions.
See hints in comments; see working code in src/solution.rs
.
get_library_version
safely return?Next on the agenda - complex types!
C structures require very little modification for use with FFI.
lib.h | lib.rs |
---|---|
|
|
Sometimes C headers only provide forward declarations as part of APIs.
struct image_t;
struct image_t* image_create();
unsigned int image_get_width(struct image_t*);
unsigned int image_get_height(struct image_t*);
This does not pose a problem for Rust bindings:
struct image_t(c_void);
extern "C" {
fn image_create() -> *mut image_t;
fn image_get_width(image: *const image_t) -> c_uint;
fn image_get_height(image: *const image_t) -> c_uint;
}
Like C library users, these values cannot be constructed by Rust code.
If a C API provides construction and deletion functions, this maps nicely to Rust's unique ownership model.
struct image_t;
struct image_t* image_create(unsigned int w,
unsigned int h);
void image_free(struct image_t* image);
The low level types are unsurprising:
struct image_t(c_void);
extern "C" {
fn image_create(w: c_uint, h: c_uint) -> *mut image_t;
fn image_free(image: *mut image_t);
}
We can encapsulate low-level types in high-level wrappers:
struct Image {
ptr: *mut image_t,
}
There is no need to duplicate any state from the low-level type.
The purpose of the wrapper is to mediate access to the unsafe pointer.
We can follow Rust construction idioms:
impl Image {
pub fn new(w: usize, h: usize) -> Image {
Image {
ptr: unsafe { ffi::image_create(w, h) },
}
}
}
Let's ensure that the memory does not leak or get used unsafely:
impl Drop for Image {
fn drop(&mut self) {
unsafe {
ffi::image_free(self.ptr);
}
}
}
We also need to wrap any related APIs that require raw pointers:
impl Image {
pub fn width(&self) -> usize {
unsafe { ffi::image_get_width(self.ptr) }
}
pub fn height(&self) -> usize {
unsafe { ffi::image_get_height(self.ptr) }
}
}
Beware of null pointers!
What if image_create
can't allocate? Either:
Option<Image>
from Image::new
, orSome APIs define types that have shared instead of unique ownership:
struct image_t;
struct image_t* image_create(unsigned int w,
unsigned int h);
void image_addref(struct image_t* image);
void image_release(struct image_t* image);
The FFI declarations remain unsurprising:
struct image_t(c_void);
extern "C" {
fn image_create(w: c_uint, h: c_uint) -> *mut image_t;
fn image_addref(image: *mut image_t);
fn image_release(image: *mut image_t);
}
Our constructor needs some care:
impl Image {
pub fn new(w: usize, h: usize) -> Image {
let ptr = unsafe { ffi::image_create(w, h) };
unsafe { ffi::image_addref(ptr) };
Image {
ptr: ptr,
}
}
}
Our destructor looks very similar:
impl Drop for Image {
fn drop(&mut self) {
unsafe {
ffi::image_release(self.ptr);
}
}
}
And now something new - enabling shared ownership:
impl Clone for Image {
fn clone(&self) -> Image {
unsafe { ffi::image_addref(self.ptr); }
Image {
ptr: self.ptr
}
}
}
Who is responsible for creating and destroying values?
Can those responsibilities be associated with Rust's lifetimes?
Keep unsafe pointers hidden behind safe interfaces.
cd exercise2; cargo build && cargo test
ffi
module in src/lib.rs
, write safe wrappers for the
string_t
and slice_t
types.substring
function. Provide a more
meaningful return type than int.See hints in comments; see working code in src/solution.rs
.
substring
more ergonomic than the C one?Next up - allocation and iteration!
Don't let C code free memory allocated by Rust.
Be careful when freeing memory allocated by C.
Allocator mismatch errors are frustrating.
So far we've only lent memory to C within a single stack frame.
Longer loans need well-defined endpoints for reclaiming the memory.
For strings:
CString::into_raw
pairs with CString::from_raw
For arbitrary values:
Box::into_raw
matches Box::from_raw
These will leak memory if used unevenly!
// Set the prefix that will be added to any subsequent
// prints. The string will be copied for internal use.
void set_print_prefix(const char* prefix);
versus
// Set the prefix that will be added to any subsequent
// prints. String must remain valid during all prints.
void set_print_prefix(const char* prefix);
This implementation for set_print_prefix
only works if
the string is copied. Otherwise, the pointer is invalid after returning.
pub fn set_print_prefix(prefix: &str) {
let s = CString::new(prefix).unwrap();
unsafe {
ffi::set_print_prefix(s.as_ptr());
}
}
Solution: transfer ownership using into_raw
and
from_raw
let s = CString::new("Rust Belt Rust").unwrap();
unsafe {
ffi::set_print_prefix(s.into_raw());
}
unsafe {
let _s = CString::from_raw(ffi::get_print_prefix());
}
At its core, Rust's support for iteration rests on two things.
The Iterator
trait:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
This defines a protocol which declares the type of item returned, and how to obtain the next item.
The other important ingredient is that iterator types typically borrow the value being iterated, to prevent it from being modified or destroyed.
struct MatrixIterator<'a> {
matrix: &'a Matrix,
index: usize,
}
Iterators retain all state necessary to obtain the next item when requested.
An iterator type implements the Iterator
trait:
impl<'a> Iterator for MatrixIterator<'a> {
type Item = f32;
fn next(&mut self) -> Option<f32> {
self.index += 1;
match (self.index - 1) {
0 => Some(&self.matrix.m0),
1 => Some(&self.matrix.m1),
_ => None
}
}
}
Finally some other code constructs an iterator value:
impl Matrix {
pub fn iter(&self) -> MatrixIterator {
MatrixIterator {
matrix: self,
idx: 0,
}
}
}
Suddenly it Just Works:
for val in matrix.iter().map(|v| v * 2) {
println!("{}", val);
}
cd exercise3; cargo build && cargo test
ffi
module in src/lib.rs
, a write safe wrapper for the
query_result_t
type.query_result_t
type and back.See hints in comments; see working code in src/solution.rs
.
get_nth_result
to be called without performing bounds checks?Next up - calling from Rust -> C -> Rust!
Function pointers in Rust look similar to function declarations.
fn div2(v: u32) -> f32 { v as f32 / 2.0 }
let f: fn(u32) -> f32;
f = div2;
assert!(f(5) == div2(5));
C callbacks are Rust function pointers with more extern and unsafe.
lib.h
|
lib.rs
|
unsafe extern "C" null_allocator(_: c_uint) -> *mut c_void {
ptr::null_mut()
}
// Pass a pointer to a Rust function,
ffi::init(Some(null_allocator));
// or a null pointer...
ffi::init(None);
Callbacks from C APIs will receive low-level arguments. Our bindings need a translation layer.
There are two common mechanisms for this:
C API for per-context callback:
void* context_get_private(struct context_t*);
void context_set_private(struct context_t*, void*);
typedef void (*operation_callback_t)(struct context_t*);
void context_set_callback(struct context_t*,
operation_callback_t);
Rust FFI declarations:
fn context_get_private(cx: *mut context_t) -> *mut void;
fn context_set_private(cx: *mut context_t, priv: *mut void);
type operation_callback_t =
Option<unsafe extern "C" (*mut context_t)>;
fn context_set_callback(cx: *mut context_t,
cb: operation_callback_t);
Associate high-level state with low-level data in the constructor:
struct ContextPrivate(u32);
let cx = Context { ptr: unsafe { ffi::context_create() } };
let prv = Box::new(ContextPrivate(0));
unsafe {
ffi::context_set_private(cx.ptr,
Box::into_raw(prv))
}
Invoke high-level callbacks with high-level state:
unsafe extern "C" fn callback(cx: *mut context_t) {
let p = context_get_private(cx) as *mut ContextPrivate;
rust_callback(&mut *p);
}
unsafe {
ffi::context_set_callback(cx.ptr, callback)
}
High level callback (no hint of unsafety):
struct ContextPrivate(u32);
fn rust_callback(data: &mut ContextPrivate) {
// Count the number of times invoked
data.0 += 1;
}
Clean up in the associated destructor:
impl Drop for Context {
fn drop(&mut self) {
let p = unsafe { context_get_private(self.ptr) };
let _ = unsafe { Box::from_raw(p) };
...
}
}
Sometimes more dynamic function pointers are desirable.
Closure types in Rust are two pointers (function + environment), so we cannot just use them in place of C function pointers.
We can build on the previous pattern to enable high-level APIs that run closures when C callbacks are invoked.
Storing closure trait objects in the private data allows subsequent execution:
type OperationCallback = Box<FnMut(&mut ContextState)>;
struct ContextState(u32);
struct ContextPrivate {
operation: OperationCallback,
state: ContextState,
}
Some careful partitioning of the members of the private state may be necessary to satisfy the borrow checker:
unsafe extern "C" fn callback(cx: *mut context_t) {
let p = context_get_private(cx) as *mut ContextPrivate;
let p = &mut *p;
(p.operation)(&mut p.state);
}
One issue that crops up in callbacks is how to deal with passing arguments down to high-level callbacks.
unsafe extern "C" fn callback(cx: *mut context_t)
type OperationCallback = Box<FnMut(&mut Context)>;
If Context
is a Rust type with unique ownership semantics,
it is not safe to synthesize an instance for the duration of the callback.
What we really need is to borrow the high-level type for the callback, but it may not be reachable from the callback's code.
Let's split Context
type into OwnedContext
and BorrowedContext
// Responsible for APIs that cannot destroy this context.
struct BorrowedContext { ptr: *mut ffi::context_t }
// Responsible only for APIs that destroy this context.
struct OwnedContext { cx: BorrowedContext }
There is a clear similarity to Vec<T>
/[T]
, String
/str
, T
/&T
...
To minimize friction, we can delegate OwnedContext
to BorrowedContext
as approprate:
impl Deref for OwnedContext {
type Target = BorrowedContext;
fn deref(&self) -> &BorrowedContext {
&self.cx
}
}
cd exercise4; cargo build && cargo test
ffi
module in src/lib.rs
, write a safe wrapper for
logger_t
.See hints in comments; see working code in src/solution.rs
.
Next up - the end!