As a library writer, it feels a bit strange, but refreshing, to write
a program that actually has a main()
function.
My experience with Rust so far has been threefold:
-
Porting chunks of C to Rust for librsvg - this is all work on librsvg's internals and no users are exposed to it directly.
-
Working on gnome-class, the procedural macro ("a little compiler") to generate GObject boilerplate from Rust. This feels like working on the edge of the exotic; it is something that runs in the Rust compiler and spits code on behalf of the programmer.
-
A few patches to the gtk-rs ecosystem. Again, work on the internals, or something that feels library-like.
But other than toy programs to test things, I haven't written a stand-alone tool until rsvg-bench. It's quite a thrill to be able to just run the thing instead of waiting for other people to write code to use it!
Parsing command-line arguments
There are quite a few Rust crates ("libraries") to parse command-line
arguments. I read about structopt via Robert O'Callahan's
blog; structopt lets you define a struct
to hold the values of
your command-line options, and then you annotate the fields in that
struct
to indicate how they should be parsed from the command line.
It works via Rust's procedural macros. Internally it generates stuff
for the clap crate, a well-established mechanism for dealing with
command-line options.
And it is quite pleasant! This is basically all I needed to do:
#[derive(StructOpt, Debug)]
#[structopt(name = "rsvg-bench", about = "Benchmarking utility for librsvg.")]
struct Opt {
#[structopt(short = "s",
long = "sleep",
help = "Number of seconds to sleep before starting to process SVGs",
default_value = "0")]
sleep_secs: usize,
#[structopt(short = "p",
long = "num-parse",
help = "Number of times to parse each file",
default_value = "100")]
num_parse: usize,
#[structopt(short = "r",
long = "num-render",
help = "Number of times to render each file",
default_value = "100")]
num_render: usize,
#[structopt(long = "pixbuf",
help = "Render to a GdkPixbuf instead of a Cairo image surface")]
render_to_pixbuf: bool,
#[structopt(help = "Input files or directories",
parse(from_os_str))]
inputs: Vec<PathBuf>
}
fn main() {
let opt = Opt::from_args();
if opt.inputs.len() == 0 {
eprintln!("No input files or directories specified\n");
process.exit(1);
}
...
}
Each field in the Opt
struct above corresponds to one command-line
argument; each field has annotations for structopt
to generate the
appropriate code to parse each option. For example, the
render_to_pixbuf
field has a long option name called "pixbuf"
;
that field will be set to true
if the --pixbuf
option gets passed
to rsvg-bench.
Handling errors
Command-line programs generally have the luxury of being able to just exit as soon as they encounter an error.
In C this is a bit cumbersome since you need to deal with every
place that may return an error, find out what to print, and call
exit(1)
by hand or something. If you miss a single place where an
error is returned, your program will keep running with an inconsistent
state.
In languages with exception handling, it's a bit easier - a small script can just let exceptions be thrown wherever, and if it catches them at the toplevel, it can just print the exception and abort gracefully. However, these nonlocal jumps make me uncomfortable; I think exceptions are hard to reason about.
Rust makes this easy: it forces you to handle every call that may return an error, but it lets you bubble errors up easily, or handle them in-place, or translate them to a higher-level error.
In the Rust world the [failure
] crate is getting a lot of traction
as a convenient, modern way to handle errors.
In rsvg-bench, errors can come from several places:
-
I/O errors when reading files and directories.
-
Errors from librsvg's parsing stage; you get a GError.
-
Errors from the rendering stage. This can be a Cairo error (a cairo_status_t), or a simple "something bad happened; can't render" from librsvg's old convenience api in C. Don't you hate it when C code just gives up and returns NULL or a boolean false, without any further details on what went wrong?
For rsvg-bench, I just needed to be able to represent Cairo errors and
generic rendering errors. Everything else, like an io::Error
, is
automatically wrapped by the failure
crate's mechanism. I just
needed to do this:
extern crate failure;
#[macro_use]
extern crate failure_derive;
#[derive(Debug, Fail)]
enum ProcessingError {
#[fail(display = "Cairo error: {:?}", status)]
CairoError {
status: cairo::Status
},
#[fail(display = "Rendering error")]
RenderingError
}
Whenever the code gets a Cairo error, I can translate it to a
ProcessingError::CairoError
and bubble it up:
fn render_to_cairo(handle: &rsvg::Handle) -> Result<(), Error> {
let dim = handle.get_dimensions();
let surface = cairo::ImageSurface::create(cairo::Format::ARgb32,
dim.width,
dim.height)
.map_err(|e| ProcessingError::CairoError { status: e })?;
...
}
And when librsvg returns a "couldn't render" error, I translate that
to a ProcessingError::RenderingError
:
fn render_to_cairo(handle: &rsvg::Handle) -> Result<(), Error> {
...
let cr = cairo::Context::new(&surface);
if handle.render_cairo(&cr) {
Ok(())
} else {
Err(Error::from(ProcessingError::RenderingError))
}
}
Here, the Ok()
case of the Result
does not contain any value —
it's just ()
, as the generated images are not stored anywhere: they
are just rendered to get some timings, not to be saved or anything.
Up to where do errors bubble?
This is the "do everything" function:
fn run(opt: &Opt) -> Result<(), Error> {
...
for path in &opt.inputs {
process_path(opt, &path)?;
}
Ok(())
}
For each path passed in the command line, process it. The program sees if the path corresponds to a directory, and it will scan it recursively. Or if the path is an SVG file, the program will load the file and render it.
Finally, main()
just has this:
fn main() {
let opt = Opt::from_args();
...
match run(&opt) {
Ok(_) => (),
Err(e) => {
eprintln!("{}", e);
process::exit(1);
}
}
}
I.e. process command line arguments, run the whole thing, and print an error if there was one.
I really appreciate that most places that can return an error an just
put a ?
for the error to bubble up. This is much more legible than
in C, where every call must have an if (something_bad_happened) {
deal_with_it; }
after it... and Rust won't let me get away with
ignoring an error, but it makes it easy to actually deal with it properly.
Reading an SVG file quickly
Why, just mmap()
it and feed it to librsvg, to avoid buffer copies.
This is easy in Rust:
fn process_file<P: AsRef<Path>>(opt: &Opt, path: P) -> Result<(), Error> {
let file = File::open(path)?;
let mmap = unsafe { MmapOptions::new().map(&file)? };
let bytes = &mmap;
let handle = rsvg::Handle::new_from_data(bytes)?;
...
}
Many things can go wrong here:
File::open()
can return an io::Error.MmapOptions::map()
can return an io::Error from themmap(2)
system call, or from thefstat(2)
to read the file's size to map it.rsvg::Handle::new_from_data()
can return a GError from parsing the file.
The little ?
characters after each call that can return an error
mean, just give me back the result, or convert the error to a
failure::Error
that can be examined later. This is beautifully
legible to me.
Summary
Writing command-line programs in Rust is fun! It's nice to have neurotically-safe scripts that one can trust in the future.