Reducing memory consumption in librsvg, part 3: slack space in Bézier paths

Translations: es - Tags: gnome, librsvg, performance, rust

We got a bug with a gigantic SVG of a map extracted from OpenStreetMap, and it has about 600,000 elements. Most of them are <path>, that is, specifications for Bézier paths.

A <path> can look like this:

<path d="m 2239.05,1890.28 5.3,-1.81"/>

The d attribute contains a list of commands to create a Bézier path, very similar to PostScript's operators. Librsvg has the following to represent those commands:

pub enum PathCommand {
    MoveTo(f64, f64),
    LineTo(f64, f64),

Those commands get stored in an array, a Vec inside a PathBuilder:

pub struct PathBuilder {
    path_commands: Vec<PathCommand>,

Librsvg translates each of the commands inside a <path d="..."/> into a PathCommand and pushes it into the Vec in the PathBuilder. When it is done parsing the attribute, the PathBuilder remains as the final version of the path.

To let a Vec grow efficiently as items are pushed into it, Rust makes the Vec grow by powers of 2. When we add an item, if the capacity of the Vec is full, its buffer gets realloc()ed to twice its capacity. That way there are only O(log₂n) calls to realloc(), where n is the total number of items in the array.

However, this means that once we are done adding items to the Vec, there may still be some free space in it: the capacity exceeds the length of the array. The invariant is that vec.capacity() >= vec.len().

First I wanted to shrink the PathBuilders so that they have no extra capacity in the end.

First step: convert to Box<[T]>

A "boxed slice" is a contiguous array in the heap, that cannot grow or shrink. That is, it has no extra capacity, only a length.

Vec has a method into_boxed_slice which does eactly that: it consumes the vector and converts it into a boxed slice without extra capacity. In its innards, it does a realloc() on the Vec's buffer to match its length.

Let's see the numbers that Massif reports:

  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 23 22,751,613,855    1,560,916,408    1,493,746,540    67,169,868            0

 30 22,796,106,012    1,553,581,072    1,329,943,324   223,637,748            0

That is, we went from using 1,493,746,540 bytes on the heap to using 1,329,943,324 bytes. Simply removing extra capacity from the path commands saves about 159 MB for this particular file.

Second step: make the allocator do less work

However, the extra-heap column in that table has a number I don't like: there are 223,637,748 bytes in malloc() metadata and unused space in the heap.

I suppose that so many calls to realloc() make the heap a bit fragmented.

It would be good to be able to read most of the <path d="..."/> to temporary buffers that don't need so many calls to realloc(), and that in the end get copied to exact-sized buffers, without extra capacity.

We can do just that with the smallvec crate. A SmallVec has the same API as Vec, but it can store small arrays directly in the stack, without an extra heap allocation. Once the capacity is full, the stack buffer "spills" into a heap buffer automatically.

Most of the d attributes in the huge file in the bug have fewer than 32 commands. That is, if we use the following:

pub struct PathBuilder {
    path_commands: SmallVec<[PathCommand; 32]>,

We are saying that there can be up to 32 items in the SmallVec without causing a heap allocation; once that is exceeded, it will work like a normal Vec.

At the end we still do into_boxed_slice to turn it into an independent heap allocation with an exact size.

This reduces the extra-heap quite a bit:

  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 33 24,139,598,653    1,416,831,176    1,329,943,212    86,887,964            0

Also, the total bytes shrink from 1,553,581,072 to 1,416,831,176 — we have a smaller heap because there is not so much work for the allocator, and there are a lot fewer temporary blocks when parsing the d attributes.

Making the code prettier

I put in the following:

/// This one is mutable
pub struct PathBuilder {
    path_commands: SmallVec<[PathCommand; 32]>,

/// This one is immutable
pub struct Path {
    path_commands: Box<[PathCommand]>,

impl PathBuilder {
    /// Consumes the PathBuilder and converts it into an immutable Path
    pub fn into_path(self) -> Path {
        Path {
            path_commands: self.path_commands.into_boxed_slice(),

With that, PathBuilder is just a temporary struct that turns into an immutable Path once we are done feeding it. Path contains a boxed slice of the exact size, without any extra capacity.

Next steps

All the coordinates in librsvg are stored as f64, double-precision floating point numbers. The SVG/CSS spec says that single-precision floats are enough, and that 64-bit floats should be used only for geometric transformations.

I'm a bit scared to make that change; I'll have to look closely at the results of the test suite to see if rendered files change very much. I suppose even big maps require only as much precision as f32 — after all, that is what OpenStreetMap uses.