Boring news about Federico - October 2016

Go forward in time to November 2016.

Fri 2016/Oct/28

Last time I showed you my beginnings of porting parts of Librsvg to Rust. In this post I'll do an annotated porting of a few functions.

Disclaimers: I'm learning Rust as I go. I don't know all the borrowing/lending rules; "Rust means never having to close a socket" is a very enlightening article, although it doesn't tell the whole story. I don't know Rust idioms that would make my code prettier. I am trying to refactor things to be prettier after a the initial pass of C-to-Rust. If you know an idiom that would be useful, please mail me!

So, let's continue with the code to render SVG markers, as before. I'll start with this function:

/* In C */

static gboolean
points_equal (double x1, double y1, double x2, double y2)
{
    return DOUBLE_EQUALS (x1, x2) && DOUBLE_EQUALS (y1, y2);
}

I know that Rust supports tuples, and pretty structs, and everything. But so far, the refactoring I've done hasn't led me to really want to use them for this particular part of the library. Maybe later! Anyway, this translates easily to Rust; I already had a function called double_equals() from the last time. The result is as follows:

/* In Rust */

fn points_equal (x1: f64, y1: f64, x2: f64, y2: f64) -> bool {
    double_equals (x1, x2) && double_equals (y1, y2)
}

Pro-tip: text editor macros work very well for shuffling around the "double x1" into "x1: f64" :)

Remove the return and the semicolon at the end of the line so that the function returns the value of the && expression. I could leave the return in there, but not having it is more Rusty, perhaps. (Rust also has a return keyword, which I think they keep around to allow early exits from functions.)

This function doesn't get used yet, so the existing tests don't catch it. The first time I ran the Rust compiler on it, it complained of a type mismatch: I had put f64 instead of bool for the return type, which is of course wrong. Oops. Fix it, test again that it builds, done.

Okay, next!

But first, a note about how the original Segment struct in C evolved after refactoring in Rust.

Original in C	Straight port to Rust	After refactoring
typedef struct { /* If is_degenerate is true, * only (p1x, p1y) are valid. * If false, all are valid. */ gboolean is_degenerate; double p1x, p1y; double p2x, p2y; double p3x, p3y; double p4x, p4y; } Segment;	struct Segment { /* If is_degenerate is true, * only (p1x, p1y) are valid. * If false, all are valid. */ is_degenerate: bool, p1x: f64, p1y: f64, p2x: f64, p2y: f64, p3x: f64, p3y: f64, p4x: f64, p4y: f64 }	pub enum Segment { Degenerate { // A single lone point x: f64, y: f64 }, LineOrCurve { x1: f64, y1: f64, x2: f64, y2: f64, x3: f64, y3: f64, x4: f64, y4: f64 }, }

Original in C

Straight port to Rust

After refactoring

typedef struct {
    /* If is_degenerate is true,
     * only (p1x, p1y) are valid.
     * If false, all are valid.
     */
    gboolean is_degenerate;
    double p1x, p1y;
    double p2x, p2y;
    double p3x, p3y;
    double p4x, p4y;
} Segment;

struct Segment {
    /* If is_degenerate is true,
     * only (p1x, p1y) are valid.
     * If false, all are valid.
     */
    is_degenerate: bool,
    p1x: f64, p1y: f64,
    p2x: f64, p2y: f64,
    p3x: f64, p3y: f64,
    p4x: f64, p4y: f64
}

pub enum Segment {
    Degenerate { // A single lone point
        x: f64,
        y: f64
    },

    LineOrCurve {
        x1: f64, y1: f64,
        x2: f64, y2: f64,
        x3: f64, y3: f64,
        x4: f64, y4: f64
    },
}

In the C version, and in the original Rust version, I had to be careful to only access the x1/y1 fields if is_degenerate==true. Rust has a very convenient "enum" type, which can work pretty much as a normal C enum, or as a tagged union, as shown here. Rust will not let you access fields that don't correspond to the current tag value of the enum. (I'm not sure if "tag value" is the right way to call it — in any case, if a segment is Segment::Degenerate, the compiler only lets you access the x/y fields; if it is Segment::LineOrCurve, it only lets you access x1/y1/x2/y2/etc.) We'll see the match statement below, which is how enum access is done.

Next!

Original in C	Straight port to Rust
/* A segment is zero length if it is degenerate, or if all four control points * coincide (the first and last control points may coincide, but the others may * define a loop - thus nonzero length) / static gboolean is_zero_length_segment (Segment segment) { double p1x, p1y; double p2x, p2y; double p3x, p3y; double p4x, p4y; if (segment->is_degenerate) return TRUE; p1x = segment->p1x; p1y = segment->p1y; p2x = segment->p2x; p2y = segment->p2y; p3x = segment->p3x; p3y = segment->p3y; p4x = segment->p4x; p4y = segment->p4y; return (points_equal (p1x, p1y, p2x, p2y) && points_equal (p1x, p1y, p3x, p3y) && points_equal (p1x, p1y, p4x, p4y)); }	/* A segment is zero length if it is degenerate, or if all four control points * coincide (the first and last control points may coincide, but the others may * define a loop - thus nonzero length) */ fn is_zero_length_segment (segment: Segment) -> bool { match segment { Segment::Degenerate { .. } => { true }, Segment::LineOrCurve { x1, y1, x2, y2, x3, y3, x4, y4 } => { (points_equal (x1, y1, x2, y2) && points_equal (x1, y1, x3, y3) && points_equal (x1, y1, x4, y4)) } } }

Original in C

Straight port to Rust

/* A segment is zero length if it is degenerate, or if all four control points
 * coincide (the first and last control points may coincide, but the others may
 * define a loop - thus nonzero length)
 */
static gboolean
is_zero_length_segment (Segment *segment)
{
    double p1x, p1y;
    double p2x, p2y;
    double p3x, p3y;
    double p4x, p4y;

    if (segment->is_degenerate)
        return TRUE;

    p1x = segment->p1x;
    p1y = segment->p1y;

    p2x = segment->p2x;
    p2y = segment->p2y;

    p3x = segment->p3x;
    p3y = segment->p3y;

    p4x = segment->p4x;
    p4y = segment->p4y;

    return (points_equal (p1x, p1y, p2x, p2y)
            && points_equal (p1x, p1y, p3x, p3y)
            && points_equal (p1x, p1y, p4x, p4y));
}

/* A segment is zero length if it is degenerate, or if all four control points
 * coincide (the first and last control points may coincide, but the others may
 * define a loop - thus nonzero length)
 */
fn is_zero_length_segment (segment: Segment) -> bool {
    match segment {
        Segment::Degenerate { .. } => { true },

        Segment::LineOrCurve { x1, y1, x2, y2, x3, y3, x4, y4 } => {
            (points_equal (x1, y1, x2, y2)
             && points_equal (x1, y1, x3, y3)
             && points_equal (x1, y1, x4, y4))
        }
    }
}

To avoid a lot of "segment->this, segment->that, segment->somethingelse", the C version copies the fields from the struct into temporary variables and calls points_equal() with them. The Rust version doesn't need to do this, since we have a very convenient match statement.

Rust really wants you to handle all the cases that your enum may be in. You cannot do something like "if segment == Segment::Degenerate", because you may forget an "else if" for some case. Instead, the match statement is much more powerful. It is really a pattern-matching engine, and for enums it lets you consider each case separately. The fields inside each case get unpacked like in "Segment::LineOrCurve { x1, y1, ... }" so you can use them easily, and only within that case. In the Degenerate case, I don't use the x/y fields, so I write "Segment::Degenerate { .. }" to avoid having unused variables.

I'm sure I'll need to change something in the prototype of this function. The plain "segment: Segment" argument in Rust means that the is_zero_length_segment() function will take ownership of the segment. I'll be passing it from an array, but I don't know what shape that code will take yet, so I'll leave it like this for now and change it later.

This function could use a little test, couldn't it? Just to guard from messing up the coordinate names later if I decide to refactor it with tuples for points, or something. Fortunately, the tests are really easy to set up in Rust:

    #[test]
    fn degenerate_segment_is_zero_length () {
        assert! (super::is_zero_length_segment (degenerate (1.0, 2.0)));
    }

    #[test]
    fn line_segment_is_nonzero_length () {
        assert! (!super::is_zero_length_segment (line (1.0, 2.0, 3.0, 4.0)));
    }

    #[test]
    fn line_segment_with_coincident_ends_is_zero_length () {
        assert! (super::is_zero_length_segment (line (1.0, 2.0, 1.0, 2.0)));
    }

    #[test]
    fn curves_with_loops_and_coincident_ends_are_nonzero_length () {
        assert! (!super::is_zero_length_segment (curve (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 1.0, 2.0)));
        assert! (!super::is_zero_length_segment (curve (1.0, 2.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0)));
        assert! (!super::is_zero_length_segment (curve (1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 1.0, 2.0)));
    }

    #[test]
    fn curve_with_coincident_control_points_is_zero_length () {
        assert! (super::is_zero_length_segment (curve (1.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 2.0)));
    }

The degenerate(), line(), and curve() utility functions are just to create the appropriate Segment::Degenerate { x, y } without so much typing, and to make the tests more legible.

After running cargo test, all the tests pass. Yay! And we didn't have to fuck around with relinking a version specifically for testing, or messing with making static functions available to tests, like we would have had to do in C. Double yay!

Next!

Original in C	Straight port to Rust
static gboolean find_incoming_directionality_backwards (Segment segments, int num_segments, int start_index, double vx, double vy) { int j; gboolean found; / "go backwards ... within the current subpath until ... segment which has directionality at its end point" / found = FALSE; for (j = start_index; j >= 0; j--) { / 1 / if (segments[j].is_degenerate) break; / reached the beginning of the subpath as we ran into a standalone point / else { / 2 / if (is_zero_length_segment (&segments[j])) / 3 / continue; else { found = TRUE; break; } } } if (found) { / 4 / g_assert (j >= 0); vx = segments[j].p4x - segments[j].p3x; vy = segments[j].p4y - segments[j].p3y; return TRUE; } else { vx = 0.0; *vy = 0.0; return FALSE; } }	fn find_incoming_directionality_backwards (segments: Vec, start_index: usize) -> (bool, f64, f64) { let mut found: bool; let mut vx: f64; let mut vy: f64; /* "go backwards ... within the current subpath until ... segment which has directionality at its end point" / found = false; vx = 0.0; vy = 0.0; for j in (0 .. start_index + 1).rev () { / 1 / match segments[j] { Segment::Degenerate { .. } => { break; / reached the beginning of the subpath as we ran into a standalone point / }, Segment::LineOrCurve { x3, y3, x4, y4, .. } => { / 2 / if is_zero_length_segment (&segments[j]) { / 3 / continue; } else { vx = x4 - x3; vy = y4 - y3; found = true; break; } } } } if found { / 4 */ (true, vx, vy) } else { (false, 0.0, 0.0) } }

Original in C

Straight port to Rust

static gboolean
find_incoming_directionality_backwards (Segment *segments, int num_segments, int start_index, double *vx, double *vy)
{
    int j;
    gboolean found;

    /* "go backwards ... within the current subpath until ... segment which has directionality at its end point" */

    found = FALSE;

    for (j = start_index; j >= 0; j--) {                                                                 /* 1 */
        if (segments[j].is_degenerate)
            break; /* reached the beginning of the subpath as we ran into a standalone point */
        else {                                                                                           /* 2 */
            if (is_zero_length_segment (&segments[j]))                                                   /* 3 */
                continue;
            else {
                found = TRUE;
                break;
            }
        }
    }

    if (found) {                                                                                         /* 4 */
        g_assert (j >= 0);
        *vx = segments[j].p4x - segments[j].p3x;
        *vy = segments[j].p4y - segments[j].p3y;
        return TRUE;
    } else {
        *vx = 0.0;
        *vy = 0.0;
        return FALSE;
    }
}

fn find_incoming_directionality_backwards (segments: Vec, start_index: usize) -> (bool, f64, f64)
{
    let mut found: bool;
    let mut vx: f64;
    let mut vy: f64;

    /* "go backwards ... within the current subpath until ... segment which has directionality at its end point" */

    found = false;
    vx = 0.0;
    vy = 0.0;

    for j in (0 .. start_index + 1).rev () {                                                            /* 1 */
        match segments[j] {
            Segment::Degenerate { .. } => {
                break; /* reached the beginning of the subpath as we ran into a standalone point */
            },

            Segment::LineOrCurve { x3, y3, x4, y4, .. } => {                                            /* 2 */
                if is_zero_length_segment (&segments[j]) {                                              /* 3 */
                    continue;
                } else {
                    vx = x4 - x3;
                    vy = y4 - y3;
                    found = true;
                    break;
                }
            }
        }
    }

    if found {                                                                                           /* 4 */
        (true, vx, vy)
    } else {
        (false, 0.0, 0.0)
    }
}

In reality this function returns three values: whether a directionality was found, and if so, the vx/vy components of the direction vector. In C the prototype is like "bool myfunc (..., out vx, out vy)": return the boolean conventionally, and get a reference to the place where we should store the other return values. In Rust, it is simple to just return a 3-tuple.

(Keen-eyed rustaceans will detect a code smell in the bool-plus-extra-crap return value, and tell me that I could use an Option instead. We'll see what the code wants to look like during the final refactoring!)

With this code, I need temporary variables vx/vy to store the result. I'll refactor it to return immediately without needing temporaries or a found variable.

1. We are looking backwards in the array of segments, starting at a specific element, until we find one that satisfies a certain condition. Looping backwards in C in the way done here has the peril that your loop variable needs to be signed, even though array indexes are unsigned: j will go from start_index down to -1, but the loop only runs while j >= 0.

Rust provides a somewhat strange idiom for backwards numeric ranges. A normal range looks like "0 .. n" and that means the half-open range [0, n). So if we want to count from start_index down to 0, inclusive, we need to rev()erse the half-open range [0, start_index + 1), and that whole thing is "(0 .. start_index + 1).rev ()".

2. Handling the degenerate case is trivial. Handling the other case is a bit more involved in Rust. We compute the vx/vy values here, instead of after the loop has exited, as at that time the j loop counter will be out of scope. This ugliness will go away during refactoring.

However, note the match pattern "Segment::LineOrCurve { x3, y3, x4, y4, .. }". This means, "I am only interested in the x3/y3/x4/y4 fields of the enum"; the .. indicates to ignore the others.

3. Note the ampersand in "is_zero_length_segment (&segments[j])". When I first wrote this, I didn't include the & sign, and Rust complained that it couldn't pass segments[j] to the function because the function would take ownership of that value, while in fact the value is owned by the array. I need to declare the function as taking a reference to a segment ("a pointer"), and I need to call the function by actually passing a reference to the segment, with & to take the "address" of the segment like in C. And if you look at the C version, it also says "&segments[j]"! So, the function now looks like this:

fn is_zero_length_segment (segment: &Segment) -> bool {
    match *segment {
        ...

Which means, the function takes a reference to a Segment, and when we want to use it, we de-reference it as *segment.

While my C-oriented brain interprets this as references and dereferencing pointers, Rust wants me to think in the higher-level terms. A function will take ownership of an argument if it is declared like fn foo(x: Bar), and the caller will lose ownership of what it passed in x. If I want the caller to keep owning the value, I can "lend" it to the function by passing a reference to it, not the actual value. And I can make the function "borrow" the value without taking ownership, because references are not owned; they are just pointers to values.

It turns out that the three chapters of the Rust book that deal with this are very clear and understandable, and I was irrationally scared of reading them. Go through them in order: Ownership, References and borrowing, Lifetimes. I haven't used the lifetime syntax yet, but it lets you solve the problem of dangling pointers inside live structs.

4. At the end of the function, we build our 3-tuple result and return it.

And what if we remove the ugliness from a straight C-to-Rust port? It starts looking like this:

fn find_incoming_directionality_backwards (segments: Vec, start_index: usize) -> (bool, f64, f64)
{
    /* "go backwards ... within the current subpath until ... segment which has directionality at its end point" */

    for j in (0 .. start_index + 1).rev () {
        match segments[j] {
            Segment::Degenerate { .. } => {
                return (false, 0.0, 0.0); /* reached the beginning of the subpath as we ran into a standalone point */
            },

            Segment::LineOrCurve { x3, y3, x4, y4, .. } => {
                if is_zero_length_segment (&segments[j]) {
                    continue;
                } else {
                    return (true, x4 - x3, y4 - y3);
                }
            }
        }
    }

    (false, 0.0, 0.0)
}

We removed the auxiliary variables by returning early from within the loop. I could remove the continue by negating the result of is_zero_length_segment() and returning the sought value in that case, but in my brain it is easier to read, "is this a zero length segment, i.e. that segment has no directionality? If yes, continue to the previous one, otherwise return the segment's outgoing tangent vector".

But what if is_zero_length_segment() is the wrong concept? My calling function is called find_incoming_directionality_backwards(): it looks for segments in the array until it finds one with directionality. It happens to know that a zero-length segment has no directionality, but it doesn't really care about the length of segments. What if we called the helper function get_segment_directionality() and it returned false when the segment has none, and a vector otherwise?

Rust provides the Option pattern just for this. And I'm itching to show you some diagrams of Bézier segments, their convex hulls, and what the goddamn tangents and directionalities actually mean graphically.

But I have to evaluate Outreachy proposals, and if I keep reading Rust docs and refactoring merrily, I'll never get that done.

Sorry to leave you in a cliffhanger! More to come soon!

Tue 2016/Oct/25

Librsvg gets Rusty

I've been wanting to learn Rust for some time. It has frustrated me for a number of years that it is quite possible to write GNOME applications in high-level languages, but for the libraries that everything else uses ("the GNOME platform"), we are pretty much stuck with C. Vala is a very nice effort, but to me it never seemed to catch much momentum outside of GNOME.

After reading this presentation called "Rust out your C", I got excited. It *is* possible to port C code to Rust, small bits at a time! You rewrite some functions in Rust, make them linkable to the C code, and keep calling them from C as usual. The contortions you need to do to make C types accessible from Rust are no worse than for any other language.

I'm going to use librsvg as a testbed for this.

Librsvg is an old library. It started as an experiment to write a SAX-based parser for SVG ("don't load the whole DOM into memory; instead, stream in the XML and parse it as we go"), and a renderer with the old libart (what we used in GNOME for 2D vector rendering before Cairo came along). Later it got ported to Cairo, and that's the version that we use now.

Outside of GNOME, librsvg gets used at Wikimedia to render the SVGs all over Wikipedia. We have gotten excellent bug reports from them!

Librsvg has a bunch of little parsers for the mini-languages inside SVG's XML attributes. For example, within a vector path definition, "M10,50 h20 V10 Z" means, "move to the coordinate (10, 50), draw a horizontal line 20 pixels to the right, then a vertical line to absolute coordinate 10, then close the path with another line". There are state machines, like the one that transforms that path definition into three line segments instead of the PostScript-like instructions that Cairo understands. There are some pixel-crunching functions, like Gaussian blurs and convolutions for SVG filters.

It should be quite possible to port those parts of librsvg to Rust, and to preserve the C API for general consumption.

Every once in a while someone discovers a bug in librsvg that makes it all the way to a CVE security advisory, and it's all due to using C. We've gotten double free()s, wrong casts, and out-of-bounds memory accesses. Recently someone did fuzz-testing with some really pathological SVGs, and found interesting explosions in the library. That's the kind of 1970s bullshit that Rust prevents.

I also hope that this will make it easier to actually write unit tests for librsvg. Currently we have some pretty nifty black-box tests for the whole library, which essentially take in complete SVG files, render them, and compare the results to a reference image. These are great for smoke testing and guarding against regressions. However, all the fine-grained machinery in librsvg has zero tests. It is always a pain in the ass to make static C functions testable "from the outside", or to make mock objects to provide them with the kind of environment they expect.

So, on to Rustification!

I've started with a bit of the code from librsvg that is fresh in my head: the state machine that renders SVG markers.

SVG markers

This image with markers comes from the official SVG test suite:

SVG markers let you put symbols along the nodes of a path. You can use them to draw arrows (arrowhead as an end marker on a line), points in a chart, and other visual effects.

In the example image above, this is what is happening. The SVG defines four marker types:
- A purple square that always stays upright.
- A green circle.
- A blue triangle that always stays upright.
- A blue triangle whose orientation depends on the node where it sits.
The top row, with the purple squares, is a path (the black line) that says, "put the purple-square marker on all my nodes".

The middle row is a similar path, but it says, "put the purple-square marker on my first node, the green-circle marker on my middle nodes, and the blue-upright-triangle marker on my end node".

The bottom row has the blue-orientable-triangle marker on all the nodes. The triangle is defined to point to the right (look at the bottommost triangles!). It gets rotated 45 degrees at the middle node, and 90 degrees so it points up at the top-left node.

This was all fine and dandy, until one day we got a bug about incorrect rendering when there are funny paths paths. What makes a path funny?

For the code that renders markers, a path is not in the "easy" case when it is not obvious how to compute the orientation of nodes. A node's orientation, when it is well-behaved, is just the average angle of the node's incoming and outgoing lines (or curves). But if a path has contiguous coincident vertices, or stray points that don't have incoming/outgoing lines (imagine a sequence of moveto commands), or curveto commands with Bézier control points that are coincident with the nodes... well, in those cases, librsvg has to follow the spec to the letter, for it says how to handle those things.

In short, one has to walk the segments away from the node in question, until one finds a segment whose "directionality" can be computed: a segment that is an actual line or curve, not a coincident vertex nor a stray point.

Librsvg's algorithm has two parts to it. The first part takes the linear sequence of PostScript-like commands (moveto, lineto, curveto, closepath) and turns them into a sequence of segments. Each segment has two endpoints and two tangent directions at those endpoints; if the segment is a line, the tangents point in the same direction as the line. Or, the segment can be degenerate and it is just a single point.

The second part of the algorithm takes that list of segments for each node, and it does the walking-back-and-forth as described in the SVG spec. Basically, it finds the first non-degenerate segment on each side of a node, and uses the tangents of those segments to find the average orientation of the node.

The path-to-segments code

In the C code I had this:
```
typedef struct {
    gboolean is_degenerate; /* If true, only (p1x, p1y) are valid.  If false, all are valid */
    double p1x, p1y;
    double p2x, p2y;
    double p3x, p3y;
    double p4x, p4y;
} Segment;
```
P1 and P4 are the endpoints of each Segment; P2 and P3 are, like in a Bézier curve, the control points from which the tangents can be computed.

This translates readily to Rust:
```
struct Segment {
    is_degenerate: bool, /* If true, only (p1x, p1y) are valid.  If false, all are valid */
    p1x: f64, p1y: f64,
    p2x: f64, p2y: f64,
    p3x: f64, p3y: f64,
    p4x: f64, p4y: f64
}
```
Then a little utility function:
```
/* In C */
	    
#define EPSILON 1e-10
#define DOUBLE_EQUALS(a, b) (fabs ((a) - (b)) < EPSILON)


/* In Rust */
	    
const EPSILON: f64 = 1e-10;

fn double_equals (a: f64, b: f64) -> bool {
    (a - b).abs () < EPSILON
}
```
And now, the actual code that transforms a cairo_path_t (a list of moveto/lineto/curveto commands) into a list of segments. I'll interleave C and Rust code with commentary.
```
/* In C */

typedef enum {
    SEGMENT_START,
    SEGMENT_END,
} SegmentState;

static void
path_to_segments (const cairo_path_t *path,
                  Segment **out_segments,
                  int *num_segments)
{


/* In Rust */

enum SegmentState {
    Start,
    End
}

fn path_to_segments (path: cairo::Path) -> Vec<Segment> {
```
The enum is pretty much the same; Rust prefers CamelCase for enums instead of CAPITALIZED_SNAKE_CASE. The function prototype is much nicer in Rust. The cairo::Path is courtesy of gtk-rs, the budding Rust bindings for GTK+ and Cairo and all that goodness.

The C version allocates the return value as an array of Segment structs, and returns it in the out_segments argument (... and the length of the array in num_segments). The Rust version returns a mentally easier vector of Segment structs.

Now, the variable declarations at the beginning of the function:
```
/* In C */

{
    int i;
    double last_x, last_y;
    double cur_x, cur_y;
    double subpath_start_x, subpath_start_y;
    int max_segments;
    int segment_num;
    Segment *segments;
    SegmentState state;


/* In Rust */

{
    let mut last_x: f64;
    let mut last_y: f64;
    let mut cur_x: f64;
    let mut cur_y: f64;
    let mut subpath_start_x: f64;
    let mut subpath_start_y: f64;
    let mut has_first_segment : bool;
    let mut segment_num : usize;
    let mut segments: Vec<Segment>;
    let mut state: SegmentState;
```
In addition to having different type names (double becomes f64), Rust wants you to say when a variable will be mutable, i.e. when it is allowed to change value after its initialization.

Also, note that in C there's an "i" variable, which is used as a counter. There isn't a similar variable in the Rust version; there, we will use an iterator. Also, in the Rust version we have a new "has_first_segment" variable; read on to see its purpose.
```
    /* In C */

    max_segments = path->num_data; /* We'll generate maximum this many segments */
    segments = g_new (Segment, max_segments);
    *out_segments = segments;

    last_x = last_y = cur_x = cur_y = subpath_start_x = subpath_start_y = 0.0;

    segment_num = -1;
    state = SEGMENT_END;


    /* In Rust */
	      
    cur_x = 0.0;
    cur_y = 0.0;
    subpath_start_x = 0.0;
    subpath_start_y = 0.0;

    has_first_segment = false;
    segment_num = 0;
    segments = Vec::new ();
    state = SegmentState::End;
```
No problems here, just initializations. Note that in C we pre-allocate the segments array with a certain size. This is not the actual minimum size that the array will need; it is just an upper bound that comes from the way Cairo represents paths internally (it is not possible to compute the minimum size of the array without walking it first, so we use a good-enough value here that doesn't require walking). In the Rust version, we just create an empty vector and let it grow as needed.

Note also that the C version initializes segment_num to -1, while the Rust version sets has_first_segment to false and segment_num to 0. Read on!
```
    /* In C */

    for (i = 0; i < path->num_data; i += path->data[i].header.length) {
        last_x = cur_x;
        last_y = cur_y;


    /* In Rust */

    for cairo_segment in path.iter () {
        last_x = cur_x;
        last_y = cur_y;
```
We start iterating over the path's elements. Cairo, which is written in C, has a peculiar way of representing paths. path->num_data is the length of the path->data array. That array has elements in path->data[] that can be either commands, or point coordinates. Each command then specifies how many elements you need to "eat" to take in all its coordinates. Thus the "i" counter gets incremented on each iteration by path->data[i].header.length; this is the "how many to eat" magic value.

The Rust version is more civilized. Get a path.iter() which feeds you Cairo path segments, and boom, you are done. That civilization is courtesy of the gtk-rs bindings. Onwards!
```
    /* In C */

        switch (path->data[i].header.type) {
        case CAIRO_PATH_MOVE_TO:
            segment_num++;
            g_assert (segment_num < max_segments);



    /* In Rust */

        match cairo_segment {
            cairo::PathSegment::MoveTo ((x, y)) => {
                if has_first_segment {
                    segment_num += 1;
                } else {
                    has_first_segment = true;
                }
```
The C version switch()es on the type of the path segment. It increments segment_num, our counter-of-segments, and checks that it doesn't overflow the space we allocated for the results array.

The Rust version match()es on the cairo_segment, which is a Rust enum (think of it as a tagged union of structs). The first match case conveniently destructures the (x, y) coordinates; we will use them below.

If you recall from the above, the C version initialized segment_num to -1. This code for MOVE_TO is the first case in the code that we will hit, and that "segment_num++" causes the value to become 0, which is exactly the index in the results array where we want to place the first segment. Rust *really* wants you to use an usize value to index arrays ("unsigned size"). I could have used a signed size value starting at -1 and then incremented it to zero, but then I would have to cast it to unsigned — which is slightly ugly. So I introduce a boolean variable, has_first_segment, and use that instead. I think I could refactor this to have another state in SegmentState and remove the boolean variable.
```
        /* In C */

            g_assert (i + 1 < path->num_data);
            cur_x = path->data[i + 1].point.x;
            cur_y = path->data[i + 1].point.y;

            subpath_start_x = cur_x;
            subpath_start_y = cur_y;


         /* In Rust */

                cur_x = x;
                cur_y = y;

                subpath_start_x = cur_x;
                subpath_start_y = cur_y;
```
In the C version, I assign (cur_x, cur_y) from the path->data[], but first ensure that the index doesn't overflow. In the Rust version, the (x, y) values come from the destructuring described above.
```
        /* In C */

            segments[segment_num].is_degenerate = TRUE;

            segments[segment_num].p1x = cur_x;
            segments[segment_num].p1y = cur_y;

            state = SEGMENT_START;

            break;


         /* In Rust */

                let seg = Segment {
                    is_degenerate: true,
                    p1x: cur_x,
                    p1y: cur_y,
                    p2x: 0.0, p2y: 0.0, p3x: 0.0, p3y: 0.0, p4x: 0.0, p4y: 0.0 // these are set in the next iteration
                };

                segments.push (seg);

                state = SegmentState::Start;
            },
```
This is where my lack of Rust idiomatic skills really starts to show. In C I put (cur_x, cur_y) in the (p1x, p1y) fields of the current segment, and since it is_degenerate, I'll know that the other p2/p3/p4 fields are not valid — and like any C programmer who wears sandals instead of steel-toed boots, I leave their memory uninitialized. Rust doesn't want me to have uninitialized values EVER, so I must fill a Segment structure and then push() it into our segments vector.

So, the C version really wants to have a segment_num counter where I can keep track of which index I'm filling. Why is there a similar counter in the Rust version? We will see why in the next case.
```
        /* In C */

        case CAIRO_PATH_LINE_TO:
            g_assert (i + 1 < path->num_data);
            cur_x = path->data[i + 1].point.x;
            cur_y = path->data[i + 1].point.y;

            if (state == SEGMENT_START) {
                segments[segment_num].is_degenerate = FALSE;
                state = SEGMENT_END;
            } else /* SEGMENT_END */ {
                segment_num++;
                g_assert (segment_num < max_segments);

                segments[segment_num].is_degenerate = FALSE;

                segments[segment_num].p1x = last_x;
                segments[segment_num].p1y = last_y;
            }

            segments[segment_num].p2x = cur_x;
            segments[segment_num].p2y = cur_y;

            segments[segment_num].p3x = last_x;
            segments[segment_num].p3y = last_y;

            segments[segment_num].p4x = cur_x;
            segments[segment_num].p4y = cur_y;

            break;


         /* In Rust */

            cairo::PathSegment::LineTo ((x, y)) => {
                cur_x = x;
                cur_y = y;

                match state {
                    SegmentState::Start => {
                        segments[segment_num].is_degenerate = false;
                        state = SegmentState::End;
                    },

                    SegmentState::End => {
                        segment_num += 1;

                        let seg = Segment {
                            is_degenerate: false,
                            p1x: last_x,
                            p1y: last_y,
                            p2x: 0.0, p2y: 0.0, p3x: 0.0, p3y: 0.0, p4x: 0.0, p4y: 0.0  // these are set below
                        };

                        segments.push (seg);
                    }
                }

                segments[segment_num].p2x = cur_x;
                segments[segment_num].p2y = cur_y;

                segments[segment_num].p3x = last_x;
                segments[segment_num].p3y = last_y;

                segments[segment_num].p4x = cur_x;
                segments[segment_num].p4y = cur_y;
            },
```
Whoa! Buts let's piece it apart bit by bit.

First we set cur_x and cur_y from the path data, as usual.

Then we roll the state machine. Remember we got a LINE_TO. If we are in the state START ("just have a single point, possibly a degenerate one"), then we turn the old segment into a non-degenerate, complete line segment. If we are in the state END ("we were already drawing non-degenerate lines"), we create a new segment and fill it in. I'll probably change the names of those states to make it more obvious what they mean.

In C we had a preallocated array for "segments", so the idiom to create a new segment is simply "segment_num++". In Rust we grow the segments array as we go, hence the "segments.push (seg)".

I will probably refactor this code. I don't like it that it looks like
```
    case move_to:
        start possibly-degenerate segment

    case line_to:
        are we in a possibly-degenerate segment?
            yes: make it non-degenerate and remain in that segment...

            no: create a new segment, switch to it, and fill its first fields...

	... for both cases, fill in the last fields of the segment
```
That is, the "yes" case fills in fields from the segment we were handling in the *previous* iteration, while the "no" case fills in fields from a *new* segment that we created in the present iteration. That asymmetry bothers me. Maybe we should build up the next-segment's fields in auxiliary variables, and only put them in a complete Segment structure once we really know that we are done with that segment? I don't know; we'll see what is more legible in the end.

The other two cases, for CURVE_TO and CLOSE_PATH, are analogous, except that CURVE_TO handles a bunch more coordinates for the control points, and CLOSE_PATH goes back to the coordinates from the last point that was a MOVE_TO.

And those tests you were talking about?

Well, I haven't written them yet! This is my very first Rust code, after reading a pile of getting-started documents.

Already in the case for CLOSE_PATH I think I've found a bug. It doesn't really create a segment for multi-line paths when the path is being closed. The reftests didn't catch this because none of the reference images with SVG markers uses a CLOSE_PATH command! The unit tests for this path_to_segments() machinery should be able to find this easily, and closer to the root cause of the bug.

What's next?

Learning how to link and call that Rust code from the C library for librsvg. Then I'll be able to remove the corresponding C code.

Feeling safer already?

Go backward in time to July 2016.

Boring news about Federico - October 2016

SVG markers

The path-to-segments code

And those tests you were talking about?

What's next?