CSS length values have a number and a unit, e.g. 5cm
or 6px
. Sometimes the unit is a percentage, like 50%
, and SVG
says that lengths with percentage units should be resolved with
respect to a certain rectangle. For example, consider this circle
element:
<circle cx="50%" cy="75%" r="4px" fill="black"/>
This means, draw a solid black circle whose center is at 50% of the width and 75% of the height of the current viewport. The circle should have a 4-pixel radius.
The process of converting that kind of units into absolute pixels for the final drawing is called normalization. In SVG, percentage units sometimes need to be normalized with respect to the current viewport (a local coordinate system), or with respect to the size of another object (e.g. when a clipping path is used to cut the current shape in half).
One detail about normalization is that it can be with respect to the horizontal dimension of the current viewport, the vertical dimension, or both. Keep this in mind: at normalization time, we need to be able to distinguish between those three modes.
The original C version
I have talked about the original C code for lengths before; the following is a small summary.
The original C code had this struct to represent lengths:
typedef struct {
double length;
char factor;
} RsvgLength;
The parsing code would set the factor
field to a
character depending on the length's unit: 'p'
for percentages, 'i'
for inches, etc., and '\0'
for the default unit, which is pixels.
Along with that, the normalization code needed to know the direction (horizontal, vertical, both) to which the length in question refers. It did this by taking another character as an argument to the normalization function:
double
_rsvg_css_normalize_length (const RsvgLength * in, RsvgDrawingCtx * ctx, char dir)
{
if (in->factor == '\0') /* pixels, no need to normalize */
return in->length;
else if (in->factor == 'p') { /* percentages; need to consider direction */
if (dir == 'h') /* horizontal */
return in->length * ctx->vb.rect.width;
if (dir == 'v') /* vertical */
return in->length * ctx->vb.rect.height;
if (dir == 'o') /* both */
return in->length * rsvg_viewport_percentage (ctx->vb.rect.width,
ctx->vb.rect.height);
} else { ... }
}
The original post talks about how I found a couple of bugs with
how the directions are identified at normalization time. The function
above expects one of 'h'/'v'/'o'
for horizontal/vertical/both, and one or
two places in the code passed the wrong character.
Making the C version cleaner
Before converting that code to Rust, I removed the pesky characters and made the code use proper enums to identify a length's units.
+typedef enum {
+ LENGTH_UNIT_DEFAULT,
+ LENGTH_UNIT_PERCENT,
+ LENGTH_UNIT_FONT_EM,
+ LENGTH_UNIT_FONT_EX,
+ LENGTH_UNIT_INCH,
+ LENGTH_UNIT_RELATIVE_LARGER,
+ LENGTH_UNIT_RELATIVE_SMALLER
+} LengthUnit;
+
typedef struct {
double length;
- char factor;
+ LengthUnit unit;
} RsvgLength;
Then, do the same for the normalization function, so it will get the direction in which to normalize as an enum instead of a char.
+typedef enum {
+ LENGTH_DIR_HORIZONTAL,
+ LENGTH_DIR_VERTICAL,
+ LENGTH_DIR_BOTH
+} LengthDir;
double
-_rsvg_css_normalize_length (const RsvgLength * in, RsvgDrawingCtx * ctx, char dir)
+_rsvg_css_normalize_length (const RsvgLength * in, RsvgDrawingCtx * ctx, LengthDir dir)
Making the C version easier to get right
While doing the last change above, I found a place in the code that used the wrong direction by mistake, probably due to a cut&paste error. Part of the problem here is that the code was specifying the direction at normalization time.
I decided to change it so that each direction value carried its own
direction since initialization, so that subsequent
code wouldn't have to worry about that. Hopefully, initializing a
width
field should make it obvious that it needed
LENGTH_DIR_HORIZONTAL
.
typedef struct {
double length;
LengthUnit unit;
+ LengthDir dir;
} RsvgLength;
That is, so that instead of
/* at initialization time */
foo.width = _rsvg_css_parse_length (str);
...
/* at rendering time */
double final_width = _rsvg_css_normalize_length (&foo.width, ctx, LENGTH_DIR_HORIZONTAL);
we would instead do this:
/* at initialization time */
foo.width = _rsvg_css_parse_length (str, LENGTH_DIR_HORIZONTAL);
...
/* at rendering time */
double final_width = _rsvg_css_normalize_length (&foo.width, ctx);
This made the drawing code, which deals with a lot of coordinates at the same time, a lot less noisy.
Initial port to Rust
To recap, this was the state of the structs after the initial refactoring in C:
typedef enum {
LENGTH_UNIT_DEFAULT,
LENGTH_UNIT_PERCENT,
LENGTH_UNIT_FONT_EM,
LENGTH_UNIT_FONT_EX,
LENGTH_UNIT_INCH,
LENGTH_UNIT_RELATIVE_LARGER,
LENGTH_UNIT_RELATIVE_SMALLER
} LengthUnit;
typedef enum {
LENGTH_DIR_HORIZONTAL,
LENGTH_DIR_VERTICAL,
LENGTH_DIR_BOTH
} LengthDir;
typedef struct {
double length;
LengthUnit unit;
LengthDir dir;
} RsvgLength;
This ported to Rust in a straightforward fashion:
pub enum LengthUnit {
Default,
Percent,
FontEm,
FontEx,
Inch,
RelativeLarger,
RelativeSmaller
}
pub enum LengthDir {
Horizontal,
Vertical,
Both
}
pub struct RsvgLength {
length: f64,
unit: LengthUnit,
dir: LengthDir
}
It got a similar constructor that took the
direction and produced an RsvgLength
:
impl RsvgLength {
pub fn parse (string: &str, dir: LengthDir) -> RsvgLength { ... }
}
(This was before using Result
; remember that the original
C code did very little error checking!)
The initial Parse trait
It was at that point that it seemed convenient to introduce a Parse
trait, which all CSS value types would implement to
parse themselves from string.
However, parsing an RsvgLength
also needed an extra piece of data,
the LengthDir
. My initial version of the Parse
trait had an
associated called Data
, through which one could pass an extra piece
of data during parsing/initialization:
pub trait Parse: Sized {
type Data;
type Err;
fn parse (s: &str, data: Self::Data) -> Result<Self, Self::Err>;
}
This was explicitly to be able to pass a LengthDir
to the parser for
RsvgLength
:
impl Parse for RsvgLength {
type Data = LengthDir;
type Err = AttributeError;
fn parse (string: &str, dir: LengthDir) -> Result <RsvgLength, AttributeError> { ... }
}
This was okay for lengths, but very noisy for everything else that
didn't require an extra bit of data. In the rest of the code, the
helper type was Data = ()
and there was a pair of extra parentheses ()
in every place that parse()
was called.
Removing the helper Data type
Introducing one type per direction
Over a year later, that ()
bit of data everywhere was driving me
nuts. I started refactoring the Length
module to remove it.
First, I introduced three newtypes to wrap Length
, and indicate
their direction at the same time:
pub struct LengthHorizontal(Length);
pub struct LengthVertical(Length);
pub struct LengthBoth(Length);
This was done with a macro because now each wrapper type
needed to know the relevant LengthDir
.
Now, for example, the declaration for the <circle>
element looked
like this:
pub struct NodeCircle {
cx: Cell<LengthHorizontal>,
cy: Cell<LengthVertical>,
r: Cell<LengthBoth>,
}
(Ignore the Cell
everywhere; we got rid of that later.)
Removing the dir
field
Since now the information about the length's direction is embodied in
the LengthHorizontal/LengthVertical/LengthBoth
types, this made it
possible to remove the dir
field from the inner
Length
struct.
pub struct RsvgLength {
length: f64,
unit: LengthUnit,
- dir: LengthDir
}
+pub struct LengthHorizontal(Length);
+pub struct LengthVertical(Length);
+pub struct LengthBoth(Length);
+
+define_length_type!(LengthHorizontal, LengthDir::Horizontal);
+define_length_type!(LengthVertical, LengthDir::Vertical);
+define_length_type!(LengthBoth, LengthDir::Both);
Note the use of a define_length_type!
macro to generate code for
those three newtypes.
Removing the Data
associated type
And finally, this made it possible to remove the Data
associated
type from the Parse
trait.
pub trait Parse: Sized {
- type Data;
type Err;
- fn parse(parser: &mut Parser<'_, '_>, data: Self::Data) -> Result<Self, Self::Err>;
+ fn parse(parser: &mut Parser<'_, '_>) -> Result<Self, Self::Err>;
}
The resulting mega-commit removed a bunch of stray parentheses ()
from all calls to parse()
, and the code ended up a lot easier to
read.
Removing the newtypes
This was fine for a while. Recently, however, I figured out that it would be possible to embody the information for a length's direction in a different way.
But to get there, I first needed a temporary refactor.
Replacing the macro with a trait with a default implementation
Deep in the guts of length.rs
, the key function that does something
different based on LengthDir
is its scaling_factor
method:
enum LengthDir {
Horizontal,
Vertical,
Both,
}
impl LengthDir {
fn scaling_factor(self, x: f64, y: f64) -> f64 {
match self {
LengthDir::Horizontal => x,
LengthDir::Vertical => y,
LengthDir::Both => viewport_percentage(x, y),
}
}
}
That method gets passed, for example, the width/height
of the
current viewport for the x/y
arguments. The method decides whether
to use the width, height, or a combination of both.
And of course, the interesting part of the define_length_type!
macro
was to generate code for calling LengthDir::Horizontal::scaling_factor()
/etc. as
appropriate depending on the LengthDir
in question.
First I made a trait called Orientation
with a scaling_factor
method, and three zero-sized types that implement that trait. Note
how each of these three implementations corresponds to one of the
match
arms above:
pub trait Orientation {
fn scaling_factor(x: f64, y: f64) -> f64;
}
pub struct Horizontal;
pub struct Vertical;
pub struct Both;
impl Orientation for Horizontal {
fn scaling_factor(x: f64, _y: f64) -> f64 {
x
}
}
impl Orientation for Vertical {
fn scaling_factor(_x: f64, y: f64) -> f64 {
y
}
}
impl Orientation for Both {
fn scaling_factor(x: f64, y: f64) -> f64 {
viewport_percentage(x, y)
}
}
Now most of the contents of the define_length_type!
macro can go in
the default implementation of a new trait
LengthTrait
. Crucially, this trait has an
Orientation
associated type, which it uses to call into the
Orientation trait:
pub trait LengthTrait: Sized {
type Orientation: Orientation;
...
fn normalize(&self, values: &ComputedValues, params: &ViewParams) -> f64 {
match self.unit() {
LengthUnit::Px => self.length(),
LengthUnit::Percent => {
self.length() *
<Self::Orientation>::scaling_factor(params.view_box_width, params.view_box_height)
}
...
}
}
Note that the incantation is
<Self::Orientation>::scaling_factor(...)
to call that method on the
associated type.
Now the define_length_type!
macro is shrunk a lot, with the
interesting part being just this:
macro_rules! define_length_type {
{$name:ident, $orient:ty} => {
pub struct $name(Length);
impl LengthTrait for $name {
type Orientation = $orient;
}
}
}
define_length_type! { LengthHorizontal, Horizontal }
define_length_type! { LengthVertical, Vertical }
define_length_type! { LengthBoth, Both }
We moved from having three newtypes of length-with-LengthDir to three newtypes with dir-as-associated-type.
Removing the newtypes and the macro
After that temporary refactoring, we had the Orientation
trait and
the three zero-sized types Horizontal
, Vertical
, Both
.
I figured out that one can use PhantomData
as a way to carry
around the type that Length
needs to normalize itself, instead of
using an associated type in an extra LengthTrait
. Behold!
pub struct Length<O: Orientation> {
pub length: f64,
pub unit: LengthUnit,
orientation: PhantomData<O>,
}
impl<O: Orientation> Length<O> {
pub fn normalize(&self, values: &ComputedValues, params: &ViewParams) -> f64 {
match self.unit {
LengthUnit::Px => self.length,
LengthUnit::Percent => {
self.length
* <O as Orientation>::scaling_factor(params.view_box_width, params.view_box_height)
}
...
}
}
}
Now the incantation is <O as Orientation>::scaling_factor()
to call
the method on the generic type; it is no longer an associated type in
a trait.
With that, users of lengths look like this; here, our <circle>
element from before:
pub struct Circle {
cx: Length<Horizontal>,
cy: Length<Vertical>,
r: Length<Both>,
}
I'm very happy with the readability of all the code now. I used to
think of PhantomData
as a way to deal with wrapping pointers from
C, but it turns out that it is also useful to keep a generic
type around should one need it.
The final Length
struct is this:
pub struct Length<O: Orientation> {
pub length: f64,
pub unit: LengthUnit,
orientation: PhantomData<O>,
}
And it only takes up as much space as its length
and unit
fields;
PhantomData
is zero-sized after all.
(Later, we renamed Orientation
to Normalize
, but the code's
structure remained the same.)
Summary
Over a couple of years, librsvg's type that represents CSS lengths went from a C representation along the lines of "all data in the world is an int", to a Rust representation that uses some interesting type trickery:
-
C struct with
char
for units. -
C struct with a
LengthUnits
enum. -
C struct without an embodied direction; each place that needs to normalize needs to get the orientation right.
-
C struct with a built-in direction as an extra field, done at initialization time.
-
Same struct but in Rust.
-
An ugly but workable
Parse
trait so that the direction can be set at parse/initialization time. -
Three newtypes
LengthHorizontal
,LengthVertical
,LengthBoth
with a common core. A cleaned-upParse
trait. A macro to generate those newtypes. -
Replace the
LengthDir
enum with anOrientation
trait, and three zero-sized typesHorizontal/Vertical/Both
that implement the trait. -
Replace most of the macro with a helper trait
LengthTrait
that has anOrientation
associated type. -
Replace the helper trait with a single
Length<T: Orientation>
type, which puts the orientation as a generic parameter. The macro disappears and there is a single implementation for everything.
Refactoring never ends!