The Magic of GObject Introspection

- Tags: gnome, gobject-introspection, rust

Before continuing with the glib-rs architecture, let's take a detour and look at GObject Introspection. Although it can seem like an obscure part of the GNOME platform, it is an absolutely vital part of it: it is what lets people write GNOME applications in any language.

Let's start with a bit of history.

Brief history of language bindings in GNOME

When we started GNOME in 1997, we didn't want to write all of it in C. We had some inspiration from elsewhere.

Prehistory: GIMP and the Procedural Database

There was already good precedent for software written in a combination of programming languages. Emacs, the flagship text editor of the GNU project, was written with a relatively small core in C, and the majority of the program in Emacs Lisp.

In similar fashion, we were very influenced by the design of the GIMP, which was very innovative at that time. The GIMP has a large core written in C. However, it supports plug-ins or scripts written in a variety of languages. Initially the only scripting language available for the GIMP was Scheme.

The GIMP's plug-ins and scripts run as separate processes, so they don't have immediate access to the data of the image being edited, or to the core functions of the program like "paint with a brush at this location". To let plug-ins and scripts access these data and these functions, the GIMP has what it calls a Procedural Database (PDB). This is a list of functions that the core program or plug-ins wish to export. For example, there are functions like gimp-scale-image and gimp-move-layer. Once these functions are registered in the PDB, any part of the program or plug-ins can call them. Scripts are often written to automate common tasks — for example, when one wants to adjust the contrast of photos and scale them in bulk. Scripts can call functions in the PDB easily, irrespective of the programming language they are written in.

We wanted to write GNOME's core libraries in C, and write a similar Procedural Database to allow those libraries to be called from any programming language. Eventually it turned out that a PDB was not necessary, and there were better ways to go about enabling different programming languages.

Enabling sane memory management

GTK+ started out with a very simple scheme for memory management: a container owned its child widgets, and so on recursively. When you freed a container, it would be responsible for freeing its children.

However, consider what happens when a widget needs to hold a reference to another widget that is not one of its children. For example, a GtkLabel with an underlined mnemonic ("_N_ame:") needs to have a reference to the GtkEntry that should be focused when you press Alt-N. In the very earliest versions of GTK+, how to do this was undefined: C programmers were already used to having shared pointers everywhere, and they were used to being responsible for managing their memory.

Of course, this was prone to bugs. If you have something like

typedef struct {
    GtkWidget parent;

    char *label_string;
    GtkWidget *widget_to_focus;
} GtkLabel;

then if you are writing the destructor, you may simply want to

static void
gtk_label_free (GtkLabel *label)
{
    g_free (label_string);
    gtk_widget_free (widget_to_focus);          /* oops, we don't own this */

    free_parent_instance (&label->parent);
}

Say you have a GtkBox with the label and its associated GtkEntry. Then, freeing the GtkBox would recursively free the label with that gtk_label_free(), and then the entry with its own function. But by the time the entry gets freed, the line gtk_widget_free (widget_to_focus) has already freed the entry, and we get a double-free bug!

Madness!

That is, we had no idea what we were doing. Or rather, our understanding of widgets had not evolved to the point of acknowledging that a widget tree is not a simply tree, but rather a directed graph of container-child relationships, plus random-widget-to-random-widget relationships. And of course, other parts of the program which are not even widget implementations may need to keep references to widgets and free them or not as appropriate.

I think Marius Vollmer was the first person to start formalizing this. He came from the world of GNU Guile, a Scheme interpreter, and so he already knew how garbage collection and seas of shared references ought to work.

Marius implemented reference-counting for GTK+ — that's where gtk_object_ref() and gtk_object_unref() come from; they eventually got moved to the base GObject class, so we now have g_object_ref() and g_object_unref() and a host of functions to have weak references, notification of destruction, and all the things required to keep garbage collectors happy.

The first language bindings

The very first language bindings were written by hand. The GTK+ API was small, and it seemed feasible to take

void gtk_widget_show (GtkWidget *widget);
void gtk_widget_hide (GtkWidget *widget);

void gtk_container_add (GtkContainer *container, GtkWidget *child);
void gtk_container_remove (GtkContainer *container, GtkWidget *child);

and just wrap those functions in various languages, by hand, on an as-needed basis.

Of course, there is a lot of duplication when doing things that way. As the C API grows, one needs to do more and more manual work to keep up with it.

Also, C structs with public fields are problematic. If we had

typedef struct {
    guchar r;
    guchar g;
    guchar b;
} GdkColor;

and we expect program code to fill in a GdkColor by hand and pass it to a drawing function like

void gdk_set_foreground_color (GdkDrawingContext *gc, GdkColor *color);

then it is no problem to do that in C:

GdkColor magenta = { 255, 0, 255 };

gdk_set_foreground_color (gc, &magenta);

But to do that in a high level language? You don't have access to C struct fields! And back then, libffi wasn't generally available.

Authors of language bindings had to write some glue code, in C, by hand, to let people access a C struct and then pass it on to GTK+. For example, for Python, they would need to write something like

PyObject *
make_wrapped_gdk_color (PyObject *args, PyObject *kwargs)
{
    GdkColor *g_color;
    PyObject *py_color;

    g_color = g_new (GdkColor, 1);
    /* ... fill in g_color->r, g, b from the Python args */

    py_color = wrap_g_color (g_color);
    return py_color;
}

Writing that by hand is an incredible amount of drudgery.

What language bindings needed was a description of the API in a machine-readable format, so that the glue code could be written by a code generator.

The first API descriptions

I don't remember if it was the GNU Guile people, or the PyGTK people, who started to write descriptions of the GNOME API by hand. For ease of parsing, it was done in a Scheme-like dialect. A description may look like

(class GtkWidget
       ;;; void gtk_widget_show (GtkWidget *widget);
       (method show
               (args nil)
               (retval nil))

       ;;; void gtk_widget_hide (GtkWidget *widget);
       (method hide
               (args nil)
               (retval nil)))

(class GtkContainer
       ;;; void gtk_container_add (GtkContainer *container, GtkWidget *child);
       (method add
               (args GtkWidget)
               (retval nil)))

(struct GdkColor
        (field r (type 'guchar))
        (field g (type 'guchar))
        (field b (type 'guchar))) 

Again, writing those descriptions by hand (and keeping up with the C API) was a lot of work, but the glue code to implement the binding could be done mostly automatically. The generated code may need subsequent tweaks by hand to deal with details that the Scheme-like descriptions didn't contemplate, but it was better than writing everything by hand.

Glib gets a real type system

Tim Janik took over the parts of Glib that implement objects/signals/types, and added a lot of things to create a good type system for C. This is where things like GType, GValue, GParamSpec, and fundamental types come from.

For example, a GType is an identifier for a type, and a GValue is a type plus, well, a value of that type. You can ask a GValue, "are you an int? are you a GObject?".

You can register new types: for example, there would be code in Gdk that registers a new GType for GdkColor, so you can ask a value, "are you a color?".

Registering a type involves telling the GObject system things like how to copy values of that type, and how to free them. For GdkColor this may be just g_new() / g_free(); for reference-counted objects it may be g_object_ref() / g_object_unref().

Objects can be queried about some of their properties

A widget can tell you when you press a mouse button mouse on it: it will emit the button-press-event signal. When GtkWidget's implementation registers this signal, it calls something like

    g_signal_new ("button-press-event",
        gtk_widget_get_type(), /* type of object for which this signal is being created */
        ...
        G_TYPE_BOOLEAN,  /* type of return value */
        1,               /* number of arguments */
        GDK_TYPE_EVENT); /* type of first and only argument */

This tells GObject that GtkWidget will have a signal called button-press-event, with a return type of G_TYPE_BOOLEAN, and with a single argument of type GDK_TYPE_EVENT. This lets GObject do the appropriate marshalling of arguments when the signal is emitted.

But also! You can query the signal for its argument types! You can run g_signal_query(), which will then tell you all the details of the signal: its name, return type, argument types, etc. A language binding could run g_signal_query() and generate a description of the signal automatically to the Scheme-like description language. And then generate the binding from that.

Not all of an object's properties can be queried

Unfortunately, although GObject signals and properties can be queried, methods can't be. C doesn't have classes with methods, and GObject does not really have any provisions to implement them.

Conventionally, for a static method one would just do

void
gtk_widget_set_flags (GtkWidget *widget, GtkWidgetFlags flags)
{
    /* modify a struct field within "widget" or whatever */
    /* repaint or something */
}

And for a virtual method one would put a function pointer in the class structure, and provide a convenient way to call it:

typedef struct {
    GtkObjectClass parent_class;

    void (* draw) (GtkWidget *widget, cairo_t *cr);
} GtkWidgetClass;

void
gtk_widget_draw (GtkWidget *widget, cairo_t *cr)
{
    GtkWidgetClass *klass = find_widget_class (widget);

    (* klass->draw) (widget, cr);
}

And GObject has no idea about this method — there is no way to query it; it just exists in C-space.

Now, historically, GTK+'s header files have been written in a very consistent style. It is quite possible to write a tool that will take a header file like

/* gtkwidget.h */
typedef struct {
    GtkObject parent_class;

    void (* draw) (GtkWidget *widget, cairo_t *cr);
} GtkWidgetClass;

void gtk_widget_set_flags (GtkWidget *widget, GtkWidgetFlags flags);
void gtk_widget_draw (GtkWidget *widget, cairo_t *cr);

and parse it, even if it is with a simple parser that does not completely understand the C language, and have heuristics like

  • Is there a class_name_foo() function prototype with no corresponding foo field in the Class structure? It's probably a static method.

  • Is there a class_name_bar() function with a bar field in the Class structure? It's probably a virtual method.

  • Etc.

And in fact, that's what we had. C header files would get parsed with those heuristics, and the Scheme-like description files would get generated.

Scheme-like descriptions get reused, kind of

Language binding authors started reusing the Scheme-like descriptions. Sometimes they would cannibalize the descriptions from PyGTK, or Guile (again, I don't remember where the canonical version was maintained) and use them as they were.

Other times they would copy the files, modify them by hand some more, and then use them to generate their language binding.

C being hostile

From just reading/parsing a C function prototype, you cannot know certain things. If one function argument is of type Foo *, does it mean:

  • the function gets a pointer to something which it should not modify ("in" parameter)

  • the function gets a pointer to uninitialized data which it will set ("out" parameter)

  • the function gets a pointer to initialized data which it will use and modify ("inout" parameter)

  • the function will copy that pointer and hold a reference to the pointed data, and not free it when it's done

  • the function will take over the ownership of the pointed data, and free it when it's done

  • etc.

Sometimes people would include these annotations in the Scheme-like description language. But wouldn't it be better if those annotations came from the C code itself?

GObject Introspection appears

For GNOME 3, we wanted a unified solution for language bindings:

  • Have a single way to extract the machine-readable descriptions of the C API.

  • Have every language binding be automatically generated from those descriptions.

  • In the descriptions, have all the information necessary to generate a correct language binding...

  • ... including documentation.

We had to do a lot of work to accomplish this. For example:

  • Remove C-isms from the public API. Varargs functions, those that have foo (int x, ...), can't be easily described and called from other languages. Instead, have something like foov (int x, int num_args, GValue *args_array) that can be easily consumed by other languages.

  • Add annotations throughout the code so that the ad-hoc C parser can know about in/out/inout arguments, and whether pointer arguments are borrowed references or a full transfership of ownership.

  • Take the in-line documentation comments and store them as part of the machine-readable description of the API.

  • When compiling a library, automatically do all the things like g_signal_query() and spit out machine-readable descriptions of those parts of the API.

So, GObject Introspection is all of those things.

Annotations

If you have looked at the C code for a GNOME library, you may have seen something like this:

/**
 * gtk_widget_get_parent:
 * @widget: a #GtkWidget
 *
 * Returns the parent container of @widget.
 *
 * Returns: (transfer none) (nullable): the parent container of @widget, or %NULL
 **/
GtkWidget *
gtk_widget_get_parent (GtkWidget *widget)
{
    ...
}

See that "(transfer none) (nullable)" in the documentation comments? The (transfer none) means that the return value is a pointer whose ownership does not get transferred to the caller, i.e. the widget retains ownership. Finally, the (nullable) indicates that the function can return NULL, when the widget has no parent.

A language binding will then use this information as follows:

  • It will not unref() the parent widget when it is done with it.

  • It will deal with a NULL pointer in a special way, instead of assuming that references are not null.

Every now and then someone discovers a public function which is lacking an annotation of that sort — for GNOME's purposes this is a bug; fortunately, it is easy to add that annotation to the C sources and regenerate the machine-readable descriptions.

Machine-readable descriptions, or repository files

So, what do those machine-readable descriptions actually look like? They moved away from a Scheme-like language and got turned into XML, because early XXIst century.

The machine-readable descriptions are called GObject Introspection Repository files, or GIR for short.

Let's look at some parts of Gtk-3.0.gir, which your distro may put in /usr/share/gir-1.0/Gtk-3.0.gir.

<repository version="1.2" ...>

  <namespace name="Gtk"
             version="3.0"
             shared-library="libgtk-3.so.0,libgdk-3.so.0"
             c:identifier-prefixes="Gtk"
             c:symbol-prefixes="gtk">

For the toplevel "Gtk" namespace, this is what the .so library is called. All identifiers have "Gtk" or "gtk" prefixes.

A class with methods and a signal

Let's look at the description for GtkEntry...

    <class name="Entry"
           c:symbol-prefix="entry"
           c:type="GtkEntry"
           parent="Widget"
           glib:type-name="GtkEntry"
           glib:get-type="gtk_entry_get_type"
           glib:type-struct="EntryClass">

      <doc xml:space="preserve">The #GtkEntry widget is a single line text entry
widget. A fairly large set of key bindings are supported
by default. If the entered text is longer than the allocation
...
       </doc>

This is the start of the description for GtkEntry. We already know that everything is prefixed with "Gtk", so the name is just given as "Entry". Its parent class is Widget and the function which registers it against the GObject type system is gtk_entry_get_type.

Also, there are the toplevel documentation comments for the Entry class.

Onwards!

      <implements name="Atk.ImplementorIface"/>
      <implements name="Buildable"/>
      <implements name="CellEditable"/>
      <implements name="Editable"/>

GObject classes can implement various interfaces; this is the list that GtkEntry supports.

Next, let's look at a single method:

      <method name="get_text" c:identifier="gtk_entry_get_text">
        <doc xml:space="preserve">Retrieves the contents of the entry widget. ... </doc>

        <return-value transfer-ownership="none">
          <type name="utf8" c:type="const gchar*"/>
        </return-value>

        <parameters>
          <instance-parameter name="entry" transfer-ownership="none">
            <type name="Entry" c:type="GtkEntry*"/>
          </instance-parameter>
        </parameters>
      </method>

The method get_text and its corresponding C symbol. Its return value is an UTF-8 encoded string, and ownership of the memory for that string is not transferred to the caller.

The method takes a single parameter which is the entry instance itself.

Now, let's look at a signal:

      <glib:signal name="activate" when="last" action="1">
        <doc xml:space="preserve">The ::activate signal is emitted when the user hits
the Enter key. ...</doc>

        <return-value transfer-ownership="none">
          <type name="none" c:type="void"/>
        </return-value>
      </glib:signal>

    </class>

The "activate" signal takes no arguments, and has a return value of type void, i.e. no return value.

A struct with public fields

The following comes from Gdk-3.0.gir; it's the description for GdkRectangle.

    <record name="Rectangle"
            c:type="GdkRectangle"
            glib:type-name="GdkRectangle"
            glib:get-type="gdk_rectangle_get_type"
            c:symbol-prefix="rectangle">

      <field name="x" writable="1">
        <type name="gint" c:type="int"/>
      </field>
      <field name="y" writable="1">
        <type name="gint" c:type="int"/>
      </field>
      <field name="width" writable="1">
        <type name="gint" c:type="int"/>
      </field>
      <field name="height" writable="1">
        <type name="gint" c:type="int"/>
      </field>

    </record>

So that's the x/y/width/height fields in the struct, in the same order as they are defined in the C code.

And so on. The idea is for the whole API exported by a GObject library to be describable by that format. If something can't be described, it's a bug in the library, or a bug in the format.

Making language bindings start up quickly: typelib files

As we saw, the GIR files are the XML descriptions of GObject APIs. Dynamic languages like Python would prefer to generate the language binding on the fly, as needed, instead of pre-generating a huge binding.

However, GTK+ is a big API: Gtk-3.0.gir is 7 MB of XML. Parsing all of that just to be able to generate gtk_widget_show() on the fly would be too slow. Also, there are GTK+'s dependencies: Atk, Gdk, Cairo, etc. You don't want to parse everything just to start up!

So, we have an extra step that compiles the GIR files down to binary .typelib files. For example, /usr/lib64/girepository-1.0/Gtk-3.0.typelib is about 600 KB on my machine. Those files get mmap()ed for fast access, and can be shared between processes.

How dynamic language bindings use typelib files

GObject Introspection comes with a library that language binding implementors can use to consume those .typelib files. The libgirepository library has functions like "list all the classes available in this namespace", or "call this function with these values for arguments, and give me back the return value here".

Internally, libgirepository uses libffi to actually call the C functions in the dynamically-linked libraries.

So, when you write foo.py and do

import gi
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk
win = Gtk.Window()

what happens is that pygobject calls libgirepository to mmap() the .typelib, and sees that the constructor for Gtk.Window is a C function called gtk_window_new(). After seeing how that function wants to be called, it calls the function using libffi, wraps the result with a PyObject, and that's what you get on the Python side.

Static languages

A static language like Rust prefers to have the whole language binding pre-generated. This is what the various crates in gtk-rs do.

The gir crate takes a .gir file (i.e. the XML descriptions) and does two things:

  • Reconstructs the C function prototypes and C struct declarations, but in a way Rust can understand them. This gets output to the sys crate.

  • Creates idiomatic Rust code for the language binding. This gets output to the various crates; for example, the gtk one.

When reconstructing the C structs and prototypes, we get stuff like

#[repr(C)]
pub struct GtkWidget {
    pub parent_instance: gobject::GInitiallyUnowned,
    pub priv_: *mut GtkWidgetPrivate,
}

extern "C" {
    pub fn gtk_entry_new() -> *mut GtkWidget;
}

And the idiomatic bindings? Stay tuned!