These articles are written by Codalogic empowerees as a way of sharing knowledge with the programming community. They do not necessarily reflect the opinions of Codalogic.
QEMU is an emulator that allows programs written for one microprocessor to be emulated and run on a host running a different microporcessor.
Central to the working of QEMU is the QEMU Object Model or QOM. Every device is represented in the QOM in an object-oriented way.
Because C doesn't naturally represent object oriented systems there's a lot of moving parts in the QEMU C implementation to make this work. As a result it can get a bit confusing.
Mapping the QOM to a regular C++ object model may help.
In C++ we may have an object model structure like the following (I've used struct
instead of class
because C doesn't have the concept of public
and private
):
struct Base
{
static int type;
int id;
Base() { /* construct me */ }
};
struct Derived
{
static int v;
int x;
Derived() { /* construct me */
virtual void do_operation_1() {}
void do_operation_2() {}
};
To represent this in C QOM uses two parallel class hierarchies that split the features.
The first has the Object
struct
(See here on Gitlab)
as the base of all its derivatives.
Types will derive from this struct. There might be multiple instances of a given
type derived from the Object
struct.
struct Object
{
ObjectClass *class;
...
};
The second has the ObjectClass
struct
(See here on Gitlab)
as the base of all its derivatives. Types will also derive from this but there will only
ever be one instance of the structs derived from this struct.
struct ObjectClass
{
Type type;
...
};
Objects deriving from Object
will contain per-instance information and objects deriving from
ObjectClass
will have per-class information.
Notice that the Object
struct has a pointer to its corresponding ObjectClass
struct.
For example, the SysBusDevice
parallel hierarchies are below.
The structs deriving from ObjectClass
are
DeviceClass
(On Gitlab here)
and then
SysBusDeviceClass
(On Gitlab here),
which have the form:
struct DeviceClass() {
ObjectClass parent_class;
...
};
struct SysBusDeviceClass {
DeviceClass parent_class;
...
};
And the structs deriving from Object
are
DeviceState
(On Gitlab here)
and then
SysBusDevice
(On Gitlab here),
which have the form:
struct DeviceState {
Object parent_obj;
...
};
struct SysBusDevice {
DeviceState parent_obj;
...
};
Referring to our C++ class above, per-class information equivalent to Base::type
and Derived::v
would end up in the hierarchy derived from ObjectClass
.
Per-instance data equivalent to
Base.id
and Derived.x
would be in the Object
struct hierachy.
Additionally the methods Derived::do_operation_1()
and Derived::do_operation_2()
would be represented
as pointers to functions in the ObjectClass
struct hierachy because only one set of pointers
are required per type irrespective of how many instances of the type there are.
A big difference between C and C++ is that if we have a pointer to a derived type, say d
, then to access
an entity in the base class we can simply do d->id
. This doesn't work in C. If we have a pointer
to a SysBusDeviceClass
struct called s
and we want to access the type
variable in ObjectClass
we would have
to do s->parent_obj->parent_obj->type
. This isn't particularly appealing especially as the naming of the base struct
isn't always consistent.
Instead, if a function, (which is typically called with a pointer to the base Object
or ObjectClass
), needs to access data in
multiple structs within the hierachy it will create multiple pointers and cast each to be a pointer to the respective type.
So if a function wants to manipulate data in both DeviceClass
and SysBusDeviceClass
it will create two
pointers like DeviceClass *dc;
and SysBusDeviceClass *sbdc;
. Both pointers will have the same value but they
will have different types associated with them and so they will be able to access the relevant members of the different structs.
However, casting pointers from ObjectClass
directly to DeviceClass
and SysBusDeviceClass
is obviously dangerous and error-prone.
Instead, when creating the parallel struct hierachy, helper macros are created that can cast from, say,
Object
to SysBusDevice
and from ObjectClass
to SysBusDeviceClass
. For SysBusDevice
the macros
you would create would be SYS_BUS_DEVICE
, SYS_BUS_DEVICE_GET_CLASS
and SYS_BUS_DEVICE_CLASS
.
Hence in a function you could see code like:
void instance_handler(Object *obj)
{
DeviceState *ds = DEVICE(obj); // Derive instance from base object
SysBusDevice *sbd = SYS_BUS_DEVICE(obj); // Derive instance from base object
SysBusDeviceClass *sbdc = SYS_BUS_DEVICE_GET_CLASS(obj); // Class from base object
}
void class_handler(ObjectClass *klass)
{
DeviceClass *dc = DEVICE_CLASS(klass);
SysBusDeviceClass *sbdc = SYS_BUS_DEVICE_CLASS(klass);
}
Fortunately, QEMU provides macros to make creating these helper macros easier.
To create the set of macros for DeviceState
and DeviceClass
(See here)
you would do:
#define TYPE_DEVICE "device"
OBJECT_DECLARE_TYPE(DeviceState, DeviceClass, DEVICE)
And for the SysBusDevice
and SysBusDeviceClass
set
(See here)
you would do:
#define TYPE_SYS_BUS_DEVICE "sys-bus-device"
OBJECT_DECLARE_TYPE(SysBusDevice, SysBusDeviceClass, SYS_BUS_DEVICE)
The OBJECT_DECLARE_TYPE
macro
(See here)
is defined as follows:
#define OBJECT_DECLARE_TYPE(InstanceType, ClassType, MODULE_OBJ_NAME) \
typedef struct InstanceType InstanceType; \
typedef struct ClassType ClassType; \
\
G_DEFINE_AUTOPTR_CLEANUP_FUNC(InstanceType, object_unref) \
\
DECLARE_OBJ_CHECKERS(InstanceType, ClassType, \
MODULE_OBJ_NAME, TYPE_##MODULE_OBJ_NAME)
QEMU has layers and layers of macros. If I re-write OBJECT_DECLARE_TYPE
to expand
the child macros and simplify the code a bit, the result is as follows:
#define OBJECT_DECLARE_TYPE(InstanceType, ClassType, MODULE_OBJ_NAME) \
typedef struct InstanceType InstanceType; \
typedef struct ClassType ClassType; \
\
G_DEFINE_AUTOPTR_CLEANUP_FUNC(InstanceType, object_unref) \
\
InstanceType * MODULE_OBJ_NAME(const void *obj) \
{ return (InstanceType *)object_dynamic_cast_assert((Object *)(obj), TYPE_##MODULE_OBJ_NAME, __FILE__, __LINE__, __func__); } \
\
ClassType * MODULE_OBJ_NAME##_GET_CLASS(const void *obj) \
{ return (ClassType *)object_class_dynamic_cast_assert(object_get_class((Object *)(obj)), TYPE_##MODULE_OBJ_NAME, __FILE__, __LINE__, __func__); } \
\
ClassType * MODULE_OBJ_NAME##_CLASS(const void *klass) \
{ return (ClassType *)object_class_dynamic_cast_assert((ObjectClass *)(klass), TYPE_##MODULE_OBJ_NAME, __FILE__, __LINE__, __func__); }
The typedef
s are a convenient place to save you having to do struct InstanceType
etc.
The object_dynamic_cast_assert()
and object_class_dynamic_cast_assert()
functions will look through the stored
object hierarchy bookkeeping information and assess whether the pointed to object is of the right type. If it is
the functions will return and the return value cast to the correct struct type. If not an assert()
will be invoked.
We've now seen how the parallel struct hierarchy stores the per-instance and per-class information that can be represented in a C++ class. We've also seen how to move up and down the class hierarchy in a safe way.
The missing piece that our C++ offers but we haven't discussed yet is how the equivalent of C++ constructors is implemented.
To tell QOM how to create objects we need to create a static instance of TypeInfo
for the type and then register it.
Typeinfo
(See here)
looks like this:
struct TypeInfo
{
const char *name;
const char *parent;
size_t instance_size;
size_t instance_align;
void (*instance_init)(Object *obj);
void (*instance_post_init)(Object *obj);
void (*instance_finalize)(Object *obj);
bool abstract;
size_t class_size;
void (*class_init)(ObjectClass *klass, void *data);
void (*class_base_init)(ObjectClass *klass, void *data);
void *class_data;
InterfaceInfo *interfaces;
};
The fields in this are well documented in the source code, but you can readily see that, in addition to specifying the name and parent name, there are entries to specify the size of a per-instance object and the class object. There are also pointers to functions to initialise the two different kinds of object.
Not all of these fields have to be set when creating an instance of TypeInfo
for your type.
The code relies heavily on C setting values that are not explicitly initialised to 0
.
A 0
value indicates a default. If, say, instance_size
is not initialised then the
size of the parent's instance_size
will be used to allocate memory for the object.
The TypeInfo
instance for SysBusDevice
(See here)
is:
static const TypeInfo sysbus_device_type_info = {
.name = TYPE_SYS_BUS_DEVICE,
.parent = TYPE_DEVICE,
.instance_size = sizeof(SysBusDevice),
.abstract = true,
.class_size = sizeof(SysBusDeviceClass),
.class_init = sysbus_device_class_init,
};
As you can see, not all the fields are initialised.
To register this type with QOM the following code (See here) is run:
static void sysbus_register_types(void)
{
...
type_register_static(&sysbus_device_type_info);
}
type_init(sysbus_register_types)
type_init
(See here)
is a macro that expands as follows:
#define type_init(function) module_init(function, MODULE_INIT_QOM)
#define module_init(function, type) \
static void __attribute__((constructor)) do_qemu_init_ ## function(void) \
{ \
register_module_init(function, type); \
}
The __attribute__((constructor))
attribute on the generated function tells GCC
to call the function before main()
is called, thus calling sysbus_register_types(void)
and registering the instance of TypeInfo
using the QOM function type_register_static
.
Instances of our defined types are then created using object_new
(See here)
or some more specialised functions that ultimately calls object_new
. SysBusDevice
has such specialised function for creating instances of it so you would not use object_new
to creating instances of it in this case. However, if we had created a SysBusDevice
-like object
we could create instances of it using something like:
MySysBusDevice *sbd = MY_SYS_BUS_DEVICE(object_new(TYPE_MY_SYS_BUS_DEVICE));
That's a long introduction to the QEMU Object Model. Sadly I think the detail makes the model look more complex and scary than it is.
We have two parallel object hierarchies corresponding to the per-instance and per-class elements that
you would find in a typical C++ class. You generate helper macros to help moving up and
down the two object hierachies is type safe ways
(e.g. SYS_BUS_DEVICE
, SYS_BUS_DEVICE_GET_CLASS
and SYS_BUS_DEVICE_CLASS
) using the
OBJECT_DECLARE_TYPE
macro.
You create a static instance of TypeInfo
and then register it to tell QOM how to create instances of the objects. Finally
you use object_new
or similar to create instances of the objects, giving it the name
of the object you want to create.
As an exercise to the reader you can see the QOM initialisation for the Arm Aarch64 processor here.
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
November 2021
June 2021
May 2021
April 2021
March 2021
October 2020
September 2020
September 2019
March 2019
June 2018
June 2017
August 2016