4.6 Model¶
A SerdModel
is an indexed set of statements.
A model can be used to store any data set,
from a few statements (for example, a protocol message),
to an entire document,
to a database with millions of statements.
A new model can be created with serd_model_new()
:
SerdModel* model = serd_model_new(world, SERD_ORDER_SPO, 0u);
The information to store for each statement can be controlled by passing flags.
Additional indices can also be enabled with serd_model_add_index()
.
For example, to be able to quickly search by predicate,
and store a cursor for each statement,
the model can be constructed with the SERD_STORE_CARETS
flag,
and an additional SERD_ORDER_PSO
index can be added like so:
SerdModel* fancy_model =
serd_model_new(world, SERD_ORDER_SPO, SERD_STORE_CARETS);
serd_model_add_index(fancy_model, SERD_ORDER_PSO);
4.6.1 Accessors¶
The flags set for a model can be accessed with serd_model_flags()
.
The number of statements can be accessed with serd_model_size()
and serd_model_empty()
:
if (serd_model_empty(model)) {
printf("Model is empty\n");
} else if (serd_model_size(model) > 1000) {
printf("Model has over 1000 statements\n");
}
4.6.2 Adding Statements¶
Statements can be added to a model with serd_model_add()
:
SerdNodes* nodes = serd_nodes_new(NULL);
serd_model_add(
model,
serd_nodes_uri(nodes, SERD_STRING("http://example.org/thing")), // S
serd_nodes_uri(nodes, SERD_STRING("http://example.org/name")), // P
serd_nodes_string(nodes, SERD_STRING("Thing")), // O
NULL); // G
Alternatively, serd_model_insert()
can be used if you already have a statement.
For example, the first statement in one model could be added to another like so:
const SerdCursor* cursor = serd_model_begin(other_model);
serd_model_insert(model, serd_cursor_get(cursor));
An entire range of statements can be inserted at once with serd_model_insert_statements()
.
For example, all statements in one model could be copied into another like so:
SerdCursor* other_range = serd_model_begin(other_model);
serd_model_insert_statements(model, other_range);
serd_cursor_free(other_range);
4.6.3 Iteration¶
An iterator is a reference to a particular statement in a model.
serd_model_begin()
returns an iterator to the first statement in the model,
and serd_model_end()
returns a sentinel that is one past the last statement in the model:
SerdCursor* i = serd_model_begin(model);
if (serd_cursor_equals(i, serd_model_end(model))) {
printf("Model is empty\n");
} else {
const SerdStatement* s = serd_cursor_get(i);
printf("First statement subject: %s\n",
serd_node_string(serd_statement_subject(s)));
}
A cursor can be advanced to the next statement with serd_cursor_advance()
,
which returns SERD_FAILURE
if the iterator reached the end:
if (!serd_cursor_advance(i)) {
const SerdStatement* s = serd_cursor_get(i);
printf("Second statement subject: %s\n",
serd_node_string(serd_statement_subject(s)));
}
Iterators are dynamically allocated,
and must eventually be destroyed with serd_cursor_free()
:
serd_cursor_free(i);
4.6.4 Pattern Matching¶
There are several functions that can be used to quickly find statements in the model that match a pattern.
The simplest is serd_model_ask()
which checks if there is any matching statement:
const SerdNode* rdf_type = serd_nodes_uri(
nodes, SERD_STRING("http://www.w3.org/1999/02/22-rdf-syntax-ns#type"));
if (serd_model_ask(model, NULL, rdf_type, NULL, NULL)) {
printf("Model contains a type statement\n");
}
To access the unknown fields,
an iterator to the matching statement can be found with serd_model_find()
instead:
SerdCursor* it = serd_model_find(model, NULL, rdf_type, NULL, NULL);
const SerdStatement* statement = serd_cursor_get(it);
const SerdNode* instance =
statement ? serd_statement_subject(statement) : NULL;
To iterate over the matching statements,
the iterator returned by serd_model_find()
can be advanced.
It will reach its end when it reaches the last matching statement:
SerdCursor* range = serd_model_find(model,
instance, // Subject = instance
rdf_type, // Predicate = rdf:type
NULL, // Object = anything
NULL); // Graph = anything
for (; !serd_cursor_is_end(range); serd_cursor_advance(range)) {
const SerdStatement* s = serd_cursor_get(range);
printf("Instance has type %s\n",
serd_node_string(serd_statement_object(s)));
}
serd_cursor_free(range);
Similar to serd_model_ask()
,
serd_model_count()
can be used to count the number of matching statements:
size_t n = serd_model_count(model, instance, rdf_type, NULL, NULL);
printf("Instance has %zu types\n", n);
4.6.5 Indexing¶
A model can contain several indices that use different orderings to support different kinds of queries. For good performance, there should be an index where the least significant fields in the ordering correspond to wildcards in the pattern (or, in other words, one where the most significant fields in the ordering correspond to nodes given in the pattern). The table below lists the indices that best support a kind of pattern, where a “?” represents a wildcard in the pattern.
Pattern |
Good Indices |
---|---|
s p o |
Any |
s p ? |
SPO, PSO |
s ? o |
SOP, OSP |
s ? ? |
SPO, SOP |
? p o |
POS, OPS |
? p ? |
POS, PSO |
? ? o |
OSP, OPS |
? ? ? |
Any |
If graphs are enabled, then statements are indexed both with and without the graph fields, so queries with and without a graph wildcard will have similar performance.
Since indices take up space and slow down insertion, it is best to enable the fewest indices possible that cover the queries that will be performed. For example, an applications might enable just SPO and OPS order, because they always search for specific subjects or objects, but never for just a predicate without specifying any other field.
4.6.6 Getting Values¶
Sometimes you are only interested in a single node,
and it is cumbersome to first search for a statement and then get the node from it.
A more convenient way is to use serd_model_get()
.
To get a value, specify a triple pattern where exactly one of the subject, predicate, and object is a wildcard.
If a statement matches, then the node that “fills” the wildcard will be returned:
const SerdNode* t = serd_model_get(model,
instance, // Subject
rdf_type, // Predicate
NULL, // Object
NULL); // Graph
if (t) {
printf("Instance has type %s\n", serd_node_string(t));
}
If multiple statements match the pattern, then the matching node from an arbitrary statement is returned. It is an error to specify more than one wildcard, excluding the graph.
The similar serd_model_get_statement()
instead returns the matching statement:
const SerdStatement* ts =
serd_model_get_statement(model, instance, rdf_type, NULL, NULL);
if (ts) {
printf("Instance %s has type %s\n",
serd_node_string(serd_statement_subject(ts)),
serd_node_string(serd_statement_object(ts)));
}
4.6.7 Erasing Statements¶
Individual statements can be erased with serd_model_erase()
,
which takes a cursor:
SerdCursor* some_type = serd_model_find(model, NULL, rdf_type, NULL, NULL);
serd_model_erase(model, some_type);
serd_cursor_free(some_type);
The similar serd_model_erase_statements()
will erase all statements in the cursor’s range:
SerdCursor* all_types = serd_model_find(model, NULL, rdf_type, NULL, NULL);
serd_model_erase_statements(model, all_types);
serd_cursor_free(all_types);
4.6.8 Lifetime¶
Models are value-like and can be copied with serd_model_copy()
and compared with serd_model_equals()
:
SerdModel* copy = serd_model_copy(NULL, model);
assert(serd_model_equals(copy, model));
When a model is no longer needed, it can be destroyed with serd_model_free()
:
serd_model_free(copy);
Destroying a model invalidates all nodes and statements within that model, so care should be taken to ensure that no dangling pointers are created.