Serd¶
Serd is a lightweight C library and set of command-line utilities for working with RDF data in Turtle, NTriples, NQuads, and TriG formats.
1 Getting Started¶
1.1 Downloading¶
Serd is distributed in several ways. There are no “official” binaries, only source code releases which must be compiled. However, many operating system distributions do package binaries. Check if your package manager has a reasonably recent package, if so, that is the easiest and most reliable installation method.
Release announcements with links to source code archives can be found at https://drobilla.net/category/serd/. All release archives and their signatures are available in the directory http://download.drobilla.net/.
The code can also be checked out of git:
git clone https://gitlab.com/drobilla/serd.git
1.2 Compiling¶
Serd uses the meson build system. From within an extracted release archive or repository checkout, the library can be built and tested with default options like so:
meson setup build
cd build
ninja test
There are many configuration options,
which can be displayed by running meson configure
.
See the meson documentation for more details on using meson.
1.3 Installing¶
If the library compiled successfully,
then ninja install
can be used to install it.
Note that you may need superuser privileges, for example:
sudo ninja install
The installation prefix can be changed by setting the prefix
option, for example:
meson configure -Dprefix=/opt/serd
If you do not want to install anything, you can also “vendor” the code in your project (provided, of course, that you adhere to the terms of the license). If you are using meson, then it should simply work as a subproject without modification. Otherwise, you will need to set up the build yourself.
1.4 Including¶
Serd installs a pkg-config file,
which can be used to set the appropriate compiler and linker flags for projects to use it.
If installed to a standard prefix,
then it should show up in pkg-config
automatically:
pkg-config --list-all | grep serd
If not, you may need to adjust the PKG_CONFIG_PATH
environment variable to include the installation prefix, for example:
export PKG_CONFIG_PATH=/opt/serd/lib/pkgconfig
pkg-config --list-all | grep serd
Most popular build systems natively support pkg-config. For example, in meson:
serd_dep = dependency('serd-1')
On systems where pkg-config is not available,
you will need to set up compiler and linker flags manually,
by adding something like -I/opt/serd/include/serd-1
,
and -lserd-1
, respectively.
Once things are set up, you should be able to include the API header and start using Serd in your code.
2 Data Model¶
2.1 Structure¶
Serd is based on RDF, a model for Linked Data. A deep understanding of what this means isn’t necessary, but it is important to have a basic understanding of how this data is structured.
The basic building block of data is the node, which is essentially a string with some extra type information. A statement is a tuple of 3 or 4 nodes. All information is represented by a set of statements, which makes this model structurally very simple: any document or database is essentially a single table with 3 or 4 columns. This is easiest to see in NTriples or NQuads documents, which are simple flat files with a single statement per line.
There are, however, some restrictions. Each node in a statement has a specific role: subject, predicate, object, and (optionally) graph, in that order. A statement declares that a subject has some property. The predicate identifies the property, and the object is its value.
A statement is a bit like a very simple machine-readable sentence. The “subject” and “object” are as in natural language, and the predicate is something like a verb (but much more general). For example, we could make a statement in English about your intrepid author:
drobilla has the first name David
We can break this statement into 3 pieces like so:
Subject |
Predicate |
Object |
---|---|---|
drobilla |
has the first name |
David |
The subject and predicate must be resources with an identifier,
so we will need to define some URIs to represent this statement.
Conventionally, predicate names do not start with “has” or similar words,
since that would be redundant in this context.
So,
we assume that http://example.org/drobilla
is the URI for drobilla,
and that http://example.org/firstName
has been defined as the appropriate property (“has the first name”),
and can represent the statement in a machine-readable way:
Subject |
Predicate |
Object |
---|---|---|
|
|
David |
Which can be written in NTriples like so:
<http://example.org/drobilla> <http://example.org/firstName> "David" .
2.2 Working with Data¶
The power of this data model lies in its uniform “physical” structure, and the use of URIs as a decentralized namespace mechanism. In particular, it makes filtering, merging, and otherwise “mixing” data from various sources easy.
For example, we could add some statements to the above example to better describe the same subject:
<http://example.org/drobilla> <http://example.org/firstName> "David" .
<http://example.org/drobilla> <http://example.org/lastName> "Robillard" .
We could also add information about other subjects:
<http://drobilla.net/sw/serd> <http://example.org/programmingLanguage> "C" .
Including statements that relate them to each other:
<http://example.org/drobilla> <http://example.org/wrote> <http://drobilla.net/sw/serd> .
Note that there is no “physical” tree structure here, which is an important distinction from structured document formats like XML or JSON. Since all information is just a set of statements, the information in two documents, for example, can be combined by simply concatenating the documents. Similarly, any arbitrary subset of statements in a document can be separated into a new document. The use of URIs enables such things even with data from many independent sources, without any need to agree on a common schema.
In practice, sharing URI “vocabulary” is encouraged since this is how different parties can have a shared understanding of what data means. That, however, is a higher-level application concern. Only the “physical” structure of data described here is important for understanding how Serd works, and what its tools and APIs can do.
3 Command-Line Tools¶
Serd includes several tools that can be used to process data on the command-line. Each is documented by their own man page:
serd-pipe is a streaming tool for reading and writing documents.
serd-sort is similar to serd-pipe, but loads data into an in-memory model instead of streaming.
serd-filter is a
grep
-like statement filtering tool.serd-validate validates data and prints warnings where data is invalid according to the schemas it uses.
4 Using Serd¶
4.1 Overview¶
The serd API is declared in serd.h
:
#include <serd/serd.h>
An instance of serd is represented by a World, which manages “global” facilities like memory allocation and logging. The rest of the API can be broadly grouped into four categories:
- Data
A Node is the basic building block of data, 3 or 4 nodes together make a Statement. All data is expressed in statements.
- Streams
Components communicate by sending and receiving streams of data. Data is streamed via Sink, which is an abstract interface that receives Events. The fundamental event is a statement event, but there are a few additional event types that describe context which is useful for things like pretty-printing.
Some components both send and receive data, which allow them to be inserted in a pipeline to process the data as it streams through. For example, a Canon converts literals to canonical form, and a Filter filters statements that match (or do not match) some pattern.
An event stream describes changes to data and its context, but does not store the context. For that, an associated Environment is maintained. This stores the active base URI and namespace prefixes which can, for example, be used to write output with the same abbreviations used in the source.
- Reading and Writing
Reading and writing data is performed using a Reader, which reads text and emits data to a sink, and a Writer, which is a sink that writes the incoming data as text. Both work in a streaming fashion so that large documents can be pretty-printed, translated, or otherwise processed quickly using only a small amount of memory.
- Storage
A set of statements can be stored in memory as a Model. This supports quickly searching and scanning statements, provided an appropriate index is enabled.
Data can be loaded into a model via an Inserter, which is a sink that inserts incoming statements into a model. Data in a model can be written out by calling
serd_describe_range()
on the desired range of statements.
The sink interface acts as a generic connection which can be used to build custom data processing pipelines. For example, a simple pipeline to read a document, filter out some statements, and write the result to a new file, would look something like:
Here, dotted arrows represent event streams, and solid arrows represent explicit use of a component. In other words, dotted arrows represent connections via the abstract Sink interface. In this case both reader and writer are using the same environment, so the output document will have the same abbreviations as the input. It is also possible to use different environments, for example to set additional namespace prefixes to further abbreviate the document.
Similarly, a document could be loaded into a model with canonical literals using a pipeline like:
Many other useful pipelines can be built using the components in serd, and applications can implement custom ones to add additional functionality.
The following documentation gives a more detailed bottom-up introduction to the API, with links to the complete reference where further detail can be found.
4.2 String Views¶
For performance reasons,
most functions in serd that take a string take a SerdStringView
,
rather than a bare pointer.
This forces code to be explicit about string measurement,
which discourages common patterns of repeated measurement of the same string.
For convenience, several macros are provided for constructing string views:
Constructs a view of an empty string, for example:
SerdStringView empty = SERD_EMPTY_STRING();
Constructs a view of an entire string or string literal, for example:
SerdStringView hello = SERD_STRING("hello");or:
SerdStringView view = SERD_STRING(string_pointer);This macro calls
strlen
to measure the string. Modern compilers will optimise this away if the parameter is a string literal.
Constructs a view of a slice of a string with an explicit length, for example:
SerdStringView slice = SERD_SUBSTRING(string_pointer, 4);This macro can also be used to create a view of a pre-measured string. If the length a dynamic string is already known, it is faster to use this than
SERD_STRING
.
These macros can be used inline when passing parameters, but if the same dynamic string is used several times, it is better to make a string view variable to avoid redundant measurement.
4.3 Nodes¶
Nodes are the basic building blocks of data.
Nodes are essentially strings,
but also have a type
,
and optionally either a datatype or a language.
In the abstract, a node is either a literal, a URI, or blank. Literals are essentially strings, but may have a datatype or a language tag. URIs are used to identify resources, as are blank nodes, except blank nodes only have labels with a limited scope and may be written anonymously.
Serd also has a type for variable nodes, which are used for some features but not present in RDF data.
4.3.1 Fundamental Constructors¶
To allow the application to manage node memory,
node constructors are provided that construct nodes in existing memory buffers.
The universal constructor serd_node_construct()
can construct any type of node,
but is somewhat verbose and tricky to use.
Several constructors for more specific types of node are also available:
If explicit memory management is not required, high-level constructors that allocate nodes on the heap can be used instead:
4.3.2 Accessors¶
The basic attributes of a node can be accessed with serd_node_type()
,
serd_node_string()
,
and serd_node_length()
.
A measured view of the string can be accessed with serd_node_string_view()
.
This can be passed to functions that take a string view,
to avoid redundant measurement of the node string.
The datatype or language can be retrieved with serd_node_datatype()
or serd_node_language()
, respectively.
Note that only literals can have a datatype or language,
but never both at once.
4.4 Statements¶
A SerdStatement
is a tuple of either 3 or 4 nodes:
the subject, predicate, object, and optional graph.
Statements declare that a subject has some property.
The predicate identifies the property,
and the object is its value.
A statement can be thought of as a very simple machine-readable sentence. The subject and object are as in natural language, and the predicate is something like a verb, but more general. For example, we could make a statement in English about your intrepid author:
drobilla has the first name “David”
We can break this statement into 3 pieces like so:
Subject |
Predicate |
Object |
---|---|---|
drobilla |
has the first name |
“David” |
To make a SerdStatement
out of this, we need to define some URIs.
In RDF, the subject and predicate must be resources with an identifier
(for example, neither can be a string).
Conventionally, predicate names do not start with “has” or similar words,
since that would be redundant in this context.
So, we assume that http://example.org/drobilla
is the URI for drobilla,
and that http://example.org/firstName
has been defined somewhere to be
a property with the appropriate meaning,
and can make an equivalent SerdStatement
:
SerdStatement* statement = serd_statement_new(
NULL,
serd_nodes_uri(nodes, SERD_STRING("http://example.org/drobilla")),
serd_nodes_uri(nodes, SERD_STRING("http://example.org/firstName")),
serd_nodes_string(nodes, SERD_STRING("David")),
NULL,
NULL);
The last two fields are the graph and the cursor. The graph is another node that can be used to group statements, for example by the URI of the document they were loaded from. The cursor represents the location in a document where the statement was loaded from, if applicable.
4.4.1 Accessing Fields¶
Statement fields can be accessed with
serd_statement_node()
, for example:
const SerdNode* s = serd_statement_node(statement, SERD_SUBJECT);
Alternatively, an accessor function is provided for each field:
const SerdNode* p = serd_statement_predicate(statement);
const SerdNode* o = serd_statement_object(statement);
const SerdNode* g = serd_statement_graph(statement);
Every statement has a subject, predicate, and object,
but the graph may be null.
The cursor may also be null (as it would be in this case),
but if available it can be accessed with serd_statement_caret()
:
const SerdCaret* c = serd_statement_caret(statement);
4.4.2 Comparison¶
Two statements can be compared with serd_statement_equals()
:
if (serd_statement_equals(statement1, statement2)) {
printf("Match\n");
}
Statements are equal if all four corresponding pairs of nodes are equal. The cursor is considered metadata, and is ignored for comparison.
It is also possible to match statements against a pattern using NULL
as a wildcard,
with serd_statement_matches()
:
SerdNode* eg_name =
serd_new_uri(NULL, SERD_STRING("http://example.org/name"));
if (serd_statement_matches(statement, NULL, eg_name, NULL, NULL)) {
printf("%s has name %s\n",
serd_node_string(serd_statement_subject(statement)),
serd_node_string(serd_statement_object(statement)));
}
4.4.3 Lifetime¶
A statement only contains const references to nodes,
it does not own nodes or manage their lifetimes internally.
The cursor, however, is owned by the statement.
A statement can be copied with serd_statement_copy()
:
SerdStatement* copy = serd_statement_copy(NULL, statement);
The copied statement will refer to exactly the same nodes, though the cursor will be deep copied.
In most cases, statements come from a reader or model which manages them internally,
but a statement owned by the application must be freed with serd_statement_free()
:
serd_statement_free(NULL, copy);
4.5 World¶
So far, we have only used nodes and statements,
which are simple independent objects.
Higher-level facilities in Serd require a SerdWorld
,
which represents the global library state.
A program typically uses just one world,
which can be constructed using serd_world_new()
:
SerdWorld* world = serd_world_new(NULL);
All “global” library state is handled explicitly via the world. Serd does not contain any static mutable data, allowing it to be used concurrently in several parts of a program, for example in plugins.
If multiple worlds are used in a single program, they must never be mixed: objects “inside” one world can not be used with objects inside another.
Note that the world is not a database, it only manages a small amount of library state for things like configuration and logging.
4.5.1 Generating Blanks¶
Blank nodes, or simply “blanks”, are used for resources that do not have URIs. Unlike URIs, they are not global identifiers, and only have meaning within their local context (for example, a document). The world provides a method for automatically generating unique blank identifiers:
const SerdNode* world_blank = serd_world_get_blank(world);
SerdNode* my_blank = serd_node_copy(NULL, world_blank);
Note that the returned pointer is to a node that will be updated on the next call to serd_world_get_blank()
,
so it is usually best to copy the node,
like in the example above.
4.6 Model¶
A SerdModel
is an indexed set of statements.
A model can be used to store any data set,
from a few statements (for example, a protocol message),
to an entire document,
to a database with millions of statements.
A new model can be created with serd_model_new()
:
SerdModel* model = serd_model_new(world, SERD_ORDER_SPO, 0u);
The information to store for each statement can be controlled by passing flags.
Additional indices can also be enabled with serd_model_add_index()
.
For example, to be able to quickly search by predicate,
and store a cursor for each statement,
the model can be constructed with the SERD_STORE_CARETS
flag,
and an additional SERD_ORDER_PSO
index can be added like so:
SerdModel* fancy_model =
serd_model_new(world, SERD_ORDER_SPO, SERD_STORE_CARETS);
serd_model_add_index(fancy_model, SERD_ORDER_PSO);
4.6.1 Accessors¶
The flags set for a model can be accessed with serd_model_flags()
.
The number of statements can be accessed with serd_model_size()
and serd_model_empty()
:
if (serd_model_empty(model)) {
printf("Model is empty\n");
} else if (serd_model_size(model) > 1000) {
printf("Model has over 1000 statements\n");
}
4.6.2 Adding Statements¶
Statements can be added to a model with serd_model_add()
:
SerdNodes* nodes = serd_nodes_new(NULL);
serd_model_add(
model,
serd_nodes_uri(nodes, SERD_STRING("http://example.org/thing")), // S
serd_nodes_uri(nodes, SERD_STRING("http://example.org/name")), // P
serd_nodes_string(nodes, SERD_STRING("Thing")), // O
NULL); // G
Alternatively, serd_model_insert()
can be used if you already have a statement.
For example, the first statement in one model could be added to another like so:
const SerdCursor* cursor = serd_model_begin(other_model);
serd_model_insert(model, serd_cursor_get(cursor));
An entire range of statements can be inserted at once with serd_model_insert_statements()
.
For example, all statements in one model could be copied into another like so:
SerdCursor* other_range = serd_model_begin(other_model);
serd_model_insert_statements(model, other_range);
serd_cursor_free(other_range);
4.6.3 Iteration¶
An iterator is a reference to a particular statement in a model.
serd_model_begin()
returns an iterator to the first statement in the model,
and serd_model_end()
returns a sentinel that is one past the last statement in the model:
SerdCursor* i = serd_model_begin(model);
if (serd_cursor_equals(i, serd_model_end(model))) {
printf("Model is empty\n");
} else {
const SerdStatement* s = serd_cursor_get(i);
printf("First statement subject: %s\n",
serd_node_string(serd_statement_subject(s)));
}
A cursor can be advanced to the next statement with serd_cursor_advance()
,
which returns SERD_FAILURE
if the iterator reached the end:
if (!serd_cursor_advance(i)) {
const SerdStatement* s = serd_cursor_get(i);
printf("Second statement subject: %s\n",
serd_node_string(serd_statement_subject(s)));
}
Iterators are dynamically allocated,
and must eventually be destroyed with serd_cursor_free()
:
serd_cursor_free(i);
4.6.4 Pattern Matching¶
There are several functions that can be used to quickly find statements in the model that match a pattern.
The simplest is serd_model_ask()
which checks if there is any matching statement:
const SerdNode* rdf_type = serd_nodes_uri(
nodes, SERD_STRING("http://www.w3.org/1999/02/22-rdf-syntax-ns#type"));
if (serd_model_ask(model, NULL, rdf_type, NULL, NULL)) {
printf("Model contains a type statement\n");
}
To access the unknown fields,
an iterator to the matching statement can be found with serd_model_find()
instead:
SerdCursor* it = serd_model_find(model, NULL, rdf_type, NULL, NULL);
const SerdStatement* statement = serd_cursor_get(it);
const SerdNode* instance =
statement ? serd_statement_subject(statement) : NULL;
To iterate over the matching statements,
the iterator returned by serd_model_find()
can be advanced.
It will reach its end when it reaches the last matching statement:
SerdCursor* range = serd_model_find(model,
instance, // Subject = instance
rdf_type, // Predicate = rdf:type
NULL, // Object = anything
NULL); // Graph = anything
for (; !serd_cursor_is_end(range); serd_cursor_advance(range)) {
const SerdStatement* s = serd_cursor_get(range);
printf("Instance has type %s\n",
serd_node_string(serd_statement_object(s)));
}
serd_cursor_free(range);
Similar to serd_model_ask()
,
serd_model_count()
can be used to count the number of matching statements:
size_t n = serd_model_count(model, instance, rdf_type, NULL, NULL);
printf("Instance has %zu types\n", n);
4.6.5 Indexing¶
A model can contain several indices that use different orderings to support different kinds of queries. For good performance, there should be an index where the least significant fields in the ordering correspond to wildcards in the pattern (or, in other words, one where the most significant fields in the ordering correspond to nodes given in the pattern). The table below lists the indices that best support a kind of pattern, where a “?” represents a wildcard in the pattern.
Pattern |
Good Indices |
---|---|
s p o |
Any |
s p ? |
SPO, PSO |
s ? o |
SOP, OSP |
s ? ? |
SPO, SOP |
? p o |
POS, OPS |
? p ? |
POS, PSO |
? ? o |
OSP, OPS |
? ? ? |
Any |
If graphs are enabled, then statements are indexed both with and without the graph fields, so queries with and without a graph wildcard will have similar performance.
Since indices take up space and slow down insertion, it is best to enable the fewest indices possible that cover the queries that will be performed. For example, an applications might enable just SPO and OPS order, because they always search for specific subjects or objects, but never for just a predicate without specifying any other field.
4.6.6 Getting Values¶
Sometimes you are only interested in a single node,
and it is cumbersome to first search for a statement and then get the node from it.
A more convenient way is to use serd_model_get()
.
To get a value, specify a triple pattern where exactly one of the subject, predicate, and object is a wildcard.
If a statement matches, then the node that “fills” the wildcard will be returned:
const SerdNode* t = serd_model_get(model,
instance, // Subject
rdf_type, // Predicate
NULL, // Object
NULL); // Graph
if (t) {
printf("Instance has type %s\n", serd_node_string(t));
}
If multiple statements match the pattern, then the matching node from an arbitrary statement is returned. It is an error to specify more than one wildcard, excluding the graph.
The similar serd_model_get_statement()
instead returns the matching statement:
const SerdStatement* ts =
serd_model_get_statement(model, instance, rdf_type, NULL, NULL);
if (ts) {
printf("Instance %s has type %s\n",
serd_node_string(serd_statement_subject(ts)),
serd_node_string(serd_statement_object(ts)));
}
4.6.7 Erasing Statements¶
Individual statements can be erased with serd_model_erase()
,
which takes a cursor:
SerdCursor* some_type = serd_model_find(model, NULL, rdf_type, NULL, NULL);
serd_model_erase(model, some_type);
serd_cursor_free(some_type);
The similar serd_model_erase_statements()
will erase all statements in the cursor’s range:
SerdCursor* all_types = serd_model_find(model, NULL, rdf_type, NULL, NULL);
serd_model_erase_statements(model, all_types);
serd_cursor_free(all_types);
4.6.8 Lifetime¶
Models are value-like and can be copied with serd_model_copy()
and compared with serd_model_equals()
:
SerdModel* copy = serd_model_copy(NULL, model);
assert(serd_model_equals(copy, model));
When a model is no longer needed, it can be destroyed with serd_model_free()
:
serd_model_free(copy);
Destroying a model invalidates all nodes and statements within that model, so care should be taken to ensure that no dangling pointers are created.
4.7 Reading and Writing¶
Reading and writing documents in a textual syntax is handled by the SerdReader
and SerdWriter
, respectively.
Serd is designed around a concept of event streams,
so the reader or writer can be at the beginning or end of a “pipeline” of stream processors.
This allows large documents to be processed quickly in an “online” fashion,
while requiring only a small constant amount of memory.
If you are familiar with XML,
this is roughly analogous to SAX.
A common simple setup is to simply connect a reader directly to a writer.
This can be used for things like pretty-printing,
or converting a document from one syntax to another.
This can be done by passing the sink returned by serd_writer_sink()
to the reader constructor, serd_reader_new()
.
First, in order to write a document, an environment needs to be created. This defines the base URI and any namespace prefixes, which is used to resolve any relative URIs or prefixed names, and may be used to abbreviate the output. In most cases, the base URI should simply be the URI of the file being written. For example:
SerdStringView host = SERD_EMPTY_STRING();
SerdStringView path = SERD_STRING("/some/file.ttl");
SerdNode* base = serd_new_file_uri(NULL, path, host);
SerdEnv* env = serd_env_new(world, serd_node_string_view(base));
Namespace prefixes can also be defined for any vocabularies used:
serd_env_set_prefix(
env,
SERD_STRING("rdf"),
SERD_STRING("http://www.w3.org/1999/02/22-rdf-syntax-ns#"));
We now have an environment set up for our document,
but still need to specify where to write it.
This is done by creating a SerdOutputStream
,
which is a generic interface that can be set up to write to a file,
a buffer in memory,
or a custom function that can be used to write output anywhere.
In this case, we will write to the file we set up as the base URI:
SerdOutputStream out = serd_open_output_file("/tmp/eg.ttl");
The second argument is the page size in bytes, so I/O will be performed in chunks for better performance. The value used here, 4096, is a typical filesystem block size that should perform well on most machines.
With an environment and byte sink ready, the writer can now be created:
SerdWriter* writer = serd_writer_new(
world, // World
SERD_TURTLE, // Syntax
0, // Writer flags
env, // Environment
&out, // Output stream
4096); // Block size
Output is written by feeding statements and other events to the sink returned by serd_writer_sink()
.
SerdSink
is the generic interface for anything that can consume data streams.
Many objects provide the same interface to do various things with the data,
but in this case we will send data directly to the writer:
SerdReader* reader = serd_reader_new(
world, // World
SERD_TURTLE, // Syntax
0, // Reader flags
env, // Environment
serd_writer_sink(writer), // Target sink
4096); // Block size
The third argument of serd_reader_new()
takes a bitwise OR
of SerdReaderFlag
flags that can be used to configure the reader.
In this case only SERD_READ_LAX
is given,
which tolerates some invalid input without halting on an error,
but others can be included.
For example, passing SERD_READ_LAX | SERD_READ_RELATIVE
would enable lax mode and preserve relative URIs in the input.
Now that we have a reader that is set up to directly push its output to a writer, we can finally process the document:
SerdStatus st = serd_reader_read_document(reader);
if (st) {
printf("Error reading document: %s\n", serd_strerror(st));
}
Alternatively, one “chunk” of input can be read at a time with serd_reader_read_chunk()
.
A “chunk” is generally one top-level description of a resource,
including any anonymous blank nodes in its description,
but this depends on the syntax and the structure of the document being read.
The reader pushes events to its sink as input is read,
so in this scenario the data should now have been re-written by the writer
(assuming no error occurred).
To finish and ensure that a complete document has been read and written,
serd_reader_finish()
can be called followed by serd_writer_finish()
.
However these will be automatically called on destruction if necessary,
so if the reader and writer are no longer required they can simply be destroyed:
serd_reader_free(reader);
serd_writer_free(writer);
Note that it is important to free the reader first in this case,
since finishing the read may push events to the writer.
Finally, closing the output with serd_close_output()
will flush and close the output file,
so it is ready to be read again later.
serd_close_output(&out);
4.7.1 Reading into a Model¶
A document can be loaded into a model by setting up a reader that pushes data to a model “inserter” rather than a writer:
SerdModel* model = serd_model_new(world, SERD_ORDER_SPO, 0u);
SerdSink* inserter = serd_inserter_new(model, NULL);
The process of reading the document is the same as above, only the sink is different:
SerdReader* const model_reader =
serd_reader_new(world, SERD_TURTLE, 0, env, inserter, 4096);
st = serd_reader_read_document(model_reader);
if (st) {
printf("Error loading model: %s\n", serd_strerror(st));
}
4.7.2 Writing a Model¶
A model, or parts of a model, can be written by writing the desired range with serd_describe_range()
:
serd_describe_range(serd_model_begin(model), serd_writer_sink(writer), 0);
By default,
this writes the range in chunks suited to pretty-printing with anonymous blank nodes (like “[ … ]” in Turtle or TriG).
Any rdf:type properties (written “a” in Turtle or TriG) will be written before any other properties of their subject.
This can be disabled by passing the flag SERD_NO_TYPE_FIRST
.
4.8 Stream Processing¶
The above examples show how a document can be either written to a file or loaded into a model, simply by changing the sink that the data is written to. There are also sinks that filter or transform the data before passing it on to another sink, which can be used to build more advanced pipelines with several processing stages.
4.8.1 Canonical Literals¶
A canon is a stream processor that converts literals with supported XSD datatypes into canonical form.
For example, this will rewrite an xsd:decimal literal like “.10” as “0.1”.
A canon is created with serd_canon_new()
,
which needs to be passed the “target” sink that the transformed statements should be written to,
for example:
SerdSink* canon = serd_canon_new(world, inserter, 0);
The last argument is a bitwise OR
of SerdCanonFlag
flags.
For example, SERD_CANON_LAX
will tolerate and pass through invalid literals,
which can be useful for cleaning up questionabe data as much as possible without losing any information.
4.8.2 Filtering Statements¶
A filter is a stream processor that filters statements based on a pattern.
It can be configured in either inclusive or exclusive mode,
which passes through only statements that match or don’t match the pattern,
respectively.
A filter is created with serd_filter_new()
,
which takes a target, pattern, and inclusive flag.
For example, all statements with predicate rdf:type
could be filtered out when loading a model:
SerdSink* filter = serd_filter_new(world, // World
inserter, // Target
NULL, // Subject
rdf_type, // Predicate
NULL, // Object
NULL, // Graph
true); // Inclusive
If false
is passed for the last parameter instead,
then the filter operates in exclusive mode and will instead insert only statements with predicate rdf:type
.
5 Serd C API¶
5.1 Version¶
Serd uses a single semantic version number which reflects changes to the C library ABI.
-
SERD_MAJOR_VERSION¶
The major version number of the serd library.
Semver: Increments when incompatible API changes are made.
-
SERD_MINOR_VERSION¶
The minor version number of the serd library.
Semver: Increments when functionality is added in a backwards compatible manner.
-
SERD_MICRO_VERSION¶
The micro version number of the serd library.
Semver: Increments when changes are made that do not affect the API, such as performance improvements or bug fixes.
5.2 String View¶
-
struct SerdStringView¶
An immutable slice of a string.
This type is used for many string parameters, to allow referring to slices of strings in-place and to avoid redundant string measurement.
-
const char *buf¶
Start of string.
-
size_t len¶
Length of string in bytes.
-
const char *buf¶
-
SERD_EMPTY_STRING()¶
Return a view of an empty string.
-
SERD_STRING(str)¶
Return a view of an entire string by measuring it.
This makes a view of the given string by measuring it with
strlen
.- Parameters
str – Non-null pointer to the start of a null-terminated C string.
-
SERD_OPTIONAL_STRING(str)¶
Return a view of an entire string by measuring it, or the empty string.
This is the same as
SERD_STRING
, but tolerates null, in which case an empty string view is returned.- Parameters
str – Pointer to the start of a null-terminated C string, or null.
-
SERD_SUBSTRING(str, len)¶
Return a view of a substring, or a premeasured string.
This makes either a view of a slice of a string (which may not be null terminated), or a view of a string that has already been measured. This is faster than
SERD_STRING
for dynamic strings since it does not callstrlen
, so should be used when the length of the string is already known.- Parameters
str – Pointer to the start of the substring.
len – Length of the substring in bytes, not including the trailing null terminator if present.
5.3 Memory Management¶
-
struct SerdAllocatorImpl¶
Definition of SerdAllocator.
-
typedef struct SerdAllocatorImpl SerdAllocator¶
A memory allocator.
This object-like structure provides an interface like the standard C functions malloc(), calloc(), realloc(), free(), and aligned_alloc(). It contains function pointers that differ from their standard counterparts by taking a context parameter (a pointer to this struct), which allows the user to implement custom stateful allocators.
-
typedef void *(*SerdAllocatorMallocFunc)(SerdAllocator *allocator, size_t size)¶
General malloc-like memory allocation function.
This works like the standard C malloc(), except has an additional handle parameter for implementing stateful allocators without static data.
-
typedef void *(*SerdAllocatorCallocFunc)(SerdAllocator *allocator, size_t nmemb, size_t size)¶
General calloc-like memory allocation function.
This works like the standard C calloc(), except has an additional handle parameter for implementing stateful allocators without static data.
-
typedef void *(*SerdAllocatorReallocFunc)(SerdAllocator *allocator, void *ptr, size_t size)¶
General realloc-like memory reallocation function.
This works like the standard C remalloc(), except has an additional handle parameter for implementing stateful allocators without static data.
-
typedef void (*SerdAllocatorFreeFunc)(SerdAllocator *allocator, void *ptr)¶
General free-like memory deallocation function.
This works like the standard C remalloc(), except has an additional handle parameter for implementing stateful allocators without static data.
-
typedef void *(*SerdAllocatorAlignedAllocFunc)(SerdAllocator *allocator, size_t alignment, size_t size)¶
General aligned_alloc-like memory deallocation function.
This works like the standard C aligned_alloc(), except has an additional handle parameter for implementing stateful allocators without static data.
-
typedef void (*SerdAllocatorAlignedFreeFunc)(SerdAllocator *allocator, void *ptr)¶
General aligned memory deallocation function.
This works like the standard C free(), but must be used to free memory allocated with the aligned_alloc() method of the allocator. This allows portability to systems (like Windows) that can not use the same free function in these cases.
-
SerdAllocator *serd_default_allocator(void)¶
Return the default allocator which simply uses the system allocator.
-
void serd_free(SerdAllocator *allocator, void *ptr)¶
Free memory allocated by Serd.
This function exists because some systems require memory allocated by a library to be freed by code in the same library. It is otherwise equivalent to the standard C free() function.
This may be used to free memory allocated using
serd_default_allocator()
.
5.4 Status Codes¶
-
struct SerdWriteResult¶
A status code with an associated byte count.
This is returned by functions which write to a buffer to inform the caller about the size written, or in case of overflow, size required.
-
SerdStatus status¶
Status code.
This reports the status of the operation as usual, and also dictates the meaning of
count
.
-
size_t count¶
Number of bytes written or required.
On success, this is the total number of bytes written. On
SerdStatus.SERD_OVERFLOW
, this is the number of bytes of output space that are required for success.
-
SerdStatus status¶
-
enum SerdStatus¶
Return status code.
-
enumerator SERD_SUCCESS¶
Success.
-
enumerator SERD_FAILURE¶
Non-fatal failure.
-
enumerator SERD_UNKNOWN_ERROR¶
Unknown error.
-
enumerator SERD_NO_DATA¶
Missing input.
-
enumerator SERD_OVERFLOW¶
Insufficient space.
-
enumerator SERD_BAD_ALLOC¶
Memory allocation failed.
-
enumerator SERD_BAD_ARG¶
Invalid argument.
-
enumerator SERD_BAD_CALL¶
Invalid call.
-
enumerator SERD_BAD_CURIE¶
Invalid CURIE or unknown namespace prefix.
-
enumerator SERD_BAD_CURSOR¶
Use of invalidated cursor.
-
enumerator SERD_BAD_EVENT¶
Invalid event in stream.
-
enumerator SERD_BAD_INDEX¶
No optimal model index available.
-
enumerator SERD_BAD_LABEL¶
Encountered clashing blank node label.
-
enumerator SERD_BAD_LITERAL¶
Invalid literal.
-
enumerator SERD_BAD_PATTERN¶
Invalid statement pattern.
-
enumerator SERD_BAD_READ¶
Error reading from file.
-
enumerator SERD_BAD_STACK¶
Stack overflow.
-
enumerator SERD_BAD_SYNTAX¶
Invalid syntax.
-
enumerator SERD_BAD_TEXT¶
Invalid text encoding.
-
enumerator SERD_BAD_URI¶
Invalid or unresolved URI.
-
enumerator SERD_BAD_WRITE¶
Error writing to file.
-
enumerator SERD_BAD_DATA¶
Invalid data.
-
enumerator SERD_SUCCESS¶
-
const char *serd_strerror(SerdStatus status)¶
Return a string describing a status code.
5.5 String Utilities¶
-
char *serd_canonical_path(SerdAllocator *allocator, const char *path)¶
Return
path
as a canonical absolute path.This expands all symbolic links, relative references, and removes extra directory separators. Null is returned on error, including if the path does not exist.
- Returns
A newly allocated string that must be freed with
serd_free()
using the world allocator, or null.
-
int serd_strncasecmp(const char *s1, const char *s2, size_t n)¶
Compare two strings ignoring case.
- Returns
Less than, equal to, or greater than zero if
s1
is less than, equal to, or greater thans2
, respectively.
5.6 I/O Function Types¶
These function types define the low-level interface that serd uses to read and write input.
They are deliberately compatible with the standard C functions for reading and writing from files.
-
typedef size_t (*SerdReadFunc)(void *buf, size_t size, size_t nmemb, void *stream)¶
Function for reading input bytes from a stream.
This has identical semantics to
fread
, but may seterrno
for more informative error reporting than supported bySerdErrorFunc
.- Parameters
buf – Output buffer.
size – Size of a single element of data in bytes (always 1).
nmemb – Number of elements to read.
stream – Stream to read from (FILE* for fread).
- Returns
Number of elements (bytes) read, which is short on error.
-
typedef size_t (*SerdWriteFunc)(const void *buf, size_t size, size_t nmemb, void *stream)¶
Function for writing output bytes to a stream.
This has identical semantics to
fwrite
, but may seterrno
for more informative error reporting than supported bySerdErrorFunc
.- Parameters
buf – Input buffer.
size – Size of a single element of data in bytes (always 1).
nmemb – Number of elements to read.
stream – Stream to write to (FILE* for fread).
- Returns
Number of elements (bytes) written, which is short on error.
-
typedef int (*SerdErrorFunc)(void *stream)¶
Function for detecting I/O stream errors.
This has identical semantics to
ferror
.- Returns
Non-zero if
stream
has encountered an error.
-
typedef int (*SerdCloseFunc)(void *stream)¶
Function for closing an I/O stream.
This has identical semantics to
fclose
. Note that when writing, this may flush the stream which can cause errors, including errors caused by previous writes that appeared successful at the time. Therefore it is necessary to check the return value of this function to properly detect write errors.- Returns
Non-zero if
stream
has encountered an error.
5.7 Syntax Utilities¶
-
enum SerdSyntax¶
Syntax supported by serd.
-
enumerator SERD_SYNTAX_EMPTY¶
Empty syntax.
-
enumerator SERD_TURTLE¶
Terse triples http://www.w3.org/TR/turtle.
-
enumerator SERD_NTRIPLES¶
Flat triples http://www.w3.org/TR/n-triples/.
-
enumerator SERD_NQUADS¶
Flat quads http://www.w3.org/TR/n-quads/.
-
enumerator SERD_TRIG¶
Terse quads http://www.w3.org/TR/trig/.
-
enumerator SERD_SYNTAX_EMPTY¶
-
SerdSyntax serd_syntax_by_name(const char *name)¶
Get a syntax by name.
Case-insensitive, supports “Turtle”, “NTriples”, “NQuads”, and “TriG”.
- Returns
The syntax with the given name, or the empty syntax if the name is unknown.
-
SerdSyntax serd_guess_syntax(const char *filename)¶
Guess a syntax from a filename.
This uses the file extension to guess the syntax of a file, for example a filename that ends with “.ttl” will be considered Turtle.
- Returns
The likely syntax of the given file, or the empty syntax if the extension is unknown.
-
bool serd_syntax_has_graphs(SerdSyntax syntax)¶
Return whether a syntax can represent multiple graphs in one document.
- Returns
True for
SerdSyntax.SERD_NQUADS
andSerdSyntax.SERD_TRIG
, false otherwise.
5.8 Data¶
5.8.1 URI¶
-
struct SerdURIView¶
A parsed view of a URI.
This representation is designed for fast streaming. It makes it possible to create relative URI references or resolve them into absolute URIs in-place without any string allocation.
Each component refers to slices in other strings, so a URI view must outlive any strings it was parsed from. Note that the components are not necessarily null-terminated.
The scheme, authority, path, query, and fragment simply point to the string value of those components, not including any delimiters. The path_prefix is a special component for storing relative or resolved paths. If it points to a string (usually a base URI the URI was resolved against), then this string is prepended to the path. Otherwise, the length is interpreted as the number of up-references (“../”) that must be prepended to the path.
-
SerdStringView scheme¶
Scheme.
-
SerdStringView authority¶
Authority.
-
SerdStringView path_prefix¶
Path prefix for relative/resolved paths.
-
SerdStringView path¶
Path suffix.
-
SerdStringView query¶
Query.
-
SerdStringView fragment¶
Fragment.
-
SerdStringView scheme¶
-
bool serd_uri_string_has_scheme(const char *string)¶
Return true iff
string
starts with a valid URI scheme.
-
SerdURIView serd_parse_uri(const char *string)¶
Parse
string
and return a URI view that points into it.
-
char *serd_parse_file_uri(SerdAllocator *allocator, const char *uri, char **hostname)¶
Get the unescaped path and hostname from a file URI.
The returned path and
*hostname
must be freed withserd_free()
.- Parameters
allocator – Allocator for the returned string.
uri – A file URI.
hostname – If non-NULL, set to the hostname, if present.
- Returns
A newly allocated path string that must be freed with
serd_free()
.
-
SerdURIView serd_resolve_uri(SerdURIView r, SerdURIView base)¶
Return reference
r
resolved againstbase
.This will make
r
an absolute URI if possible.- Parameters
r – URI reference to make absolute, for example “child/path”.
base – Base URI, for example “http://example.org/base/”.
- Returns
An absolute URI, for example “http://example.org/base/child/path”, or
r
if it is not a URI reference that can be resolved againstbase
.
-
SerdURIView serd_relative_uri(SerdURIView r, SerdURIView base)¶
Return
r
as a reference relative tobase
if possible.- Parameters
r – URI to make relative, for example “http://example.org/base/child/path”.
base – Base URI, for example “http://example.org/base”.
- Returns
A relative URI reference, for example “child/path”,
r
if it can not be made relative tobase
, or a null URI ifr
could be made relative to base, but the path prefix is already being used (most likely becauser
was previously a relative URI reference that was resolved against some base).
-
bool serd_uri_is_within(SerdURIView r, SerdURIView base)¶
Return whether
r
can be written as a reference relative tobase
.For example, with
base
“http://example.org/base/”, this returns true ifr
is also “http://example.org/base/”, or something like “http://example.org/base/child” (“child”) “http://example.org/base/child/grandchild#fragment” (“child/grandchild#fragment”), “http://example.org/base/child/grandchild?query” (“child/grandchild?query”), and so on.- Returns
True if
r
andbase
are equal or ifr
is a child ofbase
.
-
size_t serd_uri_string_length(SerdURIView uri)¶
Return the length of
uri
as a string.This can be used to get the expected number of bytes that will be written by
serd_write_uri()
.- Returns
A string length in bytes, not including the null terminator.
-
size_t serd_write_uri(SerdURIView uri, SerdWriteFunc sink, void *stream)¶
Write
uri
as a string tosink
.This will call
sink
several times to emit the URI.- Parameters
uri – URI to write as a string.
sink – Sink to write string output to.
stream – Opaque user argument to pass to
sink
.
- Returns
The length of the written URI string (not including a null terminator), which may be less than
serd_uri_string_length(uri)
on error.
-
size_t serd_write_file_uri(SerdStringView path, SerdStringView hostname, SerdWriteFunc sink, void *stream)¶
Write a file URI to
sink
from a path and optional hostname.Backslashes in Windows paths will be converted, and other characters will be percent encoded as necessary.
If
path
is relative,hostname
is ignored.- Parameters
path – File system path.
hostname – Optional hostname.
sink – Sink to write string output to.
stream – Opaque user argument to pass to
sink
.
- Returns
The length of the written URI string (not including a null terminator).
5.8.2 Node¶
5.8.2.1 Construction¶
This is the low-level node construction API, which can be used to construct nodes into existing buffers.
Advanced applications can use this to specially manage node memory, for example by allocating nodes on the stack, or with a special allocator.
Note that nodes are “plain old data”, so there is no need to destroy a constructed node, and nodes may be trivially copied, for example with memcpy().
-
SerdWriteResult serd_node_construct(size_t buf_size, void *buf, SerdNodeType type, SerdStringView string, SerdNodeFlags flags, SerdStringView meta)¶
Construct a node into an existing buffer.
This is the universal node constructor which can construct any node. An error will be returned if the parameters do not make sense. In particular,
SerdNodeFlag.SERD_HAS_DATATYPE
orSerdNodeFlag.SERD_HAS_LANGUAGE
(but not both) may only be given iftype
isSerdNodeType.SERD_LITERAL
, andmeta
must be syntactically valid based on that flag.This function may also be used to determine the size of buffer required by passing a null buffer with zero size.
- Parameters
buf_size – The size of
buf
in bytes, or zero to only measure.buf – Buffer where the node will be written, or null to only measure.
type – The type of the node to construct.
string – The string body of the node.
flags – Flags that describe the details of the node.
meta – The string value of the literal’s metadata. If
SerdNodeFlag.SERD_HAS_DATATYPE
is set, then this must be an absolute datatype URI. IfSerdNodeFlag.SERD_HAS_LANGUAGE
is set, then this must be a language tag like “en-ca”. Otherwise, it is ignored.
- Returns
A result with a
status
and acount
of bytes written. If the buffer is too small for the node, thenstatus
will beSerdStatus.SERD_OVERFLOW
, andcount
will be set to the number of bytes required to successfully construct the node.
-
SerdWriteResult serd_node_construct_token(size_t buf_size, void *buf, SerdNodeType type, SerdStringView string)¶
Construct a simple “token” node.
“Token” is just a shorthand used in this API to refer to a node that is not a typed or tagged literal, that is, a node that is just one string. This can be used to create URIs, blank nodes, variables, and simple string literals.
Note that string literals constructed with this function will have no flags set, and so will be written as “short” literals (not triple-quoted). To construct long literals, use the more advanced serd_construct_literal() with the
SerdNodeFlag.SERD_IS_LONG
flag.See the
serd_node_construct()
documentation for details on buffer usage and the return value.
-
SerdWriteResult serd_node_construct_uri(size_t buf_size, void *buf, SerdURIView uri)¶
Construct a URI node from a parsed URI.
This is similar to
serd_node_construct_token()
, but will serialise a parsed URI into the new node. This can be used to resolve a relative URI reference or expand a CURIE directly into a node without needing to allocate the URI string separately.
-
SerdWriteResult serd_node_construct_file_uri(size_t buf_size, void *buf, SerdStringView path, SerdStringView hostname)¶
Construct a file URI node from a path and optional hostname.
This is similar to
serd_node_construct_token()
, but will create a new file URI from a file path and optional hostname, performing any necessary escaping.
-
SerdWriteResult serd_node_construct_literal(size_t buf_size, void *buf, SerdStringView string, SerdNodeFlags flags, SerdStringView meta)¶
Construct a literal node with an optional datatype or language.
Either a datatype (which must be an absolute URI) or a language (which must be an RFC5646 language tag) may be given, but not both.
This is the most general literal constructor, which can be used to construct any literal node. This works like
serd_node_construct()
, see its documentation for details.
-
SerdWriteResult serd_node_construct_boolean(size_t buf_size, void *buf, bool value)¶
Construct a canonical xsd:boolean literal.
The constructed node will be either “true” or “false”, with datatype xsd:boolean.
This is a convenience wrapper for
serd_node_construct_literal()
that constructs a node directly from abool
.
-
SerdWriteResult serd_node_construct_decimal(size_t buf_size, void *buf, double value)¶
Construct a canonical xsd:decimal literal.
The constructed node will be an xsd:decimal literal, like “12.34”, with datatype xsd:decimal.
The node will always contain a ‘.’, start with a digit, and end with a digit (a leading and/or trailing ‘0’ will be added if necessary), for example, “1.0”. It will never be in scientific notation.
This is a convenience wrapper for
serd_node_construct_literal()
that constructs a node directly from adouble
.
-
SerdWriteResult serd_node_construct_double(size_t buf_size, void *buf, double value)¶
Construct a canonical xsd:double literal.
The constructed node will be an xsd:double literal, like “1.23E45”, with datatype xsd:double. A canonical xsd:double is always in scientific notation.
This is a convenience wrapper for
serd_node_construct_literal()
that constructs a node directly from adouble
.
-
SerdWriteResult serd_node_construct_float(size_t buf_size, void *buf, float value)¶
Construct a canonical xsd:float literal.
The constructed node will be an xsd:float literal, like “1.23E45”, with datatype xsd:float. A canonical xsd:float is always in scientific notation.
Uses identical formatting to
serd_node_construct_double()
, except with at most 9 significant digits (under 14 characters total).This is a convenience wrapper for
serd_node_construct_literal()
that constructs a node directly from afloat
.
-
SerdWriteResult serd_node_construct_integer(size_t buf_size, void *buf, int64_t value, SerdStringView datatype)¶
Construct a canonical xsd:integer literal.
The constructed node will be an xsd:integer literal like “1234”, with the given datatype, or datatype xsd:integer if none is given. It is the caller’s responsibility to ensure that the value is within the range of the given datatype.
-
SerdWriteResult serd_node_construct_base64(size_t buf_size, void *buf, size_t value_size, const void *value, SerdStringView datatype)¶
Construct a canonical xsd:base64 literal.
The constructed node will be an xsd:integer literal like “Zm9vYmFy”, with the given datatype, or datatype xsd:base64Binary if none is given.
5.8.2.2 Dynamic Allocation¶
This is a convenient higher-level node construction API which allocates nodes with an allocator.
The returned nodes must be freed with serd_node_free()
using the same allocator.
Note that in most cases it is better to use a SerdNodes
instead of managing individual node allocations.
-
SerdNode *serd_node_new(SerdAllocator *allocator, SerdNodeType type, SerdStringView string, SerdNodeFlags flags, SerdStringView meta)¶
Create a new node of any type.
This is a wrapper for
serd_node_construct()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_token(SerdAllocator *allocator, SerdNodeType type, SerdStringView string)¶
Create a new simple “token” node.
This is a wrapper for
serd_node_construct_token()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_string(SerdAllocator *allocator, SerdStringView string)¶
Create a new string literal node.
This is a trivial wrapper for
serd_new_token()
that passesSERD_LITERAL
for the type.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_uri(SerdAllocator *allocator, SerdStringView string)¶
Create a new URI node from a string.
This is a wrapper for
serd_node_construct_uri()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_parsed_uri(SerdAllocator *allocator, SerdURIView uri)¶
Create a new URI node from a parsed URI.
This is a wrapper for
serd_node_construct_uri()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_file_uri(SerdAllocator *allocator, SerdStringView path, SerdStringView hostname)¶
Create a new file URI node from a path and optional hostname.
This is a wrapper for
serd_node_construct_file_uri()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_literal(SerdAllocator *allocator, SerdStringView string, SerdNodeFlags flags, SerdStringView meta)¶
Create a new literal node.
This is a wrapper for
serd_node_construct_literal()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_boolean(SerdAllocator *allocator, bool b)¶
Create a new canonical xsd:boolean node.
This is a wrapper for
serd_node_construct_boolean()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_decimal(SerdAllocator *allocator, double d)¶
Create a new canonical xsd:decimal literal.
This is a wrapper for
serd_node_construct_decimal()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_double(SerdAllocator *allocator, double d)¶
Create a new canonical xsd:double literal.
This is a wrapper for
serd_node_construct_double()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_float(SerdAllocator *allocator, float f)¶
Create a new canonical xsd:float literal.
This is a wrapper for
serd_node_construct_float()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_integer(SerdAllocator *allocator, int64_t i, SerdStringView datatype)¶
Create a new canonical xsd:integer literal.
This is a wrapper for
serd_node_construct_integer()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
SerdNode *serd_new_base64(SerdAllocator *allocator, const void *buf, size_t size, SerdStringView datatype)¶
Create a new canonical xsd:base64Binary literal.
This is a wrapper for
serd_node_construct_base64()
that allocates a new node on the heap.- Returns
A newly allocated node that must be freed with
serd_node_free()
, or null.
-
enum SerdNodeType¶
Type of a node.
An RDF node, in the abstract sense, can be either a resource, literal, or a blank. This type is more precise, because syntactically there are two ways to refer to a resource (by URI or CURIE). Serd also has support for variable nodes to support some features, which are not RDF nodes.
There are also two ways to refer to a blank node in syntax (by ID or anonymously), but this is handled by statement flags rather than distinct node types.
-
enumerator SERD_LITERAL¶
Literal value.
A literal optionally has either a language, or a datatype (not both).
-
enumerator SERD_URI¶
URI (absolute or relative).
Value is an unquoted URI string, which is either a relative reference with respect to the current base URI (e.g. “foo/bar”), or an absolute URI (e.g. “http://example.org/foo”). RFC3986
-
enumerator SERD_BLANK¶
A blank node.
Value is a blank node ID without any syntactic prefix, like “id3”, which is meaningful only within this serialisation. RDF 1.1 Turtle
-
enumerator SERD_VARIABLE¶
A variable node.
Value is a variable name without any syntactic prefix, like “name”, which is meaningful only within this serialisation. SPARQL 1.1 Query Language
-
enumerator SERD_LITERAL¶
-
enum SerdNodeFlag¶
Flags that describe the details of a node.
-
enumerator SERD_IS_LONG¶
Literal node should be triple-quoted.
-
enumerator SERD_HAS_DATATYPE¶
Literal node has datatype.
-
enumerator SERD_HAS_LANGUAGE¶
Literal node has language.
-
enumerator SERD_IS_LONG¶
-
typedef struct SerdNodeImpl SerdNode¶
An RDF node.
-
typedef uint32_t SerdNodeFlags¶
Bitwise OR of SerdNodeFlag values.
-
bool serd_get_boolean(const SerdNode *node)¶
Return the value of
node
as a boolean.This will work for booleans, and numbers of any datatype if they are 0 or 1.
- Returns
The value of
node
as abool
, orfalse
on error.
-
double serd_get_double(const SerdNode *node)¶
Return the value of
node
as a double.This will coerce numbers of any datatype to double, if the value fits.
- Returns
The value of
node
as adouble
, or NaN on error.
-
float serd_get_float(const SerdNode *node)¶
Return the value of
node
as a float.This will coerce numbers of any datatype to float, if the value fits.
- Returns
The value of
node
as afloat
, or NaN on error.
-
int64_t serd_get_integer(const SerdNode *node)¶
Return the value of
node
as a long (signed 64-bit integer).This will coerce numbers of any datatype to long, if the value fits.
- Returns
The value of
node
as aint64_t
, or 0 on error.
-
size_t serd_get_base64_size(const SerdNode *node)¶
Return the maximum size of a decoded base64 node in bytes.
This returns an upper bound on the number of bytes that would be decoded by
serd_get_base64()
. This is calculated as a simple constant-time arithmetic expression based on the length of the encoded string, so may be larger than the actual size of the data due to things like additional whitespace.
-
SerdWriteResult serd_get_base64(const SerdNode *node, size_t buf_size, void *buf)¶
Decode a base64 node.
This function can be used to decode a node created with
serd_new_base64()
.- Parameters
node – A literal node which is an encoded base64 string.
buf_size – The size of
buf
in bytes.buf – Buffer where decoded data will be written.
- Returns
On success,
SerdStatus.SERD_SUCCESS
is returned along with the number of bytes written. If the output buffer is too small, thenSerdStatus.SERD_OVERFLOW
is returned along with the number of bytes required for successful decoding.
-
SerdNode *serd_node_copy(SerdAllocator *allocator, const SerdNode *node)¶
Return a deep copy of
node
-
void serd_node_free(SerdAllocator *allocator, SerdNode *node)¶
Free any data owned by
node
-
SerdNodeType serd_node_type(const SerdNode *node)¶
Return the type of a node (SERD_URI, SERD_BLANK, or SERD_LITERAL)
-
size_t serd_node_length(const SerdNode *node)¶
Return the length of the node’s string in bytes (excluding terminator)
-
SerdStringView serd_node_string_view(const SerdNode *node)¶
Return a view of the string in a node.
This is a convenience wrapper for
serd_node_string()
andserd_node_length()
that can be used to get both in a single call.
-
SerdURIView serd_node_uri_view(const SerdNode *node)¶
Return a parsed view of the URI in a node.
It is best to check the node type before calling this function, though it is safe to call on non-URI nodes. In that case, it will return a null view with all fields zero.
Note that this parses the URI string contained in the node, so it is a good idea to keep the value if you will be using it several times in the same scope.
-
SerdNodeFlags serd_node_flags(const SerdNode *node)¶
Return the flags (string properties) of a node.
-
const SerdNode *serd_node_datatype(const SerdNode *node)¶
Return the datatype of the literal node, if present.
-
const SerdNode *serd_node_language(const SerdNode *node)¶
Return the language tag of the literal node, if present.
-
int serd_node_compare(const SerdNode *a, const SerdNode *b)¶
Compare two nodes.
Returns less than, equal to, or greater than zero if
a
is less than, equal to, or greater thanb
, respectively. NULL is treated as less than any other node.Nodes are ordered first by type, then by string value, then by language or datatype, if present.
5.8.3 Nodes¶
-
typedef struct SerdNodesImpl SerdNodes¶
Hashing node container for interning and simplified memory management.
-
SerdNodes *serd_nodes_new(SerdAllocator *allocator)¶
Create a new node set.
-
void serd_nodes_free(SerdNodes *nodes)¶
Free
nodes
and all nodes that are stored in it.Note that this invalidates any node pointers previously returned from
nodes
.
-
const SerdNode *serd_nodes_get(const SerdNodes *nodes, const SerdNode *node)¶
Return the existing interned copy of a node if it exists.
This either returns an equivalent to the given node, or null if this node has not been interned.
-
const SerdNode *serd_nodes_intern(SerdNodes *nodes, const SerdNode *node)¶
Intern
node
.Multiple calls with equivalent nodes will return the same pointer.
- Returns
A node that is different than, but equivalent to,
node
.
-
const SerdNode *serd_nodes_string(SerdNodes *nodes, SerdStringView string)¶
Make a string node.
A new node will be added if an equivalent node is not already in the set.
-
const SerdNode *serd_nodes_literal(SerdNodes *nodes, SerdStringView string, SerdNodeFlags flags, SerdStringView meta)¶
Make a literal node with optional datatype or language.
This can create complex literals with an associated datatype URI or language tag, and control whether a literal should be written as a short or long (triple-quoted) string.
- Parameters
nodes – The node set to get this literal from.
string – The string value of the literal.
flags – Flags to describe the literal and its metadata. Note that at most one of
SerdNodeFlag.SERD_HAS_DATATYPE
andSerdNodeFlag.SERD_HAS_LANGUAGE
may be set.meta – The string value of the literal’s metadata. If
SerdNodeFlag.SERD_HAS_DATATYPE
is set, then this must be an absolute datatype URI. IfSerdNodeFlag.SERD_HAS_LANGUAGE
is set, then this must be an RFC 5646 language tag like “en-ca”. Otherwise, it is ignored.
-
const SerdNode *serd_nodes_boolean(SerdNodes *nodes, bool value)¶
Make a canonical xsd:boolean node.
A new node will be constructed with
serd_node_construct_boolean()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_decimal(SerdNodes *nodes, double value)¶
Make a canonical xsd:decimal node.
A new node will be constructed with
serd_node_construct_decimal()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_double(SerdNodes *nodes, double value)¶
Make a canonical xsd:double node.
A new node will be constructed with
serd_node_construct_double()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_float(SerdNodes *nodes, float value)¶
Make a canonical xsd:float node.
A new node will be constructed with
serd_node_construct_float()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_integer(SerdNodes *nodes, int64_t value, SerdStringView datatype)¶
Make a canonical xsd:integer node.
A new node will be constructed with
serd_node_construct_integer()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_base64(SerdNodes *nodes, const void *value, size_t value_size, SerdStringView datatype)¶
Make a canonical xsd:base64Binary node.
A new node will be constructed with
serd_node_construct_base64()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_uri(SerdNodes *nodes, SerdStringView string)¶
Make a URI node from a string.
A new node will be constructed with
serd_node_construct_token()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_parsed_uri(SerdNodes *nodes, SerdURIView uri)¶
Make a URI node from a parsed URI.
A new node will be constructed with
serd_node_construct_uri()
if an equivalent one is not already in the set.
-
const SerdNode *serd_nodes_blank(SerdNodes *nodes, SerdStringView string)¶
Make a blank node.
A new node will be constructed with
serd_node_construct_token()
if an equivalent one is not already in the set.
5.8.4 Caret¶
-
typedef struct SerdCaretImpl SerdCaret¶
The origin of a statement in a text document.
-
SerdCaret *serd_caret_new(SerdAllocator *allocator, const SerdNode *name, unsigned line, unsigned col)¶
Create a new caret.
Note that, to minimise model overhead, the caret does not own the name node, so
name
must have a longer lifetime than the caret for it to be valid. That is,serd_caret_name()
will return exactly the pointername
, not a copy.- Parameters
allocator – Allocator to use for caret memory.
name – The name of the document or stream (usually a file URI)
line – The line number in the document (1-based)
col – The column number in the document (1-based)
- Returns
A new caret that must be freed with
serd_caret_free()
-
SerdCaret *serd_caret_copy(SerdAllocator *allocator, const SerdCaret *caret)¶
Return a copy of
caret
-
void serd_caret_free(SerdAllocator *allocator, SerdCaret *caret)¶
Free
caret
-
bool serd_caret_equals(const SerdCaret *lhs, const SerdCaret *rhs)¶
Return true iff
lhs
is equal torhs
-
const SerdNode *serd_caret_name(const SerdCaret *caret)¶
Return the document name.
This is typically a file URI, but may be a descriptive string node for statements that originate from streams.
5.8.5 Statement¶
-
enum SerdField¶
Index of a node in a statement.
-
enumerator SERD_SUBJECT¶
Subject.
-
enumerator SERD_PREDICATE¶
Predicate (“key”)
-
enumerator SERD_OBJECT¶
Object (“value”)
-
enumerator SERD_GRAPH¶
Graph (“context”)
-
enumerator SERD_SUBJECT¶
-
typedef struct SerdStatementImpl SerdStatement¶
A subject, predicate, and object, with optional graph context.
-
SerdStatement *serd_statement_new(SerdAllocator *allocator, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g, const SerdCaret *caret)¶
Create a new statement.
Note that, to minimise model overhead, statements do not own their nodes, so they must have a longer lifetime than the statement for it to be valid. For statements in models, this is the lifetime of the model. For user-created statements, the simplest way to handle this is to use
SerdNodes
.- Parameters
allocator – Allocator to use for statement memory.
s – The subject
p – The predicate (“key”)
o – The object (“value”)
g – The graph (“context”)
caret – Optional caret at the origin of this statement
- Returns
A new statement that must be freed with
serd_statement_free()
-
SerdStatement *serd_statement_copy(SerdAllocator *allocator, const SerdStatement *statement)¶
Return a copy of
statement
-
void serd_statement_free(SerdAllocator *allocator, SerdStatement *statement)¶
Free
statement
-
const SerdNode *serd_statement_node(const SerdStatement *statement, SerdField field)¶
Return the given node of the statement.
-
const SerdNode *serd_statement_subject(const SerdStatement *statement)¶
Return the subject of the statement.
-
const SerdNode *serd_statement_predicate(const SerdStatement *statement)¶
Return the predicate of the statement.
-
const SerdNode *serd_statement_object(const SerdStatement *statement)¶
Return the object of the statement.
-
const SerdNode *serd_statement_graph(const SerdStatement *statement)¶
Return the graph of the statement.
-
const SerdCaret *serd_statement_caret(const SerdStatement *statement)¶
Return the source location where the statement originated, or NULL.
-
bool serd_statement_equals(const SerdStatement *a, const SerdStatement *b)¶
Return true iff
a
is equal tob
, ignoring statement caret metadata.Only returns true if nodes are equivalent, does not perform wildcard matching.
-
bool serd_statement_matches(const SerdStatement *statement, const SerdNode *subject, const SerdNode *predicate, const SerdNode *object, const SerdNode *graph)¶
Return true iff the statement matches the given pattern.
Nodes match if they are equivalent, or if one of them is NULL. The statement matches if every node matches.
5.9 World¶
5.9.1 Logging¶
-
struct SerdLogField¶
A structured log field.
This can be used to pass additional information along with log messages. Syslog-compatible keys should be used where possible, otherwise, keys should be namespaced to prevent clashes.
Serd itself uses the following keys:
ERRNO
SERD_COL
SERD_FILE
SERD_LINE
SERD_STATUS
-
const char *key¶
Field name.
-
const char *value¶
Field value.
-
struct SerdLogEntry¶
A log entry (message).
This is the description of a log entry which is passed to log functions. It is only valid in the stack frame it appears in, and may not be copied.
An entry is a single self-contained message, so the string should not include a trailing newline.
-
const SerdLogField *fields¶
Extra log fields.
-
const char *fmt¶
Printf-style format string.
-
va_list *args¶
Arguments for
fmt
-
SerdLogLevel level¶
Log level.
-
size_t n_fields¶
Number of
fields
-
const SerdLogField *fields¶
-
enum SerdLogLevel¶
Log message level, compatible with syslog.
-
enumerator SERD_LOG_LEVEL_EMERGENCY¶
Emergency, system is unusable.
-
enumerator SERD_LOG_LEVEL_ALERT¶
Action must be taken immediately.
-
enumerator SERD_LOG_LEVEL_CRITICAL¶
Critical condition.
-
enumerator SERD_LOG_LEVEL_ERROR¶
Error.
-
enumerator SERD_LOG_LEVEL_WARNING¶
Warning.
-
enumerator SERD_LOG_LEVEL_NOTICE¶
Normal but significant condition.
-
enumerator SERD_LOG_LEVEL_INFO¶
Informational message.
-
enumerator SERD_LOG_LEVEL_DEBUG¶
Debug message.
-
enumerator SERD_LOG_LEVEL_EMERGENCY¶
-
typedef SerdStatus (*SerdLogFunc)(void *handle, const SerdLogEntry *entry)¶
Sink function for log messages.
- Parameters
handle – Handle for user data.
entry – Pointer to log entry description.
-
SerdStatus serd_quiet_error_func(void *handle, const SerdLogEntry *entry)¶
A SerdLogFunc that does nothing, for suppressing log output.
-
const char *serd_log_entry_get_field(const SerdLogEntry *entry, const char *key)¶
Return the value of the log field named
key
, or NULL if none exists.
-
void serd_world_set_log_func(SerdWorld *world, SerdLogFunc log_func, void *handle)¶
Set a function to be called with log messages (typically errors).
If no custom logging function is set, then messages are printed to stderr.
- Parameters
world – World that will send log entries to the given function.
log_func – Log function to call for every log message. Each call to this function represents a complete log message with an implicit trailing newline.
handle – Opaque handle that will be passed to every invocation of
log_func
.
-
SerdStatus serd_world_logf(const SerdWorld *world, SerdLogLevel level, size_t n_fields, const SerdLogField *fields, const char *fmt, ...)¶
Write a message to the log.
This writes a single complete entry to the log, and so may not be used to print parts of a line like a more general printf-like function. There should be no trailing newline in
fmt
. Arguments followingfmt
should correspond to conversion specifiers in the format string as in printf from the standard C library.- Parameters
world – World to log to.
level – Log level.
n_fields – Number of entries in
fields
.fields – An array of
n_fields
extra log fields.fmt – Format string.
-
SerdStatus serd_world_vlogf(const SerdWorld *world, SerdLogLevel level, size_t n_fields, const SerdLogField *fields, const char *fmt, va_list args)¶
Write a message to the log with a
va_list
.This is the same as
serd_world_logf()
except it takes format arguments as ava_list
for composability.
-
typedef struct SerdWorldImpl SerdWorld¶
Global library state.
-
SerdWorld *serd_world_new(SerdAllocator *allocator)¶
Create a new Serd World.
It is safe to use multiple worlds in one process, though no objects can be shared between worlds.
-
SerdNodes *serd_world_nodes(SerdWorld *world)¶
Return the nodes cache in
world
.The returned cache is owned by the world and contains various nodes used frequently by the implementation. For convenience, it may be used to store additional nodes which will be freed when the world is freed.
-
const SerdNode *serd_world_get_blank(SerdWorld *world)¶
Return a unique blank node.
The returned node is valid only until the next time
serd_world_get_blank()
is called or the world is destroyed.
5.10 Data Streaming¶
5.10.1 Events¶
-
struct SerdBaseEvent¶
Event for base URI changes.
Emitted whenever the base URI changes.
-
struct SerdPrefixEvent¶
Event for namespace definitions.
Emitted whenever a prefix is defined.
-
struct SerdStatementEvent¶
Event for statements.
Emitted for every statement.
-
SerdStatementFlags flags¶
Flags for pretty-printing.
-
const SerdStatement *statement¶
Statement.
-
SerdStatementFlags flags¶
-
struct SerdEndEvent¶
Event for the end of anonymous node descriptions.
This is emitted to indicate that the given anonymous node will no longer be described. This is used by the writer which may, for example, need to write a delimiter.
-
union SerdEvent¶
An event in a data stream.
Streams of data are represented as a series of events. Events represent everything that can occur in an RDF document, and are used to plumb together different components. For example, when parsing a document, a reader emits a stream of events which can be sent to a writer to rewrite a document, or to an inserter to build a model in memory.
-
SerdEventType type¶
Event type (always set)
-
SerdBaseEvent base¶
Base URI changed.
-
SerdPrefixEvent prefix¶
New namespace prefix.
-
SerdStatementEvent statement¶
Statement.
-
SerdEndEvent end¶
End of anonymous node.
-
SerdEventType type¶
-
enum SerdEventType¶
Type of a
SerdEvent
.-
enumerator SERD_BASE¶
Base URI changed.
-
enumerator SERD_PREFIX¶
New URI prefix.
-
enumerator SERD_STATEMENT¶
Statement.
-
enumerator SERD_END¶
End of anonymous node.
-
enumerator SERD_BASE¶
-
enum SerdStatementFlag¶
Flags indicating inline abbreviation information for a statement.
-
enumerator SERD_EMPTY_S¶
Empty blank node subject.
-
enumerator SERD_EMPTY_G¶
Empty blank node graph.
-
enumerator SERD_ANON_S¶
Start of anonymous subject.
-
enumerator SERD_ANON_O¶
Start of anonymous object.
-
enumerator SERD_LIST_S¶
Start of list subject.
-
enumerator SERD_LIST_O¶
Start of list object.
-
enumerator SERD_TERSE_S¶
Start of terse subject.
-
enumerator SERD_TERSE_O¶
Start of terse object.
-
enumerator SERD_EMPTY_S¶
-
typedef uint32_t SerdStatementFlags¶
Bitwise OR of SerdStatementFlag values.
-
typedef SerdStatus (*SerdEventFunc)(void *handle, const SerdEvent *event)¶
Function for handling events.
5.10.2 Sink¶
-
typedef struct SerdSinkImpl SerdSink¶
An interface that receives a stream of RDF data.
-
typedef void (*SerdFreeFunc)(void *ptr)¶
Function to free an opaque handle.
-
SerdSink *serd_sink_new(const SerdWorld *world, void *handle, SerdEventFunc event_func, SerdFreeFunc free_handle)¶
Create a new sink.
- Parameters
world – The world the new sink will be a part of.
handle – Opaque handle that will be passed to sink functions.
event_func – Function that will be called for every event.
free_handle – Free function to call on handle in
serd_sink_free()
.
-
SerdStatus serd_sink_write_event(const SerdSink *sink, const SerdEvent *event)¶
Send an event to the sink.
-
SerdStatus serd_sink_write_base(const SerdSink *sink, const SerdNode *uri)¶
Set the base URI.
-
SerdStatus serd_sink_write_prefix(const SerdSink *sink, const SerdNode *name, const SerdNode *uri)¶
Set a namespace prefix.
-
SerdStatus serd_sink_write_statement(const SerdSink *sink, SerdStatementFlags flags, const SerdStatement *statement)¶
Write a statement.
-
SerdStatus serd_sink_write(const SerdSink *sink, SerdStatementFlags flags, const SerdNode *subject, const SerdNode *predicate, const SerdNode *object, const SerdNode *graph)¶
Write a statement from individual nodes.
-
SerdStatus serd_sink_write_end(const SerdSink *sink, const SerdNode *node)¶
Mark the end of an anonymous node.
5.10.3 Canon¶
-
enum SerdCanonFlag¶
Flags that control canonical node transformation.
-
enumerator SERD_CANON_LAX¶
Tolerate and pass through invalid input.
-
enumerator SERD_CANON_LAX¶
-
typedef uint32_t SerdCanonFlags¶
Bitwise OR of SerdCanonFlag values.
-
SerdSink *serd_canon_new(const SerdWorld *world, const SerdSink *target, SerdCanonFlags flags)¶
Return a new sink that transforms literals to canonical form where possible.
The returned sink acts like
target
in all respects, except literal nodes in statements may be modified from the original.
5.10.4 Filter¶
-
SerdSink *serd_filter_new(const SerdWorld *world, const SerdSink *target, const SerdNode *subject, const SerdNode *predicate, const SerdNode *object, const SerdNode *graph, bool inclusive)¶
Return a new sink that filters out statements that do not match a pattern.
The returned sink acts like
target
in all respects, except that some statements may be dropped.- Parameters
world – The world the new sink will be a part of.
target – The target sink to pass the filtered data to.
subject – The optional subject of the filter pattern.
predicate – The optional predicate of the filter pattern.
object – The optional object of the filter pattern.
graph – The optional graph of the filter pattern.
inclusive – If true, then only statements that match the pattern are passed through. Otherwise, only statements that do not match the pattern are passed through.
5.11 Environment¶
-
typedef struct SerdEnvImpl SerdEnv¶
Lexical environment for relative URIs or CURIEs (base URI and namespaces)
-
SerdEnv *serd_env_new(const SerdWorld *world, SerdStringView base_uri)¶
Create a new environment.
-
SerdEnv *serd_env_copy(SerdAllocator *allocator, const SerdEnv *env)¶
Copy an environment.
-
SerdStatus serd_env_set_base_uri(SerdEnv *env, SerdStringView uri)¶
Set the current base URI.
-
SerdStatus serd_env_set_prefix(SerdEnv *env, SerdStringView name, SerdStringView uri)¶
Set a namespace prefix.
A namespace prefix is used to expand CURIE nodes, for example, with the prefix “xsd” set to “http://www.w3.org/2001/XMLSchema#”, “xsd:decimal” will expand to “http://www.w3.org/2001/XMLSchema#decimal”.
-
SerdNode *serd_env_expand(const SerdEnv *env, const SerdNode *node)¶
Expand
node
, which must be a CURIE or URI, to a full URI.Returns null if
node
can not be expanded.
-
void serd_env_write_prefixes(const SerdEnv *env, const SerdSink *sink)¶
Write all prefixes in
env
tosink
-
SerdNode *serd_node_from_syntax(SerdWorld *world, const char *str, SerdSyntax syntax, SerdEnv *env)¶
Create a node from a string representation in
syntax
.The string should be a node as if written as an object in the given syntax, without any extra quoting or punctuation, which is the format returned by
serd_node_to_syntax()
. These two functions, when used withSerdSyntax.SERD_TURTLE
, can be used to round-trip any node to a string and back.- Parameters
world – The world.
str – String representation of a node.
syntax – Syntax to use. Should be either SERD_TURTLE or SERD_NTRIPLES (the others are redundant). Note that namespaced (CURIE) nodes and relative URIs can not be expressed in NTriples.
env – Environment of
str
. This must define any abbreviations needed to parse the string.
- Returns
A newly allocated node that must be freed with
serd_node_free()
using the world allocator.
-
char *serd_node_to_syntax(SerdWorld *world, const SerdNode *node, SerdSyntax syntax, const SerdEnv *env)¶
Return a string representation of
node
insyntax
.The returned string represents that node as if written as an object in the given syntax, without any extra quoting or punctuation.
- Parameters
world – World used to allocate internal components and the returned string.
node – Node to write as a string.
syntax – Syntax to use. Should be either SERD_TURTLE or SERD_NTRIPLES (the others are redundant). Note that namespaced (CURIE) nodes and relative URIs can not be expressed in NTriples.
env – Environment for the output string. This can be used to abbreviate things nicely by setting namespace prefixes.
- Returns
A newly allocated string that must be freed with
serd_free()
using the world allocator.
5.12 Reading and Writing¶
5.12.1 Input Streams¶
An input stream is used for reading input as a raw stream of bytes.
It is compatible with standard C FILE
streams, but allows different functions to be provided for things like reading from a buffer or a socket.
-
struct SerdInputStream¶
An input stream that produces bytes.
-
void *stream¶
Opaque parameter for functions.
-
SerdReadFunc read¶
Read bytes from input.
-
SerdErrorFunc error¶
Stream error accessor.
-
SerdCloseFunc close¶
Close input.
-
void *stream¶
-
SerdInputStream serd_open_input_stream(SerdReadFunc read_func, SerdErrorFunc error_func, SerdCloseFunc close_func, void *stream)¶
Open a stream that reads from a provided function.
- Parameters
read_func – Function to read input.
error_func – Function used to detect errors.
close_func – Function to close the stream after reading is done.
stream – Opaque stream parameter for functions.
- Returns
An opened input stream, or all zeros on error.
-
SerdInputStream serd_open_input_string(const char **position)¶
Open a stream that reads from a string.
The string pointer that position points to must remain valid until the stream is closed. This pointer serves as the internal stream state and will be mutated as the stream is used.
- Parameters
position – Pointer to a valid string pointer for use as stream state.
- Returns
An opened input stream, or all zeros on error.
-
SerdInputStream serd_open_input_file(const char *path)¶
Open a stream that reads from a file.
An arbitrary
FILE*
can be used withserd_open_input_stream()
as well, this convenience function opens the file properly for reading with serd, and sets flags for optimized I/O if possible.- Parameters
path – Path of file to open and read from.
-
SerdStatus serd_close_input(SerdInputStream *input)¶
Close an input stream.
This will call the close function, and reset the stream internally so that no further reads can be made. For convenience, this is safe to call on NULL, and safe to call several times on the same input.
5.12.2 Reader¶
-
enum SerdReaderFlag¶
Reader options.
-
enumerator SERD_READ_LAX¶
Tolerate invalid input where possible.
This will attempt to ignore invalid input and continue reading. Invalid Unicode characters will be replaced with the replacement character, and various other syntactic problems will be ignored. If there are more severe problems, the reader will try to skip the statement and continue parsing. This should work reasonably well for line-based syntaxes like NTriples and NQuads, but abbreviated Turtle or TriG may not recover.
Note that this flag should be used carefully, since it can result in data loss.
-
enumerator SERD_READ_VARIABLES¶
Support reading variable nodes.
As an extension, serd supports reading variables nodes with SPARQL-like syntax, for example “?foo” or “$bar”. This can be used for storing graph patterns and templates.
-
enumerator SERD_READ_RELATIVE¶
Read relative URI references exactly without resolving them.
Normally, the reader expands all relative URIs against the base URI. This flag disables that, so that URI references are passed to the sink exactly as they are in the input.
-
enumerator SERD_READ_GLOBAL¶
Read blank node labels without adding a prefix unique to the document.
Normally, the reader adds a prefix like “f1”, “f2”, and so on, to blank node labels, to separate the namespaces from separate input documents. This flag disables that, so that blank node labels will be read without any prefix added.
Note that this flag should be used carefully, since it can result in data corruption. Specifically, if data from separate documents parsed with this flag is combined, the IDs from each document may clash.
-
enumerator SERD_READ_GENERATED¶
Read generated blank node labels exactly without adjusting them.
Normally, the reader will adapt blank node labels in the input that clash with its scheme for generating new ones, for example mapping “_:b123” to “_:B123”. This flag disables that, so that blank node labels are passed to the sink exactly as they are in the input.
Note that this flag should be used carefully, since it can result in data corruption. Specifically, if the input is a syntax like Turtle with anonymous nodes, the generated IDs for those nodes may clash with IDs from the input document.
-
enumerator SERD_READ_LAX¶
-
typedef struct SerdReaderImpl SerdReader¶
Streaming parser that reads a text stream and writes to a statement sink.
-
typedef uint32_t SerdReaderFlags¶
Bitwise OR of SerdReaderFlag values.
-
SerdReader *serd_reader_new(SerdWorld *world, SerdSyntax syntax, SerdReaderFlags flags, SerdEnv *env, const SerdSink *sink, size_t stack_size)¶
Create a new RDF reader.
-
SerdStatus serd_reader_start(SerdReader *reader, SerdInputStream *input, const SerdNode *input_name, size_t block_size)¶
Prepare to read some input.
This sets up the reader to read from the given input, but will not read any bytes from it. This should be followed by
serd_reader_read_chunk()
orserd_reader_read_document()
to actually read the input.- Parameters
reader – The reader.
input – An opened input stream to read from.
input_name – The name of the input stream for error messages.
block_size – The number of bytes to read from the stream at once.
-
SerdStatus serd_reader_read_chunk(SerdReader *reader)¶
Read a single “chunk” of data during an incremental read.
This function will read a single top level description, and return. This may be a directive, statement, or several statements; essentially it reads until a ‘.’ is encountered. This is particularly useful for reading directly from a pipe or socket.
-
SerdStatus serd_reader_read_document(SerdReader *reader)¶
Read a complete document from the source.
This function will continue pulling from the source until a complete document has been read. Note that this may block when used with streams, for incremental reading use
serd_reader_read_chunk()
.
-
SerdStatus serd_reader_finish(SerdReader *reader)¶
Finish reading from the source.
This should be called before starting to read from another source.
-
void serd_reader_free(SerdReader *reader)¶
Free
reader
.The reader will be finished via
serd_reader_finish()
if necessary.
5.12.3 Buffer¶
The SerdBuffer
type represents a writable area of memory with a known size.
An implementation of SerdWriteFunc
, SerdErrorFunc
, and SerdCloseFunc
are provided which allow output to be written to a buffer in memory instead of to a file as with fwrite
, ferror
, and fclose
.
-
struct SerdBuffer¶
A mutable buffer in memory.
-
void *buf¶
Buffer.
-
size_t len¶
Size of buffer in bytes.
-
void *buf¶
-
struct SerdDynamicBuffer¶
A dynamically resizable mutable buffer in memory.
-
SerdAllocator *allocator¶
Allocator for buf.
-
void *buf¶
Buffer.
-
size_t len¶
Size of buffer in bytes.
-
SerdAllocator *allocator¶
-
size_t serd_buffer_write(const void *buf, size_t size, size_t nmemb, void *stream)¶
A function for writing to a buffer, resizing it if necessary.
This function can be used as a
SerdWriteFunc
to write to aSerdDynamicBuffer
which is reallocated as necessary. Thestream
parameter must point to an initializedSerdDynamicBuffer
.Note that when writing a string, the string in the buffer will not be null-terminated until
serd_buffer_close()
is called.
-
int serd_buffer_close(void *stream)¶
Close the buffer for writing.
This writes a terminating null byte, so the contents of the buffer are safe to read as a string after this call.
5.12.4 Output Streams¶
An output stream is used for writing output as a raw stream of bytes.
It is compatible with standard C FILE
streams, but allows different functions to be provided for things like writing to a buffer or a socket.
-
struct SerdOutputStream¶
An output stream that receives bytes.
-
void *stream¶
Opaque parameter for functions.
-
SerdWriteFunc write¶
Write bytes to output.
-
SerdErrorFunc error¶
Stream error accessor.
-
SerdCloseFunc close¶
Close output.
-
void *stream¶
-
SerdOutputStream serd_open_output_stream(SerdWriteFunc write_func, SerdErrorFunc error_func, SerdCloseFunc close_func, void *stream)¶
Open a stream that writes to a provided function.
- Parameters
write_func – Function to write output.
error_func – Function used to detect errors.
close_func – Function to close the stream after writing is done.
stream – Opaque stream parameter for write_func and close_func.
- Returns
An opened output stream, or all zeros on error.
-
SerdOutputStream serd_open_output_buffer(SerdDynamicBuffer *buffer)¶
Open a stream that writes to a buffer.
The
buffer
is owned by the caller, but will be reallocated using the buffer’s allocator as necessary. Note that the string in the buffer will not be null terminated until the stream is closed.- Parameters
buffer – Buffer to write output to.
- Returns
An opened output stream, or all zeros on error.
-
SerdOutputStream serd_open_output_file(const char *path)¶
Open a stream that writes to a file.
An arbitrary
FILE*
can be used withserd_open_output_stream()
as well, this convenience function opens the file properly for writing with serd, and sets flags for optimized I/O if possible.- Parameters
path – Path of file to open and write to.
-
SerdStatus serd_close_output(SerdOutputStream *output)¶
Close an output stream.
This will call the close function, and reset the stream internally so that no further writes can be made. For convenience, this is safe to call on NULL, and safe to call several times on the same output. Failure is returned in both of those cases.
5.12.5 Writer¶
-
enum SerdWriterFlag¶
Writer style options.
These flags allow more precise control of writer output style. Note that some options are only supported for some syntaxes, for example, NTriples does not support abbreviation and is always ASCII.
-
enumerator SERD_WRITE_ASCII¶
Escape all non-ASCII characters.
Although all the supported syntaxes are UTF-8 by definition, this can be used to escape all non-ASCII characters so that data will survive transmission through ASCII-only channels.
-
enumerator SERD_WRITE_EXPANDED¶
Write expanded URIs instead of prefixed names.
This will avoid shortening URIs into CURIEs entirely, even if the output syntax supports prefixed names. This can be useful for making chunks of syntax context-free.
-
enumerator SERD_WRITE_VERBATIM¶
Write URI references exactly as they are received.
Normally, the writer resolves URIs against the base URI, so it can potentially writem them as relative URI references. This flag disables that, so URI nodes are written exactly as they are received.
When fed by a reader with
SerdReaderFlag.SERD_READ_RELATIVE
enabled, this will write URI references exactly as they are in the input.
-
enumerator SERD_WRITE_TERSE¶
Write terser output without newlines.
For Turtle and TriG, this enables a terser form of output which only has newlines at the top level. This can result in very long lines, but is more compact and useful for making these abbreviated syntaxes line-based.
-
enumerator SERD_WRITE_LAX¶
Tolerate lossy output.
This will tolerate input that can not be written without loss, in particular invalid UTF-8 text. Note that this flag should be used carefully, since it can result in data loss.
-
enumerator SERD_WRITE_RDF_TYPE¶
Write rdf:type as a normal predicate.
This disables the special “a” syntax in Turtle and TriG.
-
enumerator SERD_WRITE_CONTEXTUAL¶
Suppress writing directives that describe the context.
This writes data as usual, but suppresses writing
prefix
directives in Turtle and TriG. The resulting output is a fragment of a document with implicit context, so it will only be readable in a suitable enviromnent.
-
enumerator SERD_WRITE_ASCII¶
-
typedef struct SerdWriterImpl SerdWriter¶
Streaming writer that writes a text stream as it receives events.
-
typedef uint32_t SerdWriterFlags¶
Bitwise OR of SerdWriterFlag values.
-
SerdWriter *serd_writer_new(SerdWorld *world, SerdSyntax syntax, SerdWriterFlags flags, const SerdEnv *env, SerdOutputStream *output, size_t block_size)¶
Create a new RDF writer.
-
void serd_writer_free(SerdWriter *writer)¶
Free
writer
-
const SerdSink *serd_writer_sink(SerdWriter *writer)¶
Return a sink interface that emits statements via
writer
-
SerdStatus serd_writer_set_base_uri(SerdWriter *writer, const SerdNode *uri)¶
Set the current output base URI, and emit a directive if applicable.
Note this function can be safely casted to SerdBaseSink.
-
SerdStatus serd_writer_set_root_uri(SerdWriter *writer, SerdStringView uri)¶
Set the current root URI.
The root URI should be a prefix of the base URI. The path of the root URI is the highest path any relative up-reference can refer to. For example, with root file:///foo/root and base file:///foo/root/base, file:///foo/root will be written as <../>, but file:///foo will be written non-relatively as file:///foo. If the root is not explicitly set, it defaults to the base URI, so no up-references will be created at all.
-
SerdStatus serd_writer_finish(SerdWriter *writer)¶
Finish a write.
This flushes any pending output, for example terminating punctuation, so that the output is a complete document.
5.13 Storage¶
5.13.1 Cursor¶
-
typedef struct SerdCursorImpl SerdCursor¶
A cursor that iterates over statements in a model.
A cursor is a smart iterator that visits all statements that match a pattern.
-
SerdCursor *serd_cursor_copy(SerdAllocator *allocator, const SerdCursor *cursor)¶
Return a new copy of
cursor
-
const SerdStatement *serd_cursor_get(const SerdCursor *cursor)¶
Return the statement pointed to by
cursor
-
SerdStatus serd_cursor_advance(SerdCursor *cursor)¶
Increment cursor to point to the next statement.
- Returns
Failure if
cursor
was already at the end.
-
bool serd_cursor_is_end(const SerdCursor *cursor)¶
Return true if the cursor has reached its end.
-
bool serd_cursor_equals(const SerdCursor *lhs, const SerdCursor *rhs)¶
Return true iff
lhs
equalsrhs
.Two cursors are equivalent if they point to the same statement in the same index in the same model, or are both the end of the same model. Note that two cursors can point to the same statement but not be equivalent, since they may have reached the statement via different indices.
-
void serd_cursor_free(SerdCursor *cursor)¶
Free
cursor
5.13.2 Range¶
-
enum SerdDescribeFlag¶
Flags that control the style of a model serialisation.
-
enumerator SERD_NO_TYPE_FIRST¶
Disable writing rdf:type (“a”) first.
-
enumerator SERD_NO_TYPE_FIRST¶
-
typedef uint32_t SerdDescribeFlags¶
Bitwise OR of SerdDescribeFlag values.
-
SerdStatus serd_describe_range(const SerdCursor *range, const SerdSink *sink, SerdDescribeFlags flags)¶
Describe a range of statements by writing to a sink.
This will consume the given cursor, and emit at least every statement it visits. More statements from the model may be written in order to describe anonymous blank nodes that are associated with a subject in the range.
The default is to write statements in an order suited for pretty-printing with Turtle or TriG with as many anonymous nodes as possible. If
SERD_NO_INLINE_OBJECTS
is given, a simple sorted stream is written instead, which is faster since no searching is required, but can result in ugly output for Turtle or Trig.
5.13.3 Model¶
-
enum SerdStatementOrder¶
Statement ordering.
Statements themselves always have the same fields in the same order (subject, predicate, object, graph), but a model can keep indices for different orderings to provide good performance for different kinds of queries.
-
enumerator SERD_ORDER_SPO¶
Subject, Predicate, Object.
-
enumerator SERD_ORDER_SOP¶
Subject, Object, Predicate.
-
enumerator SERD_ORDER_OPS¶
Object, Predicate, Subject.
-
enumerator SERD_ORDER_OSP¶
Object, Subject, Predicate.
-
enumerator SERD_ORDER_PSO¶
Predicate, Subject, Object.
-
enumerator SERD_ORDER_POS¶
Predicate, Object, Subject.
-
enumerator SERD_ORDER_GSPO¶
Graph, Subject, Predicate, Object.
-
enumerator SERD_ORDER_GSOP¶
Graph, Subject, Object, Predicate.
-
enumerator SERD_ORDER_GOPS¶
Graph, Object, Predicate, Subject.
-
enumerator SERD_ORDER_GOSP¶
Graph, Object, Subject, Predicate.
-
enumerator SERD_ORDER_GPSO¶
Graph, Predicate, Subject, Object.
-
enumerator SERD_ORDER_GPOS¶
Graph, Predicate, Object, Subject.
-
enumerator SERD_ORDER_SPO¶
-
enum SerdModelFlag¶
Flags that control model storage and indexing.
-
enumerator SERD_STORE_GRAPHS¶
Store and index the graph of statements.
-
enumerator SERD_STORE_CARETS¶
Store original caret of statements.
-
enumerator SERD_STORE_GRAPHS¶
-
typedef struct SerdModelImpl SerdModel¶
An indexed set of statements.
-
typedef uint32_t SerdModelFlags¶
Bitwise OR of SerdModelFlag values.
-
SerdModel *serd_model_new(SerdWorld *world, SerdStatementOrder default_order, SerdModelFlags flags)¶
Create a new model.
- Parameters
world – The world in which to make this model.
default_order – The order for the default index, which is always present and responsible for owning all the statements in the model. This should almost always be
SerdStatementOrder.SERD_ORDER_SPO
orSerdStatementOrder.SERD_ORDER_GSPO
(which support writing pretty documents), but advanced applications that do not want either of these indices can use a different order. Additional indices can be added withserd_model_add_index()
.flags – Options that control what data is stored in the model.
-
SerdModel *serd_model_copy(SerdAllocator *allocator, const SerdModel *model)¶
Return a deep copy of
model
-
bool serd_model_equals(const SerdModel *a, const SerdModel *b)¶
Return true iff
a
is equal tob
, ignoring statement cursor metadata.
-
SerdStatus serd_model_add_index(SerdModel *model, SerdStatementOrder order)¶
Add an index for a particular statement order to the model.
- Returns
Failure if this index already exists.
-
SerdStatus serd_model_drop_index(SerdModel *model, SerdStatementOrder order)¶
Add an index for a particular statement order to the model.
- Returns
Failure if this index does not exist.
-
SerdStatementOrder serd_model_default_order(const SerdModel *model)¶
Get the default statement order of
model
-
SerdModelFlags serd_model_flags(const SerdModel *model)¶
Get the flags enabled on
model
-
bool serd_model_empty(const SerdModel *model)¶
Return true iff there are no statements stored in
model
-
SerdCursor *serd_model_begin(const SerdModel *model)¶
Return a cursor at the start of every statement in the model.
The returned cursor will advance over every statement in the model’s default order.
-
const SerdCursor *serd_model_end(const SerdModel *model)¶
Return a cursor past the end of the model.
This returns the “universal” end cursor, which is equivalent to any cursor for this model that has reached its end.
-
SerdCursor *serd_model_begin_ordered(const SerdModel *model, SerdStatementOrder order)¶
Return a cursor over all statements in the model in a specific order.
-
SerdCursor *serd_model_find(const SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g)¶
Search for statements that match a pattern.
- Returns
An iterator to the first match, or NULL if no matches found.
-
const SerdNode *serd_model_get(const SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g)¶
Search for a single node that matches a pattern.
Exactly one of
s
,p
,o
must be NULL. This function is mainly useful for predicates that only have one value.- Returns
The first matching node, or NULL if no matches are found.
-
const SerdStatement *serd_model_get_statement(const SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g)¶
Search for a single statement that matches a pattern.
This function is mainly useful for predicates that only have one value.
- Returns
The first matching statement, or NULL if none are found.
-
bool serd_model_ask(const SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g)¶
Return true iff a statement exists.
-
size_t serd_model_count(const SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g)¶
Return the number of matching statements.
-
SerdStatus serd_model_add(SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g)¶
Add a statement to a model from nodes.
This function fails if there are any active iterators on
model
.
-
SerdStatus serd_model_add_with_caret(SerdModel *model, const SerdNode *s, const SerdNode *p, const SerdNode *o, const SerdNode *g, const SerdCaret *caret)¶
Add a statement to a model from nodes with a caret.
This function fails if there are any active iterators on
model
.
-
SerdStatus serd_model_insert(SerdModel *model, const SerdStatement *statement)¶
Add a statement to a model.
This function fails if there are any active iterators on
model
. If statement is null, then SERD_FAILURE is returned.
-
SerdStatus serd_model_insert_statements(SerdModel *model, SerdCursor *range)¶
Add a range of statements to a model.
This function fails if there are any active iterators on
model
.
-
SerdStatus serd_model_erase(SerdModel *model, SerdCursor *cursor)¶
Remove a statement from a model via an iterator.
Calling this function invalidates all other iterators on this model.
- Parameters
model – The model which
iter
points to.cursor – Cursor pointing to the element to erase. This cursor is advanced to the next statement on return.
-
SerdStatus serd_model_erase_statements(SerdModel *model, SerdCursor *range)¶
Remove a range of statements from a model.
This can be used with
serd_model_find()
to erase all statements in a model that match a pattern.Calling this function invalidates all iterators on
model
.- Parameters
model – The model which
range
points to.range – Range to erase, which will be empty on return.
-
SerdStatus serd_model_clear(SerdModel *model)¶
Remove everything from a model.
Calling this function invalidates all iterators on
model
.- Parameters
model – The model to clear.
5.13.4 Inserter¶
-
SerdSink *serd_inserter_new(SerdModel *model, const SerdNode *default_graph)¶
Create an inserter for writing statements to a model.
Once created, an inserter is just a sink with no additional interface.
- Parameters
model – The model to insert received statements into.
default_graph – Optional default graph, which will be set on received statements that have no graph. This allows, for example, loading a Turtle document into an isolated graph in the model.
- Returns
A newly allocated sink which must be freed with
serd_sink_free()
.
5.13.5 Validator¶
-
enum SerdValidatorCheck¶
A check that a validator can perform against a model.
-
enumerator SERD_CHECK_NOTHING¶
Checks nothing and always succeeds (for use as a sentinel)
-
enumerator SERD_CHECK_ALL_VALUES_FROM¶
Checks that all properties with owl:allValuesFrom restrictions have valid value types.
-
enumerator SERD_CHECK_ANY_URI¶
Checks that the value of any property with range xsd:anyURI is a URI.
-
enumerator SERD_CHECK_CARDINALITY_EQUAL¶
Checks that any instance of a class with a owl:cardinality property restriction has exactly that many values of that property.
-
enumerator SERD_CHECK_CARDINALITY_MAX¶
Checks that any instance of a class with a owl:maxCardinality property restriction has no more than that many values of that property.
-
enumerator SERD_CHECK_CARDINALITY_MIN¶
Checks that any instance of a class with a owl:minCardinality property restriction has at least that many values of that property.
-
enumerator SERD_CHECK_CLASS_CYCLE¶
Checks that no class is a sub-class of itself, recursively.
This ensures that the graph is acyclic with respect to rdfs:subClassOf.
-
enumerator SERD_CHECK_CLASS_LABEL¶
Checks that every rdfs:Class has an rdfs:label.
-
enumerator SERD_CHECK_DATATYPE_CYCLE¶
Checks that no datatype is a refinement of itself, recursively.
This ensures that the graph is acyclic with respect to owl:onDatatype.
-
enumerator SERD_CHECK_DATATYPE_PROPERTY¶
Checks that datatype properties have literal (not instance) values.
-
enumerator SERD_CHECK_DATATYPE_TYPE¶
Checks that every datatype is defined as a rdfs:Datatype.
-
enumerator SERD_CHECK_DEPRECATED_CLASS¶
Checks that there are no instances of deprecated classes.
-
enumerator SERD_CHECK_DEPRECATED_PROPERTY¶
Checks that there are no uses of deprecated properties.
-
enumerator SERD_CHECK_FUNCTIONAL_PROPERTY¶
Checks that no instance has several values of a functional property.
-
enumerator SERD_CHECK_INSTANCE_LITERAL¶
Checks that there are no instances where a literal is expected.
-
enumerator SERD_CHECK_INSTANCE_TYPE¶
Checks that every instance with an explicit type matches that type.
This is a broad check that triggers other type-related checks, but mainly it will check that every instance of a class conforms to any restrictions on that class.
-
enumerator SERD_CHECK_INVERSE_FUNCTIONAL_PROPERTY¶
Checks that at most one instance has a given value of an inverse functional property.
-
enumerator SERD_CHECK_LITERAL_INSTANCE¶
Checks that there are no literals where an instance is expected.
-
enumerator SERD_CHECK_LITERAL_MAX_EXCLUSIVE¶
Checks that literal values are not greater than or equal to any applicable xsd:maxExclusive datatype restrictions.
-
enumerator SERD_CHECK_LITERAL_MAX_INCLUSIVE¶
Checks that literal values are not greater than any applicable xsd:maxInclusive datatype restrictions.
-
enumerator SERD_CHECK_LITERAL_MIN_EXCLUSIVE¶
Checks that literal values are not less than or equal to any applicable xsd:minExclusive datatype restrictions.
-
enumerator SERD_CHECK_LITERAL_MIN_INCLUSIVE¶
Checks that literal values are not less than any applicable xsd:minInclusive datatype restrictions.
-
enumerator SERD_CHECK_LITERAL_PATTERN¶
Checks that literals with xsd:pattern restrictions match the regular expression pattern for their datatype.
-
enumerator SERD_CHECK_LITERAL_RESTRICTION¶
Checks that literals with supported restrictions conform to those restrictions.
This is a high-level check that triggers the more specific individual literal restriction checks.
-
enumerator SERD_CHECK_LITERAL_VALUE¶
Checks that literals with supported XSD datatypes are valid.
The set of supported types is the same as when writing canonical forms.
-
enumerator SERD_CHECK_OBJECT_PROPERTY¶
Checks that object properties have instance (not literal) values.
-
enumerator SERD_CHECK_PLAIN_LITERAL_DATATYPE¶
Checks that there are no typed literals where a plain literal is expected.
A plain literal may have an optional language tag, but not a datatype.
-
enumerator SERD_CHECK_PREDICATE_TYPE¶
Checks that every predicate is defined as an rdf:Property.
-
enumerator SERD_CHECK_PROPERTY_CYCLE¶
Checks that no property is a sub-property of itself, recursively.
This ensures that the graph is acyclic with respect to rdfs:subPropertyOf.
-
enumerator SERD_CHECK_PROPERTY_DOMAIN¶
Checks that any instance with a property with an rdfs:domain is in that domain.
-
enumerator SERD_CHECK_PROPERTY_LABEL¶
Checks that every rdf:Property has an rdfs:label.
-
enumerator SERD_CHECK_PROPERTY_RANGE¶
Checks that the value for any property with an rdfs:range is in that range.
-
enumerator SERD_CHECK_SOME_VALUES_FROM¶
Checks that instances of classes with owl:someValuesFrom property restrictions have at least one matching property value.
-
enumerator SERD_CHECK_NOTHING¶
-
typedef struct SerdValidatorImpl SerdValidator¶
Model validator.
-
SerdValidator *serd_validator_new(SerdWorld *world)¶
Create a new validator.
- Returns
A newly-allocated validator with no checks enabled which must be freed with
serd_validator_free()
.
-
void serd_validator_free(SerdValidator *validator)¶
Free
validator
-
SerdStatus serd_validator_enable_check(SerdValidator *validator, SerdValidatorCheck check)¶
Enable a validator check.
-
SerdStatus serd_validator_disable_check(SerdValidator *validator, SerdValidatorCheck check)¶
Disable a validator check.
-
SerdStatus serd_validator_enable_checks(SerdValidator *validator, const char *regex)¶
Enable all validator checks with names that match the given pattern.
-
SerdStatus serd_validator_disable_checks(SerdValidator *validator, const char *regex)¶
Disable all validator checks with names that match the given pattern.
-
SerdStatus serd_validate_model(SerdValidator *const validator, const SerdModel *model, const SerdNode *graph)¶
Validate a model.
This performs validation based on the XSD, RDF, RDFS, and OWL vocabularies. All necessary data, including those vocabularies and any property/class definitions that use them, are assumed to be in the model.
Validation errors are reported to the world’s error sink.
- Parameters
validator – Validator configured to run the desired checks.
model – The model to validate.
graph – Optional graph to check. Is this is given, then top-level checks will be initiated only for statements in the given graph. The entire model is still searched while running a check so that, for example, schemas that define classes and properties can be stored in separate graphs.
- Returns
SerdStatus.SERD_SUCCESS
if no errors are found, orSerdStatus.SERD_BAD_DATA
if validation checks failed.