Posts

Serd 0.18.2

Serd 0.18.2 has been released. Serd is a lightweight C library for RDF syntax which supports reading and writing [Turtle][], [TriG][], [NTriples][], and [NQuads][]. Serd is suitable for performance-critical or resource-limited applications, such as serialising very large data sets or embedded systems.

Changes:

  • Disable timestamps in HTML documentation for reproducible build
  • Fix bug that caused "a" abbreviation in non-predicate position
  • Fix clashing symbol "error" in amalgamation build
  • Fix crash when resolving against non-standard base URIs
  • Fix crash when serd_node_new_decimal is called with infinity or NaN
  • Update to waf 1.7.8 and autowaf r90 (install docs to versioned directory)

Sord 0.10.4

Sord 0.10.4 has been released. Sord is a lightweight C library for storing RDF statements in memory. For more information, see http://drobilla.net/software/sord.

Changes:

  • Disable timestamps in HTML documentation for reproducible build
  • Fix memory leaks in sord_validate
  • Implement better data type validation in sord_validate conformant with the XSD and OWL specifications
  • Install sord_validate man page

How to define a datatype in RDF

I had to do some digging around to figure out how to define a new Datatype with restrictions in RDF, so I thought it might make a useful post to save someone else the trouble in the future.

RDF datatypes are based on XSD datatypes, which are often used directly. Unfortunately, most implementations simply have the XSD types baked in and do not support or validate new datatype descriptions (though at least sord_validate can). Regardless, it is sometimes necessary to define a datatype with a specific restriction so it can be machine validated. It's a bit tricky to figure out how to do this, since everything is buried in specifications that aren't as triple oriented as they should be. So, here is an example of defining a datatype restricted by regular expression in Turtle, derived from the OWL documentation:

<http://example.org/CSymbol>
    a rdfs:Datatype ;
    rdfs:comment "A symbol in the C programming language" ;
    owl:onDatatype xsd:string ;
    owl:withRestrictions (
        [
            xsd:pattern "[_a-zA-Z][_a-zA-Z0-9]*"
        ]
        ) .

The XSD specification defines several “constraining facets” you can use in this way. See the XSD specification for details, but the most obvious and useful for RDF are: xsd:length, xsd:minLength, xsd:maxLength, xsd:pattern, xsd:maxInclusive, xsd:maxExclusive, xsd:minInclusive, xsd:minExclusive. For example, you can define a numeric type with restricted range like so:

<http://example.org/AnswerishInteger>
    a rdfs:Datatype ;
    rdfs:comment "An integer between 24 and 42 inclusive" ;
    owl:onDatatype xsd:integer ;
    owl:withRestrictions (
        [
            xsd:minInclusive 24
        ] [
            xsd:maxInclusive 42
        ]
    ) .

Defining datatypes in this way and using them as the rdfs:range for properties is a good idea because it describes which values are valid in a machine readable way. This makes it possible for simple generic tools to validate data, ensuring that all literals are valid values for the property they describe.

Sratom 0.4.0

Sratom 0.4.0 has been released. Sratom is a small library for serialising LV2 atoms to and from RDF, for converting between binary and text or storing in a model. For more information, see http://drobilla.net/software/sratom.

Changes:

  • Correctly read objects with several rdf:type properties
  • Fix various hyper-strict warnings
  • Support writing Object Atoms as top level descriptions if subject and predicate are not given.
  • Upgrade to waf 1.7.2

Sord 0.10.0

Sord 0.10.0 has been released. Sord is a lightweight C library for storing RDF statements in memory. For more information, see http://drobilla.net/software/sord.

Changes:

  • Add error callback to world for custom error reporting
  • Add option to build utilities as static binaries
  • Do not require a C++ compiler to build
  • Fix various hyper-strict warnings
  • Make all 'zix' symbols private to avoid symbol clashes in static builds
  • Performance and space (per node) improvements
  • Remove problematic "Loaded n statements" output from serdi
  • SSE4.2 accelerated hashing for node interning, where available
  • Strip down API documentation to a single clean page
  • Upgrade to waf 1.7.2
  • sordmm.hpp: Add indices and graphs parameters to Model constructor
  • sordmm.hpp: Correctly handle Sord::Node self-assignment
  • sordmm.hpp: Remove overzealous URI scheme assertion

Serd 0.18.0

Serd 0.18.0 has been released. Serd is a lightweight C library for RDF syntax which supports reading and writing [Turtle][], [TriG][], [NTriples][], and [NQuads][]. Serd is suitable for performance-critical or resource-limited applications, such as serialising very large data sets or embedded systems.

Changes:

  • Add -e option to serdi to use incremental reading
  • Add -q option to serdi to suppress all non-data output, e.g. errors
  • Add error callback to reader and writer for custom error reporting
  • Add incremental read interface suitable for reading from infinite streams
  • Add option to build utilities as static binaries
  • Do not require a C++ compiler to build
  • Fix various hyper-strict warnings
  • Report write size correctly when invalid UTF-8 is encountered and a replacement character is written
  • Reset indent when finishing a write
  • Strip down API documentation to a single clean page
  • Support digits at start of local names as per new Turtle grammar
  • Upgrade to waf 1.7.2

Sratom 0.2.0

Sratom 0.2.0 has been released. Sratom is a small library for serialising LV2 atoms to and from RDF, for converting between binary and text or storing in a model. For more information, see http://drobilla.net/software/sratom.

Sord 0.8.0

Sord 0.8.0 has been released. Sord is a lightweight C library for storing RDF statements in memory. For more information, see http://drobilla.net/software/sord.

Changes:

  • Add SordInserter for writing to a model via Serd sink functions.
  • Add convenient sord_search(), sord_ask(), and sord_count()
  • Add sord_iter_get_node()
  • Add sord_new_relative_uri()
  • Add sord_validate tool for validating data against RDF/OWL schemas
  • Fix comparison of typed literals
  • Install man page to DATADIR (e.g. PREFIX/share/man, not PREFIX/man)
  • Refuse to intern relative URIs in sord_new_uri*()
  • Support compilation as C++ under MSVC++.
  • Take advantage of interning in sord_node_equals()
  • Tolerate serd passing NULL nodes to reader callback (serd 0.6.0)
  • Use path variables in pkgconfig files

Serd 0.14.0

Serd 0.14.0 has been released. Serd is a lightweight C library for RDF syntax which supports reading and writing [Turtle][], [TriG][], [NTriples][], and [NQuads][]. Serd is suitable for performance-critical or resource-limited applications, such as serialising very large data sets or embedded systems.

Changes:

  • Add SerdBulkSink for writing bulk output and corresponding serdi -B option
  • Add serd_chunk_sink for easy writing to a string
  • Add serd_file_sink for easy writing to a FILE* stream
  • Add serd_node_new_blob and serd_base64_decode for handling arbitrary binary data via base64 encoding
  • Add serd_node_new_file_uri() and serd_file_uri_parse() and implement proper URI to/from path hex escaping, etc.
  • Add serd_reader_set_default_graph() for reading a file as a named graph
  • Add serd_strtod(), serd_node_new_decimal(), and serd_node_new_integer() for locale-independent numeric node parsing/serialising
  • Add serd_uri_serialise_relative() for making URIs relative to a base where possible (by chopping a common prefix and adding dot segments)
  • Add serd_writer_get_env()
  • Add serd_writer_set_root_uri() and corresponding -r option to serdi to enable writing URIs with up references (../)
  • Add serdi -f option to prevent URI qualification
  • Escape ASCII control characters in output (e.g. fix problems with string literals that start with a backspace)
  • Handle a quote as the last character of a long string literal in the writer (by escaping it) rather than the reader, to avoid writing Turtle other tools fail to parse
  • Handle files and strings that start with a UTF-8 Byte Order Mark
  • Implement pretty-printing for collections
  • Improve URI resolution to cover most of the abnormal cases from RFC3986
  • Improve write performance by doing bulk writes for unescaped substrings
  • Install man page to DATADIR (e.g. PREFIX/share/man, not PREFIX/man)
  • Make URIs serialised by the writer properly escape characters
  • Parse collections iteratively in O(1) space
  • Remove use of multi-byte peek (readahead) and use exactly 1 page for read buffer (instead of 2)
  • Report read error if both "genid" and "docid" IDs are found in the same document, to prevent silent merging of distinct blank nodes
  • Report reason for failure to open file in serdi
  • Resolve dot segments in serd_uri_resolve() instead of at write time
  • Support Windows file://c:/foo URIs in serd_uri_to_path() on all platforms
  • Support compilation as C++ under MSVC++
  • Support file://localhost/foo URIs in serd_uri_to_path()
  • Tolerate invalid characters in string literals by replacing with the Unicode replacement character
  • Use path variables in pkgconfig files

Serd 0.5.0

Serd 0.5.0 has been released. Serd is a lightweight C library for RDF syntax which supports reading and writing [Turtle][], [TriG][], [NTriples][], and [NQuads][]. Serd is suitable for performance-critical or resource-limited applications, such as serialising very large data sets or embedded systems.

Changes:

  • Add ability to build static library
  • Add serd_env_set_prefix_from_strings for convenience
  • Add serd_strerror
  • Avoid writing illegal Turtle names as a result of URI qualifying
  • Fix erroneously equal SERD_ERR_BAD_SYNTAX and SERD_ERR_BAD_ARG
  • Fix pretty printing of successive blank descriptions, i.e. "] , ["
  • Gracefully handle NULL reader sinks

« Page 5 / 6 »