Serd is not intended to be a swiss-army knife of RDF syntax, but rather is suited to resource limited or performance critical applications (e.g. converting many gigabytes of NTriples to Turtle), or situations where a simple reader/writer with minimal dependencies is ideal (e.g. in LV2 implementations or embedded applications).
Small: Serd is implemented in around 3000 lines1 of standard C code. On a typical machine it compiles to about 90 KiB, but can be as small as 29 KiB when optimized for size. For comparison, on the same system raptor is 417 KiB and libxml2 is 2.1 MiB (not including dependencies), making serd roughly 5 and 25 times smaller, respectively.
Portable and Dependency Free: Serd uses only the C standard library, and has no external dependencies. It is known to compile with GCC, LLVM/CLang, and MSVC (as C++), and is tested on GNU/Linux, OpenBSD, Mac OS X, and Windows.
Fast and Lightweight: Serd (and the included
serditool) can be used to stream abbreviated Turtle, unlike many tools which must build an internal model to abbreviate. In other words, Serd can serialise an unbounded amount of abbreviated Turtle using a fixed amount of memory, and it does so very quickly: to the author’s knowledge, serd is the fastest Turtle reader/writer by a wide margin (see Performance below).
Conformant and Well-Tested: Serd is written to the Turtle, NTriples and URI specifications, and includes a comprehensive test suite which includes all the tests from the Turtle specification, all the “normal” examples from the URI specification, and several additional tests added specifically for Serd. The test suite has 100% code coverage (by line), and runs with zero memory errors or leaks2.
|Command||Memory||HDD Time||HDD Throughput||SSD Time||SSD Throughput|
||3.22 MiB||0:35||64.2 MiB/s||0:27||83.2 MiB/s|
||3.6 MiB||0:35||64.2 MiB/s||0:21||107.0 MiB/s|
||3.7 MiB||0:37||60.7 MiB/s||0:22||102.1 MiB/s|
||11124.2 MiB||3:02||12.3 MiB/s||3:03||12.3 MiB/s|
||10.6 MiB||1:26||26.1 MiB/s||1:08||33.0 MiB/s|
Input is mappingbased_properties_en.nt from DBPedia fetched on 2011-12-12, ~17.5M triples, 2247 MiB uncompressed. System is a Debian GNU/Linux machine with Linux 3.1.1 on an Intel Core i7-2620M. “Memory” is maximum resident set (the maximum total memory use). “Time” is wall clock time. For reliable benchmarking, the file system cache was flushed before each run with
echo 3 > /proc/sys/vm/drop_caches. The variance between identical runs was less than 2 MiB/s, the best of 3 is shown. Output was redirected to a file on the same disk. Measurements by
/usr/bin/time -v. Note the last raptor entry is performing a simpler (non-abbreviating) task, included here for comparison.
These results show that Serdi is capable of converting NTriples to abbreviated Turtle using a small constant amount of memory. Serd is fast enough that the process is entirely I/O bound when reading from the hard disk. The solid state drive is fast enough that Serd can’t quite maintain maximum throughput, so grep and sed are faster as expected because these tools do much less processing.
The latest version of Serd is 0.18.2, released on December 22, 2012.
Man pages and HTML documentation are built and installed by the source distribution when configured with
Serd is developed and given away freely for the benefit of all. However, donations of appreciation for the considerable time and effort spent are appreciated:
- Stripped of comments etc., as calculated by David A. Wheeler’s SLOCCount
- According to valgrind