Posts

Sphinxygen 1.0.4

Sphinxygen 1.0.4 has been released. Sphinxygen is a Python module/script that generates Sphinx markup to describe a C API, from an XML description extracted by Doxygen.

Changes:

  • Handle anonymous structs without the "@" placeholder

Sphinxygen 1.0.2

Sphinxygen 1.0.2 has been released. Sphinxygen is a Python module/script that generates Sphinx markup to describe a C API, from an XML description extracted by Doxygen.

Changes:

  • Add command line flags to print version
  • Fix links to documentation groups

In Search of the Ultimate Compile-Time Configuration System

One of the many programming side quests I embark on from time to time is finding the best way to do compile-time configuration in C and C++ code. This is one of those characteristically C things that most projects need to do, but that has no well-established best practice. What you can find is all over the place, and often pretty half-baked just to suit the particularities of the "official" build. Let's try to come up with something better.

Ideal requirements:

  • Ability to enable or disable any features from the command line by defining symbols, including the ability to override or completely disable any automatic checks implemented in the code.

  • Good integration with, but no hard dependency on, any build system.

  • The code should build with reasonable defaults when simply thrown at a compiler "as-is".

  • Mistakes, such as forgetting to include the configuration header or using misspelled symbols, are caught by tooling (preferably compiler warnings).

  • It's never necessary to modify the code to achieve a particular build.

Here's a skeleton of the best I've managed to come up with so far, for a made-up "mylib" project and a few POSIX functions. It has a bit of boilerplate, but there's good reasons for everything that I'll get to. This configuration header is written manually (not generated) and included (privately) in the source code:

#ifndef MYLIB_CONFIG_H
#define MYLIB_CONFIG_H

#if !defined(MYLIB_NO_DEFAULT_CONFIG)

// Derive default configuration from the build environment

// We need unistd.h to check _POSIX_VERSION
#  ifdef __has_include
#    if __has_include(<unistd.h>)
#      include <unistd.h>
#    endif
#  elif defined(__unix__)
#    include <unistd.h>
#  endif

// Define MYLIB_POSIX_VERSION unconditionally for convenience below
#  if defined(_POSIX_VERSION)
#    define MYLIB_POSIX_VERSION _POSIX_VERSION
#  else
#    define MYLIB_POSIX_VERSION 0
#  endif

// POSIX.1-2001: fileno()
#  if !defined(HAVE_FILENO)
#    if MYLIB_POSIX_VERSION >= 200112L || defined(_WIN32)
#      define HAVE_FILENO 1
#    endif
#  endif

// POSIX.1-2001: posix_fadvise()
#  if !defined(HAVE_POSIX_FADVISE)
#    if MYLIB__POSIX_VERSION >= 200112L && !defined(__APPLE__)
#      define HAVE_POSIX_FADVISE 1
#    endif
#  endif

#endif // !defined(MYLIB_NO_DEFAULT_CONFIG)

// Define USE variables for use in the code

#if defined(HAVE_FILENO) && HAVE_FILENO
#  define USE_FILENO 1
#else
#  define USE_FILENO 0
#endif

#if defined(HAVE_POSIX_FADVISE) && HAVE_POSIX_FADVISE
#  define USE_POSIX_FADVISE 1
#else
#  define USE_POSIX_FADVISE 0
#endif

User interface:

  • By default, features are enabled if they can be detected or assumed to be available from the build environment, unless MYLIB_NO_DEFAULT_CONFIG is defined, which disables everything by default to allow complete control.

  • If a symbol like HAVE_SOMETHING is defined to non-zero, then the "something" feature is assumed to be available. If it is zero, then the feature is disabled.

Usage in code:

  • To check for a feature, the configuration header must be included, and the symbol like USE_SOMETHING (not HAVE_SOMETHING) used as a boolean in an #if expression, like:

    #include "mylib_config.h"
    
    // [snip]
    
    #if USE_FILENO
        int fd = fileno(file);
    #endif
    
  • None of the other configuration symbols described here may be used directly. In particular, the configuration header should be the only place in the code that touches HAVE symbols.

The main "trick" here which allows for all of the different configuration "modes" is the use of two "kinds" of symbol: HAVE symbols and USE symbols. HAVE symbols are exclusively the interface for the user or build system, and USE symbols are the opposite: exclusively for use in the code and never by the user or build system. This way, use of the configuration header is mandatory for any code that needs configuration.

The USE symbols are defined to 0 or 1 unconditionally, and code must check them with #if, not with #ifdef. This prevents mistakes, since both forgetting to include the configuration header, and misspelling a symbol, will be caught by compiler warnings. Tools like include-what-you-use can also enforce direct inclusion more strictly.

From the command line, basic usage is typical: define symbols like HAVE_SOMETHING to enable features. For complete control over the configuration, define MYLIB_NO_DEFAULT_CONFIG, in which case all features must be explicitly enabled. This is mainly useful for build systems, so that all features can be checked for and only those that are found used in the code. It's also useful for avoiding issues with strange compilers or platforms that aren't supported by the checks.

I think this design covers all of the above requirements, and while the header itself can get a bit verbose, it's relatively straightforward and, more importantly, usage of it is simple and resilient to mistakes.

There is one thing here that isn't caught by tooling though: misspelling a HAVE variable will silently not work. This is a concession to the simple case of just defining a few relevant HAVE symbols on the command line, and to keep command lines from the build system as terse as possible. It is however possible to modify this pattern a bit to catch this potential mistake as well: require all known HAVE variables to be defined to 1 or 0, and check those with #if as well in the configuration header itself. This adds a couple of lines per check to the boilerplate, for example:

// POSIX.1-2001, Windows: fileno()
#  ifndef HAVE_FILENO
#    if defined(_WIN32) || defined(_POSIX_VERSION) && _POSIX_VERSION >= 200112L
#      define HAVE_FILENO 1
#    else
#      define HAVE_FILENO 0
#    endif
#  endif

// [snip]

#if HAVE_FILENO
#  define USE_FILENO 1
#else
#  define USE_FILENO 0
#endif

This way, compiler warnings will catch any mistakes in the build system (because, for example, HAVE_FILENO isn't defined), ensuring that everything is explicitly either enabled or disabled. I'm not sure which style to use. Potential silent errors in the build system are pretty bad, but at the same time, I don't want to sacrifice the ability of the code to be easily compiled "manually". It's probably possible to have both, but I'm not sure how painful the boilerplate cost would be. I did have the stricter version for a while, but the extremely verbose compiler command lines were pretty annoying, so I removed it. Now, as I write this, I'm second guessing myself, but so it goes.

Questions for another day, I suppose. One of the things about programming side quests is that they usually never end.

C++ Bindings

For some C libraries, I'd like to include "official" C++ bindings to make life easier for people using them from C++ (which in the audio world, is most). However, that's not something I know a good pattern for, in terms of project organization, installation, versioning, and so on. Figuring one out is a trickier problem than it may seem at first.

In the - in this case literally - "C and C++" world, there is a notorious lack of consistent conventions and best practices in some areas, and this seems to be one of them. So, I suppose I will have to suss out the "best" (and least weird) way myself. The "best" way should:

  • Provide "official" C++ bindings which are developed, maintained, and shipped with the underlying C library.

  • Avoid having the C++ wrapper be locked to the same version as the C library (which is a strict semver reflection of the ABI).

    Rationale: It must be possible to develop the C++ bindings, including make breaking changes, while the C library version (and therefore the ABI) stays the same. Otherwise, it would be nearly impossible to change them, because that'd require changing the version... but the underlying C API version needs to break as infrequently as possible.

  • Isolate the bindings (and "C++ stuff") from the underlying C library as much as possible. Ensure that builds on systems without a C++ toolchain work (this isn't uncommon on minimal or embedded systems which much of this software is appropriate for use on).

  • Avoid making a completely separate new project (repository, test and releasing infrastructure, and so on) if at all possible. The maintenance burden would be far too high, and the bindings would be prone to rot.

  • Use a simple and predictable naming scheme that works with any "main" project name.

Poking around repositories and tinkering a little bit, the best practices I can come up with (for the sort of libraries I'm thinking about anyway), is:

  • Develop and release bindings as a sub-project within the "main" project.

    This is only a "project" in the build system sense. The bindings are maintained in the same git repository, and released in the same archive, as the C library.

  • Name the bindings sub-project by appending a cpp suffix, for example, mylib-cpp. This scheme is... well, not uncommon (for example, in the Debian repositories), and can easily be applied to any name, including libraries that already have multi-word names.

    Following meson requirements, this means the sub-project lives at a path like subprojects/mylib-cpp in the repository.

  • Install a separate "package" (for example via pkg-config) for the bindings, which depends on the one for the underlying C library. The major version is appended to both, for example, mylib-cpp-1 might depend on mylib-1.

  • Keep the C++ bindings themselves as light as possible, and header-only. This avoids link-time issues, making C++ API compatibility a compile-time issue only.

  • Give the bindings package a separate version number and let it increase as necessary. This version is not aligned with that of the underlying C library in any meaningful way. Technically, a given version of the bindings depends on some version of the C library, but in practice, this is always simply the version it's shipped with.

    A strange consequence of this scheme is that the version of the C++ bindings can only drift ever further away, so in the future even major versions may not correspond at all. This is a bit weird, but is the only way to make everything work and be properly versioned. Effectively, the version of the bindings is just an implementation detail, something developers deal with in configuration scripts. From the perspective of packagers or users, there is just one version of the library, the version of the underlying C library - the C++ bindings just may break sometimes, even within a major version of the project as a whole.

    I can't think of any concrete reason why this could be a problem: the urge to have shiny "4.0.0" type version bumps across everything at the same time smells like... marketing, frankly, not engineering. It does make parallel installation of different major versions more difficult, though. Packagers can split up the installation and make separate packages if they really want to. "Upstream" (me) officially doesn't care about parallel installation of different major versions of the C++ bindings.

    All that said, ideally they happen to stay relatively aligned anyway.

  • Make sure there is a simple and obvious option to disable C++ entirely, leaving a C library package with the broadest compatibility possible.

The short, vibes-based description of all that is something like: there is a stable and strictly versioned C library with every effort put into long-term source and binary compatibility, as always... and then there's a C++ bindings sub-project that tags along with it but is otherwise independent. The bindings are more volatile, but it's C++, so they're going to be volatile no matter what you do anyway. The bindings project is universally named by tacking a -cpp or _cpp on the end as appropriate in every context: include directories, package names, and so on.

So, an installation might look something like this:

include/dostuff-1/dostuff.h
include/dostuff-cpp-4/dostuff.hpp
lib/libdostuff-1.so
lib/libdostuff-1.so.1
lib/libdostuff-1.so.1.2.4
lib/pkgconfig/dostuff-1.pc
lib/pkgconfig/dostuff-cpp-4.pc

In the source code, the bindings and any supporting C++ code is entirely contained within the subproject, except for a minimal skeleton to handle compile time options and so on. This can be more work than a single heterogeneous project in some ways, less work in others, but overall I think it has more maintenance benefits. Importantly, it keeps any new issues or volatility as far away from the C library as possible, making it easy to see if a change could possibly break the ABI or the C library at all, for example.

This scheme may be extended to other languages if that's appropriate. The naming scheme for Python is like python-dostuff. It probably makes more sense to maintain Cython wrappers as separate projects maintained in the Python way (sigh...), but the whole point of a naming scheme is to have space for things in case you need them. In reality, language bindings are usually done independently by other people in separate projects (Rust folks will use Cargo in a separate repository, and so on).

All of this is, obviously, a massively over-thought bikeshed, but adding multiple programming languages and multiple versioning and compatibility schemes/philosophies to a project is a bit tricky. I can't just copy from an existing best-practice pattern I've been honing for years like I can with straight C libraries. This approach seems like it shouldn't cause too much trouble, though.

That said, I'm just making this up as I go along and have no experience maintaining anything quite like this (only more or less homogeneous C or C++ libraries), so feedback is, as always, welcome. I may revise this post if anything turns out to be a mistake, so it can ultimately serve as a reference for the next person trying to figure out how to do "C family" source code releases right.

Sphinxygen 1.0.0

Sphinxygen 1.0.0 has been released. Sphinxygen is a Python module/script that generates Sphinx markup to describe a C API, from an XML description extracted by Doxygen.

Down to the Last Man, Doc!

While working through some remaining things in the build/documentation/packaging/infrastructure cleanup phase I've been going through lately, I noticed that some of the man pages hosted at this domain have become stale (there was a helpful ticket about one of them, but more or less all of them were out of date).

I'm still not entirely sure what to do about the general duplication between README files and the neglected home pages I still (barely) maintain here, but this part struck me as obviously needing to be automated.

Of course, now that my attention was drawn to them, I noticed that many of the man pages themselves were in a terrible state. In recent years I've become more enthusiastic about man pages, particularly after moving to mandoc. As a language, mdoc is dramatically better than the "traditional" man macros (the ones most often used on Linux), and includes meaningful semantic markup for more or less everything you'll need. It's still line noise to write, but you actually get something for it. Since it's semantic markup, it feels like a better investment to me: superficially ugly, perhaps, but with high fidelity as a portable source material. Converting from mandoc to just about anything, I imagine, could be done automatically easily enough, if it wasn't already supported by the tool1.

In practice, I find that the language encourages/enables writing more consistent pages with better formatting, and mandoc, unlike groff, emits clean and simple HTML pages that can be easily styled with CSS. It's relatively pleasant to use in general, and emits sensible error messages that point directly to the problem when you screw something up. I've also come to appreciate the BSD style of man page more: the summary is actually a summary, it's less YELLEY, and there is a standard for sections and their order. I may spend most of my time in a GNU/Linux universe, but BSD got man pages right. That's not terribly surprising: BSD culture in general actually likes and cares about man pages. GNU treated them as an unfortunate legacy to prop up a failed successor (info). Of course the BSD tooling around man pages is much better. I tend towards BSD style for command line interfaces anyway, but you can document GNU-style long options in mdoc as well.

So! If I'm going to be putting energy into man pages, I might as well convert them to mdoc. I think good tooling is very important, and nice command-line tools with clear and articulate man pages are a part of that. Unfortunately, neither the LV2 tools themselves nor their documentation have frankly ever been of very high quality. Addressing this is one of the next themes of upcoming work, but for now, simply converting the existing content (with a few touch-ups along the way) is an easy step forward. It meshes nicely with the "bring all projects in line with a consistent set of 'modern' quality standards and automate everything" arc I've been on lately anyway.

The process of going through all of these has made me realize that I should prioritize the tool work, because the state of some of these tools is frankly embarrassing, but, uh, pay no attention to the man behind the curtain! It's a formats and infrastructure kind of day!

Having manually converted all the pages and checked them in the terminal, I needed a place to have them automatically updated, converted to HTML, and uploaded. The (not-really-existing) lv2kit project is the (not-really) parent of LV2 itself and all the "official" libraries, so I put it in CI there for the time being. I don't know what the best organization is for man pages under the LV2 umbrella in the grand scheme of things, but for now, this gets things automatically updated and online:

It's possible to build the documentation into a broader HTML page to integrate with toolbars and the like, but for now these are just complete pages straight out of mandoc. I did however write a simple stylesheet to adopt the same colours, fonts, spacing, typography-inspired minimalism, and infamous green links that I use everywhere, so things at least look reasonably coherent. I think it looks Good Enough™, anyway, but more relevantly, if you poke around in the document structure you'll see that the output is, for the most part, sensible semantic markup as well: lists are lists, sections are sections, different "sorts" of things are given classes, tables aren't abused, and so on. Everything here, from input to output, is reasonably clear and feels like it has obviously had some care put into it. The output looks like something a reasonable human might write themselves (if you run it through a formatter, anyway). I certainly can't say any of that about the equivalent groff-based monstrosity. In short, mandoc is good and you should use it.

As for these pages, the future of lv2kit as an "actual" released project is still unclear, but it is regularly updated (since the libraries rely on it for CI), so the man pages should never be stale again. I certainly wouldn't call these good yet, but not rotten is a start!


  1. mandoc currently supports emitting ascii, html, man, markdown, pdf, ps, tree, and utf8

On the LV2 Stack, Dependencies, and Incrementalism

For quite a while now, I've been working on new major versions of the LV2 host stack (serd, sord, sratom, suil, and lilv) while maintaining the current stable versions. Mainly this is a task of trying to push as many changes back as I can, without actually breaking any APIs. Aside from the usual bug fixes and code improvements, this includes a lot of build system, tooling, and general quality control improvements. This keeps the divergence low, so it will be easier to go back and do maintenance on the old versions where necessary.

At this point, this process is almost done: everything uses meson now, has a consistent QC regime (high-coverage testing, multi-platform CI, strict compiler warnings, machine formatting, clean clang-tidy configurations, and so on), most long-standing issues are fixed, and I'm finally starting to run out of changes that I can do without finally breaking the APIs and releasing new major versions.

It recently occurred to me, though, that there's one more incremental thing I can do here before "ripping the band-aid off", so to speak. In typical low-level C hacker fashion, I have some things that I just copy around piecemeal from project to project as necessary. One of those is zix, a simple library of basic data structures and portability wrappers. This has existed for over a decade, but I have never released it as a "proper" library. More dependencies, more problems, but copy-paste code reuse certainly isn't without its own problems. In this case, I have three projects that use it: sord, lilv, and jalv (and some other developers have discovered this and taken bits for their own purposes as well). This has been troublesome from time to time when bugs need fixing, and depending on how pedantic you are, might technically be considered a violation of some distributions' policies (notably Debian's).

Meanwhile, the lack of a common lowest-level dependency for this kind of very boring and generic stuff (something like the role that glib plays in that ecosystem) has proven increasingly problematic. There is some sketchy code around that exists solely to work around this "missing" dependency, and still some questionable use of things to skirt around missing facilities (for example, jalv has no mutexes). With the new versions I have in the pipeline, this problem becomes more acute for several reasons, so I've decided to release zix and depend on it as a proper released and versioned library, like all the others.

This is something that can be done before the new versions though, and, like all of the above tooling/etc improvements, I think it's best to do as much as possible on the current versions before replacing them, because that process will be painful enough as it is. My thinking originally was to not add any more libraries until I can take advantage of the solid "subproject" abilities of meson and wrap everything up into a single "lv2kit" to decrease the maintenance and packaging burden, but now I'm realizing that this isn't very realistic. For one, it's weird, and weird is bad. For another, it wouldn't really reduce packaging overhead, since distributions would surely want to package the individual libraries separately anyway. The lv2kit idea is still a goal, but pragmatically, I think it's best to just continue on in the good old "shotgun blast of little libraries" way and deal with that later.

Concretely, that will look like this:

serd <--\--- sratom
         --- sord --- lilv --- jalv
zix  <--/------------/--------/

That is, zix will become a thing, and sord, lilv, and jalv will gain it as a dependency. This will let any issues with packaging or subprojects or whatever get ironed out, without changing any of the APIs that LV2 hosts use directly whatsoever.

... and that, I think, is the last non-trivial thing I can do without rocking the boat, before the fun part where I finally get to change whatever I want in these APIs I hastily banged out over a decade ago now, and never intended to live this long in the first place. The dependency tree may get a bit more complicated, but a whole bunch of other things are going to get wildly simplified in exchange, which seems like a pretty good deal to me.

I Was Told The Crowd Would Have Funding

I am, like many an obsessive nerd before me, terrible at self-promotion and living a healthy and sustainable life in general. However, as I sit here sick, burnt-out, and slowly going broke while I desperately dump what little energy I have into boring software that mostly benefits other people, it occurs to me that it might be smart to set up the usual channels for people to support this work, and mention them here (despite how embarrassing it is to do so).

It's pretty well-known these days how brutal and thankless it can be to maintain Open Source infrastructure. While I enjoy what I do around here in a way, truth be told, I'm not sure I can do it much longer. In an ideal sense, my grand plan (as described in some earlier posts) is to get the foundation solid and increase the LV2 "bus factor" so I can return to building user-facing things and tackle some new problems which tends to be more rewarding. Realistically, though, I may just be cleaning things up and minimizing the maintenance overhead so that I can hand them off or simply abandon them without feeling too bad about it. Time will tell, but if any of this helped at all with surviving the ever-skyrocketing cost of living... well, it certainly wouldn't hurt my motivation.

So! In addition to the old direct donation and subscription links on my donation page, I've set up Github Sponsor and Patreon accounts. If you're reading this, presumably you know about the various projects I author, maintain, or contribute to. If you use or benefit from any of that and can spare a few bucks, I'd appreciate any support. Every little bit really does help.

Patchage 1.0.10

Patchage 1.0.10 has been released. Patchage is a modular patch bay for Jack and ALSA based audio/MIDI systems.

Changes:

  • Add German translation
  • Add Korean translation from Junghee Lee
  • Add i18n support
  • Replace boost with standard C++17 facilities
  • Upgrade to fmt 9.0.0

Jalv 1.6.8

Jalv 1.6.8 has been released. Jalv (JAck LV2) is a simple host for LV2 plugins. It runs a plugin, and exposes the plugin ports to the system, essentially making the plugin an application. For more information, see http://drobilla.net/software/jalv.

Changes:

  • Add Gtk plugin selector UI and desktop file
  • Add missing option in console help output
  • Add version option to console executable
  • Build Qt interface as C++14
  • Change no-menu short option to m to avoid clash with jack-name
  • Clean up and modernize code
  • Fix "preset" console command when "presets" hasn't been called
  • Fix MSVC build
  • Fix atom buffer alignment
  • Fix crash when running jalv without arguments
  • Fix man page headers
  • Fix memory leaks
  • Fix outdated man pages
  • Fix spurious transport messages
  • Fix thread-safety of plugin/UI communication rings
  • Flush stdout after printing control values in console interface
  • Print status information consistently to stdout
  • Propagate worker errors to the scheduler when possible
  • Remove Gtkmm interface
  • Remove Qt4 support
  • Support both rdfs:label and lv2:name for port group labels
  • Switch to meson build system

« Page 2 / 21 »