Like many, I've long suffered under the antiquated and inflexible HTML
documentation generated by Doxygen. Having recently worked on some Python
documentation using Sphinx, though, I found it powerful and pleasant enough to
use. It also has a way of encouraging actually writing documentation, rather
than just generating a dump of glorified comments, which is a good thing.
Though I'm not at all a fan of ReStructuredText syntax (which at times seems
like it's trying to be cryptic on purpose), Sphinx is undeniably powerful, and
I like the "assemble a bunch of plainish text files" approach in general. The
support for multiple languages is also very appealing, though not without its
problems, as we'll get to.
So, is it possible to use Sphinx to generate documentation for C and C++
libraries? Yes! As explained somewhat recently in a post by Sy Brand,
there is a project called Breathe that integrates Doxygen (for extracting
documentation) with Sphinx (for generating output). That sounded promising, so
I attempted to migrate a library to using Breathe instead of Doxygen's HTML
support. Unfortunately, though, I encountered quite a few roadblocks where I
couldn't quite get output that I was happy with. Worse, the project itself is
very complicated, and as I poked around in swaths of originally generated but
manually modified code, I decided that Breathe was not for me. That would feel
like just exchanging one inflexible and unhackable system for another.
What, then, to do? Though I realize that deep integration via modules like
Breathe is usually the way things are done with Sphinx, I am a KISS sort of
person, so I like to think of it as something more like a Static Site
Generator: it reads a bunch of plainish text input files, and outputs HTML (or
whatever other presentation format). How do we describe C and C++ things in
Sphinx? It turns out that recent versions have built-in support for these
"domains" now, which define markup for describing everything in these
languages. This means that everything to do with nicely formatting and
cross-referencing C and C++ is already dealt with out of the box. Excellent.
So, taking a step back and assessing the situation: we have some XML files that
describe the documentation, and we have a tool that reads text files and
produces nice documentation. This strikes me as a relatively straightforward
task for a nice and simple "files in, files out" script, not somewhere a
Goldbergian contraption that mashes Doxygen into Sphinx is required. So, after
investigating any other promising options (no such luck), I resigned myself to
trying to write such a thing, at the very least to see if it's feasible. I
certainly have no time or interest in writing and maintaining a Documentation
System, but a self-contained script to convert one thing to another seems
reasonable enough.
As it turns out, I wouldn't call it trivial, but it's certainly feasible. I
ended up with a ~700 line Python script that does everything I need
(though this is of course not the same as everything possible). It's a bit
"gluey" and makes some assumptions about the structure and so on, but it does
the job and is something I feel I can maintain as necessary. I won't be
publishing or supporting this as an independent project any time soon, and make
no claims about it being general purpose, but feel free to steal it if any of
this sounds appealing.
With this, I was able to get around some long-standing gripes I have with
Doxygen, and easily make whatever I wanted to happen a reality, so I'm pretty
happy with this approach. Everything is nicely decoupled, so I don't feel
over-invested in any of the tools involved. If, for example, someone finally
writes a good clang-based extractor that gains traction (JSON please, I did not
enjoy this revisitation of the horrors of XML at all), I should be able to
switch to using that easily enough. I've actually found this somewhat crude
and UNIXey approach quite convenient: you can simply look at the ReST files to
understand what is happening, or tweak them a bit and run Sphinx to test what
you're aiming for, and so on. Text files are good.
So, after however many years, I think I've found an approach to documentation
I'm actually quite happy with, that can support all of the languages that I
use, and in general doesn't seem to get in my way. Hooray. For starters, I
did my window system portability layer, Pugl. The generated documentation
for the C API can be seen at https://lv2.gitlab.io/pugl/c/singlehtml/, and
the C++ at https://lv2.gitlab.io/pugl/cpp/singlehtml/. This is more or less
the standard Alabaster theme with a few tweaks, which I'm not sure feels
appropriate for API documentation (and is much more bloated with a bunch of
Javascript than I'd like), but it's pretty enough, at least. I'll tinker with
themes later when I feel like jumping down that rabbit hole.
The slightly cumbersome links are an artifact of the one problem I encountered
using Sphinx domains: you can't really document C and C++ APIs nicely in the
same documentation set. If you use the cpp
domain everywhere, you get name
mangling in links even for C symbols, which is really unfortunate, and you
can't really mix them. To take a contrived example, if you have a struct
MylibThing
in C, then a type alias in C++ like using Thing = MylibThing
,
Sphinx isn't clever enough to figure out that MylibThing
is from C, and will
generate warnings and not link correctly. Perhaps someday it will, which would
be nice, but for now I opted to simply generate completely separate
documentation sets. This means the C documentation is duplicated in the C++
documentation so that things can be hyperlinked, which isn't ideal, but I can
live with it. A certain amount of redundancy is inherent in multi-language
documentation anyway.
As I add Python bindings to most libraries, having a unified documentation
system for all of these languages will be very nice. There is one additional
thing I'll need at some point for the LV2 documentation in particular: a domain
for RDF properties and classes. The LV2 documentation really suffers from an
unnatural code (via Doxygen) and data (via lv2specgen) documentation split, and
my hope is that Sphinx can provide a nice environment for writing documentation
that refers to both worlds freely. That, unfortunately, will be much more
work, but hopefully writing a custom Sphinx domain isn't too hard...