RDF

DOAP replacing AUTHORS/MAINTAINERS/etc

There’s been a bit of talk in the GNOME camp lately about using DOAP instead of the unstructured text files that are the current norm for source packages. On the one hand, people want the benefits of having machine readable data in projects, OTOH, RDF/XML is a nightmare (”I’ll never maintain such bloat!” - “That is one hell of an ugly file.”).

This is how RDF/XML hurts RDF. The original loathed file:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl"?>
<rdf:RDF xml:lang="en"
         xmlns="http://usefulinc.com/ns/doap#"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:asfext="http://projects.apache.org/ns/asfext#">
  <Project rdf:about="http://ant.apache.org/">
    <created>2006-02-17</created>
    <license rdf:resource="http://usefulinc.com/doap/licenses/asl20" />
    <name>Apache Ant</name>
    <homepage rdf:resource="http://ant.apache.org" />
    <asfext:pmc rdf:resource="http://ant.apache.org" />
    <shortdesc>Java-based build tool</shortdesc>
    <description>Apache Ant is a Java-based build tool. In theory, it is kind of like Make, but without Make\'s wrinkles.</description>
    <bug-database rdf:resource="http://issues.apache.org/bugzilla/buglist.cgi?product=Ant" />
    <mailing-list rdf:resource="http://ant.apache.org/mail.html" />
    <download-page rdf:resource="http://ant.apache.org/bindownload.cgi" />
    <programming-language>Java</programming-language>
    <category rdf:resource="http://projects.apache.org/category/build-management" />
    <release>
      <Version>
        <name>Apache Ant 1.7.0</name>
        <created>2006-12-13</created>
        <revision>1.7.0</revision>
      </Version>
    </release>
    <repository>
      <SVNRepository>
        <location rdf:resource="http://svn.apache.org/repos/asf/ant"/>
        <browse rdf:resource="http://svn.apache.org/viewcvs.cgi/ant"/>
      </SVNRepository>
    </repository>
  </Project>
</rdf:RDF>

and the equivalent in Turtle (a subset of N3) (automatically generated with rapper -o turtle doap_Ant.rdf):

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://usefulinc.com/ns/doap#> .
@prefix asfext: <http://projects.apache.org/ns/asfext#> .


    a :Project;
    :created "2006-02-17"@en;
    :license ;
    :name "Apache Ant"@en;
    :homepage <http://ant.apache.org>;
    asfext:pmc <http://ant.apache.org>;
    :shortdesc "Java-based build tool"@en;
    :description "Apache Ant is a Java-based build tool. In theory, it is kind of like Make, but without Make's wrinkles."@en;
    :bug-database <http://issues.apache.org/bugzilla/buglist.cgi?product=Ant>;
    :mailing-list <http://ant.apache.org/mail.html>;
    :download-page <http://ant.apache.org/bindownload.cgi>;
    :programming-language "Java"@en;
    :category <http://projects.apache.org/category/build-management>;
    :release [
        a :Version;
        :name "Apache Ant 1.7.0"@en;
        :created "2006-12-13"@en;
        :revision "1.7.0"@en
    ];
    :repository [
        a :SVNRepository;
        :location <http://svn.apache.org/repos/asf/ant>
        :browse <http://svn.apache.org/viewcvs.cgi/ant>
    ] .

I wouldn’t want to hand maintain for RDF/XML version either, but the Turtle version? Sure. It’s the exact same information, far more human readable, and about as terse as it could be while representing the same information.

The best thing about a syntax independent model like RDF is.. well, it’s syntax independent. Choose one that doesn’t suck :)

Hacking
RDF

Comments (0)

Permalink

LV2 Classification

… done and done. It is so much nicer to right click a canvas and hop through a few class (”category”) menus to find the plugin you want than search a big flat list in a dialog. Short of a couple of semantic wars over classes (I already had to change the schema a bit to make the menu less crap), that’s done. As usual, need to go over the API and make sure there’s no potential future binary incompatibility issues, but I think it’s pretty good. SLV2 is quite a bit more of an abstraction layer than I initially planned, but the API is more solid and flexible, as far as being able to change things underneath without breaking things.

Look at me, finally learning that compatibility is important.

So what’s left now is the hard part: exposing full(ish) RDF querying. I don’t want to expose Redland (gone this far without) at least for normal uses, so I guess I will have to make a query results ‘class’ etc.

First is the bigger issue though: at the moment there’s no typed literals in SLV2 at all. This is not good…

Hacking
LV2
RDF

Comments (1)

Permalink

Rewrite the hounds

Well, I did the Redland modularization thing to tear those dependencies out. Luckily for me, Dave Becket wrote the storage module stuff in anticipation of modularization, so it was actually really easy (what’s with people named Dave and being absolutely brilliant eh?) So, pending whatever autohell stuff needs doing to make it possible/easy to package into separate bits, that problem’s solved.

Armed with my new anti-whining-user ammo of future dependency dissapearance, I pretty much rewrote all of SLV2 to use one big model of all LV2 related RDF on the system. I figured with an in memory model, querying the whole thing should be quite fast (it’s not that big), and there’s some fun to be had with a big single web of data containing everything plugin/extension/whatever related. Happily, going through and parsing all the files is nearly instantaneous (Raptor has to be the fastest RDF parser in the west by a long shot)…

… and non-trivial (but necessary) queries take so long and chew 100% CPU that they’re unusable.

Shit.

Long story short, debugging and/or rewriting half of Rasqal isn’t exactly high on my Things I Feel Like Doing Right Now list, so I rewrote SLV2 again. Now each plugin has it’s own in-memory model, which works quite well. Lost that big web of data, which I thought was a pretty cool idea, but hopefully whatever’s wrong with Redland can get sorted out by the time anyone has a real use for that anyway. The plugin model can still load plugin extension bundles etc. into the same model, and that’s the important part anyway (which makes LV2 plugins extendable without having to modify the plugin’s bundle itself). Not really sure how the main ontology and LV2 extensions like categories and other system stuff fit in to this just yet, but that’s a train of thought for another day.

The performance problems Zynjacku was having are much better now. The loading guage no longer pauses on plugins with large numbers of ports, it zips along at a constant speed and is quite a bit faster overall.

It could be faster still if I added an API to get plugins with a custom query, but that will probably have to wait until the Redland query lock-up thing is solved. Anyway, it’s much, much better now, and the API has been redone in a better more abstractioney way that should allow any kind of redesign behind the scenes (as far as cacheing models and whatnot) without breaking the API. Also the API is now reentrant (well, if Redland is anyway?) which I figure is probably going to be an issue at some point once some plugin and it’s host both use SLV2.

Now to just get the thing fully working again…

Hacking
LV2
RDF

Comments (2)

Permalink

Release the hounds, part 1

Blissfully ignoring the fact that I have an exam next wednesday that’s probably going to be pretty hard and involves mathematics and whatnot…. free time!

So, now is the time to get some damn releases out and off my back, starting with stuff that other people depend on (though my toys are frankly much more fun to work on. You all owe me a beer). Particularly before the Summer of Code starts, which (27 hour a day jokes aside) is going to chew time like crazy. Hopefully I manage to get outside and see the sun once or twice this summer, but that’s another topic.

Step 1 is SLV2. LV2 really needs to get out there, yesterday. The main problem right now is exposing categories and performance issues with querying lots of plugins with large numbers of ports.

Categories is a bit tough to do right. The easy way out is to just assume the categories are as in in ontology, but that’s crap and very un-LV2ey. The way bundles work you can just drop in extensions with more categories and they should be able to magically show up in all the user’s apps without the apps having to change at all. The hard part is dealing with weird category heirarchies and how to expose it all in an API. The weird heirarchies is solved easily enough by making a few assumptions. Exposing via an API I don’t know. I suppose having separate methods to get the category heirarchy (as URIs and names, probably), then just allow getting the (many? a?) category for a plugin and the host can do whatever it wants from there. That’s easy enough, seems like it will work. How to return the heirarchy is a question, I suppose the structure in memory can be a tree that directly maps to the heirarchy somehow. This kind of thing is much more annoying in C (especially for a little OO weenie yungun’ like myself), but it can work. One thing that I should probably keep in mind is that clamping things to a strict category heirarchy is not a good idea (kids these days and their “tags”), even though virtually all apps will use categories as a (intrinsically heirarchial) menu. Both need to work.

Performance is a bit more ominous. I think maybe depending on full Redland might end up being necessary. I have avoided that in all my software so far because of dependencies (mostly mySQL) which people would complain about. Having mySQL as a dependency to use LV2 plugins is definitely not acceptable, but not using Redland (i.e. just using Raptor and Rasqal) is really, really, really limiting, and slow, and annoying, and just not good. They simply weren’t designed for the kind of higher level stuff I’m doing, that’s what Redland is for.

So, fiddling with Redland to split the stores etc. into dynamically loaded modules so they can be packages up separately is The Solution here. Unfortunately adding yet another project to the stack is exactly the opposite of what I need right now… it would be nice to be able to spend a bit more time on Redland though. My little vendetta getting RDF/Turtle on the desktop depends on having an RDF toolkit that’s lightweight/appropriate enough for installing on user machines, and Redland is clearly the one for the job. I know this is good, powerful technology, and the things it will allow us to do are concrete, tangible, awesomeness; despite what any (usually just plain ignorant) naysayers may think. Having a proper in-memory RDF store (Redland) is necessary for most of these things, and the API is much nicer regardless (plus there’s Python bindings which are, IMNSHO, the single greatest thing ever).

On a related note, I am so incredibly sick of school interfering with my time and ability to work on this stuff it’s not funny. You know your life is fucked up when you do more ‘work’ in your free time than most people do at work. My brilliant solution to this problem? Subject myself to even more of this utter waste of time and go for a Comp Sci Master’s! GOOD PLAN. I’ll be exactly where I am right now, except 2 years more bitter, stressed to death, and old (which at this particular point in life is a really, really relevant factor. Do I really want to waste what tiny shred of youth I have left slaving away at pointless work I despise? No, no I do not.)

I’m not a masochist; just weak. Weak and stupid.

Hacking
LV2
RDF

Comments (0)

Permalink

subdomains may look pretty…

Well, it turns out that making a new subdomain for practically everything is a bit of a stupid idea. Damn you, DreamHost webpanel, for making it entirely too easy and fancy looking to create new subdomains.

Oh well. Live and learn. The annoying thing about “live and learn” with webby things is that the “learn” part tends to break the hell out of things that other people notice. Then again, you would think that changing an SVN URL and having a post on the top-level page of the (clearly personal) domain about it would mean people would figure it out eventually. You would of course be wrong, having committed the common mistake of assuming people aren’t idiots. They’ll just keep trying and trying and trying, filling up the logs with the same crap every single day, like some day it will magically revert to the old location. I had faith in mankind before the Internet.

Anyway, I hereby pledge to make no more Meta entries unless they actually, you know, have some content worth reading. The “making my site!” blog entry is definitely the big yellow UNDER CONSTRUCTION banner of the 2000’s.

Now I just need to figure out a stable URI scheme for RDF ontologies, etc. That one’s actually important, but I havn’t been able to find any decent best practice documents for it via Google. I define “truth” in terms of Google findability, so I guess I just need to figure this one out. The W3C date-in-the-url thing seems a bit stupid to me. That works for blog posts and other timey things, but specifications and namespaces have a clear versioning that makes much more sense to use. Plus dates are large and ugly. Given that the W3C definitely has the most referenced namespaces of anyone, why they can’t have decent looking URIs for the damn things is beyond me.

The same sort of problem exists for LV2 plugins (and Ingen patches, which are slowly becoming LV2 plugins). I guess it’s a general problem; assigning URIs to changing (versioned) things.

As for versioning, there’s compatible changes and incompatible changes. The problem is… what does “compatible” mean? For RDF, the simple definition is “any existing triples have not changed” but that’s a bit low-level. Adding triples can seriously change the semantics of an ontology (less so for plain old RDF data). With plugins, the basic compatibility unit is the “port signature”, the ports of the plugin that are exposed (and basically define the interface).

Major/minor version number isn’t really sufficient for that though, because adding ports but leaving the existing ones unchanged is.. sort of compatible, but not really. It does change things, but hosts can deal with it as long as the ports have sensible default values. So, enter 3 digit version numbers. That’s probably a bit much for namespaces and ontologies though. Or maybe not, it depends.

Blatantly Obvious Conclusion: Versioning is domain-specific. Shame, that.

Still… things need URIs.

One thing I need to do is be able to update an ontology without breaking software that refers to it, so symlinks (ala the typical foo-LATEST.tar.gz practice) come in handy. Something like:

http://drobilla.net/ns/ingen/1.1
http://drobilla.net/ns/ingen/1 -> http://drobilla.net/ns/ingen/1.1

or maybe http://drobilla.net/ns/ingen/1.latest ?

I think maybe the best idea is to not define any sort of rigid versioning scheme (like major.minor.revision as in software) at all, but have a sort of heirarchial thing that may have as many version numbers as necessary (which could increase in the future). Symlinks (aka aliases) can make everything pretty.

http://drobilla.net/ns/ingen/1.0

http://drobilla.net/ns/ingen/1.1

http://drobilla.net/ns/ingen/1.1.1

http://drobilla.net/ns/ingen/1.2

http://drobilla.net/ns/ingen/2.0

http://drobilla.net/ns/ingen/2.0.1

http://drobilla.net/ns/ingen/1.1 -> http://drobilla.net/ns/ingen/1.1.1

http://drobilla.net/ns/ingen/1 -> http://drobilla.net/ns/ingen/1.2

http://drobilla.net/ns/ingen/2 -> http://drobilla.net/ns/ingen/2.0.1

etc. Basically any given URL with a version points to the latest version with that version as a prefix. Seems nice and open-ended yet stable to me… maybe?

Compatibility sucks. Worst part of software anything, ever.

Meta
RDF

Comments (0)

Permalink