Thoughts filed in: Nutty

20
May

Database-free content tagging with files and glob

Tagging data (e.g. in a blog) is many-to-many data. Each content item can have multiple tags. And each tag can be assigned to multiple content items. Many-to-many data needs to be stored in a database. Preferably a relational database (e.g. MySQL, PostgreSQL), otherwise an alternative data store (e.g. something document-oriented like MongoDB / CouchDB). Right?

If you're not insane, then yes, that's right! However, for a recent little personal project of mine, I decided to go nuts and experiment. Check it out, this is my "mapping data" store:

Just a list of files in a directory.

Just a list of files in a directory.

And check it out, this is me querying the data store:

Show me all posts with the tag 'fun-stuff'.

Show me all posts with the tag 'fun-stuff'.

And again:

Show me all tags for the post 'rant-about-seashells'.

Show me all tags for the post 'rant-about-seashells'.

And that's all there is to it. Many-to-many tagging data stored in a list of files, with content item identifiers and tag identifiers embedded in each filename. Querying is by simple directory listing shell commands with wildcards (also known as "globbing").

Is it user-friendly to add new content? No! Does it allow the rich querying of SQL and friends? No! Is it scalable? No!

But… Is the basic querying it allows enough for my needs? Yes! Is it fast (for a store of up to several thousand records)? Yes! And do I have the luxury of not caring about user-friendliness or scalability in this instance? Yes!