Basic breadcrumbs and taxonomy

💬 25

Part 1 of "making taxonomy work my way".

Update (6th Apr 2005): parts of this tutorial no longer need to be followed. Please see this comment before implementing anything shown here.

As any visitor to this site will soon realise, I love Drupal, the free and open source Content Management System (CMS) without which GreenAsh would be utterly defunct. However, even I must admit that Drupal is far from perfect. Many aspects of it - in particular, some of its modules - leave much to be desired. The taxonomy module is one such little culprit.

If you browse through some of the forum topics at drupal.org, you'll see that Drupal's taxonomy system is an extremely common cause of frustration and confusion, for beginners and veterans alike. Many people don't really know what the word 'taxonomy' means, or don't see why Drupal uses this word (instead of just calling it 'categories'). Some have difficulty grasping the concept of a many-to-many relationship, which taxonomy embraces with its open and flexible classification options. And quite a few people find it frustrating that taxonomy has so much potential, but that very little of it has actually been implemented. And then there are the bugs.

In this series, I show you how to patch up some of taxonomy's bugs; how to combine it with other Drupal modules to make it more effective; and also how to extend it (by writing custom code) so that it does things that it could never do before, but that it should have been able to do right from the start. In sharing all these new ideas and techniques, I hope to make life easier for those of you that use and depend on taxonomy; to give hope to those of you that have given up altogether on taxonomy; to open up new possibilities for the future of the official taxonomy module (and for the core Drupal platform); and to kindle discussion and criticism on the material that I present.

The primary audience of this series is fellow web developers that are a part of the Drupal community. In order to appreciate the ideas presented here, and to implement the examples given, it is recommended that you have at the very least used and administered a Drupal site before. Knowledge of PHP programming and of MySQL / PostgreSQL (or even other SQL) queries would be good. You do not need to be a hardcore Drupal developer to understand this series - I personally do not consider myself to be one (yet :-)) - but it would be good if you've tinkered with Drupal's code and have at least some familiarity with it, as I do. If you're not part of this audience, then by all means read on, but don't be surprised if very soon (if not already!) you have no idea what I'm talking about.

I thought part 1 was about breadcrumbs...??

And it is - you're quite right! So, now that I've got all that introductory stuff out of the way, let's get down to the guts of this post, which is - as the title suggests - basic breadcrumbs and taxonomy (for those of you that don't see any bread, be it white, multi-grain, or wholemeal, check out this definition of a breadcrumb).

Because let's face it, that's what breadcrumbs are: basic. It's one of those fundamental things that you'd expect would work 100% right out of the box: you make a new site, you post content to it, you assign the content a category, and you take it for granted that a breadcrumb will appear, showing you where that post belongs in your site's category tree. At least, I thought it was basic when I started out on this side of town. Jakob Nielson (the web's foremost expert on usability) thinks so too, as this article on deep linking shows. But apparently, Drupal thinks differently.

It's the whole many-to-many relationship business that makes things complicated in Drupal. With a CMS that supports only one-to-many relationships (that is, each piece of content has only one parent category - but the parent category can have many children), making breadcrumbs is simple: you just trace a line from a piece of content, to its parent, to it's parent's parent, and so on. But with Drupal's taxonomy, one piece of content might have 20 parents, and each of them might have another 10 each. Try tracing a line through that jungle! The fact that although you can use many-to-many relationships, you don't have to, doesn't make a difference: taxonomy was designed to support complex relationships, and if it is to do that properly, it has to sacrifice breadcrumbs. And that's the way it works in Drupal: the taxonomy system seldom displays breadcrumbs for terms, and never displays them for nodes.

Well, I have some slightly different ideas to Drupal's taxonomy developers, when it comes to breadcrumbs. Firstly, I believe that an entire site should fall under a single 'master' category hierarchy, and that breadcrumbs should be displayed on every single page of the site without fail, reflecting a page's position in this hierarchy. I also believe that this master hierarchy system can co-exist with the power and flexibility that is inherent to Drupal's taxonomy system, but that additional categories should be considered 'secondary' to the master one.

Look at the top of this page. Check out those neat breadcrumbs. That's what this entire site looks like (check for yourself if you don't believe me). By the end of this first part of the series, you will be able to make your site's breadcrumbs as good as that. You'll also have put in place the foundations for yet more cool stuff, that can be done by extending the power of taxonomy.

Get your environment ready

In order to develop and document the techniques shown here, I have used a test environment, i.e. a clean copy of Drupal, installed on my local machine (which is Apache / PHP / MySQL enabled). If you want to try this stuff out for yourself, then I suggest you do the same. Here's my advice for setting up an environment in which you can fiddle around:

  1. Grab a clean copy of Drupal (i.e. one that you've just downloaded fresh from drupal.org, and that you haven't yet hacked to death). I'm not stopping you from using a hacked version, but don't look at me when none of my tricks work on your installation. I've used Drupal 4.5.2 to do everything that you'll see here (latest stable release as at time of writing), so although I encourage you to use the newest version - if a newer stable release is out as at the time of you reading this (it's always good to keep up with the official releases) - naturally I make no guarantee that these tricks will work on anything other than vanilla 4.5.2.
  2. Install your copy of Drupal. I'm assuming that you know how to do this, so I'll be brief: unzip the files; set up your database (I use MySQL, and make no guarantee that my stuff will work with PostgreSQL); tinker with conf.php; and I think that's about it.
  3. Configure your newly-installed Drupal site: create the initial account; configure the basic settings (e.g. site name, mission / footer, time zone, cache off, clean URLs on); and enable a few core modules (in particular path module, forum and menu would be good too, you can choose some others if you want). The taxonomy module should be enabled by default, but just check that it is.
  4. Download and install taxonomy_context.module (24 Sep 2004 4.5.x version used in my environment). I consider this module to be essential for anyone who wants to do anything half-decent using taxonomy: most of its features - such as basic breadcrumbing capabilities and term descriptions - are things that really should be part of the core taxonomy module. You will need taxonomy_context for virtually everything that I will be showing you in this series. Note: make sure you move the file taxonomy_context.module from your /modules/taxonomy_context folder, to your /modules folder, otherwise certain things will not work.
  5. Once you've done all that, you have set yourself up with a base system that you can use to implement all my tricks! Your Drupal site should now look something like this:

Very basic drupal site (screenshot)
Very basic drupal site (screenshot)

Add some taxonomy terms and some content

I have written some simple instructions (below) for adding the dummy taxonomy that I used in my test environment. Your taxonomy does not have to be exactly the same as mine, although the structure that I use should be followed, as it is important in achieving the right breadcrumb effect:

  1. In the navigation menu for your new site, go to administer -> categories, then click the add vocabulary tab.
  2. Add a new vocabulary called 'Sections'. Make it required for all node types except forum topics, and give it a single or multiple hierarchy. Also give it a light weight (e.g. -8).
  3. Add a term called 'posts' to your new 'Sections' vocab.
  4. Add another term called 'news', as a child of 'posts'.
  5. Add another vocab called 'News by priority'. Make it apply only to stories, give it a single or multiple hierarchy, don't make it required, and give it a heavier weight than the 'Sections' vocab.
    Note: it was a deliberate choice to give this vocab a name that suggests it is a child of the term 'news' in the 'Sections' vocab. This was done in preparation for part 2 of this series, setting up a cross-vocabulary hierarchy. If you have no interest in part 2, then you can call this vocab whatever you want (but you still need a second vocab).
  6. Add a term 'browse by priority' to the 'news by priority' vocab. Note: the use of a name that is almost identical to the vocab's name is deliberate - the reason for this is explained in part 2 of the series.
  7. Add another term 'important' as a child of 'browse by priority'.
  8. Well done - you've just set up a reasonably complex taxonomy structure. Your 'categories' page should now look something like this:

Test site with categories (screenshot)
Test site with categories (screenshot)

Now that you have some categories in place, it's time to create a node and assign it some terms. So in the navigation menu, go to create content -> story; enter a title and an alias for your node; make it part of the 'news' section, and the 'important' priority; enter some body text; and then submit it. Your node should look similar to this:

Node with categories assigned (screenshot)
Node with categories assigned (screenshot)

First bug: nodes have no breadcrumbs

OK, so now that you've created a node, and you've assigned some categories to it, let's examine the state of those breadcrumbs. If you go to a taxonomy page, such as the page for the term 'news', you'll see that breadcrumbs are being displayed very nicely, and that they reflect our 'sections' hierarchy (e.g. home -> posts -> news). But if you go to a node page (of which there is only one, at the moment - unless you've created more), a huge problem is glaring (or failing to glare, in this case) right at you: there are no breadcrumbs!

But don't panic - the solution is right here. First, you must bring up the Drupal directory on your filesystem, and open the file /modules/taxonomy_context.module. Find the following code in taxonomy_context (Note: the taxonomy_context module is updated regularly, so this code and other code in the tutorials may not exactly match the code that you have):

<?php
/**
 * Implementation of hook_init
 * Set breadcrumb, and show some infos about terms, subterms
 */
function taxonomy_context_init() {
  $mode = arg(0);
  $paged = !empty($_GET["from"]);

  if (variable_get("taxonomy_context_use_style", 1)) {
    drupal_set_html_head('<style type="text/css" media="all">@import "modules/taxonomy_context/taxonomy_context.css";
</style>');
  }

  if (($mode == "node") && (arg(1)>0)) {
    $node = node_load(array("nid" => arg(1)));
    $node_type = $node->type;
  }
  // Commented out in response to issue http://drupal.org/node/11407
//  if (($mode == "taxonomy") || ($node_type == "story") || ($node_type == "page")) {
//    drupal_set_breadcrumb( taxonomy_context_get_breadcrumb($context->tid));
//  }
}
?>

And replace it with this code:

<?php
/**
 * Implementation of hook_init
 * Set breadcrumb, and show some infos about terms, subterms
 * Patched to make breadcrumbs on nodes work, by using taxonomy_context_get_context() call
 * Patch done by x on xxxx-xx-xx
 */
function taxonomy_context_init() {
  $mode = arg(0);
  $paged = !empty($_GET["from"]);

  // Another little patch to make the CSS link only get inserted once
  static $taxonomy_context_css_inserted = FALSE;
  if (variable_get("taxonomy_context_use_style", 1) && !$taxonomy_context_css_inserted) {
    drupal_set_html_head('<style type="text/css" media="all">@import "modules/taxonomy_context/taxonomy_context.css";
</style>');
    $taxonomy_context_css_inserted = TRUE;
  }

  if (($mode == "node") && (arg(1)>0)) {
    $node = node_load(array("nid" => arg(1)));
    $node_type = $node->type;
  }
  // Commented out in response to issue [http://]drupal.org/node/11407

  // Un-commented for breadcrumb patch
  // NOTE: you don't have to have all the node types below - only story and page are essential
  $context = taxonomy_context_get_context();
  $context_types = array(
    "story",
    "page",
    "image",
    "weblink",
    "webform",
    "poll"
  );
  if ( ($mode == "taxonomy") || (is_numeric(array_search($node_type, $context_types))) ) {
    drupal_set_breadcrumb( taxonomy_context_get_breadcrumb($context->tid, $mode));
  }
}
?>

Note: when copying any of the code examples here, you should replace the lines that say "Patch done by x on xxxx-xx-xx" with your name, and the date that you copied the code. This makes it easier to keep track of any deviations that you make from the official Drupal code base, and means that upgrading to a new version of Drupal is only 'very difficult', instead of 'impossible' ;-).

This patch makes breadcrumbs appear for any node type that is included in the $content_types array (which you should edit to suit your needs), based on the site's taxonomy hierarchy. After implementing this patch, you should see something like this when you view a node:

Basic node with breadcrumbs (screenshot)
Basic node with breadcrumbs (screenshot)

Second bug: node breadcrumbs based on the wrong vocab

We've made a good start: previously, nodes had no breadcrumbs at all, and now they do have breadcrumbs (and they're based on taxonomy). But they don't reflect the right vocab! Remember what I said earlier about a single 'master' taxonomy hierarchy for your site, and about other taxonomies being 'secondary'? In our site, the master vocab is 'Sections'. However, the breadcrumbs for our node are reflecting 'News by priority', which is a secondary vocab. We need to find a way of telling Drupal on which vocab to base its breadcrumbs for nodes.

Once again, bring up the Drupal directory on your filesystem, and this time open the file /modules/taxonomy.module. Find the following code in taxonomy:

<?php
/**
 * Find all terms associated to the given node.
 */
function taxonomy_node_get_terms($nid, $key = 'tid') {
  static $terms;

  if (!isset($terms[$nid])) {
    $result = db_query('SELECT t.* FROM {term_data} t, {term_node} r WHERE r.tid = t.tid AND r.nid = %d ORDER BY weight, name', $nid);
    $terms[$nid] = array();
    while ($term = db_fetch_object($result)) {
      $terms[$nid][$term->$key] = $term;
    }
  }
  return $terms[$nid];
}
?>

And replace it with this code:

<?php
/**
 * Find all terms associated to the given node.
 * SQL patch made by x on xxxx-xx-xx, to sort taxonomies by vocab weight rather than by term weight
 */
function taxonomy_node_get_terms($nid, $key = 'tid') {
  static $terms;

  if (!isset($terms[$nid])) {
    $result = db_query('SELECT t.* FROM {term_data} t, {term_node} r, {vocabulary} v '.
    'WHERE r.tid = t.tid AND t.vid = v.vid AND r.nid = %d ORDER BY v.weight, v.name', $nid);
    $terms[$nid] = array();
    while ($term = db_fetch_object($result)) {
      $terms[$nid][$term->$key] = $term;
    }
  }
  return $terms[$nid];
}
?>

Drupal doesn't realise this, but it already knows which vocab is the master vocab. We specified it to be 'Sections' when we gave it a lighter weight than 'News by priority'. In my system, the rule is that the vocab with the lightest weight (or the lowest name alphabetically) becomes the master one. So all we had to do in this patch, was to tell Drupal how to find the master vocab, based on this rule.

This was done by changing the SQL, so that when Drupal looks for all terms associated to a particular node, it sorts those terms by putting the ones with a vocab of the lightest weight first. Previously, it sorted terms according to the weight of the actual term. The original version makes sense for nodes that have several terms in one vocabulary, and also for terms that have more than one parent; but it doesn't make sense for nodes that have terms in more than one vocabulary, and this is a key feature of Drupal that many sites utilise.

After you implement this patch (assuming that you followed the earlier instruction about making the 'sections' vocab of a lighter weight than the 'news by priority' vocab), you can rest assured that the breadcrumb trail will always be for the 'sections' vocab, with any node that is so classified. Your node should now look something like this:

Node with correct breadcrumbs (screenshot)
Node with correct breadcrumbs (screenshot)

Note that this patch also changes the order in which a node's terms are printed out (sorted by vocab weight also).

Third bug: taxonomy breadcrumbs include the current term

While the previous bug only affected the breadcrumbs on node pages, this one only affects breadcrumbs on taxonomy term pages. Try viewing a node: you will see that the breadcrumb trail includes the parent terms of that page, but that the current page itself is not included. This is how it should be: you don't want the current page at the end of the breadcrumb, because you can determine the current page by looking at the title! And also, each part of the breadcrumb trail is a link, so if the current page is part of the trail, then every page on your site has a link to itself (very unprofessional).

If you view a taxonomy term, you will see that the term you are looking at is part of the breadcrumb trail for that page. To fix this final bug (for part 1 of this series), bring up your Drupal directory again, open /modules/taxonomy_context.module, and find the following code in taxonomy_context:

<?php
/**
 * Return the breadcrumb of taxonomy terms ending with $tid
 */
function taxonomy_context_get_breadcrumb($tid) {
  $breadcrumb[] = l(t("Home"), "");

  if (module_exist("vocabulary_list")) {
    $vid = taxonomy_context_get_term_vocab($tid);
    $vocab = taxonomy_get_vocabulary($vid);
    $breadcrumb[] = l($vocab->name, "taxonomy/page/vocab/$vid");
  }

  if ($tid) {
    $parents = taxonomy_get_parents_all($tid);
    if ($parents) {
      $parents = array_reverse($parents);
      foreach ($parents as $p) {
        $breadcrumb[] = l($p->name, "taxonomy/term/$p->tid");
      }
    }
  }
  return $breadcrumb;
}
?>

Now replace it with this code:

<?php
/**
 * Return the breadcrumb of taxonomy terms ending with $tid
 * Patched to display the current term only for nodes, not for terms
 * Patch done by x on xxxx-xx-xx
 */
function taxonomy_context_get_breadcrumb($tid, $mode) {
  $breadcrumb[] = l(t("Home"), "");

  if (module_exist("vocabulary_list")) {
    $vid = taxonomy_context_get_term_vocab($tid);
    $vocab = taxonomy_get_vocabulary($vid);
    $breadcrumb[] = l($vocab->name, "taxonomy/page/vocab/$vid");
  }

  if ($tid) {
    $parents = taxonomy_get_parents_all($tid);
    if ($parents) {
      $parents = array_reverse($parents);
      foreach ($parents as $p) {
        // The line below implements the breadcrumb patch
        if ($mode != "taxonomy" || $p->tid != $tid)
          $breadcrumb[] = l($p->name, "taxonomy/term/$p->tid");
      }
    }
  }
  return $breadcrumb;
}
?>

The logic in the if statement that we've added does two things to fix up this bug: if we're not looking at a taxonomy term (and are therefore looking at a node), then always display the current term in the breadcrumb (thus leaving the already perfect breadcrumb system for nodes untouched); and if we're looking at a taxonomy term, and the breadcrumb we're about to print is a link to the current term, then don't print it. Note that this patch will only work if you've moved your taxonomy_context.module file, as explained earlier (it's really weird, I know, but if you leave the file in its subfolder, then this patch has no effect whatsoever - and I have no idea why).

After implementing this last patch, your taxonomy pages should now look something like this:

Taxonomy page with correct breadcrumb (screenshot)
Taxonomy page with correct breadcrumb (screenshot)

That's all (for now)

Congratulations! If you've implemented everything in this tutorial, then you've now created a Drupal-powered web site that produces super-cool breadcrumbs based on a taxonomy hierarchy. Next time you're at a party, and are making endeavours with someone of the opposite gender, try that line on them (and let me know just how badly it went down). If you haven't implemented anything, then I can't help but call you a tad bit lazy: but hey, at least you read it all!

If you've been wondering where you can get a proper patch file with which to try this stuff out, you'll find two of them at the bottom of the page. See the Drupal handbook entry on using patch files if you've never used Drupal patches before. The patch code is identical to the code cited in this text: as with the cited code, the diff was performed against a vanilla 4.5.2 source tree. Also at the bottom of the page, you can download the entire code for taxonomy.module and taxonomy_context.module: you can put these files straight in your Drupal /modules folder, and all you have to do then is rename them.

Armed with the knowledge that you now have, you can hopefully utilise the power of taxonomy quite a bit better than you could before. But this is only the beginning.

Continue on to part 2, where we get our hands (and our Drupal code base) really dirty by implementing a cross-vocabulary hierarchy system, allowing one taxonomy term to be a child of another term in a different vocabulary, and hence producing (among other things) even sweeter breadcrumbs!

File attachments

Post a comment

💬   25 comments

moshe weitzman

fantasic writeup. i agree that we ought to show breadcrumbs on nodes. as a way to get there, i suggest we fix up taxonomy_context so it does what we want without patching. once we agree on that, then we push a patch into core.

of course, i have a few comments on the text

Jaza

I agree that taxonomy_context needs to be fixed up so that it implements this patch. I just wasn't sure if this was a patch that everyone wanted, or if it was considered more of a fork. After all, I'm sure many users don't want breadcrumbs showing up on their nodes, because they implement a more complex taxonomy structure, e.g. where a term has more than one parent term. In such cases, breadcrumbs are not only inappropriate but also impossible to implement.

If a patch does happen, I think that ultimately it should happen on the core taxonomy module itself - the breadcrumb functionality should be moved from taxonomy_context to the core module. But I know that Drupal core changes are not the easiest things in the world to get through, so I'm not hanging on my seat for that to happen.

Didn't realise that the _init() hook was part of the bootstrap mechanism - this is indeed quite inefficient. But hey, that's how I got it! I just took it from there.

And last but not least... thanks for the praise! I'm flattered.

Nedjo

I've applied the patch to taxonomy_context.module, thanks for the improvements.

Jaza

Attention all people who are doing this tutorial:

Now that taxonomy_context.module has been patched to implement the stuff in this article (see previous comment), you no longer need to do steps 1 and 3. Step 2 is still useful, if you want to specify which of your vocabs is used to define your site's master hierarchy (step 2 patches taxonomy.module, which has not changed for 4.5.x since time of writing).

Hope that this avoids any confusion in reading this article, and in trying to follow the instructions that it gives. The content of this article will not be changed, as it is a static document and should be kept for historical purposes. Also, although some of the code samples are now obselete, the text itself is still both interesting and relevant for Druapl users who want to understand the taxonomy system better.

Ramdak

Great development! I am happy that your hard work has become useful.

Where can I get the updated taxonomy_context module? I didn't find on the Drupal downloads page.

Just want to clarify a couple of things:

  1. Is it correct that hierarchical aliases are still available only through your patch?
  2. And, index pages for vocabularies still need to be done?

Ramdak

Nate

You'll have to get the module from CVS; it's not the "official" version yet.

Ramdak

Many thanks.

Ramdak

With the new patch, does the taxonomy_context module still have to be moved outside the /modules/taxonomy_context folder or can it stay there?

Anonymous

Same question... should tax_context be in the primary modules directory, rather than in its own folder?

Marco Stolpe

Hi, I'm new to Drupal (using 4.6 already) and so far it's the system which comes very close to a system I had in mind to implement myself (big project, I know). Maybe I can spare my time and just use Drupal applying some fixes. I think its taxonomy system is really great and I have to agree entirely with you that the breadcrumbs shown aren't very nice - nevertheless I have a slightly different approach to so called "taxonomies of terms".

Though not a native English speaker, but a German, what I recall from a course about information systems at my university is that there are some important differences between the terms "category", "classification", "taxonomy", "vocabulary" and "term". I used to confuse them too, but as far as I understand them today I'd say that "taxonomy" simply means hierarchical classification. It has become a real buzzword, but in fact it has a very broad meaning and can be applied to almost all sorts of hierarchical classification systems. So let's note that the word "taxonomy" has such a broad meaning that it won't lead to any insights.

What is much more important though is what you're trying to classify and by what attributes. If one wants to classify animals, for example, one may classify them according to their color, the number of their legs, their size, and so on. If one wants to classify buildings one may classify them according to their size, the time they were build, their style, and so on. Now the different adjectives corresponding with a certain attribute define - in a way - a vocabulary, consisting of proper terms to describe things.

Once you have found the attributes (vocabularies) with which you can describe your objects in a good way, you can go and tag them with terms of that vocabulary.

So far, we're still in good old "Excel" or relational database tables world: one row in a database table most often describes a certain entity according to its attributes. The possible values of such attributes are - if you want to - terms.

What relational databases don't recognize is that there exist certain relationships between terms. Relational databases allow for relationship between entities, but not between the terms (values) describing them. In fact, sometimes it would be desirable to create taxnonomies of terms - hierarchical classifications - to search not only for entities having exactly that attribute value, but - for example - a parent term. What I mean is: the values in hierarchical databases are flat. For example, having tagged your content by type, it is difficult to formulate queries like "give me all entities which are of type document", if the content is tagged with more specific terms like "book", "article" or "master thesis". The database won't recognize the parent-child relationship between them, that book, article and master thesis are documents. So taxonomies of terms allow for more sophisticated queries.

What is most important now (and the main point of my comment - ups, I've written too much, I fear) is that terms aren't categories. Terms are just that: terms, words, values, at most phrases. Foremost terms are used to describe things in different ways, but alone they're not of much use. For example, searching for all articles written in English might return a whole bunch of irrelevant content. Using a categorization according to type alone might return your whole website once the user searches for a term high above in the hierarchy like "object" or "content". Therefore, a single master vocabulary doesn't make much sense. Once the user clicks on "topics", he should get back all nodes which are a topic. At least according to my interpretation of vocabularies. Vocabularies all have the same importance.

In the same way, categories aren't terms. As I said, the term "flower" means the class of all flowers (on earth) and you won't be able to return all flowers on earth on your web site. The only way to cope with a whole bunch of content nodes is to categorize them (according to your's or your users' liking), that's true, but there's a difference between categorizing terms and categorizing entities/content. When creating categories of content, the terms you have defined in your vocabularies can be used, but they are not categories themselves. Instead they describe classes of content != categories, which are rather sets of objects described by predicates, attributes - well - terms of vocabularies.

What one needs to create categories are mathematical set operators like union, intersection and difference. Drupal provides union (+) and intersection (,) and with them one can already build some nice categories.

That terms are not categories is also easy to see when you take a look at the glossary module for Drupal. The glossary module treats terms exactly the way I described them: as words. IMHO, that's the correct way to treat terms: the description of a vocabulary term should contain a definition for that term, like terms organized in a thesaurus. If you define the term "animal", the description would probably read like a scientific definition in an encyclopedia. That's normally not what users want so see on your website during normal browsing activities, and in that sense, the taxonomy_context module is somewhat stupid. Users want so see such things only when they're clicking on the term or browsing the glossary for your website, but a scientific definition doesn't belong into the header of a category page.

In comparison, the description of a category might rather read like "An overview of all the bands in the late nineties which were in the top ten" or something like that. Now, go and find a single term for that, a term you'd normally find in a thesaurus or glossary. It's impossible!

I'd use the taxonomy system in a different way, therefore. In Drupal, you can define one or more hierarchical menu structures. Use those as your master categorization(s), linking to a taxonomy term url which best describes the content in that category according to the terms you have defined before. Think of terms like keywords.

For example, when you're categorizing your content according to topic, time and geographical location and you want to create a category of all articles about Drupal published during the last month when you were in New Zeeland, you can link to taxonomy/term/1,47,3 from your menu and give your menu the corresponding hierarchical structure. (If the term "Drupal" is term 1, "March" term 47 and New Zeeland term 3.)

Although this might seem cumbersome at first, over the long run it's a much more flexible system.

Interestingly enough, by using a menu for navigation instead of taxonomy_menu all breadcrumbs are suddenly shown correctly, at least on taxonomy pages.

(Only the titles are ugly and the breadcrumbs on node pages aren't correctly shown: they should both use the current menu as title/breadcrumb, in my opinion. I'll try to fix that in the next days.)

I don't know what the developers had originally intended. I guess it all depends on how much content one has to manage. As long as it all fits into an easy scheme, your system allows for more automatic and less cumbersome categorization than mine. But otherwise, the set operators + and , are more flexible IMHO and must be good for anything at least, don't they?

Luckily, Drupal can be easily extended and changed to fit the needs of many people and that is what looks most promising to me. (Nevertheless, I'm somewhat wondering about having a version 4.5, which to some degree looks buggier than some 0.3 version of other software I've seen, but that's another topic.)

If you want to know more about information management, categories, taxonomies, thesauri, etc., there's a book I liked a lot: "Information Architecture" by Louis Rosenfeld & Peter Morville, O'Reilly, 2nd ed. 2002

Jaza

Marco, I can see that you have extensive knowledge of taxonomy as a mathematical and information-architecture concept. Looks like you know far more about this than me, or than most other Drupal webmasters! You're right, classifying content with terms is far more than simply putting content into 'flat' categories. As you said, only when you use set-theory operators such as union, intersect, and difference - to aggregate your terms into specific combinations - are you really managing your information in the best possible way.

Drupal's taxonomy system has been designed, from the ground up, to support information management in the most powerful way that it can. Using Drupal, you can classify your content according to multiple adjectives (terms) of multiple attributes (vocabularies), as you explained. And you can use the '+' and ',' operators to find content that matches a particular combination of these adjectives. This is far better than many content management systems, which let you do little more than put your content in 'folders', meaning that all you have is a flat hierarchy, rather than a complex classification defined by many terms and vocabularies.

However, surely you can see that all of this is just too complex for the average user? If all you're trying to do is make a simple website such as a corporate e-brochure, or even an average blog, then it's most likely that all your site needs is a basic hierarchical navigation system. And this is where the taxonomy system disappoints me: it may be perfect for advanced information architects such as yourself; but it has proven to be too daunting for users that simply want to make a basic hierarchy using a bunch of terms.

So what I'm trying to do - with the 'master hierarchy' concept that I introduce in this article - is to combine the simplicity of a single site hierarchy, with the power and complexity of taxonomic classification. I have made the utmost effort to preserve the potential of the taxonomy system in my approach, because I believe that it's one of the key strengths of Drupal. I hope that by imposing a 'master hierarchy' on a taxonomy, I'm not 'dumbing down' the whole system.

Basically, my theory is that sites should employ multiple vocabularies to classify their content; but that there should be one vocabulary that is more important than all the others. One vocab to rule them all, I guess you could say :-). This vocab is then used to define the breadcrumbs, menus, and other navigational elements that are used in the site. This makes life simpler for the majority of visitors to the site, since most people are accustomed to everything on a site falling under a single hierarchical tree.

My approach would probably not be so suitable for large sites with thousands of individual nodes of content. In such a situation, navigation based on union / intersect / difference operations would probably be more appropriate than hierarchical 'browsing'. But for sites such as this one, with a relatively small volume of content, I think that something simpler is needed.

I hope I have covered most of what you wrote, Marco: your comment was quite long! But don't feel bad about it, there's no word limit here at GreenAsh... you just use as many words as you need to get your opinion across.

Marco Stolpe

Hi, thank you very much for your answer.

Jaza: However, surely you can see that all of this is just too complex for the average user? [...] And this is where the taxonomy system disappoints me

Yes, I have to agree with you. Thesauri and vocabularies are confusing me anyhow, because I'm not a librarian myself, and they just are that complex. That's because human language is complex and ambiguous, so are the terms we use to categorize and classify content. The authors of the book I've mentioned even have the idea that in information age, every web master to some part has to become a librarian - it's an art. By the way, I wouldn't consider myself to be a good information architect; I've not managed a larger web site so far and only have some theoretical background knowledge.

But exactly that led me to stumble across some phrases you used in your article, like "I show you how to patch up some of taxonomy's bugs" or - in your recent reply - "the taxonomy system disappoints me". From a theoretical viewpoint, I'm not sure if the combination of menus and the taxonomy system as I've described it is buggy at all. Maybe it's just what the developers had intended and it's a feature that allows for a much more flexible way to organize content than most other CMS I've seen so far. It almost reminds me of some technologies invented in the Semantic Web effort, like topic maps and RDF, the resource description framework. Maybe it's just a new way of doing things and one has to shift one's perspective a bit.

When I read your article, I got the impression as if you wanted to point out that the developers did something wrong. They probably did not, it's rather that you're trying to use taxonomies of terms in a way they were not intended for (from a theoretical viewpoint). In that sense, by bending the taxonomy system to fit your own needs better, you didn't correct a bug, but wrote a hack. ;-)

That's okay, of course. No one needs to use taxonomy_context if he doesn't want to. I just wanted to point out that there exists another way of doing things - well, maybe a better way - when one has to manage a large amount of content.

Jaza: I hope that by imposing a 'master hierarchy' on a taxonomy, I'm not 'dumbing down' the whole system.

To be honest, that was the first impression I got. :-)

But of course, I don't know what the developers exactly had in mind when they invented the system. The problem I see with Drupal is that some parts of it seem counterintuitive to me regarding taxonomies and that they're not well integrated at all. I think the most lousy thing in Drupal is its search engine. Why can't I use boolean operators like in almost all other search engines in use today? (Especially since MySQL already has that feature.) Such searches are not too complicated even for the average user, as Google shows. For professional users, why can't they do a power search by restricting search results based on taxonomy terms? As the authors of "Information Architecture" explain, taxonomies of terms are mostly used to guide searches anyhow. For example, when someone searches for "cars", the search engine can guide him and offer him to search for broader and narrower terms, similar terms, and it can recognize synonyms. That would allow users to find their way through the jungle of information on a side much more easily.

Thinking about that, now I could even fully agree with you, and say that it would make more sense to organize everything under a single hierarchy of categories, but use the vocabulary system in the background to guide user's searches.

In this way, one could then combine the simplicity of a single hierarchy with the power of a highly sophisticated search engine, allowing users to list content in all the possible ways which can be imagined if it's really needed.

Ramdak

And, to think that both editions of Lou Rosenfeld and Peter Morville's classic were right in front of me all these days!

Kenny

Very interesting discussion (if somewhat over my head!). I am just looking at Drupal once again after becoming aware of it a year ago. At the time I left it because I was just overwhelmed. I would like to thank Jaza for the well thought out and written documentation and the examples that he has contributed, which has convinced me to give it another go.

I am very interested in the flexibility of the taxonomy approach to create a site where articles can be tagged by activity type (trekking, skiing..), country (england, italy..) and category (shopping, lifestyle..)

This is how I think my taxonomy should be..

-Activity
--Trekking
--Skiing

-Country
--England
--Italy

-Category
--Shopping
--Lifestyle

But I am also wondering if it would be better to use subcategories such as..

-Category
--Shopping
---Trekking
---Skiing
--Lifestyle
---Trekking
---Skiing

Ideally I would like to create main sections in my site (think guardian.co.uk) which would group together all relevant stories - which are themselves tagged by the specific activity they refer to. Drupal seems to be the best tool to approach this design with.

Martin

First of all, I enjoyed reading the thread and even though am working with the current 4.6.2 Drupal got something out of it. For the sake of understanding the taxonomy working a bit better I decided to go through the exercise, realizing I do not need to patch. Patch 2 is certainly included in my distro. However, the output was not as I expected, or rather as it must (?) have been before the patch. Please excuse me if this is the wrong location for this post:

My setup (on a fresh drupal install database and all just for the purpose of this test)

Categories:

Sections Vocab (Type Page, Story, Single, Weight -8)
  Posts (weight 0, parent root)
  -- News (weight 0 (parent posts)

News by prio vocab (Type: Page, Single, Weight 1)
  Browse by Prio (parent root, weight 0)
  -- Important (parent browse by prio, weight 0)

Content:

"First News", (type story)
Sections: -News
News by Priority: -Important

Browsing that page results in a breadcrumb

Home > Browse by Prio > Important

Clicking 'browse by Prio' tells me correctly "there are currently not posts in this category"

I would have expected Home > News > Posts OR Home > Posts.

Again, I verified taxonomy.module contains the select...where statement.

What surprised me is that the statement does not reference the vocabulary table - after all, that is where the prime vocab should be selected from.

Flabbergasted.

Martin

Brendan K

Hello,

First, thank you for this article. I am already feeling alot more friendly towards breadcrumbs. And your walk through above was very helpful. Much appreciated!

I have a question:

Does anyone know how to make it so that "Home" is just a static page? Not a super category for your entire site?

for instance, I would like to have the breadcrumbs for my news section appear as:

News >> Cool Article

as opposed to

Home >> News >> Cool Article

I feel that having everything as a sub of "home" is just not good design.

Any suggestions on how to remedy this would be great. And, thanks to everyone who has contributed to this thread and discussions like this as it is very helpful for newbies like myself.

-Brendan

Jaza

Hi Martin,

The only thing I can think of is that you haven't set the weights correctly for your vocabs. Check that your 'sections' vocab has a lighter weight than your 'news by priority' vocab. Having the weights set differently to this will result in exactly the problem you're having.

Martin

Hello Jeremy,

OK, the problem is solved. There was something I misread.

The story: The weights had been set correctly. After reversing the weights of the vocabs (-8/1 to 1/4) nothing changed. Going through the modules again, I realized the second step of patching taxonomy.module was not part of the current 4.6.2 Drupal. Applying that patch, I am now able to choose which vocab is chosen in the breadcrumb by modifying their weight. Therefore, and that was not clear for me, it is crucial to apply the patch in step2 of this docu to get the functionality.

For the record here are the file versions used:

taxonomy_context.module,v 1.36.2.1 2005/06/16 02:48:42
taxonomy.module,v 1.192.2.5 2005/05/31 21:13:40

Thanks!

Martin

JohnG

I'm going to see if I can my hands on that book next ...

I spent alot of time trying to make the undoubted powerful taxonomy.module actually useful in helping users navigate reems of information. It really is an intriguing module and I think that Jaza's hack is a brilliant way to reign in some of that power.

I also came to the same conclusion as Marco about the search engine - which can be so much more subtle and powerful than even the most complex taxonomy. I was very suprised that the search.module has no truck with taxonomy.module.

Thankfully I recently found nedjo's SQL search (trip_search) module which apparently has been lurking around the contributed projects for several iterations. It uses the (boolean & phrase) search features native to MySQL and then adds filtering by taxonomy terms and node types. Why it has not been included as the drupal core search module mystifies me!

The biggest problem now is getting writers and word-searchers to use the same vocabulary ... :)

One last point, I read somewhere that taxonomy.module evolved from a keywords.module (many moons ago). IMO it still has the feel an uncomfortable fusion of keywords and a file-structure. An interesting experiment, but I can't figure out how to actually make it useful! Jaza's patch (erm, hack) focuses it toward the file structure because this is badly needed when using drupal for CMS.

If anyone could point out some successful alternative uses of the taxonomy system I would be very interested.

JohnG

Superficially - which is about the best I can do ! - nodes with multiple parents could display multiple breadcrumbs. Eg amazon.com - at the bottom of the page ... yes I know it looks messy but I do find it quite helpful :)

dtabach

Therefore, and that was not clear for me, it is crucial to apply the patch in step2 of this docu to get the functionality

I applied the second patch to taxonomy.module 4.6.3, and get the following error message:

Parse error: parse error, unexpected T_VARIABLE in /home/dtabach/public_html/modules/taxonomy.module on line 432

I'm using the latest taxonomy_context.module from drupal's downloads, not modified.

Ken Collins

One of the things I was hoping to learn, and may have missed, is to find out how to make any of the three top-level terms in my blog category show all the posts when clicked in the breadcrumb. For instance, making taxonomy/term/2 really work like taxonomy/term/2/all. Any ideas? TIA

akeimou

in our case, there are at least three types of drupal users:

  1. the people who set up and configure the system (administrator, power user)
  2. the people who use the configured system to provide content
  3. the people who consume the content

our type2 users are such that with a properly configured taxonomy, they will find it easier to select attribute values while creating a page than what they are doing right now because they are not able to relate well to concepts of child and parent pages. for this aspect alone i find that drupal taxonomy will be simpler and, therefore, useful.

the more complicated part of taxonomies, well, that will have to be something that type1 user (e.g., me) should be able to deal with. i'm a drupal newbie and wouldn't mind going advanced. and modeling and organizing info i've always found challenging and yet exhilirating. so thanks to Jaza's tutorials and others, i think this is even going to be fun.

and then, yes, there's the SQL Search (the new Trip Search) which will allow type3 users to search not only using the usual expressions (cumbersome for most), but also by taxonomy terms (might be easier this way), and with MySQL full text search at that. cool!

regarding relationships between terms. we're looking at multiple hierarchies for our application. say (very hypothetical), we had people categorized by the music they listen to and the alcohol brand they drink. now, it would be nice if there's a mapping module for taxonomy that will show the music distribution for Remy Martin drinkers. and i wonder which music genre takes the biggest piece.

even if i'll need a week to learn taxonomy, i'll take it. that's nothing compared to creating that kind of web-based database app from scratch.

Anonymous

And if so, which versions of it?

Manuel

For this purpose I use the taxonomy_force_all module