DE

Tagging

In Hugo, content can be tagged with keywords so that users can find specific content on specific topics. In this context, Hugo speaks of taxonomies. They enable the use of a systematic keywords that is more complex than simple hashtags.

In order to do this, a system of classification must first be defined. Content can then be classified by using class names (keywords) in the front matter of the corresponding markup files. In the following, we first discuss an example before the use cases of the classified content are illustrated.

Example: Classification of Movies

The best way to understand how such a classification system is structured is to look at a specific example. Let’s assume we are dealing with a website that is structured similarly to IMDb.

Information on movies is presented on individual pages. These movies can be grouped according to various criteria, such as genre, actors, year of release or rating. Such criteria are mapped as keys in the classification system.

The system is defined in the main configuration and may look like this:

taxonomies:
  genre: 'genres'
  actor: 'actors'
  year: 'years'
  rating: 'ratings'

The definition follows the following convention: The keys are specified in the singular and the values in the plural. The singular is used in the front matter of the markup files, while the plural form is automatically used in the URL for overview pages.

Content can now be classified accordingly. For example, the individual page for the movie The Terminator (1984) could contain the following front matter:

---
genre:
  - action
  - science-fiction
actor:
  - 'Arnold Schwarzenegger'
  - 'Linda Hamilton'
year: 1984
rating: 5
---

The information listed is referred to as terms of the respective taxonomy. The single pages for which a term applies are regarded as their values. For example, the Terminator page would be a value for the term Arnold Schwarzenegger with regard to the taxonomy actors.

Overview Pages

Content can therefore be classified according to several criteria or on several axes, which enables two types of overview pages:

  • Taxonomy Pages list all terms of a specific taxonomy used on the website. They are of the type taxonomy and are available at domain.com/<TAXONOMY>/.
  • Term Pages list all content that is linked to a specific term of a taxonomy. They are of the type term and are available at domain.com/<TAXONOMY>/<TERM>/.

As with sections, these pages are generated automatically. In the example of our classification system, for example, domain.com/genre/ and domain.com/actor/ are created as taxonomy pages and domain.com/genre/action/, domain.com/actor/arnold-schwarzenegger/ as term pages.

To design the overview pages, corresponding layouts are defined. Note that you can access metadata that is specific to certain taxonomies or terms. The corresponding information is stored in the front matter of index pages, which are created under /content/<TAXONOMY>/_index.md and /content/<TAXONOMY>/<TERM>/_index.md.

The official documentation gives an example of how to use this feature. On the index page for specific actors we could add the URL to their Wikipedia article next to their name:

---
title: Bruce Willis
wikipedia: https://en.wikipedia.org/wiki/Bruce_Willis
---

This would allow the actor overview page to display a list of actors and actresses that also provides links to the corresponding Wikipedia articles. The use of such metadata remains flexible and can be adapted to individual use cases.

Default Taxonomies

The classification system does not necessarily have to be defined. If the entry is missing in the main configuration, two taxonomies are created by default, namely tags and categories. The corresponding taxonomy overview pages are also automatically made available.

This default definition is overwritten as soon as a custom classification system is defined. This applies in particular to the case where only one of the default taxonomies is to be used. The following configuration therefore only activates the tags taxonomy while the categories taxonomy is not created in this case:

taxonomies:
  tag: 'tags'

It’s also possible to deactivate the taxonomy functionality by switching off their defaults with disableKinds:

disableKinds: ['taxonomy', 'term']

Degrees of Complexity

In the last section, the simplest case was mentioned: No taxonomies are used. The next more complex scenario is one-dimensional keywording, which should be sufficient for many websites. Instead of distinguishing semantically meaningful classes, only one taxonomy is defined in this approach—often with generic terms such as keywords or tags.

In principle, this procedure corresponds to the practice of hashtags, even if the syntax remains as it was described above. This article, for example, could be classified as follows:

tags:
  - hugo
  - web-development

Complexity in a different sense arises on multilingual pages, as class names, terms and URLs generally need to be translated. For this purpose, taxonomies are defined relative to languages:

languages:
  en:
    taxonomies:
      tag: 'tags'
  de:
    taxonomies:
      schlagwort: 'schlagwoerter'

Keywords: An Alternative for a one-dimensional Classification?

In the previous section, classification systems with only one taxonomy were addressed. With keywords (keywords), Hugo offers a predefined field in the front matter, which at first glance appears to serve precisely this use case.

We can use this field for a one-dimensional classification system. However, for this to work, the field must still be defined in the main configuration, and exactly as described above. If this is not done, domain.com/keywords/ will only return a 404 error page. So what is the purpose of the keywords?

As far as I can tell, its intended purpose is not further explained in the official documentation. In layouts, we can retrieve all the keywords used with the PAGE.Keywords method. So we could, for example, use it in the base template to inject the keywords in the head of the page:

<meta name="keywords" content="keyword-1, keyword-2" />`

Since Google no longer takes this information into account, it’s probably hardly worth going through the motion of adding keywords this way. And if you still want to record the information in the head, you can just as well use “regular” taxonomies.

Keywords are deliberately hidden from visitors to the site and could therefore be useful for program-controlled features such as a dedicated search function. For most users, however, they are likely to be of little interest.

Self-imposed Restrictions

The definition of the classification system in the main configuration leaves many details unspecified and does not yet offer the options to restrict what terms are permitted. Users are therefore responsible to ensure the consistency of their classifications.

The taxonomy for ratings is a case in point. It is in the nature of things that at most one rating can be assigned to a movie. However, since Hugo does not recognize semantic requirements of this kind, the build engine cannot enforce such rules. Technically, it would therefore be possible to give Terminator a 5 and a 2 rating at the same time.

Possible terms or formats cannot be specified in the definition, either. If you only want to allow integers or strings like x/5, you must ensure that you adhere to your own rules when classifying content.

Notes on Terminology

The terminology of the official documentation can be misleading. Usually, it’s the overall system for classifying content or things that is called a taxonomy (singular). Alternative taxonomies (plural) would then be different ways of classifying the entities in a given subject area. In the past, the classes of a taxonomy were called taxa.

However, when the documentation refers to taxonomies, it is referring to classes or categories of a classification system. No technical term is introduced for the classification system itself.

The convention mentioned above should also not be read too literally. The use of the singular in the front matter of the markup files does not mean that only one value can be specified for each content. This is a matter of course for some categories, such as rating or year. For other categories, such as actor, we would typically want to set more than one value.

Moreover, the word value is used rather unconventionally in the context of taxonomies. The definitions in the front matter are typical key-value pairs. However, these are not the values in question; the things on the right-hand side of the front matter definitions are called terms (of a particular taxonomy). Values in the given sense - values of a term - are the contents that are classified with it, for example the page for the movie Terminator.

Article from September 27, 2024.