DE

Metadata in Hugo

In Hugo, metadata is defined in the so-called front matter. Some of the predefined fields only become effective when we manually include them in templates, while other fields are automatically taken into account during the website’s build process.

This article provides an overview of how such fields are defined and provides some considerations on what information should be stored in the front matter and as hidden secondary content on the website.

Data Formats

Hugo supports various formats for the front matter: YAML, TOML, and JSON. In practice, there are hardly any decisive advantages or disadvantages; you can simply choose the format whose syntax suits you best. Personally, I prefer YAML.

JSON necessarily groups data with brackets, which seems unnecessary for simply structured data. To my mind, the somewhat cumbersome handling of hierarchical data speaks against TOML. Lists are represented in array syntax: [a, "b", 2]. For nested structures, the table headers must be repeated: [servers], then [servers.alpha], then [servers.alpha.a].

In YAML too we can denote data structures with the same bracket literals. But we can also simply list the scalar elements of structures, as we would do with unordered lists in Markdown. In nested structures, the levels are grouped by indenting them in a syntactically meaningful way (as opposed to repeated table headers).

In YAML, string values can be written without quotation marks, even if they contain spaces. Quotation marks are only required if a reserved character such as : appears in the string. Literals such as 12 or 2014-12-31 are automatically recognized as denoting numbers or dates (as opposed to strings). Only in exceptional cases, they must be turned into strings by putting quotation marks around the literals ("12").

The documentation recommends to define at least title and date in the front matter of each page. These fields only have an effect if they are used in layouts.

For most users, it makes sense to integrate this information into the HTML header. Standards such as schema.org or Dublin Core define additional requirements on how metadata should be structured and made available in machine-readable form. If this matters to you, you can use the metadata of the front matter to implement the recommendations of these guidelines.

Information about Content

The title should always be included in the head of a page. It not only appears in the title bar or in the browser tab, but also plays an important role for search engine optimization. In the base layout, this could be implemented as follows:

{{ if .Page.Title }}<title>{{ .Page.Title }}</title>{{ end }}

Two fields allow the main content to be displayed in summary or excerpt form. The description field adds a general description to the title. This is provided in the head of the page by a corresponding meta tag, which is created using the PAGE.description method:

<meta name="description" content="{{ if .Params.description }} {{ .Params.description }}{{ end }}" />`.

This description may be used by Google for the snippets in search results.

The summary field serves a different purpose. It’s used for content excerpts that are openly visible to the visitor, for example in excerpts of individual blog posts. These summaries can be inserted into layouts using the PAGE.Summary method.

The date is used to document when a page was originally published. With archetypes the current date can be automatically inserted into the front matter when a markup file is created. For ease-of-use, it is recommended to use a simple date format such as YYYY-MM-DD.

While this date may be of interest to the author, it is primarily used to present the publication date to visitors. For this, the date can be integrated into layouts via the PAGE.Date method. Here we are not limited to the date format that is used in the front matter, but can be customized using the time.Format function. For multilingual pages, words such as month names are automatically translated.

Additionally, lastmod can be used to record when content was last edited. This is particularly important for topics that are subject to rapid change. In such cases, visitors should be informed whether the content in question is reasonably up-to-date. The PAGE.lastmod method is used to insert this date in layouts.

Several fields relate to the publication status of the page in question. The draft field can be used to specify whether the page is yet ready for publication by using a Boolean value. If set to true, the page does not appear on the published website. The publishDate can be used to set the date from which a page will be accessible, while the expiryDate sets a date from which the page should no longer be displayed.

Classifying Content

The type of a page can be explicitly defined in the front matter. The definition in the type field takes priority over the type derived from section membership. A layout with the same name can be created, which gives you more control over what other content is to be inserted on individual pages.

So-called taxonomies can be used to classify content in the front matter with keywords. This topic is discussed in detail in its own part of the series on content management.

A customized URL Structure

By default, URLs in Hugo follow the directory structure under content/. However, developers may define their own URL scheme. The more you deviate from the standard, the greater the additional effort will be.

Custom URLs for Single Pages

If you only want to change the last component of the URL of a single page, you can use the slug field in the front matter of the corresponding markup file. As a result, the individual page is accessed via the value for this key. For instance, content/blog/post-1.md with slug: my-first-post would be available via domain.com/blog/my-first-post/ as opposed to domain.com/blog/post-1/.

The url field can be used to influence the entire URL. Or more precisely, the part of the URL that comes after the base URL, which is defined in the main configuration. Let’s take the example from the official documentation, which assumes https://domain.com/ as base URL. If we create a file under content/posts/post-1.md and set url = '/articles/my-first-article.html' in its front matter, then the corresponding page is accessible under https://domain.com/articles/my-first-article.html.

It’s important to note that for a multilingual website the prefix / in the url field makes a difference. Only without a preceding / will the prefix for the language (de, for instance) be inserted between the base URL and the url value. For a monolingual website it makes no difference, but even then the variant without / is simply the technically more accurate option, since we wouldn’t want two / in the URL.

Permalinks allow URLs to be defined for the sections located at the top level of the content/ directory. This means that the trailing part of a URL such as domain.com/name-of-section/ can be replaced by any other URL components.

Permalinks are defined in the main configuration and follow this pattern:

[permalinks]
  [permalinks.section]
    name-of-section = '/what/you/want/instead/'

If the value has only one path component, then this would be equivalent to renaming the content/name-of-section directory.

With version v0.131.0, Hugo introduced so-called permalink tokens, which make the feature vastly more useful. You can now add placeholder path components whose values are derived from the front matter.

The most useful tokens include :year, :month (or :monthname), and :day. They can be used, for instance, to insert the date into the URL of blog posts. :filename or slugorfilename are also often used to place the file name into the URL, similarly it’s also possible to insert the name of the :section.

Permalink tokens are either used in the url front-matter field or added to the main configuration:

[permalinks]
  [permalinks.page]
    blog = '/:section/:year-:month-:day-:slugorfilename/'

Permalinks can be defined not only for sections, but also for taxonomy terms.

[permalinks]
  [permalinks.term]
    tags = '/:slug/'

A post whose main content is defined under content/blog/post-1.md and whose front matter contains date: 2024-08-24 would therefore be accessible under domain.com/blog/2024-08-24-post-1/.

For multilingual websites, it may be advisable to define permalinks relative to the individual languages. To do this, they are added under [languages]:

[languages]
  [languages.de]
    [languages.de.permalinks]
      [permalinks.page]
        blog = '/blogbeitraege/:year-:month-:day-:slugorfilename/'

Article from October 3, 2024.