Internet-Draft | Matroska Tags | November 2024 |
Lhomme, et al. | Expires 28 May 2025 | [Page] |
This document defines the Matroska tags, namely the tag names and their respective semantic meaning.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 28 May 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Matroska is a multimedia container format defined in [RFC9559]. It can store timestamped multimedia data
but also chapters and tags. The Tag
elements add important metadata to identify and classify the information found
in a Matroska Segment
. It can tag a whole Segment
, separate Tracks
elements, individual Chapter
elements or Attachments
elements.¶
Some details about tagging are already present in Section 24 of [RFC9559].¶
While the Matroska tagging framework allows anyone to create their own custom tags, it's important to have a common set of values for interoperability. This document intends to define a set of common tag names used in Matroska.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
When a SimpleTag
is nested within another SimpleTag
, the nested SimpleTag
becomes an attribute of its parent SimpleTag
.
For instance, if you wanted to store the dates that a singer started being the lead performer,
then your SimpleTag
tree would look something like this:¶
This corresponds to this layout of EBML elements:¶
<Tags> <Tag> <Targets> <TagTrackUID>{track UID of tagged content}</TagTrackUID> </Targets> <SimpleTag> <TagName>ARTIST</TagName> <TagString>Pet Shop Boys</TagString> <!-- sub tag(s) about the ARTIST --> <SimpleTag> <TagName>LEAD_PERFORMER</TagName> <TagString>Neil Tennant</TagString> <!-- sub tag(s) about the LEAD_PERFORMER --> <SimpleTag> <TagName>DATE_STARTED</TagName> <TagString>1981-08</TagString> </SimpleTag> </SimpleTag> </SimpleTag> </Tag> </Tags>¶
In this way, it becomes possible to store any SimpleTag
as attributes of another SimpleTag
.¶
Multiple items SHOULD never be stored as a list in a single TagString
. If there is more
than one tag value with the same name to be stored, then more than one SimpleTag
SHOULD be used.¶
Official TagName
values MUST consist of UTF-8 capital letters, numbers and the underscore character '_'.¶
Official TagName
values MUST NOT contain any space.¶
Official TagName
values MUST NOT start with the underscore character '_'; see Section 3.1.¶
It is RECOMMENDED to start a tag name with the underscore character '_' for non official tags than are not meant to make it to the list of official tags.¶
The TargetTypeValue
element allows tagging of different parts that are inside or outside a
given file. For example, in an audio file with one song you could have information about
the album it comes from the CD set even if it's not found in the file.¶
For applications to know the kind of information (e.g. "TITLE") relates to a certain level
(CD title or track title), we also need a set of official TargetTypeValue
values and TargetType
names.
That also means the same tag name can
have different meanings depending on its TargetTypeValue
, otherwise we would end up with 7 "TITLE_" tag names.¶
For human readability a TargetType
string can be added next to the corresponding TargetTypeValue
.
Audio and video have different TargetType
values.
The following table summarizes the TargetType
values found in Section 5.1.8.1.1.2 of [RFC9559]
for audio and video content:¶
TargetTypeValue | Audio TargetType | Comment |
---|---|---|
70 | COLLECTION | the high hierarchy consisting of many different lower items |
60 | EDITION / ISSUE / VOLUME / OPUS | a list of lower levels grouped together |
50 | ALBUM / OPERA / CONCERT | the most common grouping level of music (e.g., an album) |
40 | PART / SESSION | when an album has different logical parts |
30 | TRACK / SONG | the common parts of an album |
20 | SUBTRACK / PART / MOVEMENT | corresponds to parts of a track for audio (e.g., a movement) |
10 | - | the lowest hierarchy found in music |
TargetTypeValue | Video TargetType | Comment |
---|---|---|
70 | COLLECTION | the high hierarchy consisting of many different lower items |
60 | SEASON / SEQUEL / VOLUME | a list of lower levels grouped together |
50 | MOVIE / EPISODE / CONCERT | the most common grouping level of video (e.g., an episode for TV series) |
40 | PART / SESSION | when an episode has different logical parts |
30 | CHAPTER | the common parts of a movie or episode |
20 | SCENE | a sequence of continuous action in a film or video |
10 | SHOT | the lowest hierarchy found in movies |
Tags from a TargetTypeValue
apply to the all lower TargetTypeValues
. This means that if a CD has the same
artist for all tracks, you just need to set the "ARTIST" tag at TargetTypeValue
50 (ALBUM) and not
to each TargetTypeValue
30 (TRACK), but you can also repeat the value for each track.
If some tracks of that CD have no known
"ARTIST", the value MUST be set to nothing, a void string "" as detailed in Section 24.2 of [RFC9559],
so that the album "ARTIST" doesn't apply.¶
If a tag with a given TagName
is found at a TargetTypeValue
,
only values of that TagName
are valid at that TargetTypeValue
level.
In other words, the TagName
values from upper TargetTypeValue
levels don't apply at that level.¶
Multiple SimpleTag
with the same TagName
can be used at a given TargetTypeValue
level when each SimpleTag
contain a TagString
.
For example this can be useful to find a single "ARTIST" even when they are found in a collaboration.
The concatenation of each TagString
represents the value for the TagName
at this level.
The presentation, for instance with a separator, is up to the application.¶
There are three organizational tags defined in Section 4.2:¶
These tags allow specifying the ordering of some tags within a another group of tags.¶
For example if you have an album with 10 tracks and you want to tag the second track from it.
You set "TOTAL_PARTS" to "10" at TargetTypeValue
50 (ALBUM). It means the "ALBUM" contains 10 lower parts.
The lower part in question is the first lower TargetTypeValue
that is specified in the file.
So, if it's TargetTypeValue
= 30 (TRACK), then that means the album contains 10 tracks.
If TargetTypeValue
is 20 (MOVEMENT), that means the album contains 10 movements, etc.
And since it's the second track within the album, the "PART_NUMBER" at TargetTypeValue
30 (TRACK) is set to "2".¶
If the parts are split into multiple logical entities, you can also use "PART_OFFSET".
For example you are tagging the third track of the second CD of a double CD album with a total of 10 tracks
the "TOTAL_PARTS" at TargetTypeValue
50 (ALBUM) is "10",
the "PART_NUMBER" at TargetTypeValue
30 (TRACK) is "3",
and the the "PART_OFFSET" at TargetTypeValue
30 (TRACK) is "5", which is the number of tracks on the first CD.¶
When a TargetTypeValue
level doesn't exist it MUST NOT be specified in the files, so that the "TOTAL_PARTS"
and "PART_NUMBER" elements match the same levels.¶
Here is an example of an audio record with 2 tracks in a single file, corresponding to [DaFunk].
There is one Tag
element for the record, and one Tag
element per track on the record.
Each track being identified by a chapter.¶
The Tag
for the record:¶
The Tag
for the first track:¶
The Tag
for the second track:¶
This corresponds to this layout of EBML elements:¶
<Tags> <!-- description of the whole file/record --> <Tag> <Targets> <TargetTypeValue>50</TargetTypeValue> </Targets> <SimpleTag> <TagName>ARTIST</TagName> <TagString>Daft Punk</TagString> </SimpleTag> <SimpleTag> <TagName>TITLE</TagName> <TagString>Da Funk</TagString> </SimpleTag> <SimpleTag> <TagName>TOTAL_PARTS</TagName> <TagString>2</TagString> </SimpleTag> </Tag> <!-- description of the first track/chapter --> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> <TagChapterUID>12345</TagChapterUID> </Targets> <SimpleTag> <TagName>TITLE</TagName> <TagString>Da Funk</TagString> </SimpleTag> <SimpleTag> <TagName>PART_NUMBER</TagName> <TagString>1</TagString> </SimpleTag> </Tag> <!-- description of the second track/chapter --> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> <TagChapterUID>67890</TagChapterUID> </Targets> <SimpleTag> <TagName>TITLE</TagName> <TagString>Rollin' & Scratchin'</TagString> </SimpleTag> <SimpleTag> <TagName>PART_NUMBER</TagName> <TagString>2</TagString> </SimpleTag> </Tag> </Tags>¶
Here is an example using the "PART_OFFSET" tag. It corresponds to a file that contains the third track on the second CD of the 2-CD album "The Orb's Adventures Beyond The Ultraworld" [OrbUltraworld]:¶
The Tag
for the album:¶
Targets¶
ARTIST = "Orb"¶
TITLE = "The Orb's Adventures Beyond The Ultraworld"¶
TOTAL_PARTS = "10"¶
The Tag
for the third track of the second CD:¶
This corresponds to this layout of EBML elements:¶
<Tags> <!-- description of the whole album --> <Tag> <Targets> <TargetTypeValue>50</TargetTypeValue> </Targets> <SimpleTag> <TagName>ARTIST</TagName> <TagString>Orb</TagString> <SimpleTag> <TagName>SORT_WITH</TagName> <TagString>Orb, The</TagString> </SimpleTag> </SimpleTag> <SimpleTag> <TagName>TITLE</TagName> <TagString>The Orb's Adventures Beyond The Ultraworld</TagString> </SimpleTag> <!-- the number of sub elements in this album (10 tracks) --> <SimpleTag> <TagName>TOTAL_PARTS</TagName> <TagString>10</TagString> </SimpleTag> </Tag> <!-- description of the third track of the second CD --> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> </Targets> <SimpleTag> <TagName>TITLE</TagName> <TagString>Outlands</TagString> </SimpleTag> <!-- This is the third track of the second CD --> <SimpleTag> <TagName>PART_NUMBER</TagName> <TagString>3</TagString> </SimpleTag> <!-- The first CD contains 5 tracks --> <SimpleTag> <TagName>PART_OFFSET</TagName> <TagString>5</TagString> </SimpleTag> </Tag> </Tags>¶
A Tag
element has a single Targets
element with a single TargetTypeValue
element.
But it can contain various TagTrackUID
, TagEditionUID
, TagChapterUID
and TagAttachmentUID
elements.¶
When multiple values are found using the same Tag UID element (e.g. TagTrackUID
)
a logical OR is applied on these elements.
In other words the tags apply to each entity defined by a UID.
This is the list of UIDs the tags apply to (e.g. list of TagTrackUID
).
Such a list may contain a single UID element.¶
When different lists of Tag UID elements are found (e.g. a list of TagTrackUID
and a list of TagChapterUID
)
a logical AND is applied between those lists.
In other words the tags apply only to the entities matching a UID in each list of Tag UID elements.¶
These operations allow factorizing tags that would otherwise need to be repeated multiple times.¶
Here is an example of a Tag
applying to 2 chapters, using the same [DaFunk] example as in Section 3.3.1:¶
Targets¶
WRITTEN_BY = "Thomas Bangalter"¶
WRITTEN_BY = "Guy-Manuel de Homem-Christo"¶
PRODUCER = "Thomas Bangalter"¶
PRODUCER = "Guy-Manuel de Homem-Christo"¶
This corresponds to this layout of EBML elements:¶
<Tags> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> <!-- chapter with Da Funk --> <TagChapterUID>12345</TagChapterUID> <!-- chapter with Rollin' & Scratchin' --> <TagChapterUID>67890</TagChapterUID> </Targets> <!-- first writer of Da Funk and Rollin' & Scratchin' --> <SimpleTag> <TagName>WRITTEN_BY</TagName> <TagString>Thomas Bangalter</TagString> </SimpleTag> <!-- second writer of Da Funk and Rollin' & Scratchin' --> <SimpleTag> <TagName>WRITTEN_BY</TagName> <TagString>Guy-Manuel de Homem-Christo</TagString> </SimpleTag> <!-- first producer of Da Funk and Rollin' & Scratchin' --> <SimpleTag> <TagName>PRODUCER</TagName> <TagString>Thomas Bangalter</TagString> </SimpleTag> <!-- second producer of Da Funk and Rollin' & Scratchin' --> <SimpleTag> <TagName>PRODUCER</TagName> <TagString>Guy-Manuel de Homem-Christo</TagString> </SimpleTag> </Tag> </Tags>¶
Some combination of different Tag UID elements are not possible.¶
A TagChapterUID
and TagAttachmentUID
can't be mixed because there is no overlap
with a Chapter and an Attachment that would make sense.
An attachment apply to the whole segment and can be tied to tracks,
via \Segment\Tracks\TrackEntry\AttachmentLink
as defined in Section 5.1.4.1.24 of [RFC9559], but not chapters.¶
Mixing TagEditionUID
and TagChapterUID
elements has also no use because each Chapter UIDs
would need to be in one of the Chapter Edition UIDs.
That would be the same as not using the list of TagEditionUID
at all.¶
The following table shows the allowed combinations between lists of Tag UID elements:¶
UID elements | Track | Edition | Chapter | Attachment |
---|---|---|---|---|
Track | YES | YES | YES | with matching AttachmentLink |
Edition | YES | YES | NO | YES |
Chapter | YES | NO | YES | NO |
Attachment | with matching AttachmentLink | YES | NO | YES |
Here is an example of a Tag
applying to a single track and a single chapter.
It represents the composer of the music in a part of a movie.
The file may contain a second audio track with audio commentary not including that music,
so we only tag the track with the music.¶
This corresponds to this layout of EBML elements:¶
<Tags> <Tag> <Targets> <TargetTypeValue>30</TargetTypeValue> <!-- chapter with the music --> <TagTrackUID>123</TagTrackUID> <!-- track with the music --> <TagChapterUID>67890</TagChapterUID> </Targets> <!-- composer of the music in that chapter when using that audio track --> <SimpleTag> <TagName>COMPOSER</TagName> <TagString>Hans Zimmer</TagString> </SimpleTag> </Tag> </Tags>¶
This document inherits security considerations from the EBML [RFC8794] and Matroska [RFC9559] documents.¶
Tag values can be either TagString
or TagBinary
blobs. In both cases issues can happen if the parsing of the data fails.¶
Most of the time strings are kept as-is and don't pose a security issue, apart from invalid UTF-8 values.¶
String tags that are parsed like "REPLAYGAIN_GAIN" or "REPLAYGAIN_PEAK" defined in Section 4.10 or string tags following the rules from Section 3.2.2 or string tags following other strict formats like URLs may cause issues when the string is bogus or in an unexpected format.¶
Binary tags that need to be parsed like "MCDI" defined in Section 4.11 may cause issues when the data is bogus or incomplete.¶
Due to the nature of nested SimpleTag
, it is possible to exhaust the memory of the host app by using very deep nesting.
An host app MAY add some limits to the amount of nesting possible to avoid such issues.¶