Type something to search... The source code of each content document identifies its primary language
Screen readers and braille displays may not automatically detect a book’s primary language. Providing this information in the source code for each section ensures the content is processed and rendered correctly.
Objectives
-
Ensure screen readers use the correct pronunciation for the content.
-
Facilitate automatic translation.
-
Improve content indexing based on language.
Implementation
-
Add the
lang attribute to the root html element of the Content Document to specify the primary language of the text. -
Ensure the language code complies with the IANA Language Subtag Registry. In practice, for English, use
lang="en" in HTML or both lang="en" and xml:lang="en" in XHTML. If the language varies within the book, use the lang attribute (and xml:lang for XHTML) on specific elements to indicate language shifts within a section. -
Apply the
lang attribute to elements such as p, div, or span to signal passages in a different language (e.g., <p lang="fr">). If the primary language changes across different parts of the content, define the lang attribute (or xml:lang for XHTML) on parent elements like head, body, or title to specify the dominant language for those sections.
Control
-
Check that the root
html element of the Content Document includes the lang attribute (e.g., lang="en") and, for XHTML, the xml:lang attribute to indicate the primary language. -
Ensure the
xml:lang attribute is present in XHTML documents to maintain compatibility with legacy systems and tools requiring this specification. -
Verify that the
lang attribute is used on relevant elements (e.g., p, div, span) for any multilingual content to indicate language changes within the document. -
Confirm that the language for metadata and content in sections such as
head, title, and body is explicitly defined with the lang attribute to ensure linguistic consistency. -
Verify that language codes comply with the IANA Language Subtag Registry and match the content’s language. Note that generic codes like
mul (multiple languages) and und (undetermined) must not be used. Additionally, while xml:lang can supplement the lang attribute, it is not sufficient on its own to meet this requirement.
Validation
- Reported by epubcheck.
- Needs to be verified by a human.