IIIFのHTML

IIIFのHTML

[2] IIIFHTML要素の一部を XML構文で記述する独自のマーク付け言語を採用し、 JSON-LD における値として使っています。

[3] HTML要素XML構文として使うというと XHTML を想起しますが、 XML名前空間は使っていないようです。

[4] 通常の HTML とは似て非なるものですから、互換性には難があります。 IIIF 以外でこの構文を採用するべきではありません。

[1] Presentation API 2.1.1 — IIIF | International Image Interoperability Framework () <http://iiif.io/api/presentation/2.1/#html-markup-in-property-values>

Minimal HTML markup may be included in the description, attribution and metadata properties. It must not be used in label or other properties. This is included to allow manifest creators to add links and simple formatting instructions to blocks of text. The content must be well-formed XML and therefore must be wrapped in an element such as p or span. There must not be whitespace on either side of the HTML string, and thus the first character in the string must be a ‘<’ character and the last character must be ‘>’, allowing a consuming application to test whether the value is HTML or plain text using these. To avoid a non-HTML string matching this, it is recommended that an additional whitespace character be added to the end of the value.

In order to avoid HTML or script injection attacks, clients must remove:

Tags such as script, style, object, form, input and similar.

All attributes other than href on the a tag, src and alt on the img tag.

CData sections.

XML Comments.

Processing instructions.

Clients should allow only a, b, br, i, img, p, and span tags. Clients may choose to remove any and all tags, therefore it should not be assumed that the formatting will always be rendered.

{"description": {"@value": "<p>Some <b>description</b></p>", "@language": "en-latn"}}