[19] This page describes how source positions of nodes, attributes, tokens, and errors
are stored and handled in [[manakai]]'s  DOM/HTML related [[Perl]] modules.

* DocumentIndex

[2] A DocumentIndex is a positive integer used to identify a source data stream.

[3] The value [CODE[-1]] represents that the source is an unknown data stream.

* CharacterIndex

[4] A CharacterIndex is a non-negative integer representing the location of a character in the given source character stream.  It is the offset from the beginning of the character string.

;; [5] It is unclear whether "an unknown character index" is necessary or not at the time of writing.  If necessary, the value [CODE[-1]] will be used.

* LineNumber

[43] A LineNumber is a positive integer representing the line in the given source character stream
in which the character appears.

[47] The LineNumber of the first line in the source character stream is 1.

;; [44] It is unclear whether "an unknown line number" is necessary or not.
If necessary, the value -1 will be used.

* ColumnNumber

[45] A ColumnNumber is a non-negative integer representing the column in the line
in the given source character stream at which the character appears.

[46] The ColumnNumber of the first character in the line is 1.

[48] The ColumnNumber of the newline can be 0 (a ColumnNumber in the next line) or
equal to the number of the characters in the previous line (a ColumnNumber
in the previous line).

;; [49] It is unclear whether "an unknown line number" is necessary or not.
If necessary, the value -1 will be used.  Note that it might be possible to
use -1 for the CR of CRLF sequence...

* IndexedStringSegment

[6] An IndexedStringSegment is an array reference containing a character string,
a DocumentIndex, and a CharacterIndex.

[7] The pair of the DocumentIndex and the CharacterIndex identifies the source
location of the first character of the character string.

[8] An IndexedStringSegment represents the character string.

* IndexedString

[1] An IndexedString is an array reference containing zero or more IndexedStringSegment,
representing a string consist of concatenation of strings represented by
the IndexedStringSegments, in order.

* IndexIndexMappingSegment

[22] An IndexIndexMappingSegment is an array reference whose items are
a CharacterIndex, a DocumentIndex, and a CharacterIndex.

[23] It represents that the character at the first CharacterIndex is
corresponding to the character at the last CharacterIndex in the document
identified by the DocumentIndex.

* IndexIndexMapping

[24] An IndexIndexMapping is an array reference whose items are zero or more
IndexIndexMappingSegments.

[25] The first CharacterIndex in an earlier IndexIndexMappingSegment in the IndexIndexMapping
must be less than or equal to the first CharacterIndex of a later IndexIndexMappingSegment.

[35] It defines a mapping from the CharacterIndex of a character to
a pair of DocumentIndex and CharacterIndex.

[29] The [DFN[relevant segment]] of a mapping
for a CharacterIndex is the result of these steps:
[FIG(steps)[
= [32] Let [VAR[result]] be a segment (0, -1, 0).
= [30] For each segments in the mapping, in order:
== [31] If the first CharacterIndex of the segment is greater than the given CharacterIndex,
return [VAR[result]] and abort these steps.
== [33] Otherwise, set the item to [VAR[result]].
= [34] Return [VAR[result]].
]FIG]

[26] The [DFN[mapped (DocumentIndex, CharacterIndex) pair]] of a character obtained by an
IndexIndexMapping is the pair of DocumentIndex and the last CharacterIndex
of the relevant segment of the IndexIndexMapping for the
CharacterIndex identifying the character.

;; [28] Note that if there are more than one IndexIndexMappingSegments whose first
CharacterIndexes are equal, the earlier IndexIndexMappingSegment is ignored.

* IndexLCMappingSegment

[36] An IndexLCMappingSegment is an array reference whose items are
a CharacterIndex, a LineNumber, and a ColumnNumber.

[37] It represents that the character at the CharacterIndex has
LineNumber and ColumnNumber.

* IndexLCMapping

[38] An IndexLCMapping is an array reference whose items are zero or more
IndexLCMappingSegments.

[39] The CharacterIndex in an earlier IndexLCMappingSegment in the IndexLCMapping
must be less than or equal to the CharacterIndex of a later IndexLCMappingSegment.

[40] It defines a mapping from the CharacterIndex of a character to
a pair of LineNumber and ColumnNumber.

[41] The [DFN[mapped (LineNumber, ColumnNumber) pair]] of a character obtained by an
IndexLCMapping is the pair of LineNumber and the ColumnNumber
of the relevant segment of the IndexLCMapping for the CharacterIndex identifying the character.

;; [42] Note that if there are more than one IndexLCMappingSegments whose first
CharacterIndexes are equal, the earlier IndexLCMappingSegment is ignored.

* DocumentIndexData

[50] A DocumentIndexData is a hash reference.  It contains properties of a document.

[51] If there is key [CODE[map]], its value must be an IndexIndexMapping.
Its IndexIndexMappingSegments' first CharacterIndexes are considered as
indexes in this document.

[52] If there is key [CODE[lc_map]], its value must be an IndexLCMapping.
Its IndexLCMappingSegments' first CharacterIndexes are considered as
indexes in this document.

[27] If there is key [CODE[url]], its value must be a character string.
It is the URL of the document, if known.  It must be an absolute URL.
However, the consumer of the value must not assume that it is an absolute URL.

[55] Applications can use other keys for their own purposes.  They should
perform their best effort to avoid conflicting of keys with future addition
and of another applications. 

* DocumentIndexDataSet

[53] A DocumentIndexDataSet is an array reference whose items are DocumentIndexData
or undef.

[54] If the [VAR[i]]-th item in the array reference is a DocumentIndexData,
it is the DocumentIndexData for the document whose DocumentIndex is [VAR[i]].
If the [VAR[i]]-th item in the array reference is undef, or there is no
[VAR[i]]-th item in the array reference, DocumentIndex [VAR[i]] is not
assigned to any document.

* Web IDL Perl binding extensions

[9] If the type of an argument is IndexedStringSegment, run these steps:
[FIG(steps)[
= Let [VAR[value]] be the specified value.
= If [VAR[value]] is not an array reference, throw an [[TypeError]] and abort these steps.
= Let [VAR[new value]] be a new array reference.
= Push ToString([VAR[value]]->[0]) to [VAR[new value]].
= Push ToNumber([VAR[value]]->[1]) to [VAR[new value]].
= Push ToNumber([VAR[value]]->[2]) to [VAR[new value]].
= Return [VAR[new value]].
]FIG]

[10] If the type of an argument is IndexedString, run these steps:
[FIG(steps)[
= Let [VAR[value]] be the specified value.
= If [VAR[value]] is not an array reference, throw an [[TypeError]] and abort these steps.
= If there is an item in [VAR[value]] which is not an array reference, throw an [[TypeError]] and abort these steps.
= Let [VAR[new value]] be a new array reference.
= For each item in [VAR[value]], in order,
== Apply >>9 to the item and push the result into [VAR[new value]].
= Return [VAR[new value]].
]FIG]

* DOM [CODE(DOMi)@en[Node]] extensions

[11] Each [CODE(DOMi)@en[Node]] and [CODE(DOMi)@en[Attr]] has associated
source document index and source character index.  Initially, they are unset.

[FIG[
[PRE(IDL code)[
partial interface Node {
  IndexedStringSegment manakaiGetSourceLocation;
  void manakaiSetSourceLocation (IndexedStringSegment? new);
};
partial interface Attr {
  IndexedStringSegment manakaiGetSourceLocation;
  void manakaiSetSourceLocation (IndexedStringSegment? new);
};
]PRE]
]FIG]

[12] The [CODE(DOMm)@en[[[manakaiGetSourceLocation]]]] method [['''MUST''']]
run these steps:
[FIG(steps)[
= If the source document index of the [[context object]] is set,
== Return a new IndexedStringSegment whose character string is the empty string,
DocumentIndex is the source document index of the [[context object]], and
CharacterIndex is the source character index of the [[context object]].
= Otherwise, if the [[context object]] is a [CODE(DOMi)@en[[[CharacterData]]]],
[CODE(DOMi)@en[[[Attr]]]], or [CODE(DOMi)@en[[[AttributeDefinition]]]]
and its data or value consist of one or more
IndexedStringSegment, 
== Let [VAR[segment]] be the first IndexedStringSegment of the data or value
of the [[context object]].
== Return a new IndexedStringSegment whose character string is the empty string,
DocumentIndex is the DocumentIndex of [VAR[segment]],
and CharacterIndex is the CharacterIndex of [VAR[segment]].
= Otherwise, return a new IndexedStringSegment whose character string is the empty string,
DocumentIndex is -1, and CharacterIndex is 0.
]FIG]

[13] The [CODE(DOMm)@en[[[manakaiSetSourceLocation]]]] method [['''MUST''']]
run these steps:
[FIG(steps)[
= If the argument is [[null]], unset the source document index 
and the source character index of the [[context object]].
= Otherwise, set the source document index of the [[context object]]
to the DocumentIndex of the argument and the source character index of the 
[[context object]] to the CharacterIndex of the argument.
]FIG]

[14] The data of a [CODE(DOMi)@en[[[CharacterData]]]] node,
the value of an [CODE(DOMi)@en[[[AttributeDefinition]]]] node, and
the value of an [CODE(DOMi)@en[[[Attr]]]] object are internally stored
as IndexedString (or equivalent).

[15] When they are to be modified by a method that does not aware of IndexedString,
it [['''MUST''']] be performed in a way that does not make DocumentIndex and CharactrIndex
contained in the IndexedString incorrect.  When it is not desired to preserve
DocumentIndex and CharacterIndex of affected IndexedStringSegments
or when a [CODE(DOMi)@en[[[DOMString]]]] is inserted, their DocumentIndex
and CharacterIndex [['''MUST''']] be set to -1 and 0, respectively.

[FIG[
[PRE(IDL code)[
partial interface Node {
  IndexedString? manakaiGetIndexedString ();
  void manakaiAppendIndexedString (IndexedString s);
};
partial interface Attr {
  IndexedString? manakaiGetIndexedString ();
  void manakaiAppendIndexedString (IndexedString s);
};
]PRE]
]FIG]

[16] The [CODE(DOMm)@en[[[manakaiGetIndexedString]]]] method
[['''MUST''']] return a new IndexedString which represents the same
string as the [CODE(DOMa)@en[[[textContent]]]] attribute
of the [[context object]].  DocumentIndex and CharacterIndex contained
in it [['''MUST''']] be set to appropriate values.

[17] The [CODE(DOMm)@en[[[manakaiAppendIndexedString]]]] method
[['''MUST''']] act as if the [CODE(DOMm)@en[[[manakaiAppendText]]]]
method is invoked with the character string of the IndexedString argument.
In addition, DocumentIndex and CharacterIndex contained in the IndexedString
argument [['''MUST''']] be used to update the data or value of relevant
objects.

[FIG[
[PRE(IDL code)[
partial interface Element {
  IndexedString? manakaiGetAttributeIndexedStringNS (DOMString? namespace, DOMString localName);
  void manakaiSetAttributeIndexedStringNS (DOMString? namespace, DOMString name, IndexedString value);
};
]PRE]
]FIG]

[20] The [CODE(DOMm)@en[[[manakaiGetAttributeIndexedStringNS]]]] method [['''MUST''']]
act as if the [CODE(DOMm)@en[[[getAttributeNS]]]] method is invoked, except
the method [['''MUST''']] return a new [CODE(DOMi)@en[[[IndexedString]]]]
instead of a [CODE(DOMi)@en[[[DOMString]]]].

[21] The [CODE(DOMm)@en[[[manakaiSetAttributeIndexedStringNS]]]] method [['''MUST''']]
act as if the [CODE(DOMm)@en[[[setAttributeNS]]]] method is invoked
with same [VAR[namespace]] and [VAR[name]] arguments,
and [VAR[value]] argument which is a [CODE(DOMi)@en[[[DOMString]]]] equivalent to
the character string of the IndexedString argument.
In addition, DocumentIndex and CharacterIndex contained in the IndexedString
argument [['''MUST''']] be used to set the value of the attribute.

;; [18] Exactly how DocumentIndex and CharacterIndex is handled by various
methods defined here or by DOM Standard is a quality of implementation issue.