<html xmlns="http://www.w3.org/1999/xhtml"><head></head><body><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="1" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[1]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">コーパスのタグセット</anchor>各種の 
<dfn><code>g</code></dfn> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">要素</anchor>は、
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">外字</anchor>を表します。</p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="2" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[2]</anchor-end> 
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">コーパス</anchor>の定める<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字集合</anchor>に含まれない<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>は、
<code>g</code>
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">要素</anchor>と<code>〓</code>または近い<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>で表します。
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">属性</anchor>に<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>の情報を記述します。
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=2, #page=18, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;32</anchor-internal> #page=7, #page=29</src></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="3" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[3]</anchor-end> 
また、<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">敬意欠字</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">空白</anchor>は <code>g</code> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">要素</anchor>とします。
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=3, #page=18, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;32</anchor-internal> #page=8, #page=29</src></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="4" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[4]</anchor-end> 
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">必須</anchor>の <dfn><code>type</code></dfn> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">属性</anchor>で性質を記述します。
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=18, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;32</anchor-internal> #page=29</src></p><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="5" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[5]</anchor-end> <dfn><code>外字</code></dfn>
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=18, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;32</anchor-internal> #page=29</src>
は、<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">コーパス</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字集合</anchor>で表せないことを意味します。</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="6" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[6]</anchor-end> <dfn><code>包摂</code></dfn>
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=18</src>
は、
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">JIS X 0213</anchor> の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">包摂規準</anchor>に従うと表せないものの、
拡張包摂規準 (<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">コーパス</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">包摂規準</anchor>) で <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">JIS X 0213</anchor>
の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>に<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">縮退</anchor>されたものを意味します。</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="7" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[7]</anchor-end> <dfn><code>敬意欠字</code></dfn>
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=18, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;32</anchor-internal> #page=29</src>
は、
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">敬意欠字</anchor>を表します。</li></ul><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="8" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[8]</anchor-end> 
<dfn><code>ref</code></dfn> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">属性</anchor>で<code>外字</code>を説明します。
<code>U+4E00</code> のような <code>U+</code> 形式で <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Unicode符号位置</anchor>を記述したり、
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">自然言語</anchor>で説明したりできます。
<src xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;29</anchor-internal> #page=18, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;32</anchor-internal> #page=29</src></p><comment-p xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="9" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[9]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">XML文書</anchor>なので本来<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Unicode文字</anchor>を記述する構文は不要なのですが、
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">コーパス</anchor>が利用する<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">集合</anchor>を定め、それ以外は<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">外字</anchor>扱いにする、
という方針のためこのような仕様になっています。</comment-p><refs xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><ul xmlns="http://www.w3.org/1999/xhtml"><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[29]</anchor-end> <cite><sw-l xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">『明六雑誌コーパス』の仕様</sw-l></cite>,
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">近藤明日子</anchor>,
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">田中牧郎</anchor>,
<time>2023-11-26T08:07:49.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://web.archive.org/web/20171116111759/http://pj.ninjal.ac.jp/corpus_center/cmj/doc/07kondo.pdf">https://web.archive.org/web/20171116111759/http://pj.ninjal.ac.jp/corpus_center/cmj/doc/07kondo.pdf</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="31" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[31]</anchor-end> <cite xml:lang="ja">国立国語研究所学術情報リポジトリ</cite>, <time>2023-11-26T08:12:17.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://repository.ninjal.ac.jp/records/3302">https://repository.ninjal.ac.jp/records/3302</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[32]</anchor-end> 
<cite><sw-l xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">『国民之友コーパス』解説書<sw-br></sw-br>第1.1 版</sw-l></cite>,
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><sw-l>近藤明日子</sw-l></anchor>,
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><sw-l>2014</sw-l></anchor>,
<code>kokumin_manual_v1_1.pdf</code></li></ul></li></ul></refs><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="10" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[10]</anchor-end> 
関連: <code>missingCharacter</code></p></body></html>