<html xmlns="http://www.w3.org/1999/xhtml"><head></head><body><section><h1>UTF-8 符号化されたバイト列</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="9" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[9]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Perl</anchor> は<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">バイト列</anchor>を<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字列</anchor>として扱うことができます。これは<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">同型復号</anchor>に相当する暗黙の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">型変換</anchor>を伴っています。
<sw-see xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"> <anchor>utf8::upgrade</anchor> </sw-see></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="1" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[1]</anchor-end> 
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">JSON</anchor> で<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">バイト列</anchor>を扱う方法として、相当する <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Latin1</anchor> の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>とみなす方法があります。
これは<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">同型復号</anchor>に相当する操作です。</p><p><sw-see xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"> <anchor>文字としてのバイト</anchor> </sw-see></p></section><section><h1>UTF-8 符号化された Latin1</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="4" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[4]</anchor-end> <cite xml:lang="en">GitHub - grantm/encoding-fixlatin: CPAN module: Fixes Latin-1 and CP1252 characters in UTF8 data</cite>, <time>2025-06-25T08:12:05.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/grantm/encoding-fixlatin">https://github.com/grantm/encoding-fixlatin</anchor-external></p></section><section><h1>UTF-8 符号化された TIS 620</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="2" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[2]</anchor-end> 
<cite xml:lang="en-GB">thaiconv | Lyndon Hill</cite>, <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Lyndon Hill</anchor>, <time>2025-04-16T20:26:33.000Z</time>, <time>2025-08-02T09:22:11.073Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www.lyndonhill.com/projects/thaiconv.html">https://www.lyndonhill.com/projects/thaiconv.html</anchor-external></p><blockquote><dl><dt><b>Cross coded UTF-8</b></dt><dd>TIS-620 that has been converted to UTF-8 Latin1 (0xA0-0xF0). For example, the Thai character that has the value 160 in TIS-620 may have the Latin representation é, this character gets converted to the Unicode for é. This mode is likely to be converted correctly only if the cross coding and decoding occur in the same locality.</dd></dl></blockquote></section><section><h1>UTF-8 符号化された EUC-JP</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="23" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[23]</anchor-end> <cite xml:lang="ja">¥«¥¿¥í¥°Æâ¸¡º÷¡Ã¶õÄ´À½ÉÊ¥«¥¿¥í¥°¡Ã¥À¥¤¥­¥ó¹©¶È³ô¼°²ñ¼Ò¡Ã</cite>, <time>2025-11-23T08:22:31.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://web.archive.org/web/20251123062132id_/https://ec.daikinaircon.com/cgi-bin/ecatalog/fulltextsearch.cgi?order=new&amp;kwd=%3F%3F%3F%3Fp&amp;old=&amp;pg=70">https://web.archive.org/web/20251123062132id_/https://ec.daikinaircon.com/cgi-bin/ecatalog/fulltextsearch.cgi?order=new&amp;kwd=%3F%3F%3F%3Fp&amp;old=&amp;pg=70</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="24" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[24]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="23" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;23</anchor-internal> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">EUC-JP</anchor> を <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">UTF-8</anchor> 符号化したもの</p></section><section><h1><code>UTF8CP1252</code></h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="6" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[6]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="5" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;5</anchor-internal></p></section><section><h1><code>UTF8UTF8</code></h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="5" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[5]</anchor-end> <cite xml:lang="en"><anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">compact_enc_det</anchor>/compact_enc_det/compact_enc_det.cc at master · google/compact_enc_det · GitHub</cite>, <time>2025-11-24T04:01:07.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/google/compact_enc_det/blob/master/compact_enc_det/compact_enc_det.cc#L50">https://github.com/google/compact_enc_det/blob/master/compact_enc_det/compact_enc_det.cc#L50</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="205" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[205]</anchor-end> 
<cite xml:lang="en"><anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">compact_enc_det</anchor>/util/encodings/encodings.pb.h at master · google/compact_enc_det · GitHub</cite>, <time>2025-05-20T15:13:42.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/google/compact_enc_det/blob/master/util/encodings/encodings.pb.h#L150">https://github.com/google/compact_enc_det/blob/master/util/encodings/encodings.pb.h#L150</anchor-external></p><blockquote><pre>  // Some external vendors make the common input error of
  // converting MSFT_CP1252 to UTF8 *twice*. No output conversion needed.
  UTF8UTF8             = 63,</pre></blockquote><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="212" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[212]</anchor-end> 
<cite xml:lang="en-US">Encode::DoubleEncodedUTF8 - Fix double encoded UTF-8 bytes to the correct one - metacpan.org</cite>, <time>2025-06-25T08:08:29.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://metacpan.org/pod/Encode::DoubleEncodedUTF8">https://metacpan.org/pod/Encode::DoubleEncodedUTF8</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="214" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[214]</anchor-end> <cite>Transliteration Tools for Indian Languages | ashishware.com</cite>, <time>2025-02-23T07:19:42.000Z</time>, <time>2025-07-13T08:53:33.598Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://ashishware.com/2006/06/25/Transl.shtml/">https://ashishware.com/2006/06/25/Transl.shtml/</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="215" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[215]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="214" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;214</anchor-internal> 本文前半は本当の <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">UTF-8</anchor>。後半は <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">UTF-8</anchor> を再度 <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">UTF-8</anchor> 符号化したものか。</p></section><section><h1>フォント依存符号化</h1><p><sw-see xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"> <anchor>フォント依存符号化</anchor> </sw-see></p></section><section><h1>関連</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="3" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[3]</anchor-end> 単なる <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">UTF-8</anchor> 化ではなくデータ破損を伴うものは<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字コードの修復</anchor>を参照。</p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="213" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[213]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字コードの修復</anchor>,
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">同型符号化</anchor>,
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字として符号化された文字やバイト</anchor></p></section><section><h1>メモ</h1></section></body></html>