<html xmlns="http://www.w3.org/1999/xhtml"><head></head><body><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="18" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[18]</anchor-end> <dfn>universalchardet</dfn> は、<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字</anchor>の出現頻度の統計データを元に<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">バイト列</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字コード</anchor>を<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">判定<title xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:">文字コード判定</title></anchor>する手法とその実装です。</p><section><h1>原典</h1><refs xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><ul xmlns="http://www.w3.org/1999/xhtml"><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="1" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[1]</anchor-end>
<dfn><cite>A composite approach to language/encoding detection</cite></dfn> (<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Shanjian Li</anchor> 著, <time>2007-09-08 17:38:48 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html">http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[29]</anchor-end> 
移転確認 <time>2019-11-11T09:04:10.300Z</time></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="30" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[30]</anchor-end> 
<cite>A composite approach to language/encoding detection</cite>, <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Shanjian Li</anchor>, <time>2019-11-11 18:03:59 +09:00</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www-archive.mozilla.org/projects/intl/universalcharsetdetection">https://www-archive.mozilla.org/projects/intl/universalcharsetdetection</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="31" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[31]</anchor-end> 
<cite>A composite approach to language/encoding detection</cite>, <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Shanjian Li</anchor>, <time>2025-05-17T08:51:04.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www-archive.mozilla.org/projects/intl/universalcharsetdetection">https://www-archive.mozilla.org/projects/intl/universalcharsetdetection</anchor-external></li></ul></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="41" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[41]</anchor-end> 
<cite>How to build standalone universal charset detector from Mozilla source</cite>, <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Frank Yung-Fong Tang</anchor>, <time>2025-05-17T09:02:46.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www-archive.mozilla.org/projects/intl/detectorsrc">https://www-archive.mozilla.org/projects/intl/detectorsrc</anchor-external></li></ul></refs><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[32]</anchor-end> 
当時は正常に表示されていたはずなのに現在 <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="31" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;31</anchor-internal> は<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字化け</anchor>しています。なんという皮肉でしょうか。
現在 <code>Content-Type:</code> が <code>text/html; charset=UTF-8</code>
とされていますが、実際には <code>windows-1252</code> で<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">符号化</anchor>されているようです。
<time>2025-05-17T08:52:58.800Z</time></p></section><section><h1>実装</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="17" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[17]</anchor-end> 本家 <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Mozilla</anchor> の <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">C++</anchor> の実装の他、各言語への移植版が存在しています。</p><refs xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><ul xmlns="http://www.w3.org/1999/xhtml"><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="36" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[36]</anchor-end> chardet 本家<ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="42" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[42]</anchor-end> 
<cite xml:lang="en">gecko-dev/intl/chardet at 24978cfe7cef443392a56e0958a9d63a6cb93219 · mozilla/gecko-dev · GitHub</cite>, <time>2025-05-17T09:11:30.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/mozilla/gecko-dev/tree/24978cfe7cef443392a56e0958a9d63a6cb93219/intl/chardet">https://github.com/mozilla/gecko-dev/tree/24978cfe7cef443392a56e0958a9d63a6cb93219/intl/chardet</anchor-external></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="34" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[34]</anchor-end> <cite>Java port of Mozilla's Automatic Charset Detection algorithm.</cite>, <time>2010-11-24T20:42:37.000Z</time>, <time>2025-05-17T08:55:03.700Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://jchardet.sourceforge.net/">https://jchardet.sourceforge.net/</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="35" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[35]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="36" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;36</anchor-internal> の移植</li></ul></li></ul></refs><refs xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><ul xmlns="http://www.w3.org/1999/xhtml"><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="33" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[33]</anchor-end> universalchardet 本家<ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="43" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[43]</anchor-end> 
<cite xml:lang="en">gecko-dev/extensions/universalchardet at 67d9eda26a48c9024492de5931478fe14daa2cd1 · mozilla/gecko-dev · GitHub</cite>, <time>2025-05-17T09:24:08.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/mozilla/gecko-dev/tree/67d9eda26a48c9024492de5931478fe14daa2cd1/extensions/universalchardet">https://github.com/mozilla/gecko-dev/tree/67d9eda26a48c9024492de5931478fe14daa2cd1/extensions/universalchardet</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="79" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[79]</anchor-end> 
<cite xml:lang="en">#115114 autodetect universal detects french as Central European (ISO-… · mozilla/gecko-dev@3f7e5bd · GitHub</cite>, <time>2025-05-18T05:11:26.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/mozilla/gecko-dev/commit/3f7e5bdeed3308eed1bb85a2f417e86ad8bab417">https://github.com/mozilla/gecko-dev/commit/3f7e5bdeed3308eed1bb85a2f417e86ad8bab417</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="84" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[84]</anchor-end> 
<cite xml:lang="en">183354 patch by alexey@optus.net (Alexey Chernyak) r=ftang sr=blizzar… · mozilla/gecko-dev@68332f7 · GitHub</cite>, <time>2025-10-18T09:51:28.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/mozilla/gecko-dev/commit/68332f717f14e8f2467ca4f2c521ed8fe6eff71d">https://github.com/mozilla/gecko-dev/commit/68332f717f14e8f2467ca4f2c521ed8fe6eff71d</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="45" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[45]</anchor-end> 
<time>平成23(2011)年<attrvalue xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:">2011</attrvalue></time>に
<code>UTF-32BE</code>,
<code>UTF-32LE</code>,
<code>X-ISO-10646-UCS-4-3412</code>,
<code>X-ISO-10646-UCS-4-2143</code>
が削除されるなど、
<cite>HTML Standard</cite> / <cite>Encoding Standard</cite>
方面の動きと連動した機能削減が行われている。</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="21" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[21]</anchor-end> <cite xml:lang="en-US">mozilla-central: files</cite> (<time>2012-05-26 11:36:03 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://hg.mozilla.org/mozilla-central/file/tip/extensions/universalchardet/">http://hg.mozilla.org/mozilla-central/file/tip/extensions/universalchardet/</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="44" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[44]</anchor-end> 現在では新実装 (<anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="24" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;24</anchor-internal>) への置き換えのため中身がなくなっている
<time>2025-05-17T09:24:56.800Z</time></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="2" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[2]</anchor-end> <del><cite>seamonkey/extensions/universalchardet/src/base/</cite> (<time>2007-04-24 00:11:24 +09:00</time> 版)<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://lxr.mozilla.org/seamonkey/source/extensions/universalchardet/src/base/">http://lxr.mozilla.org/seamonkey/source/extensions/universalchardet/src/base/</anchor-external></del></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="37" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[37]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="36" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;36</anchor-internal> の置き換え</li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="3" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[3]</anchor-end>
<cite xml:lang="en">Universal Encoding Detector: character encoding auto-detection in Python</cite> (<time>2007-11-18 20:19:09 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://chardet.feedparser.org/">http://chardet.feedparser.org/</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="4" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[4]</anchor-end>
<cite xml:lang="ja">Universalchardet - やる気向上作戦</cite> (<time>2009-02-11 14:40:44 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.void.in/wiki/Universalchardet">http://www.void.in/wiki/Universalchardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="52" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[52]</anchor-end> 消滅確認 <time>2025-05-18T03:27:01.600Z</time></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="53" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[53]</anchor-end> 
<cite>やる気向上作戦: universalchardet</cite>, <time>2025-05-18T03:25:57.000Z</time>, <time>2006-05-05T13:36:00.469Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://web.archive.org/web/20060505133528/http://www.void.in/wiki/universalchardet">https://web.archive.org/web/20060505133528/http://www.void.in/wiki/universalchardet</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="54" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[54]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="33" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;33</anchor-internal> の派生</li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="5" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[5]</anchor-end>
<cite xml:lang="ja">RubyForge: Universal Encoding Detector: Project Info</cite> (<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Bruce Williams (http://codefluency.com), for Ruby Central (http://rubycentral.org)</anchor> 著, <time>2009-03-15 11:44:11 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://rubyforge.org/projects/chardet">http://rubyforge.org/projects/chardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="55" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[55]</anchor-end> 消滅確認 <time>2025-05-18T03:36:03.200Z</time></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="56" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[56]</anchor-end> 
<cite xml:lang="en">RubyForge: Universal Encoding Detector: Project Filelist</cite>, <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Bruce Williams (http://codefluency.com), for Ruby Central (http://rubycentral.org)</anchor>, <time>2025-05-18T03:34:46.000Z</time>, <time>2007-12-24T18:21:12.435Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://web.archive.org/web/20071224181754/http://rubyforge.org/frs/?group_id=1506&amp;release_id=4726">https://web.archive.org/web/20071224181754/http://rubyforge.org/frs/?group_id=1506&amp;release_id=4726</anchor-external></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="7" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[7]</anchor-end>
<cite>自娱自乐的Emacser: 查看 universalchardet 高频字表的 perl 程序</cite> (<time>2009-02-25 12:16:07 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://wenbinhome.blogspot.com/2006/12/universalchardet-perl.html">http://wenbinhome.blogspot.com/2006/12/universalchardet-perl.html</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="11" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[11]</anchor-end> <cite>juniversalchardet - Java port of universalchardet - Google Project Hosting</cite>
( (<time>2012-05-26 11:25:24 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://code.google.com/p/juniversalchardet/">http://code.google.com/p/juniversalchardet/</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="38" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[38]</anchor-end> <cite xml:lang="en">Google Code Archive - Long-term storage for Google Code Project Hosting.</cite>, <time>2025-05-17T08:59:51.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://code.google.com/archive/p/juniversalchardet/">https://code.google.com/archive/p/juniversalchardet/</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="39" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[39]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="4" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;4</anchor-internal> の派生</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="50" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[50]</anchor-end> 
<cite xml:lang="en">GitHub - seratch/juniversalchardet: Automatically exported from code.google.com/p/juniversalchardet</cite>, <time>2025-05-18T03:23:09.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/seratch/juniversalchardet/tree/master">https://github.com/seratch/juniversalchardet/tree/master</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="51" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[51]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="11" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;11</anchor-internal> そのまま</li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="48" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[48]</anchor-end> 
<cite xml:lang="en">GitHub - albfernandez/juniversalchardet: Originally exported from code.google.com/p/juniversalchardet</cite>, <time>2025-05-18T03:13:10.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/albfernandez/juniversalchardet">https://github.com/albfernandez/juniversalchardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="49" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[49]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="11" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;11</anchor-internal> の派生版</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="65" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[65]</anchor-end> 
<cite xml:lang="en">Support for US-ASCII · albfernandez/juniversalchardet@fcf1898 · GitHub</cite>, <time>2025-05-18T04:29:38.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/albfernandez/juniversalchardet/commit/fcf1898ab7bae3f7308d991e225c3016e460b352">https://github.com/albfernandez/juniversalchardet/commit/fcf1898ab7bae3f7308d991e225c3016e460b352</anchor-external></li></ul></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="13" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[13]</anchor-end> <cite>uchardet - universalchardet - Google Project Hosting</cite>
( (<time>2012-05-26 11:26:23 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://code.google.com/p/uchardet/">http://code.google.com/p/uchardet/</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="57" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[57]</anchor-end> <cite xml:lang="en">GitHub - BYVoid/uchardet: An encoding detector library ported from Mozilla</cite>, <time>2025-05-18T04:17:34.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/BYVoid/uchardet">https://github.com/BYVoid/uchardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="58" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[58]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="13" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;13</anchor-internal> から移転</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="86" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[86]</anchor-end> 
<cite xml:lang="en">Single Byte charsets: high ctrl character ratio lowers confidence. · BYVoid/uchardet@55b4f23 · GitHub</cite>, <time>2025-10-19T13:38:53.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/BYVoid/uchardet/commit/55b4f23971db61">https://github.com/BYVoid/uchardet/commit/55b4f23971db61</anchor-external></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="59" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[59]</anchor-end> <cite>uchardet</cite>, <time>2022-12-08T21:31:14.000Z</time>, <time>2025-05-18T04:18:19.511Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www.freedesktop.org/wiki/Software/uchardet/">https://www.freedesktop.org/wiki/Software/uchardet/</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="60" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[60]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="57" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;57</anchor-internal> から移転</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="62" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[62]</anchor-end> 
<cite xml:lang="ja">uchardet / uchardet · GitLab</cite>, <time>2025-05-18T04:21:17.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://gitlab.freedesktop.org/uchardet/uchardet">https://gitlab.freedesktop.org/uchardet/uchardet</anchor-external></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="61" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[61]</anchor-end> 独自の改良多数</li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="66" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[66]</anchor-end> 
<cite xml:lang="en">GitHub - PyYoshi/uchardet: uchardet is an encoding detector library, which takes a sequence of bytes in an unknown character encoding and attempts to determine the encoding of the text. Returned encoding names are iconv-compatible.</cite>, <time>2025-05-18T04:31:18.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/PyYoshi/uchardet">https://github.com/PyYoshi/uchardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="67" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[67]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="59" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;59</anchor-internal> 旧所在からの派生</li></ul></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="46" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[46]</anchor-end> 
<cite xml:lang="en">GitHub - errepi/ude: A C# port of Mozilla Universal Charset Detector.</cite>, <time>2025-05-18T03:04:53.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/errepi/ude">https://github.com/errepi/ude</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="47" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[47]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="11" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;11</anchor-internal> の移植</li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="82" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[82]</anchor-end> <cite xml:lang="en">GitHub - CharsetDetector/UTF-unknown: Character set detector build in C# - .NET 5+, .NET Core 2+, .NET standard 1+ &amp; .NET 4+</cite>, <time>2025-05-19T12:49:25.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/CharsetDetector/UTF-unknown">https://github.com/CharsetDetector/UTF-unknown</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="83" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[83]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="46" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;46</anchor-internal>, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="59" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;59</anchor-internal> からの派生</li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="14" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[14]</anchor-end> <cite>nuniversalchardet - C# Port of UniversalCharDet - Google Project Hosting</cite>
( (<time>2012-05-26 11:26:34 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://code.google.com/p/nuniversalchardet/">http://code.google.com/p/nuniversalchardet/</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="22" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[22]</anchor-end> <cite>perl-web-encodings/lib/Web/Encoding/UnivCharDet.pod at master · manakai/perl-web-encodings · GitHub</cite> (<time>2013-05-03 09:26:53 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/manakai/perl-web-encodings/blob/master/lib/Web/Encoding/UnivCharDet.pod">https://github.com/manakai/perl-web-encodings/blob/master/lib/Web/Encoding/UnivCharDet.pod</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="40" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[40]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="33" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;33</anchor-internal> の移植</li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="12" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[12]</anchor-end> 
<cite xml:lang="en">GitHub - dcramer/chardet: Forked version of chardet</cite>, <time>2025-05-17T04:07:06.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/dcramer/chardet">https://github.com/dcramer/chardet</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="23" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[23]</anchor-end> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/sigmavirus24/charade">https://github.com/sigmavirus24/charade</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="15" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[15]</anchor-end> 
<cite xml:lang="en">GitHub - sv24-archive/charade: NO LONGER MAINTAINED. USE chardet/chardet. Fork of chardet to support Python 2 and 3 in one code base.</cite>, <time>2025-05-17T04:07:18.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/sv24-archive/charade">https://github.com/sv24-archive/charade</anchor-external></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="16" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[16]</anchor-end> 
<cite xml:lang="en">GitHub - chardet/chardet: Python character encoding detector</cite>, <time>2025-05-17T04:07:28.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet">https://github.com/chardet/chardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="85" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[85]</anchor-end> 
<cite xml:lang="en">Support CP949, fixes #10 by puzzlet · Pull Request #13 · sv24-archive/charade · GitHub</cite>, <time>2025-10-19T10:20:29.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/sv24-archive/charade/pull/13/files">https://github.com/sv24-archive/charade/pull/13/files</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="76" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[76]</anchor-end> 
<cite xml:lang="en">Added a prober for MacRoman encoding. · chardet/chardet@c292b52 · GitHub</cite>, <time>2025-05-18T05:07:29.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet/commit/c292b52a97e57c95429ef559af36845019b88b33">https://github.com/chardet/chardet/commit/c292b52a97e57c95429ef559af36845019b88b33</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="77" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[77]</anchor-end> 
<cite xml:lang="en">adding capital letter sharp S and ISO-8859-15 (#222) · chardet/chardet@bba08ba · GitHub</cite>, <time>2025-05-18T05:08:03.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet/commit/bba08ba5d656e38282a365af44e477b6f5ee74c7">https://github.com/chardet/chardet/commit/bba08ba5d656e38282a365af44e477b6f5ee74c7</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="78" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[78]</anchor-end> 
<cite xml:lang="en">add prober for Johab Korean (#207) · chardet/chardet@6282d49 · GitHub</cite>, <time>2025-05-18T05:09:14.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet/commit/6282d499bf6400ba1845119265a3d5e57c8b1fa1">https://github.com/chardet/chardet/commit/6282d499bf6400ba1845119265a3d5e57c8b1fa1</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="26" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[26]</anchor-end> 
<cite xml:lang="en">Detection of windows-1250 · Issue #87 · chardet/chardet</cite>, <time>2025-05-17T04:15:34.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet/issues/87">https://github.com/chardet/chardet/issues/87</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="80" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[80]</anchor-end> 
<cite xml:lang="en">1 sentence utf-8 detected as Windows-1252 · Issue #185 · chardet/chardet</cite>, <time>2025-05-18T05:12:12.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet/issues/185">https://github.com/chardet/chardet/issues/185</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="27" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[27]</anchor-end> 
<cite xml:lang="en"><strong>[</strong>WIP<strong>]</strong> Retrain SBCS Models and some refactoring by dan-blanchard · Pull Request #99 · chardet/chardet · GitHub</cite>, <time>2025-05-17T04:16:05.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/chardet/chardet/pull/99">https://github.com/chardet/chardet/pull/99</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="81" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[81]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="57" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;57</anchor-internal> の影響</li></ul></li></ul></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="8" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[8]</anchor-end> 
<cite xml:lang="en">GitHub - aadsm/jschardet: Character encoding auto-detection in JavaScript (port of python's chardet)</cite>, <time>2025-05-16T10:08:45.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/aadsm/jschardet">https://github.com/aadsm/jschardet</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="63" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[63]</anchor-end> 
<cite>CRAN: Package uchardet</cite>, <time>2025-05-18T04:02:20.000Z</time>, <time>2025-05-18T04:28:17.163Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://cran.r-project.org/web/packages/uchardet/index.html">https://cran.r-project.org/web/packages/uchardet/index.html</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="70" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[70]</anchor-end> 
<cite xml:lang="en">Google Code Archive - Long-term storage for Google Code Project Hosting.</cite>, <time>2025-05-18T04:55:39.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://code.google.com/archive/p/nuniversalchardet/">https://code.google.com/archive/p/nuniversalchardet/</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="72" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[72]</anchor-end> 
<cite xml:lang="en">GitHub - thuleqaid/rust-chardet: rust version of chardet</cite>, <time>2025-05-18T04:59:30.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/thuleqaid/rust-chardet">https://github.com/thuleqaid/rust-chardet</anchor-external></li><li>
<anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="75" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[75]</anchor-end> 
<cite xml:lang="en-US">Encode::Detect::Detector - Detects the encoding of data - metacpan.org</cite>, <time>2025-05-18T05:05:23.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://metacpan.org/release/JGMYERS/Encode-Detect-1.01/view/Detector.pm">https://metacpan.org/release/JGMYERS/Encode-Detect-1.01/view/Detector.pm</anchor-external></li><li>
<anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="73" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[73]</anchor-end> 
<cite xml:lang="en">GitHub - Joungkyun/libchardet: libchardet - Mozilla's Universal Charset Detector C/C++ API</cite>, <time>2025-05-18T05:02:38.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/Joungkyun/libchardet">https://github.com/Joungkyun/libchardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="74" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[74]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="33" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;33</anchor-internal>, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="75" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;75</anchor-internal>, <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="57" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;57</anchor-internal> の派生</li></ul></li></ul></refs><refs xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><ul xmlns="http://www.w3.org/1999/xhtml"><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="24" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[24]</anchor-end> 
<cite xml:lang="en">GitHub - hsivonen/chardetng: A character encoding detector for legacy Web content.</cite>, <time>2025-05-17T04:08:29.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/hsivonen/chardetng">https://github.com/hsivonen/chardetng</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="64" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[64]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="33" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;33</anchor-internal> の置き換え</li></ul></li></ul></refs><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="20" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[20]</anchor-end> 次に示すのは独立した実装ではなく、他の実装のラッパーとして機能するものです。</p><refs xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"><ul xmlns="http://www.w3.org/1999/xhtml"><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="6" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[6]</anchor-end>
<cite xml:lang="en">Whatpm — Perl Modules for Web Hypertext Application Technologies (beta)</cite> (<time>2009-01-12 11:36:16 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://suika.fam.cx/www/markup/html/whatpm/readme#whatpm-charset-universalchardet">http://suika.fam.cx/www/markup/html/whatpm/readme#whatpm-charset-universalchardet</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="19" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[19]</anchor-end> <cite>xyzzy.lisp.universalchardet/site-lisp/uchardet at master · southly/xyzzy.lisp.universalchardet · GitHub</cite> (<time>2012-05-26 11:32:30 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/southly/xyzzy.lisp.universalchardet/tree/master/site-lisp/uchardet">https://github.com/southly/xyzzy.lisp.universalchardet/tree/master/site-lisp/uchardet</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="68" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[68]</anchor-end> <cite xml:lang="en">GitHub - PyYoshi/cChardet: universal character encoding detector</cite>, <time>2025-05-18T04:33:26.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/PyYoshi/cChardet">https://github.com/PyYoshi/cChardet</anchor-external><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="69" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[69]</anchor-end> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="66" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;66</anchor-internal> 用</li></ul></li><li>
<anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="71" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[71]</anchor-end> 
<cite xml:lang="en">GitHub - emk/rust-uchardet: Rust wrapper for the uchardet character encoding detection library</cite>, <time>2025-05-18T04:58:52.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/emk/rust-uchardet">https://github.com/emk/rust-uchardet</anchor-external></li></ul></refs></section><section><h1>標準化</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="10" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[10]</anchor-end> UNIVCHARDET 自体は1実装に過ぎず、何らかの標準でも、標準によって義務付けられた実装でもありませんが、
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Web Applications 1.0</anchor> は<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字符号化</anchor>の決定算法の中で出現頻度分析に基づく推定の利用を認めており、
その具体例として <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="1" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;1</anchor-internal> を挙げています。</p></section><section><h1>判定算法で失敗する例</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="9" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[9]</anchor-end> 
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">HTML documents misinterpreted by charset sniffer</anchor> を参照してください。</p></section><section><h1>メモ</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="25" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[25]</anchor-end> 
<anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="24" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;24</anchor-internal> は諸々改善されていることがドキュメントにあるが、近年の <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Webブラウザー</anchor>の、
旧来<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">文字コード</anchor>を切り捨てていく方向に則ってるのが不安要素だなあ。</p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="28" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[28]</anchor-end> 
<anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="27" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;27</anchor-internal> は1バイト符号化の判定に大きな変更を加えている (従来未対応の文字コードへの対応を含む。)
が、マージされずに長年放置されている。</p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="87" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[87]</anchor-end> 
<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">ISO-8859-2</anchor> / <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Windows-1250</anchor> の判定問題
<anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="79" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;79</anchor-internal> <anchor-internal xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="26" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">&gt;&gt;26</anchor-internal>
<sw-see xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:"> <anchor>文字コードの判定</anchor> </sw-see></p></section></body></html>