<html xmlns="http://www.w3.org/1999/xhtml"><head></head><body><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="23" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[23]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Webサーバー</anchor>の <dfn><code>/robots.txt</code></dfn> は、
当該<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Webサーバー</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">クロール</anchor>に関する<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">検索エンジン</anchor>の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">ロボットへの指示</anchor>を記述する<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">ファイル</anchor>です。</p><section><h1>仕様書</h1><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="2" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[2]</anchor-end> <cite>robotstxt.org</cite> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.robotstxt.org/">http://www.robotstxt.org/</anchor-external><ul><li>[ROBOTS94] <cite>A Standard for Robot Exclusion</cite> 
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.robotstxt.org/wc/norobots.html">http://www.robotstxt.org/wc/norobots.html</anchor-external></li><li>[ROBOTS97] <cite>A Standard for Robot Exclusion</cite> 
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.robotstxt.org/wc/norobots-rfc.html">http://www.robotstxt.org/wc/norobots-rfc.html</anchor-external></li></ul></li><li><anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">HTML 4</anchor><ul><li><cite>The robots.txt file</cite>
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="IW" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="HTML4:&quot;appendix/notes.html#h-B.4.1.1&quot;">IW:HTML4:&quot;appendix/notes.html#h-B.4.1.1&quot;</anchor-external></li></ul></li></ul><p>[ROBOTS94] が1994年の合意で、その後1997年に <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Internet Draft</anchor> [ROBOTS97]
が書かれましたが、未完成のままです。 HTML 4 は附属書 B (参考)
の中で解説していますが、規定はしていません。</p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="3" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[3]</anchor-end>
HTML 4.0 の解説には間違いが沢山ありました。
HTML 4.01 では修正されています。</p><p>HTML 4.01
<csection xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:">A.1.2 Errors that were corrected</csection>
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="IW" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="HTML4:&quot;appendix/changes.html#h-A.1.2&quot;">IW:HTML4:&quot;appendix/changes.html#h-A.1.2&quot;</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="4" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[4]</anchor-end>
<cite>sitemaps.org - Protocol</cite> (<time>2007-04-11 20:52:23 +09:00</time> 版) <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.sitemaps.org/protocol.html#submit_robots">http://www.sitemaps.org/protocol.html#submit_robots</anchor-external>
(<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">名無しさん</anchor> <weak xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">2007-04-12 11:00:38 +00:00</weak>)</p></section><section><h1>プロトコル</h1><ul><li><code>Crawl-Delay:</code></li></ul></section><section><h1>実装</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="21" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[21]</anchor-end> <code>wget</code> は複数の <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">URL</anchor> を保存する時標準設定で <code>robots.txt</code> に従いますが、
設定を変更して無効化することもできます。</p><comment-p xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:"><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="22" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[22]</anchor-end> <code xmlns="http://www.w3.org/1999/xhtml">robots.txt</code> は<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">検索エンジン</anchor>向けで運用されている現実があるため、
<code xmlns="http://www.w3.org/1999/xhtml">wget</code> のような異なる目的の<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">ロボット</anchor>がこれに従う必要があるのかどうか、
疑問はあります。</comment-p></section><section><h1>歴史</h1><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="1" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[1]</anchor-end> <em>robotはぢきについて</em> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://c-moon.jp/robots.shtml">http://c-moon.jp/robots.shtml</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="6" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[6]</anchor-end> <cite>自分のサイトを更新チェックされたくない - はてなアンテナのヘルプ</cite>
(<time>2011-09-09 12:18:55 +09:00</time> 版)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://hatenaantenna.g.hatena.ne.jp/keyword/%E8%87%AA%E5%88%86%E3%81%AE%E3%82%B5%E3%82%A4%E3%83%88%E3%82%92%E6%9B%B4%E6%96%B0%E3%83%81%E3%82%A7%E3%83%83%E3%82%AF%E3%81%95%E3%82%8C%E3%81%9F%E3%81%8F%E3%81%AA%E3%81%84?kid=19#robots">http://hatenaantenna.g.hatena.ne.jp/keyword/%E8%87%AA%E5%88%86%E3%81%AE%E3%82%B5%E3%82%A4%E3%83%88%E3%82%92%E6%9B%B4%E6%96%B0%E3%83%81%E3%82%A7%E3%83%83%E3%82%AF%E3%81%95%E3%82%8C%E3%81%9F%E3%81%8F%E3%81%AA%E3%81%84?kid=19#robots</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="7" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[7]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">ACAP<title xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:10:">Automated Content Access Protocol</title></anchor></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="8" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[8]</anchor-end> <cite>WWW::RobotsRules - search.cpan.org</cite>
( (<time>2013-03-10 05:21:28 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://search.cpan.org/dist/lcwa/lib/lwp/lib/WWW/RobotRules.pm">http://search.cpan.org/dist/lcwa/lib/lwp/lib/WWW/RobotRules.pm</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="9" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[9]</anchor-end> <cite>WWW::RobotRules::Parser - search.cpan.org</cite>
( (<time>2013-03-10 05:22:34 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://search.cpan.org/dist/WWW-RobotRules-Parser/lib/WWW/RobotRules/Parser.pm">http://search.cpan.org/dist/WWW-RobotRules-Parser/lib/WWW/RobotRules/Parser.pm</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="10" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[10]</anchor-end> <cite>WWW::RobotRules - search.cpan.org</cite>
( (<time>2013-03-10 05:23:35 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://search.cpan.org/dist/WWW-RobotRules/lib/WWW/RobotRules.pm">http://search.cpan.org/dist/WWW-RobotRules/lib/WWW/RobotRules.pm</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="11" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[11]</anchor-end> <cite>WWW::RobotRules::Extended - search.cpan.org</cite>
( (<time>2013-03-10 05:28:09 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://search.cpan.org/dist/WWW-RobotRules-Extended/lib/WWW/RobotRules/Extended.pm">http://search.cpan.org/dist/WWW-RobotRules-Extended/lib/WWW/RobotRules/Extended.pm</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="12" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[12]</anchor-end> <cite>robots.txtにおけるAllowとDisallowとSitemapの優先順位 - 45式::雑記</cite>
( (<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">渡辺四ん五(4n5)</anchor> 著, <time>2012-02-29 17:58:01 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.45shiki.net/blog/2009/12/b000924.htm">http://www.45shiki.net/blog/2009/12/b000924.htm</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="13" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[13]</anchor-end> <cite xml:lang="en">Robots exclusion standard - Wikipedia, the free encyclopedia</cite>
( (<time>2013-03-10 00:00:25 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://en.wikipedia.org/wiki/Robots_exclusion_standard">http://en.wikipedia.org/wiki/Robots_exclusion_standard</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="14" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[14]</anchor-end> <cite>Official Google Webmaster Central Blog: Improving on Robots Exclusion Protocol</cite>
( (<time>2014-03-19 08:31:54 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://googlewebmastercentral.blogspot.jp/2008/06/improving-on-robots-exclusion-protocol.html">http://googlewebmastercentral.blogspot.jp/2008/06/improving-on-robots-exclusion-protocol.html</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="15" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[15]</anchor-end> <cite>The Web Robots Pages</cite>
( (<time>2013-12-03 09:21:42 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.robotstxt.org/faq/future.html">http://www.robotstxt.org/faq/future.html</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="16" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[16]</anchor-end> <cite xml:lang="en">Robots.txt Specifications - Webmasters — Google Developers</cite>
( (<time>2012-08-02 09:24:38 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt">https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="17" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[17]</anchor-end> <cite xml:lang="en">How to Create a Robots.txt File - Bing Webmaster Tools</cite>
( (<time>2014-03-20 07:09:51 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.bing.com/webmaster/help/how-to-create-a-robots-txt-file-cb7c31ec">http://www.bing.com/webmaster/help/how-to-create-a-robots-txt-file-cb7c31ec</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="18" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[18]</anchor-end> <cite xml:lang="ja">著作権法施行規則</cite>
( (<time>2014-10-09 01:08:42 +09:00</time> 版))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://law.e-gov.go.jp/htmldata/S45/S45F03501000026.html#1000000000007000000000000000000000000000000000000000000000000000000000000000000">http://law.e-gov.go.jp/htmldata/S45/S45F03501000026.html#1000000000007000000000000000000000000000000000000000000000000000000000000000000</anchor-external></p><figure class="quote"><figcaption><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="26" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[26]</anchor-end> <cite xml:lang="ja">Applebot について - Apple サポート</cite>
( (<time>2015-06-02 11:01:40 +09:00</time>))
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://support.apple.com/ja-jp/HT204683">https://support.apple.com/ja-jp/HT204683</anchor-external></figcaption><blockquote><p>Applebot は、Apple の Web クローラーです。Siri や Spotlight 検索候補などの製品で使用されています。慣習的な robots.txt の規則と robots meta タグを尊重します。アクセス元は 17.0.0.0 ネットワークブロックです。</p><p>User-agent 文字列には、“Applebot” と補足のエージェント情報が記載されます。下記は、その例です。</p><p>Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1)</p><p>robots の制御指示で Applebot には言及していなくても Googlebot について指定されている場合、Apple のロボットは Googlebot に対する指示に従います。</p></blockquote></figure><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="27" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[27]</anchor-end> <anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">開発サーバー</anchor>などまったく<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">クロール</anchor>されたくない場合は、<figure><pre class="code">User-agent: *
Disallow: /</pre></figure>... という <code>robots.txt</code> を返すべきです。</p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="5" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[5]</anchor-end> <cite xml:lang="en">Robots exclusion standard - Wikipedia</cite>
(<time>2016-11-01 14:15:50 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://en.wikipedia.org/wiki/Robots_exclusion_standard">https://en.wikipedia.org/wiki/Robots_exclusion_standard</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="19" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[19]</anchor-end> <cite xml:lang="en">An Analysis of the World's Leading robots.txt Files</cite>
(<anchor xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:">Ben Frederickson</anchor>著, <time>2017-10-20 22:51:02 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://www.benfrederickson.com/robots-txt-analysis/">http://www.benfrederickson.com/robots-txt-analysis/</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="20" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[20]</anchor-end> <cite xml:lang="ja">トップ100万ウェブサイトのrobots.txtを解析した人とその結果 | 秋元@サイボウズラボ・プログラマー・ブログ</cite>
(<time>2018-01-04 13:08:43 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="http://developer.cybozu.co.jp/akky/2017/11/one-million-robots-txt-analyzed/">http://developer.cybozu.co.jp/akky/2017/11/one-million-robots-txt-analyzed/</anchor-external></p><figure class="quote"><figcaption><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="24" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[24]</anchor-end> <cite xml:lang="ja">robots.txt ファイルについて - Search Console ヘルプ</cite>
(<time>2018-09-23 17:04:59 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://support.google.com/webmasters/answer/6062608#robotted-but-indexed">https://support.google.com/webmasters/answer/6062608#robotted-but-indexed</anchor-external></figcaption><blockquote><p>Google では、robots.txt でブロックされているコンテンツをクロールしたりインデックスに登録したりすることはありませんが、ブロック対象の URL がウェブ上の他の場所からリンクされている場合、その URL を検出してインデックスに登録する可能性はあります。</p></blockquote></figure><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="25" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[25]</anchor-end> <cite xml:lang="ja">Wayback Machineがrobots.txtを無視するようになるかも? | 海外SEO情報ブログ</cite>
(<time>2018-11-22 13:17:16 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www.suzukikenichi.com/blog/wayback-machine-planning-to-ignore-robotstxt/">https://www.suzukikenichi.com/blog/wayback-machine-planning-to-ignore-robotstxt/</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="28" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[28]</anchor-end> <cite xml:lang="en">Official Google Webmaster Central Blog: Formalizing the Robots Exclusion Protocol Specification</cite>
(<time>2019-07-01 23:33:39 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://webmasters.googleblog.com/2019/07/rep-id.html">https://webmasters.googleblog.com/2019/07/rep-id.html</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="29" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[29]</anchor-end> <cite xml:lang="en">Official Google Webmaster Central Blog: Google's robots.txt parser is now open source</cite>
(<time>2019-07-01 23:33:39 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://webmasters.googleblog.com/2019/07/repp-oss.html">https://webmasters.googleblog.com/2019/07/repp-oss.html</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="30" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[30]</anchor-end> <cite xml:lang="en">google/robotstxt: The repository contains Google's robots.txt parser and matcher as a C++ library (compliant to C++11).</cite>
(<time>2019-07-02 09:52:10 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://github.com/google/robotstxt">https://github.com/google/robotstxt</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="31" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[31]</anchor-end> <cite xml:lang="en">draft-rep-wg-topic-00 - Robots Exclusion Protocol</cite>
(<time>2019-07-02 03:55:27 +09:00</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://tools.ietf.org/html/draft-rep-wg-topic-00">https://tools.ietf.org/html/draft-rep-wg-topic-00</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="32" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[32]</anchor-end> <cite xml:lang="en">GNU Wget 1.20 Manual</cite>
(<time>2020-10-01T07:32:00.000Z</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://www.gnu.org/software/wget/manual/wget.html#index-wgetrc-commands">https://www.gnu.org/software/wget/manual/wget.html#index-wgetrc-commands</anchor-external></p><p><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="33" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[33]</anchor-end> <cite xml:lang="en-US">Robots.txt meant for search engines don’t work well for web archives - Internet Archive Blogs</cite>
(<time>2021-05-20T09:10:02.000Z</time>)
<anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/">https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/</anchor-external></p><ul><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="34" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[34]</anchor-end> <cite xml:lang="ja">生成AIによるクロールを拒否する - はてなブログ ヘルプ</cite>, <time>2025-07-31T02:42:48.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://help.hatenablog.com/entry/ai-crawling">https://help.hatenablog.com/entry/ai-crawling</anchor-external></li><li><anchor-end xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:anchor="35" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:">[35]</anchor-end> 
<cite xml:lang="ja">生成AIによるクロールを拒否する設定ができるようになりました - はてなブログ開発ブログ</cite>, <time>2025-07-31T02:43:24.000Z</time> <anchor-external xmlns="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resScheme="URI" xmlns:a0="urn:x-suika-fam-cx:markup:suikawiki:0:9:" a0:resParameter="https://staff.hatenablog.com/entry/2025/07/30/150351">https://staff.hatenablog.com/entry/2025/07/30/150351</anchor-external></li></ul></section></body></html>