[37] URL 構文解析器は、 文字列入力、 基底URLを表す省略可能なURL記録基底、 省略可能な文字符号化符号化上書きについて、 次のようにします >>36。
[47] つまり URL文字列に URL構文解析器を適用して URL記録を得た時点で、
blob:
URL と参照先のオブジェクトとの関連付けが行われます。
[51] URL の構文解析は、 文書や環境設定群オブジェクトに関する演算です。
[52] 文書文書に対して文字列入力を構文解析するには、 次のようにしなければなりません >>23。
[54] 環境設定群オブジェクト設定群に対して文字列入力を構文解析するには、 次のようにしなければなりません >>23。
[56] これらが (失敗を返さない場合に) 返す URL記録を、結果URL記録といいます >>23。
[49] 基本URL構文解析器 >>36 は、 次の入力を受け取ります。
[50] 基本URL構文解析器は、 URL が指定された場合、それを変更します。 指定されない場合、 URL記録または失敗を返します。 加えて、0個以上の構文違反を報告します。
[48] Webブラウザー以外で blob:
URL の処理を実装する必要がない場合は、
URL構文解析器ではなく、基本URL構文解析器を実装すれば十分です >>36。
[60] Webブラウザーは基本的に URL構文解析器を呼び出すので、 それ以外で基本URL構文解析器が呼び出される場面はあまり無さそうです。
[61] blob:
URLの起源の決定で、基本URL構文解析器が呼び出されます。
[33] URL Standard の登場まで、 URL (として解釈される文字列) に対する構文解析、解決、 比較と正準化といった各演算の関係は明確に理解されていませんでした。
[34] URL Standard は構文解析、解決、 基本的な正準化を1つの不可分の演算として改めて正確に定義し、 比較はその結果について (あるいは場合によっては文脈依存の更なる演算を経たものについて) 行うものとしました。
[87] 現在の URL の仕様書である URL Standard は、 妥当なURLになっていないものも含む任意の文字列に対するURL構文解析器の動作を厳密に規定しています。 すべての文字列が、URL記録か失敗のいずれかに決定的に変換されます。
[86]
HTML では %s
や %%
のような特別な文字列が含まれるURL雛形が使われることがあります。
これは妥当なURLではありませんが、URLの解決の演算が適用されることがあります。
[88] XML型録によるURLの解決は、 本項のような相対URLの展開処理ではなく、 XML型録から URL を探す処理です。
[3] RIF は RFC 3986 の算法を IRI に直接適用すると規定しています。
[4] SSML 1.1 の「URI」は anyURI
ですが、
RFC 3986 の算法を (特に断り書きなしに) 適用すると規定しています。
[2] Open Packaging は Unicode文字列を部分名に解決する方法を規定しています (ECMA 376 第2部附属書A)。部分名とは、 pack URI の path 部分のことです。 この解決は Unicode文字列 → IRI → URI → 部分名の3段階に分かれています。 最初の2つの段階については「IRI」の項を参照してください。 最後の URI → 部分名の段階では、相対参照の解決が行われます (同、 A.3)。 この解決の方法は RFC 3986 の相対参照の解決よりも拡張されたものになっています。
[1] 基底 http://foo.example/#bar で相対 URI 参照 ? を解決すると http://foo.example/? になる実装と http://foo.example/?#bar になる実装がある。
[5] XQuery 3.0: An XML Query Language ( ( 版)) http://www.w3.org/TR/xquery-30/#id-resolve-relative-uri
[6] XML Path Language (XPath) 3.0 ( ( 版)) http://www.w3.org/TR/xpath-30/#dt-resolve-relative-uri
[7] XPath and XQuery Functions and Operators 3.0 ( ( 版)) http://www.w3.org/TR/xpath-functions-3/#func-resolve-uri
[8] Canonical XML Version 2.0 ( ( 版)) http://www.w3.org/TR/2010/WD-xml-c14n2-20100831/
[9] XQuery 1.0 and XPath 2.0 Functions and Operators (Second Edition) ( ( 版)) http://www.w3.org/TR/2010/REC-xpath-functions-20101214/#func-resolve-uri
[10] SPARQL 1.1 Query Language ( ( 版)) http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#relIRIs
[11] Live URL Resolution Viewer (2008-07-05 22:53:22 +09:00
版) http://suika.fam.cx/www/url/urlresolution
[12] 冬様もすなる☆日記というもの (2008年7月) (わかば 著, 版) http://suika.fam.cx/~wakaba/d/d200807#d6-1
[13] OASIS Open Document Format for Office Applications (OpenDocument) Version 1.2 - Part 3: Packages ( 版) http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#a3_7Usage_of_IRIs_Within_Packages
[14] Support fragments as long as base URL is non-null. Fixes https://www.… · whatwg/url@7468c39 ( 版) https://github.com/whatwg/url/commit/7468c397f600b72f650f52fc02466f051bf96ad3
[15] Fix fragment against no relative scheme base URL again. · whatwg/url@7f5036c ( 版) https://github.com/whatwg/url/commit/7f5036c15e1f09ec37c850bbec5f3bedc418a451
[16] Support relative URLs for unknown schemes. Please review! Fixes https… · whatwg/url@b266a43 ( 版) https://github.com/whatwg/url/commit/b266a43fc9df0e8607074bd4d336a517e2010009
[17] Remove the "element's base URL" indirection · whatwg/html@199c0c0 ( 版) https://github.com/whatwg/html/commit/199c0c0569bc0e312bb70bffa8ef2f85231f4cd1
[18] Clarify callers of the 'resolve a URL' algorithm · whatwg/html@7c2db6f ( 版) https://github.com/whatwg/html/commit/7c2db6f8c24f8c20fa362ee49433de966330a1fe
[19] Fix #77: always decode "%2e" in a URL's path · whatwg/url@bee5ad8 ( 版) https://github.com/whatwg/url/commit/bee5ad8041adfe6fc676c527bbc3f3cf4562ef67
[20] Drop dependencies on Encoding Standard's decoder concept · whatwg/url@37f9329 ( 版) https://github.com/whatwg/url/commit/37f932928378c0df521034cfd223f4ba603ef476
[21] Use the "get an output encoding" from the Encoding Standard · whatwg/url@a9197f7 ( 版) https://github.com/whatwg/url/commit/a9197f7714e6b125f1f760ca1aa661530261773c
[22] Update integration with Encoding Standard · whatwg/html@6a31c26 ( 版) https://github.com/whatwg/html/commit/6a31c26cf12e39dab1a488e75dd56c03d6786d39
[24] URLs are parsed and produce records · whatwg/html@30bc255 ( 版) https://github.com/whatwg/html/commit/30bc2557105ad62881ec9670f253febbc9761b44
[26] Fix #823: Don't strip characters from the URL in meta refresh · whatwg/html@543d8c1 ( 版) https://github.com/whatwg/html/commit/543d8c1598bf5bff0b8febc50dff11dcb42f8768
[27] Fix #101: always strip U+0009, U+000A, and U+000D · whatwg/url@7b40216 ( 版) https://github.com/whatwg/url/commit/7b40216f809c7fe3c9a1680b5c1b06a771c9ebd8
[28] Stop after setting an url’s query to null · whatwg/url@4f1c2dd ( 版) https://github.com/whatwg/url/commit/4f1c2ddbdb866b1150819622ec04a86813294059
[29] curl handling of HTTP 301 redirection fails when response location header starts with http:///<domain> (3 slashes between protocol and domain)) · Issue #791 · curl/curl ( ()) https://github.com/curl/curl/issues/791
[30] Issue 385645 - chromium - Security: Multiple leading slashes in URLs may confuse some server-side XSS filters - Monorail ( ()) https://bugs.chromium.org/p/chromium/issues/detail?id=385645
--path-as-is
Tell curl to not handle sequences of /../ or /./ in the given URL path. Normally curl will squash or merge them according to standards but with this option set you tell it not to do that.
[32] Fix #521: all URLs can be relative · whatwg/html@e99f0bd ( 版) https://github.com/whatwg/html/commit/e99f0bd112e0216698bf9397b64b3f8064995756
[58] Point out that the basic URL parser is for everyone. Fixes https://ww… · whatwg/url@9dbbde1 ( 版) https://github.com/whatwg/url/commit/9dbbde11609d351dfed9b6ee066de9d32b164148
[62] Bug 161364 – URLParser should handle relative URLs that start with // () https://bugs.webkit.org/show_bug.cgi?id=161364
[63] 1064700 – Determine which characters need to be percent encoded () https://bugzilla.mozilla.org/show_bug.cgi?id=1064700
Matched the behavior of the old URL parser when %2E is in the URL path (r207805)
Prevented interpreting host of URLs with unrecognized schemes as an IPv4 address (r208086)
[65] Percent encode fragments too (annevk著, ) https://github.com/whatwg/url/commit/373dbedbbf0596f723ce8a195923da98b698aeb0
[66] No need for null passwords (annevk著, ) https://github.com/whatwg/url/commit/5e0b05e95a81fdd539c7b1bf97e69b3df701384f
[67] Stop decoding all %2e's in paths (annevk著, ) https://github.com/whatwg/url/commit/fbff6834a8a03576261f777d0e0afea5c1bc5a09
[68] Return failure in state override scheme parsing (annevk著, ) https://github.com/whatwg/url/commit/4617e33b27d386bbf1db8c04316961d46aaa1397
An URL that does not have a protocol prefix will be assumed to be a file URL. Depending on the build, an URL that looks like a Windows path with the drive letter at the beginning will also be assumed to be a file URL (usually not the case in builds for unix-like systems).
[70] Add opaque hosts (annevk著, ) https://github.com/whatwg/url/commit/30362553e9ce9fc706d3492bd61886e399fc94e2
[71] Change path parsing for non-special URLs (annevk著, ) https://github.com/whatwg/url/commit/b087fe2ab215caf656a94b067c9a69ae78f03c8f
[72] File state did not correctly deal with lack of base URL (annevk著, ) https://github.com/whatwg/url/commit/698f3e8f1d7de6d84c78ac81209fd780aca5ab7e
[73] Meta: state idempotence clearly as a goal (annevk著, ) https://github.com/whatwg/url/commit/6688baa5a3032a0acdb2149c41aef64b30106802
[74] Cleanup API for file and non-special URLs (annevk著, ) https://github.com/whatwg/url/commit/cf616f9d3fca44bd5329e992519a4236a39b0cb7
[75] Editorial: use the Infra Standard for URL's path (annevk著, ) https://github.com/whatwg/url/commit/2f99502dc12b781f5bf6a062257ba031c7129c1e
[76] Paths need to be copied from base URLs (annevk著, ) https://github.com/whatwg/url/commit/cdbe5dcf40c5c98d46c0a4a9b8984914eba38c2f
[77] Attempt to explain valid input better (annevk著, ) https://github.com/whatwg/url/commit/50cb9ab9d8f70cc2bc72e91976bfaea0ad0fd330
[78] A file URL cannot have credentials (annevk著, ) https://github.com/whatwg/url/commit/9b2eb10eb8436adaf6620b1864b25442152f205b
[79] Add empty host concept for file and non-special URLs (rmisev著, ) https://github.com/whatwg/url/commit/5807b28261e44a47e31683230137da395ddc79d8
[80] Restrict protocol around "file" (annevk著, ) https://github.com/whatwg/url/commit/462fdc14732aae4b0b9c5334f37962d8c235caf9
[81] XQuery 3.1: An XML Query Language () https://www.w3.org/TR/2017/REC-xquery-31-20170321/#dt-resolve-relative-uri
[82] XML Path Language (XPath) 3.1 () https://www.w3.org/TR/2017/REC-xpath-31-20170321/#dt-resolve-relative-uri
[83] XPath and XQuery Functions and Operators 3.1 () https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/#func-resolve-uri
[84] URL: trim leading slashes of file URL paths (annevk著, ) https://github.com/whatwg/url/commit/6103e0a58eb2460d409056fb2b93b015941b64f2
[85] Fix Windows drive letter handling in the file state (rmisev著, ) https://github.com/whatwg/url/commit/fe6b251739e225555f04319f19c70c031a5d99eb
[89] Normalize port after updating scheme (TimothyGu著, ) https://github.com/whatwg/url/commit/0f53958338bbaec3882f902897873da59ba7e8bd
[90] Deprecations and Removals in Chrome 61 | Web | Google Developers () https://developers.google.com/web/updates/2017/08/chrome-61-deprecations
[91] Fix Windows drive letter handling in the file slash state (rmisev著, ) https://github.com/whatwg/url/commit/2eef975e989cb5ae2d62467394778fd6778ddec9
[92] Drive letters get duplicated when resolving Windows file: URL with base · Issue #303 · whatwg/url () https://github.com/whatwg/url/issues/303
[93] remaining variable ambiguity · Issue #308 · whatwg/url () https://github.com/whatwg/url/issues/308
[94] Fix Windows drive letter handling in the file slash state by rmisev · Pull Request #343 · whatwg/url () https://github.com/whatwg/url/pull/343
[95] Meta: disambiguate links to "resolve" (ricea著, ) https://github.com/whatwg/streams/commit/47eab2faf0af026a33bd22abe3af583edbdbceb9
[96] Disambiguate links to "resolve" by ricea · Pull Request #878 · whatwg/streams () https://github.com/whatwg/streams/pull/878
[97] Change query state slightly to better deal with non-UTF-8 encodings (annevk著, ) https://github.com/whatwg/url/commit/f0e4390bf882446445e944215524ff3877aac95a
[98] "html" error mode somewhat incompatible with URLs · Issue #139 · whatwg/encoding () https://github.com/whatwg/encoding/issues/139
[99] Change query state slightly to better deal with non-UTF-8 encodings by annevk · Pull Request #386 · whatwg/url () https://github.com/whatwg/url/pull/386
[100] Editorial: avoid setting encoding multiple times (annevk著, ) https://github.com/whatwg/url/commit/b385374e5d1b48f82b8040395786469ae655f435
[101] Percent-encode ' in queries of URLs with special schemes (achristensen07著, ) https://github.com/whatwg/url/commit/6ef17ebe1220a7e7c0cfff0785017502ee18808b
[102] percent-encode ' in queries of URLs with special schemes · Issue #348 · whatwg/url () https://github.com/whatwg/url/issues/348
[103] percent-encode ' in queries of URLs with special schemes by achristensen07 · Pull Request #395 · whatwg/url () https://github.com/whatwg/url/pull/395
[104] Change query state slightly to better deal with non-UTF-8 encodings (annevk著, ) https://github.com/whatwg/url/commit/f0e4390bf882446445e944215524ff3877aac95a
[105] "html" error mode somewhat incompatible with URLs · Issue #139 · whatwg/encoding () https://github.com/whatwg/encoding/issues/139
[106] Change query state slightly to better deal with non-UTF-8 encodings by annevk · Pull Request #386 · whatwg/url () https://github.com/whatwg/url/pull/386
[107] Editorial: avoid setting encoding multiple times (annevk著, ) https://github.com/whatwg/url/commit/b385374e5d1b48f82b8040395786469ae655f435
[108] Use File API to resolve blob: URLs and their origins (mkruisselbrink著, ) https://github.com/whatwg/url/commit/d2ef633869b3f31d8c7e3bb76602400e4d2c126c
[46] curl - How To Use (, ) https://curl.haxx.se/docs/manpage.html#--path-as-is
[111] >>109 URL Standard 以前に長らく明確な構文解析の仕様が存在しなかったことが脆弱性の遠因にある。 URL Standard の成立後に放置してきたのは Python コミュニティーの責任だが。 構文解析の挙動を明確に定めることが相互運用性のみならずセキュリティーのため重要であることを実証する一例。
[112] 構文は生成規則を提示するだけで、あとはPostel則でなるに任せる、という20世紀の IETF の設計論の破綻の一例でもある。
[113] HISTORY & SECURITY: URL Parsing Differences Between Implementations Security Issues · Issue #766 · whatwg/url · GitHub, https://github.com/whatwg/url/issues/766