[1] 【[[URI]]】 [[安全]]ではないので、 URI 中で ([[URI符号化]]せずに) 使用してはいけない[[文字]](群)。 Unsafe characters。

* 意味

[8] 
歴代仕様の定義や一般の理解には少しずつずれがあります。

* 関連

[9] [[URLにおける文字]]

- [[予約文字]]
- [[百分率符号化]]
- [[IRI]]

[2] [[URL安全]]

* 歴史

**RFC 1630

[FIG(quote)[ [5] Unsafe characters
>In canonical form, certain characters such as spaces, control
characters, some characters whose ASCII code is used differently in
different national character variant 7 bit sets, and all 8bit
characters beyond DEL (7F hex) of the ISO Latin-1 set, shall not be
used unencoded. This is a recommendation for trouble-free
interchange, and as indicated below, the encoded set may be extended
or reduced.

正規形では、[[間隔]], [[制御文字]],
その [[ASCII]] 符号が変形7ビット集合で異なる国家文字に使われている幾つかの文字、
[CODE(char)[[[DEL]]]] (16進
[CODE(code)[7F]]) より先の全ての8ビット文字である
[[ISO]] [[Latin-1]] 集合のような文字は符号化せずに使用しません。
これは障害のない交換のための勧告で、
次 [INS[(訳注: [[URI符号化]]の節)]]
に示すように、符号化する集合は増やしたり減らしたり出来ます。 [INS[[WEAK[(訳注: 特別な意味を持つ文字を上述の文字に加えて符号化するとか、特別な意味を持たない符号化された文字を復号するとかの意味。詳しくは RFC 1630 の該当部分を参照のこと。)]]]]
]FIG]

[3] RFC 1630 の [[BNF]]
では [CODE[unsafe]] は定義されていません。

** RFC 1736

[FIG(quote)[ [6] 
>Unsafe:
>Characters can be unsafe for a number of reasons.  The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems.  The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it.  The character "%" is unsafe because it is used for
encodings of other characters.  Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".

文字は幾つかの理由で非安全になり得ます。
間隔文字は、 URL 
が転写されたり植字されたり文書処理プログラムの処理の対象になった時に意味のある間隔が消滅したり意味のない間隔が混入したりする可能性があるので非安全です。
文字 [CODE(char)[<]] 及び
[CODE(char)[>]] は、
自由文中で URL の周りの区切子として使うので非安全です。
引用符 ([CODE(char)["]])
を URL を区切るのに使うシステムもあります。
文字 [CODE(char)[#]] は、
World Wide Web や他のシステムで
URL とこれに続き得る素片・アンカー識別子を区切るのに使うので、
非安全であって常に符号化するべきです。
文字 [CODE(char)[%]]
は他の文字の符号化に使うので非安全です。
他の文字は関門やその他の転送エージェントがしばしばこれらの文字を修正することが知られているので非安全です。
その文字は 
[CODE(char)[{]], [CODE(char)[}]], [CODE(char)[|]], [CODE(char)[\]], [CODE(char)[^]], [CODE(char)[~]], [CODE(char)[[]], [CODE(char)[] ]], [CODE(char)[`]] です。

>All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.

全ての非安全文字は URL
中では常に符号化しなければなりません。
例えば、文字 [CODE(char)[#]]
は、たとえ通常素片又はアンカーの識別子を扱わないシステムにおいてであっても、
URL 中では符号化しなければなりません。
こうすれば URL がこれらを使用するほかのシステムに複製されても、
URL 符号化を変更する必要がなくなります。
]FIG]

[7] 
RFC 1738 の BNF では
[CODE[unsafe]] は定義されていません。

** RFC 2068

[4] [[RFC2068]] は [[HTTP/1.1]]
なので URI RFCs の本筋からはそれますが。

>
-unsafe         = CTL | SP | <"> | "#" | "%" | "<" | ">"

* メモ