Content-Transfer-Encoding: ヘッダー (MIME)

[1] MIME 実体の頭領域で、実体に使われている内容転送符号化 — CTE を示します。 CTE が使われていない時は、使っているオクテット種を示します。

仕様書#✎

[34] RFC 4409 (IETF 提案標準) urn:ietf:rfc:4409
- 8.4. Transfer Encode
[32] RFC 7231 - Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content (8/6/2014, 8:54:02 PM 版) https://tools.ietf.org/html/rfc7231#appendix-A.5
[42] OData Version 4.0 Part 1: Protocol Plus Errata 01 (9/4/2014, 7:00:00 AM 版) http://docs.oasis-open.org/odata/odata/v4.0/odata-v4.0-part1-protocol.html#_Toc393958809
[44] OData Version 4.0 Part 1: Protocol Plus Errata 01 (9/4/2014, 7:00:00 AM 版) http://docs.oasis-open.org/odata/odata/v4.0/odata-v4.0-part1-protocol.html#_Toc370374937

構文#✎

BNF の章を書きなおす

値#✎

一覧表を書きなおす、メモで報告されている非標準の値も追加

転送符号化の WikiPage との関係はどうする??

既定値#✎

[53] 各プロトコルの既定値については、 MIME を参照。

MIME の規定は? 電子メイルでは 7bit

[35] BEEP

BEEP で伝送される MIME実体に Content-Transfer-Encoding: 頭欄が明記されていない場合の既定値は binary です。 RFC 3080 2.2

未対応の値への対処#✎

MIME によれば、 application/octet-stream とするべき

応用#✎

PO ファイル#✎

[225] GNU gettext により拡張された POファイルの頭部には MIME-Version 欄などと合わせて Content-Transfer-Encoding 欄を指定する必要がありますが、実際にはまったく意味を持ちません。値は常に 8bit で構いません。

RFC 2045 BNF#✎

encoding := "Content-Transfer-Encoding" ":" mechanism
mechanism := "7bit" / "8bit" / "binary" / "quoted-printable" / "base64" / ietf-token / x-token

値の大文字・小文字は区別しません。

MIME での既定値は "7bit" です。 HTTP には (RFC になった最終仕様には) CTE は存在しませんが、相当する既定値的なものは "binary" です。

転送路の性能を表す転送符号化の値#✎

"7bit", "8bit" 転送符号化#✎

符号化無し。一行998オクテット以下 (SMTPの制限)で、行区切は CRLF。7ビットでは値 128 以上は出現しない。

値 0 (NULL) は出現しない。 13 (CR) と 10 (LF) は CRLF 以外としては出現しない。

[66] RFC 2045 2.7

   "7bit data" refers to data that is all represented as relatively
   short lines with 998 octets or less between CRLF line separation
   sequences [RFC-821].  No octets with decimal values greater than 127
   are allowed and neither are NULs (octets with decimal value 0).  CR
   (decimal value 13) and LF (decimal value 10) octets only occur as
   part of CRLF line separation sequences.

「7ビット・データ 7bit data」は、 CRLF 行区切り列の間が 998オクテット以下の比較的短い行ですべて表現されるデータ [RFC-821] を指します。 10進値が127より大きいオクテットや NUL (10進値0のオクテット)は認められません。 CR (10進値13) と LF (10進値10) のオクテットは CRLF 行区切り列の一部としてのみ使うことが出来ます。

[67] RFC 2045 2.8

   "8bit data" refers to data that is all represented as relatively
   short lines with 998 octets or less between CRLF line separation
   sequences [RFC-821]), but octets with decimal values greater than 127
   may be used.  As with "7bit data" CR and LF octets only occur as part
   of CRLF line separation sequences and no NULs are allowed.

「8ビット・データ 8bit data」は、 CRLF 行区切り列の間が 998オクテット以下の比較的短い行ですべて表現されるデータ [RFC-821] を指します。但し10進値が127より大きなオクテットを使って構いません。「7ビット・データ 7bit data」同様に、 CR と LF のオクテットは CRLF 行区切列の一部としてのみ使うことが出来、 NUL は認められていません。

"binary" 転送符号化#✎

[57] binaryは、符号化無し、制限無しのオクテット列。 SMTP 電子メイルでは使用不可。

[68] RFC 2045 2.9

"Binary data" refers to data where any sequence of octets whatsoever is allowed.

「バイナリ・データ binary data」はどんなオクテットの列でも認められるデータを指します。

プロトコルの転送符号化#✎

[58] プロトコルの転送符号化 (の既定値と限界) については、 MIME の項のプロトコルの章を参照。

ietf-token 転送符号化#✎

ietf-token の登録簿は http://www.iana.org/assignments/transfer-encodings。追加名は登録されていない。 MIME RFC は、相互通信性の観点から新しい CTE は定義するべきじゃないって言ってる。

x-token 転送符号化#✎

MIME 以前から使われて来た、 uuencode 用に、 x-uuencode とか x-uue とか x-uu とかが使われてる。

x-gzip64 は、 gzip で圧縮して base64 したもの。 Perl だと MIME::Decoder::Gzip64 てのがあるみたい。

圧縮 CTE が MIME で標準化されてない理由みたいなもの http://www.mew.org/ml/mew-dist-1.94/msg07639.html。なるほど。納得できないけど理解はできるな。

7bit	%x01-09 / %x0B-0C / %x0E-7F / CRLF	MIME
8bit	%x01-09 / %x0B-0C / %x0E-FF / CRLF	MIME
base64	64進数 Base64	MIME
binary	%x00-FF	MIME
quoted-string	=
x-gzip64	GNU Zip + Base64
x-uu	uuencode
x-uue	uuencode
x-uuencode	uuencode
x-uuencoded	uuencode

[30] Uuencode を表す CTE 名は MIME ができてすぐの頃から複数種類使われています。 MIME が uuencode を採用しなかったのは電子メイル輸送路で必ずしも安全で無い文字を使うからで、 MIMEr の立場では uuencode は使うな、 Base64 を使え、ということで、そのために uuencode の CTE 名は標準化されませんでした。

現実には、すべての MUA が MIME を実装したわけではなく、しかも Base64 を復号するソフトウェアはまったく普及していなかったので (あるとしたら MIME 用ばかり)、 MIME MUA の利用者も含めた多くの人達が Base64 より uuencode を使いました。 MIME を実装した MUA の開発者も、その需要に応えて uuencode を実装しましたが、その際に MIME の枠組みの中で CTE として使用するのは自然なことです。しかし標準化された CTE 名はない。かくして数々の uuencode のための CTE 名が生まれることになったのです。 (後から追加しようにも、 MIMEr に CTE の追加は RFC に書いてあるとおり、互換性のためによくないのだと一蹴されるだけです。)

MIMEr としては彼らの目指す美しき将来を描いていたのでしょうが、現実を甘く見過ぎていたという点で、 MIME の失敗の一つと言えるでしょう。

歴史#✎

RFC 2045#✎

[69] 6. Content-Transfer-Encoding Header Field

   Many media types which could be usefully transported via email are
   represented, in their "natural" format, as 8bit character or binary
   data.  Such data cannot be transmitted over some transfer protocols.
   For example, RFC 821 (SMTP) restricts mail messages to 7bit US-ASCII
   data with lines no longer than 1000 characters including any trailing
   CRLF line separator.

電子メイルで有益に転送し得る多くの媒体型は、その「普通の」形式で、 8ビット文字やバイナリ・データで表現されます。このようなデータは転送できない転送プロトコルがあります。例えば、 RFC 821 (SMTP) はメイル・メッセージを7ビット US-ASDCII データで尾っぽの CRLF 行区切りを含めて1000文字以内のものに制限しています。

   It is necessary, therefore, to define a standard mechanism for
   encoding such data into a 7bit short line format.  Proper labelling
   of unencoded material in less restrictive formats for direct use over
   less restrictive transports is also desireable.  This document
   specifies that such encodings will be indicated by a new "Content-
   Transfer-Encoding" header field.  This field has not been defined by
   any previous standard.

ですから、このようなデータを8ビットの短い行の形式に符号化する標準的な方式を定義する必要があります。比較的制限の少ない形式の符号化されていないものにも、より制限的な転送で直接使うために、適切に札付けするのが望ましいです。この文書は、新しい「Content-Transfer-Encoding」頭欄で示すこの様な符号化を規定します。この欄はいかなる以前の標準でも定義されていません。

訳注: RFC 1341 でならともかく、 RFC 2045 で「新しい」も「いかなる以前の標準」もあったもんじゃないと思うのですが。 typo: desireable → desirable

6.1. Content-Transfer-Encoding Syntax#✎

The Content-Transfer-Encoding field's value is a single token specifying the type of encoding, as enumerated below. Formally:

Content-Transfer-Encoding 欄の値は単一の字句で、次に列挙する通り符号化の型を指定します。

     encoding := "Content-Transfer-Encoding" ":" mechanism

     mechanism := "7bit" / "8bit" / "binary" /
                  "quoted-printable" / "base64" /
                  ietf-token / x-token

These values are not case sensitive -- Base64 and BASE64 and bAsE64 are all equivalent. An encoding type of 7BIT requires that the body is already in a 7bit mail-ready representation. This is the default value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the Content-Transfer-Encoding header field is not present.

これらの値は大文字・小文字を区別しません。 Base64 と BASE64 と bAsE64 は全て同等です。符号化型 7BIT は本文が既に8ビットで郵送準備万端表現である必要があります。これは既定値です。つまり、 Content-Transfer-Encoding 頭欄が無い場合は "Content-Transfer-Encoding: 7BIT" と仮定します。

6.2. Content-Transfer-Encodings Semantics#✎

   This single Content-Transfer-Encoding token actually provides two
   pieces of information.  It specifies what sort of encoding
   transformation the body was subjected to and hence what decoding
   operation must be used to restore it to its original form, and it
   specifies what the domain of the result is.

この単一の Content-Transfer-Encoding 字句は実際には2片の情報を提供します。どんな種類の符号化変形が本文に対して成されたか, それゆえどんな復号処理を元の形式に戻すためにしなければならないか、と、結果の範囲を指定します。

   The transformation part of any Content-Transfer-Encodings specifies,
   either explicitly or implicitly, a single, well-defined decoding
   algorithm, which for any sequence of encoded octets either transforms
   it to the original sequence of octets which was encoded, or shows
   that it is illegal as an encoded sequence.  
  Content-Transfer-Encodings transformations never depend on any
 additional external
   profile information for proper operation. Note that while decoders
   must produce a single, well-defined output for a valid encoding no
   such restrictions exist for encoders: Encoding a given sequence of
   octets to different, equivalent encoded sequences is perfectly legal.

Content-Transfer-Encoding 変形部分は、明示的または暗示的に、符号化オクテットのどんな列をも符号化されている元のオクテットの列に変形するか符号化列として不正であると示すための、単一のはっきりした復号解法を指定します。 Content-Transfer-Encoding 変形は適切な処理にいかなる追加の外部プロファイル情報をも必要としません。なお、復号者は妥当な符号化に対して単一のはっきりとした出力をしなければなりませんが、符号化者にはこのような制限は存在しません。与えたオクテット列を違って符号化しても、同等の符号化列は完全に妥当です。

Three transformations are currently defined: identity, the "quoted-printable" encoding, and the "base64" encoding. The domains are "binary", "8bit" and "7bit".

現在3つの変形が定義されています。 "quoted-printable" 符号化と "base64" 符号化です。範囲は "binary" と "8bit" と "7bit" です。

訳注: 3つもあるか?

   The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all
   mean that the identity (i.e. NO) encoding transformation has been
   performed.  As such, they serve simply as indicators of the domain of
   the body data, and provide useful information about the sort of
   encoding that might be needed for transmission in a given transport
   system.  The terms "7bit data", "8bit data", and "binary data" are
   all defined in Section 2.

Content-Transfer-Encoding 値 "7bit", "8bit", "binary" は全て同一 (つまり無) 符号化変形が施されていることを意味します。で、単に本文データの範囲を示して、当該転送系で必要になるかもしれない変形の符号化の種類についての優位な情報を提供します。用語「7ビット・データ 7bit data」, 「8ビット・データ 8bit data」, 「バイナリ・データ binary data」はすべて第2章で定義しました。

   The quoted-printable and base64 encodings transform their input from
   an arbitrary domain into material in the "7bit" range, thus making it
   safe to carry over restricted transports.  The specific definition of
   the transformations are given below.

quoted-printable, base64 両符号化は入力形式を "7bit" の範囲に収まるように変形します。これによって制限された転送で運ぶのに安全になります。変形の定義規定は後に示します。

   The proper Content-Transfer-Encoding label must always be used.
   Labelling unencoded data containing 8bit characters as "7bit" is not
   allowed, nor is labelling unencoded non-line-oriented data as
   anything other than "binary" allowed.

適切な Content-Transfer-Encoding 札を常に使わなければ成りません。符号化さていない8ビット文字を含むデータに "7bit" の札をつけるのは認められませんし、符号化されていない非行指向データを "binary" 以外に札付けするのも認められません。

   Unlike media subtypes, a proliferation of Content-Transfer-Encoding
   values is both undesirable and unnecessary.  However, establishing
   only a single transformation into the "7bit" domain does not seem
   possible.  There is a tradeoff between the desire for a compact and
   efficient encoding of largely- binary data and the desire for a
   somewhat readable encoding of data that is mostly, but not entirely,
   7bit.  For this reason, at least two encoding mechanisms are
   necessary: a more or less readable encoding (quoted-printable) and a
   "dense" or "uniform" encoding (base64).

媒体亜型とは違って、 Content-Transfer-Encoding 値の増加は好ましくなく、必要でもありません。しかし、 "7bit" 範囲への単一変形だけを確立するのは可能で無さそうです。大きなバイナリ・データの小さくまとまって能率的な符号化への望みと、ほとんど7ビット, だけど全て7ビットではないデータの幾分可読な符号化への望みの妥協点です。この理由から、最低2つの符号化機構が必要です。多少可読な符号化 (quoted-printable) と「濃」く「一様」な符号化 (base64) です。

   Mail transport for unencoded 8bit data is defined in RFC 1652.  As of
   the initial publication of this document, there are no standardized
   Internet mail transports for which it is legitimate to include
   unencoded binary data in mail bodies.  Thus there are no
   circumstances in which the "binary" Content-Transfer-Encoding is
   actually valid in Internet mail.  However, in the event that binary
   mail transport becomes a reality in Internet mail, or when MIME is
   used in conjunction with any other binary-capable mail transport
   mechanism, binary bodies must be labelled as such using this
   mechanism.

符号化されていない8ビット・データのメイル転送は RFC 1652 で定義されています。この文書の最初の出版時点では、符号化されていないバイナリ・データをメイルの本文中に合法的に含められる標準化された Internet メイル転送はありません。ですから Internet メイルで実際 "binary" Content-Transfer-Encoding が妥当な場面はありません。しかし、バイナリ・メイル転送が Internet メイルで現実味を帯びた暁には、あるいは MIME が他のバイナリ可能メイル転送機構と併せて使う時には、バイナリ本文にはこの機構を使って札付けしなければなりません。

   NOTE: The five values defined for the Content-Transfer-Encoding field
   imply nothing about the media type other than the algorithm by which
   it was encoded or the transport system requirements if unencoded.

参考: Content-Transfer-Encoding 欄用に定義された5つの値は、その符号化されている算法または符号化されていない場合は転送系の要件以外に、媒体型について何らの暗示をもするものではありません。

6.3. New Content-Transfer-Encodings#✎

   Implementors may, if necessary, define private Content-Transfer-
   Encoding values, but must use an x-token, which is a name prefixed by
   "X-", to indicate its non-standard status, e.g., 
 "Content-Transfer-Encoding: x-my-new-encoding".  Additional standardized 
 Content-Transfer-Encoding values must be specified by a standards-track RFC.
   The requirements such specifications must meet are given in RFC 2048.
   As such, all content-transfer-encoding namespace except that
   beginning with "X-" is explicitly reserved to the IETF for future
   use.

実装者は必要ならば、私的な Content-Transfer-Encoding 値を定義しても構いませんが、 x-token を使って名前を "x-" で始めて "Content-Transfer-Encoding: x-my-new-encoding" のようにして非標準であることを示さなければなりません。追加の標準化された Content-Transfer-Encoding 値は標準化過程 RFC で規定しなければなりません。その仕様の要件は RFC 2048 に合致するもので無ければなりません。というわけで、 "x-" で始まるものを除くすべての content-transfer-encoding 名前空間は IETF により将来の仕様のために明白に予約されています。

Unlike media types and subtypes, the creation of new Content-Transfer-Encoding values is STRONGLY discouraged, as it seems likely to hinder interoperability with little potential benefit

媒体型・亜型とは異なり、新しい Content-Transfer-Encoding 値の作成は強く非推奨します。得られるものが少なく相互通信性の妨げとなると思われるからです。

訳注: 句点で終わってない。欠落?

6.4. Interpretation and Use 解釈と使用#✎

   If a Content-Transfer-Encoding header field appears as part of a
   message header, it applies to the entire body of that message.  If a
   Content-Transfer-Encoding header field appears as part of an entity's
   headers, it applies only to the body of that entity.  If an entity is
   of type "multipart" the Content-Transfer-Encoding is not permitted to
   have any value other than "7bit", "8bit" or "binary".  Even more
   severe restrictions apply to some subtypes of the "message" type.

Content-Transfer-Encoding 頭欄がメッセージ頭の一部として出現した場合、これはそのメッセージの実体本文に適用されます。 Content-Transfer-Encoding 頭欄が実体の頭の一部として出現した場合、その実体の本文にのみ適用されます。実体が "multipart" 型である場合は、 Content-Transfer-Encoding は "7bit", "8bit", "binary" 以外の値を取ってはいけません。より厳しい制限が "message" 型の幾つかの亜型には適用されます。

   It should be noted that most media types are defined in terms of
   octets rather than bits, so that the mechanisms described here are
   mechanisms for encoding arbitrary octet streams, not bit streams.  If
   a bit stream is to be encoded via one of these mechanisms, it must
   first be converted to an 8bit byte stream using the network standard
   bit order ("big-endian"), in which the earlier bits in a stream
   become the higher-order bits in a 8bit byte.  A bit stream not ending
   at an 8bit boundary must be padded with zeroes. RFC 2046 provides a
   mechanism for noting the addition of such padding in the case of the
   application/octet-stream media type, which has a "padding" parameter.

ほとんどの媒体型はビットよりもオクテットで定義されているので、ここに示す気候は任意のオクテット列を符号化する機構であってビット列に対するものでないことに注意して下さい。ビット列がこれらの機構のうちの一つで符号化される場合、まず最初にネットワーク標準バイト順 (「大エンディアン big-endian」) の8ビット・バイト列を使って, すなわち列の始めのビットが8ビット・バイト中の高位のビットになるように変換しなければなりません。8ビット境界で終わらないビット列は 0埋めしなければなりません。 RFC 2046 は、 application/octet-stream 媒体型 ("padding" パラメーターがある。) の場合にこのような埋めの追加を注記する機構を提供しています。

The encoding mechanisms defined here explicitly encode all data in US-ASCII. Thus, for example, suppose an entity has header fields such as:

ここに定義する符号化機構は明白に、全てのデータを US-ASCII で符号化します。ですから、実体は例えば、次のような頭欄を持っているとします。

     Content-Type: text/plain; charset=ISO-8859-1
     Content-transfer-encoding: base64

This must be interpreted to mean that the body is a base64 US-ASCII encoding of data that was originally in ISO-8859-1, and will be in that character set again after decoding.

これは本文は base64 の US-ASCII 符号化データで、このデータは元は ISO-8859-1 で、復号の後で再びこの文字集合になることを意味すると解釈しなければなりません。

   Certain Content-Transfer-Encoding values may only be used on certain
   media types.  In particular, it is EXPRESSLY FORBIDDEN to use any
   encodings other than "7bit", "8bit", or "binary" with any composite
   media type, i.e. one that recursively includes other Content-Type
   fields.  Currently the only composite media types are "multipart" and
   "message".  All encodings that are desired for bodies of type
   multipart or message must be done at the innermost level, by encoding
   the actual body that needs to be encoded.

ある Content-Transfer-Encoding 値はある媒体型にのみ使わなければなりません。特に、 "7bit", "8bit", "binary" 以外の符号化を合成媒体型、つまり他の Content-Type を再帰的に含むものに使うことは 明確に禁止します。現在合成媒体型は "multipart" と "message" のみです。多部分やメッセージの型の本文で必要である全ての符号化は、最も奥の段階で、実際の符号化が必要な本文を符号化することでなされなければなりません。

   It should also be noted that, by definition, if a composite entity
   has a transfer-encoding value such as "7bit", but one of the enclosed
   entities has a less restrictive value such as "8bit", then either the
   outer "7bit" labelling is in error, because 8bit data are included,
   or the inner "8bit" labelling placed an unnecessarily high demand on
   the transport system because the actual included data were actually
   7bit-safe.

定義上、合成実体は "7bit" のような転送符号化値を持つとして、包まれた実態のどれかがより制限の少ない "8bit" のような値を持っている場合、 8ビット・データを含むので外側の "7bit" 札が間違いであるか、実際は7ビット安全であるデータを、転送系に必要も無く高い要求をしている内側の "8bit" 札付けがあるかのどちらかです。

   NOTE ON ENCODING RESTRICTIONS:  Though the prohibition against using
   content-transfer-encodings on composite body data may seem overly
   restrictive, it is necessary to prevent nested encodings, in which
   data are passed through an encoding algorithm multiple times, and
   must be decoded multiple times in order to be properly viewed.
   Nested encodings add considerable complexity to user agents:  Aside
   from the obvious efficiency problems with such multiple encodings,
   they can obscure the basic structure of a message.  In particular,
   they can imply that several decoding operations are necessary simply
   to find out what types of bodies a message contains.  Banning nested
   encodings may complicate the job of certain mail gateways, but this
   seems less of a problem than the effect of nested encodings on user
   agents.

符号化制限についての参考: 内容転送符号化の制限は過度の制限に思えるかもしれませんが、データが何度も符号化算法を通らなければならず、適切な表示のために何度も復号しなければならないような入れ子の符号化を防ぐことが必要なのです。入れ子の符号化で利用者代理者はかなり複雑になります。この複数度符号化の明白な能率問題を置いておくとしても、メッセージの構造は覆い隠されてしまいます。特に、単にメッセージにどんな型の本文が含まれているか見るのにも幾度の復号作業が必要になります。入れ子符号化の禁止はある種のメイル関門の仕事を複雑化させますが、入れ子符号化が利用者代理者に及ぼす影響に比べればたいした問題ではありません。

Any entity with an unrecognized Content-Transfer-Encoding must be treated as if it has a Content-Type of "application/octet-stream", regardless of what the Content-Type header field actually says.

認識出来ない Content-Transfer-Encoding の実体は、 Content-Type 頭欄が実際に何と言っていようと、 Content-Type が "application/octet-stream" であるとして扱わなければなりません。

   NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND 
 CONTENT-TRANSFER-ENCODING: It may seem that the Content-Transfer-Encoding 
 could be
   inferred from the characteristics of the media that is to be encoded,
   or, at the very least, that certain Content-Transfer-Encodings could
   be mandated for use with specific media types.  There are several
   reasons why this is not the case. First, given the varying types of
   transports used for mail, some encodings may be appropriate for some
   combinations of media types and transports but not for others.  (For
   example, in an 8bit transport, no encoding would be required for text
   in certain character sets, while such encodings are clearly required
   for 7bit SMTP.)

Content-Type と Content-Transfer-Encoding の関係についての参考: Content-Transfer-Encoding は符号化されている媒体の特徴から推論できるか、最低でも、ある Content-Transfer-Encoding が特定媒体型と使うのに固定することが出来たと思われるかもしれません。これが当たらないという理由が幾つかあります。まず、異なる型の転送がメイルに使われており、ある符号化がある媒体型と転送の組合せには適切であっても、他には適切でないこともあります。 (例えば、8ビット転送ではある文字集合の文には符号化は必要ありませんが、 8ビット SMTP では明らかに符号化が必要です。)

   Second, certain media types may require different types of transfer
   encoding under different circumstances.  For example, many PostScript
   bodies might consist entirely of short lines of 7bit data and hence
   require no encoding at all.  Other PostScript bodies (especially
   those using Level 2 PostScript's binary encoding mechanism) may only
   be reasonably represented using a binary transport encoding.
   Finally, since the Content-Type field is intended to be an open-ended
   specification mechanism, strict specification of an association
   between media types and encodings effectively couples the
   specification of an application protocol with a specific lower-level
   transport.  This is not desirable since the developers of a media
   type should not have to be aware of all the transports in use and
   what their limitations are.

第二に、ある媒体型に違った場面で違った種類の転送符号化が必要になるかもしれません。例えば、多くの PostScript 本文は完全に7ビット・データの短い行から構成されており、符号化は全く必要ありません。他の PostScript 本文 (特に第2水準 PostScript のバイナリ符号化機構) はバイナリ転送符号化を使って表現するのだけが理性的かもしれません。最後に、 Content-Type 欄は無制限仕様機構であり、媒体型と符号化の関連の厳密な指定が応用プロトコルの仕様と特定低水準転送とを効果的に組み合わせることになります。これは媒体型の開発者がその媒体型の使われる全ての転送とその制限について注意しなければならないべきではないので望ましくありません。

6.5. Translating Encodings#✎

   The quoted-printable and base64 encodings are designed so that
   conversion between them is possible.  The only issue that arises in
   such a conversion is the handling of hard line breaks in
 quoted-printable encoding output. When converting from quoted-printable to
   base64 a hard line break in the quoted-printable form represents a
   CRLF sequence in the canonical form of the data. It must therefore be
   converted to a corresponding encoded CRLF in the base64 form of the
   data.  Similarly, a CRLF sequence in the canonical form of the data
   obtained after base64 decoding must be converted to a
 quoted-printable hard line break, but ONLY when converting text data.

quoted-printable, base64 両符号化は両者間で変換可能に設計されています。この変換で起こる問題は、 quoted-printable 符号化出力の硬改行の取り扱いだけです。 quoted-printable から base64 に変換する時に、 quoted-printable 形式の硬改行はデータの正規形の CRLF 列で表現します。ですからデータの base64 形式で対応する符号化 CRLF に変換しなければなりません。同様に、 base64 復号後のデータの正規形中の CRLF 列は quoted-printable の硬改行に変換しなければなりません。但し文 (text) データ変換時のみです。

6.6. Canonical Encoding Model 正統符号化モデル#✎

   There was some confusion, in the previous versions of this RFC,
   regarding the model for when email data was to be converted to
   canonical form and encoded, and in particular how this process would
   affect the treatment of CRLFs, given that the representation of
   newlines varies greatly from system to system, and the relationship
   between content-transfer-encodings and character sets.  A canonical
   model for encoding is presented in RFC 2049 for this reason.

電子メイル・データが正規形・符号化に変換される時のモデル、とりわけこの処理が CRLF の扱いにどう影響するかについて (処理系間で新行の表現に多大な違いがあるので)、また CTE と文字集合の関係について、この RFC の以前の版では混乱がありました。この理由から符号化正統モデルを RFC 2049 に示しました。

6.7. Quoted-Printable Content-Transfer-Encoding#✎

Quoted-Printable

6.8. Base64 Content-Transfer-Encoding#✎

Base64

HTTP における CTE#✎

[39] HTTP には転送符号化 (Transfer-Encoding:) と内容符号化 (Content-Encoding:) が存在します。これらは CTE とは異なるものとされています。転送符号化と CTE は用途が似ていますが、利用できる値と符号化は全く異なっていて、直接変換できる関係にはありません。

[41] MIME から HTTP に変換するなら CTE を復号しないといけませんし、 HTTP から MIME に変換するなら必要に応じて CTE を適用することになります >>32。

[46] HTTP92 は MIME を HTTP に採用することを意図していたので、 Content-Transfer-Encoding: をHTTPヘッダーの一つに挙げ、具体的には MIME の RFC を参照していました >>47。 (HTTP 向けの規定も必要 >>47 としていました。)

[50] RFC 4229 は HTTP92 を出典に状態「provisional」で Content-Transfer-Encoding: を HTTPヘッダーとして IANA登録簿に登録しています >>49。

[48] しかしその後の RFC となった HTTP 仕様書では、 HTTP は MIME から派生したメッセージ形式とされており、 Content-Transfer-Encoding: は HTTPヘッダーとはされていません。

[38] RFC 1945 (HTTP/1.0) C.4; RFC 2068 (HTTP/1.1) 19.4.4; RFC 2616 (HTTP/1.1) 19.4.5 No Content-Transfer-Encoding

HTTP does not use the Content-Transfer-Encoding (CTE) field of ~~~~RFC 1521~~ MIME~~ RFC 2045. Proxies and gateways from MIME-compliant protocols to HTTP ~~must~~ MUST remove any ~~{Errata で削除} non-identity~~ CTE ~~{Errata で削除} ("quoted-printable" or "base64")~~ encoding prior to delivering the response message to an HTTP client.

[15] HTTP は RFC 2045 の Content-Transfer-Encoding (CTE) 欄を使いません。 MIME に従ったプロトコルから HTTP への串や関門は「同等」でない CTE ("quoted-printable" と "base64") 符号化を HTTP クライアントへの応答メッセージで渡す前に解かなければなりません。

Proxies and gateways from HTTP to MIME-compliant protocols are responsible for ensuring that the message is in the correct format and encoding for safe transport on that protocol, where "safe transport" is defined by the limitations of the protocol being used. Such a proxy or gateway ~~should~~ SHOULD label the data with an appropriate Content-Transfer-Encoding if doing so will improve the likelihood of safe transport over the destination protocol.

[16] HTTP から MIME に従うプロトコルへの串と関門は、メッセージが適切な形式であってそのプロトコルでの安全な転送のために符号化することに責任を持ちます。ここで「安全な転送」とは、使用するプロトコルの制限により決まります。そうした串や関門は、向こうのプロトコルでの安全な転送のためになりそうであれば、適切な Content-Transfer-Encoding でデータを札付けするのが良いです。

注意: 注記のない修正点は RFC 1945 → RFC 2068 もの。

[226] RFC 2660 - The Secure HyperText Transfer Protocol ( (7/20/2013, 1:21:48 PM 版)) http://tools.ietf.org/html/rfc2660#section-2.5

[43] OData は HTTP 応答メッセージに Content-Transfer-Encoding: binary を指定することを求めています >>42。

[45] OData は HTTP 応答メッセージに含まれる multipart/mixed の本体部分のヘッダー (こちらは定義上 MIMEヘッダー) でも Content-Transfer-Encoding: binary を指定することを求めています >>44。

[62] FIPA MTP HTTP要求 multipart/mixed の実体のMIMEヘッダーで指定することが認められています。

[63] Recherche - Les services de l'État en Gironde, 5/22/2025, 2:50:00 AM, 5/23/2025, 10:54:08 AM https://www.gironde.gouv.fr/contenu/recherche/(offset)/1350?SearchText=&dateFilter[]=
- [64] https://www.gironde.gouv.fr/contenu/telechargement/45738/310633/file/G%C3%A9n%C3%A9rac.zip
  - [65] content-transfer-encoding: binary

メモ#✎

[2] 2003-10-12 01:11:04 +00:00 名無しさん: 8 bit っていう値を見ました。 spam でしたが。。。
[3] >>1 MIME 形式という言葉はしばしば用いられますけど、その意味は曖昧ですよね。本当に MIME 実体を指している事もあれば、単に Base64 の意味だったり、 uuencode 風に Base64 を包んだものだったり、或いは媒体型のことを言っていたり。
[4] 腐った実装は 7bits とかいう値を返したりするらしいですよ。
[5] >>4 IW:Google:"\"Content-Transfer-Encoding: 7bits\"", IW:Google:"\"Content-Transfer-Encoding: 8bits\"" : 思った以上に引っ掛かりますよ。 WWW でこれだけ見つかるってことは潜在的にはかなりの量流通してそう。
[6] Content-Transfer-Encoding: uuencode というのも結構あって、例えば Becky! なんかもこれの解読に対応しています : Becky! Ver.2の改版履歴 http://www.becky-users.net/history2.html
[7] CBFlib Manual http://www.iucr.org/iucr-top/cif/imgcif/CBFlib.html#3.2.1 : このソフトウェアはファイル形式に MIME を採用していますが、独自の CTE x-base8, x-base10, x-base16 に対応しています。
[8] >>7 cbfext98 v0.7.1 http://www.iucr.org/iucr-top/cif/imgcif/cbfext98.html#_array_data.data では x-base-8 みたいに - が3つ共に入ってます。おまけに base-64 にまで入ってます。。。なお、 Base8/10/16 の説明はこの文書にあります。
[9] The ISG+IRT/JIM System: Quick Tour - Demonstration On-Line http://www.cc.gatech.edu/projects/calton+morimori/ISG+IRT-JIM/MMFR-000920-01/demo1.html : XML で MIME もどきの語彙を使ってますが、 CTE の値に X-URL というのが使われています。 URI符号化かと思えばそうではなくて、 Content-Type: message/external-body; access-type=url みたいな意味 (本体が URI) のようです。
[10] ContentTransferEncodingプロパティ http://www.hitachi-to.co.jp/prod/prod_2/inter/emk/help/IMimeEntity/property/IMimeEntityContentTransferEncoding.htm : この実装は 7-bit, 8-bit, x-uu, x-uue, x-uuencode, uu, uue, uuencode, x-xx, x-xxe, x-xxencode, xx, xxe, xxencode, x-binhex, x-gzip64, gzip64 に触れています。
[11] 何らかの理由で x-unknown を送る MUA があるっぽい。
[12] uuencode 系の中では、 x-uuencode が一番多くの MUA に理解されそうです。なんとなくですが。

[13] File - ドリームキャストでできること http://www.mars.jstar.ne.jp/~a_zone/dc/file.htm : x-dreamcast-file というのを使ってます。単に binary のような気がしますが。。。

謎ですが意外と用例が見つかります。なんでそんなに普及してるんでしょう? わけがわからん。

[14] Bommanews technical http://b-news.sourceforge.net/tech.html : 内部用として、 X-Bommanews-224-ZZZ というのを使っています。 224 は使用オクテット数 (0x20〜0xFF で 224), ZZZ は encoding block size で現時点では 40 固定。

内部用なのにニュースにうじゃうじゃ流れていたりします。なぜか。なお、 X-Bommanews-224-40 以外の実例はまだ見つかっていません。

[15] x-pp-base128 : 8ビットの怪しい形式 (参考 PostPet V3はOpenGL採用 http://slashdot.jp/comments.pl?sid=39332&cid=150006)
[16] x-ferrum-uids : IronDoc febase.h http://www.treedragon.com/ged/irondoc/febase.htm#fe_box
[17] >>15 x-postpet/* という媒体型でセットで使われるみたい。 CTE と CT を分離できるものなのかは不明。
[18] >>16 x-ferrum-head/* 又は x-ferrum-menu/* という媒体型とセットで使用されるぽ。
[19] X-Zm-base64 : Zm は MUA 名の略号だろうな。 http://hp.ujf.cas.cz/mail/WA98_official/199707/19970718.html に例があるけど、単なる Base64。
[20] The Information and Content Exchange (ICE) Format and Protocol http://www.w3.org/TR/1998/NOTE-ice-19981026 : 純粋に MIME ではありませんが、 MIME から「派生した」 content-transfer-encoding 属性の値は x-native-xml 又は base64 とされています。前者は、 XML の標準の方法で文字実体を使うんだ云々といっていますが、実際には実体の展開は XML 処理系の仕事ですから、 XML 文書内における 7bit とか 8bit とか binary に相当するものとみていいでしょう。

[21] gzip というのもしばしば見かけますが、ほとんどが HTTP での用例なので Content-Encoding と勘違いしているのでしょう。ほんとにそんなので機能しているのですかね? (WWWブラウザで見て確認くらいしないのかなー。)

[28] >>21 x-gzip というのも多数。どこまで本気なんだか。。。

[22] x-ace : 電子メイルで IDN どうするよ? と IDN ML でちょっと話題になったものですが、具体的に形式がどうとかいう話にまでは至ってないみたいです。
[23] x-base65 : http://groups.google.com/groups?selm=3AB60548.B2C16EF0%40remotepoint.com
[24] >>23 他に用例がないし、字母が64文字しか見つかってないし、彼が base65 って何よ? とたずねても誰も知る人がいないし、なんかの間違いのような気もするんだけど、 base65 じゃなくて x-base65 ってのは奇妙な間違い方ではあるんだよなあ。

[25] >>10 xxe ってのは実際用例はあるんだけど正体がよくわかんない。

[29] >>25 uuencode 変種で、字母に +-0〜9A〜Za〜z を使うものらしい。

[26] x-custom3to4 : 不明。 uuencode のような感じだけど、違うものなのかな? 誰か検証してください。

[27] x-pgp-version : PGP/MIME を CTE として実現しようとした過去の案。実際に仕様としてまとめた草案や実装や用例があるのかは不明。

[28] x-yenc : yEnc。関連 ML で激論になった曰くつき。おかげで yEnc 界の一部は MIME 嫌いになったとかならないとか。

なんだかんだでも結局用例はあります。本体は x-uuencode のように、 yEnc の符号化をまるごと (メタ情報部も含めて) つっこみます。

なかには、こんな素晴らしいのもありました。

Content-Type: application/binary
Content-Transfer-Encoding: x-yenc; line=128; size=2345436;
  name=021005-301.dwg.zip

謎の媒体型 application/binary もそうですが、 CTE で引数を使っているのが素敵。 (CTE でも引数を使うのは、 MIME のもともとの思想には反するけど、妥当な態度だと思いますよ。現状では規格違反ですが。(一般論としては正しいと思うんですが、 yEnc の場合本体のめた情報として line とか size とか name とかを持ってるんですよね。それを MIME のレベルで重複させる意味があるのかは謎。それとも純粋な yEnc (謎) を CTE として使って、メタ情報は MIME レベルで指定しようという意味なのかなあ。))

[25] x-usercode : xxencode のように見えるけど謎。

[31] Content-Transfer-Encoding: packet というのが提案されたこともありました。 (今の Transfer-Encoding: chunked)

[33] plain: spam で流行中らしいです (名無しさん 2004-06-13 13:19:31 +00:00)

[36] 無料の電子メイル・サービスやメイリング・リストなどでは、メッセージの本文の最初や最後に広告やメイリング・リスト情報などをつけたすことがあります。往々にして Content-Type: や Content-Transfer-Encoding: に対応していないので、非 text/plain メッセージや Base64 化されたメッセージは壊れてしまうことがあります。 (名無しさん)

[37] 元々合成型 (message/* や multipart/*) での (非同一) 内容転送符号化 (Base64 など) の使用が禁止されていましたが、 message/* については RFC 5335 で緩和され、亜型毎に個別に禁止するか規定できるようになりました (従来の亜型については禁止のままです)。実際に message/global では内容転送符号化の使用が認められています。

詳しくは、個々の媒体型の記事を参照。

[227] RFC 2660 - The Secure HyperText Transfer Protocol ( (11/9/2014, 5:12:37 AM 版)) http://tools.ietf.org/html/rfc2660#section-2.5

[228] RFC 4130 - MIME-Based Secure Peer-to-Peer Business Data Interchange Using HTTP, Applicability Statement 2 (AS2) ( (9/21/2014, 12:13:43 PM 版)) http://tools.ietf.org/html/rfc4130#appendix-A.3

[51] RFC 6838 - Media Type Specifications and Registration Procedures (2/10/2015, 3:35:08 PM 版) http://tools.ietf.org/html/rfc6838#section-4.8

[52] Content-Transfer-Encoding問題 ‐ 通信用語の基礎知識 (6/9/2015, 10:23:13 AM 版) http://www.wdic.org/w/WDIC/Content-Transfer-Encoding%E5%95%8F%E9%A1%8C

ところがMicrosoft Internet Mailではこのとき、「Content-Transfer-Encoding: 8bit」という、明らかに誤ったヘッダーを付ける。ISO-2022-JPは7ビットなので明らかに間違いで、このためメールが相手に送信される途中で問題が発生してしまう。

[54] 最近は電子メールでも Content-Transfer-Encoding: binary で届くのが普通ですね。 8/7/2015, 4:51:53 AM

[55] 電子メールで CTE が明示されていないのに実質 8bit なこともあります。 8/7/2015, 5:00:26 AM

[56] RFC 7030 - Enrollment over Secure Transport (6/19/2016, 7:27:56 AM) https://tools.ietf.org/html/rfc7030#section-4.1.3

A successful response MUST be a certs-only CMC Simple PKI Response, as defined in [RFC5272], containing the certificates described in the following paragraph. The HTTP content-type of "application/pkcs7-mime" is used. The Simple PKI Response is sent with a Content-Transfer-Encoding of "base64" [RFC2045].

[59] SendRawEmail - Amazon Simple Email Service (10/15/2019, 9:32:31 PM) https://docs.aws.amazon.com/ja_jp/ses/latest/APIReference/API_SendRawEmail.html

Amazon SES allows you to specify 8-bit Content-Transfer-Encoding for MIME message parts. However, if Amazon SES has to modify the contents of your message (for example, if you use open and click tracking), 8-bit content isn't preserved.

[60] w3m おぼえがき(2001年12月) (12/27/2001, 3:30:43 PM, 10/24/2020, 11:52:40 PM) http://www2u.biglobe.ne.jp/~hsaka/w3mnote.cgi?month=200112

[61] rfc2060, 6/9/2021, 3:43:30 AM https://datatracker.ietf.org/doc/html/rfc2060#page-66

Content-Transfer-Encoding: binary