[1] 次に示すのは、 RFC 3023 と RFC 2376 の差分, そしてその和訳です。 詳しくは RFC 3023 参照。
[ISO-10646] ISO/IEC, Information Technology - Universal Multiple- Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane, May 1993.
[ISO-8897] ISO (International Organization for Standardization) ISO 8879:1986(E) Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML). First edition -- 1986- 10-15.
[REC-XML] T. Bray, J. Paoli, C. M. Sperberg-McQueen, "Extensible Markup Language (XML)" World Wide Web Consortium Recommendation REC- xml-19980210. http://www.w3.org/TR/1998/REC-xml-19980210.
[ASCII] "US-ASCII. Coded Character Set -- 7-Bit American Standard Code for Information Interchange", ANSI X3.4-1986, 1986.
[CSS] Bos, B., Lie, H.W., Lilley, C. and I. Jacobs, "Cascading
Style Sheets, level 2 (CSS2) Specification", World Wide
Web Consortium Recommendation REC-CSS2, May 1998,
<http://www.w3.org/TR/REC-CSS2/>.
[ISO8859] "ISO-8859. International Standard -- Information Processing -- 8-bit Single-Byte Coded Graphic Character Sets -- Part 1: Latin alphabet No. 1, ISO-8859-1:1987", 1987.
[MathML] Ion, P. and R. Miner, "Mathematical Markup Language
(MathML) 1.01", World Wide Web Consortium Recommendation
REC-MathML, July 1999, <http://www.w3.org/TR/REC-MathML/>.
[PNG] Boutell, T., "PNG (Portable Network Graphics)
Specification", World Wide Web Consortium Recommendation
REC-png, October 1996, <http://www.w3.org/TR/REC-png>.
[RDF] Lassila, O. and R.R. Swick, "Resource Description
Framework (RDF) Model and Syntax Specification", World
Wide Web Consortium Recommendation REC-rdf-syntax,
February 1999, <http://www.w3.org/TR/REC-rdf-syntax/>.
[RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982.
[RFC0977] Kantor, B. and P. Lapsley, "Network News Transfer Protocol", RFC 977, February 1986.
DEL[-
1557] Choi, U., Chon, K. and H. Park, "Korean Character Encoding
for Internet Messages", RFC 1557, December 1993.
[RFC1652] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D. Crocker, "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July 1994.
[RFC-1874] Levinson, E., "SGML Media Types", RFC 1874, December 1995.
[RFC-2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC-2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996.
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996.
[RFC2048] Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", RFC 2048, November 1996.
[RFC2060] Crispin, M., "Internet Message Access Protocol - Version 4rev1", RFC 2060, December 1996.
[RFC-2068] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068, January 1997.
[RFC2077] Nelson, S., Parks, C. and Mitra, "The Model Primary Content Type for Multipurpose Internet Mail Extensions", RFC 2077, January 1997.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2130] Weider, C., Preston, C., Simonsen, K., Alvestrand, H., Atkinson, R., Crispin, M. and P. Svanberg, "The Report of the IAB Character Set Workshop held 29 February - 1 March, 1996", RFC 2130, April 1997.
[RFC-2279] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 2279, January 1998.
[RFC2376] Whitehead, E. and M. Murata, "XML Media Types", RFC 2376, July 1998.
[RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax.", RFC 2396, August 1998.
[RFC2413] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf, "Dublin Core Metadata for Resource Discovery", RFC 2413, September 1998.
[RFC2445] Dawson, F. and D. Stenerson, "Internet Calendaring and Scheduling Core Object Specification (iCalendar)", RFC 2445, November 1998.
[RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D. Jensen, "HTTP Extensions for Distributed Authoring -- WEBDAV", RFC 2518, February 1999.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999.
[RFC2703] Klyne, G., "Protocol-independent Content Negotiation Framework", RFC 2703, September 1999.
[RFC2781] Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO 10646", RFC 2781, Februrary 2000.
[RFC2801] Burdett, D., "Internet Open Trading Protocol - IOTP Version 1.0", RFC 2801, April 2000.
[SGML] International Standard Organization, "Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML)", ISO 8879, October 1986.
[SVG] Ferraiolo, J., "Scalable Vector Graphics (SVG)", World
Wide Web Consortium Candidate Recommendation SVG, November
2000, <http://www.w3.org/TR/SVG>.
[XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible
HyperText Markup Language", World Wide Web Consortium
Recommendation xhtml1, January 2000,
<http://www.w3.org/TR/xhtml1>.
[XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M. and E. Maler,
"Extensible Markup Language (XML) 1.0 (Second Edition)",
World Wide Web Consortium Recommendation REC-xml, October
2000, <http://www.w3.org/TR/REC-xml>.
[XSLT] Clark, J., "XSL Transformations (XSLT) Version 1.0", World
Wide Web Consortium Recommendation xslt, November 1999,
<http://www.w3.org/TR/xslt>.
[UNICODE] The Unicode Consortium, "The Unicode Standard -- Version 2.0", Addison-Wesley, 1996.
Chris Newman and Yaron Y. Goland both contributed content to the security considerations section of this document. In particular, some text in the security considerations section is copied verbatim from work in progress, draft-newman-mime-textpara-00, by permission of the author. Chris Newman additionally contributed content to the encoding considerations sections. Dan Connolly contributed content discussing when to use text/xml. Discussions with Ned Freed and Dan Connolly helped refine the author's understanding of the text media type; feedback from Larry Masinter was also very helpful in understanding media type registration issues.
Members of the W3C XML Working Group and XML Special Interest group have made significant contributions to this document, and the authors would like to specially recognize James Clark, Martin Duerst, Rick Jelliffe, Gavin Nicol for their many thoughtful comments.
E. James Whitehead, Jr. Dept. of Information and Computer Science University of California, Irvine Irvine, CA 92697-3425
EMail: ejw@@ics.uci.edu
Murata Makoto (Family Given) Fuji Xerox Information Systems, KSP 9A7, 2-1, Sakado 3-chome, Takatsu-ku, Kawasaki-shi, Kanagawa-ken, 213 Japan
EMail: murata@@fxis.fujixerox.co.jp
MURATA Makoto (FAMILY Given) IBM Tokyo Research Laboratory 1623-14, Shimotsuruma Yamato-shi, Kanagawa-ken 242-8502 Japan
Phone: +81-46-215-4678 EMail: mmurata@@trl.ibm.co.jp
Simon St.Laurent simonstl.com 1259 Dryden Road Ithaca, New York 14850 USA
EMail: simonstl@@simonstl.com URI: http://www.simonstl.com/
Dan Kohn Skymoon Ventures 3045 Park Boulevard Palo Alto, California 94306 USA
Phone: +1-650-327-2600 EMail: dan@@dankohn.com URI: http://www.dankohn.com/
Although the use of a suffix was not considered as part of the original MIME architecture, this choice is considered to provide the most functionality with the least potential for interoperability problems or lack of future extensibility. The alternatives to the '+xml' suffix and the reason for its selection are described below.
A.1 Why not just use text/xml or application/xml and let the XML
processor dispatch to the correct application based on the referenced DTD?
text/xml and application/xml remain useful in many situations, especially for document-oriented applications that involve combining XML with a stylesheet in order to present the data. However, XML is also used to define entirely new data types, and an XML-based format such as image/svg+xml fits the definition of a MIME media type exactly as well as image/png[PNG] does. (Note that image/svg+xml is not yet registered.) Although extra functionality is available for MIME processors that are also XML processors, XML-based media types -- even when treated as opaque, non-XML media types -- are just as useful as any other media type and should be treated as such.
Since MIME dispatchers work off of the MIME type, use of text/xml or application/xml to label discrete media types will hinder correct dispatching and general interoperability. Finally, many XML documents use neither DTDs nor namespaces, yet are perfectly legal XML.
A.2 Why not create a new subtree (e.g., image/xml.svg) to represent XML
MIME types?
The subtree under which a media type is registered -- IETF, vendor (*/vnd.*), or personal (*/prs.*); see [RFC2048] for details -- is completely orthogonal from whether the media type uses XML syntax or not. The suffix approach allows XML document types to be identified within any subtree. The vendor subtree, for example, is likely to include a large number of XML-based document types. By using a suffix, rather than setting up a separate subtree, those types may remain in the same location in the tree of MIME types that they would have occupied had they not been based on XML.
A.3 Why not create a new top-level MIME type for XML-based media types?
The top-level MIME type (e.g., model/*[RFC2077]) determines what kind of content the type is, not what syntax it uses. For example, agents using image/* to signal acceptance of any image format should certainly be given access to media type image/svg+xml, which is in
all respects a standard image subtype. It just happens to use XML to describe its syntax. The two aspects of the media type are completely orthogonal.
XML-based data types will most likely be registered in ALL top-level categories. Potential, though currently unregistered, examples could include application/mathml+xml[MathML] and image/svg+xml[SVG].
A.4 Why not just have the MIME processor 'sniff' the content to
determine whether it is XML?
Rather than explicitly labeling XML-based media types, the processor could look inside each type and see whether or not it is XML. The processor could also cache a list of XML-based media types.
Although this method might work acceptably for some mail applications, it would fail completely in many other uses of MIME. For instance, an XML-based web crawler would have no way of determining whether a file is XML except to fetch it and check. The same issue applies in some IMAP4[RFC2060] mail applications, where the client first fetches the MIME type as part of the message structure and then decides whether to fetch the MIME entity. Requiring these fetches just to determine whether the MIME type is XML could have significant bandwidth and latency disadvantages in many situations.
Sniffing XML also isn't as simple as it might seem. DOCTYPE declarations aren't required, and they can appear fairly deep into a document under certain unpreventable circumstances. (E.g., the XML declaration, comments, and processing instructions can occupy space before the DOCTYPE declaration.) Even sniffing the DOCTYPE isn't completely reliable, thanks to a variety of issues involving default values for namespaces within external DTDs and overrides inside the internal DTD. Finally, the variety in potential character encodings (something XML provides tools to deal with), also makes reliable sniffing less likely.
A.5 Why not use a MIME parameter to specify that a media type uses XML
syntax?
For example, one could use "Content-Type: application/iotp; alternate-type=text/xml" or "Content-Type: application/iotp; syntax=xml".
Section 5 of [RFC2045] says that "Parameters are modifiers of the media subtype, and as such do not fundamentally affect the nature of the content". However, all XML-based media types are by their nature
always XML. Parameters, as they have been defined in the MIME architecture, are never invariant across all instantiations of a media type.
More practically, very few if any MIME dispatchers and other MIME agents support dispatching off of a parameter. While MIME agents on the receiving side will need to be updated in either case to support (or fall back to) generic XML processing, it has been suggested that it is easier to implement this functionality when acting off of the media type rather than a parameter. More important, sending agents require no update to properly tag an image as "image/svg+xml", but few if any sending agents currently support always tagging certain content types with a parameter.
A.6 How about labeling with parameters in the other direction (e.g.,
application/xml; Content-Feature=iotp)?
This proposal fails under the simplest case, of a user with neither knowledge of XML nor an XML-capable MIME dispatcher. In that case, the user's MIME dispatcher is likely to dispatch the content to an XML processing application when the correct default behavior should be to dispatch the content to the application responsible for the content type (e.g., an ecommerce engine for application/iotp+xml[RFC2801], once this media type is registered).
Note that even if the user had already installed the appropriate application (e.g., the ecommerce engine), and that installation had updated the MIME registry, many operating system level MIME registries such as .mailcap in Unix and HKEY_CLASSES_ROOT in Windows do not currently support dispatching off a parameter, and cannot easily be upgraded to do so. And, even if the operating system were upgraded to support this, each MIME dispatcher would also separately need to be upgraded.
A.7 How about a new superclass MIME parameter that is defined to apply
to all MIME types (e.g., Content-Type: application/iotp; $superclass=xml)?
This combines the problems of Appendix A.5 and Appendix A.6.
If the sender attaches an image/svg+xml file to a message and includes the instructions "Please copy the French text on the road sign", someone with an XML-aware MIME client and an XML browser but no support for SVG can still probably open the file and copy the text. By contrast, with superclasses, the sender must add superclass support to her existing mailer AND the receiver must add superclass support to his before this transaction can work correctly.
If the receiver comes to rely on the superclass tag being present and applications are deployed relying on that tag (as always seems to happen), then only upgraded senders will be able to interoperate with those receiving applications.
A.8 What about adding a new parameter to the Content-Disposition header
or creating a new Content-Structure header to indicate XML syntax?
This has nearly identical problems to Appendix A.7, in that it requires both senders and receivers to be upgraded, and few if any operating systems and MIME dispatchers support working off of anything other than the MIME type.
A.9 How about a new Alternative-Content-Type header?
This is better than Appendix A.8, in that no extra functionality needs to be added to a MIME registry to support dispatching of information other than standard content types. However, it still requires both sender and receiver to be upgraded, and it will also fail in many cases (e.g., web hosting to an outsourced server), where the user can set MIME types (often through implicit mapping to file extensions), but has no way of adding arbitrary HTTP headers.
A.10 How about using a conneg tag instead (e.g., accept-features:
(syntax=xml))?
When the conneg protocol is fully defined, this may potentially be a reasonable thing to do. But given the limited current state of conneg[RFC2703] development, it is not a credible replacement for a MIME-based solution.
A.11 How about a third-level content-type, such as text/xml/rdf?
MIME explicitly defines two levels of content type, the top-level for the kind of content and the second-level for the specific media type. [RFC2048] extends this in an interoperable way by using prefixes to specify separate trees for IETF, vendor, and personal registrations. This specification also extends the two-level type by using the ' +xml' suffix. In both cases, processors that are unaware of these later specifications treat them as opaque and continue to interoperate. By contrast, adding a third-level type would break the current MIME architecture and cause numerous interoperability failures.
A.12 Why use the plus ('+') character for the suffix '+xml'?
As specified in Section 5.1 of [RFC2045], a tspecial can't be used:
tspecials := "(" / ")" / "<" / ">" / "@@" / "," / ";" / ":" / "\" / <"> "/" / "[" / "]" / "?" / "="
It was thought that "." would not be a good choice since it is already used as an additional hierarchy delimiter. Also, "*" has a common wildcard meaning, and "-" and "_" are common word separators and easily confused. The characters %'`#& are frequently used for quoting or comments and so are not ideal.
That leaves: ~!$^+{}|
Note that "-" is used heavily in the current registry. "$" and "_" are used once each. The others are currently unused.
It was thought that '+' expressed the semantics that a MIME type can be treated (for example) as both scalable vector graphics AND ALSO as XML; it is both simultaneously.
A.13 What is the semantic difference between application/foo and
application/foo+xml?
MIME processors that are unaware of XML will treat the '+xml' suffix as completely opaque, so it is essential that no extra semantics be assigned to its presence. Therefore, application/foo and application/foo+xml SHOULD be treated as completely independent media types. Although, for example, text/calendar+xml could be an XML version of text/calendar[RFC2445], it is possible that this (hypothetical) new media type would include new semantics as well as new syntax, and in any case, there would be many applications that support text/calendar but had not yet been upgraded to support text/calendar+xml.
A.14 What happens when an even better markup language (e.g., EBML) is
defined, or a new category of data?
In the ten years that MIME has existed, XML is the first generic data format that has seemed to justify special treatment, so it is hoped that no further suffixes will be necessary. However, if some are later defined, and these documents were also XML, they would need to specify that the '+xml' suffix is always the outermost suffix (e.g., application/foo+ebml+xml not application/foo+xml+ebml). If they were not XML, then they would use a regular suffix (e.g., application/foo+ebml).
A.15 Why must I use the '+xml' suffix for my new XML-based media type?
You don't have to, but unless you have a good reason to explicitly disallow generic XML processing, you should use the suffix so as not to curtail the options of future users and developers.
Whether the inventors of a media type, today, design it for dispatch to generic XML processing machinery (and most won't) is not the critical issue. The core notion is that the knowledge that some media type happens to use XML syntax opens the door to unanticipated kinds of processing beyond those envisioned by its inventors, and on this basis identifying such encoding is a good and useful thing.
Developers of new media types are often tightly focused on a particular type of processing that meets current needs. But there is no need to rule out generic processing as well, which could make your media type more valuable over time. It is believed that registering with the '+xml' suffix will cause no interoperability problems whatsoever, while it may enable significant new functionality and interoperability now and in the future. So, the conservative approach is to include the '+xml' suffix.
There are numerous and significant differences between this specification and [RFC2376], which it obsoletes. This appendix summarizes the major differences only.
First, text/xml-external-parsed-entity and application/xml-external- parsed-entity are added as media types for external parsed entities, and text/xml and application/xml are now prohibited.
Second, application/xml-dtd is added as a media type for external DTD subsets and external parameter entities, and text/xml and application/xml are now prohibited.
Third, "utf-16le" and "utf-16be" are added. RFC 2781 has introduced these BOM-less variations of the UTF-16 family.
Fourth, a naming convention ('+xml') for XML-based media types has been added, which also updates [RFC2048] as described in Section 7. By following this convention, an XML-based media type can be easily recognized as such.
This document reflects the input of numerous participants to the ietf-xml-mime@@imc.org mailing list, though any errors are the responsibility of the authors. Special thanks to:
Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed, Yaron Goland, Rick Jelliffe, Larry Masinter, David Megginson, Keith Moore, Chris Newman, Gavin Nicol, Marshall Rose, Jim Whitehead and participants of the XML activity at the W3C.
Copyright (C) The Internet Society (19982001). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Funding for the RFC Editor function is currently provided by the Internet Society.