微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

是否可能在XML属性中包含HTML文本或CDATA?

我试图把HTML文本或CDATA放在我的XML属性内时,我的解析器不断得到“XML解析器失败:未终止的属性”。有没有办法做到这一点,或者这是标准不允许的?
如果属性不是标记化或枚举类型,则将其作为CDATA处理。有关如何处理属性的详细信息,请参见 Extensible Markup Language (XML) 1.0 (Fifth Edition)

07001

XML attribute types are of three kinds: a string type,a set of tokenized types,and enumerated types. The string type may take any literal string as a value; the tokenized types are more constrained. The validity constraints noted in the grammar are applied after the attribute value has been normalized as described in 3.3.3 Attribute-Value normalization.

06000

… …

07002

Before the value of an attribute is passed to the application or checked for validity,the XML processor MUST normalize the attribute value by applying the algorithm below,or by using some other method such that the value passed to the application is the same as that produced by the algorithm.

  1. All line breaks MUST have been normalized on input to #xA as described in 07003,so the rest of this algorithm operates on text normalized in this way.
  2. Begin with a normalized value consisting of the empty string.
  3. For each character,entity reference,or character reference in the unnormalized attribute value,beginning with the first and continuing to the last,do the following:
    • For a character reference,append the referenced character to the normalized value.
    • For an entity reference,recursively apply step 3 of this algorithm to the replacement text of the entity.
    • For a white space character (#x20,#xD,#xA,#x9),append a space character (#x20) to the normalized value.
    • For another character,append the character to the normalized value.

If the attribute type is not CDATA,then the XML processor MUST further process the normalized attribute value by discarding any leading and trailing space (#x20) characters,and by replacing sequences of space (#x20) characters by a single space (#x20) character.

Note that if the unnormalized attribute value contains a character reference to a white space character other than space (#x20),the normalized value contains the referenced character itself (#xD,#xA or #x9). This contrasts with the case where the unnormalized value contains a white space character (not a reference),which is replaced with a space character (#x20) in the normalized value and also contrasts with the case where the unnormalized value contains an entity reference whose replacement text contains a white space character; being recursively processed,the white space character is replaced with a space character (#x20) in the normalized value.

All attributes for which no declaration has been read SHOULD be treated by a non-validating processor as if declared CDATA.

It is an error if an 07004 contains a 07005 to an entity for which no declaration has been read.

原文地址:https://www.jb51.cc/xml/293511.html

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。