微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

我想将XML Like字符串拆分为c#或sql中的标记

我想将 XML Like字符串拆分为c#或sql中的标记.
例如
输入字符串就像

<entry><AUTHOR>C. Qiao</AUTHOR> and <AUTHOR>R.Melhem</AUTHOR>,"<TITLE>Reducing Communication </TITLE>",<DATE>1995</DATE>. </entry>

我想要这个输出

C       AUTHOR
.       AUTHOR
Qiao    AUTHOR
and 
R       AUTHOR
.       AUTHOR
Melhem  AUTHOR,"
Reducing        TITLE
Communication   TITLE
",1995    DATE
.

解决方法

考虑到以下因素,这是如何解决此问题的第一次尝试:
1. XML String是有效的(即标签之间不会有任何无效的字符)
像这样:

string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR>
                                  <AUTHOR>R.Melhem</AUTHOR>
                                  <TITLE>Reducing Communication </TITLE>
                                  <DATE>1995</DATE>
                           </ENTRY>";

2.分裂将由空间”完成

string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR>
                              <AUTHOR>R.Melhem</AUTHOR>
                              <TITLE>Reducing Communication </TITLE>
                              <DATE>1995</DATE>
                       </ENTRY>";
        XElement doc = XElement.Parse(xml);
        foreach (XElement element in doc.Elements())
        {

            var values = element.Value.Split(' ');
            foreach (string value in values)
            {
                Console.WriteLine(element.Name + " " + value);
            }
        }

将打印出来

AUTHOR C.
AUTHOR Qiao
AUTHOR R.Melhem
TITLE Reducing
TITLE Communication
TITLE
DATE 1995

编辑:

现在,基于“.”进行拆分.和空格,最好的想法是使用正则表达式.像这样:

var values = Regex.Split(element.Value,@"(\.| )");
        foreach (string value in values.Where(x=>!String.IsNullOrWhiteSpace(x)))
        {
            Console.WriteLine(element.Name + " " + value);
        }

如果您愿意,可以添加更多分隔符.以下示例将为您提供以下内容

AUTHOR C
AUTHOR .
AUTHOR Qiao
AUTHOR R
AUTHOR .
AUTHOR Melhem
TITLE Reducing
TITLE Communication
DATE 1995

EDIT2:
这是一个与原始字符串一起使用的示例,它很可能不是最好的方法,因为它没有正确的令牌顺序,但它应该非常接近:

string xml = @" <entry>
                            <AUTHOR>C. Qiao</AUTHOR> 
                            and 
                            <AUTHOR>R.Melhem</AUTHOR>,""<TITLE>Reducing Communication </TITLE>"",<DATE>1995</DATE>. 
                           </entry>";
            //Parse xml to XDocument
            XDocument doc = XDocument.Parse(xml);

            // Get first element (we only have one)
            XElement element = doc.Descendants().FirstOrDefault();

            //Create a copy of an element for use by child elements.
            XElement copyElement = new XElement(element);
            //Remove all child nodes from root leaving only text
            element.Elements().Remove();

            //Splitting based on the tokens specified
                var values = Regex.Split(element.Value,@"(\.| |\,|\"")");
                    foreach (string value in values.Where(x => !String.IsNullOrWhiteSpace(x)))
                    {
                        Console.WriteLine(value);
                    }
            //Getting children nodes and splitting the same way
            foreach (XElement elem in copyElement.Elements())
            {
                var val = Regex.Split(elem.Value,|\"")");
                foreach (string value in val.Where(x => !String.IsNullOrWhiteSpace(x)))
                {
                    Console.WriteLine(value + " " + elem.Name);
                }
            }
            //You can try to play with DescendantsAndSelf 
            //to see if you can do it in single action and with order preserved.
            //foreach (XElement elem in element.DescendantsAndSelf())
            //{
            //    //....
            //}

这将打印出以下内容

and,"
",.
C AUTHOR
. AUTHOR
Qiao AUTHOR
R AUTHOR
. AUTHOR
Melhem AUTHOR
Reducing TITLE
Communication TITLE
1995 DATE

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。