如何解决用于 apche poi docx 的 HTML 解析器将 html 插入到段落
我正在尝试使用 apache poi 将 html 插入 docx。 Jsoup 非常适合解析 html,这个答案对我帮助很大,但我坚持将 UL 和 LI 插入 docx 作为在新位置插入段落 cozing 问题
对我有帮助的问题:How to set define different styles for the same paragraph
我添加的 ULparser :
public class UnorderedListParser implements NodeVisitor {
String nodeName;
boolean needNewRun;
boolean isItalic;
boolean isBold;
boolean isUnderlined;
int fontSize;
boolean insertImage = false;
Node anchorNode= null;
boolean liStarted = false;
String fontColor;
final CSSOMParser parser = new CSSOMParser();
XWPFParagraph paragraph;
XWPFRun run;
List<String> textList = new ArrayList<String>();
UnorderedListParser(XWPFParagraph paragraph) {
this.paragraph = paragraph;
this.run = paragraph.createRun();
this.nodeName = "";
this.needNewRun = false;
this.isItalic = false;
this.isBold = false;
this.isUnderlined = false;
this.fontSize = 11;
this.fontColor = "000000";
this.insertImage = false;
}
@Override
public void head(Node node,int depth) {
nodeName = node.nodeName();
System.out.println("Start1 "+nodeName+": " + node);
if("li".equals(nodeName)) {
liStarted = true;
}
if ("#text".equals(nodeName)) {
if(liStarted) {
textList.add(((TextNode)node).text());
}
}
}
@Override
public void tail(Node node,int depth) {
nodeName = node.nodeName();
System.out.println("End1 "+nodeName);
if("li".equals(nodeName)) {
liStarted = false;
}
if("ul".equals(nodeName)) {
try {
System.out.println("gpging into create buleet list");
createBulletList(paragraph,run,textList);
run = paragraph.createRun();
}catch(Exception e) {
System.out.println("into exception");
}
}
}
public static Map<String,String> getStyleMap(Node element) {
Map<String,String> keymaps = new HashMap<>();
if (!element.hasAttr("style")) {
return keymaps;
}
String styleStr = element.attr("style"); // => margin-top:-80px !important;color:#fcc;border-bottom:1px solid #ccc; background-color: #333; text-align:center
String[] keys = styleStr.split(":");
String[] split;
if (keys.length > 1) {
for (int i = 0; i < keys.length; i++) {
if (i % 2 != 0) {
split = keys[i].split(";");
if (split.length == 1) break;
keymaps.put(split[1].trim(),keys[i + 1].split(";")[0].trim());
} else {
split = keys[i].split(";");
if (i + 1 == keys.length) break;
keymaps.put(keys[i].split(";")[split.length - 1].trim(),keys[i + 1].split(";")[0].trim());
}
}
}
return keymaps;
}
public static void createBulletList(XWPFParagraph paragraph,XWPFRun run,List<String> documentList) {
System.out.println("all good");
CTAbstractNum cTAbstractNum = CTAbstractNum.Factory.newInstance();
//Next we set the AbstractNumId. This requires care.
//Since we are in a new document we can start numbering from 0.
//But if we have an existing document,we must determine the next free number first.
cTAbstractNum.setAbstractNumId(BigInteger.valueOf(0));
//Bullet list
CTLvl cTLvl = cTAbstractNum.addNewLvl();
cTLvl.addNewNumFmt().setVal(STNumberFormat.BULLET);
cTLvl.addNewSuff().setVal(STLevelSuffix.SPACE);
cTLvl.addNewLvlText().setVal("•");
XWPFAbstractNum abstractNum = new XWPFAbstractNum(cTAbstractNum);
XWPFNumbering numbering = paragraph.getDocument().createNumbering();
BigInteger abstractNumID = numbering.addAbstractNum(abstractNum);
BigInteger numID = numbering.addNum(abstractNumID);
XmlCursor cursor;
for (String string : documentList) {
paragraph.setNumID(numID);
// font size for bullet point in half pt
paragraph.getCTP().getPPr().addNewRPr().addNewSz().setVal(BigInteger.valueOf(22));
run = paragraph.createRun();
run.setText(string);
run.setFontSize(11);
cursor = paragraph.getCTP().newCursor();
cursor.toEndToken();
while(cursor.toNextToken() != org.apache.xmlbeans.XmlCursor.TokenType.START);
paragraph =paragraph.getDocument().insertNewParagraph(cursor);
}
System.out.println("all good1");
cursor = paragraph.getCTP().newCursor();
cursor.toEndToken();
while(cursor.toNextToken() != org.apache.xmlbeans.XmlCursor.TokenType.START);
paragraph =paragraph.getDocument().insertNewParagraph(cursor);
}
public XWPFParagraph returnParagraph() {
return paragraph;
}
}
我正在使用的 Html :
<p><a href="" target="_blank">dda</a></p><p><br></p><ul><li>1</li><li>2</li><li>3</li></ul><p>CRM stands for<span style="color: rgb(0,0);"> “custom</span><span style="color: rgb(57,92,92);">er relationship management” and it’s software that stores customer contact information like names,addresses,and phone numbers,as well as keeps track of customer activity like website visits,pho</span><span style="color: rgb(0,0);">ne calls,email,and more.</span></p><p><span style="color: rgb(0,0);">Discover Customer 360,the world’s #1 CR</span>M. Connect to your customers in a more intelligent way by uniting sales,service,marketing,commerce,IT,and analytics. All powered by our global community of Trailblazers.</p><p><br></p><p><br></p><p><img src="https://test.com" alt="Porter charge to drop laptop from home2.jpg"></img></p>
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。