One of the first things you'll probably want to do is to parse an XML document of some kind. This is easy to do in <dom4j>. The following code demonstrates how to this.html
解析:java
首先你可能会想要去解析各类XML文件,在dom4j中这是很容易的。如下代码演示了这么解析。node
import java.net.URL; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.io.SAXReader; public class Foo { public Document parse(URL url) throws DocumentException { SAXReader reader = new SAXReader(); Document document = reader.read(url); return document; } }
A document can be navigated using a variety of methods that return standard Java Iterators. For examplegit
使用迭代器github
使用不一样的方式来操做文件以得到java的标准迭代器,例如:express
public void bar(Document document) throws DocumentException { Element root = document.getRootElement(); // iterate through child elements of root for (Iterator<Element> it = root.elementIterator(); it.hasNext();) { Element element = it.next(); // do something } // iterate through child elements of root with element name "foo" for (Iterator<Element> it = root.elementIterator("foo"); it.hasNext();) { Element foo = it.next(); // do something } // iterate through attributes of root for (Iterator<Attribute> it = root.attributeIterator(); it.hasNext();) { Attribute attribute = it.next(); // do something } }
In <dom4j> XPath expressions can be evaluated on the Document or on any Node in the tree (such as Attribute, Element orProcessingInstruction). This allows complex navigation throughout the document with a single line of code. For exampleapache
XPath强大的导航能力oracle
Dom4j 中的XPath表达式能够在文档或者树的任何节点(例如属性,元素或者进程指令)上求值,它使用一行代码在整个文档中进行复杂的导航,例如:app
public void bar(Document document) { List<Node> list = document.selectNodes("//foo/bar"); Node node = document.selectSingleNode("//foo/bar/author"); String name = node.valueOf("@name"); }
For example if you wish to find all the hypertext links in an XHTML document the following code would do the trick.dom
举个例子,若是你想要找到XHTML文档中全部的超文本连接,能够根据如下的代码。
public void findLinks(Document document) throws DocumentException { List<Node> list = document.selectNodes("//a/@href"); for (Iterator<Node> iter = list.iterator(); iter.hasNext();) { Attribute attribute = (Attribute) iter.next(); String url = attribute.getValue(); } }
If you need any help learning the XPath language we highly recommend the Zvon tutorial which allows you to learn by example.
若是您须要帮助学习XPath语言,咱们强烈推荐Zvon教程,它容许您经过示例学习
If you ever have to walk a large XML document tree then for performance we recommend you use the fast looping method which avoids the cost of creating an Iterator object for each loop. For example
快速循环
若是您必须遍历大型XML文档树,那么为了提升性能,咱们建议您使用快速循环方法,这样能够避免为每一个循环建立迭代器对象的成本。例如
public void treeWalk(Document document) { treeWalk(document.getRootElement()); } public void treeWalk(Element element) { for (int i = 0, size = element.nodeCount(); i < size; i++) { Node node = element.node(i); if (node instanceof Element) { treeWalk((Element) node); } else { // do something… } } }
Often in <dom4j> you will need to create a new document from scratch. Here's an example of doing that.
新建一个XML文档
在dom4j中你将会常常从头开始建立一个新的文档,下面就是新建文档的例子。
import org.dom4j.Document; import org.dom4j.DocumentHelper; import org.dom4j.Element; public class Foo { public Document createDocument() { Document document = DocumentHelper.createDocument(); Element root = document.addElement("root"); Element author1 = root.addElement("author") .addAttribute("name", "James") .addAttribute("location", "UK") .addText("James Strachan"); Element author2 = root.addElement("author") .addAttribute("name", "Bob") .addAttribute("location", "US") .addText("Bob McWhirter"); return document; } }
A quick and easy way to write a Document (or any Node) to a Writer is via the write() method.
把文档写到文件中
向写入器写入文档(或任何节点)的快速简便方法是经过write()方法。
FileWriter out = new FileWriter("foo.xml"); document.write(out); out.close();
If you want to be able to change the format of the output, such as pretty printing or a compact format, or you want to be able to work with Writer objects or OutputStream objects as the destination, then you can use the XMLWriter class.
若是但愿可以更改输出的格式,好比漂亮的打印或紧凑格式,或者但愿可以使用Writer对象或OutputStream对象做为目标,那么可使用XMLWriter类。
import org.dom4j.Document; import org.dom4j.io.OutputFormat; import org.dom4j.io.XMLWriter; public class Foo { public void write(Document document) throws IOException { // lets write to a file try (FileWriter fileWiter = new FileWriter("output.xml")) { XMLWriter writer = new XMLWriter(fileWriter); writer.write( document ); writer.close(); } // Pretty print the document to System.out OutputFormat format = OutputFormat.createPrettyPrint(); writer = new XMLWriter(System.out, format); writer.write( document ); // Compact format to System.out format = OutputFormat.createCompactFormat(); writer = new XMLWriter(System.out, format); writer.write(document); writer.close(); } }
If you have a reference to a Document or any other Node such as an Attribute or Element, you can turn it into the default XML text via the asXML() method.
字符串的转换
若是您有对文档或任何其余节点(如属性或元素)的引用,您能够经过asXML()方法将其转换为默认的XML文本。
Document document = …; String text = document.asXML();
If you have some XML as a String you can parse it back into a Document again using the helper method DocumentHelper.parseText()
若是有一些XML做为字符串,可使用辅助方法DocumentHelper.parseText()将其解析回文档
String text = "<person> <name>James</name> </person>"; Document document = DocumentHelper.parseText(text);
Applying XSLT on a Document is quite straightforward using the JAXP API from Oracle. This allows you to work against any XSLT engine such as Xalan or Saxon. Here is an example of using JAXP to create a transformer and then applying it to a Document.
使用XSLT转换文档
使用Oracle的JAXP API对文档应用XSLT很是简单。这容许您对抗任何XSLT引擎,好比Xalan或Saxon。下面是一个使用JAXP建立转换器并将其应用到文档的示例。
import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; import org.dom4j.Document; import org.dom4j.io.DocumentResult; import org.dom4j.io.DocumentSource; public class Foo { public Document styleDocument(Document document, String stylesheet) throws Exception { // load the transformer using JAXP TransformerFactory factory = TransformerFactory.newInstance(); Transformer transformer = factory.newTransformer(new StreamSource(stylesheet)); // now lets style the given document DocumentSource source = new DocumentSource(document); DocumentResult result = new DocumentResult(); transformer.transform(source, result); // return the transformed document Document transformedDoc = result.getDocument(); return transformedDoc; } }
XMLPULL
主要的特色:
一、simple interface 接口单一
二、implementation independent实现独立
三、ease of use 操做简单
四、versatility 多功能性
五、performance 性能较好
六、minimal requirements 需求小(设备的要求,内存等)
首先咱们要建立一个解析器实例,这个要求有如下三个步骤:
一、得到XMLPULL工厂实例
二、(可选步骤)默认状况下的工厂模式将生成没有名称空间的解析器;要更改setNamespaceAware()函数,必需要被调用
三、建立一个解析器的实例
如下代码为实现方式:
XmlPullParserFactory factory=XmlPullParserFactory.newInstance(); factory.setNamespaceAware(true); XmlPullPArser xpp=factory.newPullParser();
下一步是设置解析器的输入:
xpp.setInput(new FileReader(args [i]));
接下来就能够开始进行解析了。
为了检索下一个事件,典型的XMLPULL应用将会反复地调用next()函数,直到END_DOCUMENT事件,进程才会被中止。
public void processDocument(XmlPullParser xpp)throws Exception{ int eventType=xpp.getType(); do{ if(eventType==xpp.START_DOCUMENT){ System.out.println(“Start document”); }else if(eventType==xpp.END_DOCUMENT){ System.out.println(“End documen!”); }else if(eventType == xpp.START_TAG) { processStartElement(xpp); } else if(eventType == xpp.END_TAG) { processEndElement(xpp); } else if(eventType == xpp.TEXT) { processText(xpp); } eventType = xpp.next(); } while (eventType != xpp.END_DOCUMENT); } }
让咱们看看如何处理start标记,与处理结束标签是很是类似的-主要的区别是结束标签没有属性。
public void processStartElement (XmlPullParser xpp) { String name = xpp.getName(); String uri = xpp.getNamespace(); if ("".equals (uri)) { System.out.println("Start element: " + name); } else { System.out.println("Start element: {" + uri + "}" + name); } }
如今让咱们看看如何检索和打印元素内容:
public void processText (XmlPullParser xpp) throws XmlPullParserException { char ch[] = xpp.getTextCharacters(); int start = xpp.getTextCharactersStart(); int length = xpp.getTextCharactersLength(); System.out.print("Characters: \""); for (int i = start; i < start + length; i++) { switch (ch[i]) { case '\\': System.out.print("\\\\"); break; case '"': System.out.print("\\\""); break; case '\n': System.out.print("\\n"); break; case '\r': System.out.print("\\r"); break; case '\t': System.out.print("\\t"); break; default: System.out.print(ch[i]); break; } } System.out.print("\"\n"); }