XML parsing with SAX: how to handle html as text within xml

I get an xml response from an external server.

Using some tutorials I got SAX-Parser working.

There is a small problem still remaining.

Within the response there is eg description tag containing html like this:

<description><p><strong>Title</strong></p>Description</description> 

After parsing description field of my object contains only "<".

Is it possible to tell my parser to handle html as plain text?

Or maybe there are other possibilities to solve this problem.

Thank you.


since you don't include your code, i have to imagine what you wrote. a common bug in SAX handler implementations is not handling the fact that the element text may be returned in multiple characters() method calls. you need to aggregate them all together until you get the endElement() event.


Aside from solution SAX problem, you might consider using Stax (javax.xml.stream) solution instead: it is as performance as SAX, but oftentime bit more convenient. You can also force coalescing of textual content (XMLInputFactory.IS_COALESCING) to avoid problems like you are encountering with SAX.

链接地址: http://www.djcxy.com/p/34912.html

上一篇: 使用Sax Parser,Java处理XML中的空标签

下一篇: 用SAX解析XML:如何在xml中将html作为文本处理