Parse XML using SAX in Java

XML stands for eXtended Markup Language. SAX (Simple API for XML) is an event-based sequential access parser API for XML documents. SAX is an alternative to the DOM parser.

The DOM parser operates on the document as a whole, whereas the SAX parser operates on each piece of the document sequentially. SAX is suitable for large XML files since it does consume much memory when processing the document sequentially. DOM loads the whole document in memory and thus it is not suitable for large documents.

The following Java program ParseXMLSAX.java reads the Google News feed in RSS/XML, and it goes through each “item” and prints out the “title”.

import javax.xml.parsers.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;

public class ParseXMLSAX 
{
    static class MyHandler extends DefaultHandler
    {
    	boolean newItem = false;
    	String title = null;
    	
        public void startElement(String uri, String localName, 
          String qName, Attributes attributes) throws SAXException
        {
            // System.out.println("Start Element :" + qName);
            if (qName.equals("item"))
            	newItem = true;
            else if (qName.equals("title") && newItem)
                title = "";
        }

        public void endElement(String uri, String localName, 
            String qName) throws SAXException
        {
            // System.out.println("End Element :" + qName);
            if (qName.equals("title") && newItem)
            {
            	System.out.println(title);
            	title = null;
            }
        }
        
        public void characters(char ch[], int start, int length)
                throws SAXException
        {
            if (title != null)
            	title += new String(ch, start, length);
        }
    }
    
    public static void main(String[] args)
    {
        String url = "http://news.google.com/?output=rss";
        try
        {
            SAXParser parser = 
                SAXParserFactory.newInstance().newSAXParser();
            DefaultHandler handler = new MyHandler();
            parser.parse(url, handler);
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }
}

A sample output is as below:

Nelson Mandela helped popularize use sanctions - Los Angeles Times
In rainy Washington, Obama lights national Christmas tree - Reuters
Jobless rate is at its lowest level in 5 years - The Seattle Times
US defends global cell phone tracking - Jamaica Observer
Spain handed 2010 final repeat - SPORT24

Related Posts

Comments

comments