Programming
18 April 2012 0 Comments

Fixing SAXParser Error “The system cannot find the file specified” for DTD files

When parsing an XML file with the SAXParser class, you may run into an error related to a .dtd file that cannot be found.

Example: We are parsing the file D:\homologene\build65\homologene.xml.

The first lines of the XML are:

 version="1.0"?>

>
  >
    >
      >3>

We see a DOCTYPE declaration that points to a DTD file. DTD stands for Document Type Definition, and it is used to define the format of the XML file. The SAXParser will automatically look for this file in the same directory as the XML file.

When parsing we get the following error:

java.io.FileNotFoundException: D:\homologene\build65\HomoloGene.dtd (The system cannot find the file specified)
  at java.io.FileInputStream.open(Native Method)
  at java.io.FileInputStream.(Unknown Source)
  at java.io.FileInputStream.(Unknown Source)
  at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
  at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
  at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
  at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
  at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
  at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
  at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
  at loader.homologene.HomologeneParser.parseFile(HomologeneParser.java:95)
  at loader.homologene.HomologeneParser.parse(HomologeneParser.java:70)
  at loader.homologene.HomologeneMain.parse(HomologeneMain.java:33)

The reason for the error is that the DTD file does not exist on the filesystem.

To suppress this error and parse the XML without using the DTD, we override the default resolveEntity() function in the EntityResolver interface:

public class ParseTest implements DefaultHandler
{
  public void parse()
  {
    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    SAXParser saxParser = saxParserFactory.newSAXParser();
    saxParser.parse("D:\\homologene\\build65\\homologene.xml");
  }
 
  public InputSource resolveEntity(String publicId, String systemId)
  {
    return new InputSource(new ByteArrayInputStream("".getBytes()));
  }
 
  // ...other handler functions...
}

Now the file will be parsed without problems. Of course it is better to find and use the DTD file, so use this workaround at your own risk.