Tag Archives: ZeroByteFileException

[Solved]org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes


TikaZeroByteException is a subclass of TikaException. TikaZeroByteException occurred when using AutoDetectParser to extract the content of the file which is having no text or zero-bytes. In this case, auto-detect parser throws TikaZeroByteException.


public class ZeroByteFileException extends TikaException

Constructors

  • ZeroByteFileException(String msg): This constructor used to throw an exception with a message.

ZeroByteFileException Example

Here is an example to parse content and metadata of text file by using AutoDetectParser. But it’s throwing an exception because it is not having any content/zero.

package com.fiot.tika.exceptions.handling;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.parser.txt.TXTParser;

import org.xml.sax.SAXException;

public class TikaTextParserExample {

   public static void main(final String[] args) throws IOException,SAXException, TikaException {

      //detecting the file type
      BodyContentHandler handler = new BodyContentHandler();
      Metadata metadata = new Metadata();
      FileInputStream inputstream = new FileInputStream(new File("C:\\Users\\Saurabh Gupta\\Desktop\\TIKA\\BLANK-FILE.txt"));
      ParseContext pcontext=new ParseContext();

      //auto detect document parser
      Parser  parser = new AutoDetectParser();
      parser.parse(inputstream, handler, metadata,pcontext);
      System.out.println("Contents of the text document:" + handler.toString());
      System.out.println("Metadata of the text document:");
      String[] metadataNames = metadata.names();

      for(String name : metadataNames) {
         System.out.println(name + " : " + metadata.get(name));
      }
   }
}

Output


Exception in thread "main" org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122)
    at com.fiot.tika.exceptions.handling.TikaTextParserExample.main(TikaTextParserExample.java:29)

Solutions

To handle ZeroByteException there are two ways:

  1. Always check file size before use it.
  2. If you already know the content type of file using specific Parser. For Example in the above case replace the line with below text parser instance then no exception will occur.

Parser parser = new  TextParser();

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/ZeroByteFileException.html