[Solved]org.apache.tika.exception.EncryptedDocumentException: Unable to process: document is encrypted


EncryptedDocumentException is subclass of TikaException. This Exception occurred when TIKA parser tries to extract the content of Encrypted Microsoft word documents.

 public class EncryptedDocumentException extends TikaException

Constructors

  • EncryptedDocumentException()
  • EncryptedDocumentException(String info)
  • EncryptedDocumentException(String info, Throwable th)
  • EncryptedDocumentException(Throwable th)

This exception message and exception type dependend on type of encrypted file (docx or doc):

  • File password-protected.docx : org.apache.tika.exception.EncryptedDocumentException: Unable to process: document is encrypted
  • File password-protected.doc : org.apache.poi.EncryptedDocumentException: Cannot process encrypted word file

Here is stacktrace for both types of the documents:

Tika password-protected.docx


Exception in thread "main" org.apache.tika.exception.EncryptedDocumentException: Unable to process: document is encrypted
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:245)
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
    at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
    at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)

Tika password-protected.doc


Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@119e7782
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
    at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
    at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
Caused by: org.apache.poi.EncryptedDocumentException: Cannot process encrypted word file
    at org.apache.poi.hwpf.model.FileInformationBlock.(FileInformationBlock.java:77)
    at org.apache.poi.hwpf.HWPFDocumentCore.(HWPFDocumentCore.java:155)
    at org.apache.poi.hwpf.HWPFDocument.(HWPFDocument.java:218)
    at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:80)
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:199)
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/EncryptedDocumentException.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s