All posts by Saurabh Gupta

My Name is Saurabh Gupta, Done MCA from MMMEC Gorakhpur and having approx. 12 Year of experience in Information Technology World manly in Java/J2EE. During this time I have worked with multiple organization with different client, so many technology, frameworks etc.

[Solved] org.apache.tika.parser.utils.DataURISchemeParseException


DataURISchemaParseException is a subclass of TikaException. This schema has occurred when there is a mismatch of syntax or encoding of URI data when compared with URI schema.

public class DataURISchemeParseException extends TikaException

Constructors

  • DataURISchemeParseException(String msg)

Data URI Scheme?

Data URI Scheme is a URI scheme that provides a way to include data inline in webpages if that were external resources. The data URI scheme is useful to get CSS or images for the web pages with the same URL and no need any separate HTTP URL for download.

Data URI Schema.jpg

For more detail, about Data URI Schema you can refer this link: https://en.wikipedia.org/wiki/Data_URI_scheme

References

https://tika.apache.org/1.22/api/org/apache/tika/parser/utils/DataURISchemeParseException.html

Python Overview


Python is a general-purpose interpreted, interactive, object-oriented, scripting, and high-level programming language. Python is highly readable and mostly uses English keywords and syntactical constructions other than languages.

Python Developed by Guido van Rossum during 1985- 1990. Python source code is also available in the GNU GPL (General Public License).

Why Python?

  • Python is Interpreted: Python did not require compilation. It’s processed by an interpreter at runtime. This is similar to PHP and PERL.
  • Python is Interactive: Python is interactive because you can write a python program on Python prompt and interact with an interpreter.
  • Python is Object-Oriented: Python supports Object-Oriented features for programming that encapsulates code within objects.
  • Python is a Startup Language for Beginner’s: Python supports a wide range of applications from Simple text to WWW browsers to games. Its keyword is the most common words of English to understand a program.

Python Features

Python is the most widely used language for application development. Here are the most important features that are the reason it’s most preferred:

  • Easy-to-learn: Python has a simple structure, keywords of the English language, and a clearly defined syntax. This allows the beginner level programmer to easily pick up.
  • Easy-to-read: Python code is more clear and indentation based formatting so you can easily read it.
  • Easy-to-maintain: Python’s source code is easy-to-maintain.
  • A broad standard library: Python’s library is cross-platform compatible and very portable.
  • Interactive Mode: Python is interactive because you can write a python program on Python prompt and interact with the interpreter. It allows testing and debugging of code on the snippet.
  • Extendable: Python is extendable to add low-level modules to the interpreter. By these modules enable programmers to add to or customize their tools to be more efficient.
  • Portable: Python is portable because it runs on a variety of hardware platforms and provides the same interface for all platforms.
  • Scalable: Python provides a better structure and support for programs for application than shell scripting.
  • Databases: Python provides interfaces to support all major databases use for commercial applications.
  • GUI Programming: Python supports GUI applications that can be developed and ported to so many system calls, libraries and windows systems, Macintosh, Windows MFC, etc.

Python Characteristics

These are the most important characteristics of Python Programming Language:

  • Python supports structured and functional programming methods and also supports some features of OOP.
  • Python can be used as a scripting language or can be compiled to byte-code for developing large applications.
  • Python supports dynamic type checking and also provides very high-level dynamic data types.
  • Python supports automatic garbage collection the same as Java.
  • Python can be easily integrated with C, C++, ActiveX, CORBA, COM, and Java.

Where to use Python?

Python is a very popular language to use in developing the application, used a scripting language and nowadays is the most popular language to use in Artificial intelligence and machine learning.

  • Web Development: Django, Bottle, Pyramid, Tornado, Flask, Web2py
  • GUI Development: tkInter, pyQt, PySide, Kivy, wxPython, PygObject
  • Software Development: Buildbot, Trac, Roundup
  • System Administration: Ansible, OpenStack, Salt
  • Scientific and Numeric: Pandas, IPython, SciPy

References

[Solved] org.apache.tika.parser.chm.exception.ChmParsingException


ChmParsingException is a subclass of TikaException. This is exception occurs when there is a problem with the CHM file.

public class ChmParsingExceptionv extends TikaException

Constructors

  • ChmParsingException(String description)

CHM ?

CHM is a compiled HTML help format used for software documentation, which consists of HTML pages, indexes, and other navigation tools. These files are compressed and deployed in binary format.

CHM files support the following features:

  • Data Compression
  • In-built search engine.
  • One file can merge multiple .chm files.
  • Extended character supports, although fully not support Unicode.

 

References

[Solved] org.apache.tika.io.EndianUtils. BufferUnderrunException


BufferUnderFlowException is a subclass of TikaException. This exception occurred when buffer fed from a lower rate while read at a higher rate. There can be many reasons for this connection interruption, hard drive corrupted or CPU speed issue.

public static class EndianUtils.BufferUnderrunException extends TikaException

Constructors

  • BufferUnderrunException()

Solutions

As this issue can be from multiple reasons that’s why having multiple solutions as per need:

  1. Increase buffer size.
  2. Before burning external devices perform hard drive defragmentation.
  3. Avoid burn data onto a device in the network
  4. Always take the backup of data before transferring.
  5. Run hard drive scanning software to identify the corrupted file in the machine before export it.
  6. Always set TIKA memory consumption as higher and CPU and hard drive speed requirements to ensure enough RAM is available.
  7. Make sure the device consuming data or network connection functioning properly.

References

https://tika.apache.org/1.22/api/org/apache/tika/io/EndianUtils.BufferUnderrunException.html

[Solved]org.apache.tika.exception.TikaMemoryLimitException


TikaMemoryLimitException is a subclass of TikaException. This exception generally occurred when there are lots of nested or embedded files within documents.

For Example :

  1.  Maven jars: Where one jar contains pom having a reference of other dependencies
  2. Git objects
  3. Word documents having lots of embedded files.

For parsing these nested/embedded files a large number of memory required that’s the reason for parser consuming memory up to highest mark will through this exception.

Solutions

  1. Set memory uses limit for TIKA as much as possible. at least more than 1 GB
  2. Make a common practice to shield the input stream with CloseShieldInputStreams so that it can fail if reaching the max limit.

Generally in TIKA, these allocations were coming from TikaInputStream.get(InputStream, TemporaryResources) which check if the type of InputStream for identify it’s support mark or not.

  • BufferedInputStream
  • ByteArrayInputStream

Unfortunately, because of this common practice to wrap InputStreams in CloseShieldInputStreams, causing this exception even if the mark is in fact supported.

public class TikaMemoryLimitException extends TikaException

Constructors

  • TikaMemoryLimitException(String msg)

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/TikaMemoryLimitException.html

[Solved] org.apache.tika.mime.MimeTypeException


MimeTypeException is a subclass of TikaException. This exception occurred when there is a mismatch with selected parser and document mime type or Mime Type not supported by TIKA.

public class MimeTypeException extends TikaException

Constructors

  • MimeTypeException(String message) :Constructs a MimeTypeException with the specified detail message.
  • MimeTypeException(String message, Throwable cause)
    Constructs a MimeTypeException with the specified detail message and root cause.

References

https://tika.apache.org/1.22/api/org/apache/tika/mime/MimeTypeException.html

TIKA: MS-Excel Content and Metadata Extraction


In this program, you will see complete steps to extraction content and metadata of the MS-Excel file by using TIKA OOXMLParser.

Sample File

TIKA MS excel File Content and Metadata extrcation
TIKA MS Excel File Content and Metadata extraction

Complete Example

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.microsoft.ooxml.OOXMLParser;
import org.apache.tika.sax.BodyContentHandler;

import org.xml.sax.SAXException;

public class TikaMSExcelParserExample {

	public static void main(final String[] args) throws IOException, TikaException, SAXException {

		// detecting the file type
		BodyContentHandler handler = new BodyContentHandler();
		Metadata metadata = new Metadata();
		FileInputStream inputstream = new FileInputStream(new File("C:\\Users\\Saurabh Gupta\\Desktop\\TIKA\\TIKA-MS-EXCEL.xlsx"));
		ParseContext pcontext = new ParseContext();

		// OOXml parser
		OOXMLParser msofficeparser = new OOXMLParser();
		msofficeparser.parse(inputstream, handler, metadata, pcontext);
		System.out.println("Contents of the excel document:" + handler.toString());
		System.out.println("Metadata of the excel document:");
		String[] metadataNames = metadata.names();

		for (String name : metadataNames) {
			System.out.println(name + ": " + metadata.get(name));
		}
	}
}

Output


Contents of the excel document:Sheet1
    First Name  Last Name   DOB
    Saurabh Gupta   10-Dec-85
    Gaurav  Kumar   12-May-86
    Rahul   Roi 12-Jun-10
    Raghvendra  Rana    5-Jan-95
    Tanaya  Jain    13-Mar-85



Metadata of the excel document:
date: 2019-11-23T00:25:08Z
extended-properties:AppVersion: 15.0300
meta:creation-date: 2006-09-16T00:00:00Z
extended-properties:Application: Microsoft Excel
extended-properties:Company: 
Creation-Date: 2006-09-16T00:00:00Z
dcterms:created: 2006-09-16T00:00:00Z
custom:WorkbookGuid: e742a774-13a6-49b2-8ba3-1b6118163781
dcterms:modified: 2019-11-23T00:25:08Z
Last-Modified: 2019-11-23T00:25:08Z
Last-Save-Date: 2019-11-23T00:25:08Z
Application-Version: 15.0300
protected: false
meta:save-date: 2019-11-23T00:25:08Z
Application-Name: Microsoft Excel
modified: 2019-11-23T00:25:08Z
publisher: 
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
dc:publisher: 

[Solved] org.apache.tika.exception.TikaConfigException


TikaConfigException is a subclass of TikaException. This exception occurred when there is an error in the Tika config file. It can also occur when one or more of the parsers failed to initialize from that erroneous config.

public class TikaConfigException extends TikaException

Constructors

  • TikaConfigException(String msg): Creates an instance of the exception with a message.
  • TikaConfigException(String msg, Throwable cause): Create an instance of exception with message and cause.

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/TikaConfigException.html

[Solved] org.apache.tika.exception.CorruptedFileException


CorruptedFileException is a subclass of TikaException. This exception occurred when the parse absolutely, and because of corrupted content positively has to stop. This exception doesn’t catch and swallowed if an embedded parser throws it.

public class CorruptedFileException extends TikaException

Constructors

  • CorruptedFileException(String msg): This constructor use to throw an error message.
  • CorruptedFileException(String msg, Throwable cause): This constructor is used to through exception with the cause.

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/CorruptedFileException.html

[Solved] org.apache.tika.exception.AccessPermissionException


AccessPermissionException is a subclass on TikaException. This exception occurred when a document/file does not allow content extraction. For Example, This exception is most common for PDF type documents, which might cause this type of exception.

public class AccessPermissionException extends TikaException

Solutions

Always check file access, read, write and executable permission before going to use with TIKA, accordingly perform operations.

File file = new File("TEST-File");

With Java NIO Libraries
boolean isRegularFile = Files.isRegularFile(file);
boolean isHidden = Files.isReadable(file);
boolean isReadable = Files.isReadable(file);
boolean isExecutable = Files.isExecutable(file);
boolean isSymbolicLink = Files.isSymbolicLink(file);
boolean isWritable = Files.isWritable(directory);

With Java IO Libraries
boolean isReadable=file.isReadable();
boolean isWritable=file.setWritable();
boolean isExecutable=file.setExecutable();

Constructors

Here are list of Constructor for this exception class:

  • AccessPermissionException() : Default constructor
  • AccessPermissionException(String info) : Constructor with exception message
  • AccessPermissionException(String info, Throwable th): Throw exception with message and stack trace.
  • AccessPermissionException(Throwable th): Throw exception message stack trace.

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/AccessPermissionException.html

[Solved]org.apache.tika.exception.UnsupportedFormatException


UnsupportedFormatException is a subclass of  TikaException. This exception is thrown by parsers when a file format does not support it. It happens generally when based on MIME type not able to differentiate versions.

For Example: When writing mime type as application/perfect covers all versions of WordPerfect format while parsers only support 6.x only.

Solution

To handle such cases whenever possible distinguish file formats by specific MIME Type so that if any unsupported version finds out that will take care by EmptyParser. Even if not able to distinguish by MIME Type use the distinguish versions.

Here is a complete list of supported Format, Parsers, and Mime Type for TIKA

TIKA Supported Document Formats, Parsers and MIME Type

public class UnsupportedFormatException
extends TikaException

 

Constructors

  • UnsupportedFormatException(String msg)

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/UnsupportedFormatException.html

[Solved]org.apache.tika.exception.EncryptedDocumentException: Unable to process: document is encrypted


EncryptedDocumentException is subclass of TikaException. This Exception occurred when TIKA parser tries to extract the content of Encrypted Microsoft word documents.

 public class EncryptedDocumentException extends TikaException

Constructors

  • EncryptedDocumentException()
  • EncryptedDocumentException(String info)
  • EncryptedDocumentException(String info, Throwable th)
  • EncryptedDocumentException(Throwable th)

This exception message and exception type dependend on type of encrypted file (docx or doc):

  • File password-protected.docx : org.apache.tika.exception.EncryptedDocumentException: Unable to process: document is encrypted
  • File password-protected.doc : org.apache.poi.EncryptedDocumentException: Cannot process encrypted word file

Here is stacktrace for both types of the documents:

Tika password-protected.docx


Exception in thread "main" org.apache.tika.exception.EncryptedDocumentException: Unable to process: document is encrypted
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:245)
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
    at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
    at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)

Tika password-protected.doc


Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@119e7782
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
    at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
    at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
Caused by: org.apache.poi.EncryptedDocumentException: Cannot process encrypted word file
    at org.apache.poi.hwpf.model.FileInformationBlock.(FileInformationBlock.java:77)
    at org.apache.poi.hwpf.HWPFDocumentCore.(HWPFDocumentCore.java:155)
    at org.apache.poi.hwpf.HWPFDocument.(HWPFDocument.java:218)
    at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:80)
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:199)
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/EncryptedDocumentException.html

[Solved] org.apache.tika.exception.TikaException: Error creating OOXML extractor


TikaException is the most common cached exception which required to handle while using APIs for TIKA.

Constructors

These are two constructors of the TikaException class.

  • TikaException(String msg): TikaException  throw with message
  • TikaException(String msg, Throwable cause): TikaException throws message and cause of the exception.

Example

In this example, parsing pdf file content and metadata throwing TikaException because of using the parser for PDF doesn’t support it. By mistake or copy-paste use Parser of OOXMLParser which is generally used to parser Microsoft documents.

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.parser.microsoft.ooxml.OOXMLParser;
import org.apache.tika.parser.pdf.PDFParser;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.parser.txt.TXTParser;

import org.xml.sax.SAXException;

public class TikaPdfParserExample {

   public static void main(final String[] args) throws IOException,SAXException, TikaException {

      //detecting the file type
      BodyContentHandler handler = new BodyContentHandler();
      Metadata metadata = new Metadata();
      FileInputStream inputstream = new FileInputStream(new File("C:\\Users\\Saurabh Gupta\\Desktop\\TIKA\\PDF-FILE.pdf"));
      ParseContext pcontext=new ParseContext();

      //auto detect document parser
      Parser  parser = new OOXMLParser();
      parser.parse(inputstream, handler, metadata,pcontext);
      System.out.println("Contents of the text document:" + handler.toString());
      System.out.println("Metadata of the text document:");
      String[] metadataNames = metadata.names();

      for(String name : metadataNames) {
         System.out.println(name + " : " + metadata.get(name));
      }
   }
}

Output


Exception in thread "main" org.apache.tika.exception.TikaException: Error creating OOXML extractor
    at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:209)
    at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:110)
    at com.fiot.tika.exceptions.handling.TikaTextParserExample.main(TikaTextParserExample.java:31)
Caused by: org.apache.poi.openxml4j.exceptions.NotOfficeXmlFileException: No valid entries or contents found, this is not a valid OOXML (Office Open XML) file
    at org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.getNextEntry(ZipArchiveThresholdInputStream.java:143)
    at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.(ZipInputStreamZipEntrySource.java:47)
    at org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:106)
    at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:299)
    at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:110)
    ... 2 more
Caused by: java.util.zip.ZipException: Unexpected record signature: 0X46445025
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:260)
    at org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.getNextEntry(ZipArchiveThresholdInputStream.java:139)
    ... 6 more

Solutions

Always use AutoDetectParser in TIKA if not sure about document type or specific Parser as per document type.

Preferences

https://tika.apache.org/1.8/api/org/apache/tika/exception/TikaException.html

[Solved]org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes


TikaZeroByteException is a subclass of TikaException. TikaZeroByteException occurred when using AutoDetectParser to extract the content of the file which is having no text or zero-bytes. In this case, auto-detect parser throws TikaZeroByteException.


public class ZeroByteFileException extends TikaException

Constructors

  • ZeroByteFileException(String msg): This constructor used to throw an exception with a message.

ZeroByteFileException Example

Here is an example to parse content and metadata of text file by using AutoDetectParser. But it’s throwing an exception because it is not having any content/zero.

package com.fiot.tika.exceptions.handling;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.parser.txt.TXTParser;

import org.xml.sax.SAXException;

public class TikaTextParserExample {

   public static void main(final String[] args) throws IOException,SAXException, TikaException {

      //detecting the file type
      BodyContentHandler handler = new BodyContentHandler();
      Metadata metadata = new Metadata();
      FileInputStream inputstream = new FileInputStream(new File("C:\\Users\\Saurabh Gupta\\Desktop\\TIKA\\BLANK-FILE.txt"));
      ParseContext pcontext=new ParseContext();

      //auto detect document parser
      Parser  parser = new AutoDetectParser();
      parser.parse(inputstream, handler, metadata,pcontext);
      System.out.println("Contents of the text document:" + handler.toString());
      System.out.println("Metadata of the text document:");
      String[] metadataNames = metadata.names();

      for(String name : metadataNames) {
         System.out.println(name + " : " + metadata.get(name));
      }
   }
}

Output


Exception in thread "main" org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122)
    at com.fiot.tika.exceptions.handling.TikaTextParserExample.main(TikaTextParserExample.java:29)

Solutions

To handle ZeroByteException there are two ways:

  1. Always check file size before use it.
  2. If you already know the content type of file using specific Parser. For Example in the above case replace the line with below text parser instance then no exception will occur.

Parser parser = new  TextParser();

References

https://tika.apache.org/1.22/api/org/apache/tika/exception/ZeroByteFileException.html

TIKA Document Content Extraction


TIKA supports various parsers for different types of document formats. TIKA decides the right parser and extract content based on the document type.

Here you can get a complete list of TIKA supported documents formats:

TIKA Supported Formats and Parsers

TIKA Content Extraction

There are two ways to extract content from a document by TIKA API:

  1. TIKA Facade class: Tika.parseToString()
  2. Parser Class : Parser.parse()

TIKA Facade class : Tika.parseToString()

Tika facade class parseToString() method is used to extract content from a document. Tika internally uses the following steps to extract content from the document:

  1.  Tika internally uses the mechanism to detect document type.
  2. Based on document type decide a suitable parser from the parser repository.
  3. The selected parser will parse the document and extract the content.
Tika tika = new Tika();
String content = tika.parseToString(file);

Example : TIKA Extract Content by Tika.parseToString()

Here in this program, you will see complete steps to extract content by the Tika facade class.

import java.io.File;
import java.io.IOException;

import org.apache.tika.Tika;
import org.apache.tika.exception.TikaException;

import org.xml.sax.SAXException;

public class TikaContentExtraction1 {

   public static void main(final String[] args) throws IOException, TikaException {

      File file = new File("hello.txt");

      //Instantiating Tika facade class
      Tika tika = new Tika();

      String filecontent = tika.parseToString(file);
      System.out.println("Document Content: " + filecontent);
   }
}

Output


Document Content:
This is
TIKA
Test

Parser Interface: Parser.parse()

In TIKA, the parser package provides several interfaces and classes to extract the content of a document. Here is a list of Interface, classes, and method used to extract content:

Parser Interface

TIKA supports multiple parsers according to document format. All these parser classes implement the Parser interface. For example : PDFParser, Mp3Parser,OfficeParser etc.

See Also: TIKA Supported Documents Format and Parsers

CompositeParser

CompositeParser has used a composite design pattern internally which allows using a group of parser by a single instance. It allows accessing all parser those implemented Parser interfaces.

AutoDetectParser

AutoDetectParser is a subclass of CompositeParser, which provides automatic document type detection. It automatically detects document type and send to appropriate parser classes by composite methodology.

parse() method

parse() method of the Parser interface used to extract content and metadata from the given document. Here is a prototype of parse() method and parameters descriptions:

parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context);

TIKA supports several individual parser classes i.e XMLParser, PDFParser, Mp3Parser, etc. Which can be used parse specific document type. If you want to use a generic parsing way, TIKA provides CompositeParser or AutoDetectParser which will automatically detect document type and select specific parser for extracting the content and metadata.


Parser parser = new CompositeParser();  
   (or)
Parser parser = new AutoDetectParser();
   (or)        , 
Create object of any individual parsers supported by  TIKA Library 
Object Description
InputStream stream The input stream of a file.
ContentHandler handler Tika sends content as XHTML content, where it extracts the text content by SAX API.
Metadata metadata Metadata tells about the internal information of the document. This object used as a source and target of the document.
ParseContext context This object is used where the need to customize the parsing process as per client needs.

Steps to Extract Document content by Parser

  • Step 1: Create an instance of an input stream of the document.
File  file = new File(filepath)
FileInputStream inputstream = new FileInputStream(file);
   or
InputStream stream = TikaInputStream.get(new File(filename));

Note: FileInputSream doesn’t support random access for reads for efficiently process file format. We can use TikaInputStream for random access to the file.

  • Step 2: Create an instance of ContentHandler.
    TIKA supports these three content handlers:
Content Handler Description
BodyContentHandler This class picks the body part of the XHTML output and writes that content to the output writer or output stream. Then it redirects the XHTML content to another content handler instance.
LinkContentHandler This content extraction class is used to parse only H-ref or links documents and send it to crawlers.
TeeContentHandler This class is useful when needing to use multiple tools simultaneously.

Example

BodyContentHandler handler = new BodyContentHandler( );
  • Step 3: Create an instance of Meta Data
Metadata metadata = new Metadata();
  • Step 4: Create an instance of ParserContext
ParseContext context =new ParseContext();
  • Step 5: Call Parser.parse() method
    Call Parser.parse() method with arguments as given below.

Parser.parse(inputstream, handler, metadata, context);
  • Step 6: Extract Document Content

Call handler.toString() method to extract parse content of the document as text.

Complete Example: Extract Document Content

In this example, you will get to know complete steps to extract content from TIKA supported parser.

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;

import org.xml.sax.SAXException;

public class TikaContentExtractionByParser {

   public static void main(final String[] args) throws IOException,SAXException, TikaException {

      File file = new File("hello.txt");

      //parse() method parameters
      Parser parser = new AutoDetectParser();
      BodyContentHandler handler = new BodyContentHandler();
      Metadata metadata = new Metadata();
      FileInputStream inputstream = new FileInputStream(file);
      ParseContext context = new ParseContext();

      //parsing the file hello.txt
      parser.parse(inputstream, handler, metadata, context);

      System.out.println("Document Content : " + Handler.toString());
   }
}

Output

Document Content:
This is
TIKA
Test

In further posts, you will get to know about to extract content and metadata from the document.

TIKA Language Detection


Language detection required were needing to classified documents based on language, there is a separate class LanguageIdentifier to detect the language of the text.

LanguageIdentifier class use the following algorithms to detect language:

Profiling Corpus Algorithm

Create a profile for language based on matched common words from different language dictionaries. For example a common word for English like a, an, the, etc. Then decide the language name.

Here use terms as

Corpus: collections of the most used common terms of written language.
Profiling: a dictionary of words of each language.

Drawback: If two language is having similar characters and words then it’s difficult to decide language based on the frequency of words.

N-gram Algorithm

As a solution to the above drawback of the “Profiling Corpus Algorithm“, a new approach comes of using character sequences of a given length for profiling corpus. This sequence of characters in content is called N-gram, where N is the length of the character sequence.

N-gram approach help in the detection of language in the case of European languages. Ex: English. Tika uses a 3-gram approach for language detection. N-gram approach is good in the case of short texts.

TIKA Supported Languages

As per ISO 639-1 having 184 standard languages but Tika is able to detect only 18 languages as below:

da—Danish de—German et—Estonian
el—Greek en—English es—Spanish
fi—Finnish fr—French hu—Hungarian
is—Icelandic it—Italian nl—Dutch
no—Norwegian pl—Polish pt—Portuguese
ru—Russian sv—Swedish th—Thai

How to detect Langauge by Tika?

getLanguage() method of LanguageIdentifier class is used to get language based on passed text content.

//Create Language Identifier object based on content.
LanguageIdentifier object = new LanguageIdentifier(“English is so funny.”);
//Get lanaguage name based on passing content.
String lang=object.getLangauge()

Example: Detect Langauge from Text

This example will show you steps to get Language Name of passing content.

import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.language.LanguageIdentifier;

import org.xml.sax.SAXException;

public class LanguageDetection {

   public static void main(String args[])throws IOException, SAXException, TikaException {

      LanguageIdentifier object = new LanguageIdentifier(“English is so funny.”);
      String lang = object.getLanguage();
      System.out.println("Detected Language is : " + lang);
   }
}

Output


Language Detected from content is : en

Example: Detect Langauge from Document Contents

To detect the language of a document, first, we need to parse the document by using parse() method. This parse() method will store parse content in handler object. This handler object content used as an argument of LanguageIdentifier constructor to identify the language.

//Get metadata and extract content by parser parse() method.
parser.parse(inputstream, handler, metadata, context);
//Pass content as parameter of constructor of LanguageIdentifier
LanguageIdentifier object = new LanguageIdentifier(handler.toString());

Complete Example

Here are complete steps to get metadata and extract the content of the document.

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.language.*;

import org.xml.sax.SAXException;

public class TikaDocumentLanguageDetection{

   public static void main(final String[] args) throws IOException, SAXException, TikaException {

      //Instantiating a file object
      File file = new File("hello.txt");

      //Create objects of required arguments for parse() method.
      Parser parser = new AutoDetectParser();
      BodyContentHandler handler = new BodyContentHandler();
      Metadata metadata = new Metadata();
      FileInputStream content = new FileInputStream(file);

      //Get metadata and extract content by parser parse() method.
      parser.parse(content, handler, metadata, new ParseContext());

      LanguageIdentifier object = new LanguageIdentifier(handler.toString());
	  System.out.println("File Content :" + handler.toString());
      System.out.println("Language Name :" + object.getLanguage());
   }
}

Output


File Content : English is so funny.
Language Name : en

TIKA Document Type Detection


TIKA facade class detect() method is used to detect the document type based on the input file.

Example

In this program, we can detect file type based on the input file.

import java.io.File;
import org.apache.tika.Tika;
public class TikaTypeDetection {

   public static void main(String[] args) throws Exception {

      //Suppose hello.txt is in your current directory
      File file = new File("hello.txt");//

      //Instante tika facade class
      Tika tika = new Tika();

      //detect file type using detect method
      String filetype = tika.detect(file);
      System.out.println(filetype);
   }
}

Output


text/plain

TIKA Supported Document Formats


TIKA supports these documents formats. Here you will also get list of parser with respect to format and MIME Type.

Format Parser MIME Type
HyperText Markup Language HtmlParser text/html
application/vnd.wap.xhtml+xml
application/x-asp
application/xhtml+xml
XML and derived formats DcXMLParser
Microsoft Office document formats OfficeParser
OOXMLParser application/vnd.ms-powerpoint.template.macroenabled.12
application/vnd.ms-excel.addin.macroenabled.12
application/vnd.openxmlformats-officedocument.wordprocessingml.template
application/vnd.ms-excel.sheet.binary.macroenabled.12
application/vnd.openxmlformats-officedocument.wordprocessingml.document
application/vnd.ms-powerpoint.slide.macroenabled.12
application/vnd.ms-visio.drawing
application/vnd.ms-powerpoint.slideshow.macroenabled.12
application/vnd.ms-powerpoint.presentation.macroenabled.12
application/vnd.openxmlformats-officedocument.presentationml.slide
application/vnd.ms-excel.sheet.macroenabled.12
application/vnd.ms-word.template.macroenabled.12
application/vnd.ms-word.document.macroenabled.12
application/vnd.ms-powerpoint.addin.macroenabled.12
application/vnd.openxmlformats-officedocument.spreadsheetml.template
application/vnd.ms-xpsdocument
application/vnd.ms-visio.drawing.macroenabled.12
application/vnd.ms-visio.template.macroenabled.12
model/vnd.dwfx+xps
application/vnd.openxmlformats-officedocument.presentationml.template
application/vnd.openxmlformats-officedocument.presentationml.presentation
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
application/vnd.ms-visio.stencil
application/vnd.ms-visio.template
application/vnd.openxmlformats-officedocument.presentationml.slideshow
application/vnd.ms-visio.stencil.macroenabled.12
application/vnd.ms-excel.template.macroenabled.12
OldExcelParser application/vnd.ms-excel.workspace.3
application/vnd.ms-excel.workspace.4
application/vnd.ms-excel.sheet.2
application/vnd.ms-excel.sheet.3
application/vnd.ms-excel.sheet.4
SpreedsheetMLParser
WordMLParser application/vnd.ms-wordml
Word2006MlParser application/vnd.ms-word2006ml
MSOwnerFileParser application/x-ms-owner
OpenDocument Format OpenDocumentParser application/x-vnd.oasis.opendocument.presentation
application/vnd.oasis.opendocument.chart
application/x-vnd.oasis.opendocument.text-web
application/x-vnd.oasis.opendocument.image
application/vnd.oasis.opendocument.graphics-template
application/vnd.oasis.opendocument.text-web
application/x-vnd.oasis.opendocument.spreadsheet-template
application/vnd.oasis.opendocument.spreadsheet-template
application/vnd.sun.xml.writer
application/x-vnd.oasis.opendocument.graphics-template
application/vnd.oasis.opendocument.graphics
application/vnd.oasis.opendocument.spreadsheet
application/x-vnd.oasis.opendocument.chart
application/x-vnd.oasis.opendocument.spreadsheet
application/vnd.oasis.opendocument.image
application/x-vnd.oasis.opendocument.text
application/x-vnd.oasis.opendocument.text-template
application/vnd.oasis.opendocument.formula-template
application/x-vnd.oasis.opendocument.formula
application/vnd.oasis.opendocument.image-template
application/x-vnd.oasis.opendocument.image-template
application/x-vnd.oasis.opendocument.presentation-template
application/vnd.oasis.opendocument.presentation-template
application/vnd.oasis.opendocument.text
application/vnd.oasis.opendocument.text-template
application/vnd.oasis.opendocument.chart-template
application/x-vnd.oasis.opendocument.chart-template
application/x-vnd.oasis.opendocument.formula-template
application/x-vnd.oasis.opendocument.text-master
application/vnd.oasis.opendocument.presentation
application/x-vnd.oasis.opendocument.graphics
application/vnd.oasis.opendocument.formula
application/vnd.oasis.opendocument.text-master
iWorks document formats IWorkPackageParser application/vnd.apple.keynote
application/vnd.apple.iwork
application/vnd.apple.numbers
application/vnd.apple.pages
WordPerfect document formats WordPerfectParser application/vnd.wordperfect; version=5.1
application/vnd.wordperfect; version=5.0
application/vnd.wordperfect; version=6.x
org.apache.tika.parser.xml.DcXMLParser
application/xml
image/svg+xml
QuattroProParser application/x-quattro-pro; version=9
Portable Document Format PDFParser application/pdf
Electronic Publication Format EpubParser application/x-ibooks+zip
application/epub+zip
FictionBookParser application/x-fictionbook+xml
org.gagravarr.tika.FlacParser
audio/x-oggflac
audio/x-flac
Rich Text Format RTFParser application/rtf
Compression and packaging formats CompressorParser application/zlib
application/x-gzip
application/x-bzip2
application/x-compress
application/x-java-pack200
application/x-lzma
application/deflate64
application/x-lz4
application/x-snappy
application/x-brotli
application/gzip
application/x-bzip
application/x-xz
PackageParser application/x-tar
application/java-archive
application/x-arj
application/x-archive
application/zip
application/x-cpio
application/x-tika-unix-dump
application/x-7z-compressed
RarParser application/x-rar-compressed
AppleSingleFileParser application/applefile
Text formats TXTParser
Feed and Syndication formats FeedParser application/atom+xml
application/rss+xml
IptcAnpaParser text/vnd.iptc.anpa
Help formats ChmParser application/vnd.ms-htmlhelp
application/x-chm
application/chm
Audio formats AudioParser audio/vnd.wave
audio/x-wav
audio/basic
audio/x-aiff
MidiParser application/x-midi
audio/midi
Mp3Parser audio/mpeg
Mp4Parser video/x-m4v
application/mp4
video/3gpp
video/3gpp2
video/quicktime
audio/mp4
video/mp4
VorbisParser audio/vorbis
OpusParser audio/opus
audio/ogg; codecs=opus
SpeexParser audio/ogg; codecs=speex
audio/speex
FlacParser
Image formats ImageParser image/png
image/vnd.wap.wbmp
image/x-jbig2
image/bmp
image/x-xcf
image/gif
image/x-icon
image/x-ms-bmp
JpegParser image/jpeg
TiffParser image/tiff
PSDParser image/vnd.adobe.photoshop
BPGParser image/bpg
image/x-bpg
WebPParser image/webp
ICNSParser image/icns
TesseractOCRParser
WMFParser image/wmf
EMFParser image/emf
Video formats FLVParser video/x-flv
Mp4Parser video/x-m4v
application/mp4
video/3gpp
video/3gpp2
video/quicktime
audio/mp4
video/mp4
OggParser audio/ogg
application/kate
application/ogg
video/daala
video/x-ogguvs
video/x-ogm
audio/x-oggpcm
video/ogg
video/x-dirac
video/x-oggrgb
video/x-oggyuv
TheoraParser video/theora
PooledTimeSeriesParser
Java class files and archives ClassParser application/java-vm
Source code SourceCodeParser text/x-c++src
text/x-groovy
text/x-java-source
Mail formats MboxParser application/mbox
RFC822Parser message/rfc822
OutlookPSTParser application/vnd.ms-outlook-pst
OfficeParser application/x-tika-msoffice-embedded; format=ole10_native
application/msword
application/vnd.visio
application/vnd.ms-project
application/x-tika-msworks-spreadsheet
application/x-mspublisher
application/vnd.ms-powerpoint
application/x-tika-msoffice
application/sldworks
application/x-tika-ooxml-protected
application/vnd.ms-excel
application/vnd.ms-outlook
TNEFParser application/vnd.ms-tnef
application/x-tnef
application/ms-tnef
CAD formats DWGParser image/vnd.dwg
Font formats TrueTypeParser application/x-font-ttf
AdobeFontMetricParser application/x-font-adobe-metric
Scientific formats DIFParser application/dif+xml
GDALParser application/x-gsc
image/x-ozi
application/x-pds
image/eir
application/x-usgs-dem
application/aaigrid
application/x-bag
application/elas
application/x-rs2
application/x-tsx
application/x-lcp
image/geotiff
application/x-mbtiles
application/x-cappi
application/x-netcdf
application/x-gsag
application/x-epsilon
application/x-ace2
application/jaxa-pal-sar
image/x-pcraster
application/x-msgn
image/arg
application/x-hdf
image/x-mff
application/x-kro
image/x-hdf5-image
image/x-dimap
image/x-srp
image/big-gif
application/x-envi
application/x-cosar
application/x-ntv2
image/bmp
application/x-doq2
application/x-bt
application/x-kml
application/x-gmt
application/x-rst
application/vrt
application/pcisdk
application/x-ctg
application/x-e00-grid
application/x-rik
image/ida
image/x-mff2
application/sdts-raster
application/x-snodas
image/jp2
image/sar-ceos
application/terragen
application/x-wcs
application/leveller
application/x-ingr
application/x-gtx
image/sgi
application/x-pnm
image/raster
application/fits
application/x-r
image/gif
application/x-envi-hdr
application/x-http
application/x-rmf
application/x-ecrg-toc
application/aig
application/x-rpf-toc
image/adrg
application/x-srtmhgt
application/x-generic-bin
application/jdem
image/x-airsar
application/x-webp
application/x-ngs-geoid
application/x-pcidsk
image/x-fujibas
application/x-wms
application/x-map
image/ceos
application/xpm
application/x-zmap
image/envisat
application/x-ers
application/x-doq1
application/x-isis2
application/x-nwt-grd
application/x-ppi
image/ilwis
application/x-isis3
application/x-nwt-grc
application/x-blx
application/gff
application/x-ndf
image/jpeg
application/x-geo-pdf
application/x-l1b
image/fit
application/x-gsbg
application/x-sdat
application/x-ctable2
application/x-grib
application/x-coasp
application/x-dipex
application/grass-ascii-grid
image/fits
application/x-til
application/x-dods
image/png
application/x-gxf
application/x-gs7bg
application/x-cpg
application/x-lan
application/x-xyz
image/bsb
application/x-p-aux
application/dted
application/x-rasterlite
image/nitf
image/hfa
application/x-fast
application/x-los-las
GeographicInformationParser text/iso19139+xml
GeoParser application/geotopic
GribParser application/x-grib2
HDFParser application/x-hdf
ISArchiveParser application/x-isatab
NetCDFParser application/x-netcdf
MatParser application/x-matlab-data
Executable programs and libraries ExecutableParser application/x-msdownload
application/x-sharedlib
application/x-elf
application/x-object
application/x-executable
application/x-coredump
Crypto formats Pkcs7Parser application/pkcs7-signature
application/pkcs7-mime
TSDParser
Database formats SQLite3Parser
JackcessParser application/x-msaccess
DBFParser application/x-dbf
Natural Language Processing SentimentParser
JournalParser
Image and Video object recognition Tika recognization package

References

https://tika.apache.org/1.22/formats.html

TIKA Reference API


Java Programmers can integrate the Tika library in their applications by using the Tika facade class and other below classes.

Tika Class

Tika facade class abstracts the complexity and provides simple methods to explore the functionalities of TIKA.

package:org.apache.tika

Constructors

Followings are constructors of Tika class:

Constructor Description
Tika () Tika default constructor uses the default configuration and constructs the Tika class.
Tika (Detector detector) Creates the Tika facade class by accepting the detector instance as a parameter.
Tika (Detector detector, Parser parser) Creates a Tika facade class by accepting the detector and parser instances as parameters.
Tika (Detector detector, Parser parser, Translator translator) Creates the Tika facade class by accepting the detector, the parser, and the translator instance as parameters.
Tika (TikaConfig config) Creates a Tika facade class by accepting the object of the TikaConfig class as a parameter.

Methods and Description

The following are the important methods of the Tika facade class:

Method Description
parseToString (File file) This method parses and extract extracted text content in the String format. By default, string parameter length is limited.
int getMaxStringLength () This method returns the maximum length of strings returned by the method.
void setMaxStringLength (int maxStringLength) Set the maximum length of strings returned while extracting data from the file.
Reader parse (File file) This method parses and extract extracted text content in the form of java.io.reader object.
String detect (InputStream stream, Metadata metadata) This method accepts an InputStrea and Metadata of an object as parameters and returns the document type name.
String translate (InputStream text, String targetLanguage) This method accepts the InputStream and a String representing the language that we want our text to be translated. It returns, given text to the desired language, attempting to auto-detect the source language.

Parser Interface

This interface implemented by all the parser classes of the Tika package.

package: org.apache.tika.parser

Methods

This is the important method of Tika Parser interface −

Methods Description
parse (InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) This parse method use is given document input stream into a sequence of XHTML and SAX events. After parsing, it places the metadata in the object of MetaData class and extracted document content in the object of the ContentHandler class.

Metadata Class

This MetaData class implements various interfaces such as CreativeCommons, Geographic, HttpHeaders, Message, MSOffice, ClimateForcast, TIFF, TikaMetadataKeys, TikaMimeKeys, Serializable to support various data models.

package: org.apache.tika.metadata

Constructors

Constructor Description
Metadata() Constructs new, empty metadata.

Methods

Methods Description
add (Property property, String value) Adds a new metadata property in the form of key/value pair.
add (String name, String value) Adds a new metadata property in the form of key/value pair.
String get (Property property) Returns the property’s value (if any).
String get (String name) Returns the key’s value (if any).
Date getDate (Property property) Returns the value of Date of metadata property.
String[] getValues (Property property) Returns all the values of metadata associated with property.
String[] getValues (String name) Returns all the values of a given metadata key.
String[] names() Returns all the key names of metadata elements in a metadata object.
set (Property property, Date date) Sets the date of the given metadata property
set(Property property, String[] values) Sets multiple values for a metadata property.

LanguageIdentifier Class

This class used to identify the language of the given content.

package : org.apache.tika.language

Constructors

Constructor Description
LanguageIdentifier (LanguageProfile profile) Instantiates the language identifier for parameter LanguageProfile.
LanguageIdentifier (String content) Instantiates the language identifier for text content.

Methods

Methods Description
String getLanguage () Returns the language of the content of current LanguageIdentifier object.

TIKA Environment Setup for Applications


In the previous post Apache Tika Introduction, you have got an idea of apache Tika and it’s used. In this post, you will learn about the TIKA  environment setup for applications.

As a programmer, we can integrate Apache TIKA in window or Linux or another OS environment by using:

  • Command-line
  • Tika API
  • Command-line interface (CLI) of TIKA
  • Graphical User interface (GUI) of TIKA
  • The source code.

System Requirements

  • JDK Java SE 2 JDK 1.6 or above
  • Memory 1 GB RAM (recommended)
  • Disk Space No minimum requirement
  • Operating System Version Windows XP or above, Linux

Tika Environment Setup Steps

  • Step 1: Set JAVA_HOME and Path as mentioned on the below link.
    JAVA_HOME and PATH Setup Steps
  • Step 2: Add these libraries in your CLASSPATH or pom.xml to use TIKA APIs.

<dependency>
   <groupId>org.apache.Tika</groupId>
   <artifactId>Tika-core</artifactId>
   <version>1.6</version>
</dependency>
<dependency>
   <groupId>org.apache.Tika</groupId>
   <artifactId> Tika-parsers</artifactId>
   <version> 1.6</version>
</dependency>
<dependency>
   <groupId> org.apache.Tika</groupId>
   <artifactId>Tika</artifactId>
   <version>1.6</version>
</dependency>
<dependency>
   <groupId>org.apache.Tika</groupId>
   < artifactId>Tika-serialization</artifactId>
   < version>1.6< /version>
</dependency>
<dependency>
   < groupId>org.apache.Tika< /groupId>
   < artifactId>Tika-app< /artifactId>
   < version>1.6< /version>
</dependency>
<dependency>
   <groupId>org.apache.Tika</groupId>
   <artifactId>Tika-bundle</artifactId>
   <version>1.6</version>
</dependency>

Apache Tika Introduction


Apache Tika provides generic API for all document type content detection, analysis and content extraction from multiple file formats. Tika internally uses various documents parsers to extract metadata and structured text content from the various file types. For Example PDF, Spreadsheet, text file, images, etc.

Tika latest version 1.22 released on 1st Aug 2019 by Apache software foundation. Tika completely has written in Java and supports cross-platform.

Tika Version History

Year Development
2006 The idea of Tika was proposed in front of the Lucene Project Management Committee.
2006 The concept of Tika and its benefits in the Jackrabbit project was discussed.
2007 Tika entered into Apache.
2008 Both 0.1 and 0.2 Versions were released and Tika graduated from the incubator to the Lucene sub-project.
2009 This year Tika Versions 0.3, 0.4, and 0.5 were released.
2010 Both 0.6 and 0.7 Version was released and Tika graduated into the top-level Apache project.
2011 Tika 1.0 was released with book “Tika in Action” was also released in the same year.
2019 Tika 1.22 was release for additional CSV and HWP files type.

Why Tika?

As per https://filext.com/, there are around 25k to 50K file extensions (Structured and Non Structured) and these are growing day by day. To deal with so many types of format Tika provides universal Java API to support around 1400 file types that cover most common and popular formats.

Tika provides content extraction, metadata extraction, and language identification capabilities. Tika written in Java, still used by other languages also by calling restful services and CLI tools.

Where to use Apache Tika?

  • Search Engine: Tika uses the search engine to create search indexing for text in digital documents.
  • Document Analysis: Analysis of the documents like images, pdf to do analysis based on extract content.
  • Digital Asset Management (DAM): It’s used with an organization where maintains a library of documents, images, videos, ebooks, drawings to classify based on common features.
  • Content Analysis: Analyse the content from the web site and care of user’s interest like amazon shows movies, products based on the user’s visit. Machine learning based on content.

Features of Tika

  • Unified parser Interface: Tika internally uses best suitable parser libraries within a single parser interface. Due to this feature Tika, reduce the burden of developer from the burden of selecting the suitable parser library and use it according to the file type encountered.
  • Low memory usage: Tika consumes fewer memory resources, therefore, it is easily embedded with Java applications. We can also use Tika within the application which runs on platforms with fewer resources like mobile PDA.
  • Fast processing: Tika can quickly extract and detect content from applications.
  • Flexible metadata: Tika understands all type of metadata models which are used to define files.
  • Parser integration: Tika supports various parser libraries available for each document type in the same application.
  • MIME-type detection: Tika can extract and detect content from all MIME types.
  • Language detection: Tika includes language identification feature, therefore it can be used in documents based on language type in multilingual websites.

Mask Zipcode/Pincode/Postal Code on Web Page


A lot of information that you share freely, such as your date of birth, phone number, ZIP code and email address, are very valuable to criminals. Think of how many accounts require you to verify your identity by entering in your birthday or your ZIP code before making a transaction.

Alone, your ZIP code might not be of much value all by itself, but criminals will take that information and post it on underground sites where they buy, sell and trade bunches of personal information. From those sites, criminals can purchase enough of your personal information to use it for fraud

See Also :

CVV Masking Example

Here considering, Pincode of length 5 digits. By masking will hide all 5 digits. For example, my pincode is 75038 than after mask will display as XXXXX. Pincode for some card can also be as six digit take this as assignment and make code changes accordingly so that support for both five and six digits.

original zipcode masked zipcode

Note :

This masking example is created considering standard length of text size and formatting of text field, that can vary according to organizations. If your organization having different format and text size modify source code according to your need. If you are facing any issue drop a comments will try to connect as soon as possible.

Download Source code

To download the source code of this example, click on below given download link.

Mask Pincode

Drop me your questions in comments section.

References

Mask Passport On Web Page


The passport serves as proof of your identity and your nationality which itself can be advantageous depending on your destination. As such, you should prioritize its security above all else when you travel overseas.

See Also :

Consequences of Passport Number Identity Theft

Under no circumstances should you disclose any sensitive information regarding yourself, passport number included. It may just be a jumble of digits but it’s a very valuable commodity for criminals if they even get their hands on it.
There are lots of consequences of passport identity theft as below:

  • A criminal could easily manufacture a passport with passport number, full legal name , date of birth and another person’s picture on it. This may not get them into the US (easily) but it could conceivable get them to a country nearby and then from there they could travel to the US or travel on others countries.
  • Criminal all over the world can use others stolen passport to commit a crime or do anything on others name.
  • The passport is use as an identity official documents if you don’t have driving license. Think about if someone using fraud passport and make some accident on road.
  • The fraud passport with your personal information full legal name, date of birth (available in facebook) , another person picture and address used to impersonate you. This fraud passport can be used as identity proof to open and access your credit / debit cards, social security, email, medical records, your character, your tax records.
  • A lot of passports have a chip too, which stores all your personal data.

Passport Masking Example

Here considering, Passport of length 8 digits. By masking will hide initial 4 digits. For example, my passport number is A1234765 than after mask will display as XXXX4765. Passport for some card can also be vary as per county take this as assignment and make code changes accordingly so that support for both 8 and more digits.

Original passport mask passport

Note :
This masking example is created considering standard length of text size and formatting of text field, that can vary according to organizations. If your organization having different format and text size modify source code according to your need. If you are facing any issue drop a comments will try to connect as soon as possible.

Download Source code


To download the source code of this example, click on below given download link.

Mask Passport

Drop me your questions in comments section.

References

Java: Collection Framework Introduction


A collection is an object (also called container) to deal with a group of objects or elements as a single unit. These objects called as elements of Collection. Collections are used to perform operations like the store, retrieve, manipulate, communicate and aggregate.

A collection can have a different type of data. Based on data can decide the type collection data structure.

  • Homogeneous
  • Heterogeneous
  • Unique
  • Duplicate

See Also: 

Real-life Example of Collection

Here are some real-life examples of collections:

  • LinkedList: Your web page visiting history.
  • LinkedList: Train Coaches are connected to each other.
  • LinkedList: Our brain, when we remember something memory follow association because of one memory link to another.  This way recall in sequence.
  • Stack & LinkedList: Stack of Plates or Towel at the party. The top plate always picks first.
  • Queue & LinkedList: Queue/line of the person standing on the railway ticket window or for food in the mess.
  • A classroom is a collection of students.
  • A money bank is a collection of coins.
  • A school bag is a collection of books.

Why need collection?

There are four ways in Java to store values by JVM.

1:Variable approach

If we need to handle one, two or three or fewer numbers of values then the variable approach is a good bit if need to deal with so many objects like 5000 then variable approach have some drawback:

  • The limitation of a variable is that it can store only one value at a time.
  • Readability and reusability of the code will be down.
  • JVM will take more time for the execution.

2: Using a class object approach

Using a class object, we can store multiple “fixed” numbers of values of different types. For example, suppose we are creating a class named Person.

class Person{
String Name ;
int age ;
}

If you create the object of Person class like this

Person p1=new Person(); // So can you think how many values can store in this person object?

The answer is only two i.e name and age. but if you will want to store the third value, it will not possible.

3: Using an array object approach

Array improved the readability of code, by using a single variable for a huge number of values but there are various problems and limitations with the array.


Student[] st=new Student[5000];
  1. Array allow to store only homogeneous data type.
  2. Array is static and fixed in length size.
  3. Array concept help with standard data structure, but when need to deal with the sorting of objects, search for a specific item, etc.

4: Collection Object:

By using a collection object, we can store the same or different data without any size limitation.

What is a Framework in Java?

A framework is a set of several classes and interfaces which provide a readymade architecture.

What is a Collections Framework?

A collection framework provides a unified readymade architecture for storing and manipulating a group of objects. All collections frameworks contain the following:

  • Interfaces: Interfaces generally forms a hierarchy and allow collections object to be manipulated independently of the details of their representation.
  • Implementations: Provides a concrete representation by data structure and implementation of  collections interfaces.
  • Algorithms: The methods that perform useful operations, such as searching and sorting, on objects that implement collection interfaces.

Benefits of Collections Framework

Collections Framework provides lots of benefits:

  • Reduces programming effort: The Collections framework provides useful data structures and algorithms so that developers can concentrate on programming logic only.
  • Increases program performance and quality: Collections Framework provides high-quality data structures and algorithms implementations for good performance. These collections interface APIs are interchangeable so that easily tuned by switching collection implementations.
  • Allows interoperability among unrelated APIs: These data structures are interchangeable so that choose data structure and algorithms according to requirement.
  • Reduces effort to learn and to use new APIs: Most of APIs are common for collections framework because of the inherent Collection interface. only some APIs need to remember that are specific to the data structure.
  • Reduces effort to design new APIs: If new data structure and algorithm change create polymorphism of API and change the internal algorithm of APIs.
  • Fosters software reuse: If new data structure added use standard APIs so that easy to learn for developers.

References

Java : Collection Framework Hierarchy


All the classes and interfaces of the collection framework are in java.util package. This hierarchy for the collection framework specifically mentioned the class and interface with respect to each type.

Java Collection Framework Hierarchy

Iterable Interface

The Iterable interface is the root interface for all the collection classes because the Collection interface extends the Iterable interface, therefore, all the subclasses of Collection interface also implement the Iterable interface.

The iterable interface contains only one abstract method.

  • Iterator iterator(): It returns the iterator over the elements of type T.

Iterator Interface

The iterator interface provides the facility of iterating the elements in a forward direction only.
For more detail: Java: Iterator Interface methods and Examples

Collection Interface

The Collection interface builds the foundation for the Collection framework. The collection interface is one of the interfaces which is implemented by all the Collection framework classes. It provides common methods to implement by all the subclasses of collection interfaces.
For more detail: Java: Collection Interface methods and Examples

List Interface

List interface is the subtype/child interface of the Collection interface. It stores objects/elements in list type data structure in ordered form and also allowed duplicate values.

Implementation classes for List interface are ArrayList, LinkedList, Vector, and Stack.

For Example: To instantiate List Interface with Implementation classes follow:


List  list1= new ArrayList();  
List  list2 = new LinkedList();  
List  list3 = new Vector();  
List  list4 = new Stack();

For more detail: Java: List Interface methods and Examples

ArrayList Class

The ArrayList implements the List interface. It’s having the following features:

  • ArrayList uses a dynamic array data structure to store objects and elements.
  • ArrayList allows duplicate objects and elements.
  • ArrayList maintains the insertion order.
  • ArrayList is non-synchronized.
  • ArrayList elements/objects can be accessed randomly.

For more detail: Java: ArrayList Interface methods and Examples

LinkedList Class

LinkedList implements the List interface. It’s having the following features:

  • LinkedList uses a doubly linked list data structure to store elements.
  • LinkedList allowed storing the duplicate elements.
  • LinkedList maintains the insertion order.
  • LinkedList is not synchronized.
  • LinkedList manipulation is fast because no shifting is required.

For more detail: Java: LinkedList Class methods and Examples

Vector Class

Vector Class implements List interface. It’s having the following features:

  • Vector is similar to the ArrayList class.
  • Vector class uses data structure as a dynamic array to store the data elements.
  • Vector is synchronized.
  • Vector contains many methods that are not the part of Collection Framework.

For more detail: Java: Vector Class methods and Examples

Stack Class

The Stack is the subclass of the Vector class. It’s having the following features:

  • Stack implements the Vector data structure with the (LIFO)last-in-first-out.
  • Stack contains all of the methods of the Vector class.
  • Stack also provides its methods like boolean push(), boolean peek(), boolean push(object o), which defines its features.

For more detail: Java: Stack Class methods and Examples

Queue Interface

Queue Interface extends the Collection interface. It’s having the following features:

  • Queue interface maintains the FIFO (first-in-first-out) order.
  • Queue can be defined as an ordered list that is used to hold the elements which are about to be processed.
  • Queue interface implemented by the various classes like PriorityQueue, Deque, and ArrayDeque.

Queue interface can be instantiated as:

Queue q1 = new PriorityQueue();  
Queue q2 = new ArrayDeque();  

For more detail: Java: Queue Interface methods and Examples

Here are detail about classes which implements Queue Interface.

PriorityQueue

The PriorityQueue class implements the Queue interface.

  • PriorityQueue holds the elements or objects which are to be processed by their priorities.
    PriorityQueue doesn’t allow null values to be stored in the queue.

For more detail: Java: PriorityQueue Class methods and Examples

Deque Interface

Deque stands for the double-ended queue which allows us to perform the operations at both ends.interface extends the Queue interface.

  • Deque extends the Queue interface.
  • Deque allows remove and add the elements from both the side.
Deque d = new ArrayDeque(); 

For more detail: Java: Deque Interface methods and Examples

ArrayDeque Class

ArrayDeque class implements the Deque interface.

  • ArrayDeque facilitates us to use the Deque.
  • ArrayDeque allows add or delete the elements from both the ends.
  • ArrayDeque is faster than ArrayList and has no capacity restrictions.

For more detail: Java: ArrayQueue Class methods and Examples

Set Interface

Set Interface extends Collection Interface and present in java.util package.

  • Set doesn’t allow duplicate elements or objects.
  • Set store elements in an unordered way.
  • Set allows only one null value.
  • Set is implemented by HashSet, LinkedHashSet, and TreeSet.

We can create an instantiation of Set as below:

Set s1 = new HashSet();  
Set s2 = new LinkedHashSet();  
Set s3 = new TreeSet();

For more detail: Java: Set Interface methods and Examples

HashSet

HashSet class implements Set Interface. It’s having the following features:

  • HashSet internally uses data structure like a hash table for storage.
  • HashSet uses hashing technique for storage of the elements.
  • HashSet always contains unique items.

For more detail: Java: HashSet Class methods and Examples

LinkedHashSet

LinkedHashSet class implements Set Interface. It’s having the following features:

  • LinkedHashSet store items in LinkedList.
  • LinkedHashSet store unique elements.
  • LinkedHashSet maintains the insertion order.
  • LinkedHashSet allows null elements.

For more detail: Java: LinkedHashSet Class methods and Examples

SortedSet Interface

SortedSet Interface extends Set Interface. It’s having the following features:

  • SortedSet provides a total ordering on its elements.
  • SortedSet elements are arranged in the increasing (ascending) order.
  • SortedSet provides additional methods that inhibit the natural ordering of the elements.

The SortedSet can be instantiated as:

SortedSet set = new TreeSet();  

For more detail: Java: SortedSet Interface methods and Examples

TreeSet Class

TreeSet class implements the SortedSet interface.  It’s having the following features:

  • TreeSet uses a tree data structure for storage.
  • TreeSet also contains unique elements.
  • TreeSet elements access and retrieval time is quite fast.
  • TreeSet elements stored in ascending order.

For more detail: Java: TreeSet Class methods and Examples

Map Interface

In the collection framework, a map contains values on the basis of key and value pair. This pair is known as an entry. A map having the following features:

  • Map contains unique keys.
  • Map allows duplicate values.
  • Map is useful to search, update or delete elements on the basis of a key.
  • Map is the root interface in Map hierarchy for Collection Framework.
  • Map interface is extended by SortedMap and implemented by HashMap, LinkedHashMap.
  • Map implementation classes HashMap and LinkedHashMap allow null keys and values but TreeMap doesn’t allow null key and value.
  • Map can’t be traversed, for traversing needs to convert into the set using method keySet() or entrySet().

For more detail: Java: Map Class methods and Examples

HashMap Class

HashMap class implements Map interface. It’s having following features:

  • HashMap uses data structure as a Hash Table.
  • HashMap store values based on keys.
  • HashMap contains unique keys.
  • HashMap allows duplicate values.
  • HashMap doesn’t maintain order.
  • HashMap class allows only one null key and multiple null values.
  • HashMap is not synchronized.
  • HashMap initial default capacity is 16 elements with a load factor of 0.75.

For more detail:

LinkedHashMap Class

LinkedHashMap class extends the HashMap class. It’s having the following features:

  • LinkedHashMap contains values based on the key.
  • LinkedHashMap contains unique elements.
  • LinkedHashMap may have one null key and multiple null values.
  • LinkedHashMap is not synchronized.
  • LinkedHashMap maintains the insertion order.
  • LinkedHashMap default initial capacity is 16 with a load factor of 0.75.

For more detail: Java: LinkedHashMap Class methods and Examples

TreeMap Class

TreeMap class implements the SortedMap interface. it’s having the following features:

  • TreeMap uses data structure as red-black tree.
  • TreeMap contains values based on the key. It implements the NavigableMap interface and extends AbstractMap class.
  • TreeMap contains only unique elements.
  • TreeMap doesn’t allow null keys and values.
  • TreeMap is not synchronized.
  • TreeMap maintains an ascending order.

For more detail: Java: TreeMap Class methods and Examples

HashTable Class

Hashtable class implements a Map interface and extends Dictionary class to store key and values as pairs. It’s having the following features:

  • HashTable store values as an array of the list where each list is known as a bucket of the node(key and value pair).
  • HashTable class is in java.util package.
  • Hashtable contains unique elements.
  • Hashtable doesn’t allow null key or value.
  • Hashtable is synchronized.
  • Hashtable initial default capacity is 11 whereas the load factor is 0.75.

For more detail: Java: HashTable Class methods and Examples

See Also:

Java: Arrays vs Collections


In Java, Arrays and Collections both are to deal with a group of objects but there are lots of differences in terms of data structure or performing operations.

Here are some most common differences:

Difference between Arrays & Collections

Arrays Collections
Arrays is having fixed-length.. Collections are growable in nature i.e increase or decrease.
Arrays are not recommended in terms of memory concerns. Collections use different data structures and recommended to use with respect to memory.
Arrays are used with respect to performance. Collections are not recommended to use with respect to performance.
Arrays can store only homogeneous (same type) of data. Collections can hold both homogeneous and heterogeneous elements.
Arrays do not have a single API. Collections having big list of methods.
Arrays can work with primitives and object types. Collections can hold only objects but not with primitive. If you pass as primitive internally convert to object.
See Also: Array & Arrays Class Examples See Also: Collection Framework and Examples

Java: HashCode and Equals Contract


In Java java.lang.Object is the superclass of all the classes provides two very important methods :

public boolean equals(Object obj)
public int hashCode()

See Also:

Internally these methods are used when need to check equality of objects but developer implementation prospects mainly used when need to implement HashMap because on that time developers have to implement these methods inside your User Defined class. Lot’s of the time when the developer is not aware of the contract of hashcode() and equals() method they make mistake and not received expected results from HashMap. Before going to detail of hashcode() and equals() contract, Let’s discuss first the problems.

Most Common Problem

In this example,  green and red car object is stored successfully in a HashMap, but when the map is asked to retrieve this green car object, the car object is not found in map and returning as null.

import java.util.HashMap;
import java.util.Map;

public class Car {
	private String color;

	public Car(String color) {
		this.color = color;
	}

	public boolean equals(Object obj) {
		if (!(obj instanceof Car))
			return false;
		if (obj == this)
			return true;
		return this.color.equals(((Car) obj).color);
	}
	public static void main(String[] args) {
		Car a1 = new Car("green");//hashcode
		Car a2 = new Car("red");

		// HashMap stores car type and its quantity
		Map m = new HashMap();
		m.put(a1, 10);
		m.put(a2, 20);
		//hashcode diffrent from previos green object consider different object
		System.out.println(m.get(new Car("green")));
	}
}
         

Output

null

The above program prints as null. However, we can check that this car object is stored in the map by inspecting in the debugger:

Java hashCode and Equals Contract

In the above program problem is hashcode() method was not implemented.

Before going to the solution to this problem in detail. We need to understand the equals() and hashcode() method contract.

hashcode() and equals() Method Contract

The contract between equals() and hasCode() is that:

  1.  If two objects are equal, then they must have the same hash code value.
  2.  If two objects have the same hashcode value, they may or may not be equal.

The main idea is Map worked on hashing techniques to find an object faster in comparison to linear search. In this case, because the hashcode() method is no implemented. It will call by default method of object class i.e return different hashcode value for different objects. In this above case both the objects store in HashMap and retrieving will have different hashcode values.

See Also: Java: Hashmap Working

In HashMap, store values in form array of buckets of the object where these array having hashcode value while objects having objects that match with the same hashcode. For example, the Hash Code is like a sequence of boxes for storage where different kinds of stuff can be stored in different boxes. It is more efficient if you organize stuff to a different place instead of the same garage. So it’s a good practice to keep kinds of stuff on related boxes i.e equally distribute the hashCode value.

The solution is to add hashCode() method to the Car class. Here we are getting hash value based on the size of color length as implemented below.

import java.util.HashMap;
import java.util.Map;

public class Car {
	private String color;

	public Car(String color) {
		this.color = color;
	}

	public boolean equals(Object obj) {
		if (!(obj instanceof Car))
			return false;
		if (obj == this)
			return true;
		return this.color.equals(((Car) obj).color);
	}

	public int hashCode(){
		return this.color.length();
}
	public static void main(String[] args) {
		Car a1 = new Car("green");//hashcode
		Car a2 = new Car("red");

		// HashMap stores car type and its quantity
		Map m = new HashMap();
		m.put(a1, 10);
		m.put(a2, 20);
		//hashcode different from previous green object consider different object
		System.out.println(m.get(new Car("green")));
	}

}

Output

10

Java: Aggregation and Composition (HAS-A)


Pre-requisite: Java: OOPS Concepts, Java: Class, Java: Object

If objects are having an association as “HAS-A” then there could be two types of relationship between objects:

  1. Aggregation
  2. Composition

Aggregation

Aggregation is a special type of association where objects have their independent life cycle but there is ownership. These owners and associate objects have the “HAS-A” relationship.

For Example, A person may associate with an Organization but he/she has an independent existence in the world.

Benefits of Aggregation

  • Aggregation provides reusability.

Note: For code reusability, if classes having IS-A relationship use Inheritance and for Has-A relationship use Aggregation.

Example of Aggregation

Here Employee class is associated with address class with Aggregation (HAS-A) relationship. Employee class contained an object of Address, this address object contains its own information like city, state, country, etc. Both of these objects are not strongly associate because if employee looses their jobs but still have the same address.

public class Address {
	String city, state, country;

	public Address(String city, String state, String country) {
		this.city = city;
		this.state = state;
		this.country = country;
	}

}

 

 

public class Employee {
	int id;
	String name;
	Address address;

	public Employee(int id, String name, Address address) {
		this.id = id;
		this.name = name;
		this.address = address;
	}

	void display() {
		System.out.println(id + " " + name);
		System.out.println(address.city + " " + address.state + " " + address.country);
	}

	public static void main(String[] args) {
		Address address1 = new Address("Noida", "UP", "India");
		Address address2 = new Address("Greater Noida", "UP", "India");

		Employee e = new Employee(101, "Saurabh", address1);
		Employee e2 = new Employee(102, "Gaurav", address2);

		e.display();
		e2.display();

	}
}

Composition

The composition is special type of aggregation where one object is strongly associated with another object and more restrictive. When the contained object in “HAS-A” and one object can not exist without the existence of others it’s the case of composition.

For Example:

  • Wheels in a car.
  • House has a room. Here the room can’t exist without the house.
  • An employee has a job. A job can’t exist without an employee.

Benefits of Composition

  • The composition provides reusability.
  • The composition can control the visibility of another object and reuse only when needed.

Examples of Compositions
Here Person and Job class having a composition relationship. In this case, Job class creates at runtime whenever a new person comes and we can change the salary from backend without change on client-side.

public class Person {
	//composition has-a relationship
    private Job job;

    public Person(){
        this.job=new Job();
        job.setSalary(1000L);
    }
    public long getSalary() {
        return job.getSalary();
    }
}
public class Job {
	private String role;
    private long salary;
    private int id;

    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    public long getSalary() {
        return salary;
    }
    public void setSalary(long salary) {
        this.salary = salary;
    }
    public int getId() {
        return id;
    }
    public void setId(int id) {
        this.id = id;
    }

}
public class TestComposition {

	public static void main(String[] args) {
		Person person = new Person();
		long salary = person.getSalary();

	}

}

Examples of Compositions & Aggregation
Let’s have an example of a University, it’s department and professors associated with departments.

University & Department Association: It’s a composition association because without University will exist in any department.
Department and Professors Association: It’s aggregate association because both are associated but not having any dependency like if department close professor lose. Both are are an independent entities.

import java.util.List;

public class University {
	 private List departments;

     public void destroy(){
         //it's composition, when i destroy a university I also
		 //destroy the departments. they cant live without university instance
         if(departments!=null)
             for(Department d : departments) d.destroy();
         departments.clear();
         departments = null;
     }

}
public class Department {
	private List professors;
    private University university;

    Department(University univ){
        this.university = univ;
        //check here univ not null throw whatever depending on your needs
    }

    public void destroy(){
        //It's aggregation here, if we fire any professor then they
		//will come out of department but they can still keep living
        for(Professor p:professors)
            p.fire(this);
        professors.clear();
        professors = null;
    }

}
public class Professor {
	 private String name;
     private List attachedDepartments;

     public void destroy(){

     }

     public void fire(Department d){
         attachedDepartments.remove(d);
     }
}

Java: Inheritance (IS-A)


Pre-requisite: Java: OOPS Concepts, Java: Class, Java: Object

Inheritance is a process where child class acquired all the properties and behaviors of the parent class. Inheritance is used when one object is based on another object. Here parent class also called a superclass and child class called a subclass.

For Example,  Person is Parent class and Employee is a subclass of Person. which acquired all the properties and behavior of Person class.

Advantage of inheritance

What is code Reusability in Inheritance?

Inheritance facilitates to reuse the fields and methods of the parent class in child class syntax of Java Inheritance.

Syntax of Inheritance Declaration

class Subclass-name extends Superclass-name  
{  
   //fields 
   //methods   
}

In java, extends keyword is used to inherit a class where a new class derived properties and methods of an existing class. The class that inherited is called a Parent/Superclass, and a new derived class is called a child/subclass.

What You Can Do in a Subclass?

A subclass inherits all of the public and protected members(fields, methods and nested classes) of its parent class, no matter what package the subclass is in. If the subclass is in the same package as its parent, it also inherits the private members of the parent class.

Constructors are not members of the class, so they are not inherited by child class, but the constructor of the parent class can be invoked from the child class by using super.

These inherited members can use as-is, replace them, hide them, or supplement them with new members:

  • Parent class inherited fields  can be used directly, just like any other class field.
  • You can declare a field in the child class with the same name as the one in the parent class, thus hiding it (not recommended).
  • You can declare new fields in the child class that are not in the parent class.
  • The inherited parent class methods can be used directly as they are.
  • You can write a methods in child class that has the same signature as the one in the parent class. i.e Method Overriding.
  • You can write a new static method in the child class with the same signature as on the parent class, thus hiding it.
  • You can declare new methods in the child class that are not in the parent class.
  • You can write a child class constructor that invokes the constructor of the parent class, either implicitly or by using the keyword super.

Example of Inheritance

In this example, Animal is a parent class which is extended by the Dog Child class. The animal class having properties name, breed, age and color. There is one method print() to print all these values. Dog subclass will inherit all these properties those having access modifiers as public and protected. Child class Dog print() method is calling parent class print() method by super keyword.

Java inheritance Example
Java Inheritance Example
//POJO Class
public class Animal {
	// private variables
	private String name;
	private String breed;
	private int age;
	private String color;

	// Getter and setter methods
	public Animal(String name, String breed, int age, String color) {
		this.name = name;
		this.breed = breed;
		this.age = age;
		this.color = color;
	}
    //Method of class
	public String print() {
		return "Animal [name=" + name + ", breed=" + breed + ", age=" + age + ", color=" + color + "]";
	}
}
public class Dog extends Animal {
	private String type;

	public Dog(String name, String breed, int age, String color, String type) {
		// call super class constructor to initialize
		super(name, breed, age, color);
		this.type = type;
	}

	// Overriding method of parent class
	@Override
	public String print() {
		return "Dog [type=" + type + ", print()=" + super.print() + "]";
	}
}
public class TestInheritance {

	public static void main(String[] args) {

		 //Instance with Parameterize Constructor
	     Animal dog=new Animal("Tommy","Small Dog",5,"Black");
	     //calling super class method because reference to super class
	     System.out.println(dog.print());

	     Animal dog_tiny=new Dog("Tiny","Small Dog",4,"Black","Bichon Friese");
	     //calling super class method because reference to super class
	     System.out.println(dog_tiny.print());

	     Dog dog_big=new Dog("Tufan","Big Dog",4,"White","Spotted");
	     //calling sub class method because reference to sub class
	     System.out.println(dog_big.print());
	}
}

Output


Animal [name=Tommy, breed=Small Dog, age=5, color=Black]
Dog [type=Bichon Friese, print()=Animal [name=Tiny, breed=Small Dog, age=4, color=Black]]
Dog [type=Spotted, print()=Animal [name=Tufan, breed=Big Dog, age=4, color=White]]

Here from this output, you will see the print method called based on object not by references of the object.

Points about Inheritance

  • extends the keyword used to implement inheritance.
  • Java doesn’t support multiple inheritances. It’s possible by implementing multiple interfaces.
  • Inheritance has an “IS-A” relationship.
  • Excepting Object Class, which has no parent class, every class has one and only one direct parent class (single inheritance). In the absence of any other explicit parent class, every class is implicitly a child class of Object.

Type of Inheritance in Java

Java supports three types of inheritance only for classes:

  1. Single
  2. Multi-Level
  3. HierarchicalJava Types Of Inheritance

Note:

  • Multiple Inheritance is not supported in java through the class.
  • Multiple and Hybrid inheritance is supported in java through interface only.

Single Level Inheritance

As you have seen in the above example is a single level of inheritance.

Multi-Level Inheritance

class Animal{
void eat(){System.out.println("Animal eating...");}
}
class Dog extends Animal{
void bark(){System.out.println("Dog barking...");}
}
class BabyDog extends Dog{
void weep(){System.out.println("Baby Dog weeping...");}
}
class MultiLevelInheritanceTest{
public static void main(String args[]){
BabyDog d=new BabyDog();
d.weep();
d.bark();
d.eat();
}}

Output


Baby Dog Weepig...
Dog barking...
Animal eating...

Hierarchical Inheritance

class Animal{
void eat(){System.out.println("Animal eating...");}
}
class Dog extends Animal{
void bark(){System.out.println("Dog barking...");}
}
class Cat extends Animal{
void meow(){System.out.println("Cat meowing...");}
}
class TestHierchicalInheritance3{
public static void main(String args[]){
Cat c=new Cat();
c.meow();
c.eat();
//c.bark();//Compile Time Error
}}

Output


Cat meowing...
Animal eating...

Why multiple inheritance is not supported in java?

Java doen’t support multiple inheritance for class because of fixing ambiguity of common members like fields or method.

Suppose, we have three claases ClassA, ClassB and ClassC where ClassC extends ClassA and ClassB which are having common method display(). If you call display() method from child class then not sure which method got called because of ambiguity.
Since java not support multiple inheritance that’s why throgh compile time exception.

class ClassA{
void display(){System.out.println("Inside Class A");}
}
class ClassB{
void display(){System.out.println("Inside Class B");}
}
//Compile time issue issue here
class ClassC extends ClassA,ClassB{
 //suppose if it were
 public static void main(String args[]){
   ClassC obj=new ClassC();
   //Here it will through compile time issue
   obj.display();
}

 

Java: Constructors


In Java, Constructors are used to creating and initializing the object’s state. The constructor also contains collections of statements to execute at time object creation.

Type of Constructors:

  • Default Constructor: A Constructor without any argument.
  • Parameterize Constructor: A Constructor with a number of arguments.

Points to Remember for Constructor

  • Constructor always has the same name as the class name in which it exists.
  • Constructor can not be used with keywords final, abstract, synchronized and static.
  • Constructors declaration with access modifiers can be used to control its access i.e so that restricts other classes to call the constructor.
  • All java classes must have at least one constructor. If you do not explicitly declare any constructor, then on time of compile Java compiler will automatically provide a no-argument constructor i.e also called the default constructor.
  • If you declare any parameterize constructor then that is must write a default constructor.
  • This default constructor calls the parent’s class no-argument constructor i.e super(); or if no parent class extended the called constructor of Object Class i.e directly or indirectly parent class of all the class.

Example of Constructors

//Class Declaration
public class Animal {
	//instance variables
    String name;
    String breed;
    int age;
    String color;

    //Default Constructor : Constructor without parameter
    public Animal()
    {
    this.name="Default";
    }

   // Parameterize Constructor : Constructor with parameter
    public Animal(String name, String breed,
                   int age, String color)
    {
        this.name = name;
        this.breed = breed;
        this.age = age;
        this.color = color;
    }
	//Parameterize constructor
	//Constructor Overriding
	public Animal(String name, String breed)
    {
        this.name = name;
        this.breed = breed;
    } 

}

Object Creation By Constructor

Here you will see both ways to create objects of Animal Class by using the default and parameterize constructors.

public TestClass
{
	public static void main(String[] args)
	{
	 //Instance Default Constructor
	 Animal dog_tommy=new Animal();

	 //Instance with Parameterize Constructor
     Animal dog_tommy=new Animal("Tommy","Small Dog",5,"Black");
	}
}

Does the Constructor return any value?

A constructor doesn’t have return type while implementation, but the constructor returns the current instance of the class. We can write return statements inside the class.

Constructor Overloading

Similar to methods, we can overload constructors also by creating an object in many ways. Java compiler differentiates between these constructors based on signature (i.e numbers, type, and order of parameters).

What is Constructor Chaining?

Constructor chaining is the process of calling a constructor from another constructor for the current object.

Constructor chaining can be performed in two ways:

  • Within the same class: Use this() keyword for same class constructors.
  • From base class: Use super() keyword to call a base class constructor.

Rules of constructor chaining

  • The this() expression should always be the first line of statment in the constructor.
  • There should always be at least one constructor without this() keyword.
  • Constructor chaining can be performed in any order.

Constructor Chaining Example: Within Same Class

Use this() keyword for constructors in the same class. Here you will see the last constructor with two arguments calling the constructor with the keyword this() for four arguments.

//Class Declaration
public class Animal {
	//instance variables
    String name;
    String breed;
    int age;
    String color;

    //Default Constructor : Constructor without parameter
    public Animal()
    {
    this.name="Default";
    }

   // Parameterize Constructor : Constructor with parameter
    public Animal(String name, String breed,
                   int age, String color)
    {
        this.name = name;
        this.breed = breed;
        this.age = age;
        this.color = color;
    }
	//Parameterize constructor
	//Constructor Overriding
	public Animal(String name, String breed)
    {
	   //constructor chaining
	    this(name,breed,1,"black");
    }
}

Constructor Chaining Example: From Base Class

Use super() keyword to call a constructor from the base class. Here Constructor chaining occurs through inheritance. A sub-class constructor’s call the super class’s constructor first so that sub class’s object starts with the initialization of the data members of the superclass. There could be multilevel of classes in the inheritance chain. Every subclass constructor calls up the chain till class at the top is reached.


public class Person{
    private String name;
    protected String citizenship;

    public Person()
    {
    }

    public Person(String name, String citizenship) {
        super();
        this.name = name;
        this.citizenship = citizenship;
    }

    public void print() {
		System.out.println("Citizen:"+ citizenship + ",Name:" + name);
	}

}

 

public class Employee extends Person {
	private int employeeId;
	private String department;
	private int salary;

	public Employee() {

	}

	public Employee(int employeeId, String name, String department, String citizen) {
			}

	public Employee(int employeeId, String name, String department, String citizen, int salary) {
		// super keyword use to call parent constructor
		//constructor chaining to parent class
		super(name, citizen);
		this.employeeId = employeeId;
		this.department = department;
		this.salary = salary;
		System.out.println("Employee Constructor Executed.");
	}

}

What is the use of constructor chaining?

Constructor chaining is the mechanism to perform multiple tasks in a single constructor rather than creating a separate constructor for each task and make their chain. Constructor Chaining makes the program more readable.

Constructors Vs Methods

Constructor Method
Constructor(s) must have the same name as the class within which it defined. The method does not have the same name as the class in which defined.
Constructor(s) does not return any type. method(s) have the return type or void if does not return any value.
A constructor is called only once at the time of Object creation. Method(s) can be called any number of times.
In case constructor not present in class, the default constructor provided by the compiler. In the case of Method, the compiler doesn’t provide the default method.
Constructs invoked implicitly Methods invoked explicitly.

Java: Object


Pre-Requisite: Java:Class
In the previous post, you learn about the Java Class declaration, implementation, and types of classes. Here we will discuss Object which is the basic unit of Object-Oriented paradigm to represent real-life entities.

When an instance of a class is created is known as Object. These instances share attributes and behaviors i.e methods of the class. These values of attributes will be unique for each object i.e called state.

A java program may have lots of objects and these objects interact with each other by invoking methods. An object consists of :

  • Identity: It gives a unique name to the identification of an object and enables one object to interact with other objects.
  • State: It is represented by attributes/properties values of an object.
  • Behavior: It is represented by methods of an object to behave in a particular state.

Object Class

Let’s consider a class of Animal, which can have an object like Dog, Cow, Lion, Elephant, etc. Each of these animals has different properties like name, breed, age, and color.

//Class Declaration
public class Animal {
	//instance variables
    String name;
    String breed;
    int age;
    String color;

    //Default Constructor : Constructor without parameter
    public Animal()
    {
    this.name="Default";
    }

   // Parameterize Constructor : Constructor with parameter
    public Animal(String name, String breed,
                   int age, String color)
    {
        this.name = name;
        this.breed = breed;
        this.age = age;
        this.color = color;
    } 

    //Instance methods
    public String getName()
    {
        return name;
    } 

    public String getBreed()
    {
        return breed;
    } 

    public int getAge()
    {
        return age;
    } 

    public String getColor()
    {
        return color;
    }
    //methods override from Object class
	@Override
	public String toString() {
		return "Animal [name=" + name + ", breed=" + breed + ", age=" + age + ", color=" + color + "]";
	}
}

Example of an object: Dog

Declaration of Object

We declare a variable or object like (type variable_name;). This indicates to the compiler that this variable refers to data whose type is type. In the case of the primitive type, declaration allocates space for the variable as per type but in case of a reference variable, the type must be a concrete class name. Generally, In java, we don’t create an object of Abstract class and interface.

Syntax

access_modifier class_name object_name;

Example

Animal dog_tommy;

As declared above for variable dog_tommy, show this variable is of type Animal but this will not create any object instance i.e point to undetermined value (null) as long as an object not created.

Creation & Initializing an object

When we create an instance of an object by using the new operator. It will allocate memory for a new object and return a reference to that memory. This new operator also called the class constructor to initialize the attributes of the class.

Syntax

//default constructor
access_modifier class_name object_name=new class_name();
//parameterize constructor
access_modifier class_name object_name=new class_name(arg1,arg2..);

Example

//Default Constructor
Animal dog_tommy=new Animal();

//Parameterize Constructor
Animal dog_tommy=new Animal("Tommy","Small Dog",5,"Black");

Note :

  • All classes have at least one constructor. If you do not explicitly declare any constructor, then on time of compile Java compiler will automatically provide a no-argument constructor i.e also called the default constructor.
  • If you declare any parameterize constructor then that is must write a default constructor.
  • This default constructor calls the parent’s class no-argument constructor i.e super(); or if no parent class extended the called constructor of Object Class i.e directly or indirectly parent class of all the class.

Complete Example

//Class Declaration
public class Animal {
	//instance variables
    String name;
    String breed;
    int age;
    String color;

    //Default Constructor : Constructor without parameter
    public Animal()
    {
    this.name="Default";
    }

   // Parameterize Constructor : Constructor with parameter
    public Animal(String name, String breed,
                   int age, String color)
    {
        this.name = name;
        this.breed = breed;
        this.age = age;
        this.color = color;
    } 

    //Instance methods
    public String getName()
    {
        return name;
    } 

    public String getBreed()
    {
        return breed;
    } 

    public int getAge()
    {
        return age;
    } 

    public String getColor()
    {
        return color;
    }
    //methods override from Object class
	@Override
	public String toString() {
		return "Animal [name=" + name + ", breed=" + breed + ", age=" + age + ", color=" + color + "]";
	}
}

public TestClass
{
	public static void main(String[] args)
	{
	 //Instance Default Constructor
	 Animal dog_tommy=new Animal();

	 //Instance with Parameterize Constructor
     Animal dog_tommy=new Animal("Tommy","Small Dog",5,"Black");
	}
}

Output


Animal [name=Default, breed=null, age=0, color=null]
Animal [name=Tommy, breed=Small Dog, age=5, color=Black]

Ways to create an object of a class

This is the most common way to create an instance of an object by using new operators. Java also provides other ways to create instances of an object but internally uses a new keyword only.

  1. Object by new Operator
  2. Object by Class.forName().newInstance()
  3. Object by clone() method
  4. Object by Deserialization
  5. Object for Anonymous Class

Follow this link to know about each object creation ways in detail: Java: Object Creation Ways

 

Java: Type of Classes


Pre-Requisite: Java: Class
In a previous post, you have got an understanding of java class declaration and creation.

Types of classes

Java supports lots of types of classes:

  • Concrete Class
  • Abstract Class
  • POJO Class
  • Static Class
  • Nested Class/Inner Class
  • Final Class
  • Anonymous Class
  • Lambda Expression

Here we will discuss these classes in detail.

Concrete Class

A concrete class is a normal class that is not declared with Non-access modifiers as abstract, final, etc. This class can have an implementation of a parent, interfaces or own class methods.

Example: Concrete Class

//Example Concrete Class
public class CalculatorTest {
    static int add(int a , int b)
    {
    	return a+b;
    }
    static int substract(int a , int b)
    {
    	return a-b;
    }
    static int multiply(int a , int b)
    {
    	return a*b;
    }
    static int division(int a , int b)
    {
    	return a/b;
    }
	public static void main(String[] args) {
      System.out.println("4+5 ="+add(4,5));
      System.out.println("4-5 ="+substract(4,5));
      System.out.println("4*5 ="+multiply(4,5));
      System.out.println("4/5 ="+division(4,5));
	}
}

See Also: Java: Concrete Class Examples

Abstract Class

A class that is declared with the keyword abstract is known as an abstract class in Java. It can have abstract and concrete methods (method with the body). An abstract class can not be instantiated but need to extend to implement abstract methods.

Example: Abstract Class

args)
{
Vehicle v=new Car();
v.engine();
v=new Bike();
v.engine();
v=new Truck();
v.engine();
v=new Bus();
v.engine();
}
}

See Also: Java: Abstract Class Examples

POJO Class

A class with only private variables and public getter and setter methods is called as POJO(Plain Old Java Object) class. These getter and setter methods use private variables. This class is completely encapsulated.

Example: Pojo Class

//POJO Class
public class Animal {
	//private  variables
    private String name;
    private String breed;
    private int age;
    private String color;
    //Getter and setter methods
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getBreed() {
		return breed;
	}
	public void setBreed(String breed) {
		this.breed = breed;
	}
	public int getAge() {
		return age;
	}
	public void setAge(int age) {
		this.age = age;
	}
	public String getColor() {
		return color;
	}
	public void setColor(String color) {
		this.color = color;
	}
}

See Also: Java: Pojo Class Examples

Static Class

A static class is a nested class declared as a static member of the class.

Example: Static Class

import java.util.Scanner;

public class StaticClasses {
	static int s;// static variable

	// static method
	static void printSum(int a, int b) {
		s = a + b;
		System.out.println(a + "+" + b + "=" + s);
	}

	static class NestedStaticClass//static class
	{
		static//static block
		{
			System.out.println("Inside Nested Class Static Block");
		}

		public void display()
		{
			Scanner scanner=new Scanner(System.in);
			System.out.println("Enter value of a:");
			int a= scanner.nextInt();
			System.out.println("Enter value of b:");
			int b= scanner.nextInt();

			printSum(a,b);
			System.out.println("Sum of numbers a+b:" +s);

		}
	}
}

public class StaticClassTest {

	public static void main(String[] args) {
		StaticClasses.NestedStaticClass nss=new StaticClasses.NestedStaticClass();
		//call method of nested class method
		nss.display();
	}
}

See Also: Java: Static Keyword & Examples

Nested Class/Inner Class

A Class declared inside of another class is called Nested Class/Inner Class.

Example: Nested Class/Inner Class

public class OuterClass {

	//nested/Inner Class: Class inside the class
	class NestedClass
	{
		public void innerMethod()
		{
			System.out.println("Inner Class Print");
		}
	}

	public static void main(String[] args) {
		System.out.println("Outer Class Print");

	}

}

See Also: Java: Nested/Inner Class Examples

Final Class

A Class declared with the final keyword is known as Final Class. This class can not be extended by another class. For Example java.lang.System, java.lang.String

Example: Final Class

public final class FinalClass {
 public void display()
 {
	 System.out.println("Display final class method.");
 }
}

//show compiler error :
//"The type BaseClass can not subclass the final class FinalClass"
class BaseClass extends FinalClass{
	public void display()
	 {
		 System.out.println("Display base class method.");
	 }
}

See Also: Java: Final Keyword & Examples

Anonymous Inner Class

Anonymous Class is an inner class without a name and for which only a single object is created. Such class is useful when you need to create an instance of an object such as overloading methods of a class or interface, without having to actually subclass a class.

Anonymous inner classes are mostly used in writing implementation classes for listener interfaces in graphics programming.

Example: Anonymous Class

//Using Anonymous Inner class Thread that extends a Class
class MyThread
{
    public static void main(String[] args)
    {
        //Here we are using Anonymous Inner class
        //that extends a class i.e. Here a Thread class
        Thread t = new Thread()
        {
            public void run()
            {
                System.out.println("Child Thread");
            }
        };
        t.start();
        System.out.println("Main Thread");
    }
}

See Also: Java: Anonymous Class Examples

Lambda Expression

Lambda expressions added in Java 8. It useful to create instances of functional interfaces (An interface with single abstract method). For Example: java.lang.Runnable is having one abstract method as run().

Example: Lambda Expression

public class LamdaTest {
	public static void main(String[] args) {
		new Thread(() -> System.out.println("This is Lamda test")).start();
	}
}

See Also: Java: Lamda Expression Examples

 

Java: Class


A Class is a blueprint or prototype from which objects are created. A class has properties and behaviors i.e methods that are common to all objects of the same type.

Object Class

Syntax of Class Declaration


Access_Modfier Non_access_modiers class class_name 
    extends super_class 
    implements interface1, interface 2..
{
fields
......
default constructor
......
parametrize constructor
......
methods
......
}
  • Access Modifiers: A modifier defined access scope of class, fields, and methods. If specifically not mentioned consider as default. See Also: Java: Access Modifiers/Specifiers
  • Class name: The class name should begin with an initial capital letter follow camel notation. See Also: Java: Identifier Naming Conventions
  • Non-Access Modifiers (if any): non-access modifiers can also be used on the class level to make a class special. See Also: Java: Non-Access Modifiers
  • Superclass(if any): A class can extend only one class i.e called parent class or superclass.
  • Interfaces(if any): A class can implement one or more interfaces. Their interfaces proceed by keyword implements and separated by a comma.
  • Body: A class body is surrounded by curly braces, { }.
    Fields: A class fields are variables that provide the state of class and it’s objects.
  • Methods: class methods are defined to implement the behavior of class and objects. See Also: Java Methods
  • Constructors: Java class constructors are used to initialize new objects. See Also: Java: Constructors

Note:

  • If a class is declared with access modifier public then java file name would also be the same. One Java file can have only one public class.
  • All the class extends Object Class i.e Object class is the superclass of all the classes.See Also: Java: java.lang.Object Class & Methods

Java Class Example

This is a very simple example of a class name as Animal. It’s having variables, constructors, methods and overriding method of superclass Object.

//Class Declaration
public class Animal {
	//instance variables
    String name;
    String breed;
    int age;
    String color;

    //Default Constructor : Constructor without parameter
    public Animal()
    {
    this.name="Default";
    }

   // Parameterize Constructor : Constructor with parameter
    public Animal(String name, String breed,
                   int age, String color)
    {
        this.name = name;
        this.breed = breed;
        this.age = age;
        this.color = color;
    } 

    //Instance methods
    public String getName()
    {
        return name;
    } 

    public String getBreed()
    {
        return breed;
    } 

    public int getAge()
    {
        return age;
    } 

    public String getColor()
    {
        return color;
    }
    //methods override from Object class
	@Override
	public String toString() {
		return "Animal [name=" + name + ", breed=" + breed + ", age=" + age + ", color=" + color + "]";
	}
}

Now we have got a basic understanding of class declaration and implementation. In the further post, you will learn about the types of classes and uses.

Types of classes

Java supports lots of types of classes:

  • Concrete Class
  • Abstract Class
  • POJO Class
  • Static Class
  • Nested Class/Inner Class
  • Final Class
  • Anonymous Class
  • Lambda Expression

Follow this link to know about all these classes and uses.

See Also: Java: Type of classes

Java: StringBuilder Class & Examples


Java StringBuilder is used to create a mutable String object. It’s introduced in JDK 1.5. It’s similar to StringBuffer class the only difference is non-synchronized and no thread-safe.

See Also: String Vs StringBuffer Vs StringBuilder

Constructors of StringBuilder class

Constructor Description
StringBuilder() creates an empty string Builder with the default initial capacity of 16.
StringBuilder(String str) creates a string Builder with the given string.
StringBuilder(int length) creates an empty string Builder with the given capacity as length.

Methods of StringBuilder class

Method Description 
public StringBuilder append(String s) Append the specified string with current string. The append() method is overloaded like append(char), append(boolean), append(int), append(float), append(double) etc.
public StringBuilder insert(int offset, String s) Insert the specified string with this string at the specified position. The insert() method is overloaded like insert(int, char), insert(int, boolean), insert(int, int), insert(int, float), insert(int, double) etc.
public StringBuilder replace(int startIndex, int endIndex, String str) Replace the string from the specified startIndex and endIndex.
public StringBuilder delete(int startIndex, int endIndex) Delete the string from the specified startIndex and endIndex.
public StringBuilder reverse() Reverse the string.
public int capacity() Return the current capacity of String builder.
public void ensureCapacity(int minimumCapacity) Ensure the capacity at least equal to the given minimum then only increase capacity.
public char charAt(int index) Return the character at the specified index position.
public int length() return the total count of the characters in string.
public String substring(int startIndex) is used to return the substring from the specified startIndex.
public String substring(int startIndex, int endIndex) is used to return the substring from the specified startIndex and endIndex.

Example: StringBuilder append() method

Java StringBuilder append() method used to concatenates the given argument with this string.

StringBuilder sb=new StringBuilder("Facing Issues On ");
sb.append("IT");//Original String will change
System.out.println(sb);//new string "Facing Issues On IT"

Example: StringBuilder insert() method

Java StringBuilder insert() method used to insert the given string at the given position.

StringBuilder sb=new StringBuilder("Facing On IT");
sb.insert(6," Issues ");//insert string on 6th position
System.out.println(sb);//new string "Facing Issues On IT"

Example: StringBuilder replace() method

Java StringBuilder replace() method used to replaces the given string from the specified startIndex and endIndex.

StringBuilder sb=new StringBuilder("Facing Saurabh On IT");
sb.replace(7,14,"Issues");//replace saurabh with  issues
System.out.println(sb);//new string "Facing Issues On IT"

Example: StringBuilder delete() method

Java StringBuilder delete() method use to deletes the string from the specified startIndex to endIndex..

StringBuilder sb=new StringBuilder("Facing Saurabh On IT");
sb.delete(7,14);//delete saurabh
System.out.println(sb);//new string "Facing  On IT"

Example: StringBuilder reverse() method

Java StringBuilder reverse() method used to reverses the current string.

StringBuilder sb=new StringBuilder("Facing Saurabh On IT");
sb.reverse();
System.out.println(sb);//new string "TI no seussI gnicaF"

See Also: Java: Ways to reverse String

Example: StringBuilder capacity() method

Java StringBuilder capacity() method use to returns the current capacity of the Builder. The default capacity of the Builder is 16. If the number of the character increases from its current capacity, it increases the capacity by (oldcapacity2)+2. For example, Suppose your current capacity is 16, then the next capacity will be (162)+2=34.

StringBuilder sb=new StringBuilder();
System.out.println(sb.capacity());//default 16
sb.append("Facing Issues On IT");
System.out.println(sb.capacity());//now (16*2)+2=34 i.e (old_capacity*2)+2
sb.append("Learn from Others Experinces");
System.out.println(sb.capacity()); //now (34*2)+2=70 i.e (old_capacity*2)+2

Example: StringBuilder ensureCapacity() method

Java StringBuilder ensureCapacity() method use to ensures that the given capacity is the minimum to the current capacity. If it is greater than the ensure capacity, it increases the capacity by (oldcapacity2)+2. For example, Suppose your current capacity is 16, then the next capacity will be (162)+2=34.

StringBuilder sb=new StringBuilder();
System.out.println(sb.capacity());//default 16
sb.ensureCapacity(40);//Now Capacity will increase when reach to 40
sb.append("Facing Issues On IT");
System.out.println(sb.capacity());//because of ensure capacity capacity will 40 only
sb.append("Learn from Others Experinces");
System.out.println(sb.capacity()); //now (40*2)+2=82 i.e (oldcapacity*2)+2

 

Java: EnumSet Class


Java EnumSet class is the specialized Set implementation for use with enum types. It inherits AbstractSet class and implements the Set interface.

EnumSet class hierarchy

The hierarchy of EnumSet class is given in the figure given below.
EnumSet

EnumSet class declaration

Let’s see the declaration for java.util.EnumSet class.

public abstract class EnumSet<E extends Enum> extends AbstractSet implements Cloneable, Serializable  

Methods of Java EnumSet Class

Method Description
static <E extends Enum> EnumSet allOf(Class elementType) It is used to create an enum set containing all of the elements in the specified element type.
static <E extends Enum> EnumSet copyOf(Collection c) It is used to create an enum set initialized from the specified collection.
static <E extends Enum> EnumSet noneOf(Class elementType) It is used to create an empty enum set with the specified element type.
static <E extends Enum> EnumSet of(E e) It is used to create an enum set initially containing the specified element.
static <E extends Enum> EnumSet range(E from, E to) It is used to create an enum set initially containing the specified elements.
EnumSet clone() It is used to return a copy of this set.

Example :EnumSet

import java.util.*;
enum days {
  SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY
}
public class EnumSetExample {
  public static void main(String[] args) {
    Set set = EnumSet.of(days.TUESDAY, days.WEDNESDAY);
    // Traversing elements
    Iterator iter = set.iterator();
    while (iter.hasNext())
      System.out.println(iter.next());
  }
}

Output :


TUESDAY
WEDNESDAY

Java EnumSet Example: allOf() and noneOf()

import java.util.*;
enum days {
  SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY
}
public class EnumSetExample {
  public static void main(String[] args) {
    Set set1 = EnumSet.allOf(days.class);
      System.out.println("Week Days:"+set1);
      Set set2 = EnumSet.noneOf(days.class);
      System.out.println("Week Days:"+set2);
  }
}

Output :


Week Days:[SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY]
Week Days:[]

How to reverse String in Java?


In java, we can reverse String in java by so many ways as given below:

  1. StringBuffer
  2. StringBuilder
  3. Character Iteration

Reverse String in Java: By StringBuffer

public class ReverseStringExample1 {
public static String reverseString(String str){
    StringBuffer sb=new StringBuffer(str);
//String buffer in-built method
			sb.reverse();
    return sb.toString();
}
}

Reverse String in Java: By StringBuilder

public class ReverseStringExample2 {
public static String reverseString(String str){
    StringBuilder sb=new StringBuilder(str);
	//String builder in-built method
			sb.reverse();
    return sb.toString();
}
}

Reverse String in Java: By Character Iteration

public class ReverseStringExample3 {
public static String reverseString(String str){
    char ch[]=str.toCharArray();
    String rev="";
    //run loop in reverse order for each character
    for(int i=ch.length-1;i>=0;i--){
        rev+=ch[i]; //append characters
    }
    return rev;
}
}

Complete Example: Reverse String In Java

Here consolidated all the ways to reverse String in Java.

public class TestStringInJava {
public static void main(String[] args) {
     System.out.println(ReverseStringExample1.reverseString("My Name is Saurabh."));
     System.out.println(ReverseStringExample2.reverseString("Facing Issues on IT"));
     System.out.println(ReverseStringExample3.reverseString("Learn From Others Experinces"));
    }
}
}

Output

.hbaruaS si emaN yM
TI no seussI gnicaF
secnirepxE srehtO morF nraeL

 

Java: ArrayList Vs Vector Class


java.util.ArrayList and java.util.Vector both implements List interface and maintains insertion order. It’s having many differences as below:

ArrayList vs Vector

ArrayList Vector
ArrayList is not synchronized. Vector is synchronized.
ArrayList increases 50% of the current array size if the number of elements exceeds its capacity. Vector increase 100% means doubles the array size when the total number of elements exceeds its capacity.
ArrayList is not a legacy class. It is introduced in JDK 1.2. Vector is a legacy class.
ArrayList is fast because it is non-synchronized. Vector is slow because it is synchronized, i.e., in a multithreading environment, it holds the other threads in the runnable or non-runnable state until the current thread releases the lock of the object.
ArrayList uses the Iterator interface to traverse the elements. A Vector can use the Iterator interface or Enumeration interface to traverse the elements.
See Also:ArrayList Class See Also:Java: Vector Class

ArrayList Example: traversing by the iterator

import java.util.*;
class ArrayListExample{
 public static void main(String args[]){    

  //creating arraylist of String
  List al=new ArrayList();
 //adding objecta in arraylist
  al.add("Saurabh");
  al.add("Mahesh");
  al.add("Jonny");
  al.add("Anil");
  //traverse elements using Iterator
  Iterator itr=al.iterator();
  while(itr.hasNext()){
   System.out.println(itr.next());
  }
 }
}

Output

Saurabh
Mahesh
Jonny
Anil

Vector Example: Traversing by Enumerator

import java.util.*;
class VectorExample{
 public static void main(String args[]){
  Vector v=new Vector();//creating vector
  v.add("Umrao");//method of Collection
  v.addElement("Isha");//method of Vector
  v.addElement("Kush");
  //traverse elements using Enumeration
  Enumeration e=v.elements();
  while(e.hasMoreElements()){
   System.out.println(e.nextElement());
  }
 }
}

Output

Umrao
Isha
Kush

Java: String Vs StringBuffer Vs StringBuilder


String in Java

A String class represents an array of characters.

String Instance Creation
String instance can be created in two ways:

  • By assigning as Literals
    String title = "Facing Issues On IT";
  • By using the new keyword
    String title = new ("Facing Issues On IT");

Points to Remember for Java String

  • The string class is immutable in Java, so it’s easy to share it across different threads or functions.
  • When you create a String using double quotes, it first looks for the String with the same value in the JVM string pool, if match found it returns the reference else it creates the String object and then places it in the JVM String pool. This way JVM saves a lot of space by using the same String in different threads. But if a new operator is used, it will always explicitly creates a new String in the heap memory.
  • + operator overloading is used to concatenating two strings. Although internally it uses StringBuffer to perform this action.
  • String overrides equals() and hashCode() methods, two Strings are equal only if they have the same characters in the same order. Note that equals() method is case sensitive, so if you are not looking for case sensitive checks, you should use equalsIgnoreCase() method.
  • String value represents a string in the UTF-16 format.
  • String is a final/immutable class with all the fields as final except “private int hash”. This field contains the hashCode() function value and created only when the hashCode() method is called and then cached in this field. Furthermore, the hash is generated using the final fields of String class with some calculations, so every time hashCode() method is called, it will result in the same output. For the caller, it’s like calculations are happening every time but internally it’s cached in the hash field.

Why StringBuffer & StringBuilder?

The string class is immutable in java i.e whenever we do any manipulation in String like concatenation, substring, reverse, etc. always generate a new string and discard older String for garbage collection.

That’s the reason only Java introduced StringBuffer in JDK 1.4 and StringBuilder in JDK 1.5.

StringBuffer and StringBuilder are mutable objects and provide append(), insert(), delete() and substring() methods for String manipulation.

StringBuffer vs StringBuilder

Apart from similarities, Java StringBuffer and StringBuilder having differences:

  • StringBuffer is thread-safe because all of its methods are synchronized but the main disadvantage is performance.

Note: If you are working on a single-threaded environment go with StringBuilder and in a multithreaded environment use StringBuffer. In general scenarios for string manipulation, StringBuilder is better suited than StringBuffer because String buffer is synchronized.

String vs StringBuffer vs StringBuilder

String StringBuffer StringBuilder
Immutable Mutable Mutable
Legacy JDK 1.4 JDK 1.5
Thread Safe Thread Safe No Thread Safe
Synchronized Synchronized Not Synchronized
Performance slow in manipulation Performance slow in manipulation Performance faster in manipulation
String Concat (+) uses StringBuffer and StringBuilder internally NA NA
See Also: String Class Examples See Also: StringBuffer Examples See Also: StringBuilder Examples

Java: Abstract Class Vs Interface


Abstract class and interface in java used to provide abstraction but there are lots of differences:

Abstract Class Interface
Abstraction(0 to 100%) Abstraction(100%)
Abstract class implemented by keyword ‘extends Interface implemented by using keyword ‘implements
Abstract class also can not be instantiated but can be invoked if the main() method exists. Interface is completely abstract i.e can not create an instance of it.
Abstract class can have abstract and non-abstract methods. Interface methods are implicitly abstract and can not have an implementation(nobody)
Abstract class allowed final,non-final and static variables also. Interface allowed only final and static variables
Abstract class members are private, protected, etc. Interface members are public by default.
Abstract class can extend only one class but implements multiple java interfaces. Interface can extend other interfaces only.
Abstract class is fast compare to interface Interface is slow because it required extra indirection.
See Also: Java Abstract Class Examples See Also: Java Interface Examples

See Also:

Java: Interface


An interface is the blueprint of a class to achieve total abstraction.

  • Abstract Class: Abstraction (0 to 100%)
  • Interface: Abstraction (100%)

Points to remember for Interface in Java

  • Interface in java use interface keyword to declare.
  • Interface in java allows only abstract methods i.e methods don’t have a body.
  • Interface in java allows static constants.
  • Interface in java supports multiple inheritance.
  • Interface in java not required to declare the method as public and abstract, internally all methods are public and abstract.
  • Interface in java having all fields as public, static and final.
  • Interface in java represents the IS-A relationship.
  • Interface in java can not be instantiated just like an abstract class.

Note:
There are some enhancements in the interface as per Java versions:

  • Java 8 introduced default and static methods in an interface.
  • Java 9, allowed to support private methods in an interface.

Why use Java interface?

There is the main reason to use interface in java:

  • Interface in java used to achieve abstraction.
  • Interface in java is used to support the multiple inheritance functionality.
  • Interface in java is used to achieve loose coupling.

How to declare an interface?

Interface in java is declared by using the interface keyword. All interface methods declared with the empty body are by default public and abstract. All declared fields inside the interface are public, static and final by default.

If a java class that implements an interface must implement all the methods declared in the interface.

Syntax:

interface interface_name{  
     // declare constant fields by default public , static and final  
    // declare methods that public and abstract by default   
} 

Note: If you are not adding any modifier in methods and variables in Java interface. The compiler adds as below while compilation:

  • Java compiler will add the public and abstract before the interface methods.
  • Java compiler will add public, static and final before fields inside java interface.

Java Classes and Interface Relationship

In java a class extends another class, an interface extends another interface, but a class implements an interface.

Interface Example

In the given example, the Shape interface has only one method as draw(). This method is implemented by Rectangle and Circle classes.

In a real scenario, an interface provides an abstraction to users i.e interface is defined by someone else, but its method implementation is provided by different providers. Moreover, it is used by someone else to interact with the method but the implementation part is hidden by the user.


interface Shape{
	void draw();
}
class Rectangle implements Shape{
public void draw(){System.out.println("Drawing Rectangle Shape");}
}
class Circle implements Shape{
public void draw(){System.out.println("Drawing Circle Shape");}
}  

class InterfaceExample1{
public static void main(String args[]){

Shape d=new Circle();
      d.draw();
	  d=new Rectabgle();
      d.draw();
}}

Output


Drawing Circle Shape
Drawing Rectangle Shape

Multiple inheritances in Java by the interface

Multiple inheritance in java class can be achieve by implements multiple interfaces, or an interface by extends multiple interfaces.

interface Printable{
void print();
}
interface Shape{
void draw();
}
class Rectangle implements Shape,Printable{
public void draw(){System.out.println("Drawing Rectangle Shape");}
public void print(){System.out.println("Print Rectangle Area");}
}
class Circle implements Shape,Printable{
public void draw(){System.out.println("Drawing Circle Shape");}
public void print(){System.out.println("Print Circle Area");}
}
class TestMultipleInterface{
public static void main(String args[]){
Shape d=new Circle();
d.draw();
d.print();
d=new Rectabgle();
d.draw();
d.print();
}
}

Output


Drawing Circle Shape
Print Circle Area
Drawing Rectangle Shape
Print Rectangle Area

Multiple inheritances are not supported through the class in java, but it is possible by an interface, why?

Multiple inheritances are not supported in the case of class because of ambiguity but supported by the interface because there is no ambiguity. It is because this method implementation is provided by the implementation class.

As shown in the below example,  Showable and Printable interface have the same display() methods but its implementation is provided by class MultipleInterfaceExmple, so there is no ambiguity.

interface Printable{
void display();
}
interface Showable{
void display();
}

class MultipleInterfaceExmple implements Printable, Showable{
public void dispaly(){
System.out.println("Print Test Multiple Inheritance in java");
}
public static void main(String args[]){
MultipleInterfaceExmple obj = new MultipleInterfaceExmple();
obj.print();
}
}

Output

Print Test Multiple Inheritance in java 

Interface inheritance

A class implements an interface, but one interface can extend another interface.

interface Printable{
void display();
}
interface Showable extends Printable{
void show();
}
class InterfaceInheritanceExmple implements Showable{
public void print(){System.out.println("Print Interface Inheritance in java");}
public void display(){System.out.println("Display Interface Inheritance in java");}

public static void main(String args[]){
InterfaceInheritanceExmple obj = new InterfaceInheritanceExmple();
obj.print();
obj.display();
}
}

Output

Print Interface Inheritance in java
Display Interface Inheritance in java

What is a marker or tagged interface?

An interface that has no methods is known as a marker or tagged interface, for example, Cloneable, Serializable, Remote, etc. Marker Interface in Java is used to provide information to the JVM so that JVM treats these objects specially.

Follow below link to get Detail knowledge of Marker interface with examples:

Marker Interface in Java and Uses

What is nested Interface?

An interface declared within another interface or class is called a nested interface.

Follow below link to get detail knowledge of nested interface with examples:

Java: Nested Interface

 

Java: Nested Interface


An interface declared within another interface or class is called a nested interface.

The nested interfaces in java are used to group related interfaces so that easily maintain. The nested interface can’t be accessed directly. For accessing must be referred to by the outer interface or class.

For example, the Nested interface is just like almirah inside the room, for accessing almirah, first need to enter the room.

In the java collection framework,  Entry is the subinterface of Map i.e. accessed by Map. Entry.

Points to remember for nested interfaces

  • Nested interfaces are declared static implicitly.
  • Nested interface can have any access modifier while inside the class but if use inside interface then it must be public.

Syntax of Nested Interface inside Class

class class_name{  
 ...  
 interface nested_interface_name{  
  ...  
 }  
}

Syntax of Nested Interface inside Interface

interface interface_name{  
 ...  
 interface nested_interface_name{  
  ...  
 }  
}  

Example of Nested Interface: Interface within Interface

interface Readable{
  void show();
  interface Message{
   void messageDetail();
  }
}

Access of Nested Interface within Interface

we are accessing the Message interface by its outer interface Readable because it cannot be accessed directly.

class NestedInterfaceExample1 implements Readable.Message{

 public void messageDetail(){
 System.out.println("Hello !!! you are calling messageDetail method.");
 }  

 public static void main(String args[]){
  //upcasting here
  Readable.Message message=new NestedInterfaceExample1();
  message.messageDetail();
 }
}

Output

Hello !!! you are calling messageDetail method.

Note: For the above example when you compile Readable class, compiler internally creates the public and static interface as given below:

public static interface Showable$Message
{
  public abstract void messageDetail();
}

As you can see in the above example, The java compiler internally creates the public and static interface.

Example of Nested Interface: Interface within Class

In the below example you will see, interface implementation inside the class and how can we access it.

class ClassA{
  interface Message{
   void messageDetail();
  }
}

Access of Nested Interface within Class

class NestedInterfaceExample2 implements ClassA.Message{
 public void messageDetail(){
 System.out.println("Hello !!! you are calling messageDetail method.");
 }  

 public static void main(String args[]){
 //upcasting here
  ClassA.Message message=new NestedInterfaceExample2();
  message.messageDetail();
 }
}

Output

Hello !!! you are calling messageDetail method.

Can we define a class inside the interface?

Yes, If we implement a class inside the interface, the java compiler automatically creates a static nested class. In this example you will see how can we define a class within the interface:

interface M{
class A{}
}