It is fairly easy to utilize the Apache POI API in Eclipse.

Download the most current version of POI from Apache and include the jars from the following directories in the Java Build Path in an Eclipse project's properties. "./" is the root directory of the downloaded API. 

  • ./lib
  • ./ooxml-lib
  • ./

After you have included all external libraries in the Java Build Path of your project, you can start using the API.

Reference an existing Excel file with POI

The below code snippet shows how an existing Excel file can be referenced with POI. Since there are different classes for different file types, a WorkbookFactory should be used to instantiate objects from corresponding classes. The factory will determine, if one of the following classes should be used.

  • HSSFWorkbook for .xls (Excel 2003 or earlier versions)
  • XSSFWorkbook .xlsx (Excel 2007 and later)

You however should not use one of the above classes. Instead you should use the interface Workbook to reference instances of one of the above classes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
 
	...
	Workbook wrk;
	FileInputStream in;
	String path = "A file path";
 
	/*
	 * Reference a file.
	 */
	in = new FileInputStream(path);
	/*
	 * The factory decides based on the provided file format (.xls or .xlsx)
	 * which workbook type (HSSF for Excel 2003 and earlier or XSSF for
	 * Excel 2007 and later).
	 */
	wrk = WorkbookFactory.create(in);
	...

Reference a sheet in an Excel Workbook

Referencing a sheet within a workbook works accordingly. Since the WorkbookFactory returned the correct implementation of the interface Workbook the rest is easy.

Use the method getSheet(String sheetName) from the interface Workbook to get an instantiation of the interface Sheet. A returned object will of course be of a class corresponding the Excel file type referenced by an instance of the interface Workbook.

Be careful with the imports though as there is another Sheet interface in a package related to PowerPoint sheets.

1
2
3
4
import org.apache.poi.ss.usermodel.Sheet;
	...
	Sheet sheet = wrk.getSheet("Sheet1");
	...

Determine last row in an Excel sheet

As Microsoft Excel VBA developers already will know, it can be quite tricky to determine the last row with data in an Excel sheet. Below is a method, which will work according to the Apache API docs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public boolean hasNextRow() {
    int rowCount = 0;
    
    rowCount = this.currentSheet.getLastRowNum();
    if (rowCount == 0) {
      rowCount = this.currentSheet.getPhysicalNumberOfRows();
    }
    
    if (this.rowIndex < rowCount) {
      return true;
    } else {
      return false;
    }
  }

However, I did not test it properly yet, it will give you a hint where to look for a solution. Apparently the method getLastRowNum() has to be considered unsafe whenever it returns zero. Then the method getPhysicalNumberOfRows() can return a more reliable result.

Be careful though, as this might still be not what you want. The last row might not match the last row with data but point to empty rows with formatting according to Microsoft Excel's UsedRange object.

I will test this properly and finish this article with some updates on things like this as they are good to know when working with POI.

POI Component APIs

Apache POI delivers several APIs for different document types. Here is a list from their homepage, which will give you a hint on where to look for APIs for different Microsoft document types.

  • Excel (SS=HSSF+XSSF)
  • Word (HWPF+XWPF)
  • PowerPoint (HSLF+XSLF)
  • OpenXML4J (OOXML)
  • OLE2 Filesystem (POIFS)
  • OLE2 Document Props (HPSF)
  • Outlook (HSMF)
  • Visio (HDGF)
  • TNEF (HMEF)
  • Publisher (HPBF)

Articles

Below you can find some resources regarding POI on my website.

Find last row in Excel file using POI

Reading and writing MS Excel files using POI

References

How to read from an Excel file using POI

http://www.avajava.com/tutorials/lessons/how-do-i-read-from-an-excel-file-using-poi.html

POI API documentation

http://poi.apache.org/apidocs/index.html

Apache POI homepage

http://poi.apache.org/