.xlsx format (MS Excel 2007)

Today I’ve sorted out parsing of the new MS Excel 2007 .xlsx format. It turned out to be just a .zip archive with a bunch of .xml files inside:

\_rels\.rels
\docProps\core.xml
\docProps\app.xml
\xl\_rels\workbook.xml.rels
\xl\externalLinks\_rels\externalLink1.xml.rels
\xl\externalLinks\externalLink1.xml
\xl\printerSettings\printerSettings1.bin
\xl\theme\theme1.xml
\xl\worksheets\_rels
\xl\worksheets\sheet1.xml
\xl\worksheets\_rels\sheet1.xml.rels
\xl\calcChain.xml
\xl\workbook.xml
\xl\sharedStrings.xml
\xl\styles.xml
[Content_Types].xml

As it’s expected, the data I need are in \xl\worksheets\sheet1.xml file, which is ordinary XML file:

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes” ?>
- <worksheet xmlns=”http://schemas.openxmlformats.org/spreadsheetml/2006/main” xmlns:r=”http://schemas.openxmlformats.org/officeDocument/2006/relationships”>
………
- <cols>
<col width=”20.7109375″ />
…….
</cols>
- <sheetData>
– &lt row r=”2″ spans=”2:12″ s=”9″ customFormat=”1″ ht=”23.25″>
– <c r=”B2″ s=”22″ t=”s”>
<v>16< /v>
</c>

So I think parsing this in Java will be easier than parsing old .xls format using Apache POI

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.