Processing Excel Files with JavaScript

Joe Bologna
JavaScript in Plain English
3 min readNov 26, 2020

--

I use NodeJS for a lot of stuff. When I wanted to create an inventory system in JavaScript I considered the alternatives. Since the program is for personal use, writing a UI to edit the data had little value. I decided to use a spreadsheet. At first I used Google Sheets, but I was always exporting the file after saving it. Things would get out of sync. Very annoying and error prone.

As much as I’d rather use open source software, Excel is still the simplest solution. Unfortunately, Excel spreadsheets are XLSX files. XLSX is a rather complex XML format and difficult to parse. I looked for an NPM module to read XLSX files, but haven’t found one I liked.

After considerable head scratching, I realized R ( https://www.r-project.org/) is specifically designed to process tabular data from all sorts of sources. I decided to write a simple R script to do the conversion.

The readxl R package can read XLSX files into a table. Assuming the data does not contain a tab character, the table can be output to a Tab Separated Values (TSV) file. A TSV file is easy to parse in JavaScript.

Using the following R program from the command line will do the conversion:

require('readxl')
a <- readxl::read_xlsx("inventory_data.xlsx")
write.table(a, file = "inventory_data.txt", sep = "\t", na = "", quote = FALSE, row.names = FALSE)

Save it to xltotsv.R. This program is executed from the command line using this command:

R -f xltotsv.R

At this point, the TSV file can be read using NodeJS, into an array and the rows split into fields. It’s even possible to use the first row as the attributes of an object. There are NPM packages to do this, but it’s rather simple. Let’s create a getDB() function to return the db and field names:

// Location Container Box Item Qty Value
import fs from 'fs';
// convert TSV to JSON using first line as field names
function getDB() {
const invTSV = fs.readFileSync('inventory_data.txt', 'utf-8');
const lines = invTSV.split('\n');
const fields = lines[0].replace('\r', '').split('\t').slice(0, 6);
const db = [];
for (let i = 1; i < lines.length; i++) {
const line = lines[i].replace('\r', '');
const entry = {};
for (let j = 0; j < fields.length; j++) {
const values = line.split(/ *\t/g);
entry[fields[j]] = values[j];
}
db.push(entry);
}
return {db, fields};
}

Converting each line to rows in an array is easily done by “splitting” the long string at the newline character at the end of each line.

The TSV files are in DOS format, so the \r character needs to be removed.

In this example each row is converted to an object using {} to create the JSON object, then adding attributes using [attribute] notation. This creates the objects in familiar JSON notation:

{attribute:"value"}

Processing the array of objects is now a simple matter of using the Array methods, such as map, forEach, etc.

At this point, I save the array as a JSON file and use it for whatever I want. For my inventory program, I wrote a mobile app using React Native and published using Expo. The data is retrieved from AWS so I can access it from anywhere. It’s necessary to protect access to the JSON file. How I wrote the mobile app and use AWS is somewhat complex, I’ll post the method I used in a future article.

Originally published at https://focusedforsuccess.com on November 26, 2020.

--

--