Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement bulk create from Excel files using an input stream #134

Open
asishallab opened this issue Jul 1, 2020 · 0 comments
Open

Implement bulk create from Excel files using an input stream #134

asishallab opened this issue Jul 1, 2020 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@asishallab
Copy link
Member

Introduction

Vocen offers uploading of comma separated value tables (CSV) to bulk create records from the respective rows. The table is copied to the server and processed in a stream, where row by row is read, validated, and if no validation error occurs, stored in the respective underlying data storage. This is a asynchronous non-blocking process, of the outcome of which the user is informed via email. Either validation errors or a success message are sent.

Extension to support XLSX files

Offer the option to upload XLSX (MS Excel) files. Apply the same methodology on uploaded Excel tables, i.e. copy them, open them as a stream, process row (record) by row, validate each record, and if valid store them in the underlying database. If the database is a relational DB, wrap the whole thing in a transaction. Expect the uploaded XLSX table to only contain a single sheet, no plots, and no formulae.

Framework / NPM package

There is a great variety of packages offering read streams for XLSX files. Choose wisely, which to use. Criteria should be:

  • Is the package still maintained? - Look at commit frequency on e.g. GitHub
  • How many weekly downloads in the past years, i.e. is the package among the most popular?
  • Can we also use the package to write XLSX tables?

Use npm trends to evaluate package popularity. Based on a preliminar assessment the packages node-excel-stream and exceljs appear to be good choices. The latter being by far the most popular.

Implementation

Head our coding principles and guidelines. Of course, document methods with JSDoc and inline comments, add documentation to our manual, and most importantly consider writing modular code. The latter would suggest to handle code identical between CSV and XLSX upload and processing by functions shared between the two implementations, and only use specific code, where it is needed. Consider using a function that is passed a read-stream, coming either from XLSX or CSV, to process the rows. That function would not know how the argument stream is created.

Cassandra support

Currently, we do not have the bulk creation implemented for Cassandra. For now, skip the implementation for Cassandra, which will be done in a later step.

@asishallab asishallab added the enhancement New feature or request label Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants