Skip to content

Commit

Permalink
Version 0.3.0
Browse files Browse the repository at this point in the history
### Added
- A benchmark comparison against other xlsx reading libraries.
- Several new assertions.
- Although the OS will claim memory gathered from malloc at program
termination, it's a good practice to free this allocated memory
manually. This is why a new hook method will be called at program -in
terms- termination: XLSXDrone::Workbook.close_workbooks().

### Changed
- The error reporting system from the C's library was updated, so had to
be the Ruby one. Those changes were reflected.
- The XLSXDrone::Workbook#load_sheet() method is safer now. Will raise
an exception if ANY problem arises, instead of returning nil. This
means, it will never return nil. It will return a valid XLSXDrone::Sheet
object or will raise an exception.
- If the user tries to use a workbook already closed, an exception will
get raised explaining the situation, no more segfaults.

### Fixed
- The native binding was improved to work well with x64.
- UTF-8 strings now are successfully read!
- Fixed several problems with date, time & date time Excel values.
  • Loading branch information
damian-m-g committed Jan 13, 2021
1 parent 976dce3 commit 67a9510
Show file tree
Hide file tree
Showing 28 changed files with 364 additions and 120 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,8 @@ Gemfile.lock
ProgrammerNotes.md
*.gem
releases/
test/benchmark/*.xlsx
!test/benchmark/xlsx_200000_rows.xlsx
.yardoc/
coverage/
doc/
26 changes: 24 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,31 @@
# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) featuring Added, Changed, Deprecated,
Removed, Fixed, Security, and others; and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.3.0] - 2021-01-12
### Added
- A benchmark comparison against other xlsx reading libraries.
- Several new assertions.
- Although the OS will claim memory gathered from malloc at program termination, it's a good practice to free this
allocated memory manually. This is why a new hook method will be called at program -in terms- termination:
XLSXDrone::Workbook.close_workbooks().

### Changed
- The error reporting system from the C's library was updated, so had to be the Ruby one. Those changes were reflected.
- The XLSXDrone::Workbook#load_sheet() method is safer now. Will raise an exception if ANY problem arises, instead of
returning nil. This means, it will never return nil. It will return a valid XLSXDrone::Sheet object or will raise an
exception.
- If the user tries to use a workbook already closed, an exception will get raised explaining the situation, no more
segfaults.

### Fixed
- The native binding was improved to work well with x64.
- UTF-8 strings now are successfully read!
- Fixed several problems with date, time & date time Excel values.

## [0.2.0] - 2019-04-05
### Added
Expand Down
8 changes: 8 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,14 @@ group :development do
gem 'bundler'
gem 'yard'
gem 'test-unit'
gem 'simplecov'
end

group :test do
gem 'roo'
gem 'creek'
gem 'rubyXL'
gem 'simple_xlsx_reader'
end

group :production do
Expand Down
14 changes: 14 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Copyright 2021 Damián M. G.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit
persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the
Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Empty file removed LICENSE.md
Empty file.
66 changes: 56 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,37 @@
# PORCUPINE_RUBY
# xlsx_drone

[![](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/damian-m-g/xlsx_drone_rb/master/data/shields/simplecov.json)](#xlsx_drone)
[![](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/damian-m-g/xlsx_drone_rb/master/data/shields/test_suite.json)](#xlsx_drone)
[![](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/damian-m-g/xlsx_drone_rb/master/data/shields/assertions.json)](#xlsx_drone)

Fast _Microsoft Excel's_ **\*.xlsx** reader. Binding of [C's xlsx_drone](https://github.com/damian-m-g/xlsx_drone) lib.

## Table of contents

* [Summary](#summary)
* [Installation](#installation)
* [Usage](#usage)
* [API](#api)
* [TODO](#todo)
* [License](#license)

## Summary

The xlsx_drone gem highlight specifically in runtime speed. This is because almost all the process of gathering information happens in well constructed -for speed- native C code.

[You can find a benchmark inside the repository](https://github.com/damian-m-g/xlsx_drone_rb/blob/master/test/benchmark/speed.rb) that measure the **reading speed** of the most known (and used) Ruby libraries for *.xlsx's reading/writing. The results gathered in my old notebook, reading 200000 rows × 3 columns (number, string and date) are as follow:

![](data/README.md_images/bm_result.png)

**x2** times faster than the fastest one.

You can run this test on your own computer with the `rake bm` task.

## Installation

Use the _gem_ command that comes with your Ruby installation:

`gem install xlsx_drone`

## Usage

Expand All @@ -10,17 +43,30 @@ wb = XLSXDrone.open(path_to_xlsx) #: XLSXDrone::Workbook

sheets_amount = wb.sheets_amount #: Integer
# you can pass its index (starts with 1) or its name as argument
ws = wb.load_sheet(1) #: XLSXDrone::Sheet
sheet = wb.load_sheet(1) #: XLSXDrone::Sheet
puts "Sheet #1 name: #{sheet.name}"

1.upto(ws.last_row) do |row|
p ws.read_cell(row, 'A')
p ws.read_cell(row, 'B')
1.upto(sheet.last_row) do |row|
p sheet.read_cell(row, 'A')
p sheet.read_cell(row, 'B')
end

# remember to close the wb once done
wb.close()
```

## Known problems
## API

You can fully produce the documentation with the `rake yard` task, although ~90% of the API (and most useful part) is shown above.

## TODO

All ideas about new implementations are thoroughly thought to keep the essence of the library, which is to be fast and simple. Hence, next TODOs could be taken into account or dismissed based on that.

Also, consider that this TODO list is somehow concatenated to the [C's xlsx_drone](https://github.com/damian-m-g/xlsx_drone#todo) TODO list. Changes implemented there, will be _immediately_ mirrored here.

- C's xlsx_drone has in its plans to provide **writing support** for xlsx files. As soon as this is implemented there, I'll perform the neccessary binding.
- Consider making `XLSXDrone::Workbook#load_sheet()` to keep a reference to the loaded sheet as an accessible instance variable (i.e.: @loaded_sheets).

**Be free to [make (or upvote)](https://github.com/damian-m-g/xlsx_drone_rb/issues) any feature request.**

## License

So far, it doesn't work on Ruby x64 versions.
#### [MIT](https://github.com/damian-m-g/xlsx_drone_rb/blob/master/LICENSE)
13 changes: 13 additions & 0 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,16 @@ task :build do
end
puts('Gem(s) moved.')
end

desc 'run bechmark'
task :bm do
load('test/benchmark/speed.rb')
end

# you will execute this before every new version release
desc 'perform measures & produce badges metadata'
task :badges do
# TODO: Should parse coverage/index.html and produce coverage data only with lib/**/*.rb files, hardcoding value for now
# TODO: Should parse all assertions and produce sum of all of them, hardcoding value for now
# TODO: Should produce test suite pass badge only if all test passes, hardcoding value for now
end
Binary file added data/README.md_images/bm_result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions data/shields/assertions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"schemaVersion":1,"label":"test assertions","message":"112","color":"informational"}
1 change: 1 addition & 0 deletions data/shields/simplecov.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"schemaVersion":1,"label":"coverage","message":"91.79%","color":"green"}
1 change: 1 addition & 0 deletions data/shields/test_suite.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"schemaVersion":1,"label":"test suite","message":"pass","color":"brightgreen"}
Binary file removed ext/libporcupine_x32.dll
Binary file not shown.
Binary file removed ext/libporcupine_x64.dll
Binary file not shown.
Binary file added ext/xlsx_drone_x64.dll
Binary file not shown.
Binary file added ext/xlsx_drone_x86.dll
Binary file not shown.
5 changes: 5 additions & 0 deletions lib/xlsx_drone.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,8 @@

# turn off err printing from the native library
XLSXDrone::NativeBinding.xlsx_set_print_err_messages(0)

# ensure that all opened workbooks get closed (if can't OS will claim it anyways, just filling a duty here)
at_exit do
XLSXDrone::Workbook.close_workbooks()
end
3 changes: 3 additions & 0 deletions lib/xlsx_drone/exceptions.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ class IndexOutOfBounds < RuntimeError; end

# May happen on xlsx_load_sheet().
class NonExistent < RuntimeError; end

# May happen when try to interact with a workbook already closed.
class WorkbookClosed < RuntimeError; end
end

# Errors caused by the system itself.
Expand Down
84 changes: 38 additions & 46 deletions lib/xlsx_drone/native_binding.rb
Original file line number Diff line number Diff line change
@@ -1,80 +1,72 @@
# Namespace (protector) of the library.
module XLSXDrone

# All things related to the binding with the native C library.
module NativeBinding

PLATFORM_X64 = RUBY_PLATFORM.match(/64/) ? true : false
EXT_PATH = "#{File.dirname(File.dirname(File.dirname(__FILE__)))}/ext"
DLL_PATH = PLATFORM_X64 ? "#{EXT_PATH}/libporcupine_x64.dll" : "#{EXT_PATH}/libporcupine_x32.dll"

DLL_PATH = PLATFORM_X64 ? "#{EXT_PATH}/xlsx_drone_x64.dll" : "#{EXT_PATH}/xlsx_drone_x86.dll"
class XLSXWorkbookT < FFI::Struct

byte_index = 0


layout \
:deployment_path, :pointer, byte_index,
:shared_strings_xml, :pointer, byte_index += FFI.type_size(FFI::Type::POINTER),
:n_styles, :int, byte_index += FFI.type_size(FFI::Type::POINTER),
:styles, :pointer, byte_index += FFI.type_size(FFI::Type::INT),
:n_sheets, :int, byte_index += FFI.type_size(FFI::Type::POINTER),
:sheets, :pointer, byte_index += FFI.type_size(FFI::Type::INT)
:deployment_path, :pointer,
:shared_strings_xml, :pointer,
:n_styles, :int,
:styles, :pointer,
:n_sheets, :int,
:sheets, :pointer
end

class XLSXStyleT < FFI::Struct

byte_index = 0


layout \
:style_id, :int, byte_index,
:related_type, :int, byte_index += FFI.type_size(FFI::Type::INT),
:format_code, :pointer, byte_index += FFI.type_size(FFI::Type::INT)
:style_id, :int,
:related_category, :int,
:format_code, :pointer
end

class XLSXReferenceToRowT < FFI::Struct

byte_index = 0


layout \
:row_n, :int, byte_index,
:sheetdata_child_i, :int, byte_index += FFI.type_size(FFI::Type::INT)
:row_n, :int,
:sheetdata_child_i, :int
end

class XLSXSheetT < FFI::Struct

byte_index = 0


layout \
:xlsx, :pointer, byte_index,
:name, :pointer, byte_index += FFI.type_size(FFI::Type::POINTER),
:sheet_xml, :pointer, byte_index += FFI.type_size(FFI::Type::POINTER),
:sheetdata, :pointer, byte_index += FFI.type_size(FFI::Type::POINTER),
:last_row, :int, byte_index += FFI.type_size(FFI::Type::POINTER),
:last_row_looked, XLSXReferenceToRowT, byte_index += FFI.type_size(FFI::Type::INT)
:xlsx, :pointer,
:name, :pointer,
:sheet_xml, :pointer,
:sheetdata, :pointer,
:last_row, :int,
:last_row_looked, XLSXReferenceToRowT
end

class XLSXCellValue < FFI::Union

layout \
:pointer_to_char_value, :pointer,
:int_value, :int,
:long_long_value, :long_long,
:double_value, :double
end

class XLSXCellT < FFI::Struct

byte_index = 0


layout \
:style, :pointer, byte_index,
:value_type, :int, byte_index += FFI.type_size(FFI::Type::POINTER),
:value, XLSXCellValue, byte_index += FFI.type_size(FFI::Type::INT)
:style, :pointer,
:value_type, :int,
:value, XLSXCellValue
end

extend FFI::Library
ffi_lib DLL_PATH

# function attachings
attach_function :xlsx_get_xlsx_errno, [], :int
attach_function :xlsx_set_print_err_messages, [:int], :void
attach_function :xlsx_open, [:string, :pointer], :int
attach_function :xlsx_load_sheet, [:pointer, :int, :string], :pointer
Expand Down
Loading

0 comments on commit 67a9510

Please sign in to comment.