Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data-forge not working nicely with collectionsjs #7

Closed
btruhand opened this issue Jun 13, 2018 · 3 comments
Closed

data-forge not working nicely with collectionsjs #7

btruhand opened this issue Jun 13, 2018 · 3 comments

Comments

@btruhand
Copy link

First and foremost, I do not think this is a fault in data-forge but I feel like I should raise the issue here so others and author(s) are aware of it

I have the following code:

const dataForge = require('data-forge');
const util = require('util');
//const c = require('collections/fast-map');

function test() {
  const timestamps = [ [ '2018-05-21 15:38:04' ],
    [ '2018-05-21 15:38:09' ],
    [ '2018-05-21 15:38:09' ],
    [ '2018-05-21 15:38:08' ] ];
  const tsDataFrame = new dataForge.DataFrame({
    columnNames: ['Timestamp'],
    rows: timestamps
  }).setIndex('Timestamp');
  let groupedDf = tsDataFrame.groupBy(row => row.Timestamp).select(tsGroup => ({
    Timestamp: tsGroup.first().Timestamp,
    QPS: tsGroup.count()
  })).inflate();
  console.log(groupedDf.toString());
}

test();

Running the above code will give me the following result

__index__  Timestamp            QPS
---------  -------------------  ---
0          2018-05-21 15:38:04  1
1          2018-05-21 15:38:09  2
2          2018-05-21 15:38:08  1

which is what I expected

However if //const c = require('collections/fast-map'); is uncommented, running it again I will get

__index__  [object Object]  false
---------  ---------------  -----
0
1
2

Clearly this is a mistake. After hours of debugging I can at least spot one possible reason for the error. In the build version of data-forge within DataFrame.prototype.toString function we have the following (cut down for brevity's sake)

    DataFrame.prototype.toString = function () {
        var columnNames = this.getColumnNames();
        var header = ["__index__"].concat(columnNames);
        // more things down below
    };

Doing console.log(columnNames) with collectionsjs required gives me the following:

[ SelectIterable {
    iterable: [ [Object], [Object], [Object] ],
    selector: [Function] },
  false ]

Without collectionsjs I will get the expected result: [ 'Timestamp', 'QPS' ]

Inspecting further the getColumnNames function tells me that Array.from is used which is overridden by collectionsjs implementation: https://github.com/montagejs/collections/blob/master/shim-array.js#L26

I managed to fix things on data-forge side by doing a seemingly unnecessary function call:

let groupedDf = tsDataFrame.groupBy(row => row.Timestamp).select(tsGroup => ({
  Timestamp: tsGroup.first().Timestamp,
  QPS: tsGroup.count()
})).inflate().resetIndex();

This will give me the correct result regardless if collectionsjs was used or not

There's an issue raised already in collectionsjs regarding Array.from montagejs/collections#169 and there is also a PR montagejs/collections#173. I'm not sure about the progress of either

@ashleydavis
Copy link
Member

Thanks for raising the issue.

If I remove the use of Array.from would that solve the problem?

@btruhand
Copy link
Author

I never did try... though I personally think it's better off keeping data-forge code as is. I'd write it down as a bug on collectionsjs than it is of data-forge

You can either just close this ticket, or put some labels that indicate that this is only some form of warning (or perhaps document it in a wiki)

@ashleydavis
Copy link
Member

Ok thanks for logging the issue.

Please be sure to put a star on the repo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants