Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table API discussion #1

Open
Zelenyy opened this issue Feb 18, 2020 · 2 comments
Open

Table API discussion #1

Zelenyy opened this issue Feb 18, 2020 · 2 comments

Comments

@Zelenyy
Copy link
Member

Zelenyy commented Feb 18, 2020

  1. Access to column by string name have some disadvantages:
  • Access impossible if data have missing header value or repeating values.
  • Table can have column name in strange encoding, for example kaggle data contain string data in national encoding.

More better have independent column id

  1. Request API for auto filling user class, fox example:
table.asSequence<MyData?>()
  1. Extension library with helper function: max, min, averege etc.
  2. Need to research necessity of row indexing.
@Zelenyy
Copy link
Member Author

Zelenyy commented Feb 18, 2020

  1. Some sugar:
val Rows.names : List<String> get() = header.map{it.name}
val Rows.types<KClass<out T>> : List<KClass<out T>> get() = header.map{it.type}
  1. Table interface don't forbid have column with different size. Is it normal?

@altavir
Copy link
Member

altavir commented Feb 19, 2020

  1. We can't completely avoid strings since they are primary identifiers for columns. It is possible to make a generic ID-s but it would significantly complicate API without a lot of added value. Currently it is not necessary to use string each time. Once column header is created, it could be used to access column like
val header = ...
val value = table[8 , header]

I will add some kind of method to access columns the same way. Like table[header] or ``table.columns[header]`.
I do not want to allow headless columns at all since I have no understanding how to tread them.

  1. We can use Row schema to wrap actual rows in a way similar to what we do in plotly.kt. I will create a separate issue for that.

  2. Maybe some extensions for number columns.

  3. Primary numeric index is necessary in tables (not necessary in Rows). But old table version had optional additional indexes for fast queries. I think we will do it later.

  4. OK, you can do it yourself. The only question is whether we want to guarantee order of columns or not. Currently it is not guaranteed on API level. To fix it we need to raplace Collection with List in API.

  5. It is checked in a builder. It should be impossible to construct a table with different column size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants