Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve main string class doco #114

Open
Tracked by #110
apjanke opened this issue Jan 30, 2024 · 2 comments
Open
Tracked by #110

Improve main string class doco #114

apjanke opened this issue Jan 30, 2024 · 2 comments
Assignees
Milestone

Comments

@apjanke
Copy link
Owner

apjanke commented Jan 30, 2024

The main string class doco needs some work. The main intro sentence, "A string array of Unicode strings.", is really clumsy, the whole thing focuses too much on Unicode, and it's not clear who the audience is. (Like, I'm basically talking to coders who already understand basic computer text representation and the standard Octave text types.)

Update it.

This is expanded from #110.

@apjanke apjanke self-assigned this Jan 30, 2024
@github-project-automation github-project-automation bot moved this to Needs triage in Octave-Tablicious Jan 30, 2024
@apjanke apjanke added this to the 0.4.0 milestone Jan 30, 2024
@apjanke apjanke mentioned this issue Jan 30, 2024
15 tasks
@apjanke apjanke changed the title Update main string doco Improve main class string doco Jan 30, 2024
@pr0m1th3as
Copy link

The implementation of string class you merged in main branch is missing the <missing> values as it is
.
As a result

>> sz = [4 3];
>> varTypes = {'double','datetime','string'};
>> T = table('Size',sz,'VariableTypes',varTypes)
T =
table: 4 rows x 3 variables
  VariableNames: Var1, Var2, Var3
>> prettyprint(T)
----------------------
| Var1 | Var2 | Var3 |
----------------------
| 0    | NaT  |      |
| 0    | NaT  |      |
| 0    | NaT  |      |
| 0    | NaT  |      |
----------------------

instead of

>> prettyprint(T)
---------------------------
| Var1 | Var2 |   Var3    |
---------------------------
| 0    | NaT  | <missing> |
| 0    | NaT  | <missing> |
| 0    | NaT  | <missing> |
| 0    | NaT  | <missing> |
---------------------------

I really think you should reconsider adopting my approach on this issue.

@apjanke
Copy link
Owner Author

apjanke commented Feb 3, 2024

I don't think my string is missing missing values. It's just that string({''}) is not how you create them. You need to use string(missing) or NaS instead. (string(missing) is the Matlab way of doing it; NaS is my own extension.)

There was a bug in string.missing that caused it and NaS to return "" instead of missings; find a fix here: 9260210.

I agree there's a case for making a "preallocated" table like you're referring to, including one populated with missings. I just don't think a change to the string constructor semantics is required for that.

I took a stab at my own implementation of the table "preallocation" constructor over here on branch WIP/preallocation-ctor-form. Have a look and see what you think.

One thing is that Matlab's table() constructor, for the Size/VariableTypes preallocation form, defines the fill values for variables differently from array expansion, and they're not the missings you're asking for here. Here's the doco in the table() help page. In particular, the fill value used for string is "", not string(missing). And the default for double is 0, not NaN.

image

For Matlab compatibility, I made my table('Size', 'VarTypes') constructor match Matlab's fill behavior.

I think there's a case for making preallocated tables that use NaN and string(missing) as fill values instead of 0 and "". I think that way actually makes more sense, and is less error-prone. To support that, Tablicious could provide a static method or extension to the constructor arguments to support that. My implementation on that branch has the code to do so, but it's not part of the public interface yet; it's in tblish.table.internal.blankTable. (Packages named +internal are by convention (at least in Matlab) internal-use stuff that's not part of the public API of a piece of code, even though they have public access at the language level.)

@apjanke apjanke modified the milestones: 0.4.0, 0.5.0 Feb 3, 2024
@apjanke apjanke changed the title Improve main class string doco Improve main string class doco Feb 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs triage
Development

No branches or pull requests

2 participants