Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file with semi-colon and lines with different number of cols not work #124

Closed
jbarotin opened this issue Mar 8, 2022 · 9 comments
Closed

Comments

@jbarotin
Copy link

jbarotin commented Mar 8, 2022

Hi,

Your tools look awesome, unfortunately, I work with embedded devices that generate non-standard csv files, you can find a sample of the content below :

MYVALUE;126;001
MYOTHERVALUE;THISALSOAVALUE
37;1;2;3;4;6;7;8;9;10;11;13;15;17;19;21;23;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;45;46
17/03/20-14:00:04;1901346120;146;146;147;65535;65535;65535;2341;2335;2338;1027;4998;1027;0;32768;1224125;129;561;725;64;505;323;65535;65535;65535;65535;65535;65535;65535;65535;65535;65535;65535;65535;300;4;0
YOURVALUE;127;002
MYOTHERVALUE;THISANICEVALUE
37;1;2;3;4;6;7;8;9;10;11;13;15;17;19;21;23;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;45;46
17/03/20-14:00:04;1901346338;147;147;147;65535;65535;65535;2338;2332;2334;1028;4998;1028;0;32768;1222219;131;556;726;64;507;327;65535;65535;65535;65535;65535;65535;65535;65535;65535;65535;65535;65535;300;4;0

If I try to open them in tidy-view, then I ran this command :

$ tidy-viewer CHALE1_MODBUS_000102_070000.csv  -s ";"

An get this crash

thread 'main' panicked at 'a csv record: Error(UnequalLengths { pos: Some(Position { byte: 16, line: 2, record: 1 }), expected_len: 3, len: 2 })', src/main.rs:353:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
@alexhallam
Copy link
Owner

Thanks for this.

Since this package is based on GNU-R pillar I just did a check to see what that package would output. This is what I saw.

  `MYVALUE;126;001`                                                             
  <chr>                                                                         
1 MYOTHERVALUE;THISALSOAVALUE                                                   
2 37;1;2;3;4;6;7;8;9;10;11;13;15;17;19;21;23;25;26;27;28;29;30;31;32;33;34;35;3…
3 17/03/20-14:00:04;1901346120;146;146;147;65535;65535;65535;2341;2335;2338;102…
4 YOURVALUE;127;002                                                             
5 MYOTHERVALUE;THISANICEVALUE                                                   
6 37;1;2;3;4;6;7;8;9;10;11;13;15;17;19;21;23;25;26;27;28;29;30;31;32;33;34;35;3…
7 17/03/20-14:00:04;1901346338;147;147;147;65535;65535;65535;2338;2332;2334;102…

This is a pretty ragged (#79 ) csv. I would say that there is nothing that I could do that would make this look better that something like cat.

Here is one suggestion. I am assuming that you have some ability to modify the code on your embedded device. If that is true then modify the code to generate "tidy data". The general idea is shown in the image below. If you want more details here is the paper

image

Let me know if this is useful. I would like to see tv used with embedded devices. If there is some edge case where it is impossible to modify code on embedded devices to make 'tidy data' I would like to know.

@jbarotin
Copy link
Author

jbarotin commented Mar 8, 2022

Thank's very much for your answer, unfortunately, I can't change the format of these files cause IoT devices are not ours.

I think about a solution that creates a table where the number of cols corresponds to the highest numbers of items in all lines of the document, but I understand if it's not possible.

image

@alexhallam
Copy link
Owner

Like this?

image

@alexhallam
Copy link
Owner

I think I see what you are saying. This goes back to (#79). My ability to make this happen depends on what I can get away with using rust-csv. This is a problem that needs a solution.

@jbarotin
Copy link
Author

jbarotin commented Mar 8, 2022

That's better, but the perfect view for us is to have cols for the line "17/03/20-14:00:04;1901346120;146;146;147;6... ", maybe I can try while specifying the number of cols.

Edit : I tested using the -n option and it still not working

@alexhallam
Copy link
Owner

Sorry, the -n is for the number of rows to output. The number of columns is automatically configured by the width of the terminal. This will take some thought. The solution will not be a fast solution.

Here is another question about the data. Do you only need the rows that start with a date? If that is the case maybe we can grep those date lines only then pipe to tv

@jbarotin
Copy link
Author

jbarotin commented Mar 8, 2022

maybe we can grep those date lines only then pipe to tv

It's a workaround thanks.

@alexhallam
Copy link
Owner

grep "^17/03" your.csv | tv -s ";"

image

It is a little funky without a formal header saying what the values are, but this is something. It is just a workaround. That is that fastest solution I can whip up at the moment.

@alexhallam
Copy link
Owner

And even a little nicer with headers:

# make a new file with headers
cat <<EOF >test.csv
date;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37
EOF

# append header file with grepped data
grep "^17/03" your.csv >> test.csv | tv test.csv -s ";"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants