Rewrite of tokenizer and introduction of object-based test cases #208

VisLab · 2024-10-25T20:53:33Z

This PR represents a rewrite of the tokenizer, which is the syntax parser for hedstrings. The tokenizer now catches errors from bad slashes and blanks in addition to errors caught in the previous parser. An object based data framework for the test cases was introduced to make it easy to add testcases for the tokenizer.

A start was also made on converting the bids.spec tests to the object framework for clearer description of what was being tested and for possible porting to the Python validator for test cases.

… into update-spec

into update-tokenizer

VisLab · 2024-10-26T15:55:02Z

@happy5214 I have updated and moved the converter tests to the tests directory. I am going to go ahead and merge to the develop branch. I also corrected a bug in the tokenizer.

In the next PR, I am going to convert some more of the bids.spec tests to bidsTests.spec and make a first pass at converting the events.spec to the separated test format. Eventually, we may be able to pull out some of the common functions to a test utilities, but I don't think we should worry about that yet.

VisLab · 2024-10-27T21:45:54Z

@happy5214 Could you please review --- I think this should be merged with develop and soon with master.

Summary:

The main changes are organizational. I started to convert the existing test to tests based on separate datafiles that could easily be ported to use as tests for the python tools. This is a big job and will take some time. The datafiles are now in tests/testData
There were two converters.js files. I renamed one of them to be tagConverter.js and moved it to the parser directory.
I moved all of the .spec.js files that were scattered in the code to the tests subdirectory.
I corrected a bug in the tokenizer -- it wasn't properly detecting slash after open parenthesis.
I corrected a bug in the column splicer --- it was crashing when {HED} was used as a splice but there was not HED column in the tsv.file.

VisLab · 2024-10-28T11:17:13Z

Revised the PR to not treat a missing HED column in tsv as an error when {HED} is used.

happy5214 · 2024-10-29T21:41:08Z

We walked through these changes during a Zoom meeting.

VisLab added 27 commits October 11, 2024 13:34

Updated the regex

f4ccac6

Updated the splitter

769aa3b

Merge branch 'master' of https://github.com/hed-standard/hed-javascript…

9b2bb73

… into update-spec

Merge branch 'master' of https://github.com/hed-standard/hed-javascript…

550cdb2

… into update-spec

Updating to start tokenizer

5127943

Updated the test of tokenizer

01ecd6a

First pass at testing tokenizer

7b72f78

Updated the tokenizer

0758bfe

More experiments with the tokenizer

906bdbc

First pass at rewritten tokenizer

2d8a51f

Working on slash handling

bd73724

Updated the tokenizer to handle empty tags

a918907

Basic tests are running

f9b659d

Initial implementation of the new tokenizer

1d3307a

stringParser tests now past with new tokenizer

c427407

Worked on the other tests

c397a12

Separated the tests on the original tokenizer temporarily

6ef9400

Corrected the error message on curly brace recursion

62f5a2b

Removed temporary test

cf03a72

Skipped tests on old tokenizer

cd53a2c

bidsTests just started

f6034df

Updated bids tests to include a valid and invalid case

3473342

Continued reorganizing the bids spec tests

1c2f686

Updated the tests

31e3229

Cleaned up so all tests run --- added capability to run individual tests

ca1676d

Merge branch 'develop' of https://github.com/hed-standard/hed-javascript

0585b5f

into update-tokenizer

Updated the README

ff0dafe

VisLab requested a review from happy5214 October 25, 2024 20:53

VisLab added enhancement New feature or request quality Code quality, not must-fix labels Oct 25, 2024

VisLab added bids BIDS integration tests Issues related to testcases validation Tag validation issues parsing String parsing labels Oct 25, 2024

VisLab added 2 commits October 25, 2024 18:48

Working on the converter spec tests

17e77a4

Updated tests to agree with tokenizer messages.

28c97ea

VisLab added 3 commits October 26, 2024 16:14

Working on the tests

4b9fb93

Added test for {HED} but no tsv HED column

194717b

Updated tests for curly brace recursion

aa3884e

Missing HED column when using {HED} is not an error

24f1ffc

happy5214 approved these changes Oct 29, 2024

View reviewed changes

happy5214 merged commit 9561c87 into hed-standard:develop Oct 29, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite of tokenizer and introduction of object-based test cases #208

Rewrite of tokenizer and introduction of object-based test cases #208

VisLab commented Oct 25, 2024

VisLab commented Oct 26, 2024

VisLab commented Oct 27, 2024

VisLab commented Oct 28, 2024

happy5214 commented Oct 29, 2024

Rewrite of tokenizer and introduction of object-based test cases #208

Rewrite of tokenizer and introduction of object-based test cases #208

Conversation

VisLab commented Oct 25, 2024

VisLab commented Oct 26, 2024

VisLab commented Oct 27, 2024

VisLab commented Oct 28, 2024

happy5214 commented Oct 29, 2024