Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support running btests on Windows #80

Merged
merged 27 commits into from
Jan 13, 2023
Merged

Conversation

timwoj
Copy link
Member

@timwoj timwoj commented Dec 12, 2022

This PR covers the majority of the changes that need to happen to the btest script in order to run tests on Windows. There's a few known issues I'll go into at the end. The big differences here are:

  • Switching to the multiprocess module instead of the built-in multiprocessing module, since the default one has some problems with pickling our internal types.
  • Changing executing processes on Windows to wrap them in a simple bash script, which lets them run correctly under msys2 (the bash environment installed with Git on Windows).
  • Passing data from the test manager down into the child processes via a proxied dictionary. mulitprocess on Windows doesn't seem to like our use of globals here and complains that it can't find objects. The proxied dictionary works around this, and is probably the more correct way of doing this anyways.

Known issues:

  • Running tests against debug builds hits an assertion down in the Windows OS libraries. I'm pretty sure this is a libkqueue bug, and opened Closing a file descriptor on Windows might cause an assertion mheily/libkqueue#151 to have them investigate. The easy work-around is to not close the event_queue file descriptor in iosource_mgr, and that can be used for testing.
  • Shutting down a run with the -j flag with ctrl-c doesn't work right. There's something different about signal masking on Windows. I spent a day trying to figure this one out and haven't solved it yet.
  • There's a problem with pickling our internal types using the dill library used by multiprocess that changes that internal ID stored for the types. run_cmdseq previously used isinstance to check whether a command being run was a CmdSeq but because the type IDs are different it was returning false (on all platforms). There's currently a work-around where we just check the type name, but it's likely possible to fix this via implementing custom pickling methods for those types.

These changes will require some modifications to the Zeek tree to work right as well, but this PR is independent of those.

@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch 3 times, most recently from abd5f37 to 142cfba Compare December 12, 2022 19:56
Copy link
Member

@ckreibich ckreibich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must have been pretty darn painful. Just a few quick comments here as you're finalizing this.

setup.py Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch from 754bc09 to dd84f74 Compare December 13, 2022 21:46
Copy link
Contributor

@awelzel awelzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a bit of a pass.

btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
btest Show resolved Hide resolved
btest Outdated Show resolved Hide resolved
@awelzel
Copy link
Contributor

awelzel commented Dec 15, 2022

Switching to the multiprocess module instead of the built-in multiprocessing module, since the default one has some problems with pickling our internal types.

From the naive side: Is there something we fix/work-around in the types involved? Adding non-stdlib dependency when there wasn't one before seems unfortunate to me.

@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch 2 times, most recently from 10005c0 to 2338efc Compare December 16, 2022 23:07
@timwoj
Copy link
Member Author

timwoj commented Dec 16, 2022

I rebased and squashed down all of the fixup commits. I'm down to just two btests failing:

  • tests.diag-file: This one looks like the .stdout and .stderr files get reset between commands run (meaning each time a bash script is fired up), which shouldn't be happening. This causes the output from one the internal btest commands to get lost, which the tests requires for success.
  • tests.environment: This one fails because the test calls pwd in the middle of it and compares it to the testbase value read from btest.cfg. On Windows, pwd will return a POSIX-style path (so /c/...) whereas btest currently uses pseudo-Windows-style paths internally (C:/...). It's making me want to switch everything over to use POSIX-style paths since it would help with some consistency between POSIX and Windows in other places, but the last time I tried to do that everything went south quickly.

I'll look over the rest of the comments above after I get back from the holiday break.

@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch from 2338efc to 6596525 Compare December 16, 2022 23:51
@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch from 6596525 to 17df3c8 Compare January 4, 2023 23:04
@timwoj
Copy link
Member Author

timwoj commented Jan 4, 2023

From the naive side: Is there something we fix/work-around in the types involved? Adding non-stdlib dependency when there wasn't one before seems unfortunate to me.

Prior to switching over to the other library, I did attempt to fix the types it was failing on. It turns into a game of whack-a-mole pretty fast, with one type after another failing for varying reasons. I got fairly far into it and then ran into a type that I couldn't figure out how to fix, and figured there had to be a better way to go about it than manually fixing every type and hoping I didn't miss something.

@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch 2 times, most recently from 39e7e54 to 0e4e68d Compare January 4, 2023 23:44
btest Outdated Show resolved Hide resolved
@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch 7 times, most recently from 096dfca to 3a224d7 Compare January 6, 2023 22:32
setup.py Outdated Show resolved Hide resolved
@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch 5 times, most recently from c56c7bc to e56ecd0 Compare January 10, 2023 00:32
@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch 2 times, most recently from 44d2e48 to 66021f1 Compare January 13, 2023 22:19
timwoj and others added 25 commits January 13, 2023 15:33
If this option isn't here, the Windows runners will reset all of the
line endings when it clones to \r\n. This breaks a few of the tests
because the comparison will have the wrong line endings.
This is mostly suggestions from pycharm, but also includes some comments
from when I was tracing through code.
The reason for this switch is primarily because the stock
multiprocessing library has very poor support for pickling of
non-primitive types on Windows.
As per the comment, some serializer/unserializers don't produce the identical
type when unserializing, failing isinstance().
Using the 'spawn' method for multiprocessing causes the global
state to get lost when moving from the parent into the child
processes. Rebuilding it by looping over a subset and reinserting
them into globals() ensures that they exist.
This changes how runSubprocess works on Windows to insert all of the
calls within a temporary bash script. This ensures that the entire
environment is available when running the processes, which doesn't
work when simply calling subprocess.check_call().
Windows has some issue where `hash()` returns different values for the
same string in the different child processes. crc32() returns the same
values in each.
This fixes a problem on Windows where multiple TEST-EXEC statements in a
test could cause those files to be overwritten by subsequent TEST-EXECs,
causing failures.
The original tests.environment btest doesn't work on Windows due to some
path differences in the output. This adds a new test that does the same
things except does some additional conversions in the test script itself
to remove those differences.
@timwoj timwoj force-pushed the topic/timw/windows-support-2 branch from 66021f1 to ff90899 Compare January 13, 2023 22:38
@timwoj timwoj merged commit f313d13 into master Jan 13, 2023
@timwoj timwoj deleted the topic/timw/windows-support-2 branch January 13, 2023 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants