-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preload is useless with tainting #262
Comments
Hmm. It may be possible to add some logic to preload tools that acounts for this. Maybe something like this: This can add a stage that need sot have taint checking turned on. When the runner is initializing stages it can launch a new perl interpreter to start the stage, instead of a plain fork. The one it starts can have taint enabled at the start. This would only really work for top-level stages, as the new process will not be able to inherit anything preloaded in earlier stages. But child-stages of the taint one(s) would inherit taint checking. I am not sure how soon I could get to this. But Using the specification above, if you wanted to look into writing a PR I should be able to review it and/or give pointers and release it once it is working. |
So after bashing my head against this for a few hours I'm not really sure how to achieve what you suggest and could do with a few pointers. So I found where you fork in I also played with a far easier but much hackier approach where the yath script would re-exec itself at the start if you passed a taint flag or it read it from the config file, this way the whole thing runs under tainting. I had to sprinkle a few untainting regexes in places like |
If you want to add one or more taint options to yath itself so that the entire thing runs with taint, I would review and probably merge that. One thing to note is that yath itself does not need to start with taint, only the |
So I expanded the test cases to try and understand the runner logic better, is my understanding correct? Each test is always executed from a new process, those processes can be forks of perl which in turn can have libraries preloaded into them. The shebangs in test files influence which tests share a preloaded fork. I expanded the test file to look like this: #!perl
use Test2::V0 -target => 'MyExpensive';
diag "pid: $$";
diag "ppid: ${\getppid}";
diag "preload: @{[ $ENV{T2_HARNESS_PRELOAD} // 0 ]}";
diag "taint: ${^TAINT}";
diag "warning: $^W";
pass;
done_testing; Turned it into three files which vary in shebang:
Added a bit more debug to the "expensive" library: package MyExpensive;
warn "$$ is loading me...\n";
sleep 3;
1; Added a shell test for good measure: #!/bin/sh
echo '#' preload: $T2_HARNESS_PRELOAD >&2
echo ok 1
echo 1..1 And this is the output I get:
So a few observations (feel free to correct any of them).
|
Oh it's worth noting that I couldn't get any of the above to work without applying my hack from #211 (comment). I kinda get the impression that tainting isn't a very popular use case? |
shbang processing is mostly done here: https://github.com/Test-More/Test2-Harness/blob/master/lib/Test2/Harness/TestFile.pm However the decision to skip the preload for taint happens here: https://github.com/Test-More/Test2-Harness/blob/master/lib/Test2/Harness/Runner/Job.pm#L419 Basically it refuses to do preload+fork for any flags other than -w as it is the only one that can be enabled at runtime. I am not sure about the popularity of taint mode. I have never worked anywhere that uses it. |
Thanks for the info. As for potential fixes I think regardless of whether it happens in the Runner or the Preloader somewhere a fork need to be replaced with a fork + exec perl -T which means working out some way to convert the state of the currently running fork into some kind of IPC process to pass to the newly formed interpreter, that's the bit I'm really struggling with. That's what motivated my approach to just say screw it and run the whole process tree under a tainted perl instead. If anyone has any bright ideas on how to do the IPC I'm all ears because at present I'm pretty stalled on this issue. |
I think the best option is to add a taint mode option. Then clean up any places where data in the runner is currently considered tainted, like your hack up above. |
Thanks, I'll give this approach a go. |
Quick update, this approach does seem to work. I have a very rough around the edges PoC commit that seems to DTRT. ATM it hardcodes tainting on and disables the shebang flag checking but they're all fixable. The initial work is in the draft PR #264. Not wild about the sprinkling of |
Sorry for the radio silence, other stuff came up and I haven't been able to touch this much, this is what's currently known left to do (there's probably more!):
# This approach won't scale if we allow even more swiches.
my @allowed_switches = '-w';
# Allow taint and taint + warnings if we're a tainted runner.
push @allowed_switches => qw/-T -wT -Tw/ if ${^TAINT};
my $allowed_switches = join '|', map { quotemeta } @allowed_switches;
my $allowed_switches_re = qr/\s*(?:$allowed_switches)\s*/;
return $self->{+USE_FORK} = 0 if grep { $_ !~ $allowed_switches_re } $self->switches;
# We're running under the taint but the test hasn't requested taint.
return $self->{+USE_FORK} = 0 if ${^TAINT} && !grep { /\s*-w?Tw?\s*/ } $self->switches; The complication mainly stems from parsing the shebang with regexes and accounting for argument bundling. I'm hesitant to start actually parsing the shebang properly as we'd have to ensure we parse it the same way perl does. Suggestions on how to improve this logic are VERY welcome! So mainly I'm blocked on points 2 and 3. In terms of good news though, the patch works! I've been trying it out at $WORK and it really shaves heaps of time off our test suite run, so that's good 🎉 (99% of our tests enable tainting since our product also does). |
Given the following stripped down example:
lib/MyExpensive.pm
:lib/MyPreload.pm
:t/foo.t
:Running the tests with preload shows everything passes and the extra second isn't attributed to the test's "startup" cost:
However if we add a tainting shebang to the test file:
Then the test fails and takes longer:
Now imagine that you have hundreds of tests, all with tainting enabled and lots of expensive libraries that take seconds to load, suddenly you're in my situation where $WORK's codebase takes tens of minutes to test :-(
I thought this was a regression since I'm sure this used to work but I've yet to find a version where it works correctly.
I appreciate this is hard to make work as you can only adjust the taint mode at perl startup time and the worker you're forking doesn't have it enabled so can we either have multiple workers based on the set of interpreter flags we see in test files? Or an ugly hack to run all of yath under taint worker and all.
The text was updated successfully, but these errors were encountered: