-
Notifications
You must be signed in to change notification settings - Fork 31
Test Runner Ideation
- Note
- Current Implementation
- Current Issues
- Potential Issues / Considerations
- Desired Features
- Possible Implementation
This document explores possible ideas for a wrapper around our tests to provide better feedback for the developer on failure.
We currently run our tests from the makefile(1)(2)(3) with an explicit call to our binary with a path to a specific test. Ex: $(XVFB) ./dbuild/bin/foedag --script tests/TestGui/compiler_flow.tcl
- Any failure stops the current CI target and often no error message is provided
-
Test results are inconsistent on failure
-
Many tests print nothing on failure
-
Something in our current harness suppresses or redirects a lot of the error messages foedag would natively print making debug very hard.
-
-
Foedag.log is the best source of clues for failure, but it is hard to access from CI and gets overwritten on the next test run
-
Some tests have added explicit rules to print foedag.log on failure to deal with this
-
We should have a standard solution that works for all tests
-
-
Test run quits on first failure
- For CI protecting against bad check-ins, this is ok, but if we ever made a feature change that we knew would break multiple tests, it would be nice to run all the tests and get a report back of all the failures.
-
While github is supposed to have some logic to kill long running CI’s, we had an issue where a test hung for half a day without erroring out.
- With a top level test runner, we can hopefully catch and timeout long running tests
-
Different test types
-
currently we have batch tests, --script tests, --replay tests, unit tests, valgrind smoke tests, and possibly others
-
Need to consider how to dynamically encapsulate these different types
-
Some things like google test (unit tests) are already encapsulated so we might just need to collect the results and show them
-
Might just target the test runner for our gui tests at first and expand to include other test types as we find it necessary.
-
-
Different run goals
-
CI’s primary goal is to prevent bad check-ins so any failure is a good excuse to stop. When debugging, we might want to run all tests before failing out.
-
We might want a way to change if we error out on first failure
-
-
Come up with a unified approach that will ALWAYS provide the developer w/ some valid error info on test failure.
-
Provide access to foedag.log after test run
- All files tests would be ideal, but if first implementations still error out
-
Provide a report of all test results and possibly run metrics
- Not necessary, but would be nice down the road
-
Create some wrapper test runner
-
Since we are executing system level commands, we aren’t bound to any language
-
Something simple like python would be an easy approach
-
Currently Raptor has python for ip catalog, but FOEDAG doesn’t so we’d need to decide on packaging python with foedag to allow this
-
Tcl is an available option, but i’d avoid it unless we want to stick to languages we already use
-
tcl has low adoption out of eda and non-common coding conventions which makes code maintenance difficult due to limited amount of developers with tcl experience
-
This wrapper needs to be able to monitor and handle subprocesses which tcl might not be able to do
-
-
Create some way of listing our tests and details they might need to run
-
Possible entries for a given test
-
Binary path (probably always foedag or raptor, but might want to provide a way to choose)
-
Arguments
-
A lot of gui runs will have common arguements like --script and --replay, but less common ones like --compiler openfpga need to be considered when trying to separate and abstract all this
-
Path to script
-
-
Iterate the list of tests
-
Start a timeout counter and execute the test in a way that we can capture any output and control the process
-
On completion:
-
Store any metrics we might care about
-
For example, runtime
-
Store foedag.log
-
Would be good to do this for every test, but depending on how this info is accessed it might make more sense just to capture failed ones.
-
Might need to collect other .rpt files so we might need a dynamic way to copy other files.
-
Store pass/fail result
-
On timeout:
-
Cancel process
-
Store foedag.log
-
note that test timed out
-
-
Archive results
-
The CI itself will need some way to determine if it should error out
-
The test runner wrapper can probably do this
-
We need some way to either pull/parse the results from a failure
-
Need to research CI to see what is possible
-
Ideally, it would be nice to parse and collect all these result into an html report that can be viewed by the developer after a run.
-
-
Alert CI to result of test run process
- Probably as simple as returning 0 or 1 in the makefile