Skip to content

Test Runner Ideation

KochynVolodymyr edited this page Nov 28, 2023 · 2 revisions

Note

This document explores possible ideas for a wrapper around our tests to provide better feedback for the developer on failure.

Current Implementation

We currently run our tests from the makefile(1)(2)(3) with an explicit call to our binary with a path to a specific test. Ex: $(XVFB) ./dbuild/bin/foedag --script tests/TestGui/compiler_flow.tcl

  • Any failure stops the current CI target and often no error message is provided

Current Issues

  • Test results are inconsistent on failure

    • Many tests print nothing on failure

    • Something in our current harness suppresses or redirects a lot of the error messages foedag would natively print making debug very hard.

  • Foedag.log is the best source of clues for failure, but it is hard to access from CI and gets overwritten on the next test run

    • Some tests have added explicit rules to print foedag.log on failure to deal with this

    • We should have a standard solution that works for all tests

  • Test run quits on first failure

    • For CI protecting against bad check-ins, this is ok, but if we ever made a feature change that we knew would break multiple tests, it would be nice to run all the tests and get a report back of all the failures.
  • While github is supposed to have some logic to kill long running CI’s, we had an issue where a test hung for half a day without erroring out.

    • With a top level test runner, we can hopefully catch and timeout long running tests

Potential Issues / Considerations

  • Different test types

    • currently we have batch tests, --script tests, --replay tests, unit tests, valgrind smoke tests, and possibly others

    • Need to consider how to dynamically encapsulate these different types

    • Some things like google test (unit tests) are already encapsulated so we might just need to collect the results and show them

    • Might just target the test runner for our gui tests at first and expand to include other test types as we find it necessary.

  • Different run goals

    • CI’s primary goal is to prevent bad check-ins so any failure is a good excuse to stop. When debugging, we might want to run all tests before failing out.

    • We might want a way to change if we error out on first failure

Desired Features

  • Come up with a unified approach that will ALWAYS provide the developer w/ some valid error info on test failure.

  • Provide access to foedag.log after test run

    • All files tests would be ideal, but if first implementations still error out
  • Provide a report of all test results and possibly run metrics

    • Not necessary, but would be nice down the road

Possible Implementation

  • Create some wrapper test runner

    • Since we are executing system level commands, we aren’t bound to any language

    • Something simple like python would be an easy approach

    • Currently Raptor has python for ip catalog, but FOEDAG doesn’t so we’d need to decide on packaging python with foedag to allow this

    • Tcl is an available option, but i’d avoid it unless we want to stick to languages we already use

    • tcl has low adoption out of eda and non-common coding conventions which makes code maintenance difficult due to limited amount of developers with tcl experience

    • This wrapper needs to be able to monitor and handle subprocesses which tcl might not be able to do

  • Create some way of listing our tests and details they might need to run

    • Possible entries for a given test

    • Binary path (probably always foedag or raptor, but might want to provide a way to choose)

    • Arguments

    • A lot of gui runs will have common arguements like --script and --replay, but less common ones like --compiler openfpga need to be considered when trying to separate and abstract all this

    • Path to script

  • Iterate the list of tests

    • Start a timeout counter and execute the test in a way that we can capture any output and control the process

    • On completion:

    • Store any metrics we might care about

    • For example, runtime

    • Store foedag.log

    • Would be good to do this for every test, but depending on how this info is accessed it might make more sense just to capture failed ones.

    • Might need to collect other .rpt files so we might need a dynamic way to copy other files.

    • Store pass/fail result

    • On timeout:

    • Cancel process

    • Store foedag.log

    • note that test timed out

  • Archive results

    • The CI itself will need some way to determine if it should error out

    • The test runner wrapper can probably do this

    • We need some way to either pull/parse the results from a failure

    • Need to research CI to see what is possible

    • Ideally, it would be nice to parse and collect all these result into an html report that can be viewed by the developer after a run.

  • Alert CI to result of test run process

    • Probably as simple as returning 0 or 1 in the makefile