`beariver`: failure with Yampa's battery of tests #270

miguel-negrao · 2022-03-14T15:02:27Z

Out of curiosity I ran Yampa's test suite on bearriver, by chaning the dependency on Yampa to bearriver. First I discovered that some functions such as snap and delay that exist on Yampa, aren't on bearriver. After commenting the tests that depend on those functions and running the other tests, I saw a lot of them are failing. Would it be expected that bearriver would pass these tests ? Could it be related to the fact that bearriver doesn't have the concept of value at time 0 for an SF ?

Yampa QC properties
  SF based on (**2) equal to SF on (^2)):             FAIL
    *** Failed! Falsified (after 83 tests):
    (50.24434,[(58.47182466972261,34.557312),(20.231661236485348,-50.80688),(7.218444010637784,-74.01073),(40.20546318749015,65.252304),(15.82790001400401,-55.067417),(13.652100846100907,77.0322),(64.53345505896199,67.87616),(76.35816589737523,23.783476),(62.232960300918435,10.726906),(66.73224746429403,80.5771),(66.94842923478188,27.716297),(54.76567385622039,37.090862),(40.5183381532102,45.29092),(42.628835407070326,-56.079567),(42.696083046921764,61.636703),(59.2966619891153,-20.217981),(34.43015173705496,-80.32405),(16.831737961477277,-77.2843),(52.738055534626405,67.438354),(57.18878711347577,55.46689),(20.64014580270434,80.58429),(40.03824449417514,-28.975924),(23.88014834398699,-6.0626917),(6.663921472329724,39.893562),(14.471230185535603,-38.895615),(67.47273452698106,-7.2321978),(23.253679117506643,-13.111708),(80.3291991288804,10.027201),(47.54587550032761,-64.94131),(59.506870185827104,4.3038564),(66.6474801739591,22.597166),(58.877341009195916,-67.222984),(66.53837048814208,62.83117),(70.70930628939345,-79.662895),(22.839840661953776,-1.8905973),(68.92267968551033,-1.4134943),(2.544375087752849,-54.88906),(36.4934335586765,19.597475),(57.18540104946238,-34.70285),(72.00286803529683,-75.75716),(69.19667501673068,49.632004),(12.804752095681293,-67.79812)])
    Use --quickcheck-replay=528891 to reproduce.
    Use -p '/SF based on (**2) equal to SF on (^2))/' to rerun this test only.
  Identity:                                           OK
    +++ OK, passed 100 tests.
  Arrow Naturality:                                   OK (0.02s)
    +++ OK, passed 100 tests.
  Naturality:                                         OK (0.02s)
    +++ OK, passed 100 tests.
  Basic > Identity (1):                               OK
    +++ OK, passed 100 tests.
  Basic > Identity (2):                               OK
    +++ OK, passed 100 tests.
  Basic > Constant:                                   OK
    +++ OK, passed 100 tests.
  Basic > Initially:                                  OK
    +++ OK, passed 100 tests.
  Basic > Time:                                       FAIL
    *** Failed! Falsified (after 4 tests):
    (-1.8570651,[(2.394013962470563,1.1119848),(2.960133907944883,0.401466),(2.117922182509095,-0.61746645)])
    Use --quickcheck-replay=888729 to reproduce.
    Use -p '$0=="Yampa QC properties.Basic > Time"' to rerun this test only.
  Basic > Time (fixed delay):                         FAIL
    *** Failed! Falsified (after 1 test):
    (0.0,[(0.25,0.0)])
    Use --quickcheck-replay=173968 to reproduce.
    Use -p '/Basic > Time (fixed delay)/' to rerun this test only.
  Basic > localTime:                                  FAIL
    *** Failed! Falsified (after 4 tests):
    (2.0827584,[(1.087108041397159,-2.408621)])
    Use --quickcheck-replay=641317 to reproduce.
    Use -p '$0=="Yampa QC properties.Basic > localTime"' to rerun this test only.
  Basic > localTime (fixed delay):                    FAIL
    *** Failed! Falsified (after 1 test):
    (0.0,[(0.25,0.0)])
    Use --quickcheck-replay=372519 to reproduce.
    Use -p '/Basic > localTime (fixed delay)/' to rerun this test only.
  Collections > parB:                                 OK
    +++ OK, passed 100 tests.
  Arrows > Composition (1):                           OK
    +++ OK, passed 100 tests.
  Arrows > Composition (2):                           OK
    +++ OK, passed 100 tests.
  Arrows > Composition (3):                           FAIL
    *** Failed! Falsified (after 1 test):
    (0.0,[(0.25,0.0)])
    Use --quickcheck-replay=303294 to reproduce.
    Use -p '/Arrows > Composition (3)/' to rerun this test only.
  Derivatives > Comparison with known derivative (1): FAIL
    *** Failed! Falsified (after 1 test):
    (6.283143965558951e-3,[(1.0e-3,1.2566039883352607e-2)])
    Use --quickcheck-replay=367489 to reproduce.
    Use -p '/Derivatives > Comparison with known derivative (1)/' to rerun this test only.
  Derivatives > Comparison with known derivative (2): FAIL
    *** Failed! Falsified (after 1 test):
    (0.0,[(1.0e-3,0.0)])
    Use --quickcheck-replay=539382 to reproduce.
    Use -p '/Derivatives > Comparison with known derivative (2)/' to rerun this test only.
  Events > No event:                                  OK
    +++ OK, passed 100 tests.
  Events > Now:                                       OK
    +++ OK, passed 100 tests.
  Events > After 0.0:                                 FAIL
    *** Failed! Falsified (after 1 test):
    (0.0,[])
    Use --quickcheck-replay=4527 to reproduce.
    Use -p '/Events > After 0.0/' to rerun this test only.
  Arrows > First (1):                                 OK
    +++ OK, passed 100 tests.
  Arrows > First (2):                                 OK
    +++ OK, passed 100 tests.
  Arrows > Second (1):                                OK
    +++ OK, passed 100 tests.
  Arrows > Second (2):                                OK
    +++ OK, passed 100 tests.
  Arrows > Identity (0):                              OK (0.02s)
    +++ OK, passed 100 tests.
  Arrows > Identity (2):                              OK (0.02s)
    +++ OK, passed 100 tests.
  Arrows > Associativity:                             OK (0.04s)
    +++ OK, passed 100 tests.
  Arrows > Function lifting composition:              OK
    +++ OK, passed 100 tests.
  Arrows > First:                                     OK
    +++ OK, passed 100 tests.
  Arrows > Distributivity of First:                   OK (0.02s)
    +++ OK, passed 100 tests.
  Arrows > Commutativity of id on first:              OK (0.03s)
    +++ OK, passed 100 tests.
  Arrows > Nested firsts:                             OK (0.03s)
    +++ OK, passed 100 tests.

9 out of 33 tests failed (0.31s)

yampa-test> Test suite yampa-quicheck failed
Test suite failure for package yampa-test-0.13.3
    yampa-quicheck:  exited with: ExitFailure 1
Logs printed to console

The text was updated successfully, but these errors were encountered:

ivanperez-keera · 2022-03-14T15:31:21Z

It is possible, yes.

It's worth investigating why some are failing. As you say, since bearriver as a first sample with delta 0, things could fail.

A failure could suggest a number of things, including:

That bearriver is broken.
That a Yampa concept does not translate as expect to BR.
That a testing library is broken.
That the test is expressed in a way that is not abstract enough.

Unless it's really obvious, I would expect that this issue might have to be split in multiple other issues.

miguel-negrao · 2022-03-14T16:11:44Z

Looking at the test prop_basic_time_increasing If I add some traces:

prop_basic_time_increasing =
   forAll myStream $ evalT $ Always $ prop (sf, pred)
 where myStream :: Gen (SignalSampleStream Float)
       myStream = uniDistStream

       sf   :: SF a (Time, Time)
       sf   = loopPre (-1 :: Time) sfI

       sfI :: SF (a,Time) ((Time, Time), Time)
       sfI =  ((time >>> arr (\t -> trace ("time " <> show t) t )) *** identity) >>> arr resort

       resort :: (Time, Time) -> ((Time,Time),Time)
       resort (newT, oldT) = trace ("resort " <> show (newT, oldT)) ((newT, oldT), newT)

       pred :: a -> (Time, Time) -> Bool
       pred _ (t,o) = (t > o)

then the output is:

  Basic > Time:                                       time 0.0
resort (0.0,-1.0)
time 0.0
resort (0.0,-1.0)
time 0.6998458781665305
resort (0.6998458781665305,-1.0)
time 0.6998458781665305
resort (0.6998458781665305,0.6998458781665305)
FAIL
    Passed:
    (0.0,[])
    
    Failed:
    (0.4182935,[(0.6998458781665305,-0.30343592)])
    
    *** Failed! Falsified (after 2 tests):
    (0.4182935,[(0.6998458781665305,-0.30343592)])
    Use --quickcheck-replay=483600 to reproduce.
    Use -p '$0=="Yampa QC properties.Basic > Time"' to rerun this test only.

It's strange that time with value 0 is evaluated twice, but still the last time is -1 on both, but then once the first delta is used, time outputs twice the same value, which shouldn't happen (and fails the test).

miguel-negrao · 2022-03-16T10:37:01Z

The function evalSF seems to be correct. The issue is in evalT which implements a Linear temporal logic. The definition of evalT is somewhat convoluted, and somewhere in it something is behaving differently . There is a comment in the source code:

 Important question: because this FRP implement uses CPS,
-- it is stateful, and sampling twice in one time period
-- is not necessarily the same as sampling once. This means that
-- tauApp, or next, might not work correctly. It's important to
-- see what is going on there... :(

Indeed, although I couldn't confirm it, I have the feeling somehow the same "tick" is being evaluated twice.

ivanperez-keera · 2022-04-02T17:08:59Z

Quick question: did you try this with develop or with your version of bearriver that fixes #266?

miguel-negrao · 2022-04-02T18:49:40Z

It was with the version of bearriver in hackage.

ivanperez-keera · 2022-06-10T10:25:52Z

Partly related: #40 .

ivanperez-keera · 2024-06-23T02:00:58Z

#40 is near completion (only part of 3 modules is missing now). After that, my intention is to enable Yampa's tests with bearriver and address everything that fails. How long that will take will mostly depend on how many things fail.

ivanperez-keera · 2024-09-21T14:22:38Z

All of Yampa's API will be covered when #426 is completed, which should happen as part of the next release.

After that, we can run this again and identify which tests "misbehave".

ivanperez-keera · 2024-10-11T10:01:59Z

#426 and, consequently, #40 are now complete. We can now run these tests again and see what breaks.

ivanperez-keera · 2024-10-12T15:59:51Z

I've just re-run the tests. I've found the following issues:

bearriver doesn't make submodules available under the FRP.Yampa namespace.
~~embed has a different signature, so the tests don't compile without composing every call to embed with runIdentity~~ (addressed in bearriver: the function embed is API-incompatible #439).
embedSynch imposes a constraint on MonadFail, so tests also do not compile out of the box.
Some tests hang. I've located individually the ones that do.
Some tests fail.
~~loopPre is duplicated in bearriver~~ (addressed in bearriver: loopPre is defined twice #438).

The list of changes is extensive. Addressing this as part of one issue will be nearly impossible. I'll split this into smaller issues (e.g., per module).

ivanperez-keera changed the title ~~beariver - yampa tests~~ beariver fails with Yampa's battery of tests Mar 14, 2022

ivanperez-keera added the bug label Mar 14, 2022

ivanperez-keera changed the title ~~beariver fails with Yampa's battery of tests~~ beariver: failure with Yampa's battery of tests May 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`beariver`: failure with Yampa's battery of tests #270

`beariver`: failure with Yampa's battery of tests #270

miguel-negrao commented Mar 14, 2022 •

edited

Loading

ivanperez-keera commented Mar 14, 2022

miguel-negrao commented Mar 14, 2022

miguel-negrao commented Mar 16, 2022

ivanperez-keera commented Apr 2, 2022

miguel-negrao commented Apr 2, 2022

ivanperez-keera commented Jun 10, 2022

ivanperez-keera commented Jun 23, 2024

ivanperez-keera commented Sep 21, 2024 •

edited

Loading

ivanperez-keera commented Oct 11, 2024 •

edited

Loading

ivanperez-keera commented Oct 12, 2024 •

edited

Loading

beariver: failure with Yampa's battery of tests #270

beariver: failure with Yampa's battery of tests #270

Comments

miguel-negrao commented Mar 14, 2022 • edited Loading

ivanperez-keera commented Mar 14, 2022

miguel-negrao commented Mar 14, 2022

miguel-negrao commented Mar 16, 2022

ivanperez-keera commented Apr 2, 2022

miguel-negrao commented Apr 2, 2022

ivanperez-keera commented Jun 10, 2022

ivanperez-keera commented Jun 23, 2024

ivanperez-keera commented Sep 21, 2024 • edited Loading

ivanperez-keera commented Oct 11, 2024 • edited Loading

ivanperez-keera commented Oct 12, 2024 • edited Loading

`beariver`: failure with Yampa's battery of tests #270

`beariver`: failure with Yampa's battery of tests #270

miguel-negrao commented Mar 14, 2022 •

edited

Loading

ivanperez-keera commented Sep 21, 2024 •

edited

Loading

ivanperez-keera commented Oct 11, 2024 •

edited

Loading

ivanperez-keera commented Oct 12, 2024 •

edited

Loading