Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine event loop cleanup with other style/modernization cleanup #4

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

mattsta
Copy link
Contributor

@mattsta mattsta commented Aug 18, 2024

Previous nice refactorings/structure/cleanups provided by @gnzsnz plus some additional style and technical debt cleanups found along the way.

Easiest way of running tests is now poetry install --with dev; poetry run pytest -vs -n auto (but some of the tests have timing conflicts, so maybe 1 out of 10 or 20 test runs show false errors when running concurrently... we can also just run poetry run pytest -vs for regular sequential running).

(and docs building is now: poetry install --with docs; poetry run sphinx-build -b html docs html — we probably don't need the generated docs in the repo itself — we could set up something like the ib_async CI docs-to-website-on-commit config here too).

Status: all tests are currently passing on 3.10 and 3.12 and the rest works for me 🤷‍♂️

gnzsnz and others added 12 commits August 15, 2024 15:05
Python 3.10 as a minimum seems safe (4 years old right now).

Also added more dev deps including pytest with a concurrency runner.
Tests can now be run concurrently with:

poetry run pytest -n auto
Improves:

- uses only more modern python versions for tests
- uses poetry for installs and running instead of pip directly
More new lines between different logical sections for easier reading.

Cleaned up some other loop creation to be inline when possible.

Replaced `asyncio.ensure_future()` in ops/transform.py with
`create_task()` for more modern APIs and all the tests still pass.
Primarily due to version changes over time, this all used to work in
older versions, but Python 3.12 started issuing new warnings and errors
unless manual steps are used for event loop handling instead of automatic steps.

Fixes #2
Interestingly, ruff and µsort have different opinions on how headers
should be sorted. Guess one has to win out over time (µsort prefers to
have blank lines between regular "import" lines and "from X import Y"
lines).
Fixes:

- Incorporates doc deps into poetry (poetry install --with dev,docs)
- Removes manual version.py parsing to get version number
- Fixes "no language" error in docs spec
- Fixes doc syntax problem in docstring

Docs can now be built using:

poetry run sphinx-build -b html docs html
version this, version that, it worked, but we were overwriting things
incorrectly even though it didn't impact the final otuput.
@mattsta mattsta requested a review from gnzsnz August 19, 2024 19:07
@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 20, 2024

FYI

/dev/eventkit on next*
$ poetry run ruff check --select "NPY201" eventkit/
All checks passed!

I think there is no need to update ruff rules as described here

  • I added a test for emit_threadsafe, that is impacted by get_event_loop
  • event.py had import after the Event class declaration. I run ruff on it and sorted imports

@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 20, 2024

@mattsta I was planning to add github actions to automate pypi release process as described here

https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/

i see that you use poetry, which can be used to publish to pypi as well so i wanted to double check this with you. with github actions we can set a rule to publish any tag that follows a certain rule, usually v0.0.1 format. so once all tests pass, a new tag will push to pypi.

let me know your thoughts

Copy link
Collaborator

@gnzsnz gnzsnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm good to go

@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 20, 2024

  • event.py had import after the Event class declaration. I run ruff on it and sorted imports --> rolled back causes circular import problems

Originally slots were using just nested ararys indexed by position, but
we can do better now.

This project was originally written in Python 3.6 before many more
modern things like dataclasses exist, so now we can use
dataclass(slots=True) for automatic slots generation as well as easily
use dataclasses for inner-encapsulated data management.
@mattsta
Copy link
Contributor Author

mattsta commented Aug 24, 2024

Thanks! Looks good all around. Nice adding an extra test too. The packaging via actions should be fine if you want to manage the auth tokens inside the actions environment. I haven't looked into setting all those configs up before.

How about one more before update before a version bump... I worked a while this week to modernize the data layout away from positional nested arrays into more self-managing objects.

All tests still pass for me, but give it a look to see if anything looks out of place or if any naming should be improved anywhere. 🎯

@mattsta
Copy link
Contributor Author

mattsta commented Aug 24, 2024

on second look of my diff, I noticed I left in @dataclass(slots=False) because that's what caused the tests to pass.

There's a trick with dataclasses and automatic slots and weakrefs where they don't work nicely together.

The only difference between the original __slots__ implementation and @dataclass(slots=False) is the slots=False version is about 5 ns slower to access an instance variable property (because slots is faster than regular instance variable storage).

It's okay if we want to revert the Event() part back to manual __slots__, or we could keep it in the newer code organization layout.

@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 27, 2024

@mattsta , I'm reviewing your code. I wrote a few test for Slot and Slots classes. I will post them.

@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 28, 2024

@mattsta I added 2 new files for github actions, "build and test" and "pypi publish". I added pypy to the matrix, but it's failing. from what i can see pypy seems to be handling garbage collector differently, so the tests that delete the event handler fail to pass the assertions. it's not a big problem, eventkit and ib_async work with pypy (i have tested) but there might be surprises.

I added as well dependabot, which will keep github actions up-to-date, please enable it on settings > Code security > dependabot (I don't have access)

I loosing my faith with PEP 514, there is a huge backlog and no progress what so ever on any request. the only issues that are closed have been closed by the author.

@mattsta
Copy link
Contributor Author

mattsta commented Aug 28, 2024

I added pypy to the matrix, but it's failing.

oh, great idea. Nice to keep pypy verified too. I'll test more locally to see if I can get it to start working again. I did run the tests locally with pypy 3.12 recently and it worked, but maybe that was before some other recent tests/changes.

I added as well dependabot, which will keep github actions up-to-date, please enable it

will do. thanks for playing yaml warrior with all those things.

loosing my faith with PEP 514,

It was nice to try at least? Honestly, I think we know eventkit isn't really used by any other project in the world than the IBKR client (the eventkit downloads are exactly the same as ib_async/insync downloads: https://pypistats.org/packages/eventkit), so we could easily release it as a new independent package or not even use a package and just reference it as a github URL dependency using a tag for fetching directly.

We could also do a bit of a feature-reduction for parts not necessary for basic event notifications. I'm not sure anybody has ever actually used things like the numpy-backed ops/array.py:Array() implementation, etc? I don't see why we need numpy to be a dependency here really (we could move the Array() file to only testing, then add numpy as a testing dependency though just to maintain it as an example).

I think we could just declare the core usage here is only adding event handlers to named events, then running event handlers when named events receive event updates. If somebody wants to create a graph of numpy processing elements, it can be easily composed from existing built-in operations without needing to provide those exactly internally (or people can subclass/copy locally where needed too).

@mattsta
Copy link
Contributor Author

mattsta commented Aug 28, 2024

Found some pypy details with weakref problems:

Here are some more technical details. This issue affects the precise time at which del methods are called, which is not reliable or timely in PyPy (nor Jython nor IronPython). It also means that weak references may stay alive for a bit longer than expected. This makes “weak proxies” (as returned by weakref.proxy()) somewhat less useful: they will appear to stay alive for a bit longer in PyPy, and suddenly they will really be dead, raising a ReferenceError on the next access. Any code that uses weak proxies must carefully catch such ReferenceError at any place that uses them. (Or, better yet, don’t use weakref.proxy() at all; use weakref.ref().)

https://doc.pypy.org/en/latest/cpython_differences.html

The failing tests work if I manually inject a GC between the delete and the next check in test_functor and test_method. The manual GC allows all the remaining tests pass (minus regular timing issues for the delayed tests), so ymmv.

(also this error has always been there back to the last official release, so it is unrelated to any of our recent changes/refactors).

Fails

        del obj2

        self.assertEqual(len(event), 1)

Works

        del obj2
        gc.collect()

        self.assertEqual(len(event), 1)

@mattsta
Copy link
Contributor Author

mattsta commented Aug 28, 2024

dependabot, which will keep github actions up-to-date, please enable it

I think this is just enabled by the action being committed. If I go to enable it in the settings, all it says is "commit this to enable the feature":

# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file

version: 2
updates:
  - package-ecosystem: "" # See documentation for possible values
    directory: "/" # Location of package manifests
    schedule:
      interval: "weekly"

@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 29, 2024

Ok, so I went through a rabbit hole.

I have solved "all" (knock on good) the warnings with the tests. Specially the warnings stating that the event loop is closed.

I didn't like the functools.cache and I realized that is only needed so the tests don't complain. Which is why I removed, that's when I felt through the rabbit hole. And probably the Edwald's reason to use a global in utils.py

There is a mix of things causing the warnings, the main one is pytest-xdist, and a bad combination between unittest module and asyncio. So to summarize my changes

  • functools.cache is out
  • pytest-async plugin in
  • I had to rewrite all tests so they don't use unittest, so now all test are on pytest
  • All test pass, except when they are run with pytest-xdist, poetry run pytest -vs -n auto gives a warning. It's a warning, not an error

task: <Task pending name='Task-18' coro=<Aiterate._looper() done, defined at ~/Documents/dev/eventkit/eventkit/ops/create.py:42> wait_for=>

  • updated Event.repeat, as the parameters where in the wrong position. it should be Ok now.
  • update Aiterate and Wait to use tasks. removed __del__ method from Aiterate as it causes warnings. we can put it back and live with warnings on tests, which is not bad.
  • right now everything is task based, except timing module which uses loop.call_soon and loop.call_at.
  • I'm looking Slots class so it uses tasks rather than ensure futures. coming soon.

All test pass now, but i will check that ib_async stills works. I have some code that uses ticker.updateEvent.trades().timebars it will be a good test

- functools.cache is out
- pytest-async plugin in
- I had to rewrite all tests so they don't use unittest, so now all test are on pytest
- All test pass, except when they are run with pytest-xdist, poetry run pytest -vs -n auto gives a warning. It's a warning, not an error

 > task: <Task pending name='Task-18' coro=<Aiterate._looper() done, defined at ~/Documents/dev/eventkit/eventkit/ops/create.py:42> wait_for=>

- updated Event.repeat, as the parameters where in the wrong position. it should be Ok now.
- update Aiterate and Wait to use tasks. removed __del__ method from Aiterate as it causes warnings. we can put it back and live with warnings on tests, which is not bad.
- right now everything is task based, except timing module which uses loop.call_soon and loop.call_at
@mattsta
Copy link
Contributor Author

mattsta commented Aug 29, 2024

Some of this may be a bit too much now?

Overall, this is a tiny abandoned project whose only real goal is to "trigger callback." If refactor attempts are introducing behavior changes and warnings and requiring workarounds, it's probably not worth it.

I had to rewrite all tests so they don't use unittest, so now all test are on pytest

amusingly, I also did the same thing and saw it just ran slower overall under pytest logic instead of UnitTest classes, so I didn't add them back. I still have my pytest converted tests sitting around and now all those tests now fail due to the event loop and task/future changes...

It appears all the original tests are now broken because of event loop logic changes, so technically all the tests don't work and we fixed broken tests by rewriting the tests instead of maintaining original code compatibility?

All test pass, except when they are run with pytest-xdist, poetry run pytest -vs -n auto gives a warning. It's a warning, not an error

New warnings introduced where they didn't exist before seems not entirely clean.

updated Event.repeat, as the parameters where in the wrong position. it should be Ok now.

Nice catch!

update Aiterate and Wait to use tasks. removed del method from Aiterate as it causes warnings. we can put it back and live with warnings on tests, which is not bad.

This is another one of those "it worked before, but now it doesn't work after all the changes," so it doesn't work?

I agree create_task() is cleaner, but it just doesn't work for the purposes here. Spending hours of bug fixes and research and refactoring to convert ensure_future() into create_task() for no actual benefit isn't really worth it (other than our nerd logic of "everything must be the newest").

Moving to create_task() does introduce behavior changes (see: create_task leaving all those Pending events on close which doesn't happen with ensure_future(loop=loop) and also create_task() failing in all the tests which weren't running with an already active event loop, so the tests were rewritten to accommodate the behavior changes, but it means all the original uses cases the tests were checking for are also now broken).

right now everything is task based, except timing module which uses loop.call_soon and loop.call_at.

It does seem like the right thing to do, right? I had also tried that a couple weeks ago and found it just doesn't work. I ran the original tests and they all fail because, I think, they used to work before the event loop was running by placing tasks on it, then they started after the event loop started. create_task() only works on a running event loop which breaks all the original assumptions around how these APIs work (which, for a 5-8 year old abandoned project, just isn't worth the effort to break compatibility?).

addendum

(also, yes the original pattern of "allow attaching futures to a non-running event loop so they run when the loop starts" is a non-standard practice, but it's how this thing has worked since it was created, so do we want to break it now? — another alternative is: if our refactors break existing features, we could just remove the existing non-core features too. simple as. I would be surprised if anybody in the world actually used any of the "advanced operator" or time delay or numpy APIs at all.

i guess one good example: if we want to keep these async/create_task update changes and break backwards usage, we could also just remove the Event.run() thing which assumes it is controlling an entire event loop itself. That's why the original tests all had "modern" problems because every test wanted full control over the event loop.)

@gnzsnz
Copy link
Collaborator

gnzsnz commented Aug 30, 2024

Hi,

It appears all the original tests are now broken because of event loop logic changes, so technically all the tests don't work and we fixed broken tests by rewriting the tests instead of maintaining original code compatibility?

the original test are implemented using unittest which does not have good asyncio compatibility, pytest-asyncio handles that nicely reference. The test are basically the same than before, in terms of coverage, but now they run in async methods.

broker? no they are not broken

This is another one of those "it worked before, but now it doesn't work after all the changes," so it doesn't work?

it does work, all test pass. the warning is caused by pytest-xdist that closes the loop before everything is done.

we are already using create_task with no problem in transform.Map

I have tested it with ib_async and all works fine. The eventkit sample notebook works fine, everything is working fine. There is no impact in changing ensure_future by the create (which is the recommended way going forward). ensure_future itself returns Tasks objects

with these changes we have tests working in modern way (no need for workarounds for test to pass, like the function cache, or use sync methods to test async)

@mattsta
Copy link
Contributor Author

mattsta commented Aug 30, 2024

Well, yes, overall the changes are great! Improvements to an abandoned project for stability and modernization are always a good thing.

But we have to acknowledge the new ensure_future->create_task changes break the original usage.

Originally the patterns allowed were:

  • create event pipeline (events are added to a non-running loop then started later!)
  • start event loop

or:

  • start event loop
  • create event pipeline

These updates change the pattern to not work with the original usage of (create events -> start loop) and now everything must be (start loop -> create events). That's why all the original tests break is because ensure_future allows adding tasks to a non-running event loop. It doesn't really have to do with UnitTest being bad at async.

I think the original use case was just because of the way he liked to write things in notebooks and test them later all at once, but the usage of the library has now changed to remove an entire category of use cases which worked before.

Does it matter? Probably not. But we broke the interface, all the original tests broke, then we changed the tests to match the new interface changes instead of maintaining test compat over time.

Since the original way doesn't work anymore, we can delete the .run() since it will never work again, and I guess also remove the Aiterate interface because now it leaves dead tasks in the background we can't clean up nicely since they now they are tasks and not futures? Should be good to go then.

@gnzsnz
Copy link
Collaborator

gnzsnz commented Sep 1, 2024

Indeed

ensure_future->create_task changes break the original usage.

I missed this scenario, there is no test for it, because it was basically everywhere, until I replace all for await. But this is not creating the event loop errors.

I've done an analysis and the errors are coming from the "new" util.get_event_loop() implementation. It is not equivalent to asyncio.get_event_loop_policy().get_event_loop(). The errors are solved by the cache, but to be honest I'm not 100% convinced by the cache solution.

I'm not sure if it's just a problem with the tests, and how the event loop is started/stopped during test or a problem with util.get_event_loop() implementation itself.

There is a note on asyncio.get_running_event_loop

This function can only be called from a coroutine or a callback.

and in asyncio.get_event_loop

using the get_running_loop() function is preferred to get_event_loop() in coroutines and callbacks.

so asyncio.get_running_event_loop comes with strings attached.

The way I see it the options are:

  • use the legacy asyncio.get_event_loop_policy().get_event_loop() which raises a warning, and eventually it will stop working.
  • use the new solution which clearly is having side effects.

Personally I rather live with the warning. If you are Ok let me know I will summit a commit with the changes.

And please do your analysis, I might have missed something. Personally never went this deep on the event loop

new test for exception handling during emit/slots.__call__
enable pytest log on console
@gnzsnz
Copy link
Collaborator

gnzsnz commented Sep 1, 2024

fix and unit test for #5

new setting on pyproject.toml to actually see the log message

@gnzsnz
Copy link
Collaborator

gnzsnz commented Sep 6, 2024

The way I see it the options are:

* use the legacy `asyncio.get_event_loop_policy().get_event_loop()` which raises a warning, and eventually it will stop working.

* use the [new solution](https://github.com/ib-api-reloaded/eventkit/issues/2#issuecomment-2295201637) which clearly is having side effects.

rolled back changes, with "legacy get event loop"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants