-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support PyTorch Lightning: more carefully avoid problematic attribute accesses #31
Support PyTorch Lightning: more carefully avoid problematic attribute accesses #31
Conversation
Hi Kale, thanks so much for this pull request, and for your careful development and testing here. All your points seem astute and reasonable:
I suspect all your changes are great (and will happily credit you as a contributor on the main page--thanks so much), but can I circle back on this next week? I am in the middle of a grant deadline so that is eating up all my time at the moment. Also, quick tangential question in the meantime--since you are one of the first (besides me) to do a substantial pull request for TorchLens, how hard has it been to work with the codebase? I am aware that pretty much everything is implemented in the massive |
As the test suite continued to run, the 97-106th tests failed. I think these are mostly to do with
I took the liberty of pushing another commit to mark this test as xfail. That will let other hypothetical contributors know which tests are supposed to pass, and is also a necessary first step towards setting up automated tests.
Yes, this is easy to do.
Of course, take your time!
This change wasn't too hard, but it also didn't really require understanding much about the code. My thought process was basically, "The code seems to be encountering an error while attempting to 'restore model attributes'. Probably by the time it's restoring things, it already has whatever data it needs, so I can just suppress the error and everything will be fine." So I added the try/catch block, ran the code, got another similar stack trace, and repeated that process until it worked. I was actually a bit uneasy submitting a PR despite how little I understand the code, and that's one of the reasons I paid so much attention to the tests. It was pretty easy for me to figure out which file to look in for what kind of information, although I guess part of that might just be that most of the information is in I think my biggest challenge with the code base is that I so far haven't been able to build a mental map for how the code works. I know that everything happens in Granted, I haven't put much effort into trying to understand the code. I'm just saying all of this because I know it can be helpful to get the perspective of someone who isn't as familiar with things. Because I can't help myself, I hope you don't mind if I give some unsolicited opinions on object-oriented software architecture. Broadly, the ideas I want to put forth are that (i) functions are easier to understand than methods, and therefore (ii) the only reason to use a method is when access to private class data is required. So if you're looking to refactor things (which I'm not saying would be a good use of your time), I suspect that a lot of The reason I think functions are easier to understand than methods is that they usually have simpler inputs and outputs. Methods always have access to Without This brings up the question of when methods should be used. As I wrote above, I think the answer is that they should be used when access to private class data is required. This relates to the idea that the purpose of a class is generally to hide some messy private data behind a clean public API. Only methods can do this. However, it's still true that methods are more complicated than functions, perhaps even more so when comparing methods that are responsible for maintaining the integrity of an object's private data with functions that are just accessing an object's public API. This leads to a few rules of thumb that I use when writing classes:
Bringing this all back to If you made it this far, thanks for reading. I hope I didn't come across as too critical or arrogant. |
Hi Kale, Apologies for the headaches with the unit tests, so far I've only been using them internally so any snafus are probably errors on my end. Currently the tests err on the side of being exhaustive (checking as many kinds of architectures as I can think of), so they do end up taking awhile. Thanks for all your great suggestions for how to streamline the tests and dependencies, when I have time I will do some tidying up on this front. And, thanks for all the feedback on the code organization! A second pair of eyes is super helpful after so long working solo on this package:
All very good point and it would probably be good to write a readme explaining this stuff. The general "flowchart" is:
Please don't apologize, I really appreciate the hot takes :] This was my first serious software project and I've actually been wanting a spot-check on my design choices. I totally agree with you that splitting up the code could do a lot for readability and that it could stand to be a lot more faithful to OOP best practices. I will think about all your specific suggestions. My one mental roadblock is... what if we actually do want to make a lot of changes to |
You can get the best of both worlds by using
I took some time to try to actually understand the code, so I could give some more informed feedback. I agree with the design goal of having everything available via attributes. Without changing that, I have a number of specific suggestions that might make the code easier to understand and/or modify: Convert redundant attributes into properties.Are you familiar with
Of all the suggestions I'm making here, I think this one is the best value for effort. It wouldn't be a big change, but it would simplify these classes a lot. Group related attributes into objects
For example, all of the I'll note that the attributes of both classes are already grouped via comments in their respective constructors. I'm basically just advocating for these same groups to be formalized into real data structures. Separate user-facing and private attributesSome of the
I think the code would be easier to understand if there was a more clear division between what's meant to be presented to the user, and what's just for internal use. For instance, one thing I'm still not sure about is whether "addresses" (which seem to be recursive data structures used to extract data from arbitrary python objects) exist for internal or external use. A simple way to make this distinction would be to prefix internal attributes with an underscore (e.g. Make a
|
This is all fantastic feedback, thanks so much for this deep dive. I think many of these ideas are no-brainers and would greatly improve the code. I have a K99 grant to push out the door in the next 48 hours, but will respond in detail after ;) |
Grant submitted, finally time to respond properly to your great comments :)
Fantastic idea and a no-brainer. This needs to be done.
Also a great suggestion and no-brainer. This will clean up a lot of the "plumbing" code.
I have thought about more limited versions of this--at the very least, I think having classes for describing the modules and for describing the parameters would be sensible. One thing to think about is the usability tradeoff of adding extra levels of nesting. It saves some keystrokes to only have one level of nesting when indexing ModelHistory or TensorLogEntry, but on the other hand it really is a big list of attributes and this could be clunky to navigate (and it's easy to overlook attributes that a user might find helpful). Here I am inclined to default to whatever is easier from the user standpoint.
100% agreed. This would be an easy fix and help to further tidy up the classes.
Indeed, this could be useful. In fact in my original version of TorchLens I did exactly this, with separate data structures for the object that logs the forward pass and the user-facing object. But maybe it would be cleaner to have separate data structures for these. I think this is very sensible.
I agree with this. One thought I had was to have torchlens decorate PyTorch upon being imported, but have there be a "toggle" such that the logging functionality is only executed in the context of
It's a good point: there is not a one-to-one mapping between function calls and tensors. I made the choice to "conflate" them for a few reasons:
But, these were subjective judgment calls that I'm not 100% wedded to :] I do think that making modules their own dedicated data structure could be useful (e.g., a
I agree with this, and to answer your question, the visualization code has no role in building
It exists solely for visualization purposes. The "rolled" graph option corresponds not to collapsing the modules (that's separate), but to reformatting the graph to show any recurrent loops. The issue is this: the visualization code uses the info from each
I actually did this initially until I ran into one vexing design choice: PyTorch forward hooks can't see any input tensors that were fed in as kwargs to the module's This shouldn't be the source of the mistaken module assignments (since they're just different ways of triggering code to be executed when the
The barcodes are used for a few different purposes, not all of them for labeling objects. They are used for individuating parameters, though, which is where your suggestion is helpful. I will say that the barcodes (8 chars long, any alphanumeric character) have 200 trillion+ possible values, so the odds of a collision are tiny... but you're right that there's nothing to lose from being kosher here. |
Also, reviewing the pull request today. It looks great so I think I'll be able to merge it shortly. I will credit you as a contributor on the readme :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look great, thanks so much for doing this.
This PR makes a small change that allows TorchLens to work on PyTorch Lightning modules. If you're not familiar, Lightning is a framework that tries to abstract-away a lot of the training loop code that you'd otherwise have to write. The problem arises because when a Lightning module is used outside the context of a Lightning training loop, it turns out that attempting to access one particular attribute of the module results in an exception being raised. TorchLens ends up triggering this error (in three different places) as it attempts to monkeypatch the module.
This PR attempts to solve the problem by simply skipping any "inaccessible" attributes that are encountered. Some minor comments about the implementation/tests:
requirements.test.txt
, and I changed the way the others are imported so that the affected tests xfail but the rest of the test suite still runs.pyproject.toml
is a better way to specify extra dependencies thanrequirements.*.txt
. The main advantage is that it's more standardized, so it's easier for third-party tools to interact with. If you want, I'd be happy to make a PR that transitions the project fromsetup.py
topyproject.toml
. While I'm at it, I could also make it so that GitHub automatically runs the test suite on each PR.test_varying_loop_noparam2
test fails both in the main branch and in this branch. I'm assuming that this is not related to the changes I made.black
because it looked like it would change a lot of code I didn't write.