Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug that prevents deepcopy and pickling if object has a parent #37

Merged
merged 4 commits into from
Jan 21, 2024

Conversation

bela127
Copy link

@bela127 bela127 commented Jan 23, 2023

This bug is definitely the cause for: SheffieldML/GPy#666
and most certainly the cause for: SheffieldML/GPy#583

A lot of multiprocessing frameworks internally use pickle so using a other 'non protocol' copy method is not possible.
Implementation of the copy protocol should not have a hard to find bug.

The problem with existing code is the following:

def __deepcopy__(self, memo: dict):
        s = self.__new__(self.__class__) # fresh instance
        memo[id(self)] = s # be sure to break all cycles --> self is already done
        # The above line can cause hard to understand exceptions/bugs, because s is not 'done' its state attributes need to be copied first
        # If a state attribute has a link to a parent object (like '_parent_'), the parent is copied first, using the uninitialized copy s
        # thereby throwing an exception when attributes of the child are accessed while copying the parent.
        # so a subclass should not link to its parent or remove the link when beeing copied.

        state = self.__getstate__()
        updated_state = copy.deepcopy(state, memo) # standard copy
        s.__setstate__(updated_state)

The following code tests the behavior, and highlights the bug (it raises an AttributeError):

import pickle
import copy

import GPy
import numpy as np


min_support = -1
max_support = 1
support_points = 10
query_shape = (1,)
result_shape = (1,)

kern = GPy.kern.Brownian() #use a kernel

rng = np.random.RandomState(None)
support = rng.uniform(min_support, max_support, (support_points, *query_shape))

flat_support = support.reshape((support.shape[0], -1))

results = np.random.normal(0, 1, (1, *result_shape))

flat_results = results.reshape((1, -1))

m = GPy.models.GPRegression(flat_support[:1], flat_results, kern, noise_var=0.0) #use the kernel in a model, so the kernel has a parent


obj = copy.deepcopy(kern) #Now deepcopy the kernel ( now the '_parent_' attribute is set)
print(obj)


bytes = pickle.dumps(kern)


obj = pickle.loads(bytes)
print(obj)

The Error will be:

  File "/home/bela/Cloud/code/Git/alts-experiments/.venv/lib/python3.9/site-packages/paramz/core/nameable.py", line 63, in name
    return self._name
AttributeError: 'Brownian' object has no attribute '_name'

This is because of the following:

deepcopy tries to copy 'kern' -> adds it to 'memo' before it is fully copied-> finds 'parent' attribute that needs to be copied first -> begins copying the parent -> finds the 'kern' attribute in the parent -> looks it up in the 'memo' -> now the parent has an unfinished copy of 'kern' -> setstate of the parent tries to access attribute of this unfinished copy -> raises attribute error

This pull request will fix this bug, and the provided test as well as multiprocessing will work.

@MartinBubel
Copy link
Collaborator

Hi @bela127
thanks for taking the effort on this one and sorry for the lack of action since then!

I recently joined the team of maintainers and will bring some life into this project (with the intention of patching paramz.
I will do a review within the next days and probably re-route the PR to some other branch than master (first need to create some develop branch...).

Best regards
Martin

@bela127
Copy link
Author

bela127 commented Jan 16, 2024

Hi @bela127 thanks for taking the effort on this one and sorry for the lack of action since then!

I recently joined the team of maintainers and will bring some life into this project (with the intention of patching paramz. I will do a review within the next days and probably re-route the PR to some other branch than master (first need to create some develop branch...).

Best regards Martin

If there is anything you need, let me know.

Copy link
Collaborator

@MartinBubel MartinBubel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bela127
Thank you for your contribution and your patience!

I have made some comments on changes that I am uncertain about.

Also, I think it we would be a good habit to add a test for the updated behavior.
Is that something you could provide?

Best,
Martin

paramz/core/parentable.py Show resolved Hide resolved
@MartinBubel MartinBubel changed the base branch from master to development January 21, 2024 11:27
@MartinBubel MartinBubel merged commit eef6b70 into sods:development Jan 21, 2024
7 checks passed
MartinBubel added a commit that referenced this pull request Jan 21, 2024
update changelog after mergin #37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants