Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrun: Fix potential concurrency issues #67

Merged
merged 2 commits into from
Aug 12, 2024

Conversation

Deedone
Copy link
Contributor

@Deedone Deedone commented Apr 16, 2024

Add reference counting to container objects to prevent unexpected failures when container object is freed right after return from get_container function.
Add locking to pause/resume functions.

xrun/src/xrun.c Outdated
struct container *container = NULL;

k_mutex_lock(&container_lock, K_FOREVER);
container = find_container_locked(container_id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is prone to ABA problem. Reference owner should hold own pointer to the container.

xrun/src/xrun.c Show resolved Hide resolved
xrun/src/xrun.c Outdated
return ret;
}

put_container(container);
return unregister_container_id(container_id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just call put_container for a second time here. With a comment, of course.

xrun/src/xrun.c Outdated
ret = domain_unpause(container->domid);
if (ret) {
put_container(container);
return ret;
}

container->status = RUNNING;
put_container(container);
return 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ret = domain_unpause(container->domid);
if (ret) {
put_container(container);
return ret;
}
container->status = RUNNING;
put_container(container);
return 0;
}
ret = domain_unpause(container->domid);
if (ret) {
goto out;
}
container->status = RUNNING;
out:
put_container(container);
return ret;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not mandatory. Just will look more readable from my standpoint. The same suggestion across all put_container calls

@Deedone Deedone force-pushed the container_refcount branch from d72414c to 960d7e5 Compare April 17, 2024 10:06
@Deedone Deedone changed the title xrun: Add reference counting to container objects xrun: Fix potential concurrency issues Apr 17, 2024
@GrygiriiS
Copy link

Similar comment as #72 (comment)

* - Add reference counting (like linux kref) to
* ensure that this object does not disappear under
* your feet.
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a pity, but this function has another concurrency issue: code below checks if container exists by using get_container. Problem is that if register_container_id is called twice with the same container id, it may register both containers. You need to hold the container lock whole time: obtain the lock, check if container already exists, create new one, free the lock.

This is another issue, so you can add TODO for now. Or make a separate patch that fixes this.

}

return unregister_container_id(container_id);
/* Put container twice to drop the last reference */
put_container(container);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... should you call domain_destroy when you are freeing container? Or it will be okay to have a container object without alive domain inside?

@Deedone Deedone force-pushed the container_refcount branch from 20879d6 to 9b61505 Compare August 2, 2024 08:31
Copy link
Collaborator

@lorc lorc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should to. Thanks.

How you tested this, BTW?

Acked-by: Volodymyr Babchuk <[email protected]>

@lorc
Copy link
Collaborator

lorc commented Aug 2, 2024

Oh, and don't forget to fix checkpatch.

@Deedone Deedone force-pushed the container_refcount branch from 9b61505 to 2078370 Compare August 3, 2024 07:54
@Deedone
Copy link
Contributor Author

Deedone commented Aug 3, 2024

Tested with unit tests and trying to start/stop/pause/unpause the aos monkey container.

Copy link
Collaborator

@firscity firscity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you
Reviewed-by: Dmytro Firsov <[email protected]>

Add reference counting to container objects to prevent unexpected
failures when container object is freed right after return from
get_container function.

Signed-off-by: Mykyta Poturai <[email protected]>
Acked-by: Volodymyr Babchuk <[email protected]>
Reviewed-by: Dmytro Firsov <[email protected]>
Add locking to pause and resume operations to prevent container->status
from getting out of sync with the actual state of the container.

Signed-off-by: Mykyta Poturai <[email protected]>
Acked-by: Volodymyr Babchuk <[email protected]>
Reviewed-by: Dmytro Firsov <[email protected]>
@Deedone Deedone force-pushed the container_refcount branch from 2078370 to fe9f3e7 Compare August 12, 2024 06:39
@firscity firscity merged commit 2136b2f into xen-troops:main Aug 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants