-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel restraint #218
base: master
Are you sure you want to change the base?
parallel restraint #218
Conversation
There are some tasks have to run in order at first or at the end. So I introduce three lists, pre/parallel/post in recipe running phase. Add PARALLEL parameter to put tasks in corresponding list. The default value to pre, so xml without any this param will no change. Round 1, the pre list tasks will run in order. like config/setup tasks. Round 2, the parallel list tasks will parallel execution. Round 3, the post list tasks run in order too. like reservesys, tasks cannot run in parallel Signed-off-by: Ping Fang <[email protected]>
Use glib thread pool launch nproc thread worker to execution task_handler. Build per-thread data structure pass it to handler nstead of app_data. Signed-off-by: Ping Fang <[email protected]>
Once tasks parallel execute, we can't use single app_data to track all tasks. The per-task index used to transfer between restraintd and rstrnt_xxx. Signed-off-by: Ping Fang <[email protected]>
load_eval function will compute currently system load through /proc/loadavg, /proc/stat, /proc/meminfo. This is used to prevent local watchdog timeout caused by other parallel tasks. Signed-off-by: Ping Fang <[email protected]>
The local server need task index to identify requester. Add get_nproc get max available thread number. Signed-off-by: Ping Fang <[email protected]>
Replace all internal interface, pass per-thread task data structure. Instead of use a single default main loop, create their own main loop foreach thread. The thread loop remove and reattach event source until no more tasks, then stop loop. Signed-off-by: Ping Fang <[email protected]>
Sometimes parallel execution will produce some presure or some stress testing parallel running. By evaluating system load, extend local timer needed 15 min at most 3 times to avoid unexpected timeout. There is only one external/remote watchdog in beaker side. We update it only when requested time longer then currently remain time. Signed-off-by: Ping Fang <[email protected]>
There maybe some tasks will conflict with others. For example, those tasks need to modify sysfs. The restraint will check this keyword and only allow one conflicted task execute at a time. There is a reserved keyword zreboot used for zstream. Signed-off-by: Ping Fang <[email protected]>
free these three lists at the recipe cleanup Signed-off-by: Ping Fang <[email protected]>
Add mutex to protect race condition of conflict keywords Signed-off-by: Ping Fang <[email protected]>
Signed-off-by: Ping Fang <[email protected]>
Signed-off-by: Ping Fang <[email protected]>
This pull request introduces 2 alerts when merging b2d368a into cf36cbc - view on LGTM.com new alerts:
|
In short, this PR should be REJECTED. Long version now
Basically right now this patch would do more harm than good to Restraint maintainers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment above ^
I agree with Martin. This is a high risk change which I voiced concerns about a long time ago. Even the smallest of changes may work fine for one user but fail miserably for other users. |
In short, let the tests run in parallel.