Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

Support dynamic fitness threshold #147

Open
EronWright opened this issue May 2, 2017 · 3 comments
Open

Support dynamic fitness threshold #147

EronWright opened this issue May 2, 2017 · 3 comments

Comments

@EronWright
Copy link

EronWright commented May 2, 2017

I'd like the "good enough" calculation to vary with the urgency of the task. For example, imagine that a task should be scheduled within 30 seconds. At first, the fitness bar should be held high, then be gradually lowered as the task request gets older; at the 30 second mark, "anything that meets the hard constraints will do".

I suggest that the FitnessGoodEnoughFunction accept the TaskAssignmentResult to convey the TaskRequest along with the fitness measurement.

@EronWright
Copy link
Author

A possible workaround may be to dynamically adjust the fitness calculation, to 'improve' the fitness as the task grows older or more urgent. It's a hack that I'd rather avoid.

@spodila
Copy link
Contributor

spodila commented May 3, 2017

This thought of making fitness good enough evaluation dynamic existed since the beginning. But, it hasn't manifested itself in the implementation yet. For this to work effectively, I believe there are two pieces needed:

  1. Make evaluation of fitness being good enough dynamic based on additional factors, including task level SLAs like start or finish time deadlines, for example.

  2. Enhance scheduler to not use the best available assignment if the good enough function says it is not good enough. Specifically, currently the good enough function evaluation is used only to stop iterating over additional VMs for assignments, after finding one that is good enough.

Your suggestion of the function accepting the task assignment result can address the first piece above. However, the 2nd piece is required as well for this to work. Also, a question to answer is whether or not a scale up evaluation must trigger a scale up if the task remains pending with this new way of "failing assignment". It can likely be argued both ways. Further thought is needed to decide on it.

Additional thoughts welcome.

Specifically to talk about your workaround, an alternate workaround for you would be to make it a constraint instead. The constraint could return 0.0 if it doesn't find a good enough fitness (just use the same fitness function inside it that you set on Fenzo) and the task is still in it's 30s mark for finding a better fit. However, this can also trigger a scale up action, if you are using cluster autoscaling feature with shortfall analyzer. This workaround may be easier because you have the information you need within the constraint implementation, and it can work on a per task basis.

@EronWright EronWright changed the title Provide TaskAssignmentResult to FitnessGoodEnoughFunction Support dynamic fitness threshold May 4, 2017
@pradykaushik
Copy link

@EronWright I feel that looking at the current load on the cluster would also be beneficial. One can look at the possible load on the cluster to decide whether one would need to go with the fitness good-enough function or just stick with the regular fitness calculation.
A simple way to determine whether the cluster is loaded or not (assuming you're running a single framework on mesos) would be to check the frequency, and the resource values (how much CPU, how much MEM etc.) of mesos resource offers.
The higher the load on a machine, the lesser the number of resource offers that one would receive from that machine. So, in such a scenario, one might want to switch to using a fitness good-enough calculation to increase the likelihood that this task is assigned by Fenzo's task scheduler.
@spodila I'd like to hear your thoughts on this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants