Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

Incorrect handling of reserved resources #166

Open
EronWright opened this issue Dec 5, 2017 · 2 comments
Open

Incorrect handling of reserved resources #166

EronWright opened this issue Dec 5, 2017 · 2 comments

Comments

@EronWright
Copy link

EronWright commented Dec 5, 2017

Problem Description
Fenzo misinterprets offers containing a mix of reserved and unreserved resources, causing it to fail to consider all offered resources. For example, given an offer of 2 reserved CPUs and 3 unreserved CPUs, Fenzo behaves as though the offer contains 2 (or 3) CPUs, not 5 CPUs as it should.

This situation arises when the operator (or another framework in the same role) reserves a subset of a host for the framework's role. This is an increasingly common phenomenon due to:

  1. the dynamic reservation feature, which makes it easy for an operator to make fine-grained reservations.
  2. the growing popularity of the dcos-commons library, which makes extensive use of dynamic reservations. A framework based on that library may use the same role as a Fenzo-based framework, leading to unintended side-effects.

Here's an example depicting the resources within such an offer (2 cpus for myrole, 3 unreserved):

cpus(myrole):2.0; mem(myrole):4096.0; ports(myrole):[1025-2180];
disk(*):28829.0; cpus(*):3.0; mem(*):10766.0; ports(*):[2182-3887,8082-8180,8182-32000]

Problem Location
The root cause is within com.netflix.fenzo.plugins.VMLeaseObject. The VMLeaseObject assumes that a given resource name (e.g. cpus) will appear at most once in the offer.

Suggested fix
VMLeaseObject should aggregate all resources with the same name (subject to a set of roles to filter on).

A suggested workaround is for the framework to use an alternate implementation of com.netflix.fenzo.VirtualMachineLease. See example here.

@EronWright
Copy link
Author

EronWright commented Dec 5, 2017

Just a comment to framework developers: to effectively use reserved resources, be sure to formulate the resource objects in your TaskInfo based on the specific resources contained in the offer. If you're accepting an offer like shown above, your TaskInfo should contain numerous cpus resources (e.g.
[cpus(myrole):2.0; cpus(*):3.0; ...]). See examples here and here.

@emaxerrno
Copy link

oh @EronWright this is a really good idea ! thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants