Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chef_user check for existing users #13

Open
patcon opened this issue Feb 28, 2017 · 7 comments
Open

Fix chef_user check for existing users #13

patcon opened this issue Feb 28, 2017 · 7 comments

Comments

@patcon
Copy link

patcon commented Feb 28, 2017

I'm getting an error when using chef_user resource after a second user is created. It then starts failing for the first one, saying it already exists. So it's trying to add the existing user. Digging in, i think I've found the issue

chef-users seems to get saved in run-state as things like this:

node.run_state['chef-users'] = "delivery\n"
node.run_state['chef-users'] = "\n"
node.run_state['chef-users'] = "delivery\nsomeother\n"
node.run_state['chef-users'] = "someother\ndelivery\n"

The current guard for that is:

not_if { node.run_state['chef-users'].index(/^#{new_resource.username}$/) }

This doesn't seem right, as the code produces the following results:

# comment shows return
node.run_state['chef-users'] = "delivery\n" # 0
node.run_state['chef-users'] = "\n" # nil
node.run_state['chef-users'] = "delivery\nsomeother\n" # 0
node.run_state['chef-users'] = "someother\ndelivery\n" # 10

I suspect this isn't what we want?

Created a fix commit here: https://github.com/patcon/chef_stack/commits/blendive/develop

Let me know if you recognize this as an issue, and happy to create a PR

cc: @mrjcleaver

@patcon
Copy link
Author

patcon commented Feb 28, 2017

Found the same issue in chef_org resource. Working on that too

@patcon
Copy link
Author

patcon commented Feb 28, 2017

Not sure how to reason about this bit of chef_org and chef_user:
https://github.com/ncerny/chef_stack/blob/master/resources/chef_org.rb#L33

(this was helpful context: https://github.com/chef/chef-rfc/blob/master/rfc056-load-and-converge.md#non-existence)

Seems like it perhaps just worked by happenstance before? Could you advise what it was trying to guard against, so I know how to treat it?

Thanks!

@ncerny
Copy link
Owner

ncerny commented Mar 2, 2017

Hey @patcon! Thanks for taking an interest in chef_stack!

I believe the code as written is correct:

chef (12.12.15)> node.run_state['chef-orgs'] = <<-EOF
chef"> delivery
chef"> chef
chef"> test123
chef"> EOF
 => "delivery\nchef\ntest123\n"
chef (12.12.15)>
chef > if node.run_state['chef-orgs'].index(/^delivery$/)
chef ?>   p 'exists'
chef ?> else
chef >   p 'does not exist'
chef ?> end
"exists"
 => "exists"
chef (12.12.15)> if node.run_state['chef-orgs'].index(/^chef$/)
chef ?>   p 'exists'
chef ?> else
chef >   p 'does not exist'
chef ?> end
"exists"
 => "exists"
chef (12.12.15)> if node.run_state['chef-orgs'].index(/^franklin$/)
chef ?>   p 'exists'
chef ?> else
chef >   p 'does not exist'
chef ?> end
"does not exist"
 => "does not exist"

Thus, the resource block to create the user/org will only execute if that index returns a falsey (such as 'false' or 'nil'). Any other return (such as '0' or '10') will be truey, and prevent the resource from executing.

The load_current_value is saying "If the user/org is not found in this list of users/orgs, then the current value (the object that describes the state of the system prior to convergence) does not exist - IE, the User/Org does not exist prior to converge. Chef uses this to determine if it needs to converge a resource or not.

Was there a specific problem you were seeing that made you look at this code?

@ncerny
Copy link
Owner

ncerny commented Mar 2, 2017

I see you said the issue was it tries to recreate the first user after you add a second. Do you have the output from a run that we can take a look at?

@patcon
Copy link
Author

patcon commented Mar 2, 2017

Hm. Odd. But yep, save a partial log for our internal ticket:
https://gist.github.com/patcon/2703540e68984252a4be13bcde1c620b

You're right, forgot that zero is truey in ruby-land. Perhaps that's a reason to use something more intuitive like match, but I digress :)

The issue was pretty persistent for us, although there was a small amount of inconsistency on first few runs, but then settled into consistent failure while reconverging. Making the chancg resolved it.

(Fwiw, our script knife bootstraps with our top-level run-list item on each run, in case that confuses the run_state logic somehow.)

If nothing sticks out, I'll have to revert away from our fork and try to figure out the real reason later. Any insight is appreciated!

@ncerny
Copy link
Owner

ncerny commented Mar 3, 2017

Good point. I don't see a reason not to use match!

The only time I've seen this behavior is when the server isn't responding to the user-list/org-list. Which we should handle more gracefully. Is it possible that the server is restarting while the recipe is running? Or is it possible that high load is causing the issue?

As for your change - I'm happy to merge it in; however, I don't like \b\b. I prefer ^$ because it's easier to read from a regex perspective. We should also change to using match instead of index. But if that fixes your problem, then there doesn't seem to be a reason not to do it. If you want to make a PR with those changes, we'll get a review going.

@patcon
Copy link
Author

patcon commented Mar 4, 2017

Hm. We were using a system with much lower specs than recommended, so perhaps that's slowed things down and brought up an edgecase. We've moved to manual install for the next little while in the interest of speedy demos, but I'll investigate when we come back to using chef-services cookbook :)

Thanks for being a great sounding board @ncerny!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants