Skip to content
This repository has been archived by the owner on Dec 13, 2022. It is now read-only.

[rabbitmq] v4.2.2 ERROR: execute[rabbitmqctl add_user guest] (/var/chef/cache/cookbooks/rabbitmq/providers/user.rb line 86) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '2' #934

Open
alextricity25 opened this issue Apr 14, 2014 · 21 comments

Comments

@alextricity25
Copy link

I receive this ERROR after running chef-client on ha-controller1. It has happened with one other user as well, can anyone else re-create? Chef-client tries to start the rabbitmq-server service but it already seems to be running. This ERROR can be by-passed by killing the rabbitmq processes then re-running chef-client.

  * rabbitmq_user[add guest user] action add[2014-04-14T13:21:18-05:00] INFO: Processing rabbitmq_user[add guest user] action add (rabbitmq-openstack::server line 56)
[2014-04-14T13:21:19-05:00] INFO: Adding RabbitMQ user 'guest'.


Recipe: <Dynamically Defined Resource>
  * execute[rabbitmqctl add_user guest] action run[2014-04-14T13:21:19-05:00] INFO: Processing execute[rabbitmqctl add_user guest] action run (/var/chef/cache/cookbooks/rabbitmq/providers/user.rb line 86)

================================================================================
Error executing action `run` on resource 'execute[rabbitmqctl add_user guest]'
================================================================================


Mixlib::ShellOut::ShellCommandFailed
------------------------------------
Expected process to exit with [0], but received '2'
---- Begin output of rabbitmqctl add_user guest 'guest' ----
STDOUT: Creating user "guest" ...
STDERR: Error: unable to connect to node rabbit@controller1: nodedown

DIAGNOSTICS
===========

nodes in question: [rabbit@controller1]

hosts, their running nodes and ports:
- controller1: [{rabbit,58031},{rabbitmqctl14926,39798}]

current node details:
- node name: rabbitmqctl14926@controller1
- home dir: /var/lib/rabbitmq
- cookie hash: 2i6c2xRuc35IpsySTngwLg==
---- End output of rabbitmqctl add_user guest 'guest' ----
Ran rabbitmqctl add_user guest 'guest' returned 2


Resource Declaration:
---------------------
# In /var/chef/cache/cookbooks/rabbitmq/providers/user.rb

 86:     execute "rabbitmqctl add_user #{new_resource.user}" do
 87:       command cmdStr
 88:       Chef::Log.info "Adding RabbitMQ user '#{new_resource.user}'."
 89:       new_resource.updated_by_last_action(true)
 90:     end
 91:   end



Compiled Resource:
------------------
# Declared in /var/chef/cache/cookbooks/rabbitmq/providers/user.rb:86:in `block in class_from_file'

execute("rabbitmqctl add_user guest") do
  action "run"
  retries 0
  retry_delay 2
  guard_interpreter :default
  command "rabbitmqctl add_user guest 'guest'"
  backup 5
  returns 0
  cookbook_name "rabbitmq-openstack"
end



[2014-04-14T13:21:19-05:00] INFO: Running queued delayed notifications before re-raising exception

Running handlers:
[2014-04-14T13:21:19-05:00] ERROR: Running exception handlers
Running handlers complete

[2014-04-14T13:21:19-05:00] ERROR: Exception handlers complete
[2014-04-14T13:21:19-05:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
Chef Client failed. 6 resources updated in 34.689104079 seconds
[2014-04-14T13:21:19-05:00] ERROR: execute[rabbitmqctl add_user guest] (/var/chef/cache/cookbooks/rabbitmq/providers/user.rb line 86) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '2'
---- Begin output of rabbitmqctl add_user guest 'guest' ----
STDOUT: Creating user "guest" ...
STDERR: Error: unable to connect to node rabbit@controller1: nodedown

DIAGNOSTICS
===========

nodes in question: [rabbit@controller1]

hosts, their running nodes and ports:
- controller1: [{rabbit,58031},{rabbitmqctl14926,39798}]

current node details:
- node name: rabbitmqctl14926@controller1
- home dir: /var/lib/rabbitmq
- cookie hash: 2i6c2xRuc35IpsySTngwLg==
---- End output of rabbitmqctl add_user guest 'guest' ----
Ran rabbitmqctl add_user guest 'guest' returned 2
[2014-04-14T13:21:19-05:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1
@breu
Copy link
Contributor

breu commented Apr 14, 2014

What OS is this for? Was the rabbitmq-server already installed and running on this box before the chef-client run (i.e. you had chef-server installed on this same node before you ran chef-client)?

@alextricity25
Copy link
Author

OS: Ubuntu 12.04.4
Chef-client was run on a clean box, meaning rabbit was not installed at any moment prior to running chef-client.

@breu
Copy link
Contributor

breu commented Apr 14, 2014

cool @ELEXTRO. Was there an error before the guest user creation around installing or starting rabbitmq-server for the first time? Is there a stale rabbitmq-server process running? Can you gist up the log files in /var/log/rabbitmq?

@alextricity25
Copy link
Author

Definitely.
https://gist.github.com/elextro/10678883

It doesn't look like rabbit had any ERRORs when it was first starting up I also pasted the relevant chef output in the gist.

@breu
Copy link
Contributor

breu commented Apr 14, 2014

@ELEXTRO There should be some more log files in /var/log/rabbitmq like rabbit@*.log. Can you gist that one up for me?

@alextricity25
Copy link
Author

Sorry about this, the logs have gotten kind of long :/ Let me know if you want me to re-create.

https://gist.github.com/elextro/10737171

@Itxaka
Copy link

Itxaka commented Apr 15, 2014

Just wanted to point out that I experienced this error as well with 4.2.2, same setup (2 controllers in HA, Ubuntu server 12.04.04 fresh install).

Just in case ended up going back to 4.2.1 which didn't had the error.

@breu
Copy link
Contributor

breu commented Apr 15, 2014

@ELEXTRO and @Itxaka - So, we haven't run into any of these problems with our gate jobs nor QE testing. Can you help me reproduce the circumstances that led up to the problem? I'd like to understand how you deployed, where you deployed and what you deployed on. As much detail as you can provide would assist us in tracking this down.

  • Is this physical hardware?
  • Where is the chef-server at? What version of chef server and client?
  • What operating system are you deploying on?
  • what version of rabbitmq-server do you have installed?
  • What version, branch or tag of the cookbooks are you deploying? Please provide the SHA if possible.
  • How many nodes are you deploying on?
  • What roles are you assigning to each node?
  • In what order did you run the initial chef-client run on each node after applying the roles?
  • Please provide the environment that you are using (redact any personal or identifiable information)

Thanks! I hope we can figure out what is causing this.

@alextricity25
Copy link
Author

Sure thing @rackerjoe.

  • I'm running both controllers, the compute hosts, and chef-server on physical Dell R710s.
  • The chef-server lives alone on a Dell R710. Version 11.6.0. I used the support tool script to configure it. The chef-client is 11.12.2, and i'm using "curl -L https://www.opscode.com/chef/install.sh | sudo bash" to install it.
  • All of the boxes are running a clean version of Ubuntu 12.04.4 x86_64 with kernel version 3.11.0-18-generic
  • It looks like chef installs this rabbit version:
    • install version 3.1.5-1 of package /var/chef/cache/rabbitmq-server_3.1.5-1_all.deb
  • I'm using: git checkout v4.2.2. Then the submodule commands listed in the docs. Submodule init, sync, update.
    root@chef-server:~/chef-cookbooks# git rev-parse HEAD:
    bd8ca2a
  • I have 2 controllers, and 3 computes. This ERROR appears after running chef-client on ha-controller1.
  • Here are the roles:
    controller1: role[ha-controller1], role[single-network-node]
    controller2: role[ha-controller2]
    compute1,2,3: role[single-compute-node]'
  • I ran chef-client in this order:
    ha-controller1 -- ERROR appears here.
  • The environment:
    https://gist.github.com/elextro/10747733

I just ran chef-client on a clean box to re-create, and it happened again. Here are the rabbit logs (fresh out of the oven).
https://gist.github.com/elextro/10748666

@rcbjenkins
Copy link
Contributor

=ERROR REPORT==== 15-Apr-2014::12:07:58 ===
Mnesia(rabbit@controller1): ** ERROR ** (core dumped to file: "/var/lib/rabbitmq/MnesiaCore.rabbit@controller1_1397_581678_444704")
** FATAL ** {error,{"Cannot rename disk_log file",latest_log,
"/var/lib/rabbitmq/mnesia/rabbit@controller1/PREVIOUS.LOG",
{log_header,trans_log,"4.3","4.5",rabbit@controller1,
{1397,581678,433457}},
enoent}}

^ looks like you’ve got an issue with your disk/logfiles/rabbit dir structure, which is crashing the log handler, which is crashing the rest.

On Apr 15, 2014, at 12:16, Alex Cantu [email protected] wrote:

Sure thing @rackerjoe.

• I'm running both controllers, the compute hosts, and chef-server on physical Dell R710s.

• The chef-server lives alone on a Dell R710. Version 11.6.0. I used the support tool script to configure it. The chef-client is 11.12.2, and i'm using "curl -L https://www.opscode.com/chef/install.sh | sudo bash" to install it.

• All of the boxes are running a clean version of Ubuntu 12.04.4 x86_64 with kernel version 3.11.0-18-generic

• It looks like chef installs this rabbit version:

  • install version 3.1.5-1 of package /var/chef/cache/rabbitmq-server_3.1.5-1_all.deb

+I'm using: git checkout v4.2.2. Then the submodule commands listed in the docs. Submodule init, sync, update.
root@chef-server:~/chef-cookbooks# git rev-parse HEAD:
bd8ca2a

• I have 2 controllers, and 3 computes. This ERROR appears after running chef-client on ha-controller1.

• Here are the roles:
controller1: role[ha-controller1], role[single-network-node]
controller2: role[ha-controller2]
compute1,2,3: role[single-compute-node]'

• I ran chef-client in this order:
ha-controller1 -- ERROR appears here.

• The environment:
https://gist.github.com/elextro/10747733

I just ran chef-client on a clean box to re-create, and it happened again. Here are the rabbit logs (fresh out of the oven).
https://gist.github.com/elextro/10748666


Reply to this email directly or view it on GitHub.

@Apsu
Copy link
Contributor

Apsu commented Apr 15, 2014

^ should mention that was my comment. Replying to the email doesn't credit me, apparently.

@bunchc
Copy link

bunchc commented Apr 16, 2014

Oddly enough, I can make this happen in one environment but not another. Both are 12.04.4 and running in virtualbox. Now to see what I did differently.

@cloudnull
Copy link
Contributor

@bunchc

Some questions:

  • Assuming that you are talking about your chef environment, can you gist up your environments so that we can review the differences?
  • Additionally, can you provide the versions of chef and chef-server that you have installed on your nodes?
  • How was chef and chef-server installed on your nodes?
  • How many nodes are you using for the deployment and what are they being used for?
  • Are you sharing rabbit between chef and openstack; chef and controller1 installed on the same node?

I, like several others, am attempting to narrow down this issue. Any information you can share would be great.

@bunchc
Copy link

bunchc commented Apr 16, 2014

This is the working one:
https://gist.github.com/anonymous/f79169742afa515d2ef7
This one does not:
https://gist.github.com/anonymous/727781f4cbf62d6f1b15

Chef server is installed per the rpcs scripts, chef client by using the curl | sudo bash method.
Working node list:
controller1 not created (virtualbox) controller2 not created (virtualbox) compute1 not created (virtualbox) cinder1 not created (virtualbox) chef not created (virtualbox)

Not working list:
rpcs-controller-01 not created (virtualbox) rpcs-controller-02 not created (virtualbox) rpcs-compute-01 not created (virtualbox) rpcs-compute-02 not created (virtualbox) rpcs-chef-01 not created (virtualbox)

re: rabbit, it shouldn't be being shared.

Other notes: Same vagrant install, same vbox install, same 12.04.4 base box.

@bunchc
Copy link

bunchc commented Apr 16, 2014

RE: Environments, I had meant vagrant environments (one using vagrant-hostmanager, the other not), tis why I think the problem is in something I'm doing. Will know more soon.

@jedipunkz
Copy link

I had same problem and I resolved it.

In my situation, I changed /etc/hosts on controller node. so my controller node has some network interfacese, and rabbitmq-server need hostname with gw network on /etc/hosts.

% sudo ${EDITOR} /etc/hosts
xxx.xxx.xxx.xxx    testnode

xxx.xxx.xxx.xxx must be on network with public network which has gateway not private network. so this is rabbitmq problem.

@bunchc
Copy link

bunchc commented Apr 18, 2014

Ok, so, the difference in how I was building things seems to have made all the difference in the world. When things were /not/ functioning, I had installed chef client using curl | sudo bash on the nodes. When they were working, I had used knife bootstrap from the chef server.

@jacobsevart
Copy link

I am experiencing the same issue. I'm on Ubuntu 12.04 (since it's allegedly stable) under Vagrant (in preparation for a rollout to metal). It's the precise64 image hosted by Vagrant itself.

I have chef-client v. 11.12.8-1 installed via knife bootstrap on my two Openstack nodes. The one which gets role[single-controller] or role[ha-controller1](I've tried it both ways) errors out with the message above. I have verified this around 15 times; it is consistent from completely clean state.

I've set all the nodes hostnames via Vagrant and pointed them at each other through /etc/hosts files. Getting them actual FQDNs is never going to happen (we'd like this to be behind the firewall in production as well), but maybe it's related to the hostname setting in some way?

jedipunkz can you share more details of your solution? I can't quite tell what you're saying there.

@jacobsevart
Copy link

UPDATE: rebooting the node and then re-running chef-client seems to do the trick.

@claco
Copy link
Contributor

claco commented Jul 12, 2014

@jacobsevart "Vagrant". That's the problem. Specifically, depending on the image and who made it, and the version of vagrant, it will update the hosts file and sometimes not in a same manner causing name resolution to fail, which causes rabbit install to fail.

@siddheshwar-more
Copy link

@jacobsevart @claco @jedipunkz @bunchc

I also facing same problem:- related to rabbitMq

chef-client logs-

STDOUT: Creating user "guest" ...
STDERR: Error: unable to connect to node 'rabbit@clo-test-23': nodedown

DIAGNOSTICS

nodes in question: ['rabbit@clo-test-23']

hosts, their running nodes and ports:

  • clo-test-23: [{rabbit,47919},{rabbitmqctl5931,39926}]

current node details:

  • node name: 'rabbitmqctl5931@clo-test-23'
  • home dir: /var/lib/rabbitmq
  • cookie hash: 2i6c2xRuc35IpsySTngwLg==
    ---- End output of rabbitmqctl add_user guest 'guest' ----
    Ran rabbitmqctl add_user guest 'guest' returned 2
    [2014-11-22T12:32:55+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

Steps I followed-

I've created one VM on aws ec2 cloud where i install chef-client 11.16.4 from https://downloads/getchef.com

  1. download chef-client
  2. Install chef-client $ dpkg -i chef-client_.deb
  3. git clone --recursive https://github.com/rcbops/chef-cookbooks
  4. edit environments/example.json
    5.knife cookbook upload -o cookbooks --all -s 'http://localhost:4000'
    6 .knife role from file roles/_.rb -s http://localhost:4000
  5. knife environment from file example.json -s 'http://localhost:4000'
  6. chef-client -E example_environment -r "role[allinone]" -c ~/chef-repo/.chef/knife.rb

I'm using chef-zero instead of chef server

  1. open new terminal session
  2. export PATH=$PATH:/opt/chef/embedded/bin/:/opt/chef/bin/
    3.irb
  3. require 'chef_zero/server'
    server = ChefZero::Server.new(host: "127.0.0.1", port: 4000)
    server.start

Can you please help me!! To resolve this issue.!!

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests