[Rdo-list] rdo-manager failures: instack-install-undercloud failing for non-obvious reasons

Jiří Stránský jistr at redhat.com
Wed Apr 22 15:08:32 UTC 2015


On 22.4.2015 17:05, Jiří Stránský wrote:
> On 21.4.2015 23:00, James Slagle wrote:
>> On Tue, Apr 21, 2015 at 04:37:26PM -0400, Lars Kellogg-Stedman wrote:
>>> Running "instack-install-undercloud" is failing for me:
>>>
>>>     + echo 'puppet apply exited with exit code 6'
>>>     puppet apply exited with exit code 6
>>>     + '[' 6 '!=' 2 -a 6 '!=' 0 ']'
>>>     + exit 6
>>>     [2015-04-21 20:13:20,426] (os-refresh-config) [ERROR] during configure
>>>     phase. [Command '['dib-run-parts',
>>>     '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit
>>>     status 6]
>>>
>>> Unfortunately, the failure doesn't provide much in the way of useful
>>> information.  If I scroll up several pages, I find:
>>>
>>>     Notice: /Stage[main]/Rabbitmq::Install::Rabbitmqadmin/File[/usr/local/bin/rabbitmqadmin]/ensure: defined content as '{md5}63d7331e825c865a97b7a8d1299841ff'
>>>     Error: /Stage[main]/Main/Rabbitmq_user[neutron]: Could not evaluate: Command is still failing after 180 seconds expired!
>>>     Error: /Stage[main]/Main/Rabbitmq_user[heat]: Could not evaluate: Command is still failing after 180 seconds expired!
>>>     Error: /Stage[main]/Main/Rabbitmq_user[ceilometer]: Could not evaluate: Command is still failing after 180 seconds expired!
>>>     Error: /Stage[main]/Main/Rabbitmq_user[nova]: Could not evaluate: Command is still failing after 180 seconds expired!
>>>     Error: /Stage[main]/Main/Rabbitmq_vhost[/]: Could not evaluate: Command is still failing after 180 seconds expired!
>>>
>>> But again, that doesn't really tell me what is failing either (a
>>> command is still failing? Which command?).
>>
>> Unfortunately we're pretty much at the mercy of puppet and all of the external
>> puppet modules here in terms of its helpful output, and the point at which it
>> chooses to stop applying after an error is encountered. Perhaps some people
>> more familiar with puppet might chime in here on how to improve this.
>
> Changing "puppet apply" to "puppet apply -d" here [1] should give you
> more output including the commands which are being run at each step.
> (Hoping i've found the right spot, i'm not very familiar with
> instack-undercloud.) Perhaps an env variable could be added to switch on
> the debug output? (It shouldn't be on by default i guess because Puppet
> prints a lot of stuff then, including potentially sensitive info.)
>
> Regarding the problem as a whole, the hostname/domain issue you outlined
> in another e-mail might be a good clue. It certainly wouldn't be the
> first time i've seen problems with Puppet caused by FQDN settings. In
> general it's a good idea to verify that `facter fqdn` prints the same
> thing as `hostname -f` before running Puppet.

Actually, now i recall that staypuft-installer has a check for this ^^ 
built in, and if the two values don't match, it refuses to run. Maybe we 
should do the same with instack-undercloud.

J.

>
> Cheers
>
> J.
>
> [1]
> https://github.com/rdo-management/instack-undercloud/blob/6f75c8dc3c37d489763b7310a7b57d00e1e70da2/elements/puppet-stack-config/os-refresh-config/configure.d/50-puppet-stack-config#L7
>
>>
>>>
>>> It looks like rabbitmq is having some problems:
>>>
>>>     [stack at localhost ~]$ sudo rabbitmqctl status
>>>     Status of node rabbit at localhost ...
>>>     Error: unable to connect to node rabbit at localhost: nodedown
>>>
>>>     DIAGNOSTICS
>>>     ===========
>>>
>>>     attempted to contact: [rabbit at localhost]
>>>
>>>     rabbit at localhost:
>>>     * connected to epmd (port 4369) on localhost
>>>     * epmd reports node 'rabbit' running on port 25672
>>>     * TCP connection succeeded but Erlang distribution failed
>>>     * suggestion: hostname mismatch?
>>>     * suggestion: is the cookie set correctly?
>>>
>>>     current node details:
>>>     - node name: rabbitmqctl20640 at stack
>>>     - home dir: /var/lib/rabbitmq
>>>     - cookie hash: 4DA3U2yua3rw7wYLr+PbiQ==
>>>
>>> If I manually stop and then start rabbitmq:
>>>
>>>       sudo systemctl stop rabbitmq-server
>>>       sudo systemctl start rabbitmq-server
>>>
>>> It seems to work:
>>>
>>>     # rabbitmqctl status
>>>     Status of node rabbit at stack ...
>>>     [{pid,20946},
>>>      {running_applications,
>>>          [{rabbitmq_management,"RabbitMQ Management Console","3.3.5"},
>>>     ...
>>>
>>> After manually starting rabbit and re-running
>>> instack-install-undercloud, the process is able to successfully create
>>> the rabbitmq_user resources and completes successfully.
>>
>> Are you on RHEL 7.1 or CentOS 7? I'll try to reproduce locally and see if I can
>> get to the bottom of it.
>>
>>>
>>> --
>>> Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {freenode,twitter,github}
>>> Cloud Engineering / OpenStack          | http://blog.oddbit.com/
>>
>>
>>
>>> _______________________________________________
>>> Rdo-list mailing list
>>> Rdo-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>
>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>> --
>> -- James Slagle
>> --
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
>




More information about the dev mailing list