[Rdo-list] rdo-manager failures: instack-install-undercloud failing for non-obvious reasons
Jiří Stránský
jistr at redhat.com
Wed Apr 22 15:08:32 UTC 2015
On 22.4.2015 17:05, Jiří Stránský wrote:
> On 21.4.2015 23:00, James Slagle wrote:
>> On Tue, Apr 21, 2015 at 04:37:26PM -0400, Lars Kellogg-Stedman wrote:
>>> Running "instack-install-undercloud" is failing for me:
>>>
>>> + echo 'puppet apply exited with exit code 6'
>>> puppet apply exited with exit code 6
>>> + '[' 6 '!=' 2 -a 6 '!=' 0 ']'
>>> + exit 6
>>> [2015-04-21 20:13:20,426] (os-refresh-config) [ERROR] during configure
>>> phase. [Command '['dib-run-parts',
>>> '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit
>>> status 6]
>>>
>>> Unfortunately, the failure doesn't provide much in the way of useful
>>> information. If I scroll up several pages, I find:
>>>
>>> Notice: /Stage[main]/Rabbitmq::Install::Rabbitmqadmin/File[/usr/local/bin/rabbitmqadmin]/ensure: defined content as '{md5}63d7331e825c865a97b7a8d1299841ff'
>>> Error: /Stage[main]/Main/Rabbitmq_user[neutron]: Could not evaluate: Command is still failing after 180 seconds expired!
>>> Error: /Stage[main]/Main/Rabbitmq_user[heat]: Could not evaluate: Command is still failing after 180 seconds expired!
>>> Error: /Stage[main]/Main/Rabbitmq_user[ceilometer]: Could not evaluate: Command is still failing after 180 seconds expired!
>>> Error: /Stage[main]/Main/Rabbitmq_user[nova]: Could not evaluate: Command is still failing after 180 seconds expired!
>>> Error: /Stage[main]/Main/Rabbitmq_vhost[/]: Could not evaluate: Command is still failing after 180 seconds expired!
>>>
>>> But again, that doesn't really tell me what is failing either (a
>>> command is still failing? Which command?).
>>
>> Unfortunately we're pretty much at the mercy of puppet and all of the external
>> puppet modules here in terms of its helpful output, and the point at which it
>> chooses to stop applying after an error is encountered. Perhaps some people
>> more familiar with puppet might chime in here on how to improve this.
>
> Changing "puppet apply" to "puppet apply -d" here [1] should give you
> more output including the commands which are being run at each step.
> (Hoping i've found the right spot, i'm not very familiar with
> instack-undercloud.) Perhaps an env variable could be added to switch on
> the debug output? (It shouldn't be on by default i guess because Puppet
> prints a lot of stuff then, including potentially sensitive info.)
>
> Regarding the problem as a whole, the hostname/domain issue you outlined
> in another e-mail might be a good clue. It certainly wouldn't be the
> first time i've seen problems with Puppet caused by FQDN settings. In
> general it's a good idea to verify that `facter fqdn` prints the same
> thing as `hostname -f` before running Puppet.
Actually, now i recall that staypuft-installer has a check for this ^^
built in, and if the two values don't match, it refuses to run. Maybe we
should do the same with instack-undercloud.
J.
>
> Cheers
>
> J.
>
> [1]
> https://github.com/rdo-management/instack-undercloud/blob/6f75c8dc3c37d489763b7310a7b57d00e1e70da2/elements/puppet-stack-config/os-refresh-config/configure.d/50-puppet-stack-config#L7
>
>>
>>>
>>> It looks like rabbitmq is having some problems:
>>>
>>> [stack at localhost ~]$ sudo rabbitmqctl status
>>> Status of node rabbit at localhost ...
>>> Error: unable to connect to node rabbit at localhost: nodedown
>>>
>>> DIAGNOSTICS
>>> ===========
>>>
>>> attempted to contact: [rabbit at localhost]
>>>
>>> rabbit at localhost:
>>> * connected to epmd (port 4369) on localhost
>>> * epmd reports node 'rabbit' running on port 25672
>>> * TCP connection succeeded but Erlang distribution failed
>>> * suggestion: hostname mismatch?
>>> * suggestion: is the cookie set correctly?
>>>
>>> current node details:
>>> - node name: rabbitmqctl20640 at stack
>>> - home dir: /var/lib/rabbitmq
>>> - cookie hash: 4DA3U2yua3rw7wYLr+PbiQ==
>>>
>>> If I manually stop and then start rabbitmq:
>>>
>>> sudo systemctl stop rabbitmq-server
>>> sudo systemctl start rabbitmq-server
>>>
>>> It seems to work:
>>>
>>> # rabbitmqctl status
>>> Status of node rabbit at stack ...
>>> [{pid,20946},
>>> {running_applications,
>>> [{rabbitmq_management,"RabbitMQ Management Console","3.3.5"},
>>> ...
>>>
>>> After manually starting rabbit and re-running
>>> instack-install-undercloud, the process is able to successfully create
>>> the rabbitmq_user resources and completes successfully.
>>
>> Are you on RHEL 7.1 or CentOS 7? I'll try to reproduce locally and see if I can
>> get to the bottom of it.
>>
>>>
>>> --
>>> Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {freenode,twitter,github}
>>> Cloud Engineering / OpenStack | http://blog.oddbit.com/
>>
>>
>>
>>> _______________________________________________
>>> Rdo-list mailing list
>>> Rdo-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>
>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>> --
>> -- James Slagle
>> --
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
>
More information about the dev
mailing list