[rdo-users] [TripleO] [RDO] overcloud deploy failed at ansible plan setup

Ruslanas Gžibovskis ruslanas at lpic.lt
Thu Oct 3 10:22:50 UTC 2019


digging into ansible playbook and config, I found, that it is trying to
connect using tripleo-admin, but it do not allow/accept sshkey which is in
mistral do not match to the one which is in tripleo-admin user autho

()[mistral at undercloud104 /]$ less /var/lib/mistral/C104/ansible.cfg
()[mistral at undercloud104 /]$ ssh -l tripleo-admin -i
/var/lib/mistral/.ssh/tripleo-admin-rsa localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:0/Axj7n0cQU9eKCFisOpI2HeaOZeI05RhNa/qT/2/2A.
ECDSA key fingerprint is
MD5:fa:e4:df:f5:8b:63:41:ae:c3:a3:2d:7d:55:2d:7f:65.
Are you sure you want to continue connecting (yes/no)? yes
Failed to add the host to the list of known hosts
(/var/lib/mistral/.ssh/known_hosts).
tripleo-admin at localhost's password:

I have generated public part from a key used and redeploying cloud to see
if it works

On Thu, 3 Oct 2019 at 09:47, Ruslanas Gžibovskis <ruslanas at lpic.lt> wrote:

> now deployment failed on first ansible job with:
>
> <localhost> ssh_retry: attempt: 8, ssh return code is 255. cmd (['ssh',
> '-o', 'UserKnownHostsFile=/dev/null', '-o', 'StrictHostKeyChecking=no',
> '-o', 'ControlMaster=auto', '-o', 'ControlPersist=30m', '-o',
> 'ServerAliveInterval=5', '-o', 'ServerAliveCountMax=5', '-o',
> 'IdentityFile="/var/lib/mistral/.ssh/tripleo-admin-rsa"', '-o',
> 'KbdInteractiveAuthentication=no', '-o',
> 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey',
> '-o', 'PasswordAuthentication=no', '-o', 'User="tripleo-admin"', '-o',
> 'ConnectTimeout=30', '-o',
> 'ControlPath=/var/lib/mistral/C104/ansible-ssh/ff320dd376', 'localhost',
> "/bin/sh -c '/usr/bin/python2 && sleep 0'"]...), pausing for 30 seconds
> fatal: [undercloud]: UNREACHABLE! => {"changed": false, "msg": "SSH Error:
> data could not be sent to remote host \"localhost\". Make sure this host
> can be reached over ssh", "unreachable": true}
>
> interesting, that it fails to connect to localhost and fails, I  have
> checked and
> [stack at undercloud104 ~]$ ssh -l stack localhost
> The authenticity of host 'localhost (::1)' can't be established.
> ECDSA key fingerprint is
> SHA256:0/Axj7n0cQU9eKCFisOpI2HeaOZeI05RhNa/qT/2/2A.
> ECDSA key fingerprint is
> MD5:fa:e4:df:f5:8b:63:41:ae:c3:a3:2d:7d:55:2d:7f:65.
> Are you sure you want to continue connecting (yes/no)? yes
> Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
> Last login: Thu Oct  3 09:04:45 2019 from 10.120.123.5
> [stack at undercloud104 ~]$ logout
> Connection to localhost closed.
> [stack at undercloud104 ~]$
>
> Also from mistral container:
> )[mistral at undercloud104 /]$ ssh -l stack -i
> /var/lib/mistral/.ssh/tripleo-admin-rsa localhost
> The authenticity of host 'localhost (::1)' can't be established.
> <...>
> Last login: Thu Oct  3 09:05:32 2019 from ::1
> [stack at undercloud104 ~]$
>
> any ideas?
> If I should send this to a different group/page, please let me know.
>
> Thank you
>
> On Wed, 2 Oct 2019 at 15:25, Ruslanas Gžibovskis <ruslanas at lpic.lt> wrote:
>
>> just now noticed, I do not have controller! Even I have 1 in
>> node-info.yaml:
>> (undercloud) [stack at undercloud104 ~]$ less c104/node-info.yaml
>> parameter_defaults:
>>   OvercloudControllerFlavor: control
>>   OvercloudComputeFlavor: compute
>>   ControllerCount: 1
>>   ComputeCount: 3
>> (undercloud) [stack at undercloud104 ~]$ openstack baremetal node list -c
>> UUID -c "Instance UUID" -c "Power State" -c "Provisioning State"
>>
>> +--------------------------------------+---------------+-------------+--------------------+
>> | UUID                                 | Instance UUID | Power State |
>> Provisioning State |
>>
>> +--------------------------------------+---------------+-------------+--------------------+
>> | 7162d96f-65c9-4f87-b9ef-982e75dc8abc | None          | power off   |
>> available          |
>> | db52c6da-67dd-4e21-baa0-455937822300 | None          | power off   |
>> available          |
>> | e5420f96-05c5-4ebb-b9a6-499b8b9b6841 | None          | power off   |
>> available          |
>> | dcb47213-f2fa-48a9-bf1a-2d2ebdf13784 | None          | power off   |
>> available          |
>> | 8bc1e8e4-9c51-44bf-81be-b1dd1d19dec4 | None          | power off   |
>> available          |
>> | c9687294-c336-481c-b0a3-1464d4209ba9 | None          | power off   |
>> available          |
>> | 2f398a0b-a73c-4bc9-affd-df62e1eaa262 | None          | power off   |
>> available          |
>> | b011dfb9-aa3c-4f9f-ade8-88ee96f8ae16 | None          | power on    |
>> active         |
>> | f2fb7e39-73f5-42da-a3d1-03f16fa6457e | None          | power off   |
>> available          |
>> | c75e7582-1f1f-424b-8614-2110cc0a7539 | None          | power on    |
>> active         |
>> | f4ed164a-56d3-4536-a665-aa626b9346b9 | None          | power off   |
>> available          |
>> | 44d9d25b-3e88-42d4-b17f-770b262584bb | None          | power off   |
>> available          |
>> | 3bccd4ae-4e1c-419f-b2b1-124af13e4fce | None          | power off   |
>> available          |
>> | 83857f28-8b2f-4d08-9354-0f9488867d62 | None          | power off   |
>> available          |
>> | 095da91b-bba9-4a49-a7b0-2af00f26f309 | None          | power on    |
>> active         |
>> | 9d3475c6-4cdf-4406-8df8-beaff1a1db45 | None          | power off   |
>> available          |
>>
>> +--------------------------------------+---------------+-------------+--------------------+
>>
>>
>> I believe it is a cause, but not sure, how to cure it, try to redeploy it
>> now.
>>
>> On Wed, 2 Oct 2019 at 13:16, Ruslanas Gžibovskis <ruslanas at lpic.lt>
>> wrote:
>>
>>> Hi team.
>>>
>>> I am using CentOS7 +  Openstack-stein repo and  " OpenStack stein Trunk
>>> Tested".
>>> yum repolist:
>>> !base/7/x86_64
>>> !centos-ceph-nautilus/7/x86_64
>>> !centos-nfs-ganesha28/7/x86_64
>>> !centos-openstack-stein/7/x86_64
>>> !centos-qemu-ev/7/x86_64
>>> !extras/7/x86_64
>>> !rdo-trunk-stein-tested
>>> !updates/7/x86_64
>>>
>>> When doing deployment everything looks promising but failing with the
>>> last steps, according to error, looks like ansible version error.
>>>
>>> QUESTION:
>>> What I can check more, to debug. I am stuck now.
>>>
>>>
>>> files used for install:
>>> (undercloud) [stack at undercloud104 ~]$ ls -ld *
>>> drwxrwxr-x.  3 stack stack  4096 Oct  2 09:43 c104
>>> -rw-r--r--.  1 stack stack  4683 Sep 30 12:19 cert.2019-09-06.pem
>>> -rw-rw-r--.  1 stack stack   543 Oct  1 14:21 deploy.sh
>>> drwxrwxr-x. 17 stack stack  4096 Sep 30 12:58
>>> generated-openstack-tripleo-heat-templates
>>> -rw-rw-r--.  1 stack stack  8632 Sep 30 09:30 hosts.json
>>> -rw-rw-r--.  1 stack stack     0 Oct  1 14:16 install-undercloud.log
>>> drwxr-xr-x.  2 stack stack  4096 Sep 27 15:25 repos
>>> lrwxrwxrwx.  1 stack stack    58 Sep 30 14:11 scripts ->
>>> generated-openstack-tripleo-heat-templates/network/scripts
>>> -rw-------.  1 stack stack   775 Sep 30 17:42 stackrc
>>> drwxrwxr-x.  2 stack stack    40 Sep 27 11:17
>>> tripleo-config-generated-env-files
>>> -rw-------.  1 stack root   9697 Sep 30 17:18
>>> tripleo-undercloud-passwords.yaml
>>> -rw-------.  1 stack root   2250 Sep 30 17:18 undercloud-passwords.conf
>>> -rw-r--r--.  1 stack stack 14405 Sep 30 16:57 undercloud.conf
>>> (undercloud) [stack at undercloud104 ~]$ ls -ld repos/*
>>> -rw-r--r--. 1 stack stack 1664 Sep 27 15:25 repos/CentOS-Base.repo
>>> -rw-r--r--. 1 stack stack 1309 Sep 27 15:25 repos/CentOS-CR.repo
>>> -rw-r--r--. 1 stack stack  956 Sep 27 15:25
>>> repos/CentOS-Ceph-Nautilus.repo
>>> -rw-r--r--. 1 stack stack  649 Sep 27 15:25 repos/CentOS-Debuginfo.repo
>>> -rw-r--r--. 1 stack stack  630 Sep 27 15:25 repos/CentOS-Media.repo
>>> -rw-r--r--. 1 stack stack  715 Sep 27 15:25
>>> repos/CentOS-NFS-Ganesha-28.repo
>>> -rw-r--r--. 1 stack stack 1290 Sep 27 15:25
>>> repos/CentOS-OpenStack-stein.repo
>>> -rw-r--r--. 1 stack stack  612 Sep 27 15:25 repos/CentOS-QEMU-EV.repo
>>> -rw-r--r--. 1 stack stack 1331 Sep 27 15:25 repos/CentOS-Sources.repo
>>> -rw-r--r--. 1 stack stack  353 Sep 27 15:25
>>> repos/CentOS-Storage-common.repo
>>> -rw-r--r--. 1 stack stack 6639 Sep 27 15:25 repos/CentOS-Vault.repo
>>> -rw-r--r--. 1 stack stack  314 Sep 27 15:25 repos/CentOS-fasttrack.repo
>>> (undercloud) [stack at undercloud104 ~]$
>>>
>>> Last lines in output were:
>>> Removing short term keys locally
>>> Enabling ssh admin - COMPLETE.
>>> Waiting for messages on queue 'tripleo' with no timeout.
>>> Config downloaded at /var/lib/mistral/C104
>>> The action raised an exception
>>> [action_ex_id=49561c17-928d-4d66-a2ad-466b57c13253, action_cls='<class
>>> 'mistral.actions.action_factory.AnsibleGenerateInventoryAction'>',
>>> attributes='{}', params='{u'work_dir': u'/var/lib/mistral/C104',
>>> u'ansible_python_interpreter': None, u'ansible_ssh_user': u'tripleo-admin',
>>> u'undercloud_key_file': u'/var/lib/mistral/.ssh/tripleo-admin-rsa',
>>> u'plan_name': u'C104', u'ssh_network': u'ctlplane'}']
>>>  list index out of range
>>> Overcloud configuration failed.
>>> (undercloud) [stack at undercloud104 ~]$
>>>
>>>
>>> --
>>> Ruslanas Gžibovskis
>>> +370 6030 7030
>>>
>>
>>
>> --
>> Ruslanas Gžibovskis
>> +370 6030 7030
>>
>
>
> --
> Ruslanas Gžibovskis
> +370 6030 7030
>


-- 
Ruslanas Gžibovskis
+370 6030 7030
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/users/attachments/20191003/8a7766cf/attachment.html>


More information about the users mailing list