<div dir="ltr">Hi Anda,<div><br></div><div>all the issues seem to related, if you're using tunneled networks you need to configure  tenant networks on both controller and computes.</div><div><br></div><div>Also if you're using static ips you should have internal networks defined and bind them on ServiceNetMap.</div><div><br></div><div>In the compute nodes if you don't use external network make sure you have the default route and <a href="http://169.254.169.254/32" target="_blank">169.254.169.254/32</a> on ctlplane network, something like this:</div><div><br></div><div><div><b>network_config:</b></div><div><b>            -</b></div><div><b>              type: interface</b></div><div><b>              name: nic1</b></div><div><b>              use_dhcp: false</b></div><div><b>              dns_servers: {get_param: DnsServers}</b></div><div><b>              addresses:</b></div><div><b>                -</b></div><div><b>                  ip_netmask:</b></div><div><b>                    list_join:</b></div><div><b>                      - '/'</b></div><div><b>                      - - {get_param: ControlPlaneIp}</b></div><div><b>                        - {get_param: ControlPlaneSubnetCidr}</b></div><div><b>              routes:</b></div><div><b>                -</b></div><div><b>                  ip_netmask: <a href="http://169.254.169.254/32">169.254.169.254/32</a></b></div><div><b>                  next_hop: {get_param: EC2MetadataIp}</b></div><div><b>                -</b></div><div><b>                  default: true</b></div><div><b>                  next_hop: {get_param: ControlPlaneDefaultRoute}  </b></div></div><div><br></div><div>Hope it helps.</div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Feb 2, 2018 at 9:04 AM, Anda Nicolae <span dir="ltr"><<a href="mailto:anicolae@lenovo.com" target="_blank">anicolae@lenovo.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">





<div lang="EN-US" link="blue" vlink="purple">
<div class="m_6589963827243982282WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi all,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Thanks for the info about the 2 networks (external and ctlplane) that I need on the overcloud VMs (controller and compute).<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Now br-ex on my overcloud VMs has the external IP address and I am able to ping overcloud VMs on both external and ctlplane IP addresses.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Also, since for the external network I use static IPs, in my ips-from-pool-all.yaml, I have:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">OS::TripleO::Compute::Ports::<wbr>ExternalPort: ../network/ports/external_<wbr>from_pool_compute.yaml<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">external_from_pool_compute.<wbr>yaml is similar to external_from_pool.yaml file. I've noticed that I if use noop.yaml, the external IP is not assigned to eth0 interface
 on the compute node.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I hope it is correct to use it like this.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I have continued with my overcloud deployment and I've noticed that some progress has been made:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">- Controller resource is now in CREATE_COMPLETE state<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">- although deployment still fails, I can connect to the overcloud VMs via both ctlplane IP and external IP and check the logs, after the failure of the deploy
 operation<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Compute resource fails with the CREATE aborted reason. I've looked in /valog/messages on the overcloud compute VM and I've noticed the following error messages
 that keep repeating:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Feb  2 03:09:36 localhost os-collect-config: Source [ec2] Unavailable.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Feb  2 03:09:36 localhost os-collect-config: /var/lib/os-collect-config/<wbr>local-data not found. Skipping<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Feb  2 03:09:36 localhost os-collect-config: No local metadata found (['/var/lib/os-collect-config/<wbr>local-data'])<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Feb  2 03:10:16 localhost os-collect-config: HTTPConnectionPool(host='169.<wbr>254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/ (Caused
 by ConnectTimeoutError(<requests.<wbr>packages.urllib3.connection.<wbr>HTTPConnection object at 0x2752190>, 'Connection to 169.254.169.254 timed out. (connect timeout=10.0)'))<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">From heat-engine.log, I have:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">2018-02-01 19:26:32.253 3348 DEBUG neutronclient.v2_0.client [req-c27f050c-b743-4e1d-a706-<wbr>e01e63a43b49 fdfcf2f659a94e57829dbefc618f3d<wbr>3b 453c1e37b83f4f8e8a49dab299e822<wbr>4d
 - - -] Error message: {"NeutronError": {"message": "Port 0292b718-2c28-4b0c-a517-<wbr>c481c547b711 could not be found.", "type": "PortNotFound", "detail": ""}} _handle_fault_response /usr/lib/python2.7/site-<wbr>packages/neutronclient/v2_0/<wbr>client.py:266<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I have 2 questions regarding the deployment:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">1. Does any of the error messages above cause the failed deployment of the Compute resource?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">2. In my network-environment.yaml, I haven't set InternalApiNetCidr, TenantNetCidr, InternalApiNetworkVlanID, TenantNetworkVlanID.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Do I need to set these in order to make de overcloud deployment work?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Thanks,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Anda<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Anda Nicolae
<br>
<b>Sent:</b> Wednesday, January 31, 2018 12:40 PM<br>
<b>To:</b> 'Pedro Sousa'<br>
<b>Cc:</b> <a href="mailto:rasca@redhat.com" target="_blank">rasca@redhat.com</a>; <a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a><br>
<b>Subject:</b> RE: [rdo-users] RHOSP 10 failed overcloud deployment<u></u><u></u></span></p>
</div>
</div><span class="">
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I've just run 'neutron net-list' on the undercloud node and I have the 2 networks, ctlplane and external.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">My belief was that I don't need the external network, I only need the provision (ctlplane) network for the deployment.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I don't have a DHCP server for my external network.
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Do I need to set the external IP address for the compute node and for the controller node in the yaml files from templates folder?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Thanks,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Anda<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
</span><p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Pedro Sousa [<a href="mailto:pgsousa@gmail.com" target="_blank">mailto:pgsousa@gmail.com</a>]
<br><span class="">
<b>Sent:</b> Wednesday, January 31, 2018 12:32 PM<br>
<b>To:</b> Anda Nicolae<br>
</span><b>Cc:</b> <a href="mailto:rasca@redhat.com" target="_blank">rasca@redhat.com</a>; <a href="mailto:users@lists.rdoproject.org" target="_blank">
users@lists.rdoproject.org</a></span></p><div><div class="h5"><br>
<b>Subject:</b> Re: [rdo-users] RHOSP 10 failed overcloud deployment<u></u><u></u></div></div><p></p><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Hi Anda,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">some things you could check:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Do you have 2 networks on director (ctlplane and external) and are they reachable from the overcloud nodes?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Seems to me that you have network issues and that's because you're seeing those long timeouts.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">For "<span style="font-size:9.5pt;font-family:"Arial","sans-serif";color:#222222;background:white">Message: No valid host was found. There are not enough hosts available" message you could check "/var/log/nova/nova-conductor.<wbr>log".</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Regards<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">On Wed, Jan 31, 2018 at 10:14 AM, Anda Nicolae <<a href="mailto:anicolae@lenovo.com" target="_blank">anicolae@lenovo.com</a>> wrote:<u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:12.0pt">I've let the deployment run overnight and it failed after almost 4hrs with the errors below. Do you happen to know the config file where I can decrease the timeout? I looked in /etc/nova/nova.conf and in ironic
 config files but I couldn't find anything relevant.<br>
<br>
The errors are:<br>
<br>
[overcloud.Compute.0]: CREATE_FAILED  ResourceInError: resources[0].resources.<wbr>NovaCompute: Went to status ERROR due to "Message: Unknown, Code: Unknown"<br>
[overcloud.Controller.0]: CREATE_FAILED  Resource CREATE failed: ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"<br>
<br>
It is unclear to me why the above errors occur, since in my instackenv.json I declared node capabilities for both the computer and the controller node to be greater than the compute and controller flavors from 'openstack flavor list'.<br>
<br>
However, I've found this link and I am looking over it:<br>
<a href="https://docs.openstack.org/ironic/latest/admin/troubleshooting.html#nova-returns-no-valid-host-was-found-error" target="_blank">https://docs.openstack.org/<wbr>ironic/latest/admin/<wbr>troubleshooting.html#nova-<wbr>returns-no-valid-host-was-<wbr>found-error</a><br>
<br>
<span class="m_6589963827243982282im">Thanks,</span><br>
<span class="m_6589963827243982282im">Anda</span><br>
<br>
<span class="m_6589963827243982282im">-----Original Message-----</span><br>
<span class="m_6589963827243982282im">From: Raoul Scarazzini [mailto:<a href="mailto:rasca@redhat.com" target="_blank">rasca@redhat.com</a>]</span><br>
<span class="m_6589963827243982282im">Sent: Tuesday, January 30, 2018 8:17 PM</span><br>
<span class="m_6589963827243982282im">To: Anda Nicolae; <a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a></span><br>
<span class="m_6589963827243982282im">Subject: Re: [rdo-users] RHOSP 10 failed overcloud deployment</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal">On 01/30/2018 04:39 PM, Anda Nicolae wrote:<br>
> Got it.<br>
><br>
> I've noticed that it spends quite some time in CREATE_IN_PROGRESS state for OS::Heat::ResourceGroup resource (on Controller node).<br>
> Overcloud deployment fails after 4h. I will check in which config file is the overcloud deployment timeout configured and decrease it.<br>
><br>
> Thanks,<br>
> Anda<br>
<br>
Check also network settings. 4h timeout is the default when something is unreachable.<br>
<br>
--<br>
Raoul Scarazzini<br>
<a href="mailto:rasca@redhat.com" target="_blank">rasca@redhat.com</a><br>
______________________________<wbr>_________________<br>
users mailing list<br>
<a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a><br>
<a href="http://lists.rdoproject.org/mailman/listinfo/users" target="_blank">http://lists.rdoproject.org/<wbr>mailman/listinfo/users</a><br>
<br>
To unsubscribe: <a href="mailto:users-unsubscribe@lists.rdoproject.org" target="_blank">users-unsubscribe@lists.<wbr>rdoproject.org</a><u></u><u></u></p>
</div>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div></div></div>
</div>

</blockquote></div><br></div>