<div dir="ltr"><div><div><div><div><div><div><div><div><div><div>Hi  Sasha\Dan, <br>Yep that's my bug I opened yesterday about this.  <br></div><br></div></div></div></div>sshd and firewall rules look OK having tested below:<br></div>I can ssh into the virt host from my laptop with root user, checking 10.X.X.X net<br></div>Can also ssh from instack vm to virt host, checking 192.168.122.X net. <br><br></div>Unless I should check ssh with other user, if so which ? <br></div>I doubt ssh user/firewall caused the problem as controller was installed successfully and it too uses same procedure ssh virt power-on method. <br><br>Deployment is still up & stuck if any one ones to take a look contact me for access details in private. <br><br></div><div>Will review/use  virt console, virt journal and timeout tips on next deployment.  <br><br></div><div>Thanks<br></div><div>Tzach<br> </div><div><div><div><br></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Oct 14, 2015 at 5:07 AM, Sasha Chuzhoy <span dir="ltr"><<a href="mailto:sasha@redhat.com" target="_blank">sasha@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I hit the same (or similar) issue on my BM environment, though I manage to complete the 1+1 deployment on VM successfully.<br>
I see it's reported already: <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1271289" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1271289</a><br>
<br>
Ran a deployment with:   openstack overcloud deploy --templates --timeout 90 --compute-scale 3 --control-scale 1<br>
The deployment fails, and I see that "all minus one" overcloud nodes are still in BUILD status.<br>
<br>
[stack@undercloud ~]$ nova list<br>
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+<br>
| ID                                   | Name                    | Status | Task State | Power State | Networks            |<br>
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+<br>
| b15f499e-79ed-46b2-b990-878dbe6310b1 | overcloud-controller-0  | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.23 |<br>
| 4877d14a-e34e-406b-8005-dad3d79f5bab | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.9  |<br>
| 0fd1a7ed-367e-448e-8602-8564bf087e92 | overcloud-novacompute-1 | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.21 |<br>
| 51630a7d-c140-47b9-a071-1f2fdb45f4b4 | overcloud-novacompute-2 | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.22 |<br>
<br>
<br>
Will try to investigate further tomorrow.<br>
<br>
Best regards,<br>
Sasha Chuzhoy.<br>
<span class="im HOEnZb"><br>
----- Original Message -----<br>
> From: "Tzach Shefi" <<a href="mailto:tshefi@redhat.com">tshefi@redhat.com</a>><br>
> To: "Dan Sneddon" <<a href="mailto:dsneddon@redhat.com">dsneddon@redhat.com</a>><br>
> Cc: <a href="mailto:rdo-list@redhat.com">rdo-list@redhat.com</a><br>
</span><span class="im HOEnZb">> Sent: Tuesday, October 13, 2015 6:01:48 AM<br>
> Subject: Re: [Rdo-list] Overcloud deploy stuck for a long time<br>
><br>
</span><div class="HOEnZb"><div class="h5">> So gave it a few more hours, on heat resource nothing is failed only<br>
> create_complete and some init_complete.<br>
><br>
> Nova show<br>
> | 61aaed37-4993-4165-93a7-3c9bf6b10a21 | overcloud-controller-0 | ACTIVE | -<br>
> | | Running | ctlplane=192.0.2.8 |<br>
> | 7f9f4f52-3ee6-42d9-9275-ff88582dd6e7 | overcloud-novacompute-0 | BUILD |<br>
> | spawning | NOSTATE | ctlplane=192.0.2.9 |<br>
><br>
><br>
> nova show 7f9f4f52-3ee6-42d9-9275-ff88582dd6e7<br>
> +--------------------------------------+----------------------------------------------------------+<br>
> | Property | Value |<br>
> +--------------------------------------+----------------------------------------------------------+<br>
> | OS-DCF:diskConfig | MANUAL |<br>
> | OS-EXT-AZ:availability_zone | nova |<br>
> | OS-EXT-SRV-ATTR:host | instack.localdomain |<br>
> | OS-EXT-SRV-ATTR:hypervisor_hostname | 4626bf90-7f95-4bd7-8bee-5f5b0a0981c6<br>
> | |<br>
> | OS-EXT-SRV-ATTR:instance_name | instance-00000002 |<br>
> | OS-EXT-STS:power_state | 0 |<br>
> | OS-EXT-STS:task_state | spawning |<br>
> | OS-EXT-STS:vm_state | building |<br>
><br>
> Checking nova log this is what I see:<br>
><br>
> nova-compute.log:{"nodes": [{"target_power_state": null, "links": [{"href": "<br>
> <a href="http://192.0.2.1:6385/v1/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6" rel="noreferrer" target="_blank">http://192.0.2.1:6385/v1/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6</a> ",<br>
> "rel": "self"}, {"href": "<br>
> <a href="http://192.0.2.1:6385/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6" rel="noreferrer" target="_blank">http://192.0.2.1:6385/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6</a> ", "rel":<br>
> "bookmark"}], "extra": {}, "last_error": " Failed to change power state to<br>
> 'power on'. Error: Failed to execute command via SSH : LC_ALL=C<br>
> /usr/bin/virsh --connect qemu:///system start baremetalbrbm_1.",<br>
> "updated_at": "2015-10-12T14:36:08+00:00", "maintenance_reason": null,<br>
> "provision_state": "deploying", "clean_step": {}, "uuid":<br>
> "4626bf90-7f95-4bd7-8bee-5f5b0a0981c6", "console_enabled": false,<br>
> "target_provision_state": "active", "provision_updated_at":<br>
> "2015-10-12T14:35:18+00:00", "power_state": "power off",<br>
> "inspection_started_at": null, "inspection_finished_at": null,<br>
> "maintenance": false, "driver": "pxe_ssh", "reservation": null,<br>
> "properties": {"memory_mb": "4096", "cpu_arch": "x86_64", "local_gb": "40",<br>
> "cpus": "1", "capabilities": "boot_option:local"}, "instance_uuid":<br>
> "7f9f4f52-3ee6-42d9-9275-ff88582dd6e7", "name": null, "driver_info":<br>
> {"ssh_username": "root", "deploy_kernel":<br>
> "94cc528d-d91f-4ca7-876e-2d8cbec66f1b", "deploy_ramdisk":<br>
> "057d3b42-002a-4c24-bb3f-2032b8086108", "ssh_key_contents": "-----BEGIN( I<br>
> removed key..)END RSA PRIVATE KEY-----", "ssh_virt_type": "virsh",<br>
> "ssh_address": "192.168.122.1"}, "created_at": "2015-10-12T14:26:30+00:00",<br>
> "ports": [{"href": "<br>
> <a href="http://192.0.2.1:6385/v1/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6/ports" rel="noreferrer" target="_blank">http://192.0.2.1:6385/v1/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6/ports</a> ",<br>
> "rel": "self"}, {"href": "<br>
> <a href="http://192.0.2.1:6385/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6/ports" rel="noreferrer" target="_blank">http://192.0.2.1:6385/nodes/4626bf90-7f95-4bd7-8bee-5f5b0a0981c6/ports</a> ",<br>
> "rel": "bookmark"}], "driver_internal_info": {"clean_steps": null,<br>
> "root_uuid_or_disk_id": "9ff90423-9d18-4dd1-ae96-a4466b52d9d9",<br>
> "is_whole_disk_image": false}, "instance_info": {"ramdisk":<br>
> "82639516-289d-4603-bf0e-8131fa75ec46", "kernel":<br>
> "665ffcb0-2afe-4e04-8910-45b92826e328", "root_gb": "40", "display_name":<br>
> "overcloud-novacompute-0", "image_source":<br>
> "d99f460e-c6d9-4803-99e4-51347413f348", "capabilities": "{\"boot_option\":<br>
> \"local\"}", "memory_mb": "4096", "vcpus": "1", "deploy_key":<br>
> "BI0FRWDTD4VGHII9JK2BYDDFR8WB1WUG", "local_gb": "40", "configdrive":<br>
> "H4sICGDEG1YC/3RtcHpwcWlpZQDt3WuT29iZ2HH02Bl7Fe/G5UxSqS3vLtyesaSl2CR4p1zyhk2Ct+ateScdVxcIgiR4A5sAr95xxa/iVOUz7EfJx8m7rXyE5IDslro1mpbGox15Zv6/lrpJ4AAHN/LBwXMIShIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADhJpvx+5UQq5EqNtvzldGs+MIfewJeNv53f/7n354F6xT/3v/TjH0v/chz0L5+8Gv2f3V+n0s+Pz34u/dj982PJfvSTvxFVfXQ7vfyBlRfGvOZo+kQuWWtNVgJn/jO/d6kHzvrGWlHOjGn0TDfmjmXL30kZtZSrlXPFREaVxQM5Hon4fdl0TU7nCmqtU6urRTlZVRP1clV+knwqK/F4UFbPOuVGKZNKFNTbgVFvwO+PyPmzipqo1solX/6slszmCuKozBzKuKPdMlE5ma<br>
><br>
><br>
> Any ideas on how to resolve a stuck spawning compute node, it's stuck hasn't<br>
> changed for a few hours now.<br>
><br>
> Tzach<br>
><br>
> Tzach<br>
><br>
><br>
> On Mon, Oct 12, 2015 at 11:25 PM, Dan Sneddon < <a href="mailto:dsneddon@redhat.com">dsneddon@redhat.com</a> > wrote:<br>
><br>
><br>
><br>
> On 10/12/2015 08:10 AM, Tzach Shefi wrote:<br>
> > Hi,<br>
> ><br>
> > Server running centos 7.1, vm running for undercloud got up to<br>
> > overcloud deploy stage.<br>
> > It looks like its stuck nothing advancing for a while.<br>
> > Ideas, what to check?<br>
> ><br>
> > [stack@instack ~]$ openstack overcloud deploy --templates<br>
> > Deploying templates in the directory<br>
> > /usr/share/openstack-tripleo-heat-templates<br>
> > [91665.696658] device vnet2 entered promiscuous mode<br>
> > [91665.781346] device vnet3 entered promiscuous mode<br>
> > [91675.260324] kvm [71183]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff<br>
> > [91675.291232] kvm [71200]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff<br>
> > [91767.799404] kvm: zapping shadow pages for mmio generation wraparound<br>
> > [91767.880480] kvm: zapping shadow pages for mmio generation wraparound<br>
> > [91768.957761] device vnet2 left promiscuous mode<br>
> > [91769.799446] device vnet3 left promiscuous mode<br>
> > [91771.223273] device vnet3 entered promiscuous mode<br>
> > [91771.232996] device vnet2 entered promiscuous mode<br>
> > [91773.733967] kvm [72245]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff<br>
> > [91801.270510] device vnet2 left promiscuous mode<br>
> ><br>
> ><br>
> > Thanks<br>
> > Tzach<br>
> ><br>
> ><br>
> > _______________________________________________<br>
> > Rdo-list mailing list<br>
> > <a href="mailto:Rdo-list@redhat.com">Rdo-list@redhat.com</a><br>
> > <a href="https://www.redhat.com/mailman/listinfo/rdo-list" rel="noreferrer" target="_blank">https://www.redhat.com/mailman/listinfo/rdo-list</a><br>
> ><br>
> > To unsubscribe: <a href="mailto:rdo-list-unsubscribe@redhat.com">rdo-list-unsubscribe@redhat.com</a><br>
> ><br>
><br>
> You're going to need a more complete command line than "openstack<br>
> overcloud deploy --templates". For instance, if you are using VMs for<br>
> your overcloud nodes, you will need to include "--libvirt-type qemu".<br>
> There are probably a couple of other parameters that you will need.<br>
><br>
> You can watch the deployment using this command, which will show you<br>
> the progress:<br>
><br>
> watch "heat resource-list -n 5 | grep -v COMPLETE"<br>
><br>
> You can also explore which resources have failed:<br>
><br>
> heat resource-list [-n 5]| grep FAILED<br>
><br>
> And then look more closely at the failed resources:<br>
><br>
> heat resource-show overcloud <resource><br>
><br>
> There are some more complete troubleshooting instructions here:<br>
><br>
> <a href="http://docs.openstack.org/developer/tripleo-docs/troubleshooting/troubleshooting-overcloud.html" rel="noreferrer" target="_blank">http://docs.openstack.org/developer/tripleo-docs/troubleshooting/troubleshooting-overcloud.html</a><br>
><br>
> --<br>
> Dan Sneddon | Principal OpenStack Engineer<br>
> <a href="mailto:dsneddon@redhat.com">dsneddon@redhat.com</a> | <a href="http://redhat.com/openstack" rel="noreferrer" target="_blank">redhat.com/openstack</a><br>
> <a href="tel:650.254.4025" value="+16502544025">650.254.4025</a> | dsneddon:irc @dxs:twitter<br>
><br>
> _______________________________________________<br>
> Rdo-list mailing list<br>
> <a href="mailto:Rdo-list@redhat.com">Rdo-list@redhat.com</a><br>
> <a href="https://www.redhat.com/mailman/listinfo/rdo-list" rel="noreferrer" target="_blank">https://www.redhat.com/mailman/listinfo/rdo-list</a><br>
><br>
> To unsubscribe: <a href="mailto:rdo-list-unsubscribe@redhat.com">rdo-list-unsubscribe@redhat.com</a><br>
><br>
><br>
><br>
> --<br>
> Tzach Shefi<br>
> Quality Engineer, Redhat OSP<br>
> <a href="tel:%2B972-54-4701080" value="+972544701080">+972-54-4701080</a><br>
><br>
> _______________________________________________<br>
> Rdo-list mailing list<br>
> <a href="mailto:Rdo-list@redhat.com">Rdo-list@redhat.com</a><br>
> <a href="https://www.redhat.com/mailman/listinfo/rdo-list" rel="noreferrer" target="_blank">https://www.redhat.com/mailman/listinfo/rdo-list</a><br>
><br>
> To unsubscribe: <a href="mailto:rdo-list-unsubscribe@redhat.com">rdo-list-unsubscribe@redhat.com</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><font size="4"><b><span>Tzach Shefi</span></b></font><br>Quality Engineer, Redhat OSP<br><span><a href="callto:+972-52-4534729" target="_blank">+972-54-4701080</a></span></div></div>
</div>