Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on Compute Node when testing delorean RC2 or CI repo on CentOS 7.1

Monday, 11 May 2015

Steve,

Thanks!

I pulled magnum from git on devstack, dropped the magnum db, created a new
one
and tried to create a bay, now I'm getting "went to status error due to
unknown" as below.

Nova  and magnum bay-list list shows:

ubuntu@magnum:~/devstack$ nova list

+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+-----------------------------------------------------------------------+

| ID                                   | Name
                    | Status | Task State | Power State | Networks
                                                    |

+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+-----------------------------------------------------------------------+

| 797b6057-1ddf-4fe3-8688-b63e5e9109b4 |
te-h5yvoiptrmx3-0-4w4j2ltnob7a-kube_node-vg7rojnafrub | ERROR  | -
| NOSTATE     | testbay-6kij6pvui3p7-fixed_network-46mvxv7yfjzw=10.0.0.5,
2001:db8::f |

| c0b56f08-8a4d-428a-aee1-b29ca6e68163 |
testbay-6kij6pvui3p7-kube_master-z3lifgrrdxie         | ACTIVE | -
| Running     | testbay-6kij6pvui3p7-fixed_network-46mvxv7yfjzw=10.0.0.3,
2001:db8::d |

+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+-----------------------------------------------------------------------+

ubuntu@magnum:~/devstack$ magnum bay-list

+--------------------------------------+---------+------------+---------------+

| uuid                                 | name    | node_count | status
  |

+--------------------------------------+---------+------------+---------------+

| 87e36c44-a884-4cb4-91cc-c7ae320f33b4 | testbay | 2          |
CREATE_FAILED |

+--------------------------------------+---------+------------+---------------+

e3a65b05f", "flannel_network_subnetlen": "24",
"fixed_network_cidr": "
10.0.0.0/24", "OS::stack_id":
"d0246d48-23e0-4aa0-87e0-052b2ca363e8",
"OS::stack_name": "testbay-6kij6pvui3p7", "master_flavor":
"m1.small",
"external_network_id": "e3e2a633-1638-4c11-a994-7179a24e826e",
"portal_network_cidr": "10.254.0.0/16",
"docker_volume_size": "5",
"ssh_key_name": "testkey", "kube_allow_priv":
"true", "number_of_minions":
"2", "flannel_use_vxlan": "false",
"flannel_network_cidr": "10.100.0.0/16",
"server_flavor": "m1.medium", "dns_nameserver":
"8.8.8.8", "server_image":
"fedora-21-atomic-3"}, "id":
"d0246d48-23e0-4aa0-87e0-052b2ca363e8",
"outputs": [{"output_value": ["2001:db8::f",
"2001:db8::e"], "description":
"No description given", "output_key":
"kube_minions_external"},
{"output_value": ["10.0.0.5", "10.0.0.4"],
"description": "No description
given", "output_key": "kube_minions"}, {"output_value":
"2001:db8::d",
"description": "No description given", "output_key":
"kube_master"}],
"template_description": "This template will boot a Kubernetes cluster with
one or more minions (as specified by the number_of_minions parameter, which
defaults to \"2\").\n"}}

 log_http_response
/usr/local/lib/python2.7/dist-packages/heatclient/common/http.py:141

2015-05-11 17:31:15.968 30006 ERROR magnum.conductor.handlers.bay_k8s_heat
[-] Unable to create bay, stack_id: d0246d48-23e0-4aa0-87e0-052b2ca363e8,
reason: Resource CREATE failed: ResourceUnknownStatus: Resource failed -
Unknown status FAILED due to "Resource CREATE failed:
ResourceUnknownStatus: Resource failed - Unknown status FAILED due to
"Resource CREATE failed: ResourceInError: Went to status error due to
"Unknown"""

Any Idea?

Thanks!
-Arash

On Mon, May 11, 2015 at 2:04 AM, Steven Dake (stdake) <stdake(a)cisco.com&gt;
wrote:

...
  Arash,

  The short of it is Magnum 2015.1.0 is DOA.

  Four commits have hit the repository in the last hour to fix these
 problems.  Now Magnum works with v1beta3 of the kubernetes 0.15 v1betav3
 examples with the exception of the service object.  We are actively working
 on that problem upstream – I’ll update when its fixed.

  To see my run check out:

 http://ur1.ca/kc613 -> http://paste.fedoraproject.org/220479/13022911

  To upgrade and see everything working but the service object, you will
 have to remove your openstack-magnum package if using my COPR repo or git
 pull on your Magnum repo if using devstack.

  Boris - interested to hear the feedback on a CentOS distro operation
 once we get that service bug fixed.

  Regards
 -steve

   From: Arash Kaffamanesh <ak(a)cloudssky.com&gt;
 Date: Sunday, May 10, 2015 at 4:10 PM
 To: Steven Dake <stdake(a)cisco.com&gt;

 Cc: &quot;rdo-list(a)redhat.com&quot; <rdo-list(a)redhat.com&gt;
 Subject: Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on
 Compute Node when testing delorean RC2 or CI repo on CentOS 7.1

   Steve,

  Thanks for your kind advice.

  I'm trying to go first through the quick start for magnum with devstack
 on ubuntu and I'm also
 following this guide to create a bay with 2 nodes:

 http://git.openstack.org/cgit/openstack/magnum/tree/doc/source/dev/dev-qu...

  I got somehow far, but by running this step to run the service tp
 provide a discoverable endpoint for the redis sentinels in the cluster:

   magnum service-create --manifest ./redis-sentinel-service.yaml --bay testbay

 I'm getting:

 ERROR: Invalid resource state. (HTTP 409)

 In the console, I see:

 2015-05-10 22:19:44.010 4967 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to
AMQP server on 127.0.0.1:5672

 2015-05-10 22:19:44.050 4967 WARNING wsme.api [-] Client-side error: Invalid resource
state.

 127.0.0.1 - - [10/May/2015 22:19:44] "POST /v1/rcs HTTP/1.1" 409 115

 The testbay is running with 2 nodes properly:

 ubuntu@magnum:~/kubernetes/examples/redis$ magnum bay-list

 | 4fa480a7-2d96-4a3e-876b-1c59d67257d6 | testbay | 2          | CREATE_COMPLETE |

 Any ideas, where I could dig for the problem?

 By the way after running "magnum pod-create .." the status shows
"failed"

 ubuntu@magnum:~/kubernetes/examples/redis/v1beta3$ magnum pod-create --manifest
./redis-master.yaml --bay testbay

 +--------------+---------------------------------------------------------------------+

 | Property     | Value                                                               |

 +--------------+---------------------------------------------------------------------+

 | status       | failed                                                              |

 And the pod-list shows:

 ubuntu@magnum:~$ magnum pod-list

 +--------------------------------------+--------------+

 | uuid                                 | name         |

 +--------------------------------------+--------------+

 | 8d6977c1-a88f-45ee-be6c-fd869874c588 | redis-master |

 I tried also to set the status to running in the pod database table, but it didn't
help.

 P.S.: I tried also to run the whole thing on fedora 21 with devstack, but I got more
problems as on Ubuntu.

 Many thanks in advance for your help!

 Arash

 On Mon, May 4, 2015 at 12:54 AM, Steven Dake (stdake) <stdake(a)cisco.com&gt;
 wrote:

>  Boris,
>
>  Feel free to try out my Magnum packages here.  They work in containers,
> not sure about CentOS.  I’m not certain the systemd files are correct (I
> didn’t test that part) but the dependencies are correct:
>
>  https://copr.fedoraproject.org/coprs/sdake/openstack-magnum/
>
>  NB you will have to run through the quickstart configuration guide here:
>
>
>
https://github.com/openstack/magnum/blob/master/doc/source/dev/dev-manual...
>
>  *Regards*
> *-steve*
>
>   From: Boris Derzhavets <bderzhavets(a)hotmail.com&gt;
> Date: Sunday, May 3, 2015 at 11:20 AM
> To: Arash Kaffamanesh <ak(a)cloudssky.com&gt;
> Cc: &quot;rdo-list(a)redhat.com&quot; <rdo-list(a)redhat.com&gt;
>
> Subject: Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on
> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
>
>   Arash,
>
> Please, disregard this notice :-
>
> >You wrote :-
>
> >> What I noticed here, if I associate a floating ip to a VM with 2
> interfaces, then I'll lose the
> >> connectivity >to the instance and Kilo
>
> Different types of VMs  in yours and mine environments.
>
> Boris.
>
>  ------------------------------
> Date: Sun, 3 May 2015 16:51:54 +0200
> Subject: Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on
> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
> From: ak(a)cloudssky.com
> To: bderzhavets(a)hotmail.com
> CC: apevec(a)gmail.com; rdo-list(a)redhat.com
>
> Boris, thanks for your kind feedback.
>
> I did a 3 node Kilo RC2 virt setup on top of my Kilo RC2 which was
> installed on bare metal.
> The installation was successful by the first run.
>
> The network looks like this:
> https://cloudssky.com/.galleries/images/kilo-virt-setup.png
>
>  For this setup I added the latest CentOS cloud image to glance, ran an
> instance (controller), enabled root login,
> added ifcfg-eth1 to the instance, created a snapshot from the controller,
> added the repos to this instance, yum updated,
> rebooted and spawn the network and compute1 vm nodes from that snapshot.
> (To be able to ssh into the VMs over 20.0.1.0 network, I created the gate
> VM with a floating ip assigned and installed OpenVPN
>  on it.)
>
>  What I noticed here, if I associate a floating ip to a VM with 2
> interfaces, then I'll lose the connectivity to the instance and Kilo
> becomes crazy (the AIO controller on bare metal lose somehow its br-ex
> interface, but I didn't try to reproduce it again).
>
>  The packstack file was created in interactive mode with:
>
>  packstack --answer-file= --> press enter
>
>  I accepted most default values and selected trove and heat to be
> installed.
>
>  The answers are on pastebin:
>
>  http://pastebin.com/SYp8Qf7d
>
> The generated packstack file is here:
>
>  http://pastebin.com/XqJuvQxf
>  The br-ex interfaces and changes to eth0 are created on network and
> compute nodes correctly (output below).
> And one nice thing for me coming from Havana was to see how easy has got
> to create an image in Horizon
> by uploading an image file (in my case rancheros.iso and centos.qcow2
> worked like a charm).
> Now its time to discover Ironic, Trove and Manila and if someone has some
> tips or guidelines on how to test these
> new exciting things or has any news about Murano or Magnum on RDO, then
> I'll be more lucky and excited
> as I'm now about Kilo :-)
> Thanks!
> Arash
> ---
> Some outputs here:
> [root@controller ~(keystone_admin)]# nova hypervisor-list
> +----+---------------------+-------+---------+
> | ID | Hypervisor hostname | State | Status  |
> +----+---------------------+-------+---------+
> | 1  | compute1.novalocal   | up    | enabled |
>
> +----+---------------------+-------+---------+
> [root@network ~]# ovs-vsctl show
> 436a6114-d489-4160-b469-f088d66bd752
>     Bridge br-tun
>         fail_mode: secure
>         Port "vxlan-14000212"
>             Interface "vxlan-14000212"
>                 type: vxlan
>                 options: {df_default="true", in_key=flow,
> local_ip="20.0.2.19", out_key=flow, remote_ip="20.0.2.18"}
>         Port br-tun
>             Interface br-tun
>                 type: internal
>         Port patch-int
>             Interface patch-int
>                 type: patch
>                 options: {peer=patch-tun}
>     Bridge br-int
>         fail_mode: secure
>         Port br-int
>             Interface br-int
>                 type: internal
>         Port int-br-ex
>             Interface int-br-ex
>                 type: patch
>                 options: {peer=phy-br-ex}
>         Port patch-tun
>             Interface patch-tun
>                 type: patch
>                 options: {peer=patch-int}
>     Bridge br-ex
>         Port br-ex
>             Interface br-ex
>                 type: internal
>         Port phy-br-ex
>             Interface phy-br-ex
>                 type: patch
>                 options: {peer=int-br-ex}
>         Port "eth0"
>             Interface "eth0"
>
>     ovs_version: "2.3.1"
>
>
> [root@compute~]# ovs-vsctl show
> 8123433e-b477-4ef5-88aa-721487a4bd58
>     Bridge br-int
>         fail_mode: secure
>         Port int-br-ex
>             Interface int-br-ex
>                 type: patch
>                 options: {peer=phy-br-ex}
>         Port patch-tun
>             Interface patch-tun
>                 type: patch
>                 options: {peer=patch-int}
>         Port br-int
>             Interface br-int
>                 type: internal
>     Bridge br-tun
>         fail_mode: secure
>         Port br-tun
>             Interface br-tun
>                 type: internal
>         Port patch-int
>             Interface patch-int
>                 type: patch
>                 options: {peer=patch-tun}
>         Port "vxlan-14000213"
>             Interface "vxlan-14000213"
>                 type: vxlan
>                 options: {df_default="true", in_key=flow,
> local_ip="20.0.2.18", out_key=flow, remote_ip="20.0.2.19"}
>     Bridge br-ex
>         Port phy-br-ex
>             Interface phy-br-ex
>                 type: patch
>                 options: {peer=int-br-ex}
>         Port "eth0"
>             Interface "eth0"
>         Port br-ex
>             Interface br-ex
>                 type: internal
>
>     ovs_version: "2.3.1"
>
>
>
>
>
>
>  On Sat, May 2, 2015 at 9:02 AM, Boris Derzhavets <
> bderzhavets(a)hotmail.com&gt; wrote:
>
>  Thank you once again it really works.
>
> [root@ip-192-169-142-127 ~(keystone_admin)]# nova hypervisor-list
> +----+----------------------------------------+-------+---------+
> | ID | Hypervisor hostname                    | State | Status  |
> +----+----------------------------------------+-------+---------+
> | 1  | ip-192-169-142-127.ip.secureserver.net | up    | enabled |
> | 2  | ip-192-169-142-137.ip.secureserver.net | up    | enabled |
> +----+----------------------------------------+-------+---------+
>
> [root@ip-192-169-142-127 ~(keystone_admin)]# nova hypervisor-servers
> ip-192-169-142-137.ip.secureserver.net
>
>
+--------------------------------------+-------------------+---------------+----------------------------------------+
> | ID                                   | Name              | Hypervisor
> ID | Hypervisor Hostname                    |
>
>
+--------------------------------------+-------------------+---------------+----------------------------------------+
> | 16ab7825-1403-442e-b3e2-7056d14398e0 | instance-00000002 |
> 2             | ip-192-169-142-137.ip.secureserver.net |
> | 5fa444c8-30b8-47c3-b073-6ce10dd83c5a | instance-00000004 |
> 2             | ip-192-169-142-137.ip.secureserver.net |
>
>
+--------------------------------------+-------------------+---------------+----------------------------------------+
>
> with only one issue:-
>
>  during AIO run CONFIG_NEUTRON_OVS_TUNNEL_IF=
>  during Compute Node setup CONFIG_NEUTRON_OVS_TUNNEL_IF=eth1
>
>  and finally it results mess in ml2_vxlan_endpoints table. I had manually
> update
>  ml2_vxlan_endpoints and restart   neutron-openvswitch-agent.service on
> both nodes
>  afterwards VMs on compute node obtained access to meta-data server.
>
>  I also believe that synchronized delete records from tables
> "compute_nodes && services"
>  ( along with disabling nova-compute on Controller)  could  turn AIO host
> into real Controller.
>
> Boris.
>
>  ------------------------------
> Date: Fri, 1 May 2015 22:22:41 +0200
> Subject: Re: [Rdo-list] RE(1) Failure to start openstack-nova-compute on
> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
> From: ak(a)cloudssky.com
> To: bderzhavets(a)hotmail.com
> CC: apevec(a)gmail.com; rdo-list(a)redhat.com
>
>  I got the compute node working by adding the delorean-kilo.repo on
> compute node,
> yum updating the compute node, rebooted and extended the packstack file
> from the first AIO
> install with the IP of compute node and ran packstack again with
> NetworkManager enabled
> and did a second yum update on compute node before the 3rd packstack run,
> and now it works :-)
>
>  In short, for RC2 we have to force by hand to get the nova-compute
> running on compute node,
> before running packstack from controller again from an existing AIO
> install.
>
>  Now I have 2 compute nodes (controller AIO with compute + 2nd compute)
> and could spawn a
> 3rd cirros instance which landed on 2nd compute node.
> ssh'ing into the instances over the floating ip works fine too.
>
>  Before running packstack again, I set:
>
> EXCLUDE_SERVERS=<ip of controller>
>
>  [root@csky01 ~(keystone_osx)]# virsh list --all
>  Id    Name                           Status
> ----------------------------------------------------
>  2     instance-00000001              laufend --> means running in German
>
>  3     instance-00000002              laufend --> means running in German
>
>
>  [root@csky06 ~]# virsh list --all
>  Id    Name                           Status
> ----------------------------------------------------
>  2     instance-00000003              laufend --> means running in German
>
>
>  == Nova managed services ==
>
>
+----+------------------+----------------+----------+---------+-------+----------------------------+-----------------+
> | Id | Binary           | Host           | Zone     | Status  | State |
> Updated_at                 | Disabled Reason |
>
>
+----+------------------+----------------+----------+---------+-------+----------------------------+-----------------+
> | 1  | nova-consoleauth | csky01.csg.net | internal | enabled | up    |
> 2015-05-01T19:46:42.000000 | -               |
> | 2  | nova-conductor   | csky01.csg.net | internal | enabled | up    |
> 2015-05-01T19:46:42.000000 | -               |
> | 3  | nova-scheduler   | csky01.csg.net | internal | enabled | up    |
> 2015-05-01T19:46:42.000000 | -               |
> | 4  | nova-compute     | csky01.csg.net | nova     | enabled | up    |
> 2015-05-01T19:46:40.000000 | -               |
> | 5  | nova-cert        | csky01.csg.net | internal | enabled | up    |
> 2015-05-01T19:46:42.000000 | -               |
> | 6  | nova-compute     | csky06.csg.net | nova     | enabled | up    |
> 2015-05-01T19:46:38.000000 | -               |
>
>
+----+------------------+----------------+----------+---------+-------+----------------------------+-----------------+
>
>
>  On Fri, May 1, 2015 at 9:02 AM, Boris Derzhavets <
> bderzhavets(a)hotmail.com&gt; wrote:
>
>  Ran packstack --debug --answer-file=./answer-fileRC2.txt
> 192.169.142.137_nova.pp.log.gz attached
>
> Boris
>
>  ------------------------------
> From: bderzhavets(a)hotmail.com
> To: apevec(a)gmail.com
> Date: Fri, 1 May 2015 01:44:17 -0400
> CC: rdo-list(a)redhat.com
> Subject: [Rdo-list] Failure to start openstack-nova-compute on Compute
> Node when testing delorean RC2 or CI repo on CentOS 7.1
>
> Follow instructions
> https://www.redhat.com/archives/rdo-list/2015-April/msg00254.html
> packstack fails :-
>
> Applying 192.169.142.127_nova.pp
> Applying 192.169.142.137_nova.pp
> 192.169.142.127_nova.pp:                             [ DONE ]
> 192.169.142.137_nova.pp:                          [ ERROR ]
> Applying Puppet manifests                         [ ERROR ]
>
> ERROR : Error appeared during Puppet run: 192.169.142.137_nova.pp
> Error: Could not start Service[nova-compute]: Execution of
> '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for
> openstack-nova-compute.service failed. See 'systemctl status
> openstack-nova-compute.service' and 'journalctl -xn' for details.
> You will find full trace in log
> /var/tmp/packstack/20150501-081745-rIpCIr/manifests/192.169.142.137_nova.pp.log
>
> In both cases (RC2 or CI repos)  on compute node 192.169.142.137
> /var/log/nova/nova-compute.log
> reports :-
>
> 2015-05-01 08:21:41.354 4999 INFO oslo.messaging._drivers.impl_rabbit
> [req-0ae34524-9ee0-4a87-aa5a-fff5d1999a9c ] Delaying reconnect for 1.0
> seconds...
> 2015-05-01 08:21:42.355 4999 INFO oslo.messaging._drivers.impl_rabbit
> [req-0ae34524-9ee0-4a87-aa5a-fff5d1999a9c ] Connecting to AMQP server on
> localhost:5672
> 2015-05-01 08:21:42.360 4999 ERROR oslo.messaging._drivers.impl_rabbit
> [req-0ae34524-9ee0-4a87-aa5a-fff5d1999a9c ] AMQP server on localhost:5672
> is unreachable: [Errno 111] ECONNREFUSED. Trying again in 11 seconds.
>
> Seems like it is looking for AMQP Server at wrong host . Should be
> 192.169.142.127
> On 192.169.142.127 :-
>
> [root@ip-192-169-142-127 ~]# netstat -lntp | grep 5672
> ==>  tcp        0      0 0.0.0.0:25672           0.0.0.0:*
> LISTEN      14506/beam.smp
>         tcp6       0      0 :::5672
> :::*                    LISTEN      14506/beam.smp
>
> [root@ip-192-169-142-127 ~]# iptables-save | grep 5672
> -A INPUT -s 192.169.142.127/32 -p tcp -m multiport --dports 5671,5672 -m
> comment --comment "001 amqp incoming amqp_192.169.142.127" -j ACCEPT
> -A INPUT -s 192.169.142.137/32 -p tcp -m multiport --dports 5671,5672 -m
> comment --comment "001 amqp incoming amqp_192.169.142.137" -j ACCEPT
>
> Answer-file is attached
>
> Thanks.
> Boris
>
> _______________________________________________ Rdo-list mailing list
> Rdo-list(a)redhat.com https://www.redhat.com/mailman/listinfo/rdo-list To
> unsubscribe: rdo-list-unsubscribe(a)redhat.com
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list(a)redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe(a)redhat.com
>
>
>
>

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on Compute Node when testing delorean RC2 or CI repo on CentOS 7.1