[Rdo-list] RE(2) Failure to start openstack-nova-compute on Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
Arash Kaffamanesh
ak at cloudssky.com
Mon May 11 18:05:47 UTC 2015
Steve,
Thanks!
I pulled magnum from git on devstack, dropped the magnum db, created a new
one
and tried to create a bay, now I'm getting "went to status error due to
unknown" as below.
Nova and magnum bay-list list shows:
ubuntu at magnum:~/devstack$ nova list
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+-----------------------------------------------------------------------+
| ID | Name
| Status | Task State | Power State | Networks
|
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+-----------------------------------------------------------------------+
| 797b6057-1ddf-4fe3-8688-b63e5e9109b4 |
te-h5yvoiptrmx3-0-4w4j2ltnob7a-kube_node-vg7rojnafrub | ERROR | -
| NOSTATE | testbay-6kij6pvui3p7-fixed_network-46mvxv7yfjzw=10.0.0.5,
2001:db8::f |
| c0b56f08-8a4d-428a-aee1-b29ca6e68163 |
testbay-6kij6pvui3p7-kube_master-z3lifgrrdxie | ACTIVE | -
| Running | testbay-6kij6pvui3p7-fixed_network-46mvxv7yfjzw=10.0.0.3,
2001:db8::d |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+-----------------------------------------------------------------------+
ubuntu at magnum:~/devstack$ magnum bay-list
+--------------------------------------+---------+------------+---------------+
| uuid | name | node_count | status
|
+--------------------------------------+---------+------------+---------------+
| 87e36c44-a884-4cb4-91cc-c7ae320f33b4 | testbay | 2 |
CREATE_FAILED |
+--------------------------------------+---------+------------+---------------+
e3a65b05f", "flannel_network_subnetlen": "24", "fixed_network_cidr": "
10.0.0.0/24", "OS::stack_id": "d0246d48-23e0-4aa0-87e0-052b2ca363e8",
"OS::stack_name": "testbay-6kij6pvui3p7", "master_flavor": "m1.small",
"external_network_id": "e3e2a633-1638-4c11-a994-7179a24e826e",
"portal_network_cidr": "10.254.0.0/16", "docker_volume_size": "5",
"ssh_key_name": "testkey", "kube_allow_priv": "true", "number_of_minions":
"2", "flannel_use_vxlan": "false", "flannel_network_cidr": "10.100.0.0/16",
"server_flavor": "m1.medium", "dns_nameserver": "8.8.8.8", "server_image":
"fedora-21-atomic-3"}, "id": "d0246d48-23e0-4aa0-87e0-052b2ca363e8",
"outputs": [{"output_value": ["2001:db8::f", "2001:db8::e"], "description":
"No description given", "output_key": "kube_minions_external"},
{"output_value": ["10.0.0.5", "10.0.0.4"], "description": "No description
given", "output_key": "kube_minions"}, {"output_value": "2001:db8::d",
"description": "No description given", "output_key": "kube_master"}],
"template_description": "This template will boot a Kubernetes cluster with
one or more minions (as specified by the number_of_minions parameter, which
defaults to \"2\").\n"}}
log_http_response
/usr/local/lib/python2.7/dist-packages/heatclient/common/http.py:141
2015-05-11 17:31:15.968 30006 ERROR magnum.conductor.handlers.bay_k8s_heat
[-] Unable to create bay, stack_id: d0246d48-23e0-4aa0-87e0-052b2ca363e8,
reason: Resource CREATE failed: ResourceUnknownStatus: Resource failed -
Unknown status FAILED due to "Resource CREATE failed:
ResourceUnknownStatus: Resource failed - Unknown status FAILED due to
"Resource CREATE failed: ResourceInError: Went to status error due to
"Unknown"""
Any Idea?
Thanks!
-Arash
On Mon, May 11, 2015 at 2:04 AM, Steven Dake (stdake) <stdake at cisco.com>
wrote:
> Arash,
>
> The short of it is Magnum 2015.1.0 is DOA.
>
> Four commits have hit the repository in the last hour to fix these
> problems. Now Magnum works with v1beta3 of the kubernetes 0.15 v1betav3
> examples with the exception of the service object. We are actively working
> on that problem upstream – I’ll update when its fixed.
>
> To see my run check out:
>
> http://ur1.ca/kc613 -> http://paste.fedoraproject.org/220479/13022911
>
> To upgrade and see everything working but the service object, you will
> have to remove your openstack-magnum package if using my COPR repo or git
> pull on your Magnum repo if using devstack.
>
> Boris - interested to hear the feedback on a CentOS distro operation
> once we get that service bug fixed.
>
> Regards
> -steve
>
>
> From: Arash Kaffamanesh <ak at cloudssky.com>
> Date: Sunday, May 10, 2015 at 4:10 PM
> To: Steven Dake <stdake at cisco.com>
>
> Cc: "rdo-list at redhat.com" <rdo-list at redhat.com>
> Subject: Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on
> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
>
> Steve,
>
> Thanks for your kind advice.
>
> I'm trying to go first through the quick start for magnum with devstack
> on ubuntu and I'm also
> following this guide to create a bay with 2 nodes:
>
>
> http://git.openstack.org/cgit/openstack/magnum/tree/doc/source/dev/dev-quickstart.rst
>
> I got somehow far, but by running this step to run the service tp
> provide a discoverable endpoint for the redis sentinels in the cluster:
>
> magnum service-create --manifest ./redis-sentinel-service.yaml --bay testbay
>
>
> I'm getting:
>
>
> ERROR: Invalid resource state. (HTTP 409)
>
>
> In the console, I see:
>
>
> 2015-05-10 22:19:44.010 4967 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on 127.0.0.1:5672
>
> 2015-05-10 22:19:44.050 4967 WARNING wsme.api [-] Client-side error: Invalid resource state.
>
> 127.0.0.1 - - [10/May/2015 22:19:44] "POST /v1/rcs HTTP/1.1" 409 115
>
>
> The testbay is running with 2 nodes properly:
>
> ubuntu at magnum:~/kubernetes/examples/redis$ magnum bay-list
>
> | 4fa480a7-2d96-4a3e-876b-1c59d67257d6 | testbay | 2 | CREATE_COMPLETE |
>
>
> Any ideas, where I could dig for the problem?
>
>
> By the way after running "magnum pod-create .." the status shows "failed"
>
>
> ubuntu at magnum:~/kubernetes/examples/redis/v1beta3$ magnum pod-create --manifest ./redis-master.yaml --bay testbay
>
> +--------------+---------------------------------------------------------------------+
>
> | Property | Value |
>
> +--------------+---------------------------------------------------------------------+
>
> | status | failed |
>
>
> And the pod-list shows:
>
> ubuntu at magnum:~$ magnum pod-list
>
> +--------------------------------------+--------------+
>
> | uuid | name |
>
> +--------------------------------------+--------------+
>
> | 8d6977c1-a88f-45ee-be6c-fd869874c588 | redis-master |
>
>
> I tried also to set the status to running in the pod database table, but it didn't help.
>
> P.S.: I tried also to run the whole thing on fedora 21 with devstack, but I got more problems as on Ubuntu.
>
>
> Many thanks in advance for your help!
>
> Arash
>
>
>
> On Mon, May 4, 2015 at 12:54 AM, Steven Dake (stdake) <stdake at cisco.com>
> wrote:
>
>> Boris,
>>
>> Feel free to try out my Magnum packages here. They work in containers,
>> not sure about CentOS. I’m not certain the systemd files are correct (I
>> didn’t test that part) but the dependencies are correct:
>>
>> https://copr.fedoraproject.org/coprs/sdake/openstack-magnum/
>>
>> NB you will have to run through the quickstart configuration guide here:
>>
>>
>> https://github.com/openstack/magnum/blob/master/doc/source/dev/dev-manual-devstack.rst
>>
>> *Regards*
>> *-steve*
>>
>> From: Boris Derzhavets <bderzhavets at hotmail.com>
>> Date: Sunday, May 3, 2015 at 11:20 AM
>> To: Arash Kaffamanesh <ak at cloudssky.com>
>> Cc: "rdo-list at redhat.com" <rdo-list at redhat.com>
>>
>> Subject: Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on
>> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
>>
>> Arash,
>>
>> Please, disregard this notice :-
>>
>> >You wrote :-
>>
>> >> What I noticed here, if I associate a floating ip to a VM with 2
>> interfaces, then I'll lose the
>> >> connectivity >to the instance and Kilo
>>
>> Different types of VMs in yours and mine environments.
>>
>> Boris.
>>
>> ------------------------------
>> Date: Sun, 3 May 2015 16:51:54 +0200
>> Subject: Re: [Rdo-list] RE(2) Failure to start openstack-nova-compute on
>> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
>> From: ak at cloudssky.com
>> To: bderzhavets at hotmail.com
>> CC: apevec at gmail.com; rdo-list at redhat.com
>>
>> Boris, thanks for your kind feedback.
>>
>> I did a 3 node Kilo RC2 virt setup on top of my Kilo RC2 which was
>> installed on bare metal.
>> The installation was successful by the first run.
>>
>> The network looks like this:
>> https://cloudssky.com/.galleries/images/kilo-virt-setup.png
>>
>> For this setup I added the latest CentOS cloud image to glance, ran an
>> instance (controller), enabled root login,
>> added ifcfg-eth1 to the instance, created a snapshot from the controller,
>> added the repos to this instance, yum updated,
>> rebooted and spawn the network and compute1 vm nodes from that snapshot.
>> (To be able to ssh into the VMs over 20.0.1.0 network, I created the gate
>> VM with a floating ip assigned and installed OpenVPN
>> on it.)
>>
>> What I noticed here, if I associate a floating ip to a VM with 2
>> interfaces, then I'll lose the connectivity to the instance and Kilo
>> becomes crazy (the AIO controller on bare metal lose somehow its br-ex
>> interface, but I didn't try to reproduce it again).
>>
>> The packstack file was created in interactive mode with:
>>
>> packstack --answer-file= --> press enter
>>
>> I accepted most default values and selected trove and heat to be
>> installed.
>>
>> The answers are on pastebin:
>>
>> http://pastebin.com/SYp8Qf7d
>>
>> The generated packstack file is here:
>>
>> http://pastebin.com/XqJuvQxf
>> The br-ex interfaces and changes to eth0 are created on network and
>> compute nodes correctly (output below).
>> And one nice thing for me coming from Havana was to see how easy has got
>> to create an image in Horizon
>> by uploading an image file (in my case rancheros.iso and centos.qcow2
>> worked like a charm).
>> Now its time to discover Ironic, Trove and Manila and if someone has some
>> tips or guidelines on how to test these
>> new exciting things or has any news about Murano or Magnum on RDO, then
>> I'll be more lucky and excited
>> as I'm now about Kilo :-)
>> Thanks!
>> Arash
>> ---
>> Some outputs here:
>> [root at controller ~(keystone_admin)]# nova hypervisor-list
>> +----+---------------------+-------+---------+
>> | ID | Hypervisor hostname | State | Status |
>> +----+---------------------+-------+---------+
>> | 1 | compute1.novalocal | up | enabled |
>>
>> +----+---------------------+-------+---------+
>> [root at network ~]# ovs-vsctl show
>> 436a6114-d489-4160-b469-f088d66bd752
>> Bridge br-tun
>> fail_mode: secure
>> Port "vxlan-14000212"
>> Interface "vxlan-14000212"
>> type: vxlan
>> options: {df_default="true", in_key=flow,
>> local_ip="20.0.2.19", out_key=flow, remote_ip="20.0.2.18"}
>> Port br-tun
>> Interface br-tun
>> type: internal
>> Port patch-int
>> Interface patch-int
>> type: patch
>> options: {peer=patch-tun}
>> Bridge br-int
>> fail_mode: secure
>> Port br-int
>> Interface br-int
>> type: internal
>> Port int-br-ex
>> Interface int-br-ex
>> type: patch
>> options: {peer=phy-br-ex}
>> Port patch-tun
>> Interface patch-tun
>> type: patch
>> options: {peer=patch-int}
>> Bridge br-ex
>> Port br-ex
>> Interface br-ex
>> type: internal
>> Port phy-br-ex
>> Interface phy-br-ex
>> type: patch
>> options: {peer=int-br-ex}
>> Port "eth0"
>> Interface "eth0"
>>
>> ovs_version: "2.3.1"
>>
>>
>> [root at compute~]# ovs-vsctl show
>> 8123433e-b477-4ef5-88aa-721487a4bd58
>> Bridge br-int
>> fail_mode: secure
>> Port int-br-ex
>> Interface int-br-ex
>> type: patch
>> options: {peer=phy-br-ex}
>> Port patch-tun
>> Interface patch-tun
>> type: patch
>> options: {peer=patch-int}
>> Port br-int
>> Interface br-int
>> type: internal
>> Bridge br-tun
>> fail_mode: secure
>> Port br-tun
>> Interface br-tun
>> type: internal
>> Port patch-int
>> Interface patch-int
>> type: patch
>> options: {peer=patch-tun}
>> Port "vxlan-14000213"
>> Interface "vxlan-14000213"
>> type: vxlan
>> options: {df_default="true", in_key=flow,
>> local_ip="20.0.2.18", out_key=flow, remote_ip="20.0.2.19"}
>> Bridge br-ex
>> Port phy-br-ex
>> Interface phy-br-ex
>> type: patch
>> options: {peer=int-br-ex}
>> Port "eth0"
>> Interface "eth0"
>> Port br-ex
>> Interface br-ex
>> type: internal
>>
>> ovs_version: "2.3.1"
>>
>>
>>
>>
>>
>>
>> On Sat, May 2, 2015 at 9:02 AM, Boris Derzhavets <
>> bderzhavets at hotmail.com> wrote:
>>
>> Thank you once again it really works.
>>
>> [root at ip-192-169-142-127 ~(keystone_admin)]# nova hypervisor-list
>> +----+----------------------------------------+-------+---------+
>> | ID | Hypervisor hostname | State | Status |
>> +----+----------------------------------------+-------+---------+
>> | 1 | ip-192-169-142-127.ip.secureserver.net | up | enabled |
>> | 2 | ip-192-169-142-137.ip.secureserver.net | up | enabled |
>> +----+----------------------------------------+-------+---------+
>>
>> [root at ip-192-169-142-127 ~(keystone_admin)]# nova hypervisor-servers
>> ip-192-169-142-137.ip.secureserver.net
>>
>> +--------------------------------------+-------------------+---------------+----------------------------------------+
>> | ID | Name | Hypervisor
>> ID | Hypervisor Hostname |
>>
>> +--------------------------------------+-------------------+---------------+----------------------------------------+
>> | 16ab7825-1403-442e-b3e2-7056d14398e0 | instance-00000002 |
>> 2 | ip-192-169-142-137.ip.secureserver.net |
>> | 5fa444c8-30b8-47c3-b073-6ce10dd83c5a | instance-00000004 |
>> 2 | ip-192-169-142-137.ip.secureserver.net |
>>
>> +--------------------------------------+-------------------+---------------+----------------------------------------+
>>
>> with only one issue:-
>>
>> during AIO run CONFIG_NEUTRON_OVS_TUNNEL_IF=
>> during Compute Node setup CONFIG_NEUTRON_OVS_TUNNEL_IF=eth1
>>
>> and finally it results mess in ml2_vxlan_endpoints table. I had manually
>> update
>> ml2_vxlan_endpoints and restart neutron-openvswitch-agent.service on
>> both nodes
>> afterwards VMs on compute node obtained access to meta-data server.
>>
>> I also believe that synchronized delete records from tables
>> "compute_nodes && services"
>> ( along with disabling nova-compute on Controller) could turn AIO host
>> into real Controller.
>>
>> Boris.
>>
>> ------------------------------
>> Date: Fri, 1 May 2015 22:22:41 +0200
>> Subject: Re: [Rdo-list] RE(1) Failure to start openstack-nova-compute on
>> Compute Node when testing delorean RC2 or CI repo on CentOS 7.1
>> From: ak at cloudssky.com
>> To: bderzhavets at hotmail.com
>> CC: apevec at gmail.com; rdo-list at redhat.com
>>
>> I got the compute node working by adding the delorean-kilo.repo on
>> compute node,
>> yum updating the compute node, rebooted and extended the packstack file
>> from the first AIO
>> install with the IP of compute node and ran packstack again with
>> NetworkManager enabled
>> and did a second yum update on compute node before the 3rd packstack run,
>> and now it works :-)
>>
>> In short, for RC2 we have to force by hand to get the nova-compute
>> running on compute node,
>> before running packstack from controller again from an existing AIO
>> install.
>>
>> Now I have 2 compute nodes (controller AIO with compute + 2nd compute)
>> and could spawn a
>> 3rd cirros instance which landed on 2nd compute node.
>> ssh'ing into the instances over the floating ip works fine too.
>>
>> Before running packstack again, I set:
>>
>> EXCLUDE_SERVERS=<ip of controller>
>>
>> [root at csky01 ~(keystone_osx)]# virsh list --all
>> Id Name Status
>> ----------------------------------------------------
>> 2 instance-00000001 laufend --> means running in German
>>
>> 3 instance-00000002 laufend --> means running in German
>>
>>
>> [root at csky06 ~]# virsh list --all
>> Id Name Status
>> ----------------------------------------------------
>> 2 instance-00000003 laufend --> means running in German
>>
>>
>> == Nova managed services ==
>>
>> +----+------------------+----------------+----------+---------+-------+----------------------------+-----------------+
>> | Id | Binary | Host | Zone | Status | State |
>> Updated_at | Disabled Reason |
>>
>> +----+------------------+----------------+----------+---------+-------+----------------------------+-----------------+
>> | 1 | nova-consoleauth | csky01.csg.net | internal | enabled | up |
>> 2015-05-01T19:46:42.000000 | - |
>> | 2 | nova-conductor | csky01.csg.net | internal | enabled | up |
>> 2015-05-01T19:46:42.000000 | - |
>> | 3 | nova-scheduler | csky01.csg.net | internal | enabled | up |
>> 2015-05-01T19:46:42.000000 | - |
>> | 4 | nova-compute | csky01.csg.net | nova | enabled | up |
>> 2015-05-01T19:46:40.000000 | - |
>> | 5 | nova-cert | csky01.csg.net | internal | enabled | up |
>> 2015-05-01T19:46:42.000000 | - |
>> | 6 | nova-compute | csky06.csg.net | nova | enabled | up |
>> 2015-05-01T19:46:38.000000 | - |
>>
>> +----+------------------+----------------+----------+---------+-------+----------------------------+-----------------+
>>
>>
>> On Fri, May 1, 2015 at 9:02 AM, Boris Derzhavets <
>> bderzhavets at hotmail.com> wrote:
>>
>> Ran packstack --debug --answer-file=./answer-fileRC2.txt
>> 192.169.142.137_nova.pp.log.gz attached
>>
>> Boris
>>
>> ------------------------------
>> From: bderzhavets at hotmail.com
>> To: apevec at gmail.com
>> Date: Fri, 1 May 2015 01:44:17 -0400
>> CC: rdo-list at redhat.com
>> Subject: [Rdo-list] Failure to start openstack-nova-compute on Compute
>> Node when testing delorean RC2 or CI repo on CentOS 7.1
>>
>> Follow instructions
>> https://www.redhat.com/archives/rdo-list/2015-April/msg00254.html
>> packstack fails :-
>>
>> Applying 192.169.142.127_nova.pp
>> Applying 192.169.142.137_nova.pp
>> 192.169.142.127_nova.pp: [ DONE ]
>> 192.169.142.137_nova.pp: [ ERROR ]
>> Applying Puppet manifests [ ERROR ]
>>
>> ERROR : Error appeared during Puppet run: 192.169.142.137_nova.pp
>> Error: Could not start Service[nova-compute]: Execution of
>> '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for
>> openstack-nova-compute.service failed. See 'systemctl status
>> openstack-nova-compute.service' and 'journalctl -xn' for details.
>> You will find full trace in log
>> /var/tmp/packstack/20150501-081745-rIpCIr/manifests/192.169.142.137_nova.pp.log
>>
>> In both cases (RC2 or CI repos) on compute node 192.169.142.137
>> /var/log/nova/nova-compute.log
>> reports :-
>>
>> 2015-05-01 08:21:41.354 4999 INFO oslo.messaging._drivers.impl_rabbit
>> [req-0ae34524-9ee0-4a87-aa5a-fff5d1999a9c ] Delaying reconnect for 1.0
>> seconds...
>> 2015-05-01 08:21:42.355 4999 INFO oslo.messaging._drivers.impl_rabbit
>> [req-0ae34524-9ee0-4a87-aa5a-fff5d1999a9c ] Connecting to AMQP server on
>> localhost:5672
>> 2015-05-01 08:21:42.360 4999 ERROR oslo.messaging._drivers.impl_rabbit
>> [req-0ae34524-9ee0-4a87-aa5a-fff5d1999a9c ] AMQP server on localhost:5672
>> is unreachable: [Errno 111] ECONNREFUSED. Trying again in 11 seconds.
>>
>> Seems like it is looking for AMQP Server at wrong host . Should be
>> 192.169.142.127
>> On 192.169.142.127 :-
>>
>> [root at ip-192-169-142-127 ~]# netstat -lntp | grep 5672
>> ==> tcp 0 0 0.0.0.0:25672 0.0.0.0:*
>> LISTEN 14506/beam.smp
>> tcp6 0 0 :::5672
>> :::* LISTEN 14506/beam.smp
>>
>> [root at ip-192-169-142-127 ~]# iptables-save | grep 5672
>> -A INPUT -s 192.169.142.127/32 -p tcp -m multiport --dports 5671,5672 -m
>> comment --comment "001 amqp incoming amqp_192.169.142.127" -j ACCEPT
>> -A INPUT -s 192.169.142.137/32 -p tcp -m multiport --dports 5671,5672 -m
>> comment --comment "001 amqp incoming amqp_192.169.142.137" -j ACCEPT
>>
>> Answer-file is attached
>>
>> Thanks.
>> Boris
>>
>> _______________________________________________ Rdo-list mailing list
>> Rdo-list at redhat.com https://www.redhat.com/mailman/listinfo/rdo-list To
>> unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/dev/attachments/20150511/605003a4/attachment.html>
More information about the dev
mailing list