[rdo-list] [Meeting] RDO meeting- (2016-11-16) Minutes
by Chandan kumar
==============================
#rdo: RDO meeting - 2016-11-16
==============================
Meeting started by chandankumar at 15:00:59 UTC. The full logs are
available at
https://meetbot.fedoraproject.org/rdo/2016-11-16/rdo_meeting_-_2016-11-16...
.
Meeting summary
---------------
* roll call (chandankumar, 15:01:07)
* Liberty EOL : 2016-11-17 (chandankumar, 15:04:10)
* LINK: https://etherpad.openstack.org/p/RDO-Meeting (chandankumar,
15:05:13)
* LINK: https://trello.com/c/fV69VODx/165-rdo-release-eol
(chandankumar, 15:06:25)
* LINK: https://etherpad.openstack.org/p/DLRN-new-RDO-release
(chandankumar, 15:14:33)
* ACTION: chandankumar to start sending BZ reports to rdo-list
(chandankumar, 15:18:04)
* ACTION: jpena to remote Liberty worker from DLRN servers (jpena,
15:19:06)
* ACTION: apevec practice EOL vault.centos.org move on Kilo (not EOLed
properly yet!) (apevec, 15:19:06)
* ACTION: apevec write announcement and work with rbowen to publicize
it (apevec, 15:20:24)
* rdopkg reviews (chandankumar, 15:22:06)
* altarch status (chandankumar, 15:24:52)
* ppc64le architecture will be enabled soon (number80, 15:27:07)
* help testing https://bugzilla.redhat.com/show_bug.cgi?id=1391444
(number80, 15:27:48)
* chair for next meeting (chandankumar, 15:29:10)
* ACTION: jpena to chair for next meeting. (chandankumar, 15:29:39)
* open floor (chandankumar, 15:29:46)
* LINK: https://github.com/openstack-packages/rdopkg/issues/66
(jruzicka, 15:39:45)
Meeting ended at 15:46:06 UTC.
Action Items
------------
* chandankumar to start sending BZ reports to rdo-list
* jpena to remote Liberty worker from DLRN servers
* apevec practice EOL vault.centos.org move on Kilo (not EOLed properly
yet!)
* apevec write announcement and work with rbowen to publicize it
* jpena to chair for next meeting.
Action Items, by person
-----------------------
* apevec
* apevec practice EOL vault.centos.org move on Kilo (not EOLed
properly yet!)
* apevec write announcement and work with rbowen to publicize it
* chandankumar
* chandankumar to start sending BZ reports to rdo-list
* jpena
* jpena to remote Liberty worker from DLRN servers
* jpena to chair for next meeting.
* **UNASSIGNED**
* (none)
People Present (lines said)
---------------------------
* chandankumar (72)
* apevec (26)
* number80 (25)
* zodbot (15)
* Duck (15)
* dmsimard (9)
* jruzicka (8)
* jpena (6)
* amoralej (5)
* trown (4)
* weshay (4)
* rdogerrit (4)
* dmellado (3)
* hrybacki (3)
* adarazs (1)
* coolsvap (1)
* jschluet (1)
* jpena|lunch (1)
* mengxd (1)
Generated by `MeetBot`_ 0.1.4
.. _`MeetBot`: http://wiki.debian.org/MeetBot
7 years, 11 months
[rdo-list] RDO: opendaylight (ODL) packstack support
by Fellert, Rafi
Hi,
1. I'm a new subscriber to RDO-list
2. I want to run a demo in which ODL is installed on centos-7.
3. Is there a way/version in which the packstack supports ODL installation?
Thanks, Rafi.
7 years, 11 months
Re: [rdo-list] [TripleO] Newton large baremetal deployment issues
by Charles Short
Hi Graeme,
Thanks for the reply.
I used these images -
http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/d...
I installed the stable repo following the documentation here -
http://docs.openstack.org/developer/tripleo-docs/installation/installatio...
for example -
sudo curl -L -o /etc/yum.repos.d/delorean-newton.repo https://trunk.rdoproject.org/centos7-newton/current/delorean.repo
sudo curl -L -o /etc/yum.repos.d/delorean-deps-newton.repo http://trunk.rdoproject.org/centos7-newton/delorean-deps.repo
The difficulty I am having is that when I test with a small deployment
all works fine. So you would assume just adding more compute nodes would
not be an issue.
Testing this is painful due to the time it takes for a large deployment
to fail. It seems to be only scale that is the issue.
I will try and get you some logs
Regards
Charles
> So the symptoms you are showing me above almost definitely leads me to
> believe that neutron-server failed on the undercloud, which would
> explain why the deploy and nova failed to work. It could have failed
> before or during the deploy. We regularly see instances where
> neutron-server times out upon system boot (takes slightly longer to
> start than systemd expects), so we need to start it manually.
>
> To be clear, The undercloud has been installed using this repo
>
> http://buildlogs.centos.org/centos/7/cloud/x86_64/rdo-trunk-newton-tested/
>
> Which overcloud images are you using? I'm not seeing any provided in
> that repo, and I just want to make sure the undercloud and overcloud
> packages match (as the tripleo-heat-templates package on the undercloud
> has to align with the openstack-puppet-modules package on the overcloud
> iamges).
>
> Also, is it possible to get a copy of all the neutron-server log from
> the undercloud? If we can understand why neutron-server failed, that is
> the first step towards getting a working deployment.
>
> It would be great if we could get a full sosreport with all the system
> logs, to check for other errors. I'm assuming there were no problems
> with the 'openstack undercloud install' process?
>
> Regards,
>
> Graeme
>
7 years, 11 months
[rdo-list] [TripleO] Newton large baremetal deployment issues
by Charles Short
Hi,
I am running TripleO Newton stable release and am deploying on baremetal
with CentOS.
I have 64 nodes, and the Undercloud has plenty of resource as it is one
of the nodes with 294 GB Memory and 64 CPUs.
The provisioning network is 1Gbps
I have tried tuning the Undercloud using this tuning section in 10.7 as
a guide
https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/p...
My Undercloud passes validations in Clapper
https://github.com/rthallisey/clapper
I am deploying with Network Isolation and 3 Controllers in HA.
If I create a stack with 3 Controllers and 3 compute nodes this takes
about 1 hour
If I create a stack with 3 Controllers and 15 compute nodes this takes
about 1 hour
Both stacks pass Clapper validations.
During deployment I can see that the first 20 to 30 mins is using all
the bandwidth available for the overcloud image deployment and them uses
hardly any bandwidth whilst the rest of the configuration takes place.
So I try a stack with 40 nodes. This is where I have issues.
I set the timeout to 4 hours and leave it over night to deploy.
It seems to timeout and fail to deploy due to the timeout every time.
During the 40 node deployment the overcloud image is distributed in
about 45 mins to all nodes and the all nodes appear ACTIVE and have an
IP address on the deployment network.
So it would appear that the rest of the low bandwidth configuration is
taking well over 3 hours to complete. This seems excessive
I have configured nova.conf for deployment concurrency (from the tuning
link above) and configured the heat.conf 'num_engine_workers' to be 32
taking in to account this bug
https://bugzilla.redhat.com/show_bug.cgi?id=1370516
So my question is how do I tune my Undercloud to speed up the deployment?
Looking at htop during deployment I can see heat is using many CPUs, but
the work pattern is NOT distributed. What typically happens is all the
CPUs are at 0 to 1 % used apart from one which is at 50 to 100%. This
one CPU id changes regularly, but there is no concurrent distributed
workload across all the CPUs that the heat processes are running on. Is
heat really multi-threaded, or does if have limitations so it can only
really do proper work on one CPU at a time (which I am seeing in htop)?
Thanks
Charles
--
Charles Short
Cloud Engineer
Virtualization and Cloud Team
European Bioinformatics Institute (EMBL-EBI)
Tel: +44 (0)1223 494205
7 years, 11 months