Hi,
So I have finally tried OSP9 and here are the results -
3 Controllers 40 compute - 1 hours 20 mins to deploy.
This is much more the sort of deployment time I was expecting :)
I then tried TripleO Newton Stable again with 3 Controllers 40 Compute -
4 hours and counting.....
The two deployment scripts (for OSP9 and TripleO Newton) were pretty
much identical (allowing for any changes between releases)
During the OSP9 deployment I could use nova list to list the nodes. The
Undercloud API access was in general very responsive.
During the TripleO Newton deployment 'nova list' hangs -
ERROR (ClientException): The server has either erred or is incapable of
performing the requested operation. (HTTP 500)
Undercloud API access was very sluggish.
I noticed Keystone was stuck at 140% for most of the deployment (albeit
multi threaded) which is not the case for OSP9.
I know it is hard to compare two releases, but the difference is enormous.
I will stick with OSP9 for now as this for me works properly out of the
box for large deployments.
Charles
On 14/11/2016 09:01, Charles Short wrote:
Hi Graeme,
Thanks for the reply.
I used these images -
http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/d...
I installed the stable repo following the documentation here -
http://docs.openstack.org/developer/tripleo-docs/installation/installatio...
for example -
sudo curl -L -o /etc/yum.repos.d/delorean-newton.repo
https://trunk.rdoproject.org/centos7-newton/current/delorean.repo
sudo curl -L -o /etc/yum.repos.d/delorean-deps-newton.repo
http://trunk.rdoproject.org/centos7-newton/delorean-deps.repo
The difficulty I am having is that when I test with a small deployment
all works fine. So you would assume just adding more compute nodes
would not be an issue.
Testing this is painful due to the time it takes for a large
deployment to fail. It seems to be only scale that is the issue.
I will try and get you some logs
Regards
Charles
> So the symptoms you are showing me above almost definitely leads me to
> believe that neutron-server failed on the undercloud, which would
> explain why the deploy and nova failed to work. It could have failed
> before or during the deploy. We regularly see instances where
> neutron-server times out upon system boot (takes slightly longer to
> start than systemd expects), so we need to start it manually.
>
> To be clear, The undercloud has been installed using this repo
>
>
http://buildlogs.centos.org/centos/7/cloud/x86_64/rdo-trunk-newton-tested/
>
>
> Which overcloud images are you using? I'm not seeing any provided in
> that repo, and I just want to make sure the undercloud and overcloud
> packages match (as the tripleo-heat-templates package on the undercloud
> has to align with the openstack-puppet-modules package on the overcloud
> iamges).
>
> Also, is it possible to get a copy of all the neutron-server log from
> the undercloud? If we can understand why neutron-server failed, that is
> the first step towards getting a working deployment.
>
> It would be great if we could get a full sosreport with all the system
> logs, to check for other errors. I'm assuming there were no problems
> with the 'openstack undercloud install' process?
>
> Regards,
>
> Graeme
>
--
Charles Short
Cloud Engineer
Virtualization and Cloud Team
European Bioinformatics Institute (EMBL-EBI)
Tel: +44 (0)1223 494205