On 16/11/16 08:39, Charles Short wrote:
So I have finally tried OSP9 and here are the results -
3 Controllers 40 compute - 1 hours 20 mins to deploy.
This is much more the sort of deployment time I was expecting :)
I then tried TripleO Newton Stable again with 3 Controllers 40 Compute -
4 hours and counting.....
The two deployment scripts (for OSP9 and TripleO Newton) were pretty
much identical (allowing for any changes between releases)
During the OSP9 deployment I could use nova list to list the nodes. The
Undercloud API access was in general very responsive.
During the TripleO Newton deployment 'nova list' hangs -
ERROR (ClientException): The server has either erred or is incapable of
performing the requested operation. (HTTP 500)
Undercloud API access was very sluggish.
I noticed Keystone was stuck at 140% for most of the deployment (albeit
multi threaded) which is not the case for OSP9.
I know it is hard to compare two releases, but the difference is enormous.
I will stick with OSP9 for now as this for me works properly out of the
box for large deployments.
Hi Charles,
Thanks for letting us know about your results, this obviously narrows it
down to an issue introduced in newton timeframe (OSP 9 is mitaka). If
you are still able to provide any logs from the undercloud that
failed/took ages for the update that would be good, as any issue like
this I would like to understand before it hits other users. However, if
you haven't got the time/inclination at this stage, I understand.
I'll keep an eye out for it affecting other users.
On 14/11/2016 09:01, Charles Short wrote:
> Hi Graeme,
> Thanks for the reply.
> I used these images -
> I installed the stable repo following the documentation here -
> for example -
> sudo curl -L -o /etc/yum.repos.d/delorean-newton.repo
> sudo curl -L -o /etc/yum.repos.d/delorean-deps-newton.repo
> The difficulty I am having is that when I test with a small deployment
> all works fine. So you would assume just adding more compute nodes
> would not be an issue.
> Testing this is painful due to the time it takes for a large
> deployment to fail. It seems to be only scale that is the issue.
> I will try and get you some logs
> Regards
> Charles
>> So the symptoms you are showing me above almost definitely leads me to
>> believe that neutron-server failed on the undercloud, which would
>> explain why the deploy and nova failed to work. It could have failed
>> before or during the deploy. We regularly see instances where
>> neutron-server times out upon system boot (takes slightly longer to
>> start than systemd expects), so we need to start it manually.
>> To be clear, The undercloud has been installed using this repo
>> Which overcloud images are you using? I'm not seeing any provided in
>> that repo, and I just want to make sure the undercloud and overcloud
>> packages match (as the tripleo-heat-templates package on the undercloud
>> has to align with the openstack-puppet-modules package on the overcloud
>> iamges).
>> Also, is it possible to get a copy of all the neutron-server log from
>> the undercloud? If we can understand why neutron-server failed, that is
>> the first step towards getting a working deployment.
>> It would be great if we could get a full sosreport with all the system
>> logs, to check for other errors. I'm assuming there were no problems
>> with the 'openstack undercloud install' process?
>> Regards,
>> Graeme
Graeme Gillies
Principal Systems Administrator
Openstack Infrastructure
Red Hat Australia