[rdo-list] [TripleO] Newton large baremetal deployment issues
Graeme Gillies
ggillies at redhat.com
Tue Nov 15 23:40:30 UTC 2016
On 16/11/16 08:39, Charles Short wrote:
> Hi,
>
> So I have finally tried OSP9 and here are the results -
>
> 3 Controllers 40 compute - 1 hours 20 mins to deploy.
>
> This is much more the sort of deployment time I was expecting :)
>
> I then tried TripleO Newton Stable again with 3 Controllers 40 Compute -
>
> 4 hours and counting.....
>
> The two deployment scripts (for OSP9 and TripleO Newton) were pretty
> much identical (allowing for any changes between releases)
>
> During the OSP9 deployment I could use nova list to list the nodes. The
> Undercloud API access was in general very responsive.
>
> During the TripleO Newton deployment 'nova list' hangs -
> ERROR (ClientException): The server has either erred or is incapable of
> performing the requested operation. (HTTP 500)
> Undercloud API access was very sluggish.
> I noticed Keystone was stuck at 140% for most of the deployment (albeit
> multi threaded) which is not the case for OSP9.
>
> I know it is hard to compare two releases, but the difference is enormous.
> I will stick with OSP9 for now as this for me works properly out of the
> box for large deployments.
>
> Charles
Hi Charles,
Thanks for letting us know about your results, this obviously narrows it
down to an issue introduced in newton timeframe (OSP 9 is mitaka). If
you are still able to provide any logs from the undercloud that
failed/took ages for the update that would be good, as any issue like
this I would like to understand before it hits other users. However, if
you haven't got the time/inclination at this stage, I understand.
I'll keep an eye out for it affecting other users.
Regards,
Graeme
>
> On 14/11/2016 09:01, Charles Short wrote:
>> Hi Graeme,
>>
>> Thanks for the reply.
>>
>> I used these images -
>>
>> http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/delorean/
>>
>>
>> I installed the stable repo following the documentation here -
>>
>> http://docs.openstack.org/developer/tripleo-docs/installation/installation.html
>>
>>
>> for example -
>>
>> sudo curl -L -o /etc/yum.repos.d/delorean-newton.repo
>> https://trunk.rdoproject.org/centos7-newton/current/delorean.repo
>>
>> sudo curl -L -o /etc/yum.repos.d/delorean-deps-newton.repo
>> http://trunk.rdoproject.org/centos7-newton/delorean-deps.repo
>>
>>
>> The difficulty I am having is that when I test with a small deployment
>> all works fine. So you would assume just adding more compute nodes
>> would not be an issue.
>> Testing this is painful due to the time it takes for a large
>> deployment to fail. It seems to be only scale that is the issue.
>>
>> I will try and get you some logs
>>
>> Regards
>>
>> Charles
>>
>>
>>
>>> So the symptoms you are showing me above almost definitely leads me to
>>> believe that neutron-server failed on the undercloud, which would
>>> explain why the deploy and nova failed to work. It could have failed
>>> before or during the deploy. We regularly see instances where
>>> neutron-server times out upon system boot (takes slightly longer to
>>> start than systemd expects), so we need to start it manually.
>>>
>>> To be clear, The undercloud has been installed using this repo
>>>
>>> http://buildlogs.centos.org/centos/7/cloud/x86_64/rdo-trunk-newton-tested/
>>>
>>>
>>> Which overcloud images are you using? I'm not seeing any provided in
>>> that repo, and I just want to make sure the undercloud and overcloud
>>> packages match (as the tripleo-heat-templates package on the undercloud
>>> has to align with the openstack-puppet-modules package on the overcloud
>>> iamges).
>>>
>>> Also, is it possible to get a copy of all the neutron-server log from
>>> the undercloud? If we can understand why neutron-server failed, that is
>>> the first step towards getting a working deployment.
>>>
>>> It would be great if we could get a full sosreport with all the system
>>> logs, to check for other errors. I'm assuming there were no problems
>>> with the 'openstack undercloud install' process?
>>>
>>> Regards,
>>>
>>> Graeme
>>>
>>
>
--
Graeme Gillies
Principal Systems Administrator
Openstack Infrastructure
Red Hat Australia
More information about the dev
mailing list