[Rdo-list] CPU high load issue RDO Juno controllers

Wed Aug 26 13:55:56 UTC 2015

----- Original Message ----- 

> Hi all,
>
> I have 3 Controller deployment in HA using Juno/Centos 7, based on this
> howto: https://github.com/beekhof/osp-ha-deploy/blob/master/HA-keepalived.md
>
> Usually after some 2/3 weeks I start to have CPU high load issues on my
> controllers as you can see below. I have to restart all my services to
> everything get back to normal.
>
> Some pointers:
> 1º This can happen in any controller, so I don't think it's a hardware issue.
>
> 2º My cloud is not in production yet, so I don't have much usage.
>
> 3º I only see high load in related openstack services.
>
> Can anyone provide me some tips to figure out what's wrong here? Thanks.
>
> top - 05:05:38 up 28 days, 4:02, 1 user, load average: 12.79, 12.18, 12.14
> Tasks: 420 total, 10 running, 410 sleeping, 0 stopped, 0 zombie
> %Cpu(s): 82.6 us, 16.1 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 1.3 si, 0.0 st
> KiB Mem : 32732120 total, 19646776 free, 6891364 used, 6193980 buff/cache
> KiB Swap: 16457724 total, 16457724 free, 0 used. 23890352 avail Mem
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 22270 root 20 0 161808 20516 4364 R 24.9 0.1 0:08.24 cinder
> 19234 swift 20 0 246148 15076 1620 R 24.7 0.0 2:25.64 swift-object-au
> 22268 root 20 0 178528 22904 5000 R 24.2 0.1 0:08.48 keystone
> 33342 keystone 20 0 468852 69132 4148 R 24.2 0.2 539:01.19 keystone-all
> 22264 root 20 0 141596 19652 4076 R 22.7 0.1 0:07.56 neutron
> 22262 root 20 0 170680 21736 4924 R 19.7 0.1 0:08.24 keystone
> 22537 root 20 0 203540 9884 3656 R 19.7 0.0 0:01.19 neutron-rootwra
> 22273 root 20 0 141504 19800 4076 R 19.5 0.1 0:07.63 glance
> 33423 neutron 20 0 453068 73784 2024 S 16.0 0.2 26:53.02 neutron-server
> 33338 keystone 20 0 468880 69124 4148 S 11.5 0.2 540:48.66 keystone-all
> 22288 rabbitmq 20 0 640756 17068 2512 S 11.2 0.1 0:04.12 beam.smp
> 32813 glance 20 0 352124 55292 5928 S 9.2 0.2 661:41.51 glance-api
> 32863 nova 20 0 385864 74056 7196 S 9.2 0.2 673:00.23 nova-api
>

Hi Pedro,

It's hard to tell what is going wrong only from a top output. But since Keystone seems to be involved, I'll give it a try.

Re-checking the how-to, I realized there is nothing about Keystone token flushing. This is required (see http://docs.openstack.org/admin-guide-cloud/identity_troubleshoot.html#flush-expired-tokens-from-the-token-database-table), so it might help if we enable the cron job. Basically, just add a line to /etc/crontab with the following contents:

1 * * * * keystone keystone-manage token_flush >>/var/log/keystone/keystone-tokenflush.log 2>&1

I would recommend to do a first execution interactively, if there are many expired tokens in the database it could take a long time to start.

If this doesn't help, you may need to check the log files to see what is causing the high load.

Regards,
JAvier

> Regards,
> Pedro Sousa
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com