I've manually removed all gnocchi, aodh, collectd and ceilometer containers.
The Load average was low, but the problem was RabbitMQ, I've manually stopped all RabbitMQ containers and started them again and the cloud got back on track.

Yes, you are totally right, I am planning to change Gnocchi backend to file before enabling it again.

Thanks a lot,
Khodayar


On Tue, Oct 27, 2020 at 5:07 PM Matthias Runge <mrunge@redhat.com> wrote:
On 27/10/2020 13:47, Khodayar Doustar wrote:
> Matthias,
>
> I've done that like this:
>
> openstack overcloud deploy --templates -r ..........
> -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml
>
> but it didn't change anything.
> I can remove the containers manually, but:
>
> 1. I thought it should remove the container itself when you undo
> something, isn't that right?

I've been told, it's not.
You need explicitly to describe what you want the installer to do.

If you'd remove the collectd containers, you should get the cloud in a
usable state quickly, since that won't overwhelm gnocchi anymore.
Again, using traditional telemetry in such a large deployment is not a
good idea. Use ceph as gnocchi backend, if you really need gnocchi.

Matthias


> 2. The main problem is that the cloud is down, CLI works but we cannot
> access any machine etc. We cannot login to horizon and every 5 minutes
> we have this error on all of our controllers:
>
> haproxy[830994]: proxy cinder has no server available!
>
> As I've checked this queue has no consumer in RabbitMQ:
>
> /]# rabbitmqctl list_queues | awk '$2 > 0'
> Listing queues
> cinder-scheduler_fanout_d94a6e6429db48848ca49d04ce5f4d6b11277
>
> Thanks,
>
> On Tue, Oct 27, 2020 at 10:29 AM Matthias Runge <mrunge@redhat.com
> <mailto:mrunge@redhat.com>> wrote:
>
>     On 26/10/2020 22:27, Khodayar Doustar wrote:
>     > Hi,
>     >
>     >  
>     >
>     > Maybe you remember me from the last thread:
>     >
>     > I have enable legacy telemetry services and my cloud got
>     overloaded and
>     > down.
>     >
>     > I have tried to undo all I’ve done with no luck. Load average is lower
>     > but gnocchi and ceilometer containers are still there, but the main
>     > problem is timeout on various services/endpoints.
>
>     You need to undefine these services in tripleo-heat-templates,
>     comparable to [1] and also remove the containers.
>
>     Matthias
>
>     [1]
>     https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/environments/disable-telemetry.yaml
>     <https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/environments/disable-telemetry.yaml>
>
>     --
>     Matthias Runge <mrunge@redhat.com <mailto:mrunge@redhat.com>>
>
>     Red Hat GmbH, http://www.de.redhat.com/ <http://www.de.redhat.com/>,
>     Registered seat: Grasbrunn,
>     Commercial register: Amtsgericht Muenchen, HRB 153243,
>     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
>     O'Neil
>
>     _______________________________________________
>     users mailing list
>     users@lists.rdoproject.org <mailto:users@lists.rdoproject.org>
>     http://lists.rdoproject.org/mailman/listinfo/users
>     <http://lists.rdoproject.org/mailman/listinfo/users>
>
>     To unsubscribe: users-unsubscribe@lists.rdoproject.org
>     <mailto:users-unsubscribe@lists.rdoproject.org>
>


--
Matthias Runge <mrunge@redhat.com>

Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neil