[rdo-users] TripleO Monitoring Tool/Method

Fri Oct 23 17:53:39 UTC 2020

On 23/10/2020 17:56, Khodayar Doustar wrote:
> Great, I just don't understand what is the "new" monitoring if that's
> the "old" one?

"old" one is OpenStack Telemetry, comprised by ceilometer, aodh, (panko,
now deprecated and removed), and gnocchi.

"new" one would be STF: collectd, ceilometer, transport via amqp1 and
data ingestion by prometheus and elasticsearch.

Matthias
> 
> Gruß,
> 
> On Fri, Oct 23, 2020 at 5:07 PM Matthias Runge <mrunge at redhat.com
> <mailto:mrunge at redhat.com>> wrote:
> 
>     Hi.
> 
>     for STF deployment, you can follow the doc you linked.
> 
>     Legacy refers to "old" monitoring, meaning ceilometer, aodh and gnocchi.
> 
>     Depending on your setup, you'll also enable ceilometer for OpenStack
>     usage reporting.
> 
>     Because you also would like to use cloudkitty, you will have to enable
>     gnocchi (iirc.) That is something you'll have to check.
> 
>     Matthias
> 
>     On 23/10/2020 16:02, Khodayar Doustar wrote:
>     > Hi,
>     >
>     > Thanks a lot Matthias,
>     >
>     > And is it also enough to follow this doc
>     > here: https://infrawatch.github.io/documentation
>     <https://infrawatch.github.io/documentation>
>     > <https://infrawatch.github.io/documentation
>     <https://infrawatch.github.io/documentation>> ? Because I'm using
>     TripleO
>     > on CentOS, maybe you are using Original RHOSP?
>     >
>     > Is it legacy to the new one which is STF? Or is it some other modern
>     > monitoring "Legacy"?
>     > Does it mean if I'm going to use STF I won't need this Legacy?
>     > (considering that I'm going to implement CloudKitty as well)
>     >
>     > Regards,
>     > Khodayar
>     >
>     > On Fri, Oct 23, 2020 at 3:37 PM Matthias Runge <mrunge at redhat.com
>     <mailto:mrunge at redhat.com>
>     > <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>> wrote:
>     >
>     >     Hi,
>     >
>     >     yes of course I'm using STF, and it's not complicated.
>     >     It's always a good idea to separate your monitoring stack from the
>     >     monitored infrastructure. How would you know your stack is
>     down, if
>     >     notifications are also sent from that stack?
>     >
>     >     With the tripleo-heat-templates you linked, you basically
>     enable legacy
>     >     telemetry (ceilometer, aodh, gnocchi).
>     >
>     >     If you are running 40 computes, that is not a small stack
>     anymore. I
>     >     would suggest (recommend) to use ceph as backend.
>     >
>     >     Also, depending on your use-case and your settings (for
>     collectd) you
>     >     may want to lower the interval, the parameter is
>     >     CollectdDefaultPollingInterval, I have set it here to
>     something like 5
>     >     secs, but in your case, I would suggest to use 600 (same as for
>     >     Ceilometer).
>     >
>     >     Matthias
>     >
>     >
>     >     On 23/10/2020 11:09, Khodayar Doustar wrote:
>     >     > Matthias,
>     >     >
>     >     > Thanks a lot for your answer.
>     >     > Yes, you win the bet :) I've used swift and currently
>     struggling to
>     >     > disable collectd to make my cloud usable again! :))
>     >     >
>     >     > I've seen this STF (Service Telemetry Framework) but it
>     seems a little
>     >     > bit too complicated. I should implement an OKD cluster to
>     monitor my
>     >     > openstack, isn't it too much work?
>     >     > Have you tried it yourself?
>     >     >
>     >     > If I understand correctly, with your first and main opinion
>     you mean
>     >     > adding this files to my overcloud deploy command:
>     >     >
>     >     >
>     >   
>      /usr/share/openstack-tripleo-heat-templates/environments/enable-legacy-telemetry.yaml
>     >     >
>     >   
>      /usr/share/openstack-tripleo-heat-templates/environments/services/collectd.yaml
>     >     >
>     >     > and for performance tuning I've checked this page:
>     >     >
>     >   
>      https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
>     <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry>
>     >   
>      <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
>     <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry>>
>     >     >
>     >   
>      <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
>     <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry>
>     >   
>      <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
>     <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry>>>
>     >     >
>     >     > Is that what you mean?
>     >     > If so I should make my cloud usable again and just
>     >     change GnocchiBackend
>     >     > to a path to a file on a shared file system (i.e. NFS) because I
>     >     have 4
>     >     > controller nodes, because the rest is exactly what I've done
>     up to
>     >     now.
>     >     >
>     >     > Thanks a lot,
>     >     > Khodayar
>     >     >
>     >     > On Fri, Oct 23, 2020 at 10:01 AM Matthias Runge
>     <mrunge at redhat.com <mailto:mrunge at redhat.com>
>     >     <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>
>     >     > <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>
>     <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>>> wrote:
>     >     >
>     >     >     On 22/10/2020 17:46, Khodayar Doustar wrote:
>     >     >     > Hi everybody,
>     >     >     >
>     >     >     > I am searching for a good and useful method to monitor
>     my 40
>     >     nodes
>     >     >     cloud.
>     >     >     >
>     >     >     > I have tried
>     >     >     >
>     >     >     > - Prometheus + Grafana (with
>     >     >     >
>     https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>
>     >     <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>>
>     >     >   
>      <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>
>     >     <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>>>
>     >     >     >
>     <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>
>     >     <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>>
>     >     >   
>      <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>
>     >     <https://github.com/openstack-exporter/openstack-exporter
>     <https://github.com/openstack-exporter/openstack-exporter>>>>) but it
>     >     >     > cannot monitor nodes load and cpu usage etc.
>     >     >     > and 
>     >     >     > - Gnocchi +Collectd + Grafana but it enforces unbelievable
>     >     load on
>     >     >     nodes
>     >     >     > and make the whole cloud completely unusable!
>     >     >     >
>     >     >     > I've tried to use Graphite + Grafana but I failed.
>     >     >     >
>     >     >     > Do you have any suggestions?
>     >     >
>     >     >
>     >     >     Hi,
>     >     >
>     >     >     yes, I have some opinions here.
>     >     >
>     >     >     My proposal here is:
>     >     >
>     >     >     - use collectd to collect low level metrics from your
>     >     baremetal machines
>     >     >     - use ceilometer to collect OpenStack related info, like
>     >     project usage,
>     >     >     etc. That is nothing you'd get by using node-exporter
>     >     >     - hook them both together and send metrics over to something
>     >     called
>     >     >     Service Telemetry Framework. The configuration *is* included
>     >     in tripleo.
>     >     >     The website has documentation available
>     >     >     https://infrawatch.github.io/documentation
>     <https://infrawatch.github.io/documentation>
>     >     <https://infrawatch.github.io/documentation
>     <https://infrawatch.github.io/documentation>>
>     >     >     <https://infrawatch.github.io/documentation
>     <https://infrawatch.github.io/documentation>
>     >     <https://infrawatch.github.io/documentation
>     <https://infrawatch.github.io/documentation>>>
>     >     >     - graphite + grafana (plus collectd) is also a single node
>     >     setup and
>     >     >     won't provide you reliability.
>     >     >     - collectd also provides the ability to send events,
>     which can
>     >     be acted
>     >     >     on. That is not included if you use node-exporter,
>     >     openstack-exporter
>     >     >     etc. Prometheus monitoring creates events from metrics, but
>     >     will be slow
>     >     >     to detect failed components.
>     >     >
>     >     >     Since prometheus is meant to be single server, there is
>     no HA
>     >     per se in
>     >     >     prometheus. That makes handling prometheus on standalone
>     >     machines a bit
>     >     >     awkward, or you'd have a infrastructure taking care of that.
>     >     >
>     >     >     In your tests with gnocchi, collectd and grafana, I bet you
>     >     used swift
>     >     >     as backend for gnocchi storage. That is not a good idea and
>     >     may lead to
>     >     >     bad performance.
>     >     >
>     >     >     Matthias
>     >     >
>     >     >     --
>     >     >     Matthias Runge <mrunge at redhat.com
>     <mailto:mrunge at redhat.com> <mailto:mrunge at redhat.com
>     <mailto:mrunge at redhat.com>>
>     >     <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>
>     <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>>>
>     >     >
>     >     >     Red Hat GmbH, http://www.de.redhat.com/
>     <http://www.de.redhat.com/>
>     >     <http://www.de.redhat.com/ <http://www.de.redhat.com/>>
>     <http://www.de.redhat.com/ <http://www.de.redhat.com/>
>     >     <http://www.de.redhat.com/ <http://www.de.redhat.com/>>>,
>     >     >     Registered seat: Grasbrunn,
>     >     >     Commercial register: Amtsgericht Muenchen, HRB 153243,
>     >     >     Man.Directors: Charles Cachera, Brian Klemm, Laurie
>     Krebs, Michael
>     >     >     O'Neil
>     >     >
>     >     >     _______________________________________________
>     >     >     users mailing list
>     >     >     users at lists.rdoproject.org
>     <mailto:users at lists.rdoproject.org>
>     <mailto:users at lists.rdoproject.org <mailto:users at lists.rdoproject.org>>
>     >     <mailto:users at lists.rdoproject.org
>     <mailto:users at lists.rdoproject.org>
>     <mailto:users at lists.rdoproject.org <mailto:users at lists.rdoproject.org>>>
>     >     >     http://lists.rdoproject.org/mailman/listinfo/users
>     <http://lists.rdoproject.org/mailman/listinfo/users>
>     >     <http://lists.rdoproject.org/mailman/listinfo/users
>     <http://lists.rdoproject.org/mailman/listinfo/users>>
>     >     >     <http://lists.rdoproject.org/mailman/listinfo/users
>     <http://lists.rdoproject.org/mailman/listinfo/users>
>     >     <http://lists.rdoproject.org/mailman/listinfo/users
>     <http://lists.rdoproject.org/mailman/listinfo/users>>>
>     >     >
>     >     >     To unsubscribe: users-unsubscribe at lists.rdoproject.org
>     <mailto:users-unsubscribe at lists.rdoproject.org>
>     >     <mailto:users-unsubscribe at lists.rdoproject.org
>     <mailto:users-unsubscribe at lists.rdoproject.org>>
>     >     >     <mailto:users-unsubscribe at lists.rdoproject.org
>     <mailto:users-unsubscribe at lists.rdoproject.org>
>     >     <mailto:users-unsubscribe at lists.rdoproject.org
>     <mailto:users-unsubscribe at lists.rdoproject.org>>>
>     >     >
>     >
>     >
>     >     --
>     >     Matthias Runge <mrunge at redhat.com <mailto:mrunge at redhat.com>
>     <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>>
>     >
>     >     Red Hat GmbH, http://www.de.redhat.com/
>     <http://www.de.redhat.com/> <http://www.de.redhat.com/
>     <http://www.de.redhat.com/>>,
>     >     Registered seat: Grasbrunn,
>     >     Commercial register: Amtsgericht Muenchen, HRB 153243,
>     >     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
>     >     O'Neil
>     >
> 
> 
>     -- 
>     Matthias Runge <mrunge at redhat.com <mailto:mrunge at redhat.com>>
> 
>     Red Hat GmbH, http://www.de.redhat.com/ <http://www.de.redhat.com/>,
>     Registered seat: Grasbrunn,
>     Commercial register: Amtsgericht Muenchen, HRB 153243,
>     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
>     O'Neil
> 

-- 
Matthias Runge <mrunge at redhat.com>

Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neil