[rdo-users] TripleO Monitoring Tool/Method

Khodayar Doustar khodayard at gmail.com
Fri Oct 23 15:56:55 UTC 2020


Great, I just don't understand what is the "new" monitoring if that's the
"old" one?

Gruß,

On Fri, Oct 23, 2020 at 5:07 PM Matthias Runge <mrunge at redhat.com> wrote:

> Hi.
>
> for STF deployment, you can follow the doc you linked.
>
> Legacy refers to "old" monitoring, meaning ceilometer, aodh and gnocchi.
>
> Depending on your setup, you'll also enable ceilometer for OpenStack
> usage reporting.
>
> Because you also would like to use cloudkitty, you will have to enable
> gnocchi (iirc.) That is something you'll have to check.
>
> Matthias
>
> On 23/10/2020 16:02, Khodayar Doustar wrote:
> > Hi,
> >
> > Thanks a lot Matthias,
> >
> > And is it also enough to follow this doc
> > here: https://infrawatch.github.io/documentation
> > <https://infrawatch.github.io/documentation> ? Because I'm using TripleO
> > on CentOS, maybe you are using Original RHOSP?
> >
> > Is it legacy to the new one which is STF? Or is it some other modern
> > monitoring "Legacy"?
> > Does it mean if I'm going to use STF I won't need this Legacy?
> > (considering that I'm going to implement CloudKitty as well)
> >
> > Regards,
> > Khodayar
> >
> > On Fri, Oct 23, 2020 at 3:37 PM Matthias Runge <mrunge at redhat.com
> > <mailto:mrunge at redhat.com>> wrote:
> >
> >     Hi,
> >
> >     yes of course I'm using STF, and it's not complicated.
> >     It's always a good idea to separate your monitoring stack from the
> >     monitored infrastructure. How would you know your stack is down, if
> >     notifications are also sent from that stack?
> >
> >     With the tripleo-heat-templates you linked, you basically enable
> legacy
> >     telemetry (ceilometer, aodh, gnocchi).
> >
> >     If you are running 40 computes, that is not a small stack anymore. I
> >     would suggest (recommend) to use ceph as backend.
> >
> >     Also, depending on your use-case and your settings (for collectd) you
> >     may want to lower the interval, the parameter is
> >     CollectdDefaultPollingInterval, I have set it here to something like
> 5
> >     secs, but in your case, I would suggest to use 600 (same as for
> >     Ceilometer).
> >
> >     Matthias
> >
> >
> >     On 23/10/2020 11:09, Khodayar Doustar wrote:
> >     > Matthias,
> >     >
> >     > Thanks a lot for your answer.
> >     > Yes, you win the bet :) I've used swift and currently struggling to
> >     > disable collectd to make my cloud usable again! :))
> >     >
> >     > I've seen this STF (Service Telemetry Framework) but it seems a
> little
> >     > bit too complicated. I should implement an OKD cluster to monitor
> my
> >     > openstack, isn't it too much work?
> >     > Have you tried it yourself?
> >     >
> >     > If I understand correctly, with your first and main opinion you
> mean
> >     > adding this files to my overcloud deploy command:
> >     >
> >     >
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/enable-legacy-telemetry.yaml
> >     >
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/services/collectd.yaml
> >     >
> >     > and for performance tuning I've checked this page:
> >     >
> >
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
> >     <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
> >
> >     >
> >     <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
> >     <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry
> >>
> >     >
> >     > Is that what you mean?
> >     > If so I should make my cloud usable again and just
> >     change GnocchiBackend
> >     > to a path to a file on a shared file system (i.e. NFS) because I
> >     have 4
> >     > controller nodes, because the rest is exactly what I've done up to
> >     now.
> >     >
> >     > Thanks a lot,
> >     > Khodayar
> >     >
> >     > On Fri, Oct 23, 2020 at 10:01 AM Matthias Runge <mrunge at redhat.com
> >     <mailto:mrunge at redhat.com>
> >     > <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>> wrote:
> >     >
> >     >     On 22/10/2020 17:46, Khodayar Doustar wrote:
> >     >     > Hi everybody,
> >     >     >
> >     >     > I am searching for a good and useful method to monitor my 40
> >     nodes
> >     >     cloud.
> >     >     >
> >     >     > I have tried
> >     >     >
> >     >     > - Prometheus + Grafana (with
> >     >     > https://github.com/openstack-exporter/openstack-exporter
> >     <https://github.com/openstack-exporter/openstack-exporter>
> >     >     <https://github.com/openstack-exporter/openstack-exporter
> >     <https://github.com/openstack-exporter/openstack-exporter>>
> >     >     > <https://github.com/openstack-exporter/openstack-exporter
> >     <https://github.com/openstack-exporter/openstack-exporter>
> >     >     <https://github.com/openstack-exporter/openstack-exporter
> >     <https://github.com/openstack-exporter/openstack-exporter>>>) but it
> >     >     > cannot monitor nodes load and cpu usage etc.
> >     >     > and
> >     >     > - Gnocchi +Collectd + Grafana but it enforces unbelievable
> >     load on
> >     >     nodes
> >     >     > and make the whole cloud completely unusable!
> >     >     >
> >     >     > I've tried to use Graphite + Grafana but I failed.
> >     >     >
> >     >     > Do you have any suggestions?
> >     >
> >     >
> >     >     Hi,
> >     >
> >     >     yes, I have some opinions here.
> >     >
> >     >     My proposal here is:
> >     >
> >     >     - use collectd to collect low level metrics from your
> >     baremetal machines
> >     >     - use ceilometer to collect OpenStack related info, like
> >     project usage,
> >     >     etc. That is nothing you'd get by using node-exporter
> >     >     - hook them both together and send metrics over to something
> >     called
> >     >     Service Telemetry Framework. The configuration *is* included
> >     in tripleo.
> >     >     The website has documentation available
> >     >     https://infrawatch.github.io/documentation
> >     <https://infrawatch.github.io/documentation>
> >     >     <https://infrawatch.github.io/documentation
> >     <https://infrawatch.github.io/documentation>>
> >     >     - graphite + grafana (plus collectd) is also a single node
> >     setup and
> >     >     won't provide you reliability.
> >     >     - collectd also provides the ability to send events, which can
> >     be acted
> >     >     on. That is not included if you use node-exporter,
> >     openstack-exporter
> >     >     etc. Prometheus monitoring creates events from metrics, but
> >     will be slow
> >     >     to detect failed components.
> >     >
> >     >     Since prometheus is meant to be single server, there is no HA
> >     per se in
> >     >     prometheus. That makes handling prometheus on standalone
> >     machines a bit
> >     >     awkward, or you'd have a infrastructure taking care of that.
> >     >
> >     >     In your tests with gnocchi, collectd and grafana, I bet you
> >     used swift
> >     >     as backend for gnocchi storage. That is not a good idea and
> >     may lead to
> >     >     bad performance.
> >     >
> >     >     Matthias
> >     >
> >     >     --
> >     >     Matthias Runge <mrunge at redhat.com <mailto:mrunge at redhat.com>
> >     <mailto:mrunge at redhat.com <mailto:mrunge at redhat.com>>>
> >     >
> >     >     Red Hat GmbH, http://www.de.redhat.com/
> >     <http://www.de.redhat.com/> <http://www.de.redhat.com/
> >     <http://www.de.redhat.com/>>,
> >     >     Registered seat: Grasbrunn,
> >     >     Commercial register: Amtsgericht Muenchen, HRB 153243,
> >     >     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs,
> Michael
> >     >     O'Neil
> >     >
> >     >     _______________________________________________
> >     >     users mailing list
> >     >     users at lists.rdoproject.org <mailto:users at lists.rdoproject.org>
> >     <mailto:users at lists.rdoproject.org <mailto:
> users at lists.rdoproject.org>>
> >     >     http://lists.rdoproject.org/mailman/listinfo/users
> >     <http://lists.rdoproject.org/mailman/listinfo/users>
> >     >     <http://lists.rdoproject.org/mailman/listinfo/users
> >     <http://lists.rdoproject.org/mailman/listinfo/users>>
> >     >
> >     >     To unsubscribe: users-unsubscribe at lists.rdoproject.org
> >     <mailto:users-unsubscribe at lists.rdoproject.org>
> >     >     <mailto:users-unsubscribe at lists.rdoproject.org
> >     <mailto:users-unsubscribe at lists.rdoproject.org>>
> >     >
> >
> >
> >     --
> >     Matthias Runge <mrunge at redhat.com <mailto:mrunge at redhat.com>>
> >
> >     Red Hat GmbH, http://www.de.redhat.com/ <http://www.de.redhat.com/>,
> >     Registered seat: Grasbrunn,
> >     Commercial register: Amtsgericht Muenchen, HRB 153243,
> >     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
> >     O'Neil
> >
>
>
> --
> Matthias Runge <mrunge at redhat.com>
>
> Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neil
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/users/attachments/20201023/73f2d60e/attachment-0001.html>


More information about the users mailing list