[rdo-users] TripleO Monitoring Tool/Method

Laurent Dumont laurentfdumont at gmail.com
Fri Oct 23 18:05:28 UTC 2020


I guess I'm still not super clear on the Framework for feeding internal
Openstack events.

This is something that we are actively discussing with RH as part of my
$dayjob. We basically want to get some real-time visibility into Openstack
events (nova, neutron, cinder, glance) and are looking into ways to feed
our tooling from Rabbitmq notifications. That specific approach does not
seem to be supported and we were told that it was risky and not supported.
The notification queue does not seem to be present in OSP13 and most
documentation detail how to remove it when you are on <13.

We already have performance metrics (for computes though Collectd +
Cadvisor) that are fed to our own Prometheus.

With the STF, is there a way to limit our integration to Ceilometer and the
event forwarding system? And for ceilometer itself, what is actually
driving the feed of Openstack events to Ceilometer itself?

I'm open to running the STF as a whole, but I'd rather pick what I really
need - which is visibility into internal Openstack events (VM start, stop,
migrate, port create, delete and so forth)

Thanks!

On Fri, Oct 23, 2020 at 1:58 PM Matthias Runge <mrunge at redhat.com> wrote:

> On 23/10/2020 17:47, Laurent Dumont wrote:
> > What do you mean by collectd and events? Is there a way to make
> > collected collect Rabbitmq messages/events and forward them?
> >
> Collectd does not connect to rabbitmq (in this setup).
>
> collectd (as collecting agents) collects metrics, but also events. An
> event can be: process "nova" died, interface "eth0" went down.
> These events are not handled by prometheus (as that is a metrics store).
>
> ceilometer also creates/forwards metrics and events. In STF, this data
> (metrics and events, from, both sources collectd and ceilometer) are
> delivered over via amqp1 to prometheus (for metrics) and elasticsearch
> (for events).
>
> Does that answer your question?
>
> Matthias
>
> > On Fri, Oct 23, 2020 at 4:01 AM Matthias Runge <mrunge at redhat.com
> > <mailto:mrunge at redhat.com>> wrote:
> >
> >     On 22/10/2020 17:46, Khodayar Doustar wrote:
> >     > Hi everybody,
> >     >
> >     > I am searching for a good and useful method to monitor my 40 nodes
> >     cloud.
> >     >
> >     > I have tried
> >     >
> >     > - Prometheus + Grafana (with
> >     > https://github.com/openstack-exporter/openstack-exporter
> >     <https://github.com/openstack-exporter/openstack-exporter>
> >     > <https://github.com/openstack-exporter/openstack-exporter
> >     <https://github.com/openstack-exporter/openstack-exporter>>) but it
> >     > cannot monitor nodes load and cpu usage etc.
> >     > and
> >     > - Gnocchi +Collectd + Grafana but it enforces unbelievable load on
> >     nodes
> >     > and make the whole cloud completely unusable!
> >     >
> >     > I've tried to use Graphite + Grafana but I failed.
> >     >
> >     > Do you have any suggestions?
> >
> >
> >     Hi,
> >
> >     yes, I have some opinions here.
> >
> >     My proposal here is:
> >
> >     - use collectd to collect low level metrics from your baremetal
> machines
> >     - use ceilometer to collect OpenStack related info, like project
> usage,
> >     etc. That is nothing you'd get by using node-exporter
> >     - hook them both together and send metrics over to something called
> >     Service Telemetry Framework. The configuration *is* included in
> tripleo.
> >     The website has documentation available
> >     https://infrawatch.github.io/documentation
> >     <https://infrawatch.github.io/documentation>
> >     - graphite + grafana (plus collectd) is also a single node setup and
> >     won't provide you reliability.
> >     - collectd also provides the ability to send events, which can be
> acted
> >     on. That is not included if you use node-exporter, openstack-exporter
> >     etc. Prometheus monitoring creates events from metrics, but will be
> slow
> >     to detect failed components.
> >
> >     Since prometheus is meant to be single server, there is no HA per se
> in
> >     prometheus. That makes handling prometheus on standalone machines a
> bit
> >     awkward, or you'd have a infrastructure taking care of that.
> >
> >     In your tests with gnocchi, collectd and grafana, I bet you used
> swift
> >     as backend for gnocchi storage. That is not a good idea and may lead
> to
> >     bad performance.
> >
> >     Matthias
> >
> >     --
> >     Matthias Runge <mrunge at redhat.com <mailto:mrunge at redhat.com>>
> >
> >     Red Hat GmbH, http://www.de.redhat.com/ <http://www.de.redhat.com/>,
> >     Registered seat: Grasbrunn,
> >     Commercial register: Amtsgericht Muenchen, HRB 153243,
> >     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
> >     O'Neil
> >
> >     _______________________________________________
> >     users mailing list
> >     users at lists.rdoproject.org <mailto:users at lists.rdoproject.org>
> >     http://lists.rdoproject.org/mailman/listinfo/users
> >     <http://lists.rdoproject.org/mailman/listinfo/users>
> >
> >     To unsubscribe: users-unsubscribe at lists.rdoproject.org
> >     <mailto:users-unsubscribe at lists.rdoproject.org>
> >
>
>
> --
> Matthias Runge <mrunge at redhat.com>
>
> Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neil
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/users/attachments/20201023/fb2499a1/attachment.html>


More information about the users mailing list