On 23/10/2020 17:47, Laurent Dumont wrote:
What do you mean by collectd and events? Is there a way to make
collected collect Rabbitmq messages/events and forward them?
Collectd does not connect to rabbitmq (in this setup).
collectd (as collecting agents) collects metrics, but also events. An
event can be: process "nova" died, interface "eth0" went down.
These events are not handled by prometheus (as that is a metrics store).
ceilometer also creates/forwards metrics and events. In STF, this data
(metrics and events, from, both sources collectd and ceilometer) are
delivered over via amqp1 to prometheus (for metrics) and elasticsearch
(for events).
Does that answer your question?
Matthias
On Fri, Oct 23, 2020 at 4:01 AM Matthias Runge <mrunge(a)redhat.com
<mailto:mrunge@redhat.com>> wrote:
On 22/10/2020 17:46, Khodayar Doustar wrote:
> Hi everybody,
>
> I am searching for a good and useful method to monitor my 40 nodes
cloud.
>
> I have tried
>
> - Prometheus + Grafana (with
>
https://github.com/openstack-exporter/openstack-exporter
<
https://github.com/openstack-exporter/openstack-exporter>
> <
https://github.com/openstack-exporter/openstack-exporter
<
https://github.com/openstack-exporter/openstack-exporter>>) but it
> cannot monitor nodes load and cpu usage etc.
> and
> - Gnocchi +Collectd + Grafana but it enforces unbelievable load on
nodes
> and make the whole cloud completely unusable!
>
> I've tried to use Graphite + Grafana but I failed.
>
> Do you have any suggestions?
Hi,
yes, I have some opinions here.
My proposal here is:
- use collectd to collect low level metrics from your baremetal machines
- use ceilometer to collect OpenStack related info, like project usage,
etc. That is nothing you'd get by using node-exporter
- hook them both together and send metrics over to something called
Service Telemetry Framework. The configuration *is* included in tripleo.
The website has documentation available
https://infrawatch.github.io/documentation
<
https://infrawatch.github.io/documentation>
- graphite + grafana (plus collectd) is also a single node setup and
won't provide you reliability.
- collectd also provides the ability to send events, which can be acted
on. That is not included if you use node-exporter, openstack-exporter
etc. Prometheus monitoring creates events from metrics, but will be slow
to detect failed components.
Since prometheus is meant to be single server, there is no HA per se in
prometheus. That makes handling prometheus on standalone machines a bit
awkward, or you'd have a infrastructure taking care of that.
In your tests with gnocchi, collectd and grafana, I bet you used swift
as backend for gnocchi storage. That is not a good idea and may lead to
bad performance.
Matthias
--
Matthias Runge <mrunge(a)redhat.com <mailto:mrunge@redhat.com>>
Red Hat GmbH,
http://www.de.redhat.com/ <
http://www.de.redhat.com/>,
Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
O'Neil
_______________________________________________
users mailing list
users(a)lists.rdoproject.org <mailto:users@lists.rdoproject.org>
http://lists.rdoproject.org/mailman/listinfo/users
<
http://lists.rdoproject.org/mailman/listinfo/users>
To unsubscribe: users-unsubscribe(a)lists.rdoproject.org
<mailto:users-unsubscribe@lists.rdoproject.org>
--
Matthias Runge <mrunge(a)redhat.com>
Red Hat GmbH,
http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neil