<div dir="ltr">Great, I just don't understand what is the "new" monitoring if that's the "old" one?<div><br></div><div>Gruß,</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Oct 23, 2020 at 5:07 PM Matthias Runge <<a href="mailto:mrunge@redhat.com">mrunge@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi.<br>
<br>
for STF deployment, you can follow the doc you linked.<br>
<br>
Legacy refers to "old" monitoring, meaning ceilometer, aodh and gnocchi.<br>
<br>
Depending on your setup, you'll also enable ceilometer for OpenStack<br>
usage reporting.<br>
<br>
Because you also would like to use cloudkitty, you will have to enable<br>
gnocchi (iirc.) That is something you'll have to check.<br>
<br>
Matthias<br>
<br>
On 23/10/2020 16:02, Khodayar Doustar wrote:<br>
> Hi,<br>
> <br>
> Thanks a lot Matthias,<br>
> <br>
> And is it also enough to follow this doc<br>
> here: <a href="https://infrawatch.github.io/documentation" rel="noreferrer" target="_blank">https://infrawatch.github.io/documentation</a><br>
> <<a href="https://infrawatch.github.io/documentation" rel="noreferrer" target="_blank">https://infrawatch.github.io/documentation</a>> ? Because I'm using TripleO<br>
> on CentOS, maybe you are using Original RHOSP?<br>
> <br>
> Is it legacy to the new one which is STF? Or is it some other modern<br>
> monitoring "Legacy"?<br>
> Does it mean if I'm going to use STF I won't need this Legacy?<br>
> (considering that I'm going to implement CloudKitty as well)<br>
> <br>
> Regards,<br>
> Khodayar<br>
> <br>
> On Fri, Oct 23, 2020 at 3:37 PM Matthias Runge <<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a><br>
> <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>>> wrote:<br>
> <br>
>     Hi,<br>
> <br>
>     yes of course I'm using STF, and it's not complicated.<br>
>     It's always a good idea to separate your monitoring stack from the<br>
>     monitored infrastructure. How would you know your stack is down, if<br>
>     notifications are also sent from that stack?<br>
> <br>
>     With the tripleo-heat-templates you linked, you basically enable legacy<br>
>     telemetry (ceilometer, aodh, gnocchi).<br>
> <br>
>     If you are running 40 computes, that is not a small stack anymore. I<br>
>     would suggest (recommend) to use ceph as backend.<br>
> <br>
>     Also, depending on your use-case and your settings (for collectd) you<br>
>     may want to lower the interval, the parameter is<br>
>     CollectdDefaultPollingInterval, I have set it here to something like 5<br>
>     secs, but in your case, I would suggest to use 600 (same as for<br>
>     Ceilometer).<br>
> <br>
>     Matthias<br>
> <br>
> <br>
>     On 23/10/2020 11:09, Khodayar Doustar wrote:<br>
>     > Matthias,<br>
>     ><br>
>     > Thanks a lot for your answer.<br>
>     > Yes, you win the bet :) I've used swift and currently struggling to<br>
>     > disable collectd to make my cloud usable again! :))<br>
>     ><br>
>     > I've seen this STF (Service Telemetry Framework) but it seems a little<br>
>     > bit too complicated. I should implement an OKD cluster to monitor my<br>
>     > openstack, isn't it too much work?<br>
>     > Have you tried it yourself?<br>
>     ><br>
>     > If I understand correctly, with your first and main opinion you mean<br>
>     > adding this files to my overcloud deploy command:<br>
>     ><br>
>     ><br>
>     /usr/share/openstack-tripleo-heat-templates/environments/enable-legacy-telemetry.yaml<br>
>     ><br>
>     /usr/share/openstack-tripleo-heat-templates/environments/services/collectd.yaml<br>
>     ><br>
>     > and for performance tuning I've checked this page:<br>
>     ><br>
>     <a href="https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry" rel="noreferrer" target="_blank">https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry</a><br>
>     <<a href="https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry" rel="noreferrer" target="_blank">https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry</a>><br>
>     ><br>
>     <<a href="https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry" rel="noreferrer" target="_blank">https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry</a><br>
>     <<a href="https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry" rel="noreferrer" target="_blank">https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/deployment_recommendations_for_specific_red_hat_openstack_platform_services/config-recommend-telemetry_config-recommend-telemetry#config_telemetry-small-overcloud_config-recommend-telemetry</a>>><br>
>     ><br>
>     > Is that what you mean?<br>
>     > If so I should make my cloud usable again and just<br>
>     change GnocchiBackend<br>
>     > to a path to a file on a shared file system (i.e. NFS) because I<br>
>     have 4<br>
>     > controller nodes, because the rest is exactly what I've done up to<br>
>     now.<br>
>     ><br>
>     > Thanks a lot,<br>
>     > Khodayar<br>
>     ><br>
>     > On Fri, Oct 23, 2020 at 10:01 AM Matthias Runge <<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a><br>
>     <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>><br>
>     > <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a> <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>>>> wrote:<br>
>     ><br>
>     >     On 22/10/2020 17:46, Khodayar Doustar wrote:<br>
>     >     > Hi everybody,<br>
>     >     ><br>
>     >     > I am searching for a good and useful method to monitor my 40<br>
>     nodes<br>
>     >     cloud.<br>
>     >     ><br>
>     >     > I have tried<br>
>     >     ><br>
>     >     > - Prometheus + Grafana (with<br>
>     >     > <a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a><br>
>     <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a>><br>
>     >     <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a><br>
>     <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a>>><br>
>     >     > <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a><br>
>     <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a>><br>
>     >     <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a><br>
>     <<a href="https://github.com/openstack-exporter/openstack-exporter" rel="noreferrer" target="_blank">https://github.com/openstack-exporter/openstack-exporter</a>>>>) but it<br>
>     >     > cannot monitor nodes load and cpu usage etc.<br>
>     >     > and <br>
>     >     > - Gnocchi +Collectd + Grafana but it enforces unbelievable<br>
>     load on<br>
>     >     nodes<br>
>     >     > and make the whole cloud completely unusable!<br>
>     >     ><br>
>     >     > I've tried to use Graphite + Grafana but I failed.<br>
>     >     ><br>
>     >     > Do you have any suggestions?<br>
>     ><br>
>     ><br>
>     >     Hi,<br>
>     ><br>
>     >     yes, I have some opinions here.<br>
>     ><br>
>     >     My proposal here is:<br>
>     ><br>
>     >     - use collectd to collect low level metrics from your<br>
>     baremetal machines<br>
>     >     - use ceilometer to collect OpenStack related info, like<br>
>     project usage,<br>
>     >     etc. That is nothing you'd get by using node-exporter<br>
>     >     - hook them both together and send metrics over to something<br>
>     called<br>
>     >     Service Telemetry Framework. The configuration *is* included<br>
>     in tripleo.<br>
>     >     The website has documentation available<br>
>     >     <a href="https://infrawatch.github.io/documentation" rel="noreferrer" target="_blank">https://infrawatch.github.io/documentation</a><br>
>     <<a href="https://infrawatch.github.io/documentation" rel="noreferrer" target="_blank">https://infrawatch.github.io/documentation</a>><br>
>     >     <<a href="https://infrawatch.github.io/documentation" rel="noreferrer" target="_blank">https://infrawatch.github.io/documentation</a><br>
>     <<a href="https://infrawatch.github.io/documentation" rel="noreferrer" target="_blank">https://infrawatch.github.io/documentation</a>>><br>
>     >     - graphite + grafana (plus collectd) is also a single node<br>
>     setup and<br>
>     >     won't provide you reliability.<br>
>     >     - collectd also provides the ability to send events, which can<br>
>     be acted<br>
>     >     on. That is not included if you use node-exporter,<br>
>     openstack-exporter<br>
>     >     etc. Prometheus monitoring creates events from metrics, but<br>
>     will be slow<br>
>     >     to detect failed components.<br>
>     ><br>
>     >     Since prometheus is meant to be single server, there is no HA<br>
>     per se in<br>
>     >     prometheus. That makes handling prometheus on standalone<br>
>     machines a bit<br>
>     >     awkward, or you'd have a infrastructure taking care of that.<br>
>     ><br>
>     >     In your tests with gnocchi, collectd and grafana, I bet you<br>
>     used swift<br>
>     >     as backend for gnocchi storage. That is not a good idea and<br>
>     may lead to<br>
>     >     bad performance.<br>
>     ><br>
>     >     Matthias<br>
>     ><br>
>     >     --<br>
>     >     Matthias Runge <<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a> <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>><br>
>     <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a> <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>>>><br>
>     ><br>
>     >     Red Hat GmbH, <a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a><br>
>     <<a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a>> <<a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a><br>
>     <<a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a>>>,<br>
>     >     Registered seat: Grasbrunn,<br>
>     >     Commercial register: Amtsgericht Muenchen, HRB 153243,<br>
>     >     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael<br>
>     >     O'Neil<br>
>     ><br>
>     >     _______________________________________________<br>
>     >     users mailing list<br>
>     >     <a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a> <mailto:<a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a>><br>
>     <mailto:<a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a> <mailto:<a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a>>><br>
>     >     <a href="http://lists.rdoproject.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.rdoproject.org/mailman/listinfo/users</a><br>
>     <<a href="http://lists.rdoproject.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.rdoproject.org/mailman/listinfo/users</a>><br>
>     >     <<a href="http://lists.rdoproject.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.rdoproject.org/mailman/listinfo/users</a><br>
>     <<a href="http://lists.rdoproject.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.rdoproject.org/mailman/listinfo/users</a>>><br>
>     ><br>
>     >     To unsubscribe: <a href="mailto:users-unsubscribe@lists.rdoproject.org" target="_blank">users-unsubscribe@lists.rdoproject.org</a><br>
>     <mailto:<a href="mailto:users-unsubscribe@lists.rdoproject.org" target="_blank">users-unsubscribe@lists.rdoproject.org</a>><br>
>     >     <mailto:<a href="mailto:users-unsubscribe@lists.rdoproject.org" target="_blank">users-unsubscribe@lists.rdoproject.org</a><br>
>     <mailto:<a href="mailto:users-unsubscribe@lists.rdoproject.org" target="_blank">users-unsubscribe@lists.rdoproject.org</a>>><br>
>     ><br>
> <br>
> <br>
>     -- <br>
>     Matthias Runge <<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a> <mailto:<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>>><br>
> <br>
>     Red Hat GmbH, <a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a> <<a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a>>,<br>
>     Registered seat: Grasbrunn,<br>
>     Commercial register: Amtsgericht Muenchen, HRB 153243,<br>
>     Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael<br>
>     O'Neil<br>
> <br>
<br>
<br>
-- <br>
Matthias Runge <<a href="mailto:mrunge@redhat.com" target="_blank">mrunge@redhat.com</a>><br>
<br>
Red Hat GmbH, <a href="http://www.de.redhat.com/" rel="noreferrer" target="_blank">http://www.de.redhat.com/</a>, Registered seat: Grasbrunn,<br>
Commercial register: Amtsgericht Muenchen, HRB 153243,<br>
Man.Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neil<br>
<br>
</blockquote></div>