[rdo-list] CentOS OpsTools (logging, monitoring, etc.) SIG proposal
Rich Megginson
rmeggins at redhat.com
Sat May 21 00:01:07 UTC 2016
On 05/20/2016 05:55 PM, Arash Kaffamanesh wrote:
> Great!
>
> I'm currently working on how to bring log- and application-
> performance monitoring under the same roof for cloud-native and highly
> distributed applications on top of OpenStack w/ Cloud Foundry or
> OpenShift and Kubernetes add-ons and define some best practices
> (needs) to build a simple, though effective cloud native application
> monitoring solution for BizDevOps (yet another buzz :-)).
>
> My 10 BizDevOps needs are:
>
> 1. Bring log and performance monitoring under the same roof, by
> providing a seamless correlation between log and performance metrics.
> 2. Provide intuitive pre-built monitoring interfaces and dashboards
> for everybody and for different roles and organizations
> (BizDevOps) (note: people lack the time and sometimes the skills
> to configure a monitoring tool).
> 3. Build dedicated dashboards for transaction and correlation
> analysis to figure out the usual suspects like, memory leaks,
> garbage collection, saturated thread pools and hundreds of unusual
> suspects which might be the root cause of problems.
> 4. Enhance the quality of logs (on paas and apps level) and define
> custom metrics which are specific to our cloud-native applications
> and visualize these metrics on custom dashboards for tenants w/
> different roles.
> 5. Analyze long term-trends such as how big is my database and how
> fast is it growing? How quickly is my daily-active user count growing?
> 6. Implement innovative ideas such as data mining, forecasting and
> advanced analytics support to provide added value to the
> monitoring solution.
> 7. Get alerts on issues before customers notice, use the monitoring
> tool as an early warning system, and analyze application
> performance before and after new code deployments.
> 8. If using remediation actions which are triggered through the
> monitoring solution, first require human approval before the
> script is executed (this provides a better understanding of the
> root cause of the problem and how to eliminate it in long term).
> 9. Implement a simple, though an effective alerting system with clear
> alerting escalation path and low noise (rules that generate alerts
> for developers or operators should be simple to understand and
> represent a clear failure).
> 10. Combine heavy use of white-box monitoring with modest but critical
> uses of black-box monitoring and learn from others like Google
> about how they are monitoring their highly distributed systems:
> https://www.oreilly.com/ideas/monitoring-distributed-systems
>
These are good.
> To achieve the above needs, I'm investigating the following tools to
> bring log and performance monitoring under the same roof for my
> current needs:
>
> * ELK / EFK Stack
> * Hawkular: http://www.hawkular.org/
>
EFK is already being used by OpenShift and RDO, and Hawkular is already
being used by OpenShift - these will be among our first packages to support.
> * Stagemonitor http://www.stagemonitor.org/
> * cAdvisor https://github.com/google/cadvisor
>
>
> and I think these BizDevOps-Tools might be the right choice to start
> with and I'd be happy to be of help.
>
> Cheers,
> Arash
>
>
> On Fri, May 20, 2016 at 9:08 PM, Matthias Runge <mrunge at redhat.com
> <mailto:mrunge at redhat.com>> wrote:
>
> On 20/05/16 16:12, Rich Megginson wrote:
> > We are trying to start up a CentOS OpsTools SIG
> > https://wiki.centos.org/SpecialInterestGroup for logging,
> monitoring, etc.
> >
> > The intention is that this would be the upstream for development and
> > packaging of tools related to logging (EFK stack, etc.),
> monitoring, and
> > other opstools, as a single place where packages can be consumed
> by RDO,
> > OpenShift Origin, and other upstream projects - pool our resources,
> > share the lessons learned, and enable cross project log
> aggregation and
> > correlation (e.g. running OpenShift on top of OpenStack on top of
> > Ceph/Gluster - do my OpenShift application errors correlate with
> Nova
> > errors? file system errors?). This would also be a place for
> > installers (puppet manifests, ansible playbooks), and possibly
> > testing/CI and containers.
> >
> > If you are interested, please chime in in the email thread:
> > https://lists.centos.org/pipermail/centos-devel/2016-May/014777.html
> >
> Thank you for the reminder, Rich.
>
> We already have quite a few interested persons. The reason, why I
> didn't
> mention this here was, that it has a broader focus than just RDO.
>
> On the other side, it clearly will be usable with RDO, and it will
> help
> RDO operators to get to the root of occurring issues.
>
> If any of you is interested or can help, please join us on
> centos-devel
> mailing list and express your interest there. It will help us to speed
> things up.
> --
> Matthias Runge <mrunge at redhat.com <mailto:mrunge at redhat.com>>
>
> Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Managing Directors: Paul Argiry, Charles Cachera, Michael Cunningham,
> Michael O'Neill
>
> _______________________________________________
> rdo-list mailing list
> rdo-list at redhat.com <mailto:rdo-list at redhat.com>
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
> <mailto:rdo-list-unsubscribe at redhat.com>
>
>
>
>
> _______________________________________________
> rdo-list mailing list
> rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/dev/attachments/20160520/67d9c9d5/attachment.html>
More information about the dev
mailing list