On Sun, Jan 7, 2018 at 11:24 PM, Tristan Cacqueray <tdecacqu@redhat.com> wrote:
Hello David and Wesley, please find some comments inlined bellow.

On January 5, 2018 6:39 pm, Wesley Hayutin wrote:
On Fri, Jan 5, 2018 at 12:36 PM, David Moreau Simard <dms@redhat.com> wrote:

There are already plans [1] to add the software factory implementation of
Grafana on review.rdoproject.org, you can see what it looks like on
softwarefactory-project.io [2].

The backend to this grafana implementation is currently influxdb, not
graphite.
However, there are ongoing discussions to either both graphite and
influxdb simultaneously or optionally either.

We're interested in leveraging this influxdb (or graphite) and grafana
implementation for monitoring data in general (uptime, resources, disk
space, load, etc.) so our goals align here.
We both agree that using graphite would be a plus in order to re-use the
same queries in the grafana dashboard but at the same time, influxdb is
more "modern" and easier to work with -- this is why we might end up
deploying both, we'll see.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1514086
[2]: https://softwarefactory-project.io/grafana/


Note that the current influxdb/grafana integration is for instance system
metric (cpu, mem, network and i/o). We are working on getting zuul and
nodepool metric but the upstream query needs to be adapted for influxdb,
hence we may look at integrating graphite/carbon too so that is easier.
There is also this tool that can make influxdb a backend for graphite:
https://github.com/InfluxGraph/influxgraph

Also note that we are integrating grafyaml to the config repo so that
grafana dashboards can be proposed and updated by regular user too.


This is great news David, thank you for sharing.
Given that this is already in plan software factory and we have an
immediate need I'm wondering how to proceed.
Does the RDO Infra team have an estimate when graphite/influxdb/grafana
will be moved to production?

While we could setup the grafana/influxdb service, and we should
in the near future, it seems like this ci use-case needs some more
tinkering and I think it would be easier to start with another
dedicated setup until the requirements are better defined.


Some possibilities come to mind, depending on when it moves to prod

1.  The TripleO-CI team waits for prod
2.  TripleO CI would stand up a test instance of graphite/influxdb and
grapha and start to work out what we need to send and how to send data
3.  Is it possible to use the stage instance RDO SF as a testbed for
TripleO-CI's work?  Meaning we send metrics and use the stage instance with
a backing up the data in mind?

What do you think?
Thanks



I think 1. will happen shortly, and this will bring a grafana setup
accessible from the top-menu.

Though I think 2. is probably easier to begin with, and we could
configure the new graphite/influxdb backend in the existing grafana.

Not sure what you mean by 3. If there is a graphite/influxdb service in
rdo-prod tenant, then you could use it for tripleo-ci work of course.
The backup of RDO SF is managed by this playbook:
https://softwarefactory-project.io/r/gitweb?p=software-factory/sf-ops.git;a=blob;f=backup/ansible/backup.yml
We could add_host the new backend and backup it's data similarly.


Here are some more thoughts:

Dependending on how the metrics are pushed, we may need some kind of
authorization mechanism and a job secret to allow external clients to
push new metrics.

It seems like we could setup post run to push job metrics. Perhaps
we could leverage ara sqldump to extract per task duration.

Software Factory may also automatically setup job duration graph
dashboard per project, here is a new user-story to track this work:
https://tree.taiga.io/project/morucci-software-factory/us/897


Alternatively, we could also use the zuul sql reporter database, which
already record the start/end time of each job. Here is a gnuplot of that
data:
https://fedorapeople.org/~tdecacqu/tripleo-ci/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset018-pike.png
This could probably be integrated in the zuul-web dashboard upstream.

Alternatively, the elasticsearch data could also be used to constructed a
similar graph in kibana, though it seems like it's missing a duration field.


Regards,
-Tristan

Thanks for the feedback David, Tristan.
We will be discussing your feedback tomorrow directly after the tripleo meeting on #tripleo.

You guys are always welcome to join, just ping on #oooq / #tripleo for details about the meeting.
We're going to spend about 20min in a Q&A session about the tools.

We'll follow up with our plans to this thread.

Thanks all!
 






David Moreau Simard
Senior Software Engineer | OpenStack RDO

dmsimard = [irc, github, twitter]

On Fri, Jan 5, 2018 at 12:13 PM, Wesley Hayutin <whayutin@redhat.com>
wrote:

Greetings,

At the end of 2017, a number of the upstream multinode scenario jobs
started to run over our required deployment times [1].  In an effort to
better understand the performance of the deployment and CI the tripleo
cores requested that a Graphite and Grafana server be stood up such that we
can analyze the core issues more effectively.

There is a certain amount of urgency with the issue as our upstream
coverage is impacted.  The TripleO-CI team is working on the deployment of
both tools in a dev-ops style in RDO-Cloud this sprint.  Nothing yet has
been deployed.

The TripleO CI team is also working with upstream infra to send metric
and data to the upstream Graphite and Grafana servers.  It is not clear yet
if we have permission or access to the upstream tools.

I wanted to publically announce this work to the RDO infra community to
inform and to gather any feedback anyone may have.  There are two scopes of
work here, the initial tooling to stand up the infra and the longer term
maintenance of the tools.  Perhaps there are plans to build these into RDO
SF already.. etc.

Please reply with your comments and concerns.
Thank you!


[1] https://github.com/openstack-infra/tripleo-ci/commit/7a2
edf70eccfc7002d26fd1ce1eef803ce8d0ba8



_______________________________________________
dev mailing list
dev@lists.rdoproject.org
http://lists.rdoproject.org/mailman/listinfo/dev

To unsubscribe: dev-unsubscribe@lists.rdoproject.org



_______________________________________________
dev mailing list
dev@lists.rdoproject.org
http://lists.rdoproject.org/mailman/listinfo/dev

To unsubscribe: dev-unsubscribe@lists.rdoproject.org