Re: [rdo-list] Updates to RDO slaves and jobs in ci.centos.org

Friday, 21 April 2017

Yeah, i was only taking into account time running run_test.sh, which
shouldn't be impacted by slowness in rdo-ci-slave1. This are my
findings for job weirdo-master-promote-puppet-openstack-scenario002:

RDO-Cloud: 39mins
n30.dusty: 33mins
n13.pufty: 60mins
n54.cursty: 58mins

I think it's pretty good.

On Fri, Apr 21, 2017 at 1:55 PM, David Moreau Simard <dms(a)redhat.com&gt; wrote:
...
 The performance is not great because of "rdo-ci-slave01"
from which Ansible
 runs on.

 We all know that node has performance problems (especially i/o).
 For example, a promote job [1] will take 1 hour and 4 minutes while the
 equivalent generic job [2] (ran on a cloudslave) will finish in about 35
 minutes.

 I mean, it takes rdo-ci-slave01 more than five (5!) minutes to just
 bootstrap the job (clone weirdo, virtualenv with ara, ansible, shade and
 initialize ara).
 The same thing takes less than 30 seconds on a cloudslave.

 [1]:
 https://ci.centos.org/job/weirdo-master-promote-packstack-scenario001/1080/
 [2]:
 https://ci.centos.org/view/rdo/view/weirdo/job/weirdo-generic-packstack-s...

 David Moreau Simard
 Senior Software Engineer | Openstack RDO

 dmsimard = [irc, github, twitter]

 On Apr 21, 2017 4:22 AM, "Alfredo Moralejo Alonso" <amoralej(a)redhat.com&gt;
 wrote:
>
> On Fri, Apr 21, 2017 at 2:40 AM, David Moreau Simard <dms(a)redhat.com&gt;
> wrote:
> > WeIRDO jobs were tested manually on the rdo-ci-slave01 (promote slave)
> > on which the jobs would not run successfully yesterday.
> >
> > Everything now looks good after untangling the update issue from
> > yesterday and WeIRDO promote jobs have been switched to rdo-cloud.
> >
>
> Nice!, I've seen weirdo jobs in
>
>
https://ci.centos.org/view/rdo/view/promotion-pipeline/job/rdo_trunk-prom...
> ran in RDO Cloud with pretty good performance, they seems to run
> slower than jobs running in dusty servers in ci.centos but faster that
> the rest of servers.
>
> I'll keep an eye on it too to find out if there is any abnormal behavior.
>
>
> > I'll be monitoring this closely but let me know if you see any problems.
> >
> > David Moreau Simard
> > Senior Software Engineer | Openstack RDO
> >
> > dmsimard = [irc, github, twitter]
> >
> >
> > On Thu, Apr 20, 2017 at 12:26 AM, David Moreau Simard <dms(a)redhat.com&gt;
> > wrote:
> >> Hi,
> >>
> >> There's been a few updates worth mentioning and explaining to a wider
> >> audience as far as RDO is concerned on the ci.centos.org environment.
> >>
> >> First, please note that all packages on the five RDO slaves have been
> >> updated to the latest version.
> >> We had not yet updated to 7.3.
> >>
> >> The rdo-ci-slave01 node (the "promotion" slave) ran into some
issues
> >> that took some time to fix, EPEL was enabled and it picked up python
> >> packages it shouldn't have.
> >> Things seem to be back in order now but some jobs might have failed in
> >> a weird way, triggering them again should be fine.
> >>
> >> Otherwise, all generic WeIRDO jobs are now running on OpenStack
> >> virtual machines provided by the RDO Cloud.
> >> This is provided by using the "rdo-virtualized" slave tags.
> >> The "rdo-promote-virtualized" tag will be used for the weirdo
promote
> >> jobs once we're sure there's no more issues running them on the
> >> promotion slave.
> >>
> >> These tags are designed to work with WeIRDO jobs only for the time
> >> being, please contact me if you'd like to run virtualized workloads
> >> from ci.centos.org.
> >>
> >> This amounts to around 35 less jobs per day running on Duffy
> >> ci.centos.org hardware in total on a typical day (including generic
> >> weirdo jobs and promote weirdo jobs).
> >>
> >> I've re-shuffled the capacity around a bit, considering we've now
> >> freed significant capacity for bare-metal based TripleO jobs.
> >> The slave threads are now as follows:
> >> - rdo-ci-slave01: 12 threads (up from 11), tagged with
"rdo-promote"
> >> and "rdo-promote-virtualized"
> >> - rdo-ci-cloudslave01: 6 threads (up from 4), tagged with "rdo"
> >> - rdo-ci-cloudslave02: 6 threads (up from 4), tagged with "rdo"
> >> - rdo-ci-cloudslave03: 8 threads (up from 4), tagged with
> >> "rdo-virtualized"
> >> - rdo-ci-cloudslave04: 8 threads (down from 15), tagged with
> >> "rdo-virtualized"
> >>
> >> There is a specific reason why cloudslave03 and cloudslave04 amount to
> >> 16 threads between the two, it is to match the quota we have been
> >> given in terms of capacity at RDO cloud.
> >> The threads will be used to artificially limit the amount of jobs run
> >> against the cloud concurrently without needing to implement queueing
> >> on our end.
> >>
> >> You'll otherwise notice the net effect for the "rdo" and
"rdo-promote"
> >> tag isn't much, at least for the time being, it's very much the
same
> >> since I've re-allocated cloudslave03 to load balance virtualized jobs.
> >> However, jobs are likely to be more reliable and faster now that they
> >> won't have to retry for nodes because we're less likely to hit
> >> rate-limiting.
> >>
> >> I'll monitor the situation over the next few days and bump the numbers
> >> if everything is looking good.
> >> That said, I'd like to hear about your feedback if you feel things are
> >> looking better and if we are running into "out of inventory"
errors
> >> less often.
> >>
> >> Let me know if you have any questions,
> >>
> >> David Moreau Simard
> >> Senior Software Engineer | Openstack RDO
> >>
> >> dmsimard = [irc, github, twitter] 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [rdo-list] Updates to RDO slaves and jobs in ci.centos.org