[rdo-list] Building overcloud images with TripleO and RDO

Matt Young myoung at redhat.com
Wed Aug 17 05:55:18 UTC 2016


Hi!

First, thanks for digging into the recent issue detailed below.

I agree that:

---

We can always do more to document what processes and tooling we have, and
clearly we need to!

Open mechanisms that allow anyone to build images, see how it all works,
customize, override, debug, etc are indeed both ideal and IMHO are a
requirement.

Declarative mechanisms are an ideal base for tools that can adapt, that 
are a
joy to use, can evolve rationally, and are robust enough in the long 
term to be
used broadly. We have a variety of use-cases, scenarios, and users to 
consider.
It’s a big tent!

---

Regarding ansible-role-tripleo-image-build (artib) [1], I wanted to chime in
briefly with a few thoughts and references to help advance the conversation.
Earlier this year this was all new to me, so I thought others might be 
in the
same place.

---

tripleo-quickstart (oooq) [2] has the primary goal of providing an “easy 
button”
for new users/contributors. It is designed to deploy current stable bits 
[3] to
libvirt quickly and reliably, using declarative configuration for 
topology and
inputs [4]. It’s does so using the documented steps to deploy tripleo, and
encourages learning and onboarding. It can run the the generated scripts 
(on the
undercloud) or the user can. It manages to do so in a way that that is a 
boon to
CI and iterative development workflows. A set of building blocks [5] 
exist and
is growing, and is being used in our CI today.

artib packages images for use with oooq, but does not attempt to define 
a new
build mechanism. It uses a shared upstream library (tripleo-common [6]) 
which
invokes DIB [7] to create the images. The input to tripleo-common is 
declarative
YAML, and there’s a CLI interface provided as well.

Regarding the pre-built images,  we’re using them not because it’s “too 
hard”
but rather “saves time in CI.”  Using the same tools anyone can create 
their own
images, tweak them, put custom test or debug tools in them, etc. One can 
also
simply (optionally) use one to quickly verify a bug, or experiment, or 
____. We
are also building images today to enable stuff like the oooq-usbkey [8] 
(cool
right ?!?!)


---

+1 to talking more about this, and submitting blueprints on how we can 
advance
tooling around image building. I’m an interested party that’s new to 
OpenStack,
and look forward to collaborating with (all of) you.

Let’s brainstorm how we might improve both the discoverability and 
utility of
the various image building tools, or what they might look like moving 
forward.
What are we missing?  What can we do/make better?

In my view this begins with listening - to understand the requirements,
constraints, and expectations that folks have for these tools. Let’s 
recognize
up front that while converging/consolidating the toolchains might be a 
possible
outcome, this is OpenStack. Carrots > Sticks, and it might just happen
naturally.

I think these tools have immense potential to help the OpenStack development
process. Given the recent advances in both containers and composable 
roles, as
well as the plethora of tools we have already, I look forward to seeing 
how we
can improve our collective union-set “utility belt” [9]

Matt


[1] https://github.com/redhat-openstack/ansible-role-tripleo-image-build
[2] https://github.com/openstack/tripleo-quickstart
[3] http://artifacts.ci.centos.org/rdo/images/master/delorean/stable/
[4] 
https://github.com/openstack/tripleo-quickstart/blob/master/doc/configuring.md
[5] 
https://github.com/redhat-openstack?utf8=%E2%9C%93&query=ansible-role-tripleo
[6] https://github.com/openstack/tripleo-common/tree/master/tripleo_common
[7] http://docs.openstack.org/developer/diskimage-builder/
[8] https://www.rdoproject.org/tripleo/oooq-usbkey/
[9] yes...batman. He is/was self-created!


On 08/11/2016 11:36 PM, Graeme Gillies wrote:
> Hi,
>
> I spent the last day or two trying to get to the bottom of the issue
> described at [1], which turned out to be because the version of galera
> that is in EPEL is higher than what we have in RDO mitaka stable, and
> when it attempts to get used, mariadb-galera-server fails to start.
>
> In order to understand why epel was being pulled in, how to stop it, and
> how this seemed to have slipped through CI/testing, I've been trying to
> look through and understand the whole state of the image building
> process across TripleO, RDO, and our CI.
>
> Unfortunately what I've discovered hasn't been great. It looks like
> there is at least 3 different paths being used to build images.
> Apologies if anything below is incorrect, it's incredibly convoluted and
> difficult to follow for someone who isn't intimately familiar with it
> all (like myself).
>
> 1) Using "openstack overcloud image build --all", which is I assume the
> method end users are supposed to be using, or at least it's the method
> documented in the docs. This uses diskimagebuilder under the hood, but
> the logic behind it is in python (under python-tripleoclient), with a
> lot of stuff hardcoded in
>
> 2) Using tripleo.sh, which, while it looks like calls "openstack
> overcloud image build", also has some of it's own logic and messes with
> things like the ~/.cache/image-create/source-repositories file, which I
> believe is how the issue at [1] passed CI in the first place
>
> 3) Using the ansible role ansible-role-tripleo-image-build [2] which
> looks like it also uses diskimagebuilder, but through a slightly
> different approach, by using an ansible library that can take an image
> definition via yaml (neat!) and then all diskimagebuilder using
> python-tripleo-common as an intermediary. Which is a different code path
> (though the code itself looks similar) to python-tripleoclient
>
> I feel this issue is hugely important as I believe it is one of the
> biggest barriers to having more people adopt RDO/TripleO. Too often
> people encounter issues with deploys that are hard to nail down because
> we have no real understanding exactly how they built the images, nor as
> an Operator I don't feel like I have a clear understanding of what I get
> when I use different options. The bug at [1] is a classic example of
> something I should never have hit.
>
> We do have stable images available at [3] (built using method 3) however
> there are a number of problems with just using them
>
> 1) I think it's perfectly reasonable for people to want to build their
> own images. It's part of the Open Source philosophy, we want things to
> be Open and we want to understand how things work, so we can customise,
> extend, and troubleshoot ourselves. If your image building process is so
> convoluted that you have to say "just use our prebuilt ones", then you
> have done something wrong.
>
> 2) The images don't get updated (they current mitaka ones were built in
> April)
>
> 3) There is actually nowhere on the RDO website, nor the tripleo
> website, that actually references their location. So as a new user, you
> have exactly zero chance of finding these images and using them.
>
> I'm not sure what the best process is to start improving this, but it
> looks like it's complicated enough and involves enough moving pieces
> that a spec against tripleo might be the way to go? I am thinking the
> goal would be to move towards everyone having one way, one code path,
> for building images with TripleO, that could be utilised by all use
> cases out there.
>
> My thinking is the method would take image definitions in a yaml format
> similar to how ansible-role-tripleo-image-build works, and we can just
> ship a bunch of different yaml files for all the different image
> scenarios people might want. e.g.
>
> /usr/share/tripleo-images/centos-7-x86_64-mitaka-cbs.yaml
> /usr/share/tripleo-images/centos-7-x86_64-mitaka-trunk.yaml
> /usr/share/tripleo-images/centos-7-x86_64-trunk.yaml
>
> Etc etc. you could then have a symlink called default.yaml which points
> to whatever scenario you wish people to use by default, and the scenario
> could be overwritten by a command line argument. Basically this is
> exactly how mock [4] works, and has been proven to be a nice, clean,
> easy to use workflow for people to understand. The added bonus is if
> people wanted to do their own images, they could copy one of the
> existing files as a template to start with.
>
> If people feel this is worthwhile (and I hope they do), I'm interested
> in understanding what the next steps would be to get this to happen.
>
> Regards,
>
> Graeme
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1365884
> [2] https://github.com/redhat-openstack/ansible-role-tripleo-image-build
> [3]
> http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/mitaka/cbs/
> [4] https://fedoraproject.org/wiki/Mock




More information about the dev mailing list