On 12/05/14 14:08 +0100, Martyn Taylor wrote:
All,
I recently had some discussion about HA orchestration this morning
with Petr Chalupa. Particular around the HA Controller node
deployment. This particular role behaves slightly differently to the
other roles in a Staypuft deployment in that it requires more than one
puppet run to complete.
Up to now we have worked on the assumption that once we have received
a successful puppet run report in foreman, then the node associated
with the role is configured and ready to go. We use this for
scheduling the next list of nodes in a given deployment.
We do have a work around for HA Controller issue described above in
the astapor modules. Blocking is implemented in the subsequent puppet
modules that are dependent on the HA Controller services. This means
that any depdendent modules will wait until controller completes
before proceeding. This results in the following behaviour.
Sequence
- Controller Nodes Provisioned.
- First puppet run returns successful.
- LVM Block Storage is provisioned.
- Controller Node puppet run 2 completes
- LVM Block storage puppet run completes.
In this case, the LVM block storage is provisioned before the
controllers are complete, but will block until the Controller puppet
run 2 completes.
This work around is sufficient for the time being. But really what we
would like is to have Staypuft orchestrate the whole process, rather
than it be partially orchestrated by the puppet modules, partially by
Staypuft orchestration.
The difficulty we have right now in Staypuft is that (with out knowing
the specific implementation details of the puppet modules), there is
no clear way to detect whether a node with role X is complete and we
are able to schedule the next roles in the sequence.
What we need here is a clear interface for determining status of
puppet class and/or HostGroup status for the Astapor modules.
I have 2 questions around this,
1. Does there currently exist anyway to consistently detect the
status of a role/list of classes within Foreman for Astapor classes
that we can utilize?
-. If so can we do this without knowing the implementation details
of the Astapor puppet modules? (We do not want to, for example, look
for class specific facts in foreman, since these vary between classes
and may change in Astapor)?
2. If not 1. Is is possible to add something to the puppet modules
to explicitly show that a class/Hostgroup is complete? I am thinking
something along the lines of reporting a "Ready" flag back to foreman.
I'll have to think about it more, but we already have a fact similar
to this that we use in quickstack for determining if ha-mysql is
ready, so we can decide whether to do certain other steps. Crag had
some concern that we were seeing an odd behavior with puppet agent
running as a service though, not sure if he and Petr looked at it
friday or not. In case they did not, his theory was that the puppet
facts from the node were not getting updated correctly between agent
runs when the agent was not a service. It seemed that the node was
reporting in and the next run still did not have the new value for the
fact (so in this case, the second run should show ha_mysql_ready=true
or similar). The fact was correct when puppet agent was run in the
foreground for each run, so I believe the thought was that when agent
ran as a service, facts were being cached and not updated. I am
unsure if this has yet been either proved or disproved, just
mentioning it in case it is a real issue.
Anyway, if that were _not_ an issue, it would be simple enough to add
a controller_ready fact or similar to quickstack. I am still not sure
if this is the best approach, but it is definitely feasible, we have
all the information available to us to report back such a thing.
-j
If none of the above, any other suggestions?
Cheers
Martyn