On 12/05/14 14:08 +0100, Martyn Taylor wrote:
All,
I recently had some discussion about HA orchestration this morning 
with Petr Chalupa.  Particular around the HA Controller node 
deployment.  This particular role behaves slightly differently to the 
other roles in a Staypuft deployment in that it requires more than one 
puppet run to complete.
Up to now we have worked on the assumption that once we have received 
a successful puppet run report in foreman, then the node associated 
with the role is configured and ready to go.  We use this for 
scheduling the next list of nodes in a given deployment.
We do have a work around for HA Controller issue described above in 
the astapor modules.  Blocking is implemented in the subsequent puppet 
modules that are dependent on the HA Controller services. This means 
that any depdendent modules will wait until controller completes 
before proceeding.  This results in the following behaviour.
Sequence
- Controller Nodes Provisioned.
- First puppet run returns successful.
- LVM Block Storage is provisioned.
- Controller Node puppet run 2 completes
- LVM Block storage puppet run completes.
In this case, the LVM block storage is provisioned before the 
controllers are complete, but will block until the Controller puppet 
run 2 completes.
This work around is sufficient for the time being.  But really what we 
would like is to have Staypuft orchestrate the whole process, rather 
than it be partially orchestrated by the puppet modules, partially by 
Staypuft orchestration.
The difficulty we have right now in Staypuft is that (with out knowing 
the specific implementation details of the puppet modules), there is 
no clear way to detect whether a node with role X is complete and we 
are able to schedule the next roles in the sequence.
What we need here is a clear interface for determining status of 
puppet class and/or HostGroup status for the Astapor modules.
I have 2 questions around this,
1.  Does there currently exist anyway to consistently detect the 
status of a role/list of classes within Foreman for Astapor classes 
that we can utilize?
   -. If so can we do this without knowing the implementation details 
of the Astapor puppet modules?  (We do not want to, for example, look 
for class specific facts in foreman, since these vary between classes 
and may change in Astapor)?
2.  If not 1.  Is is possible to add something to the puppet modules 
to explicitly show that a class/Hostgroup is complete?  I am thinking 
something along the lines of reporting a "Ready" flag back to foreman.
 
I'll have to think about it more, but we already have a fact similar
to this that we use in quickstack for determining if ha-mysql is
ready, so we can decide whether to do certain other steps.  Crag had
some concern that we were seeing an odd behavior with puppet agent
running as a service though, not sure if he and Petr looked at it
friday or not.  In case they did not, his theory was that the puppet
facts from the node were not getting updated correctly between agent
runs when the agent was not a service.  It seemed that the node was
reporting in and the next run still did not have the new value for the
fact (so in this case, the second run should show ha_mysql_ready=true
or similar).  The fact was correct when puppet agent was run in the
foreground for each run, so I believe the thought was that when agent
ran as a service, facts were being cached and not updated.  I am
unsure if this has yet been either proved or disproved, just
mentioning it in case it is a real issue.
Anyway, if that were _not_ an issue, it would be simple enough to add
a controller_ready fact or similar to quickstack.  I am still not sure
if this is the best approach, but it is definitely feasible, we have
all the information available to us to report back such a thing.
-j
If none of the above, any other suggestions?
Cheers
Martyn