Thanks for this effort. I'd fully support the public wiki page as
it
allows for explanations and incremental improvements along with the
script.
Yeah, I definitely need to get that up :)
Have you had a look at what it would take to do a rolling upgrade
rather than down all the services ?
Yeah. So, Lars just wrote back about the service-by-service approach,
which also works well and provides smaller downtime gaps and fewer
"committed" components during the upgrade if something goes wrong.
Further, it also lets you upgrade everything except nova, and then
deploy a new havana nova cluster that shares resources with the
still-on-grizzly deployment. With that, you can slowly migrate your
instances to the new nova deployment, growing that deployment as nodes
become vacated from the old one, eventually deprecating the old one
completely. That offers tenant, credential, image, and volume continuity
across the upgrade, as well as a much larger timeframe to migrate
everything, other than just "nova is down, move everything!!" :)
To me, "rolling upgrades" means something else, which I think is
probably what you're referring to. My upstream work right now is
targeted at making sure that we can upgrade nova components individually
without needing the parallel nova deployment above. We're getting ever
closer to this being a reality, and I _hope_ we will start gating on
"not breaking live upgrades" in the Icehouse or J timeframes.
--Dan