<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Sep 22, 2021 at 9:53 AM Guido Langenbach <<a href="mailto:guido.langenbach@cloudseeds.de">guido.langenbach@cloudseeds.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hey guys,<br><br>we migrated an OpenStack installation of one of our customers to TripleO RDO when Rocky was released. Back then deployment times were fine, as the whole cluster didn't have that many nodes yet.<br><br>In the meantime we upgraded to Ussuri and use 3 controllers and 44 compute nodes by now. Our deployment times exponentially increased with each set of compute nodes we added. With our current setup, a complete deployment run takes about 15 to 16 hours. In our clusters we count the following ressources right now:<br><br>VMs: ~ 2100<br>Networks: ~500<br>Ports: ~6100<br>Volumes: ~5700<br><br></div></blockquote><div><br></div><div>This sounds like it's related to <a href="https://bugs.launchpad.net/tripleo/+bug/1915761">https://bugs.launchpad.net/tripleo/+bug/1915761</a>. This should be fixed in the latest versions and we did backport it to Train. So it'd be interesting to see if you're missing some of these patches. We've had reports of updates taking about 4.5 hours with newer versions of train to update so the numbers you have seem to point to possibly missing patches related to that bug or an execution configuration problem.</div><div><br></div><div>Additionally in newer versions we switched the execution strategy to try and speed things up as well which should be available in the latest versions of train/ussuri due to <a href="https://review.opendev.org/q/topic:%22strategy-improvements%22+(status:open%20OR%20status:merged)">https://review.opendev.org/q/topic:%22strategy-improvements%22+(status:open%20OR%20status:merged)</a></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">We implemented ARA for now so we can get exact measures of each ansible-playbook runtime to see what is taking the most time. My question is: How big are your production OpenStack environments and how long does it take you to deploy?<br><br></div></blockquote><div><br></div><div>Are you running ansible-playbook by hand? And do you have a ansible.cfg? We added an `openstack tripleo config generate ansible` command that'll generate a starting ansible.cfg that's similar to what we have in the mistral execution.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Which methods do you guys use to scale-up compute nodes? (spoiler: --skip-deploy-identifier doesn't seem to work properly)<br>Is Blacklisting all other Compute nodes the right move? Do you even blacklist the Controllers as well?<br><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><br></div><div>Best Regards,</div><div><br></div><div>Guido</div><div dir="ltr"><br></div></div></div></div></div></div></div></div></div></div></div></div>
_______________________________________________<br>
users mailing list<br>
<a href="mailto:users@lists.rdoproject.org" target="_blank">users@lists.rdoproject.org</a><br>
<a href="http://lists.rdoproject.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.rdoproject.org/mailman/listinfo/users</a><br>
<br>
To unsubscribe: <a href="mailto:users-unsubscribe@lists.rdoproject.org" target="_blank">users-unsubscribe@lists.rdoproject.org</a><br>
</blockquote></div></div>