Hi Ashish,

On Tue, May 12, 2020 at 11:39 AM Ashish Kurian <ashishbnv@gmail.com> wrote:
Hello Yatin,

I am deploying on a baremetal. The configuration is Centos 7. 16GB Ram. I hope this much of information about my machine configuration is sufficient. Let me know if you need more details like the kernel version or anything else.
Yes for this issue this information is sufficient, and 16GB is less for the configuration you are trying to deploy. As Phil said with 16 GB swap your installation might complete but better would be to go with more RAM then swap as with swap it will be slow.

Following your suggestion, I tried with the Train release and the undercloud succeeded and the process went much further. However, the process failed (remained stuck) at deploying overcloud phase.


Do you think this could be due to insufficient amount of memory? I have ordered for another 16GB of RAM and it will arrive today.
Yes, this could also be related to configuration, so suggest you try after updating RAM. You checked at which task it was stuck? 

Also, what is the correct procedure to rerun the deployment rather than starting from the scratch with a new Centos installation?
Re installation of CentOS on your baremetal machine is not needed(But if you try to deploy master ensure you have CentOS8 installed as base OS, for train or older you can go with CentOS7).
quickstart will take care of cleanup of deployment when doing redeployment. Just ensure you cleanup the workspace(default is ~/.quickstart, rm -rf ~/.quickstart) or use --clean with quickstart.sh. You can also check out phased installation[1].

[1] https://docs.openstack.org/tripleo-quickstart/latest/getting-started.html
 

Best Regards,
Ashish Kurian


On Mon, May 11, 2020 at 9:44 AM YATIN KAREL <yatinkarel@gmail.com> wrote:
Hi Ashish,

On Sun, May 10, 2020 at 12:08 AM Ashish Kurian <ashishbnv@gmail.com> wrote:

Helllo Folks,

I am still waiting for some assistance from this group. Really cannot proceed without that.
Can u share more details wrt to your environment like which release you are trying to deploy, you deploying on a baremetal or a vm, what's the configuration of baremetal/vm, how you trying to deploy, etc so people on list have more context.
Recently there was a bug wrt slow overcloud nodes https://bugs.launchpad.net/tripleo/+bug/1873892 which is fixed in Train.

In your case it's undercloud ssh failing, and i believe it's also due to slow nodes(as you said you are able to SSH to undercloud manually). I assume you are using tripleo-quickstart to deploy, if yes you can try similar retry/pause in quickstart to see if it helps.

Also you can join freenode channel #tripleo, #oooq to have quicker feedback.
 

Best Regards,
Ashish Kurian


On Mon, May 4, 2020 at 6:44 PM Ashish Kurian <ashishbnv@gmail.com> wrote:
Hello Folks,

For my previous question to the mailing list, Arkady was able to figure out the exact error message that was being generated in the logs. I am forwarding my email conversation with Arkady so that all the information is collected in the email.

Additionally I am attaching the undercloud logs collected using the LogTools utility with run mode 8, if required.

Can anyone in this list, help me with identifying what is wrong with the template and where can I locate this template to take a look into it?

Best Regards,
Ashish Kurian


---------- Forwarded message ---------
From: Arkady Shtempler <ashtempl@redhat.com>
Date: Mon, May 4, 2020 at 4:23 PM
Subject: Re: [rdo-users] [TripleO] Undercloud unreachable
To: Ashish Kurian <ashishbnv@gmail.com>


Hi Ashish!

I was able to find these Errors (these are nor related to SSH problem that you have, but indicates on some FATAL error in used templates) in builder-undercloud.log

~~~~~~~~~~~~~~~~ /home/ashtempl/zahlabut/home/stack/builder-undercloud.log ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2058-  Updating   : linux-firmware-20191203-76.gite8a0f4c.el7.noarch         273/629
2059-  Installing : kernel-3.10.0-1127.el7.x86_64                            274/629
2060:grubby fatal error: unable to find a suitable template
2061:grubby fatal error: unable to find a suitable template

2062-  Updating   : libreport-filesystem-2.1.11-53.el7.centos.x86_64         275/629
2063-  Updating   : mdadm-4.1-4.el7.x86_64                                   276/629
2064-  Updating   : 1:libguestfs-1.40.2-9.el7.x86_64                         277/629
2065-  Updating   : 1:python-libguestfs-1.40.2-9.el7.x86_64                  278/629
2066-  Updating   : fence-agents-all-4.2.1-30.el7.x86_64                     279/629
2067-  Updating   : 2:docker-1.13.1-161.git64e9980.el7_8.x86_64              280/629
2068-  Updating   : conntrack-tools-1.4.4-7.el7.x86_64                       281/629



2421-No '/dev/log' or 'logger' included for syslog logging
2422-No '/dev/log' or 'logger' included for syslog logging
2423:grubby fatal error: unable to find a suitable template
2424:grubby fatal error: unable to find a suitable template

2425-  Verifying  : 10:qemu-kvm-common-ev-2.12.0-44.1.el7_8.1.x86_64           1/629
2426-  Verifying  : 1:grub2-tools-2.02-0.81.el7.centos.x86_64                  2/629
2427-  Verifying  : certmonger-0.78.4-12.el7.x86_64                            3/629
2428-  Verifying  : boost-program-options-1.53.0-28.el7.x86_64                 4/629
2429-  Verifying  : dracut-config-rescue-033-568.el7.x86_64                    5/629
2430-  Verifying  : libvirt-daemon-driver-qemu-4.5.0-33.el7.x86_64             6/629
2431-  Verifying  : 1:libguestfs-1.40.2-9.el7.x86_64                           7/629



1780-  Updating   : ipa-common-4.6.6-11.el7.centos.noarch                      3/629
1781-  Updating   : setup-2.8.71-11.el7.noarch                                 4/629newaliases: warning: valid_hostname: invalid character 40(decimal)...<--LogTool-LINE IS TOO LONG!
1782:newaliases: fatal: unable to use my own hostname
1783-
1784-warning: /etc/shadow created as /etc/shadow.rpmnew
1785-  Updating   : 32:bind-license-9.11.4-16.P2.el7_8.2.noarch                5/629
1786-  Updating   : subscription-manager-rhsm-certificates-1.24.26-1.el7.c     6/629
1787-  Updating   : ipa-client-common-4.6.6-11.el7.centos.noarch               7/629
1788-  Updating   : 1:grub2-pc-modules-2.02-0.81.el7.centos.noarch             8/629
1789-  Updating   : libvirt-bash-completion-4.5.0-33.el7.x86_64                9/629


BTW - maybe you can try to run LogTool you need to run mode number 8, I mean:
8) - Export ERRORs/WARNINGs from Undercloud logs
Actually it's very simple to use this tool, you just have to clone it to your Undercloud host and to start PyTool.py

Thanks!

On Mon, May 4, 2020 at 5:05 PM Ashish Kurian <ashishbnv@gmail.com> wrote:
Hello Arkady,

Please find the two set of log files.

Just to make your analysis easier, the quickstart is failing at the playbook : 

TASK [Gathering Facts] ********************************************************************************************************************************************************************************
task path: /home/ashish/.quickstart/playbooks/quickstart.yml:67

Best Regards,
Ashish Kurian


On Mon, May 4, 2020 at 3:50 PM Arkady Shtempler <ashtempl@redhat.com> wrote:
Hi!

I'm not sure that you'll be able to get them all in one zip file that won't exceed max email attachment size.
Let's start with log files that you have under /home/stack and /var/log

Thanks!



On Mon, May 4, 2020 at 4:36 PM Ashish Kurian <ashishbnv@gmail.com> wrote:
Hello Arkady,

Do you need all of them? How should I provide them to you? Over email or something else?

Best Regards,
Ashish Kurian


On Mon, May 4, 2020 at 3:33 PM Arkady Shtempler <ashtempl@redhat.com> wrote:
Hi Ashish!

On your Undercloud host you have a bunch of logs under:
['/var/log', '/home/stack', '/usr/share/', '/var/lib/']'

Thanks!


On Mon, May 4, 2020 at 4:28 PM Ashish Kurian <ashishbnv@gmail.com> wrote:
Hello Arkady,

I appreciate your help. Ofcourse I can provide you the required log files. However, can you let me know what log file are you looking for and where they are located?

Best Regards,
Ashish Kurian


On Mon, May 4, 2020 at 3:17 PM Arkady Shtempler <ashtempl@redhat.com> wrote:
Hi Ashish!

Is that possible to get the access to your log files?

Thanks!

On Mon, May 4, 2020 at 2:28 PM Ashish Kurian <ashishbnv@gmail.com> wrote:
Hello Folks,

For my TripleO installation, I am constantly getting failure to reach the undercloud with the message:

MSG:

Data could not be sent to remote host "undercloud". Make sure this host can be reached over ssh: Warning: Permanently added 'undercloud' (ECDSA) to the list of known hosts.
System is booting up. See pam_nologin(8)
Authentication failed.

When I actually try to manually ssh into the undercloud using the actual commands, I am able to login into the undercloud.

Can someone help me what might be the problem?

Best Regards,
Ashish Kurian
_______________________________________________
users mailing list
users@lists.rdoproject.org
http://lists.rdoproject.org/mailman/listinfo/users

To unsubscribe: users-unsubscribe@lists.rdoproject.org
_______________________________________________
users mailing list
users@lists.rdoproject.org
http://lists.rdoproject.org/mailman/listinfo/users

To unsubscribe: users-unsubscribe@lists.rdoproject.org


--
Yatin Karel


Thanks and Regards
Yatin Karel