On Fri, Nov 10, 2017 at 5:12 PM, James LaBarre <jlabarre@redhat.com> wrote:
I keep trying to build tripleo-quickstart with different parameters, and
at the end of a 2+ hour process, it fails at validation.  Problem is,
even with a "verbose" option in the command, I have absolutely no clue
on why the build failed.  Just where does the process keep it's log
information?

With the verbose option, it bombs out at :

=================================

TASK [did the deployment pass or fail?]
****************************************
task path: /root/.quickstart/playbooks/quickstart-extras-overcloud.yml:41
Friday 10 November 2017  16:57:13 -0500 (0:00:00.071)       2:12:25.331
*******
fatal: [localhost]: FAILED! => {
    "changed": false,
    "failed": true,
    "failed_when_result": true,
    "invocation": {
        "module_args": {
            "var": "overcloud_deploy_result"
        },
        "module_name": "debug"
    },
    "overcloud_deploy_result": "failed"
}
=================================

I'd really like to figure out where it's failing, rather than having to
run yet another 2 or more hour run just to see it fail at the end (I had
tried caching the image files locally with the hopes I could cut the
build time down, but that hasn't cut off enough time).


Starting the build with:

export OOOtimestamp=`date +%Y%m%d_%H%M`
export GenConfig=minimal
export ReleaseName=pike
export ClusterNodes=config/nodes/3ctlr_1comp.yml
bash quickstart.sh --playbook quickstart-extras.yml --bootstrap
--no-clone --config config/general_config/${GenConfig}.yml -t all -S
overcloud-validate -R ${ReleaseName} -N $ClusterNodes -v 127.0.0.2 |&
tee ~/OOOlog_${OOOtimestamp}.txt


(I'm using the variables so I can have a consistent run script, should
probably make the timestamp log name run as part of the command)

Seems like this job should would be the closest equivilant [1]
Pike has been passing consistently there, so we'll have to figure out why your
deployment is failing.

If your deployment is failing there should be a file in /home/stack/ called failed_deployments* [2]
There are two files there and should lead you down the path to what is failing.

Also there is no reason to restart from scratch each time.  If you deployment fails
1. ssh -F ~/.quickstart/ssh.config.ansible undercloud
2. debug and or change the params in overcloud-deploy.sh
3. (delete the overcloud) source stackrc; openstack stack delete $name ( overcloud )
4. Rerun overcloud-deploy.sh from the overcloud

Ansible will not format errors very well, so quickstart doesn't even try.  All the logs
relevant to your deployment will be on the undercloud in /home/stack and of course
the logs on the overcloud nodes in /var/log will be useful as well.

You will probably find it handy to look through our logs as well.
Hope that helps

[1] https://thirdparty.logs.rdoproject.org/jenkins-promote-rhel-pike-rdo_trunk-virtha-3ctlr_1comp_192gb-28/
[2] http://logs.openstack.org/89/457989/2/check-tripleo/legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024/b33f955/logs/undercloud/home/zuul/failed_deployment_list.log.txt.gz

 

_______________________________________________
users mailing list
users@lists.rdoproject.org
http://lists.rdoproject.org/mailman/listinfo/users

To unsubscribe: users-unsubscribe@lists.rdoproject.org