Hi all,
For the last few days, I have been monitoring a spike in disk space utilization for logs.rdoproject.org. The current situation is:
- 94% of space used, with less than 140GB out of 2TB available.
- The log pruning script has been reclaiming less space than we are using for new logs during this week.
- I expect the situation to improve over the weekend, but we're definitely running out of space.
I have looked at a random job (https://review.opendev.org/639324, patch set 26), and found that each run is consuming 1.2 GB of disk space in logs. The worst offenders I have found are:
- atop.bin.gz files (one per job, 8 jobs per recheck), ranging between 15 and 40 MB each
- logs/undercloud/home/zuul/tempest/.stackviz directory on tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001 jobs, which is a virtualenv eating up 81 MB.
Can we sync up w/ how you are calculating these results as they do not match our results.
I see each job consuming about 215M of space, we are close on stackviz being 83M. Oddly I don't see atop.bin.gz in our calculations so I'll have to look into that.
I've checked it directly using du on the logserver. By 1.2 GB I meant the aggregate of the 8 jobs running for a single patchset. PS26 is currently using 2.5 GB and had one recheck.
About the atop.bin.gz file:
# find . -name atop.bin.gz -exec du -sh {} \;
16M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/042cb8f/logs/undercloud/var/log/atop.bin.gz
16M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/e4171d7/logs/undercloud/var/log/atop.bin.gz
28M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-rocky-branch/ffd4de9/logs/undercloud/var/log/atop.bin.gz
26M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-rocky-branch/34d44bf/logs/undercloud/var/log/atop.bin.gz
25M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein-branch/b89761d/logs/undercloud/var/log/atop.bin.gz
24M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein-branch/9ade834/logs/undercloud/var/log/atop.bin.gz
29M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/a10447d/logs/undercloud/var/log/atop.bin.gz
44M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/99a5f9f/logs/undercloud/var/log/atop.bin.gz
15M ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/c8a8c60/logs/subnode-2/var/log/atop.bin.gz
33M ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/c8a8c60/logs/undercloud/var/log/atop.bin.gz
16M ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/73ef532/logs/subnode-2/var/log/atop.bin.gz
33M ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/73ef532/logs/undercloud/var/log/atop.bin.gz
40M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/109d5ae/logs/undercloud/var/log/atop.bin.gz
45M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/c2ebeae/logs/undercloud/var/log/atop.bin.gz
39M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/7fe5bbb/logs/undercloud/var/log/atop.bin.gz
16M ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/5e6cb0f/logs/undercloud/var/log/atop.bin.gz
40M ./tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039/c6bf5ea/logs/undercloud/var/log/atop.bin.gz
40M ./tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039/6ec5ac6/logs/undercloud/var/log/atop.bin.gz
Can I safely delete all .stackviz directories? I guess that would give us some breathing room.