Problems With mon_disk_space Script? (VMware KB 2058187)

I recently attempted to deploy the mon_disk_space script from VMware KB 2058187. The instructions from the KB are straightforward; users only need to modify the below two values to get started:

# Please provide email for alert messages
email='wmilliron@example.com'
# Please provide percentage threshold for PostgreSQL used disk space
thresh=10

The script should send an email to the address provided when the PostgreSQL volume is utilizing more capacity than the provided (as a percentage) threshold. For my testing, I put the initial value at 10 knowing it would trigger the email to send.

After copying the script to /etc/cron.hourly on the VCSA and running ‘chmod 700 /etc/cron.hourly/mon_disk_space‘ to ensure the script is executable by cron, emails still were still not showing up, even after waiting over an hour. The troubleshooting began…

First, make sure cron is attempting to execute the script by running:

grep cron /var/log/vmware/messages

You should find entries similar to this in the log:

run-parts[51761]: (/etc/cron.hourly) starting mon_disk_space
run-parts[51761][51796]: (/etc/cron.hourly) finished mon_disk_space

If you see those entries, then cron is able to execute the script, so the problem seems to be within the script itself. If you take a look at line 9 of the provided script, the variable ‘db_type’ is populated by running:

cat /etc/odbc.ini | grep DB_TYPE | awk -F= '{print $2}' | tr -d ' '

When I run that single command against my 6.7 VCSA, I get these duplicate values:

vcsa [ ~ ] cat /etc/odbc.ini | grep DB_TYPE | awk -F= '{print $2}' | tr -d ' '
PostgreSQL
PostgreSQL

Let’s take a look at the provided script again. Lines 10-12 are looking for a single “PostgreSQL” entry, but the VCSA is providing back two values. This condition causes the script to exit, which explains why no emails are sent.

Simply adding a ‘uniq’ to line 9 will cause the script to produce a single, unique value. Line 9 of mon_disk_space ends up looking like this:

db_type=`cat /etc/odbc.ini | grep DB_TYPE | awk -F= '{print $2}' | uniq | tr -d ' '`

After making the change, I manually triggered the cron job by running run-parts /etc/cron.hourly. The alert properly triggered, and the email showed up in my inbox. Lastly, don’t forget to go back and modify the alerting threshold on line 6 of the script to something more sensible.

You may also like

Leave a Reply

Your email address will not be published. Required fields are marked *