SysAdmin at Large: VMware

Monday, November 5, 2012

Fixing Cron Jobs in Zimbra 8 on CentOS 6.3

In a previous post, I detailed some preliminary steps for installing Zimbra 8 GA on CentOS 6.3. Today, I realized that my server status was not accurately reporting the status of various Zimbra services and server metrics. A little digging around showed me that none of the Zimbra 8 cron jobs were running or properly scheduled. Here is how I solved my problem:

First, install cronie:

# yum install cronie

This will provide us with a familiar crontab interface, and it will make it much easier to add the Zimbra 8 cron jobs to our system. Next, we must remove exim so that we don't have another sendmail on our system to pollute our running process name space:

# rpm -e --nodeps exim

Leaving exim installed will conflict with Zimbra's MTA. If the MTA fails to start, you might encounter the following errors:

[zimbra@zimbra ~]$ zmmtactl start
Rewriting configuration files...done.
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.10-20120422.2z/conf/main.cf
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.10-20120422.2z/conf/master.cf
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.10-20120422.2z/conf/master.cf.in
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.10-20120422.2z/conf/tag_as_foreign.re
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.10-20120422.2z/conf/tag_as_originating.re

Run the following commands as root:

# /opt/zimbra/libexec/zmfixperms --verbose --extended
# su - zimbra
# zmmtactl restart

Next, navigate to the Zimbra 8 crontabs directory:

[root@zimbra crontabs]# pwd

/opt/zimbra/zimbramon/crontabs

From here, we have to concatenate all of our cron files into one easy to digest crontab:

[root@zimbra crontabs]# cat crontab >> crontab.zimbra

[root@zimbra crontabs]# cat crontab.ldap >> crontab.zimbra

[root@zimbra crontabs]# cat crontab.logger >> crontab.zimbra

[root@zimbra crontabs]# cat crontab.mta >> crontab.zimbra

[root@zimbra crontabs]# cat crontab.store >> crontab.zimbra

Finally, we use crontab to load the new crontab.zimbra file:

[root@zimbra crontabs]# crontab crontab.zimbra

A quick 'crontab -l' will show our current crontab, and after a few minutes, Zimbra 8 on CentOS 6.3 should be properly reporting server statistics, performing regular house keeping, and more. There may be a better way to go about this, but I haven't found one as of yet. Before putting this into production, please test thoroughly. If I encounter any other issues, I will either update this post or create an entirely new post (depending on the scope of the problem).

Tuesday, September 4, 2012

vSphere 4.0.0 and vmInventory.xml

Yesterday, one of my ESXi 4.0 hosts decided to seg fault. Fortunately, this was during labor day when almost everyone (except us SysAdmins) was enjoying a hard earned day off. Also, I am fortunate that no production services happened to be running on that host.

After bringing the server back up, I discovered a few items needed to be fixed:

I had a corrupted software iSCSI configuration.
I had a corrupted vSwitch (the one I am using for vMotions).
I now had a corrupted vmInventory.xml

As I have HA configured on the cluster, the VMs somehow evacuated the host, and while the running states were migrated over, the VMs now showed as orphaned in vCenter. Shortly after that, the VMs disappeared entirely from the inventory. Thankfully, all services somehow managed to stay up. I could ping and SSH all day long on the VMs that were supposedly down.

I used the instructions in the following VMware KB to attempt to repair vmInventory.xml:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007541

Unfortunately this did not work. After a few hours of repairing software iSCSI, and restoring and rebuilding by vSwitch (in addition to checking the integrity of other vSwitches), it dawned on me that I could perhaps just brows through my LUNs and re-add orphaned VMs to the appropriate host. Well, some of the vmx files were locked, and it was fun getting those unlocked. Finally, after all files were unlocked, I successfully added one of my VMs to the inventory...

... only, it stated the VM was invalid. Damn it! So close. I verified all critical VMs were up and running, then I went home to see my wife and daughter, ate a very belated dinner, and then went to sleep.

This afternoon, it occurred to me that the object of my quest resided in vmware.log. On a hunch, I browsed to the datastore of the VM, pulled down a copy of vmware.log, and I was not surprised to find the log contained information about the last running ESXi host, and the details of the vmx (in case it had to be rebuilt). I then attempted to re-add the VM back to the proper host, and vCenter indicated that I had indeed placed the VM on the right host.

As a side effect, I learned a few things about my environment that I need to adjust, and ways I can improve my monitoring and my infrastructure, not the least of which is scheduling some time to upgrade ESXi and vCenter. Now that the "vRAM Tax" has been eliminated in vSphere 5.1, I am seriously considering renewing my service contract instead of picking a new hypervisor platform.