SysAdmin at Large: September 2012

Friday, September 14, 2012

Zimbra 8 GA on CentOS 6.3

Much to the pleasure of many SysAdmins, Zimba 8 General Availability has been announced by VMware. I have been planning on migrating to the Zimbra Open Source for my commercial email platform. (I will migrate clients over using imapsync.) Due to my upcoming Space Walk deployment, I decided early this year to standardize my department on CentOS. Naturally, I figure it is time to update my CentOS VM template to 6.3, and give Zimbra 8 a test.

When I setup my CentOS Linux Virtual Machines, I prefer to attach multiple SCSI vmdk files as my guest storage; usually 25 GB thin-provisioned disks. Then, when building my template, I tend to place /boot and /swap on one disk, and use the rest of the space for LVM. This allows me to quickly add more space to my Virtual Machine down the line without having to power down or restart my guest OS; a feature that is very important for capacity planning and meeting my SLAs. I also like to forgo setting resource reservations, and opt instead for resource limits. My standard VM template has 4 GB RAM, 4x vCPUs, 4000 MHz CPU limit, and a 4096 MB (4 GB) RAM limit. Also, installing VMware tools, while sometimes a pain (and outside scope of this blog), is beneficial for tighter management from vCenter.

After choosing the "basic server" packages from my 64 bit net install, I ran the following commands:

# yum install wget sudo sysstat libidn gmp libtool-ltdl compat-glib vixie-cron nc perl

Then, after using WinSCP to upload my zcs folder to /root, I modified the following permissions:

# chmod +x install.sh && chmod +x ./bin/get_plat_tag.sh

I also discovered that Zimbra 8 expects libstdc++.so.6 in /usr/lib. So, I made a symbolic link:

# ln -s /usr/lib64/libstdc++.so.6 /usr/lib/libstdc++.so.6

Then, I modified /etc/hosts with my public IP and FQDN of my new Zimbra host. Also, be sure to setup a DNS MX record in your zone file (it is better to have DNS up to date before installing ZCS).

Now, since this edition of Zimbra is built for RHEL, we need to run the following command to install Zimbra on our host:

# ./install.sh --platform-override

That's it. From here, it is up to the SysAdmin to configure iptables according to their internal policy, and to make any other changes to the system as needed (while following change control and management best practices; naturally). The easiest Zimbra build to start with is the "all in one" server, and I refer the reader to the Zimbra Documentation, Wiki, and community forums for installation and configuration instructions. You may also want to consider adding a commercial SSL certificate from your favorite vendor (if/when putting this host into production).

One of the features of Zimbra that I love so much is the modular design - this makes building a Zimbra cluster(s) much simpler than products from some unnamed competitors. =) For more advanced deployments, I recommend the reader consult the resources listed above.

Tuesday, September 4, 2012

vSphere 4.0.0 and vmInventory.xml

Yesterday, one of my ESXi 4.0 hosts decided to seg fault. Fortunately, this was during labor day when almost everyone (except us SysAdmins) was enjoying a hard earned day off. Also, I am fortunate that no production services happened to be running on that host.

After bringing the server back up, I discovered a few items needed to be fixed:

I had a corrupted software iSCSI configuration.
I had a corrupted vSwitch (the one I am using for vMotions).
I now had a corrupted vmInventory.xml

As I have HA configured on the cluster, the VMs somehow evacuated the host, and while the running states were migrated over, the VMs now showed as orphaned in vCenter. Shortly after that, the VMs disappeared entirely from the inventory. Thankfully, all services somehow managed to stay up. I could ping and SSH all day long on the VMs that were supposedly down.

I used the instructions in the following VMware KB to attempt to repair vmInventory.xml:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007541

Unfortunately this did not work. After a few hours of repairing software iSCSI, and restoring and rebuilding by vSwitch (in addition to checking the integrity of other vSwitches), it dawned on me that I could perhaps just brows through my LUNs and re-add orphaned VMs to the appropriate host. Well, some of the vmx files were locked, and it was fun getting those unlocked. Finally, after all files were unlocked, I successfully added one of my VMs to the inventory...

... only, it stated the VM was invalid. Damn it! So close. I verified all critical VMs were up and running, then I went home to see my wife and daughter, ate a very belated dinner, and then went to sleep.

This afternoon, it occurred to me that the object of my quest resided in vmware.log. On a hunch, I browsed to the datastore of the VM, pulled down a copy of vmware.log, and I was not surprised to find the log contained information about the last running ESXi host, and the details of the vmx (in case it had to be rebuilt). I then attempted to re-add the VM back to the proper host, and vCenter indicated that I had indeed placed the VM on the right host.

As a side effect, I learned a few things about my environment that I need to adjust, and ways I can improve my monitoring and my infrastructure, not the least of which is scheduling some time to upgrade ESXi and vCenter. Now that the "vRAM Tax" has been eliminated in vSphere 5.1, I am seriously considering renewing my service contract instead of picking a new hypervisor platform.