I agree with Ed that this is a general question and most responses will typically include one of VMware's favorite answers to most questions..."it depends" :smileyhappy:. That being said, here are a few of the tasks to get you started that I believe are important to the daily health of the environment. Our environment is over 90% virtualized, but I do wear multiple hats, so what I am, "able" to do versus what I think, "should" be done is always an endless struggle.
Prerequisites:
Overall, you want to have a thorough knowledge of the virtual inventory you are supporting. You will want to know the answers to the following questions. How many servers, physical and virtual? What models? Hardware specs? Software versions? Have you (or your company) decided on a specific ROI for virtualization (number of VMs/Host, etc)? A lot of this information can be gathered through scripts that can be run on a scheduled basis to give you that 10,000 foot view of your architecture. These reports will also probably identify some items that need to be addressed, which will add further to your tasks. After that you will be better prepared to dig into the minutae.
- Check the health of all Hosts and VM objects in vCenter. Are there any active alarms in vCenter? Have you set up any alarms in the first place? Do the alarms automatically trigger notification or any type of incident tracking mechanism?
- Are all vCenter plug-ins functioning properly?
- Do you have any Host Hardware issues? Alarms, bad memory, power supply or capacity issues?
- Are all Hosts in compliance with Host Profiles?
- Are there any resource bottlenecks? Memory, CPU, Disk, Network? Do you have any, or need any, additional tools to have a better handle on this?
- Are you running at your optimum resource levels? In other words, is the load properly distributed?
- Are you running out of resources anywhere? LUNs with low disk space, etc. Do you need to start looking at budgeting for additional capacity?
- Check for Firmware updates on Host hardware
- Check for ESX Patches
- Check for VM Patches
- Check VMware Tools version
- Run scripts to identify the existence of VMs with snapshots and follow up to see if they are still needed.
- Have you schmoozed with your Storage Admins lately? A good idea since you cannot get very far without them.
Once again, these are all just a list to get you started and is by no means an exhaustive job description. Plus any one of the above tasks could result in an issue that gets you side-tracked for days/weeks.