The best things about virtual computing are its ease of scale, management, portability and consolidation Virtualization also presents unique challenges for the IT team members who ask the following questions:
- How do I monitor this constantly evolving and changing environment?
- What metrics are the most important?
- Do we keep using the legacy tools we’re used using?
- Do I procure something that’s purposely built for the virtual infrastructure (VI)?
- Like most IT decisions, there are pros and cons to each approach, but what if we took a bit of a hybrid approach?
- What if the VI could be monitored and reported on within an existing toolset?
Important vSphere metrics
Exactly what do you need to be aware if you already have a monitoring tool? From my experience, the most common pain point and area for concern is disk; otherwise known as the vSphere datastore. Disk is a shared resource with performance problems that exist for many different reasons. A common pitfall is when slower, legacy storage is repurposed for the VI, simply because it’s what’s on hand. The second common pitfall is misconfiguration and overprovisioning. When monitoring disk, you’ll want focus on the following areas:
- Kernel latency
- Queue length
- Device read/write latency
Compute is made up of CPU and memory resources, which are distributed from the parent hypervisor. Just as disk is a shared resource, performance issues always stem from individual VM resource misconfigurations. A few examples include:
- Oversized and undersized VMs
- Resource pools
- VM CPU and memory reservations
- Out-of-date or missing VM Tools
The above misconfiguration examples place extraneous weight on the parent hypervisor. ESX host CPU scheduling issues lead to VM CPU % Wait and Host % Ready, and can stem from oversized vCPU allocations. Host memory issues occur due to transparent page sharing, ballooning and swapping. Ballooning is a technique that individual VMs utilize to grant guest physical memory back to their parent hypervisor in situations when the parent is out of physical memory. A balloon driver (vmmemctl) is installed through VM Tools within the guest OS. This memory management technique allows the VM OS to communicate with the parent hypervisor and allow the balloon to inflate and allocate memory that’s no longer in use back to the parent. That said, ballooning can cause performance degradation if the guest memory is low and paging has begun. When ballooning is not enough, the dreaded swapping occurs. Swapping is a last ditch memory-management technique taken by a host to reclaim memory. This is achieved by swapping guest physical memory on the datastore out to its individual VM swap file. For a complete rundown on vSphere memory management techniques mentioned here, read this vSphere white paper.
Compute: Host and VM CPU & memory
- Average memory usage (KB)
- Balloon (KB)
- Swap (KB)
- Consumed
- Active
- Average CPU usage (MHz)
- CPU ready time
- CPU wait
For a complete list of all VMware performance metrics collected within Veeam Management Pack and corresponding definitions visit the Veeam MP for VMware metric definitions help center.
Current monitoring state
Before jumping ahead to the end state, I’ll start with some of the tools that I’ve seen folks use to monitor the vSphere VI:
- ESXTop
- vCenter performance tab
- Point monitoring solutions
- Virtual team
- Application teams
- DBA team
Each of these tools serve individual purposes. ESXTop provides a great deal of detailed real-time performance information at and individual ESX(i) host level, making monitoring this ongoing very challenging. In addition, ESXTop does not provide any level of historical performance reporting. vCenter server was designed as a centralized control and management tool for the vSphere environment, and not a monitoring or reporting tool. Application monitoring tools have zero awareness of the underlying hypervisor while on the flip side, and virtual environment point solutions have zero insight into the application level, This leaving you completely blind to what’s running within the OS. Put simply, the problem with each of these current state tools is they fail to see the big picture.
Future state: Veeam + System Center Operations Manager
The Microsoft System Center suite delivers a perfect solution that allows users to maintain, update, manage and monitor environments. One of the components of System Center is Operations Manager ‒ a management and monitoring framework. Operations Manager is extensible through management packs (MPs) that contain the knowledge, rules, monitors, graphical views, reports and more to manage and monitor applications and infrastructure components. Many organizations make large time investments in creating business processes that wrap around monitors and alarms presented through Operations Manager. Veeam Management Pack for Microsoft System Center fills the environmental void that’s created when legacy, physical operating systems are run virtualized by providing app-to-metal visibility.
Allow me to simplify. Prior to virtualization, folks ran operating systems and applications on bare metal hardware where the Microsoft Monitoring Agent was installed. This provided monitoring teams with the required visibility to effectively monitor the hardware, operating systems and applications. When you virtualize workloads that a single, bare-metal server now runs, many virtualized servers and applications where the monitoring agent is still present within the guest. Unfortunately, Operations Manager is completely unaware of an underlying hypervisor. Veeam Management Pack allows teams to merge these two and get the best of both worlds! IT admin can still use the tools they’re used to and benefit from the business processes previously built, while the virtualization administrator gets all of the deep vSphere related performance statistics, along with the historical reporting, capacity planning and intelligence needed to effectively run the VI.
Summary
The fast paced and rapidly evolving virtual data center is often forgotten and left unmonitored. As an IT Professional, having the tools and knowledge to do the job and ensure that all of your applications and services are running in an optimal fashion are both extremely important. Whether you’re running VMware vSphere or Microsoft Hyper-V (to be covered in depth during a subsequent blog), there are performance metrics that you should watch closely. Please share your monitoring stories with the community below!