System Center Operations Manager is a great component of the system center suite. It’s a framework that helps you monitor and manage all of your components in your infrastructure. From the hardware like compute, storage, network to the virtualization layer, VMs, OS and applications. It can be your single pane of glass to view & manage your infrastructure in a single glass of pane.
And it gets better, in combination with the other system center components such as Service Manager, Orchestrator and more it is your window to manage the entire lifecycle of your infrastructure.
But as always in infrastructure management, life isn’t always as shiny as you hope. Operations Manager is actually a rather “dumb” framework. With “dumb” I don’t mean that is a bad solution, I mean that it is indeed a powerful framework that won’t give you information until you actually feed it the knowledge it need to become the best monitoring solution. The framework provides you the possibility to create dashboards, performance views, reports, monitors, rules and much more, but unless you tell it what it needs to look for and collect information… it won’t do anything.
So how do we get in that information, that knowledge so it starts working? The answer… Management Packs
What are Management Packs
So what are Management Packs exactly? I like to explain it as following: It’s a big box that contains the information to monitor and manage a part of your infrastructure. There are management packs for applications such as SQL, Exchange and more. There are also management packs for Operating Systems, hardware (delivered by the hardware vendor itself), 3rd party management packs for non-Microsoft applications like Oracle and many more. Each management pack (or packs) contains events it need to keep an eye on (and alert if necessary), other rules or monitors, the performance counters (specific) that it needs to collect and store in the frameworks database, views, reports, dashboards and so on
Depending on what you have in your environment you are going to import these management packs into your system so that it can start doing its monitoring and give you the information what you need!
So life is good again? All shiny and well? Not really… I have seen too many imports of a huge amount of management packs and then let it sit there and throw a bunch of alerts at you. This mostly results in the monitoring guy or girl saying that the solution doesn’t work and ignore it complete.
The reason for this is very simple. Every environment isn’t the same and you need to fine-tune those management packs to fit your environment.
Management packs are written with thresholds, rules and monitors that are considered best practices in an average environment. The thing is, an average environment doesn’t exist. So you need to fine-tune those thresholds and more yourself, to fit it in your environment.
Let me give you a small example from the past that explains this very well. A couple of years ago, Microsoft released a management pack for Active Directory (there is of course newer versions of that MP now to reflect current versions but follow me here for a second…) In that MP, there was a rule that created an alert when you didn’t had enough DC’s running in your environment. That meant that when you have less than 50% of your DC’s available, it started to alert that your AD environment wasn’t healthy anymore. It did exactly the same thing from the moment you had less than 3 DCs alive, even if you didn’t had 3. The reason at that time was because MSFT’s best practice was to have at least 3 DCs running. Many companies in Europe however didn’t had 3 DCs but only 2 (or sometimes only 1…) So in those cases, that alert appeared always. Overriding that alert with new thresholds was the solution (again, fit to your environment) but it needed to be done. This probably is a very simple example and could be done very quickly, but you can imagine many more scenarios that will need to be adjusted. Think about the 90% memory rule inside an OS, or the alert for telling you when you are running out of disk space. Some companies want to have at least 10% available before alerting, other ones think that 5% is still enough…
So we need to fine-tune.
Importing an MP into system center operations manager won’t magically make your life shiny. You need to adjust that MP to fit your specific needs. But how can you start with this process and how do you do this.
This is the first part of a blog post series that explain what you need to do and how you can perform this into your own environment. In the next parts, we are going to look on what you can do and how you can do this for your own environment including some best practice tips.