Simplifying things for our application team consumers: Inputs
Application teams know their applications. We know our Service Monitoring Service. We need to quantify very simple inputs that will enable our application/service teams to easily consume our service. The great news for us is that we can succeed with just two simple inputs:
- No matter the operating system or platform, every piece of capacity (including network devices) writes events.
- We need to have connectors to every class of capacity that our application teams utilize so we can harvest the event information (e.g. SNMP network traps, Windows Server Event Logs, Internet of Things device logs, etc).
- From there, our application teams can configure their code to write events for whatever conditions they deem necessary.
- Inbound email connector
- Another reality is that our application teams will build custom synthetic monitors and/or they will consume third party synthetic monitoring solutions. Often, those monitors will only be configurable to send email alerts.
- Also, many cloud services—on which our application teams depend—only send email-based alerts and notifications.
- However, we need all alerts in our Service Monitoring Service in order to have the metadata. And remember, we need to manage centralized escalations through our center-point hub.
- Due to the fact that some alerts will only come via email and we need all alerts in our system, we must account for inbound-email-alarms.
- We can easily account for this need in the following manner:
- Establish a monitoring inbox per application team and/or service.
- Instruct each application team to include the monitoring inbox on the distribution for any email alerts that they configure.
- Our Service Monitoring Service will programmatically check each monitoring inbox on a scheduled frequency (e.g. every minute) to look for new email alerts.
- Our service will consume those alerts, and based on the sender and/or subject of the emails, we will generate alerts for the application team.
- At that point, the alert is like any other alert in our system and we can use our normal outputs to notify and/or ticket the application team.
- NOTE: our code should clean up the monitoring inboxes so we do not run into “inbox full” situations that break our monitoring capabilities.
A quick aside on performance monitoring
One of the common pitfalls with any Service Monitoring Service is that we may try to take on too much. Too broad a scope will destroy any project. Even though we realize the risk, we often try to do everything for everyone because we do not want to see an application team go their own way with tooling.
Performance Monitoring is a very application-centric need. Even though there are some commonalities, every application team will want different frequencies of data collection, different periods of data retention, different capabilities for performance triggers based on the data, etc. If we try to solve Performance Monitoring as part of our Service Monitoring Service initiative, we will distract from our goal of creating a virally-adopted Service Management Service. We cannot allow ourselves to venture into the performance monitoring quagmire.
At some point, our business may want to create a centralized “Performance Monitoring Service” in support of our Capacity and Performance Management processes. We may be the individuals that are chartered to build that service. And, I will likely write a blog series about the Performance Management Service in the future.
Even so, the Performance Monitoring Service should be decoupled from the Service Monitoring Service. If we try to make the two services one and the same, we are much more likely to fail. Until such a time that the Performance Monitoring Service is a reality for our business, we should instruct our application teams to “do whatever they need to do” with respect to performance monitoring and performance monitoring triggers for their application. When they want to alert, ticket, or notify for a particular performance trigger, they should simply create an event. We will consume that event just like any other event.
We have covered a lot of ground in this blog series—ITSM and cloud, ITSM and application development, ITSM and tools, Service Monitoring Service Inputs and Outputs, and performance monitoring. I have no doubts that some of the points will spark passionate discussions. We could write books on each of those topics, and I am confident that we will write those books together over time.
Remember, my intention is not to convince you of any particular point. You do not have to agree with me. But I want you all to think critically, to plan deliberately, and to act urgently. The future of Service Management for our businesses is at stake. We owe it to our industry and our businesses to drive the Service Management evolution that is needed—even if some of those discussions are difficult discussions. The writing is on the wall and we must act now.
More blog posts in the Building Service Monitoring as a Service with an Eye on the Cloud series
Read the first blog post from Carroll Moon, Service Monitoring as a Strategic Opportunity.
Read the second post, The Future of Service Management in the Era of the Cloud.
Read the third post, One Team - One Set of Service Management Objectives.
Read the fourth post Service Monitoring Service Outputs.
Read the sixth post Building Trust in the Service Monitoring Service.
Read the seventh post Making the Service Monitoring Service Viral.
Read the eighth post, Service Monitoring Application Development.
Read the ninth post, Monitoring Service Health.
Read the tenth post, Delivering the Service Monitoring Service.
Read the final post, The service monitoring service – rounding it all up.