Microsoft Azure Operational Insights Preview Series – Log Management (Part 3)


I’ve covered System Update Assessment and Malware Assessment Intelligence Packs so far. The third Inelegance Pack that I will cover is Log Management. It may sound that this intelligence pack is very simple but I think it can be quite powerful.

In the overview page you will find the Log Management tile. The information there is more of informative than useful but this is because of the nature of this IP.

image

Clicking on the tile leads you to more informative information.

image

Actually before you will see any informative information in these tiles you will need to configure the Log Management Inelegance Pack. Click on configure tile to do that:

image

Here you need to enter which event logs you want to gather information and at what error level. It is very simple you just enter the display name for event log and check what level of logging. As you can see in my example I’ve added 3 event logs and only error events to be gathered for them. A couple of hours later you will see the count for the gathered events on the different event logs being raised.

There are a couple of reasons why I think this Intelligence Pack is very powerful. First very often when you search something in the event log you cannot search more than 2-3 days behind and second it is very hard to search events between more than one servers.

Let’s start to explore this Intelligence pack with some queries.

Let’s say that I want to see what cluster events have been logged for a particular server for the past 7 days:

Type:Event  EventLog:System   Computer:”server.contoso.com”  Source:Microsoft-Windows-FailoverClustering | Select Computer,EventID,ParameterXml | measure count() as count by EventID

image

What I can see from here is that I have a lot of errors about failing of roles and cluster resources. First 4 EventIDs are all bout such failures.

Let’s see what will be shown if I do not target particular server but rather see the count for all:

Type:Event  EventLog:System     Source:Microsoft-Windows-FailoverClustering | Select Computer,EventID,ParameterXml | measure count() as count by EventID

image

We can see that the top 4 EventIDs are the same when searching trough the whole environment. Let’s see if we can find which server what amount of these events generates:

Type:Event  EventLog:System     Source:Microsoft-Windows-FailoverClustering | Select Computer,EventID,ParameterXml | measure count() as count by Computer

image

We can see that we have only 3 servers generating these errors and one particular with bigger amount of them compared to the other two.

Here we may doubt that there is something wrong with that server with most alerts but we may also want to found out if that happened on particular day or it is happening every day:

Type:Event  EventLog:System    Source:Microsoft-Windows-FailoverClustering | Select Computer,EventID,ParameterXml | measure count() by TimeGenerated interval 1DAYS

image

This result gives us information that we have these events every day but most of the errors are from the last 3 days which may lead me to the conclusion that I may need to tighten up the Change Management Process.

I hope this gave you a good understanding what is Log Management Intelligent Pack and how it can help you in your cases.

14 thoughts on “Microsoft Azure Operational Insights Preview Series – Log Management (Part 3)

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.