My Webinar on Log Analytics in Operations Management Suite

Yesterday I’ve done a webinar for my company Lumagate on Azure OpInsights which is now called Operations Management Suite.

If you are interested in viewing this webinar you can find it on Lumagate’s content Library.

Azure Operational Insights Becomes Microsoft Operations Management Suite and Gains Some Features

Yes I know another change in the name. Let’s hope that this will be the last one. Microsoft Operations Management Suite actually includes 4 services:

  • Operational Insights
  • Backup (Azure Backup)
  • Site Recovery (Azure Site Recovery)
  • Automation (Azure Automation)

Note that the portal for this new service is Operational Insights’ portal and also Azure is no longer in the name of the service.

Now before showing what is new in Microsoft Operations Management Suite (Operational Insights) I would also to note that you will manage the services of this stack in different portals:

I personally find this confusing but wanted to share it with you.

So let’s see what is new.

First Microsoft Operations Management  Suite portal UI is slightly changed:

image

Intelligence Packs are now called Solution. Probable reason for that they will no longer represent only Intelligence Packs but more.

image

There are 3 new solutions to represent the other services besides Operational Insights: Automation, Backup and Azure Site Recovery. All of these solutions are currently at very basic level and provide small integration surface. Let’s look at every one of them.

Automation Solution:

image

This solution allows you to connect to your Azure Automation account and have some basic stats showing up from it. There are a lot of links to Azure links to go manage Azure Automation there. This solution also serves as deployment mechanism for Azure Automation Hybrid Worker which I’ve shown in previous post. And this seems to be end story for integration for now.

Backup:

image

As Automaton, Backup Solution also offers similar integration. You can connect to Azure Backup vault and see some basic stats. There are links to go and manage Azure Backup on Azure portal.

Azure Site Recovery

image

This solution seems better than the above two and offers data in more rich way and again links to manage Azure Site Recovery on Azure Portal.

As I see it currently I do not think any of those 3 solution actually stores data in Operational Insights and basically gets this data from the other services directly.

Of course more previous integrations exists on these services as you can do some automation for Site Recovery or Backup and Site Recovery can leverage runbooks in Azure Automation.

Some other changes. Settings is slightly changed:

image

There are some things that are advertised as coming soon as gathering logs from Linux servers trough Linux agent and getting logs from AWS Storage.

image

Usage is also changed and now looks more useful:

image

And at last you can actually get syslog data from Linux machines located in Azure without agent. You can do that by going to the Azure Portal and add Storage account to data type. You will see that there is a new data type called Syslog (linux):

image

The bad news is I couldn’t find anywhere information how to enable Diagnostics in Azure for Linux VMs that will fill up that table. Hope documentation on that will come soon.

Azure Automation – Hybrid Worker Setup

It is May 4th which marks the start of Microsoft Ignite 2015 and of course new Azure features start to appear. So first Azure Automation is now available in the preview portal:

image

This new experience also gives you the option to create graphical runbooks:

image

Although this looks great I think I will still prefer writing text runbooks (PowerShell) but this will be useful for less PowerShell experienced folks.

Now lets move on to the more interesting feature and that is Hybrid Worker. Hybrid worker let’s use execute Runbooks on on-prem machine instead in Azure. If you click on that tile and configure you will see how to enable that:

image

So in short Azure Automation uses Azure Operational Insights for deployment of the bits of the Hybrid Worker on on-prem machine. After that you can logon locally on the machine you need to execute a cmdlet in order to connect the Hybrid worker to your Azure Automation Subscription. It is still uncertain for me if the Hybrid worker will use OpInsights’ proxy if you do not have direct connection to Internet.

Lets see how we can do all these steps.

First I go to my Operational Insights workspace and add Automation Intelligence Pack:

image

After that the bits needed for Hybrid Worker will be deployed on all your directly attached agents:

image

What I’ve noticed that those bits are not deployed to connected  SCOM server and agents connected trough SCOM. So you should use directly connected agents from OpInisghts.

After that you can logon to a machine with direct agent and execute cmdlet to connect that machine to Azure Automaton and basically promote it as Hybrid Worker.

The cmdlet in question is Add-HybridRunbookWorker and we first need to find in which module is that found. After some digging I’ve found that all Azure Automation Hybrid worker bits are locate in the following directory:

C:\Program Files\Microsoft Monitoring Agent\Agent\AzureAutomationFiles

image

and in HybridRegistration folder you will find a module that you can import:

Import-Module “C:\Program Files\Microsoft Monitoring Agent\Agent\AzureAutomationFiles\HybridRegistration\HybridRegistration.psd1″

image

After that you can see what cmdlets are available:

get-command -Module HybridRegistration

image

Looking at the help of Add-HybridRegistration we can see what parameters are available:

image

Seems Endpoint, Token and Name are mandatory but where I can find those? Actually two of them are easily to find at the Azure Automation account information. Just go to your Automation Account main page and on tip there is a key icon that you can click which will give you Primary Access Key, Secondary Access Key and URL.

image

Your Primary Access Key is your Token parameter, URL is your Endpoint parameter and for Name parameter you can choose whatever makes sense for you:

Add-HybridRunbookWorker -EndPoint https://we-agentservice-prod-1.azure-automation.net/accounts/XXXXXXXXXXXXXXXXXXXXXXXXXXX -Token “XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX” -Name StanHW

image

After that you should see immediately your Hybrid worker on the portal:

image

Now you can create a simple runbook like this:

workflow TestHybrid
{
Write-Output $env:computername
}

for tests.

If you run it and choose Azure:

image

Output will be “Client”:

image

But if you execute at your Hybrid Worker:

image

The output will be “DC01” which is the name of the server I’ve added as Hybrid Worker:

image

When you execute runbook on Hybrid Worker you will see process Orchestrator.Sandbox being started:

image

So a few summaries from me on this:

  • Seems this feature works currently only with OpInsights Direct Agent and not with SCOM connected servers.
  • It is not clear if Hybrid worker uses the proxy of OpInsights Direct Agent if there is one.
  • Executing runbooks in Azure looks a little bit faster.
  • Seems currently you cannot scheduled runbooks to be executed on Hybrid worker.
  • You cannot do tests of runbooks on Hybrid Worker. you have to publish the runbook and execute it.
  • Seems you are billed the same way if you are executing runbooks on Azure or Hybrid worker.
  • Tile for Automation in OpInsights is currently empty and unclickable.
  • PowerShell modules uploaded to Azure Automation are not distributed to Hybrid Workers.
  • If you give the same Name for Hybrid Worker when register it to more than one server they will work in high-availability/load balancing mode. I am not sure what happens if node is down.

Microsoft Azure Operational Insights Preview Series – General Availability (Part 18)

Previously on Microsoft Azure Operational Insights Preview Series:

This will be the last blog post of these series simply because I’ve received e-mail that Azure Operational Insights will become Generally Available May 4th. This date matches my prediction that it will happen around Microsoft Ignite Conference. I will still continue to write blog posts on the service but I will not include it in a series. Here is the official Azure e-mail we all received about the service:

image
I hope I was helpful with these series to you.

Microsoft Azure Operational Insights Preview Series – Security and Audit (Part 17)

Previously on Microsoft Azure Operational Insights Preview Series:

Security and Audit Intelligence Pack is probably the most powerful IP of all. That IP gathers a lot of logs basically every security log on every machines you are monitoring with Operational Insights. And if you have tried doing that with SCOM Audit Collection Services in the past you know it is not an easy job to do. Azure Operational Insights solves that problem you just enable it and the OpInsights team takes care of supporting the infrastructure for all this data and updating the IP itself, you just consume the end result and make the analysis based on the data.

Now before enabling this IP keep in mind that uploads a lot of data. For a 57 machines where we have 1 SMB storage server, 4 Hyper-V servers and the rest is virtual machines we have seen up to 44GB uploaded data per day. As this information is based on preview always look for the latest data on this topic.

You can find the Security and Audit IP in the Gallery:

image

Until data is being gathered you will not be able to click on the tile and dive deep:

image

After several hours you will see data:

image

Clicking on the tile opens a lot of information:

image

You will notice that this data is scoped for the last 1 day. The reason for this is to show you day to day trends.

I will not focus on every single tile you see here in this dashboard as when you click on every single tile this will lead you to a search query. And those predefined queries are there to help you explore so you can make your own queries that makes sense in your environment.

We can find a KB article with some security Event IDs and search by them:

http://support.microsoft.com/kb/977519

Let’s say I am choosing Event 4720 which should show me what are the user accounts that are created:

Type=SecurityEvent  EventID=4720 | Select TargetUserName,UserPrincipalName,TargetSid,TargetDomaunName,SubjectAccount

It is a very simple query that I can execute and for example I can scope for the last 7 days:

image

Although that query is very simple it gives me powerful results as I search across every domain controller I am monitoring. Imagine situations where I have two separate environments but for example I want to see results from both of them on one place. That is the power of Operational Insights.

When you are trying this services I would recommend to try to convert every audit you had in the past related to Windows security logs into search queries. Keep in mind the services is still in preview so there might be some glitches or missing scenarios. Weather this service is GA or in Preview it is always good to express your suggestions in the UserVoice to help improve it.

Azure Automation is Faster Than Me

The original title of this post was suppose to be “Strategy on Running Runbooks in Azure Automation” and wanted to write it for more than a month. Why the title changed during this period? Simple – there was some limitation on running runbooks in Azure Automation and that limitation was significantly  increased which pushed me to change the title. Let me explain in detail what was the limitation, what has happened and what is that strategy I had that you might still need and it is always good to implement.

So Azure Automation had this limitation that if one runbook is running for more than 30 mins it will restart the job and it will start either from the beginning of the runbook or from the last checkpoint you’ve made in it. So those 30 mins were increased to 3 hours. I’ve found that just before writing this article. I always verify sources before writing blog post. A month ago this limitation was 30 mins and I know because I’ve hit it. I was a little bit busy so couldn’t write blog post on it right away and after a month things has changed. This limitation is described over here under Fair Share section.

But still with 3 hours I guess in some extreme cases you can still hit that limit. My proposal on this is to have one root runbook that will start other child runbooks one after another. In the child runbook you will have tasks that you will make sure they run under 3 hours or the tasks you will execute in them can be asynchronous. Usually asynchronous tasks return a job ID that you can query every 5 seconds for example to check the status. Before doing the check you can create a checkpoint (Checkpoint-Workflow ) in your runbook so in case the runbook takes longer than 3 hours and it is restarted it will continue right from your last safe point. In fact your root runbook you can use the same approach of starting child runbooks, getting their job ID, making a checkpoint and waiting for the status of the job. There is a good example on starting a runbook and checking its job status here.

Here is sample code from me in order to get the logic:

workflow ExampleRoot
{
    param (
      [Parameter(Mandatory=$true)][string]$Param1
     )
   
    $Creds       = Get-AutomationPSCredential -Name ‘Creds’
   
   
    ########################### RUNBOOK 1 #############################################################
    # Connect to Azure Subscription
    $AzureAccount = Add-AzureAccount -Credential $Creds

    $job1 = Start-AzureAutomationRunbook -AutomationAccountName CloudAdmin  `
                                         -Name                  “ExampleChild1″ `
                                         -Parameters            $param1
   
    $Creds        = $null
    #Create Checkpoint
    Checkpoint-Workflow
    $Creds       = Get-AutomationPSCredential -Name ‘Creds’
   
    # Connect to Azure Subscription
    $AzureAccount = Add-AzureAccount -Credential $Creds
   
    $doLoop1 = $true
    While ($doLoop1)
    {
        $job1 = Get-AzureAutomationJob -AutomationAccountName CloudAdmin `
                                       -Id                    $job1.Id
        $status1 = $job1.Status
        $doLoop1 = (($status1 -ne “Completed”) -and ($status1 -ne “Failed”) -and ($status1 -ne “Suspended”) -and ($status1 -ne “Stopped”))
    }

    $Output1 = Get-AzureAutomationJobOutput -AutomationAccountName CloudAdmin`
                                            -Id                    $job1.Id `
                                            -Stream                Output

 

    $RemovedAzureAccount = Remove-AzureAccount -Name          $AzureAccount.ID `
                                               -Force         `
                                               -WarningAction SilentlyContinue

   
    ########################### RUNBOOK 2 #############################################################
   
   
    # Connect to Azure Subscription
    $AzureAccount = Add-AzureAccount -Credential $Creds

 

    $job2 = Start-AzureAutomationRunbook -AutomationAccountName CloudAdmin  `
                                         -Name                  “ExampleChild1″ `
                                         -Parameters            $param1
   
    $Creds         = $null
    #Create Checkpoint
    Checkpoint-Workflow
    $Creds       = Get-AutomationPSCredential -Name ‘Creds’

    # Connect to Azure Subscription
    $AzureAccount = Add-AzureAccount -Credential $Creds
   
    $doLoop2 = $true
    While ($doLoop2)
    {
        $job2 = Get-AzureAutomationJob -AutomationAccountName CloudAdmin`
                                       -Id                    $job2.Id
        $status2 = $job2.Status
        $doLoop2 = (($status2 -ne “Completed”) -and ($status2 -ne “Failed”) -and ($status2 -ne “Suspended”) -and ($status2 -ne “Stopped”))
    }

    $Output2 = Get-AzureAutomationJobOutput -AutomationAccountName CloudAdmin `
                                            -Id                    $job2.Id `
                                            -Stream                Output

    $RemovedAzureAccount = Remove-AzureAccount -Name          $AzureAccount.ID `
                                               -Force         `
                                               -WarningAction SilentlyContinue

}

Logically this would look something like this:

image

Now there is one tip I want to share here. You will notice that before making checkpoint I am clearing the credentials variable I am using by this: $Creds= $null. Currently the checkpoint cannot handle credentials and your runbook you will probably fail if you do not clear all your credential variables you are using before making a checkpoint. This issue is described here. After clearing the credentials, making the checkpoint you can get your credentials again from Assets. Example:

   $Creds        = $null
    #Create Checkpoint
    Checkpoint-Workflow
    $Creds       = Get-AutomationPSCredential -Name ‘Creds’

 

If you have more than one credentials you are using obviously you will need to clear every one of them and get them again after the checkpoint.

Whether Azure Automation has that time limit or not it is always good to make checkpoints after certain actions. This will make your runbook  better.

I hope this was helpful.

Microsoft Azure Operational Insights Preview Series – Collecting Logs from Azure Diagnostics (Part 16)

Previously on Microsoft Azure Operational Insights Preview Series:

This blog post is about a feature you may know or may not know about OpInsights. Besides ingesting data trough agents or SCOM OpInsights can ingest data trough Azure Storage as well. And you can place data in Azure Storage trough a Azure feature like Azure Diagnostics. So lets see how all this works.

First you will need to link your OpInsights workspace to your Azure Subscription and Add Azure Storage Account to it. You can check Part 10 of my series for this but your Azure you should have  the following configured for the storage:

image

Now that we have this in place let’s see what we actually can ingest. Azure Diagnostics can collect different types of data but currently OpInsights can ingest some of it. Currently the matrix of what logs can be ingested and from what source is the following:

image

Now let’s see how to configure Windows Event logs for a VM.

To do this you will need to go to the Azure Preview portal:

https://portal.azure.com

Click Browse –> Virtual Machines

image

Select one of the Virtual Machines for which  you want to activate Azure Diagnostics:

image

Click on the monitoring tile:

image

Select Diagnostics Settings and change status from Off to On:

image

Basically for Virtual Machine if you enable every Windows event log you can gather them. In my case I’ve also selected to collected everything from Verbose to Critical you can of course can decide to collect anything above warning.

You will also need to place these logs to the same storage account that is used by Operational Insights. When you are ready click save.

After around one hour if you execute the following query:

*  | Measure count() by SourceSystem

You should see Events from source Azure Storage showing up:

image

Of course you can enable Azure Diagnostics even with Azure PowerShell. You can find example for this along on how to enable Azure Diagnostics on Web roles and Work roles on the Azure Operational Insights documentation site.

Begin Your Journey to the Cloud with the Cloud Administrator

Follow

Get every new post delivered to your Inbox.

Join 1,449 other followers