How to selectively collect security events by using Microsoft Operations Management Suite?

UPDATE: Post updated on 20th April with information about security events required to populate pre-defined queries and tiles that come with the Security and Audit solution.

 

Operations Management Suite Security and Audit Solution enables you to collect security events from managed computers. Once enabled, it configures Agents on managed computers to send all security events to the Cloud. Depending on the audit policy, as well as on the size and complexity of your environment this can potentially produce a large volume of event data that is sent to the MS OMS workspace, causing you to reach free daily data transfer limit or generating the higher service cost. In a large scale enterprise environment, the amount of security data logged on Active Directory domain controllers can be tremendous.

In case if you are using Operations Management Suite in a scenario where managed computers are not attached to OMS directly and you are using OMS Connector for SCOM instead, then you are able to configure your SCOM environment to selectively collect security events from managed computers. By this I mean that you have flexibility to control:

  1. which security events should be sent to the Cloud and
  2. which managed computers should be within the scope of the security event log collection.

Let me try to provide an overview of required steps in this blog post.

For this exercise, we will store all configuration and overrides within the separate management pack.

Create new SCOM management pack within the on-premises SCOM environment.

041216_1614_Howtoselect1.png

To configure integration with the OMS workspace, we will then create new computer group which will contain Windows computer objects of managed computers that should be in scope of the security event log collection.

041216_1614_Howtoselect2.png

Leave it empty at the moment, without any members and store it in the custom management pack created in the previous step.

To configure that computer objects which will be members of the new computer group created in the previous step are OMS-enabled, go to the Administration workspace, expand Operations Management Suite, and then click Managed Computers. Click the Add a Computer/Group link in the Tasks pane and search for the newly created computer group. Click Add, and then click OK.

3a - OMS Add Group

Newly created computer group should now be visible in the Managed Computers node under the Operations Management Suite in the Administration workspace of the Operations console.

3b - OMS Add Group

When the Security and Audit Solution is added to the OMS workspace, Microsoft System Center Advisor Security Event Collection management pack gets imported into the on-premises SCOM environment that is connected to the Microsoft OMS workspace.

Microsoft System Center Advisor Security Event Collection

This management pack contains the Collect Security Events rule.

4a - Collect Security Events rule

It collects security events for the Security and Audit Solution and sends them to the Cloud.

To modify the default behavior of the Security and Audit Solution which defines that OMS-enabled managed computers send all security events to the Cloud, we need to create an override first. Purpose of this override is to disable the Collect Security Events rule. This is done by creating an override for the Enabled parameter for all objects of the Windows Computer class. In order to give this override precedence, make sure that the override is created with the Enforced parameter enabled as shown in the screenshot below.

4b - Create override for the Collect Security Events rule

Store override in the previously-created custom management pack.

To confirm that the override has been created as expected, open the Overrides Summary for the Collect Security Events

4c - Overrides summary for the Collect Security Events rule

Now that we have an override in place which ensures that all security events are not sent to the Cloud by default from all OMS-enabled computers, next step is to create a custom rule which will define the list of specific security events which should be included/excluded from the security event log collection. Start the Create Rule Wizard and select the Collection Rules -> Event Based -> NT Event Log. Make sure that the new rule is stored in the custom management pack that we created earlier.

5a - rule creation

Give it a meaningful name and description, scope it over the Windows Computer class and leave it disabled by default, as shown in the screenshot below.

5b - rule creation

Specify Security as an event log name to read events from.

5c - rule creation

Define the list of specific security events that you want to include or exclude from the security event log collection on the Build Event Expression page of the Create Rule Wizard. Configuration of this rule depends on the audit policy, compliance requirements of your company and specifics of your IT environment. On the example shown in the screenshot below, I’ve decided to exclude four specific security events from the collection.

5d - rule creation

NOTE: Scroll down to see the list of security events that need to be collected for the Security and Audit solution.

Finish the Create Rule Wizard, open the Properties of the newly created rule and then go to the Configuration. What you will notice here is that this rule would store collected security events to the OperationsManager and OperationsManagerDW databases.

6 - rule properties before

The default response must be modified and that is done by editing the custom management pack where this rule is stored. To do that, let’s export the custom management pack and open it in Notepad or preferred XML editor. Locate the WriteActions node of the custom security event collection rule and replace the following configuration:

<WriteActions>

<WriteAction ID="WriteToDB" TypeID="SystemCenter!Microsoft.SystemCenter.CollectEvent" />

<WriteAction ID="WriteToDW" TypeID="SCDW!Microsoft.SystemCenter.DataWarehouse.PublishEventData" />

</WriteActions>

… with this:

<WriteActions>

<WriteAction ID="HttpWA" TypeID="IPTypes!Microsoft.SystemCenter.CollectHighVolumeDirectChannelCloudEvent" />

</WriteActions>

CollectHighVolumeDirectChannelCloudEvent write action is actually a reference to the High Volume Direct Channel Cloud Event Collection Write Action response that is defined in the Microsoft System Center Advisor Types Library management pack. This Write Action ensures that data is sent directly from the Agent (managed computer) to the MS OMS cloud and not to the Management Server first. In addition, data sent by this Write Module is indexed as SecurityEvent type of the data on the MS OMS side.

To be able to use this Write Action in our management pack, we must also ensure that reference to the Microsoft System Center Advisor Types Library management pack is added to our custom management pack:

<Reference Alias=”IPTypes”>

<ID>Microsoft.IntelligencePacks.Types</ID>

<Version>7.0.10609.0</Version>

<PublicKeyToken>31bf3856ad364e35</PublicKeyToken>

</Reference>

Now, scroll down to the DisplayStrings section of the custom management pack XML file and remove references to the WriteToDW and the WriteToDB sub-elements which are references to the Write Actions which we don’t plan to use. If you don’t do that, you will most likely receive an error during the management pack import.

management pack edit

And finally, scroll up to the Identity section of the custom management pack XML file and increase the version number. Now you are ready for the import of the updated custom management pack.

Once management pack import is finished, again open the Properties of the custom rule and then go to the Configuration. You should now see High Volume Direct Channel Cloud Event Collection Write Action as a default response action for this rule.

6 - rule properties after

Now the only thing remaining is to configure an override for the custom rule in order to enable it for the computer group we created earlier. Right-click on the custom rule and then select Overrides -> Override the Rule -> For a group

7a - override creation

Set the override value of the Enabled parameter to True.

7b - override creation

Open Overrides Summary for the custom rule to double-check that override was created successfully.

7c - override creation

Now populate the custom computer group and watch security events flow into the Microsoft OMS workspace.

When configuring custom security event collection rules, take into account that the following security events are required by the Security and Audit solution in order to populate pre-defined queries, visualization tiles and graphs that come with the Security and Audit solution:

21
43
104
219
400, 410
517
560
865-868
882
1000
1001
1007, 1008
1102
2087
3001, 3002
3003, 3004
3010, 3023
4104
4624
4624
4625
4626
4657
4663
4688
4698
4699
4700
4701
4702
4713
4719
4720
4722
4740
4767
4912
4946
4947
4948
4949
4950
4956
5024
5025
5029
5030
5033
5034
5035
5037
5038
5858
6281
8002-8007
10154
64004

 

For more information, refer to the Anatomy of an Event Collection Rule for Azure Operational Insights (Advanced targeting when using OpsMgr attach) blog post by Daniele Muscetta. Big thanks to Satya Vel for help around this topic.

 

How to selectively collect security events by using Microsoft Operations Management Suite?

Audit Collection Services: number of events per day

If you ever wanted to know how many events per day are collected and stored within the Audit Collection Services database of your Operations Manager infrastructure, you can run the following SQL query against the OperationsManagerAC database:

SELECT
 CASE
 WHEN(GROUPING(CONVERT(VARCHAR(20), CollectionTime, 102)) = 1) THEN 'All Days' 
 ELSE CONVERT(VARCHAR(20), CollectionTime, 102)
 END AS DayAdded, 
 COUNT(*) AS EventsPerDay 
FROM AdtServer.dvAll5
GROUP BY CONVERT(VARCHAR(20), CollectionTime, 102) WITH ROLLUP 
ORDER BY DayAdded DESC

Audit Collection Services: number of events per day

Message Queuing 6.0 Management Pack: Improper default configuration still present

This post is a follow-up to my previous blog post on the topic of the improper default configuration of the Message Queuing Management Pack for Operations Manager.

Message Queuing 6.0 Management Pack has recently been released by Microsoft. You can find it here – http://www.microsoft.com/en-us/download/details.aspx?id=36775.

According to the Management Pack documentation, it is designed to monitor Message Queuing version 6.0 only and it supports the following platforms:

    Windows Server 2012

    Windows 8

I was curious to check if the “Collect MSMQ Log Detail Script Events” rule is still included and wanted to see its configuration. It seems that it is still configured to collect all events from the Operations Manager event log where Event Source equals Health Service Script:

So, be careful in case if you will import this management pack because its default configuration could cause event flooding and might burden the Operations Manager database server.

See my previous post on this topic, to see how to create an override for the “Collect MSMQ Log Detail Script Events” rule in order to disable it.

Message Queuing 6.0 Management Pack: Improper default configuration still present

ConfigMgr: The case of the missing asset inventory data aka faulty Server Locator Point

I’ve bumped into an issue where asset inventory data was missing from managed devices. So, I’ve started troubleshooting by examining SCCM Client log files on affected devices. InventoryAgent.log file which records activities of the Hardware (and Software) Inventory Client Agent showed that inventory was running on a scheduled basis without any errors being logged, that the inventory report was created and sent to the Management Point.

Inventory: Collection Task completed in 6.657 seconds
Inventory: 19 Collection Task(s) failed.
Inventory: Temp report = D:\WINDOWS\system32\CCM\Inventory\Temp\c270ec00-8fce-40ce-823b-cd1af0f998ce.xml
Inventory: Starting reporting task.    
Reporting: 17 report entries created.
Inventory: Reporting Task completed in 0.235 seconds
Inventory: Successfully sent report. Destination:mp:MP_HinvEndpoint, ID: {5715713A-8EBD-4F52-A65F-9177CF67474F}, Timeout: 80640 minutes MsgMode: Signed, Not Encrypted
Inventory: Cycle completed in 16.484 seconds
Inventory: Action completed.

This led me to the conclusion that problem was not on the Client side. Next step was to check the server side…

Dataldr.log file on the server side records activities related to processing of sent MIF files and hardware inventory reports to the Site Server. There I’ve realized that there are no records of processing hardware inventory reports from affected clients. So, on one hand there are records on the Client side showing that HW inventory reports are generated and sent to the Management Point. On the other hand, there are no records of HW inventory reports reaching server. Data seems to be lost on the way between the Client and the Server. Let’s try to figure out what is the process in between.

The fact that affected devices were WORKGROUP devices was indication that there is something wrong in the process how workgroup devices locate the Management Point. Workgroup devices, as they are unable to query information that is published in the Active Directory, use Server Locator Points to locate the Management Points.

In order to understand the health state of the Server Locator Point, the following checks were performed initially:

  • Check if the SMS_SERVER_LOCATOR_POINT service is running as expected

  • Check the health status of the SMS_SERVER_LOCATOR_POINT component in the Configuration Manager Console under the Site Database à System Status à Site Status à Primary Site à Component Status.

 

Considering that both checks did not indicate that there were any problems with the SLP, I also wanted to take a look at the IIS log files to see if there are any entries showing HTTP requests reaching the SLP component.

I use SMS Trace utility to parse IIS log files. To narrow all entries to only the ones that I was interested in, I’ve used the following filter – show me all lines which contain “/sms_slp/slp.dll” substring. It shows HTTP requests against the Server Locator Point.

This is where I’ve realized that we are not dealing with the case of the SLP service failure (service not running). Instead, it was the case of the SLP service corruption (service runs, but doesn’t work as expected). Server responded with the 403 Forbidden HTTP status code to all Client requests against the Server Locator Point.

As a quick fix solution, I have removed and reinstalled the Server Locator Point. After successful reinstallation, SLP started to respond to Client requests, Clients were able to find the Management Point and the asset inventory information started to flow in.

For more information about valid SLP queries and responses, you may refer to an article on the TechNet blog: Questions about Service Locator Points (SLP’s)http://blogs.technet.com/b/mwiles/archive/2011/03/21/questions-about-service-locator-points-slp-s.aspx

ConfigMgr: The case of the missing asset inventory data aka faulty Server Locator Point

Improper default configuration of the Message Queuing Management Pack for Operations Manager

During recent work of fine-tuning Operations Manager infrastructure for the customer, high number of collected events in the SCOM database has come to my attention. So, I spent some time in analysing where did all collected events come from in order to better understand the root cause.

First thing that caught my eye was the fact that majority of collected events are the ones which have “Health Service Script” as Event Source property.

Next step was trying to understand which rule collects “Health Service Script” events. It turned out that 92% of all collected events were collected by the same rule, which was “Collect MSMQ Log Detail Script Events”.

Structure of collected events (per rule)

This rule is configured with an expression which collects all events from the Operations Manager event log where Event Source equals Health Service Script.

Collect MSMQ Log Detail Script Events rule configuration
Collect MSMQ Log Detail Script Events rule configuration

Needless to say that most of the “Health Service Script” events are not related to the MSMQ management pack, nor relevant for understanding the health state of managed devices from the MSMQ perspective. Collection of all “Health Service Script” events might cause event flooding on the Operations Manager database side.

“Collect MSMQ Log Detail Script Events” rule is included in the Message Queuing Management Pack for Operations Manager, enabled by default. To my knowledge, it is included in the following management packs:

  • Microsoft MSMQ 2003 management pack,
  • Microsoft MSMQ 2008 management pack,
  • Microsoft MSMQ 2008 R2 management pack and
  • Message Queuing 6.0 Management Pack.

In order to prevent event flooding, you should create an override and disable the “Collect MSMQ Log Detail Script Events” rule for all applicable target classes. In order to do that, change the value of the Enabled parameter from True (default value) to False (override value).

Override configuration
Override configuration
Improper default configuration of the Message Queuing Management Pack for Operations Manager

The limit of Software Inventory rules

On the Inventory Collection tab of the Software Inventory Client Agent Properties recently I have stumbled upon an issue where the New icon has been greyed out and disabled, not allowing me add any new rules defining the software inventory scope.

Image

After some googling around, this behaviour turned out to be documented and by design. In Configuration Manager 2007, there is a maximum limit of 64 rules. I am not sure if it is applicable to Configuration Manager 2012 as well, but it is worth to keep this limit in mind. Sometimes you learn things which are well documented the hard way.

For more information, refer to the Configuring Software Inventory Rules article on Microsoft Technet – http://technet.microsoft.com/en-us/library/cc181215.aspx

The limit of Software Inventory rules

The effect of configuring filter on an ACS Collector

As a follow up to my previous post, I want to share my findings on the effect of configuring filter on the ACS Collector side in order to collect and store only security events which are relevant for your reporting purposes.

To limit the list of events that are stored in the ACS database to only specific subset of events which are required for reporting purposes, AdtAdmin.exe /SetQuery command line utility was used.

The total number of events collected on a daily basis has been drastically reduced by setting up ACS Collector filter, as can be seen in the diagram provided below.

This diagram shows the total number of events stored within the Audit Collections Services database on a daily basis. On the day 12, ACS Collector filter has been set-up in order to filter-out all unnecessary security events from collection. A picture is worth a thousand words…

Total number of events stored on a daily basis within the ACS database
Total number of events stored on a daily basis within the ACS database before/after configuring ACS Collector
The effect of configuring filter on an ACS Collector