Quantcast
Channel: Kevin Holman's System Center Blog
Viewing all 158 articles
Browse latest View live

Creating a SCOM Service Monitor that allows overrides for Interval Frequency and Samples

0
0

 

image

 

The “built in” service monitor in SCOM is hard-coded for how often it checks the service state, and how many service checks have to return “not running” before it alarms.  This is a bit unfortunate, as customers would often want to customize this.  This article will explain how.

 

All the built in service monitoring uses Monitors that reference the Microsoft.Windows.CheckNTServiceStateMonitorType monitortype, which is in the Microsoft.Windows.Library mp.

This MonitorType has a hard coded definition with <Frequency>30</Frequency> and <MatchCount>2</MatchCount>.  This means by default, monitors that use this will inspect the service state every 30 seconds, and alarm when it is not running after two consecutive checks.  However – the challenge is – Microsoft did not expose these values as override-able parameters.

What if you want to check the service every 60 seconds, and alarm only after it has been consistently down for 15 samples (15 consecutive minutes)?  We can do that.  We have the tools.  Smile

 

Basically – we need to create our own MonitorType –which will expose these.  Here is an example:

<UnitMonitorType ID="Contoso.Demo.Service.MonitorType" Accessibility="Internal"> <MonitorTypeStates> <MonitorTypeState ID="Running" NoDetection="false" /> <MonitorTypeState ID="NotRunning" NoDetection="false" /> </MonitorTypeStates> <Configuration> <xsd:element name="ComputerName" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="ServiceName" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="IntervalSeconds" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="CheckStartupType" minOccurs="0" maxOccurs="1" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="Samples" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> </Configuration> <OverrideableParameters> <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" /> <OverrideableParameter ID="CheckStartupType" Selector="$Config/CheckStartupType$" ParameterType="string" /> <OverrideableParameter ID="Samples" Selector="$Config/Samples$" ParameterType="int" /> </OverrideableParameters> <MonitorImplementation> <MemberModules> <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProvider"> <ComputerName>$Config/ComputerName$</ComputerName> <ServiceName>$Config/ServiceName$</ServiceName> <Frequency>$Config/IntervalSeconds$</Frequency> <DisableCaching>true</DisableCaching> <CheckStartupType>$Config/CheckStartupType$</CheckStartupType> </DataSource> <ProbeAction ID="Probe" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProbe"> <ComputerName>$Config/ComputerName$</ComputerName> <ServiceName>$Config/ServiceName$</ServiceName> </ProbeAction> <ConditionDetection ID="ServiceRunning" TypeID="System!System.ExpressionFilter"> <Expression> <Or> <Expression> <And> <Expression> <SimpleExpression> <ValueExpression> <Value Type="String">$Config/CheckStartupType$</Value> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="String">false</Value> </ValueExpression> </SimpleExpression> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='StartMode']</XPathQuery> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="Integer">2</Value> <!-- 0=BootStart 1=SystemStart 2=Automatic 3=Manual 4=Disabled --> </ValueExpression> </SimpleExpression> </Expression> </And> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='State']</XPathQuery> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="Integer">4</Value> <!-- 0=Unknown 1=Stopped 2=StartPending 3=StopPending 4=Running 5=ContinuePending 6=PausePending 7=Paused 8=ServiceNotFound 9=ServerNotFound --> </ValueExpression> </SimpleExpression> </Expression> </Or> </Expression> </ConditionDetection> <ConditionDetection ID="ServiceNotRunning" TypeID="System!System.ExpressionFilter"> <Expression> <And> <Expression> <Or> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='StartMode']</XPathQuery> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="Integer">2</Value> <!-- 0=BootStart 1=SystemStart 2=Automatic 3=Manual 4=Disabled --> </ValueExpression> </SimpleExpression> </Expression> <Expression> <And> <Expression> <SimpleExpression> <ValueExpression> <Value Type="String">$Config/CheckStartupType$</Value> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="String">false</Value> </ValueExpression> </SimpleExpression> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='StartMode']</XPathQuery> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="Integer">2</Value> <!-- 0=BootStart 1=SystemStart 2=Automatic 3=Manual 4=Disabled --> </ValueExpression> </SimpleExpression> </Expression> </And> </Expression> </Or> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='State']</XPathQuery> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="Integer">4</Value> <!-- 0=Unknown 1=Stopped 2=StartPending 3=StopPending 4=Running 5=ContinuePending 6=PausePending 7=Paused 8=ServiceNotFound 9=ServerNotFound --> </ValueExpression> </SimpleExpression> </Expression> </And> </Expression> <SuppressionSettings> <MatchCount>$Config/Samples$</MatchCount> </SuppressionSettings> </ConditionDetection> </MemberModules> <RegularDetections> <RegularDetection MonitorTypeStateID="Running"> <Node ID="ServiceRunning"> <Node ID="DS" /> </Node> </RegularDetection> <RegularDetection MonitorTypeStateID="NotRunning"> <Node ID="ServiceNotRunning"> <Node ID="DS" /> </Node> </RegularDetection> </RegularDetections> <OnDemandDetections> <OnDemandDetection MonitorTypeStateID="Running"> <Node ID="ServiceRunning"> <Node ID="Probe" /> </Node> </OnDemandDetection> <OnDemandDetection MonitorTypeStateID="NotRunning"> <Node ID="ServiceNotRunning"> <Node ID="Probe" /> </Node> </OnDemandDetection> </OnDemandDetections> </MonitorImplementation> </UnitMonitorType>

 

Essentially – we have taken the hard-coded values, and changed them to allow a $Config/Value$ passed parameter.  This will allow the monitor to PASS this value to the MonitorType, and be used in the DataSource or ConditionDetection.  Even if you don’t fully understand that, it’s ok…. because I will be wrapping all this up in a consumable VSAE Fragment that is easy to implement.

The changes made to allow data to be passed in were:

          <Frequency>$Config/IntervalSeconds$</Frequency>
          <MatchCount>$Config/Samples$</MatchCount>

In the <Configuration> section we added:

          <xsd:element name="IntervalSeconds" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
          <xsd:element name="Samples" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />

In the <OverrideableParameters> section – we added:

          <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" />
          <OverrideableParameter ID="Samples" Selector="$Config/Samples$" ParameterType="int" />

In the DataSource – one new value that should be added when using Microsoft.Windows.Win32ServiceInformationProvider and multiple runs, is the following:

           <DisableCaching>true</DisableCaching>

This is very important, as this will cause the datasource to output data every time, even if nothing has changed.  We need this for the number of samples (MatchCount) to work as desired.

Now that we have this new MonitorType – we can reference it in our own Monitors.  Here is an example of a Monitor using this:

<UnitMonitor ID="Contoso.Demo.Spooler.Service.Monitor" Accessibility="Public" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="Contoso.Demo.Service.MonitorType" ConfirmDelivery="false"> <Category>AvailabilityHealth</Category> <AlertSettings AlertMessage="Contoso.Demo.Spooler.Service.Monitor.Alert.Message"> <AlertOnState>Error</AlertOnState> <AutoResolve>true</AutoResolve> <AlertPriority>Normal</AlertPriority> <AlertSeverity>Error</AlertSeverity> <AlertParameters> <AlertParameter1>$Data/Context/Property[@Name='Name']$</AlertParameter1> <AlertParameter2>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/PrincipalName$</AlertParameter2> </AlertParameters> </AlertSettings> <OperationalStates> <OperationalState ID="Running" MonitorTypeStateID="Running" HealthState="Success" /> <OperationalState ID="NotRunning" MonitorTypeStateID="NotRunning" HealthState="Error" /> </OperationalStates> <Configuration> <ComputerName /> <ServiceName>spooler</ServiceName> <IntervalSeconds>30</IntervalSeconds> <CheckStartupType>true</CheckStartupType> <Samples>2</Samples> </Configuration> </UnitMonitor>

 

Once you implement this Monitor – you will see the new options exposed in overrides:

image

 

 

So the key takeaways are:

  • The built in service monitoring does not allow for configurable Interval and Sample count.
  • We can customize this using a custom MonitorType that allows for these variables to be passed in.
  • Using the Microsoft.Windows.Win32ServiceInformationProvider we MUST set <DisableCaching>true</DisableCaching>

 

 

This example has been added to my Fragment Library for you to download at:

https://gallery.technet.microsoft.com/SCOM-Management-Pack-VSAE-2c506737

(see:  Monitor.Service.WithAlert.FreqAndSamples.mpx)

 

To learn more about using MP Fragments, and how EASY they are to use with Visual Studio:

https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/

https://www.youtube.com/watch?v=9CpUrT983Gc

 

To make using fragments REALLY EASY, using Silect MP Author Pro, watch the video:

https://blogs.technet.microsoft.com/kevinholman/2017/03/22/management-pack-authoring-the-really-fast-and-easy-way-using-silect-mp-author-and-fragments/

https://www.youtube.com/watch?v=E5nnuvPikFw

 

 

imageSmile


QuickTip: Disabling workflows to optimize for large environments

0
0

 

image

 

One of the coolest things about SCOM is how much monitoring you get out of the box.

That said, one of the biggest performance impacts to SCOM is all the monitoring out of the box, plus all the Management Packs you import.  This has a cumulative effect, and over time, can impact the speed of the console, because of all the activity happening.

I have long stated, the biggest performance relief you can give to SCOM, is to reduce the number of workflows, reduce the classes and relationships, and keep things simple.

SCOM 2007 shipped back in March 2007.  In 10 years, We have continuously added management packs to a default installation of SCOM, and continuously added workflows to the existing MP’s.

For the most part – this is good.  These packs add more and more monitoring and capabilities “out of the box”.  However, in many cases, they can also add load to the environment.  They discover class instances, relationships, add state calculation, etc.  In small SCOM environments (under 1000 agents) this will have very little impact.  But at large enterprise scale, every little thing counts.

 

I have already written about some of the optional things you can consider (IF you don’t use the features), such as removing the APM MP’s, and removing the Advisor MP’s.

 

Here is one I came across today with a customer:

 

I noticed on the server that hosts the “All Management Servers Resource Pool” we have some out of the box PowerShell script based rules that were timing out after 300 seconds, and running every 15 minutes:

Collect Agent Health States (ManagementGroupCollectionAgentHealthStatesRule)

Collect Management Group Active Alerts Count (ManagementGroupCollectionAlertsCountRule)


image

 

These scripts do things like “Get-SCOMAgent” and “Get-SCOMAlert”.  They were timing out, running constantly for 5 minutes, then getting killed by the timeout limit, then starting over again.  This kind of thing will have significant impact on SQL blocking, SDK utilization, and overall performance.

 

Now, in small environments, this isn’t a big deal, and these will return results quickly with little impact.  However, in a VERY large environment, Get-SCOMAgent can take 10 minutes or more just to return the data!!!!  If you have hundreds of thousands of open alerts, it can take just as long to run the Alert SDK queries as well.

The only thing these two rules are used for is to populate a SCOM Health dashboard – and these are of little value:

 

image

 

I recommend that larger environments disable these two rules….. as they will be very resource intensive for very minimal value.  If you feel like you like to keep them, then override them to 86400 seconds, and set a sync time to run each at slightly different times, off peak, like 23:00 (11pm), and set the timeout to 600 seconds.  If it cannot complete in 10 minutes, then disable them…..  also – stagger the sync time for the other rule to begin at 23:20 (11:20pm) so they aren't both running at the time time.

 

image

 

Additionally, in this same MP (Microsoft.SystemCenter.OperationsManager.SummaryDashboard) there are two discoveries.

Collect Agent Versions (ManagementGroupDiscoveryAgentVersions)

Collect agent configurations (ManagementGroupDiscoveryAgentConfiguration)

These discoveries run once per hour, and also run things like Get-SCOMAgent – which is bad for large environments, especially with that frequency.

The only thing they do is populate this dashboard:

 

image

 

I rarely ever see this being used and recommend large environments disable these as well. 

image

 

Speed up that SCOM deployment!

image

Adding direct agent OMS Workspace and Proxy via PowerShell

0
0

 

image

We have some REALLY good documentation on this here:  https://docs.microsoft.com/en-us/azure/log-analytics/log-analytics-windows-agents

 

PowerShell and the Agent Scripting Objects make it really easy to control the OMS direct agent configuration on thousands of agents, using SCOM.

 

Here are some PowerShell examples:

# Load agent scripting object
  $AgentCfg = New-Object -ComObject AgentConfigManager.MgmtSvcCfg

# Get all AgentCfg methods and properties
$AgentCfg | Get-Member

# Check to see if this agent supports OMS
  $AgentSupportsOMS = $AgentCfg | Get-Member -Name 'GetCloudWorkspaces'

# Get all configured OMS Workspaces
  $AgentCfg.GetCloudWorkspaces()

# Add OMS Workspace
  $AgentCfg.AddCloudWorkspace($WorkspaceID,$WorkspaceKey)

# Remove OMS Workspace
  $AgentCfg.RemoveCloudWorkspace($WorkspaceID)

# Get the OMS proxy if configured
  $AgentCfg.proxyUrl

# Set a proxy for the OMS Agent
  $AgentCfg.SetProxyUrl($ProxyURL)


I added these tasks to ADD and REMOVE OMS workspaces from the MMA, in the latest version of the SCOM Management helper pack:

https://blogs.technet.microsoft.com/kevinholman/2017/05/09/agent-management-pack-making-a-scom-admins-life-a-little-easier/

image

Monitoring for Time Drift in your enterprise

0
0

 

image

 

Time sync is critical in today’s networks.  Experiencing time drift across devices can cause authentication breakdowns, reporting miscalculations, and wreak havoc on interconnected systems.  This article shows a demo management pack to monitor for time sync across your Windows devices.

The basic idea was – to monitor all systems and compare their local time, against a target reference time server, using W32Time.  Here is the command from the PowerShell:

$cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples

The script will take two parameters, the reference server and the threshold for how much time drift is allowed.

Here is the PowerShell script:

#================================================================================= # Time Skew Monitoring Script # Kevin Holman # Version 1.0 #================================================================================= param([string]$RefServer,[int]$Threshold) #================================================================================= # Constants section - modify stuff here: # Assign script name variable for use in event logging $ScriptName = "Demo.TimeDrift.PA.ps1" # Set samples to the number of w32time samples you wish to include [int]$Samples = '1' # For testing - assign values instead of paramtersto the script #[string]$RefServer = 'dc1.opsmgr.net' #[int]$Threshold = '10' #================================================================================= # Gather script start time $StartTime = Get-Date # Gather who the script is running as $WhoAmI = whoami # Load MomScript API and PropertyBag function $momapi = new-object -comObject 'MOM.ScriptAPI' $bag = $momapi.CreatePropertyBag() #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,9250,0, "Starting script") #Start MAIN body of script: #Getting the required data $cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples IF ($cmd -match 'error') { #Log error and quit $momapi.LogScriptEvent($ScriptName,9250,2, "Getting TimeDrift from Reference Server returned an error . Reference server is ($RefServer). Output of command is ($cmd)") exit } ELSE { #Assume we got good results from cmd $Skew = $cmd[-1..($Samples * -1)] | ConvertFrom-Csv -Header "Time","Skew" | Select -ExpandProperty Skew $Result = $Skew | % { $_ -replace "s","" } | Measure-Object -Average | select -ExpandProperty Average } #The problem is that you can have time skew in two directions: positive or negative. You can do two #things: create an IF statement that does check both or just create a positive number. IF ($Result -lt 0) { $Result = $Result * -1 } $TimeDriftSeconds = [math]::Round($Result,2) #Determine if the average time skew is higher than your threshold and report this back to SCOM. IF ($TimeDriftSeconds -gt $Threshold) { $bag.AddValue("TimeSkew","True") $momapi.LogScriptEvent($ScriptName,9250,2, "Time Drift was detected. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds") } ELSE { $bag.AddValue("TimeSkew","False") #Log good event for testing #$momapi.LogScriptEvent($ScriptName,9250,0, "Time Drift was OK. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds") } #Add stuff into the propertybag $bag.AddValue("RefServer",$RefServer) $bag.AddValue("Threshold",$Threshold) $bag.AddValue("TimeDriftSeconds",$TimeDriftSeconds) #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $ScriptTime = [math]::Round($ScriptTime,2) $momapi.LogScriptEvent($ScriptName,9250,0,"`n Script has completed. `n Reference server is ($RefServer). `n Threshold is ($Threshold) seconds. `n Value is ($TimeDriftSeconds) seconds. `n Runtime was ($ScriptTime) seconds.") #Output the propertybag $bag

 

Next, we will put the script into a Probe action, which will be called by a Datasource with a scheduler.  The reason we want to break this out, is because we want to “share” this datasource between a monitor and rule.  The monitor will monitor for the time skew, while the rule will collect the skew as a perf counter, so we can monitor for trends in the environment.

 

So the key components of the MP are the DS, the PA (containing the script), the MonitorType and the Monitor, the Perf collection rule, and some views to show this off:

 

image

 

When a threshold is breached, the monitor raises an alert:

image

 

The performance view will show you the trending across your systems:

image

 

On the monitor (and rule) you can modify the reference server:

image

 

One VERY IMPORTANT concept – if you change anything – you must make identical overrides on BOTH the monitor and the rule, otherwise you will break cookdown, and result in the script running twice for each interval.  So be sure to set the IntervalSeconds, RefServer, and Threshold the same on both the monitor and the rule.  If you want the monitor to run much more frequently than the default once an hour, that’s fine, but you might not want the perf data collected more than once per hour, so while that will break cookdown, it only breaks once per hour, which is probably less of an impact than overcollecting performance data.

From here, you could add in a recovery to force a resync of w32time if you wanted, or add in additional alert rules for w32time events.

 

The example MP is available here:

https://gallery.technet.microsoft.com/SCOM-Management-Pack-to-bca30237

How to create a SCOM group from an Active Directory Computer Group

0
0

 

imageimage

 

There have been a bunch of examples of this published over the years.  Some of them worked well, but I was never happy with many of them as they were often vbscript based, hard to troubleshoot, and required lots of editing each time you wanted to reuse them.  Many were often error prone, and didn’t work if the AD group contained computers that didn’t exist in SCOM, as SCOM will reject the entire discovery data payload in that case.

If you too were looking for a reliable and easy way to do this, well, look no further!  I have created an MP Fragment in my fragment library for this: 

https://gallery.technet.microsoft.com/SCOM-Management-Pack-VSAE-2c506737

 

This MP Fragment will make creating SCOM groups of Windows Computers from Active Directory groups super easy!  This is a nice way to “delegate” the ability for end users to control what servers will appear in their scopes, as they often have the ability to easily add and remove computers from their AD groups, but they do not have access to SCOM Group memberships.

I am going to demonstrate using Silect MP Author Pro to reuse this Fragment, and you can also easily use Visual Studio with VSAE.  If you’d like to read more on either of those, see:

https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/

https://blogs.technet.microsoft.com/kevinholman/2017/03/22/management-pack-authoring-the-really-fast-and-easy-way-using-silect-mp-author-and-fragments/

 

In Silect MP Author Pro – create a new, empty management pack, and select “Import Fragment”

 

image

 

Browse the fragment and choose:  Class.Group.ADGroupWindowsComputers.mpx

image

 

We need to simply input the values here, such as:

image

 

Click “Import

Silect MP Author Pro will automagically handle the references for you, so just say “Yes” on the popup:

image

 

That’s IT!   Surprised smile

 

Save it, and deploy it!

image

 

If you look in SCOM after a few minutes – you should see your group:

 

image

 

The rule to populate it runs once a day by default, but it will run immediately upon import.  Look for event ID 7500 in the OpsMgr event log on the Management Server that hosts your All Management Servers Resource Pool object

image

 

Once you see these events and no errors in them – you can view group membership in SCOM:

 

image

 

So easy.  And you don’t have to know anything about XML, or even Management Packs to do it!

Using Visual Studio with VSAE works exactly the same way – you simply have to do a manual Find/Replace for each item.  See the VSAE method in the link above.

 

Want to dig deeper into how this is put together?  Read on:

The MP we generate is very basic.  There is a Class (the Group definition) a Relationship (the Group contains Windows Computers) and a discovery (queries AD and discovers the relationship to the existing Windows Computers in SCOM)

image

 

The script is below:

We basically connect to AD, find the group by name, query to get members, look the membership up and see if they exist in SCOM, if they do, add them to the group.

We will log events along the way to help in troubleshooting if anything doesn’t work, and record the completion and total script runtime, like all my SCOM scripts.

#================================================================================= # Group Population script based on AD group membership # # Kevin Holman # v1.2 #================================================================================= param($SourceID, $ManagedEntityID, $ADGroup, $LDAPSearchPath) # Manual Testing section - put stuff here for manually testing script - typically parameters: #================================================================================= # $SourceId = '{00000000-0000-0000-0000-000000000000}' # $ManagedEntityId = '{00000000-0000-0000-0000-000000000000}' # $ADGroup = "SCOM Computers Group" # $LDAPSearchPath = "LDAP://DC=opsmgr,DC=net" #================================================================================= # Constants section - modify stuff here: #================================================================================= # Assign script name variable for use in event logging $ScriptName = "##CompanyID##.##AppName##.##GroupNameNoSpaces##.Group.Discovery.ps1" $EventID = "7500" #================================================================================= # Starting Script section - All scripts get this #================================================================================= # Gather the start time of the script $StartTime = Get-Date # Load MOMScript API $momapi = New-Object -comObject MOM.ScriptAPI # Load SCOM Discovery module $DiscoveryData = $momapi.CreateDiscoveryData(0, $SourceId, $ManagedEntityId) #Set variables to be used in logging events $whoami = whoami #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script is starting. `n Running as ($whoami).") #================================================================================= # Connect to local SCOM Management Group Section #================================================================================= # Clear any previous errors $Error.Clear() # Import the OperationsManager module and connect to the management group $SCOMPowerShellKey = "HKLM:\SOFTWARE\Microsoft\System Center Operations Manager\12\Setup\Powershell\V2" $SCOMModulePath = Join-Path (Get-ItemProperty $SCOMPowerShellKey).InstallDirectory "OperationsManager" Import-module $SCOMModulePath New-DefaultManagementGroupConnection "localhost" IF ($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: Failure loading OperationsManager module or unable to connect to the management server. `n Terminating script. `n Error is: ($Error).") EXIT } #================================================================================= # Begin MAIN script section #================================================================================= #Log event for captured parameters $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n ADGroup: ($ADGroup) `n LDAP search path: ($LDAPSearchPath).") # Connect to AD using LDAP search to find the DN for the Group $Searcher = New-Object DirectoryServices.DirectorySearcher $Searcher.Filter = '(&(objectCategory=group)(cn=' + $ADGroup + '))' $Searcher.SearchRoot = $LDAPSearchPath $Group = $Searcher.FindAll() $GroupDN = @() # Now that we have the group object, trim to get the DN in order to search for members $GroupDN = $Group.path.TrimStart("LDAP://") #If we found the group in AD by the DisplayName log a success event otherwise log error IF ($GroupDN) { $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Successfully found group in AD: ($GroupDN).") } ELSE { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: Did not find group in AD: ($ADGroup) using ($LDAPSearchPath). `n Terminating script.") EXIT } # Search for members of the group $Searcher.Filter = '(&(objectCategory=computer)(memberOf=' + $GroupDN + '))' $ADComputerObjects = $Searcher.FindAll() $ADComputerObjectsCount = $ADComputerObjects.Count If ($ADComputerObjectsCount -gt 0) { $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Successfully found ($ADComputerObjectsCount) members in the group: ($GroupDN).") } Else { $momapi.LogScriptEvent($ScriptName,$EventID,1, "`n FATAL ERROR: Did not find any members in the AD group: ($GroupDN). `n Terminating script.") EXIT } # Set namelist array to empty $namelist = @() # Loop through each computer object and get an array of FQDN hostnames FOREACH ($ADComputerObject in $ADComputerObjects) { [string]$DNSComputerName = $ADComputerObject.Properties.dnshostname $namelist += $DNSComputerName } # Check SCOM and get back any matching computers # This is necesasary to filter the list for relationship discovery because if we return any computers missing from SCOM the Management Server will reject the discovery # We are using the namelist array of FQDNs passed to Get-SCOMClassInstance to only pull back matching systems from the SDK as opposed to getting all Windows Computers then parsing which is assumed slower in large environments $ComputersInSCOM = Get-SCOMClassInstance -Name $namelist $ComputersInSCOMCount = $ComputersInSCOM.Count # Logging event $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Found ($ComputersInSCOMCount) matching computers in SCOM from the ($ADComputerObjectsCount) total computers in the AD group ($GroupDN).") #Discovery Section #Set the group instance we will discover members of $GroupInstance = $DiscoveryData.CreateClassInstance("$MPElement[Name='##CompanyID##.##AppName##.##GroupNameNoSpaces##.Group']$") # Loop through each SCOM computer and add a group membership containment relationship to the discovery data FOREACH ($ComputerInSCOM in $ComputersInSCOM.DisplayName) { $ServerInstance = $DiscoveryData.CreateClassInstance("$MPElement[Name='Windows!Microsoft.Windows.Computer']$") $ServerInstance.AddProperty("$MPElement[Name='Windows!Microsoft.Windows.Computer']/PrincipalName$", $ComputerInSCOM) $RelationshipInstance = $DiscoveryData.CreateRelationshipInstance("$MPElement[Name='##CompanyID##.##AppName##.##GroupNameNoSpaces##.Group.Contains.Windows.Computers']$") $RelationshipInstance.Source = $GroupInstance $RelationshipInstance.Target = $ServerInstance $DiscoveryData.AddInstance($RelationshipInstance) } # Return Discovery Items Normally $DiscoveryData # Return Discovery Bag to the command line for testing (does not work from ISE) # $momapi.Return($DiscoveryData) #================================================================================= # End MAIN script section # End of script section #================================================================================= #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script Ending. `n Script Runtime: ($ScriptTime) seconds.") #================================================================================= #End Script

 

Key recommendations:

1.  Don’t run your frequency <intervalseconds> too often.  If updating the group once a day is ok, leave it at the default.  If you need it more frequent, that’s fine, just remember it’s a script, and all scripts running on the management servers have an impact on the overall load, plus we are submitting discovery data about relationships each time, and searching through AD.

2.  The default timeout is set to 5 minutes.  If you cannot complete this in less, something is WRONG and it most likely will be how long it takes to find the group in AD.  If that is true for you, you need to optimize the section on querying AD and LDAP search path.

3.  If you have a lot of AD based SCOM groups, consider adding a staggered sync time to each discovery, so they don’t all run at the same time, or on the same interval.

Event 18054 errors in the SQL application log – in SCOM 2016 deployments

0
0

 

image

 

When SCOM is installed – it doesn’t just create the databases on the SQL instance – it adds data to the sysmessages view for different error scenarios, to the Master database for the instance.

This is why after moving a database, or restoring a DB backup to a rebuilt SQL server, or when using SQL AlwaysOn and failing over to another instance - we might end up missing this data. 

These are important because they give very good detailed data about the error and how to resolve it.  If you see these – you need to update your SQL instance with some scripts.  Or – if you KNOW you are using SQL AlwaysOn, or migrating a DB – be PROACTIVE and handle this manually, up front.

 

Examples of these events on the SQL server:

Log Name:      Application
Source:        MSSQL$I01
Date:          10/23/2010 5:40:14 PM
Event ID:      18054
Task Category: Server
Level:         Error
Keywords:      Classic
User:          OPSMGR\msaa
Computer:      SQLDB1.opsmgr.net
Description:
Error 777980007, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage.

You might also notice some truncated events in the OpsMgr event log, on your RMS or management servers:

Event Type:    Warning
Event Source:    DataAccessLayer
Event Category:    None
Event ID:    33333
Date:        10/23/2010
Time:        5:40:13 PM
User:        N/A
Computer:    OMMS3
Description:
Data Access Layer rejected retry on SqlError:
Request: p_DiscoverySourceUpsert -- (DiscoverySourceId=f0c57af0-927a-335f-1f74-3a3f1f5ca7cd), (DiscoverySourceType=0), (DiscoverySourceObjectId=74fb2fa8-94e5-264d-5f7e-57839f40de0f), (IsSnapshot=True), (TimeGenerated=10/23/2010 10:37:36 PM), (BoundManagedEntityId=3304d59d-5af5-ba80-5ba7-d13a07ed21d4), (IsDiscoveryPackageStale=), (RETURN_VALUE=1)
Class: 16
Number: 18054
Message: Error 777980007, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage.

Event Type:    Error
Event Source:    Health Service Modules
Event Category:    None
Event ID:    10801
Date:        10/23/2010
Time:        5:40:13 PM
User:        N/A
Computer:    OMMS3
Description:
Discovery data couldn't be inserted to the database. This could have happened because  of one of the following reasons:

     - Discovery data is stale. The discovery data is generated by an MP recently deleted.
     - Database connectivity problems or database running out of space.
     - Discovery data received is not valid.

The following details should help to further diagnose:

DiscoveryId: 74fb2fa8-94e5-264d-5f7e-57839f40de0f
HealthServiceId: bf43c6a9-8f4b-5d6d-5689-4e29d56fed88
Error 777980007, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage..

I have created some SQL scripts which are taken from the initial installation files, and you can download them below.  You simply run them in SQL Management studio to get this data back.

 

These are for SCOM 2016 ONLY!!!!

Download link:   https://gallery.technet.microsoft.com/SQL-to-fix-event-18054-4d6d9ec1

 

SCOM 2012R2 files are here:  https://blogs.technet.microsoft.com/kevinholman/2016/02/10/event-18054-errors-in-the-sql-application-log-in-scom-2012-r2-deployments/

Demo SCOM Script Template

0
0

 

image

 

I thought I’d take a moment to publish my SCOM script template.

Whenever I am writing a SCOM script for monitoring, discovery, or automation, there are some “standards” that I want in all my scripts.

 

1.  I personally feel that all script running in SCOM should at the MINIMUM log at script starting event, and a script completed event with runtime in seconds.  This helps anyone evaluating the server, or agent, just how many scripts are running and on what kind of frequency.

2.  I like to log “who” the script is executed by (what account, whether RunAs or default agent action account.

3.  I like to have an examples section for manually assigning script variables, which is very handy when testing/troubleshooting.

4.  I assign a ScriptName and EventID variables in the script, for consistency when logging events.

5.  I load examples for discovery scripts, propertybags for monitoring scripts, and just remove what isn't needed.  I find this easier and more consistent than going and grabbing an example from some other script I wrote previously.

6.  I have a section on connecting to the SCOM SDK, for scripts that will run automation on the SCOM management server.  I found this method to be the most reliable, as there are scenarios where commandlets just stop working under the MonitoringHost.exe process.

 

I don’t have a lot of “fluff” in here…. I never like it when I have to page down 3 or 4 pages to get to what a script is actually doing…. this is mostly just the meat and potatoes.

 

#================================================================================= # Describe Script Here # # Author: Kevin Holman # v1.2 #================================================================================= param($SourceId, $ManagedEntityId, $ComputerName, $Param1, $Param2) # Manual Testing section - put stuff here for manually testing script - typically parameters: #================================================================================= # $SourceId = '{00000000-0000-0000-0000-000000000000}' # $ManagedEntityId = '{00000000-0000-0000-0000-000000000000}' # $ComputerName = "computername.domain.com" # $Param1 = "foo" # $Param2 = "bar" #================================================================================= # Constants section - modify stuff here: #================================================================================= # Assign script name variable for use in event logging. # ScriptName should be the same as the ID of the module that the script is contained in $ScriptName = "CompanyID.AppName.Workflow.RuleMonitorDiscoveryDSWA.ps1" $EventID = "1234" #================================================================================= # Starting Script section - All scripts get this #================================================================================= # Gather the start time of the script $StartTime = Get-Date #Set variable to be used in logging events $whoami = whoami # Load MOMScript API $momapi = New-Object -comObject MOM.ScriptAPI #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script is starting. `n Running as ($whoami).") #================================================================================= # Discovery Script section - Discovery scripts get this #================================================================================= # Load SCOM Discovery module $DiscoveryData = $momapi.CreateDiscoveryData(0, $SourceId, $ManagedEntityId) #================================================================================= # PropertyBag Script section - Monitoring scripts get this #================================================================================= # Load SCOM PropertyBag function $bag = $momapi.CreatePropertyBag() #================================================================================= # Connect to local SCOM Management Group Section - If required #================================================================================= # I have found this to be the most reliable method to load SCOM modules for scripts running on Management Servers # Clear any previous errors $Error.Clear() # Import the OperationsManager module and connect to the management group $SCOMPowerShellKey = "HKLM:\SOFTWARE\Microsoft\System Center Operations Manager\12\Setup\Powershell\V2" $SCOMModulePath = Join-Path (Get-ItemProperty $SCOMPowerShellKey).InstallDirectory "OperationsManager" Import-module $SCOMModulePath New-DefaultManagementGroupConnection "localhost" IF ($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: Unable to load OperationsManager module or unable to connect to Management Server. `n Terminating script. `n Error is: ($Error).") EXIT } #================================================================================= # Begin MAIN script section #================================================================================= #Put your stuff in here #================================================================================= # End MAIN script section # Discovery Script section - Discovery scripts get this #================================================================================= # Example discovery of a class with properties $instance = $DiscoveryData.CreateClassInstance("$MPElement[Name='Your.Custom.Class']$") $instance.AddProperty("$MPElement[Name='Windows!Microsoft.Windows.Computer']/PrincipalName$", $ComputerName) $instance.AddProperty("$MPElement[Name='System!System.Entity']/DisplayName$", $ComputerName) $instance.AddProperty("$MPElement[Name='Your.Custom.Class']/Property1$", $Param1) $instance.AddProperty("$MPElement[Name='Your.Custom.Class']/Property2$", $Param2) $DiscoveryData.AddInstance($instance) # Return Discovery Items Normally $DiscoveryData # Return Discovery Bag to the command line for testing (does not work from ISE) # $momapi.Return($DiscoveryData) #================================================================================= # PropertyBag Script section - Monitoring scripts get this #================================================================================= # Output a fixed Result = BAD for a monitor example $bag.AddValue("Result","BAD") # Output other data from script into bag $bag.AddValue("Param1",$Param1) $bag.AddValue("Param2",$Param2) # Return all bags $bag #================================================================================= # End of script section #================================================================================= #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script Completed. `n Script Runtime: ($ScriptTime) seconds.") #================================================================================= # End of script

 

Do you have stuff you like to place in every script?  If so – let me know in the comments!

SCOM 2016 now supports SQL 2016 SP1


Please take this SCOM survey – make your voice heard!

0
0

 

The SCOM Product Group is looking for YOUR feedback.

Please respond to the survey:

https://aka.ms/scomsurvey

 

System Center is changing.  We are aligning to the Windows Server Semi-Annual Channel update process, which will provide for frequent and continuous releases.  (see announcement​ here.)

Because of this - the SCOM product team would love to hear your inputs on this change.  And we would also like to hear from you on other improvements that we should focus on.

 

Your survey response will help us plan and prioritize features for the new release model.

 

 

Additionally, always remember that your voice for your requests, or to vote up other request you feel are important, is USERVOICE.  If you look at the top uservoice requests, these are the features that the product group has been adding to SCOM 2016, and working on for the next version.  If you don’t join the conversation, we wont know what's important to YOU!

https://aka.ms/scom

How to create a SCOM group from a SQL CMDB Query

0
0

 

imageimage

 

I wrote an example of this a long time ago.  I was never happy with it, as it was VBScript based, hard to troubleshoot, and required lots of editing each time you wanted to reuse it.  It was also error prone, and didn’t work if the SQL query results contained computers that didn’t exist in SCOM, as SCOM will reject the entire discovery data payload in that case.

 

If you too were looking for a reliable and easy way to do this, well, look no further!  I have created an MP Fragment in my fragment library for this: 

https://gallery.technet.microsoft.com/SCOM-Management-Pack-VSAE-2c506737

 

This MP Fragment will make creating SCOM groups of Windows Computers from a SQL query super easy!  This is a nice way to “delegate” the ability for end users to control what servers will appear in their scopes, as they often have the ability to easily add and remove computers from a database or CMDB, but they do not have access to SCOM Group memberships.

 

I am going to demonstrate using Silect MP Author Pro to reuse this Fragment, and you can also easily use Visual Studio with VSAE.  If you’d like to read more on either of those, see:

https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/

https://blogs.technet.microsoft.com/kevinholman/2017/03/22/management-pack-authoring-the-really-fast-and-easy-way-using-silect-mp-author-and-fragments/

 

In Silect MP Author Pro – create a new, empty management pack, and select “Import Fragment”

image

 

Browse the fragment and choose:  Class.Group.SQLQueryBasedGroupWindowsComputers.mpx

 

image

 

We need to simply input the values here, such as:

 

image

 

Click “Import

Silect MP Author Pro will automagically handle the references for you, so just say “Yes” on the popup:

 

image

 

 

That’s IT!   Surprised smile

 

Save it, and deploy it!

 

image

 

If you look in SCOM after a few minutes – you should see your group:

 

image

 

The rule to populate it runs once a day by default, but it will run immediately upon import.  Look for event ID 7500 in the OpsMgr event log on the Management Server that hosts your All Management Servers Resource Pool object

 

image

 

Once you see these events and no errors in them – you can view group membership in SCOM:

 

image

 

So easy.  And you don’t have to know anything about XML, or even Management Packs to do it!

 

Using Visual Studio with VSAE works exactly the same way – you simply have to do a manual Find/Replace for each item.  See the VSAE method in the link above.

 

Want to dig deeper into how this is put together?  Read on:

The MP we generate is very basic.  There is a Class (the Group definition) a Relationship (the Group contains Windows Computers) and a discovery (queries SQL and discovers the relationship to the existing Windows Computers in SCOM)

 

image

 

The script is below:

We basically connect to SQL, return a list of FQDN’s from the query, look the results up and see if they exist in SCOM, if they do, add them to the group.

We will log events along the way to help in troubleshooting if anything doesn’t work, and record the completion and total script runtime, like all my SCOM scripts.

 

#================================================================================= # Group Population script based on SQL Query # Your query should return a list of FQDN names only # # Kevin Holman # v1.1 #================================================================================= param($SourceID, $ManagedEntityID) # Manual Testing section - put stuff here for manually testing script - typically parameters: #================================================================================= # $SourceId = '{00000000-0000-0000-0000-000000000000}' # $ManagedEntityId = '{00000000-0000-0000-0000-000000000000}' # $SQLServer = "FOO" # $SQLDBName = "CMDB" # $SQLQuery = "SELECT SERVERNAME from serverlist" #================================================================================= # Constants section - modify stuff here: #================================================================================= # Assign script name variable for use in event logging $ScriptName = "FAB.MyApp.SCOMComputerGroupFromSQL.SQLBased.Group.Discovery.ps1" $EventID = "7501" $SQLServer = "SQL2A" $SQLDBName = "CMDB" $SQLQuery = "select SERVERNAME from serverlist" #================================================================================= # Starting Script section #================================================================================= # Gather the start time of the script $StartTime = Get-Date # Load MOMScript API $momapi = New-Object -comObject MOM.ScriptAPI # Load SCOM Discovery module $DiscoveryData = $momapi.CreateDiscoveryData(0, $SourceId, $ManagedEntityId) #Set variables to be used in logging events $whoami = whoami #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script is starting. `n Running as ($whoami).") #================================================================================= # Connect to local SCOM Management Group Section #================================================================================= # Clear any previous errors $Error.Clear() # Import the OperationsManager module and connect to the management group $SCOMPowerShellKey = "HKLM:\SOFTWARE\Microsoft\System Center Operations Manager\12\Setup\Powershell\V2" $SCOMModulePath = Join-Path (Get-ItemProperty $SCOMPowerShellKey).InstallDirectory "OperationsManager" Import-module $SCOMModulePath New-DefaultManagementGroupConnection "localhost" IF ($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: Failure loading OperationsManager module or unable to connect to the management server. `n Terminating script. `n Error is: ($Error).") EXIT } #================================================================================= # Begin MAIN script section #================================================================================= #Log event for captured parameters $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n SQLServer: ($SQLServer) `n SQLDatabase: ($SQLDBName). `n SQL Query: ($SQLQuery).") # Health Service class section # We need this list of SCOM agents, so we can only submit discovery data for a Healthservice in SCOM otherwise SCOM will reject the discovery data, and this will clean up deleted stale Windows Computer objects that will remain until the next discovery # Clear any previous errors $Error.Clear() # Get all instances of a existing Health Service class $HS = Get-SCOMClass -Name "Microsoft.SystemCenter.Healthservice" | Get-SCOMClassInstance $HSNames = $HS.DisplayName $HSCount = $HSNames.count IF($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1, "`n FATAL ERROR: Unable to gather Healthservice instances from SCOM. `n Error is: $Error") EXIT } ELSE { $momapi.LogScriptEvent($ScriptName,$EventID,0, "`n Get all Health Service Objects has completed. `n Returned ($HSCount) Health Service Objects from SCOM.") } # END Health Service class section # Connect to and then Query the database $SqlConnection = New-Object System.Data.SqlClient.SqlConnection $SqlConnection.ConnectionString = "Server=$SQLServer;Database=$SQLDBName;Integrated Security=True" $SqlCmd = New-Object System.Data.SqlClient.SqlCommand $SqlCmd.CommandText = $SqlQuery $SqlCmd.Connection = $SqlConnection $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter $SqlAdapter.SelectCommand = $SqlCmd $ds = New-Object System.Data.DataSet $SqlAdapter.Fill($ds) | Out-Null $SqlConnection.Close() # Check for errors connecting to SQL IF ($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: There was an attempting to connect to and query SQL. `n Terminating script. `n Error is: ($Error).") EXIT } # Set the output to a variable [array]$SQLNames = $ds.Tables[0] $SQLNamesCount = $SQLNames.Count IF ($SQLNamesCount -ge 1) { $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Successfully collected ($SQLNamesCount) records from the SQL query.") } ELSE { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: There was an error getting records from SQL or no records were returned. `n Number of objects returned: ($SQLNamesCount). `n Terminating script.") EXIT } # Set namelist array to empty [array]$NameList = @() # Loop through each Name from SQL and build an array of FQDN hostnames FOREACH ($SQLName in $SQLNames) { #Get the Hostname property from SQL [string]$DNSComputerName = $SQLName[0] $NameList += $DNSComputerName } $NameListCount = $NameList.Count #Discovery Section #Set the group instance we will discover members of $GroupInstance = $DiscoveryData.CreateClassInstance("$MPElement[Name='FAB.MyApp.SCOMComputerGroupFromSQL.SQLBased.Group']$") # Loop through each SCOM computer and add a group membership containment relationship to the discovery data $i=0; FOREACH ($Name in $NameList) { #Check to make sure the name we got from AD exists as a Healthservice in this Management Group IF ($Name -in $HSNames) { $i = $i+1 $ServerInstance = $DiscoveryData.CreateClassInstance("$MPElement[Name='Windows!Microsoft.Windows.Computer']$") $ServerInstance.AddProperty("$MPElement[Name='Windows!Microsoft.Windows.Computer']/PrincipalName$", $Name) $RelationshipInstance = $DiscoveryData.CreateRelationshipInstance("$MPElement[Name='FAB.MyApp.SCOMComputerGroupFromSQL.SQLBased.Group.Contains.Windows.Computers']$") $RelationshipInstance.Source = $GroupInstance $RelationshipInstance.Target = $ServerInstance $DiscoveryData.AddInstance($RelationshipInstance) } } IF ($i -ge 1) { $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Successfully found ($i) Computers in SCOM from the original ($NameListCount) DNS names from the query.") } ELSE { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR: No computers in SCOM were found matching the ($NameListCount) DNS names from the query. `n Terminating script.") EXIT } # Return Discovery Items Normally $DiscoveryData # Return Discovery Bag to the command line for testing (does not work from ISE) # $momapi.Return($DiscoveryData) #================================================================================= # End MAIN script section # End of script section #================================================================================= #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script Ending. `n Script Runtime: ($ScriptTime) seconds.") #================================================================================= #End Script

 

 

 

Key recommendations:

1.  Don’t run your frequency <intervalseconds> too often.  If updating the group once a day is ok, leave it at the default.  If you need it more frequent, that’s fine, just remember it’s a script, and all scripts running on the management servers have an impact on the overall load, plus we are submitting discovery data about relationships each time, and searching through SQL and SCOM via the SDK.

2.  The default timeout is set to 5 minutes.  If you cannot complete this in less, something is WRONG.  If that is true for you, you need to find out where it is taking too long.

3.  If you have a lot of SQL based SCOM groups, consider adding a staggered sync time to each discovery, so they don’t all run at the same time, or on the same interval.

Using SCOM AD integration – but with a CMDB instead of LDAP wizards

0
0

 

imageimage

 

AD integration has been around since SCOM 2007 first shipped.  The concept is simple – the ability to deploy agents as part of a build process to the OS, but with the SCOM agent left un-configured.  Then the SCOM agent checks with Active Directory in its local domain, and received management group and management server assignment from there.

 

Historically, there were two challenges using AD integration: 

First, we only provided a very simple UI, that allowed you to use LDAP queries to try and assign agents based on some criteria in AD.  While this worked for some people, in large enterprise environments, this rarely provided a good level of granularity or a good method of load balancing the agent numbers, without constant adjustments. 

Second, it wasn’t always reliable.  The UI wizards wrote these rules to the Default Management Pack, and when you made a change, it would delete these rules, and write new rules, and sometimes this process broke, leaving your AD publishing in a broken state.  The risk was high, because a mistake or a hiccup might cause all your agents to drop out of monitoring (worst case).

Because of these challenges, a lot of customers stopped using AD integration over the years.

What if I told you – you don’t have to use the built in AD LDAP wizards to configure AD integration?  Instead, you could use a CMDB datasource, to give you the potential for more granular control, and easier load balancing control of management servers.

I didn’t come up with this idea.  One of my customers actually did, a long time ago.  They have been using it for years to manage their agents with great success.  I have simply re-written the design using PowerShell and simplified it to demonstrate here.

 

At the end of the day – AD integration is controlled simply by one or more rules in a management pack.  The rule is made up of two components:

  • Datasource
  • Write Action

The Datasource is the custom script, that will query the CMDB, and return output necessary for the write action.  For our datasource, we will use a simple scheduler, and the Microsoft.Windows.PowerShellPropertyBagProbe DS.  The output will be propertybags for each computer we return from the CMDB.

The Write Action is very specific - Microsoft.SystemCenter.ADWriter – this is the “worker bee” in the equation – it takes inputs for DNSHostName and DistinguishedName of each server, and writes that to the specialized container and security groups in AD.  So if we are going to use a custom datasource with this write action – we simply need to pass these important items as propertybags to the Write Action.

 

Here is an example datasource script:

#================================================================================= # # Get Server list from CMDB for SCOM ADIntegration # Output FQDN and DistinguishedName into PropertyBags # # Kevin Holman # v 1.2 # #================================================================================= param($SQLQuery,$RuleId) # Manual Testing section - put stuff here for manually testing script - typically parameters: #================================================================================= # $RuleId = "Demo.CMDB.ADIntegration.MGNAME.MSNAME.Rule" # $SQLQuery = "WITH OrderedServers AS #( # SELECT SERVERNAME, ROW_NUMBER() OVER(ORDER BY SERVERNAME) AS RowNumber # FROM serverlist # WHERE MG = 'PROD' #) #SELECT SERVERNAME #FROM OrderedServers #WHERE RowNumber BETWEEN 1 and 2" #================================================================================= # Constants section - modify stuff here: #================================================================================= # Assign script name variable for use in event logging $ScriptName = "Demo.CMDB.ADIntegration.PsCMDBQuery.DS.ps1" $EventID = 9500 $SQLServer = "SQLSERVERNAME\INSTANCENAME" $SQLDBName = "CMDB" #================================================================================= # Starting Script section - All scripts get this #================================================================================= # Gather the start time of the script $StartTime = Get-Date #Set variable to be used in logging events $whoami = whoami # Load MOMScript API $momapi = New-Object -comObject MOM.ScriptAPI #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script is starting. `n Running as ($whoami).") #================================================================================= # PropertyBag Script section - Monitoring scripts get this #================================================================================= # Load SCOM PropertyBag function $bag = $momapi.CreatePropertyBag() #================================================================================= # Begin MAIN script section #================================================================================= # Log an event for the parameters $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Rule: ($RuleId) `n SQL Server: ($SQLServer) `n SQLDBName: ($SQLDBName) `n SQLQuery: ($SQLQuery)") #Clear any previous errors $Error.Clear() # Query the CMDB database to get the servers and properties $SqlConnection = New-Object System.Data.SqlClient.SqlConnection $SqlConnection.ConnectionString = “Server=$SQLServer;Database=$SQLDBName;Integrated Security=True$SqlCmd = New-Object System.Data.SqlClient.SqlCommand $SqlCmd.CommandText = $SQLQuery $SqlCmd.Connection = $SqlConnection $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter $SqlAdapter.SelectCommand = $SqlCmd $ds = New-Object System.Data.DataSet $SqlAdapter.Fill($ds) $SqlConnection.Close() #Check for errors connecting to SQL IF($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1,"`n FATAL ERROR connecting to SQL server. `n Rule: ($RuleId). `n SQLServer: ($SQLServer). `n SQLDBName: ($SQLDBName). `n SQLQuery: ($SqlQuery). `n Error is: ($error).") EXIT } #Loop through each row of the SQL query output $i=0; $j=0; FOREACH ($row in $ds.Tables[0].Rows) { #Increment our counter to get number of computers returned from query $i = $i+1 #Get the FQDN from the SQL data $FQDN = $row[0].ToString().Trim() #Get the domain from the FQDN $Domain = $FQDN.Substring($FQDN.IndexOf(".") + 1) #Get the HostName and DomainComputer Account Name for DN use below $FQDNSplit = $FQDN.Split(".") #Get the HostName $HostName = $FQDNSplit[0] #Get the DomainComputerAccount $DomainComputerAccountName = $FQDNSplit[1] + "\" + $HostName + "$" #Get the Distinguished Name $ADS_NAME_INITTYPE_DOMAIN = 1 $ADS_NAME_TYPE_NT4 = 3 $ADS_NAME_TYPE_1779 = 1 $NameTranslate = New-Object -ComObject "NameTranslate" #Clear any previous errors $Error.Clear() #Connect to Active directory $NameTranslate.GetType().InvokeMember("Init", "InvokeMethod", $NULL, $NameTranslate, ($ADS_NAME_INITTYPE_DOMAIN, $Domain)) | Out-Null #We need to check for an error at this point because this is where we connect to a domain and this might fail if we dont have rights or are firewalled IF($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,1, "`n FATAL ERROR connecting to Active Directory. `n Terminating script. `n Rule: ($RuleId) `n Domain: ($Domain) `n Error is: ($error).") EXIT } #Connect to AD and look up computer object from CMDB $NameTranslate.GetType().InvokeMember("Set", "InvokeMethod", $NULL, $NameTranslate, ($ADS_NAME_TYPE_NT4, $DomainComputerAccountName)) | Out-Null $DN = $NameTranslate.GetType().InvokeMember("Get", "InvokeMethod", $NULL, $NameTranslate, $ADS_NAME_TYPE_1779) #We need to check for an error at this point because this is where we find the computer in AD and it might not exist IF($Error) { $momapi.LogScriptEvent($ScriptName,$EventID,2, "`n NON FATAL WARNING connecting to Active Directory to find computer from CMDB. `n This usually mean that a computer exists in the CMDB bues does not exist in AD or the CMDB record is bad. `n Rule: ($RuleId) `n Domain: ($Domain). `n ComputerName = ($DomainComputerAccountName). `n Error is: ($error).") } ELSE { # Assume no errors so we will continue #Increment our counter to get number of computers returned from AD $j = $j+1 # Debugging: #Write-Host "Servername: $FQDN" #Write-Host "HostName: $HostName" #Write-Host "Domain Name: $Domain" #Write-Host "Domain Computer Name: $DomainComputerAccountName" #Write-Host "DN: $DN" #$momapi.LogScriptEvent($ScriptName,$EventID,0, "`n Debug: `n Rule: ($RuleId) `n FQDN: ($FQDN) `n HostName: ($HostName). `n DomainName: ($Domain). `, Domain Computer Account Name: ($DomainComputerAccountName). `n DN: ($DN)") #Create a propertybag for each computer $bag = $momapi.CreatePropertyBag() #Put the hostname and DN in the bag. #Includes a value for the folder name so that we can tell which folder the data is from. $bag.AddValue("distinguishedName",$DN) $bag.AddValue("dNSHostName",$FQDN) #Return each property bag as we create and populate it. $bag } } $QueryComputerCount = $i $ADComputerCount = $j $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Rule: ($RuleId). `n CMDB query looped through ($QueryComputerCount) computers. `n AD query found ($ADComputerCount) matching computers.") #================================================================================= # End MAIN script section # End of script section #================================================================================= #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script Completed. `n Script Runtime: ($ScriptTime) seconds.") #================================================================================= # End of script

 

Each rule will pass in a custom SQL query, and a RuleID (the RuleID is used only for logging to know which rule is running the shared DS.)

The script will run the SQL query, loop through each servername we get in a result, and query AD, to get the DNSHostname and DN of the object (if it exists).  Those items will be placed into propertybags to be consumed by the Write Action.

The rule has two parameters configured:

<Parameters> <Parameter> <Name>SQLQuery</Name> <!-- Use ANY query that returns only fully qualified domain names --> <Value> WITH OrderedServers AS ( SELECT SERVERNAME, ROW_NUMBER() OVER(ORDER BY SERVERNAME) AS RowNumber FROM serverlist WHERE MG = 'PROD' ) SELECT SERVERNAME FROM OrderedServers WHERE RowNumber BETWEEN 1 and 3 </Value> </Parameter> <Parameter> <Name>RuleId</Name> <Value>Demo.CMDB.ADIntegration.DOMAIN.MS1.Rule</Value> <!-- We use this to help identify the rule calling the script for troubleshooting --> </Parameter> </Parameters>

In the above SQL query – I am using numbers of results between 1 and 3 for the first management server.  If you had thousands of agents, you could use a method like this to assign 2000 agents to each management server, just using the result number as a separator for each rule.

That’s it.

Next – the Write Action:

<WriteAction ID="WA" RunAs="SC!Microsoft.SystemCenter.ADWriterAccount" TypeID="SC!Microsoft.SystemCenter.ADWriter"> <ManagementServerName>d46bb8b5-48b2-c607-4890-33efd9416450</ManagementServerName> <!-- This needs to be changed to the GUID of the Windows Computer object for your management server --> <Domain>DOMAIN.net</Domain> <!-- This needs to be changed to your domain name you want to publish to --> <UserAndDomain /> <Password /> <SecureReferenceId /> <dNSXPath>DataItem/Property[@Name='dNSHostName']</dNSXPath> <distinguishedNameXPath>DataItem/Property[@Name='distinguishedName']</distinguishedNameXPath> </WriteAction>

 

We will need a distinct rule with a unique write action for each management server we want assignments to.  The write action needs to contain the Management Server’s GUID in the <ManagementServerName> XML tag.  To get a list of the correct GUIDS:

SELECT bme.DisplayName, bme.BaseManagedEntityId
FROM BaseManagedEntity bme
JOIN MTV_HealthService mtvhs ON bme.DisplayName = mtvhs.DisplayName
WHERE bme.Fullname like 'Microsoft.Windows.Computer:%'
AND mtvhs.IsManagementServer = 1
ORDER BY bme.fullname

 

Next, provide the domain we are publishing to.

That’s it!

Now – you can publish to any domain, using one rule for each management server.  You can also assign agents to your gateways, using the same process.

 

I have published an example MP you can use as a template here:

https://gallery.technet.microsoft.com/SCOM-Active-Directory-abe8b3d1

 

You still use the normal RunAs account configuration you always would, and can scope different RunAs publishing accounts to different management servers or Gateways.  Additional reading on that:

https://docs.microsoft.com/en-us/system-center/scom/manage-ad-integration-agent-assignment

https://blogs.technet.microsoft.com/smsandmom/2008/05/21/opsmgr-2007-how-to-enable-ad-integration-for-an-untrusted-domain/

UR4 for SCOM 2016 – Step by Step

0
0

 

image

 

KB Article for OpsMgr:  https://support.microsoft.com/en-us/help/4024941/update-rollup-4-for-system-center-2016-operations-manager

Download catalog site:  http://www.catalog.update.microsoft.com/Search.aspx?q=4024941

Updated UNIX/Linux Management Packs:  https://www.microsoft.com/en-us/download/details.aspx?id=29696

Recommended hotfix page:  https://blogs.technet.microsoft.com/kevinholman/2009/01/27/which-hotfixes-should-i-apply/

 

 

NOTE:  I get this question every time we release an update rollup:   ALL SCOM Update Rollups are CUMULATIVE.  This means you do not need to apply them in order, you can always just apply the latest update.  If you have deployed SCOM 2016 and never applied an update rollup – you can go straight to the latest one available. 

There is some conflicting documentation with the above statement in the UR4 KB article, so I will try and explain.  The KB article states that you must apply UR2 before UR4.  This is misleading.  UR2 fixed a bug in the update process to allow any subsequent UR (post UR2) to have the ability to be uninstalled.  This bug was first fixed in UR2.  So if you want the ABILITY to uninstall UR4, you should apply UR2 or UR3 first.  However, once you apply UR4, you have the fix, and any subsequent UR’s will be able to be uninstalled.  In my opinion, there is very little to no value of applying UR2 first.  Simply apply UR4 to whatever version you are at (since it IS cumulative), and any subsequent Update Rollups will have the ability to be uninstalled.

 

Ok, now that’s clear as mud, lets get rolling:

 

Key fixes:
  • Adds support for TLS 1.2.

    • For more information about how to set up, configure, and run your environment with TLS 1.2, see the following article in the Microsoft Knowledge Base:  4051111 TLS 1.2 Protocol Support Deployment Guide for System Center 2016

  • APM Crash fix:  This update resolves an issue that causes a crash of IIS application pools that are running under CLR 2.0 when the APM feature is installed on the server as part of SCOM Agent. The code now uses appropriate memory instructions, based on the CLR version.

  • Fix:  Addresses an issue in which the APM AppDiagnostics console fails to create a Problem Management rule due to a FormatException. The appropriate string is now used for formatting, and the Problem Management wizard is able to run without issues.

  • Fix:  When a log file is being monitored by SCOM, Monagent locks the file and won't allow it to be renamed.

  • Fix:  Failure of GetOpsMgrDBWatcherDiscovery.ps1 script causes the Monitoring Host to crash.

  • Fix:  WMI Health monitor doesn't work if WINRM is configured to use https only.  (servers configured with HTTP SPN’s)

  • WMI Health monitor doesn't work if SPN http://servername is set to a user account.  (servers configured with HTTP SPN’s)

  • Fix:  SCOMpercentageCPUTimeCounter.ps1 script generates WMI errors that are caused by Service Principle Name (SPN) configuration issues.  (servers configured with HTTP SPN’s)

  • Product knowledge of "Windows Cluster Service Discovery" includes an incorrect reference to "Windows NT."

  • After a network outage, the management server does not reconnect to the gateway server if the gateway server was installed with the /ManagementServerInitiatesConnection=True option.

  • A configuration change to the network device triggers a rediscover of the device, and this process changes the SNMP agent address.

  • The UseMIAPI registry subkey prevents collection of custom performance rules data for all Linux servers.

 

 


Lets get started.

 

 

From reading the KB article – the order of operations is:

  1. Install the update rollup package on the following server infrastructure:
    • Management server or servers
    • Audit Collection Server (ACS) 
    • Web Console server role computers
    • Gateway
    • Operations Console role computers
  2. Apply SQL scripts.
  3. Manually import the management packs.
  4. Apply Agent Updates
  5. Update Nano Agents
  6. Update Unix/Linux MP’s and Agents

 


 
1.  Management Servers

image

It doesn’t matter which management server I start with.  I simply make sure I only patch one management server at a time to allow for agent failover without overloading any single management server.

I can apply this update manually via the MSP files, or I can use Windows Update.  I have 2 management servers, so I will demonstrate both.  I will do the first management server manually.  This management server holds 3 roles, and each must be patched:  Management Server, Web Console, and Console.

The first thing I do when I download the updates from the catalog, is copy the cab files for my language to a single location, and then extract the contents

 

image

 

Once I have the MSP files, I am ready to start applying the update to each server by role.

***Note:  You MUST log on to each server role as a Local Administrator, SCOM Admin, AND your account must also have System Administrator role to the SQL database instances that host your OpsMgr databases.

 

My first server is a Management Server, and the Web Console, and has the OpsMgr console installed, so I copy those update files locally, and execute them per the KB, from an elevated command prompt:

image

 

This launches a quick UI which applies the update.  It will bounce the SCOM services as well.  The update usually does not provide any feedback that it had success or failure….  but I did get a reboot prompt.  You can choose “No” and then reboot after applying all the SCOM role updates.

 

image

 

You can check the application log for the MsiInstaller events to show completion:

Log Name:      Application
Source:        MsiInstaller
Event ID:      1036
Description:
Windows Installer installed an update. Product Name: System Center Operations Manager 2016 Server. Product Version: 7.2.11719.0. Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: System Center 2016 Operations Manager Update Rollup 4 Patch. Installation success or error status: 0.

You can also spot check a couple DLL files for the file version attribute.

 

image

 

Next up – run the Web Console update:

 

image

 

 

This runs much faster.   A quick file spot check:

 

image

 

Lastly – install the Console Update (make sure your console is closed):

 

image

 

A quick file spot check:

 

image

 

Or help/about in the console:

 

image

 

 

Additional Management Servers:

image

Apply the UR updates for Server, Web Console, and Console roles as needed for all additional management servers.  You may also use Windows Update.

 

Updating ACS (Audit Collection Services)

image

One of my management servers is also my ACS Audit Collection Server role.  I will apply the update for that.

 

From an elevated command prompt:

image

 

image

Note the above image states “Operations Manager 2012”.  This is a known issue documented in the KB article. 

 

Updated files:

image

 

 

Updating Gateways:

image

Generally I can use Windows Update or manual installation.  I will proceed with manual:

 

image

 

The update launches a UI and quickly finishes.

Then I will spot check the DLL’s:

image

 

 

I can also spot-check the \AgentManagement folder, and make sure my agent update files are dropped here correctly:

image

***NOTE:  You can delete any older UR update files from the \AgentManagement directories.  The UR’s do not clean these up and they provide no purpose for being present any longer.

I could also apply the GW update via Windows Update.

 

 

 

2. Apply the SQL Scripts

image

***Note – this update SQL script did NOT change from UR3 to UR4.  If you had previously applied this file in UR3, you may skip this step.  However, if you are unsure, you may always re-apply it.  Reapplication will never hurt.

 

In the path on your management servers, where you installed/extracted the update, there is ONE SQL script file: 

%SystemDrive%\Program Files\Microsoft System Center 2016\Operations Manager\Server\SQL Script for Update Rollups

(note – your path may vary slightly depending on if you have an upgraded environment or clean install)

Next – let’s run the script to update the OperationsManager (Operations) database.  Open a SQL management studio query window, connect it to your Operations Manager database, and then open the script file (update_rollup_mom_db.sql).  Make sure it is pointing to your OperationsManager database, then execute the script.

You should run this script with each UR, even if you ran this on a previous UR.  The script body can change so as a best practice always re-run this.

image

 

Click the “Execute” button in SQL mgmt. studio.  The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.  

I have had customers state this takes from a few minutes to as long as an hour. In MOST cases – you will need to shut down the SDK, Config, and Monitoring Agent (healthservice) on ALL your management servers in order for this to be able to run with success.

You will see the following (or similar) output: 

image

 

IF YOU GET AN ERROR – STOP!  Do not continue.  Try re-running the script several times until it completes without errors.  In a production environment with lots of activity, you will almost certainly have to shut down the services (sdk, config, and healthservice) on your management servers, to break their connection to the databases, to get a successful run.

 

 

3. Manually import the management packs

image

 

There are 36 management packs in this update!  Most of these we don’t needso read carefully.

 

The path for these is on your management server, after you have installed the “Server” update:

\Program Files\Microsoft System Center 2016\Operations Manager\Server\Management Packs for Update Rollups

However, the majority of them are Advisor/OMS, and language specific.  Only import the ones you need, and that are correct for your language.  

This is the initial import list: 

image

image

 

What NOT to import:

The Advisor MP’s are only needed if you are using Microsoft Operations Management Suite cloud service, (Previously known as Advisor, and Operations Insights) and have your on premise SCOM environment connected to the cloud service.

DON’T import ALL the languages – ONLY ENU, or any other languages you might require.

The Alert Attachment MP update is only needed if you are already using that MP for very specific other MP’s that depend on it (rare)

The IntelliTrace Profiling MP requires IIS MP’s and is only used if you want this feature in conjunction with APM.

 

So I remove what I don’t want or need – and I have this:

image

#Note:  If the “Install” button is greyed out – this means you might already have one or move of these MP’s with the same version installed.  In my case, I had already applied an out of band MP hotfix for “System Center Internal Library” version 7.0.8437.10, so I had to remove that to continue.  Only do this if it is blocking you from continuing.

 

 

 

4.  Update Agents

image24

Agents should be placed into pending actions by this update for any agent that was not manually installed (remotely manageable = yes):

image

 

If your agents are not placed into pending management – this is generally caused by not running the update from an elevated command prompt, or having manually installed agents which will not be placed into pending by design, OR if you use Windows Update to apply the update rollup for the Server role patch.

You can approve these – which will result in a success message once complete:

image

 

You normally could verify the PatchLevel by going into the console and opening the view at:  Monitoring > Operations Manager > Agent Details > Agents by Version

image

HOWEVER – due to a bug in UR4 agent patch – it is still reporting as UR3.  Sad smile

 

I *strongly* recommend you take a look at this community MP, which helps see the “REAL” agent number in the “Agent Managed” view console:

https://blogs.technet.microsoft.com/kevinholman/2017/02/26/scom-agent-version-addendum-management-pack/

image

 

And my SCOM Management Group Management mp (updated for UR4), which will help show you REAL UR levels based on a better discovery.  This has long been a pain point in SCOM:

https://blogs.technet.microsoft.com/kevinholman/2017/05/09/agent-management-pack-making-a-scom-admins-life-a-little-easier/

image

image

 

 
 
5.  Update UNIX/Linux MPs and Agents

image

 

The UNIX/Linux MP’s and agents at the time of this article publishing have not changed since SCOM 2016UR3 was released.

You can get the current Unix/Linux MP updates here:  https://www.microsoft.com/en-us/download/details.aspx?id=29696

The current version of these MP’s for SCOM 2016 UR4 is 7.6.1076.0 – and includes agents with version 1.6.2-339

 

Make sure you download the correct version for your SCOM deployment:

image

 

Download, extract, and import ONLY the updated Linux/UNIX MP’s that are relevant to the OS versions that you want to monitor:

image

 

This will take a considerable amount of time to import, and consume a lot of CPU on the management servers and SQL server until complete.

Once it has completed, you will need to restart the Healthservice (Microsoft Monitoring Agent) on each management server, in order to get them to update their agent files at \Program Files\Microsoft System Center 2016\Operations Manager\Server\AgentManagement\UnixAgents

You should see the new files dropped with new timestamps:

image

 

Now you can deploy the agent updates:

image

image

 

Next – you decide if you want to input credentials for the SSH connection and upgrade, or if you have existing RunAs accounts that are set up to do the job (Agent Maintenance/SSH Account)

image

image

 

If you have any issues, make sure your SUDOERS file has the correct information pertaining to agent upgrade:

https://blogs.technet.microsoft.com/kevinholman/2016/11/11/monitoring-unixlinux-with-opsmgr-2016/

 

 

6.  Update the remaining deployed consoles

image

 

This is an important step.  I have consoles deployed around my infrastructure – on my Orchestrator server, SCVMM server, on my personal workstation, on all the other SCOM admins on my team, on a Terminal Server we use as a tools machine, etc.  These should all get the matching update version.

 

 

Review:

image60

Now at this point, we would check the OpsMgr event logs on our management servers, check for any new or strange alerts coming in, and ensure that there are no issues after the update.

 

Known Issues:

1.  The ACS update shows “Operations Manager 2012” in the UI but is actually for SCOM 2016.

2.  The Patchlist for agents mistakenly shows UR3, and was not updated for UR4.  This can be worked around by importing my “Agent Version” and “SCOM Management” mp’s.

3.  The Web Console is broken for any Silverlight dashboards.  When you view a Silverlight Dashboard, you are prompted to “Configure” but even when you configure it, you are prompted over and over again.  This is because the certificate being published in the utility is not correct for the version hosted on the Web Console server.  You can follow the steps in the UR4 KB article to resolve the issue.

Alerting on SNMP traps in SCOM – Without discovering the SNMP Device

0
0

 

Well, sort of, anyway.  Smile

 

I have written on SNMP monitoring in SCOM a few times:

https://blogs.technet.microsoft.com/kevinholman/2011/07/20/opsmgr-2012-discovering-a-network-device/

https://blogs.technet.microsoft.com/kevinholman/2015/02/03/snmp-trap-monitoring-with-scom-2012-r2/

https://blogs.technet.microsoft.com/kevinholman/2015/12/16/how-to-discover-a-windows-computer-as-a-network-device-in-scom-2012/

https://blogs.technet.microsoft.com/kevinholman/2016/04/20/writing-a-custom-class-for-your-network-devices/

 

This one will be a little different.

One of the challenges I have heard many times with SCOM – is that we must discover a network device, in order to monitor or receive SNMP traps from a device.

This can be a big problem for customers, as they often have network devices that only sent traps, but are not query-able via SNMP GET requests.  But if we cannot get a device to discover, we can’t generate an alert or collect the trap.

 

Let’s make that a little simpler.  This article will demonstrate how we can create a new class for our network devices, discover them from a simple CSV text file, and then monitor them for SNMP traps.

This post will be based on the work of Tatiana Golovina here:  https://blogs.technet.microsoft.com/scpferublog/2013/12/09/snmp-scom-2012-109/

 

The idea is to create a management pack with the following:

1.  A new Resource Pool, that will host our network devices and load balance them.

2.  A Class that will define our SNMP network device and any properties.

3.  A discovery, which is a PowerShell script to discover our network devices.

4.  Rules, to alert on all traps, specific traps from a specific OID, and specific traps where a varbind contains specific data

 

Our class definitions look like the following:

<ClassTypes> <ClassType ID="Demo.SNMPDevice.Class" Accessibility="Public" Abstract="false" Base="System!System.ApplicationComponent" Hosted="false" Singleton="false" Extension="false"> <Property ID="DeviceName" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="IP" Type="string" AutoIncrement="false" Key="true" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="SNMPCommunity" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="Description" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="Owner" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> </ClassType> <ClassType ID="Demo.SNMPDevice.ResourcePool" Accessibility="Public" Abstract="false" Base="SC!Microsoft.SystemCenter.ManagementServicePool" Hosted="false" Singleton="true" Extension="false" /> </ClassTypes>

 

There is a datasource for the Resource Pool, and a Discovery for that as well, but those aren't important here.  They just create the pool and load the relationship so the pool hosts our devices.

Then – there is a discovery that discovers against the CSV file and creates our instances:

 

image

 

You can override this for Interval, SyncTime, Timeout, and the CSV path you want to discover from:

image

 

My example CSV is very basic – you can add or remove fields you want to discover.  For my example I include the device name, IP, Base64 SNMP Community String, a description, and the Owner.  You may change these as you wish, but you have to change the discovery script as well if you do.

We require a DeviceName, IP Address, and Community String (base64) at a minimum:

image

 

The Base64 Community string is described here:  https://blogs.technet.microsoft.com/brianwren/2009/01/21/snmp-queries/  I have included the one for “public” in my example.

 

This will discover your objects:

image

 

And let you generate alerts when traps are sent to the SCOM Management server that hosts these objects:

 

image

 

I have included three different rule examples:

1.  A rule to alert on ANY trap sent from the device.

2.  A rule to alert on ANY trap sent to the device that comes from a specific OID.

3.  A rule that will filter a specific payload in the trap, such as data in a specific Varbind.

 

You can look at the rule configurations to better understand this method, to start creating your own rules.

 

This MP contain a Resource Pool dedicated for these devices.  You must configure this, as by default it will use Automatic membership, and distribute your network devices across all management servers.  This means each device must send a trap to EACH management server in the pool, because we do not control which management server hosts the device.  For this reason, especially for testing, you may want to set the Pool membership to manual, and limit the management servers.  Many devices are only able to send traps to two IP destinations, so it would be wise to choose two management servers for pool membership – to give high availability and load balancing, or just one management server for simplicity and testing:

 

image

image

image

 

So with a simple CSV, we can quickly “discover” our network devices, and start getting traps from them really quickly.

 

 

You can download the example management pack and sample CSV file here:

https://gallery.technet.microsoft.com/SCOM-Management-Pack-to-ac659c99

UR14 for SCOM 2012 R2 – Step by Step

0
0

 

image

 

KB Article for OpsMgr:  https://support.microsoft.com/en-us/help/4024942/update-rollup-14-for-system-center-2012-r2-operations-manager

Download catalog site:  http://www.catalog.update.microsoft.com/Search.aspx?q=4024942

UNIX/Linux Management Packs:  https://www.microsoft.com/en-us/download/details.aspx?id=29696

 

NOTE:  I get this question every time we release an update rollup:   ALL SCOM Update Rollups are CUMULATIVE.  This means you do not need to apply them in order, you can always just apply the latest update.  If you have deployed SCOM 2012R2 and never applied an update rollup – you can go straight to the latest one available.  If you applied an older one (such as UR3) you can always go straight to the latest one!

 

NOTE: There is an issue with the UR14 Web Console update for client connectivity.  This is the same issue that existed in UR13.  Because of this, you should only apply it if you have first mitigated the certificate issue created by the update for your clients.  This is documented in the “Known Issues” at the bottom of this page.

 

 

Key Fixes:

 

 

Lets get started.

 

From reading the KB article – the order of operations is:

  1. Install the update rollup package on the following server infrastructure:
    • Management servers
    • Audit Collection servers 
    • Gateway servers
    • Web console server role computers
    • Operations console role computers
    • Reporting
  2. Apply SQL scripts.
  3. Manually import the management packs.
  4. Update Agents
  5. Unix/Linux management packs and agent updates (if required)

 

 

Management Servers

image

It doesn’t matter which management server I start with.  There is no need to begin with whomever holds the “RMSe” role.  I simply make sure I only patch one management server at a time to allow for agent failover without overloading any single management server.

I can apply this update manually via the MSP files, or I can use Windows Update.  I recommend using the manual approach and not using Windows Update, for several reasons.  Windows Update applied patches will not put agents into pending updates, and is more difficult to precisely control.

 

The first thing I do when I download the updates from the catalog, is copy the cab files for my language to a single location then extract the contents:

image

 

Once I have the MSP files, I am ready to start applying the update to each server by role.

***Note:  You MUST log on to each server role as a Local Administrator, SCOM Admin, AND your account must also have System Administrator role to the SQL database instances that host your OpsMgr databases.

 

My first server is a Management Server Role, and the Web Console Role, and has the OpsMgr Console installed, so I copy those update files locally, and execute them per the KB, from an elevated command prompt:

 

image

 

This launches a quick UI which applies the update.  It will bounce the SCOM services as well.  The update usually does not provide any feedback that it had success or failure. 

You *MAY* be prompted for a reboot.  You can click “No” and do a single reboot after fully patching all roles on this server.

 

You can check the application log for the MsiInstaller events to show completion:

Log Name:      Application
Source:        MsiInstaller
Event ID:      1022
Description:
Product: System Center Operations Manager 2012 Server - Update 'System Center 2012 R2 Operations Manager UR14 Update Patch' installed successfully.

 

You can also spot check a couple DLL files for the file version attribute:

 

image

 

 

Next up – run the Web Console update:

image

 

This runs much faster.   A quick file spot check:

image

 

Lastly – install the console update (make sure your console is closed):

image

 

A quick file spot check:

image

 

 
 
Additional Management Servers:

image

I now move on to my additional management servers, applying the server update, then the console update and web console update where applicable, just like above.

 

 

 

Updating ACS (Audit Collection Services)

image

You would only need to update ACS if you had installed this optional role.

On any Audit Collection Collector servers, you should run the update included:

 

image

 

image

 

A spot check of the files:

image

 

 

Updating Gateways:

image

I can use Windows Update or manual installation.

image

 

The update launches a UI and quickly finishes.

You MAY be prompted for a reboot.

 

Then I will spot check the files:

image

 

I can also spot-check the \AgentManagement folders, and make sure my agent update files are dropped here correctly:

image

 

***NOTE:  You can delete any older UR update files from the \AgentManagement directories.  The UR’s do not clean these up and they provide no purpose for being present any longer.

 

 

Reporting Server Role Update

image

I kick off the MSP from an elevated command prompt:

image

 

This runs VERY fast and does not provide any feedback on success or failure.  So I spot check the files:

image

 

NOTE: There is an RDL file update available to fix a bug in business hours based reporting.  See the KB article for more details.  You can update this RDL optionally if you use that type of reporting and you feel you are impacted.

 

 

 
Apply the SQL Scripts

 

In the path on your management servers, where you installed/extracted the update, there are two SQL script files: 

%SystemDrive%\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\SQL Script for Update Rollups

(note – your path may vary slightly depending on if you have an upgraded environment or clean install)

image

First – let’s run the script to update the OperationsManagerDW (Data Warehouse) database.  Open a SQL management studio query window, connect it to your Operations Manager DataWarehouse database, and then open the script file (UR_Datawarehouse.sql).  Make sure it is pointing to your OperationsManagerDW database, then execute the script.

You should run this script with each UR, even if you ran this on a previous UR.  The script body can change so as a best practice always re-run this.

If you see a warning about line endings, choose Yes to continue.

 

image

 

Click the “Execute” button in SQL mgmt. studio.  The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.

You will see the following (or similar) output:   “Command(s) completes successfully”

 

image

Next – let’s run the script to update the OperationsManager (Operations) database.  Open a SQL management studio query window, connect it to your Operations Manager database, and then open the script file (update_rollup_mom_db.sql).  Make sure it is pointing to your OperationsManager database, then execute the script.

You should run this script with each UR, even if you ran this on a previous UR.  The script body can change so as a best practice always re-run this.

 

image

 

Click the “Execute” button in SQL mgmt. studio.  The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.  

I have had MOST customers state this takes from a few minutes to as long as an hour.  In MOST cases – you will need to shut down the SDK, Config, and Microsoft Monitoring Agent (healthservice) on ALL your management servers in order for this to be able to run with success.  So prepare for the outage accordingly.

You will see the following (or similar) output: 

 

image

or

image

IF YOU GET AN ERROR – STOP!  Do not continue.  Try re-running the script several times until it completes without errors.  In a production environment with lots of activity, you will almost certainly have to shut down the services (sdk, config, and healthservice) on your management servers, to break their connection to the databases, to get a successful run.

Technical tidbit:   Even if you previously ran this script in any previous UR deployment, you should run this again in this update, as the script body can change with updated UR’s.

 

 

 

Manually import the management packs

image

 

There are 60 management packs in this update!   Most of these we don’t needso read carefully.

The path for these is on your management server, after you have installed the “Server” update:

\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Management Packs for Update Rollups

However, the majority of them are Advisor/OMS, and language specific.  Only import the ones you need, and that are correct for your language. 

I will remove all the MP’s for other languages (keeping only ENU), and I am left with the following:

 

image

 

What NOT to import:

The Advisor MP’s are only needed if you are connecting your on-prem SCOM environment to the OMS cloud service, (Previously known as Advisor, and Operations Insights).

The APM MP’s are only needed if you are using the APM feature in SCOM.

The Alert Attachment and TFS MP bundle is only used for specific scenarios, such as DevOps scenarios where you have integrated APM with TFS, etc.  If you are not currently using these MP’s, there is no need to import or update them.  I’d skip this MP import unless you already have these MP’s present in your environment.

However, the Image and Visualization libraries deal with Dashboard updates, and these always need to be updated.

I import all of these shown above without issue.

 

 

Update Agents

image

Agents should be placed into Pending Management by this update for any agent that was not manually installed (remotely manageable = yes):  

One the Management servers where you used Windows Update to patch them, their agents did not show up in this list.  Only agents where you manually patched their management server show up in this list, FYI.   The experience is NOT the same when using Windows Update vs manual.  If yours don’t show up – you can try running the update for that management server again – manually, from an elevated command prompt.

 

image

 

If your agents are not placed into pending management – this is generally caused by not running the update from an elevated command prompt, or having manually installed agents which will not be placed into pending.

In this case – my agents that were reporting to a management server that was updated using Windows Update – did NOT place agents into pending.  Only the agents reporting to the management server for which I manually executed the patch worked.

I manually re-ran the server MSP file manually on these management servers, from an elevated command prompt, and they all showed up.

You can approve these – which will result in a success message once complete:

image

 

Soon you should start to see PatchList getting filled in from the Agents By Version view under Operations Manager monitoring folder in the console:

image

 

 

I *strongly* recommend you take a look at this community MP, which helps see the “REAL” agent number in the “Agent Managed” view console:

https://blogs.technet.microsoft.com/kevinholman/2017/02/26/scom-agent-version-addendum-management-pack/

image

 

And my SCOM Management Group Management mp (updated for UR14), which will help show you REAL UR levels based on a better discovery.  This has long been a pain point in SCOM:

https://blogs.technet.microsoft.com/kevinholman/2017/05/09/agent-management-pack-making-a-scom-admins-life-a-little-easier/

image

image

 

 

Update Unix/Linux MPs and Agents

image

The current Linux MP’s can be downloaded from:

https://www.microsoft.com/en-us/download/details.aspx?id=29696

7.5.1070.0 is the current SCOM 2012 R2 UR12-UR14 release version.  

Obviously – you skip this if you don’t use xPlat monitoring.  If you already have this version applied, then also skip it.

****Note – take GREAT care when downloading – that you select the correct download for SCOM 2012 R2.  You must scroll down in the list and select the MSI for 2012 R2:

image

 

Download the MSI and run it.  It will extract the MP’s to C:\Program Files (x86)\System Center Management Packs\System Center 2012 R2 Management Packs for Unix and Linux\

Update any MP’s you are already using.   These are mine for RHEL, SUSE, and the Universal Linux libraries. 

 

image

 

You will likely observe VERY high CPU utilization of your management servers and database server during and immediately following these MP imports.  Give it plenty of time to complete the process of the import and MPB deployments.

Next – you need to restart the “Microsoft Monitoring Agent” service on any management servers which manage Linux systems.  I don’t know why – but my MP’s never drop/update the UNIX/Linux agent files in the \Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\AgentManagement\UnixAgents\DownloadedKits folder until this service is restarted.

Next up – you would upgrade your agents on the Unix/Linux monitored agents.  You can now do this straight from the console:

image

You can input credentials or use existing RunAs accounts if those have enough rights to perform this action.

Finally:

image

 

 

 

 

Update the remaining deployed consoles

image

This is an important step.  I have consoles deployed around my infrastructure – on my Orchestrator server, SCVMM server, on my personal workstation, on all the other SCOM admins on my team, on a Terminal Server we use as a tools machine, etc.  These should all get the matching update version.

You can use Help > About to being up a dialog box to check your console version:

image

 

 

 

Review:

image

Now at this point, we would check the OpsMgr event logs on our management servers, check for any new or strange alerts coming in, and ensure that there are no issues after the update.

 

 

 

Known issues:

See the existing list of known issues documented in the KB article.

1.  Many people are reporting that the SQL script is failing to complete when executed. You should attempt to run this multiple times until it completes without error.  You might need to stop the Exchange correlation engine, stop all the SCOM services on the management servers, and/or bounce the SQL server services in order to get a successful completion in a busy management group.  The errors reported appear as below:

——————————————————
(1 row(s) affected)
(1 row(s) affected)
Msg 1205, Level 13, State 56, Line 1
Transaction (Process ID 152) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Msg 3727, Level 16, State 0, Line 1
Could not drop constraint. See previous errors.
——————————————————–

2.  The Web Console update breaks the Web Console Silverlight in UR14.

Issue:

Once you apply the UR14 Web Console update, the initial web console connection prompts constantly to “Configure” Silverlight.  You can run configure, but this repeats…. And the web console is not useable for the customer, as you cannot get past the configure prompt.  If you choose “skip” then the web console will not be useable.

Cause:

When we initially connect to the Web console, we check to ensure the client has a code signing certificate that matches the .XAP files that are part of the web console.  If we detect that the client does not have the correct certificate, we will prompt to Configure this.  We include a file silverlightclientconfiguration.exe on the webserver which basically does two things:  (1) modifies the registry to AllowElevatedTrustAppsInBrowser, and (2) installs the Microsoft code signing certificate that was used to sign the .XAP files.

We included an updated set of .XAP files for the Web Console in UR14, and these were signed with the latest MS Code Signing certificate (Expiring 8/11/2018)

When we update the cert for signing, we are SUPPOSED to include this cert in the silverlightclientconfiguration.exe file.  However, this file was not updated with the new cert in UR13 or UR14.  It contains the same certs that worked in UR12.

The result it that users are prompted to “Configure” the Silverlight plugin, but even after running Configure, they continually get re-prompted because they do not have the correct certificate, which allows for Silverlight Elevated Trust Apps in Browser.

Known Workarounds:

Manually handle the certificate distribution.  Either via registry file, or import the cert into the trusted publishers.  You can export this cert by viewing the digital signature configuration on either of the XAP files or the SilverlightClientConfiguration.exe file:

To work around this issue, follow these steps:

  1. Click Configure in the dialog box.
  2. When you are prompted to run or save the SilverlightClientConfiguration.exe file, click Save.
  3. Run the SilverlightClientConfiguration.exe file.
  4. Right-click the .exe file, click Properties, and then select the Digital Signatures tab.
  5. Select the certificate that has Digest Algorithm as SHA256, and then click Details.
  6. In the Digital Signature Details dialog box, click View Certificate.
  7. In the dialog box that appears, click Install Certificate.
  8. In the Certificate Import Wizard, change the store location to Local Machine, and then click Next.
  9. Select the Place all certificates in the following store option and then select Trusted Publishers.
  10. Click Next and then click Finish.
  11. Refresh your browser window.

 

This will install the correct certificate on the client cert store, to be able to use the SCOM web console

image

VSAE now supports Visual Studio 2017

0
0

imageimage

 

System Center Visual Studio Authoring Extensions (VSAE) has been updated for VS2017

 

https://www.microsoft.com/en-us/download/details.aspx?id=30169

 

Read more about it here:  https://blogs.technet.microsoft.com/momteam/2017/12/01/system-center-visual-studio-authoring-extensionvsae-support-for-visual-studio-2017/

 

You voted.  Microsoft responded.  This had over 700 votes on Uservoice:

https://systemcenterom.uservoice.com/forums/293064-general-operations-manager-feedback/suggestions/18560653-updated-vsae-to-support-visual-studio-2017

 

Version 1.3.0.0 is released on 11/29/2017. 


Using Hashtables in your SCOM scripts

0
0

 

image

 

When we work with getting data out of SCOM via PowerShell, or building automations, something that can really help speed things up are Hashtables.

 

I was recently working with a customer to build a discovery for a new class, and get properties from other existing classes.  I had to build an array of these class objects (Windows Computers) and loop through them each time I wanted to use one.  As I found out – this became VERY time consuming.

In this case, I was doing something basic like this:

$WCarr = Get-SCOMClass -Name "Microsoft.Windows.Computer" | Get-SCOMClassInstance

That creates an array ($WCarr) and fills it with all the Windows Computer objects in the management group. 

The customer has about 10,000 Windows Computers in this SCOM Management group, so this operation takes about 85 seconds to complete.  Not a big deal for my script.

 

However, in a subsequent part of the script, I needed to loop through about 600 objects, and on each one I needed to get properties from the Windows Computer object.  Here is an example:

FOREACH ($WC in $WCarr)
{
     IF ($NameImLookingFor -eq $WC.DisplayName)
     {
          $MatchingComputerObject = $WC
     }
}

A cleaner way to write this same thing would be:

$MatchingComputerObject = $WCarr | where {$_.DisplayName -eq $NameImLookingFor}

 

I am basically just going through all 10,000 records in memory, until I find the one I am looking for.  At the customer, this process takes about 25 seconds.

25 seconds isn’t a big deal, right?

However, remember, I am doing this same thing 600 times!  Because I need to find the matching object FOR EACH of my 600 objects that I am adding discovery data for.

(600 * 25 seconds) = 15,000 seconds.  15000 seconds / 3600 seconds = 4.17 HOURS!!!!  This is no bueno!  I need to run this on a regular basis, and I cannot have it hammering my management servers like this.

 

So I take a different approach – what if I just go back to the SDK, and look up the Windows Computer object each time?

$MatchingComputerObject = Get-SCOMClass -Name "Microsoft.Windows.Computer" | Get-SCOMClassInstance | Where-Object -Property DisplayName -eq $NameImLookingFor

That takes 70 seconds and hits the SDK service and the SQL database every time.  Much worse!  Now my script will take 12 hours and hammer the SDK while it is running.  Ugh.

 

Enter in, the Hash Table

 

A Hash Table is defined as an associative array, with a compact data structure that stores key/value pairs.  Here is a simple example:

Create a new empty hashtable:

$HashTable = @{}

Add stuff to it:

$HashTable.Add('Kevin','Rocks')

Kevin is the “Key” in the Hashtable.  Rocks is the value.  In this case it is a simple string, but it could also be an object, which is powerful!

If I want to retrieve the Value for the Key “Kevin” I would:

$HashTable.'Kevin'

Which returns the value, “Rocks”.

Similarly – we can input an object:

$HashTable.Add('DC1.opsmgr.net',(Get-SCOMClass -Name "Microsoft.Windows.Computer" | Get-SCOMClassInstance | Where-Object -Property DisplayName -eq "DC1.opsmgr.net"))

In the above example, I am using the string FQDN as the Key, and inputting the Windows Computer object as the Value for the hashtable.

Now I can quickly retrieve the object without needing to go to the SDK, or loop through an array to find that object.

 

Here is a real world example:

# Get all the Windows Computers into my Array $WCarr = Get-SCOMClass -Name "Microsoft.Windows.Computer" | Get-SCOMClassInstance # Create an empty Hashtable $HashTable = @{} # Fill the hashtable from the array using the unique FQDN as the Key FOREACH($WC in $WCarr) { $HashTable.Add("$($WC.DisplayName)",$WC) } # Loop through each instance of my custom array and get properties from the Windows Computer as needed FOREACH ($MyObject in $MyObjects) { # Retrieve the individual matching object from the Hashtable $WC = $HashTable.($MyObject.DisplayName) # Get some property from the Windows Computer Object $IPAddress = $WC.'[Microsoft.Windows.Computer].IPAddress'.Value }

 

Now – when I need to retrieve a matching object – I don’t need to parse through all 10,000 objects each time, I simply request the specific object by the indexed Key from the hashtable.

 

How long does this take?

Well I am glad you asked!

The following line from above: 

$WC = $HashTable.($MyObject.DisplayName)

takes exactly 0.4 MILLISECONDS.

That’s right.  Milliseconds.

What took 21 SECONDS before, now takes four tenths of a millisecond.

The script section, which took 4.17 HOURS before, now completes in 0.24 SECONDS.

 

Script time matters.  Make sure yours aren't running any longer than they have to.

I really do recommend placing a start time and end time, and total runtime in all your SCOM scripts – to help you monitor these kinds of things closely:

https://blogs.technet.microsoft.com/kevinholman/2017/09/10/demo-scom-script-template/

Operations Manager 1801 is available!

0
0

 

image

 

This is a major change from all previous versions of SCOM. 

Operations Manager 1801 is the first release of SCOM built on the Semi-Annual Channel release cycle.  All new features and updates will now be delivered in this Semi-Annual Channel (SAC), following the same Semi-Annual release cycle that Windows Server is using. 

 

Read about the announcement here:  https://azure.microsoft.com/en-us/blog/first-system-center-semi-annual-channel-release-now-available/

Download it here:  https://www.microsoft.com/en-us/evalcenter/evaluate-system-center-release

Documentation:  https://docs.microsoft.com/en-us/system-center/scom/welcome?view=sc-om-1801

 

Basically – you have two choices for you Operations Manager deployments:

  • Semi-Annual Channel (SAC)
  • Long Term Servicing Channel (LTSC)

SAC:

  • Consistent updates on a semi-annual basis
  • All new features will be introduced into this release
  • Each build is supported for 18 months after release

LTSC:

  • Updates will continue in the form of UR’s, but no features added, primarily fixes.
  • Longer term (5 years of mainstream) support lifecycle.

 

Key feature updates in Operations Manager 1801:

  • Improved HTML5 dashboard experience
  • Enhanced SDK performance
  • Linux Logfile monitoring enhancements
  • Linux Kerberos support
  • GUI support for entering SCOM License key
  • Service Map integration
  • Updates and recommendations for third-party Management Packs
  • System Center Visual Studio Authoring Extension (VSAE) support for Visual Studio 2017
  •  

 

As always – submit and/or vote for feature requests, fixes, enhancements here:  https://systemcenterom.uservoice.com/forums/293064-general-operations-manager-feedback

Operations Manager 1801 – Quickstart Deployment Guide

0
0

 

There is already a very good deployment guide posted on Microsoft Docs here:   https://docs.microsoft.com/en-us/system-center/scom/deploy-overview?view=sc-om-1801

 

The following article will cover a basic install of System Center Operations Manager 1801.   The concept is to perform a limited deployment of OpsMgr, only utilizing as few servers as possible, but enough to demonstrate the roles and capabilities.  For this reason, this document will cover a deployment on 3 servers. A dedicated SQL server, and two management servers will be deployed.  This will allow us to show the benefits of high availability for agent failover, and the highly available resource pool concepts.  This is to be used as a template only, for a customer to implement as their own pilot or POC, or customized deployment guide. It is intended to be general in nature and will require the customer to modify it to suit their specific data and processes.

This also happens to be a very typical scenario for small environments for a production deployment.  This is not an architecture guide or intended to be a design guide in any way. This is provided "AS IS" with no warranties, and confers no rights. Use is subject to the terms specified in the Terms of Use.


 
Server Names\Roles:
  • SQL1             SQL Database Services, Reporting Services
  • SCOM1         Management Server Role, Web Console Role, Console
  • SCOM2         Management Server Role, Web Console Role, Console

Windows Server 2016 will be installed as the base OS for all platforms.  All servers will be a member of the AD domain.

SQL 2016 SP1  will be the base standard for all database and SQL reporting services.


 
 
High Level Deployment Process:

1.  In AD, create the following accounts and groups, according to your naming convention:

  • DOMAIN\OMAA                 OM Server Action Account
  • DOMAIN\OMDAS               OM Config and Data Access Account
  • DOMAIN\OMREAD             OM Datawarehouse Reader Account
  • DOMAIN\OMWRITE            OM Datawarehouse Write Account
  • DOMAIN\SQLSVC               SQL Service Account
  • DOMAIN\OMAdmins          OM Administrators security group

2.  Add the OMAA, OMDAS, OMREAD, and OMWRITE accounts to the “OMAdmins” global group.

3.  Add the domain user accounts for yourself and your team to the “OMAdmins” group.

4.  Install Windows Server 2016 to all server role servers.

5.  Install Prerequisites and SQL 2016 SP1.

6.  Install the Management Server and Database Components

7.  Install the Reporting components.

8.  Deploy Agents

9.  Import Management packs

10.  Set up security (roles and run-as accounts)


 
 
Prerequisites:

1.  Install Windows Server 2016 to all Servers

2.  Join all servers to domain.

3.  Install the Report Viewer controls to any server that will receive a SCOM console.  Install them from https://www.microsoft.com/en-us/download/details.aspx?id=45496  There is a prereq for the Report View controls which is the “Microsoft System CLR Types for SQL Server 2014” (ENU\x64\SQLSysClrTypes.msi) available here:   https://www.microsoft.com/en-us/download/details.aspx?id=42295

4.  Install all available Windows Updates.

5.  Add the “OMAdmins” domain global group to the Local Administrators group on each server.

6.  Install IIS on any management server that will also host a web console:

Open PowerShell (as an administrator) and run the following:

Add-WindowsFeature NET-WCF-HTTP-Activation45,Web-Static-Content,Web-Default-Doc,Web-Dir-Browsing,Web-Http-Errors,Web-Http-Logging,Web-Request-Monitor,Web-Filtering,Web-Stat-Compression,Web-Mgmt-Console,Web-Metabase,Web-Asp-Net,Web-Windows-Auth –Restart

Note:  The server needs to be restarted at this point, even if you are not prompted to do so.  If you do not reboot, you will get false failures about prerequisites missing for ISAPI/CGI/ASP.net registration.

7.  Install SQL 2016 SP1 to the DB server role

  • Setup is fairly straightforward. This document will not go into details and best practices for SQL configuration. Consult your DBA team to ensure your SQL deployment is configured for best practices according to your corporate standards.
  • Run setup, choose Installation > New SQL Server stand-alone installation…

image

  • When prompted for feature selection, install ALL of the following:
    • Database Engine Services
    • Full-Text and Semantic Extractions for Search
    • Reporting Services – Native

image

  • On the Instance configuration, choose a default instance, or a named instance. Default instances are fine for testing, labs, and production deployments. Production clustered instances of SQL will generally be a named instance. For the purposes of the POC, choose default instance to keep things simple.
  • On the Server configuration screen, set SQL Server Agent to Automatic.  You can accept the defaults for the service accounts, but I recommend using a Domain account for the service account.  Input the DOMAIN\sqlsvc account and password for Agent, Engine, and Reporting.  Set the SQL Agent to AUTOMATIC.
  • Check the box to grant Volume Maintenance Task to the service account for the DB engine.  This will help performance when autogrow is needed.

image

  • On the Collation Tab – you can use the default which is SQL_Latin1_General_CP1_CI_AS
  • On the Account provisioning tab – add your personal domain user account and/or a group you already have set up for SQL admins. Alternatively, you can use the OMAdmins global group here. This will grant more rights than is required to all OMAdmin accounts, but is fine for testing purposes of the POC.
  • On the Data Directories tab – set your drive letters correctly for your SQL databases, logs, TempDB, and backup.
  • On the Reporting Services Configuration – choose to Install and Configure. This will install and configure SRS to be active on this server, and use the default DBengine present to house the reporting server databases. This is the simplest configuration. If you install Reporting Services on a stand-alone (no DBEngine) server, you will need to configure this manually.
  • Choose Install, and setup will complete.
  • You will need to disable Windows Firewall on the SQL server, or make the necessary modifications to the firewall to allow all SQL traffic.  See http://msdn.microsoft.com/en-us/library/ms175043.aspx
  • When you complete the installation – you might consider also downloading and installing SQL Server Management Studio Tools from the installation setup page, or https://msdn.microsoft.com/en-us/library/mt238290.aspx


     
     
    SCOM Step by step deployment guide:

    1.  Install the Management Server role on SCOM1.

    • Log on using your personal domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
    • Run Setup.exe
    • Click Install
    • Select the following, and then click Next:
      • Management Server
      • Operations Console
      • Web Console
    • Accept or change the default install path and click Next.
    • You might see an error from the Prerequisites here. If so – read each error and try to resolve it.
    • On the Proceed with Setup screen – click Next.
    • On the specify an installation screen – choose to create the first management server in a new management group.  Give your management group a name. Don’t use any special or Unicode characters, just simple text.  KEEP YOUR MANAGEMENT GROUP NAME SIMPLE, and don’t put version info in there.  Click Next.
    • Accept the license.  Next.
    • On the Configure the Operational Database screen, enter in the name of your SQL database server name and instance. In my case this is “DB01”. Leave the port at default unless you are using a special custom fixed port.  If necessary, change the database locations for the DB and log files. Leave the default size of 1000 MB for now. Click Next.
    • On the Configure the Data Warehouse Database screen, enter in the name of your SQL database server name and instance. In my case this is “DB01”. Leave the port at default unless you are using a special custom fixed port.  If necessary, change the database locations for the DB and log files. Leave the default size of 1000 MB for now. Click Next.  
    • On the Web Console screen, choose the Default Web Site, and leave SSL unchecked. If you have already set up SSL for your default website with a certificate, you can choose SSL.  Click Next.
    • On the Web Console authentication screen, choose Mixed authentication and click Next.
    • On the accounts screen, change the accounts to Domain Account for ALL services, and enter in the unique DOMAIN\OMAA, DOMAIN\OMDAS, DOMAIN\OMREAD, DOMAIN\OMWRITE accounts we created previously. It is a best practice to use separate accounts for distinct roles in OpsMgr, although you can also just use the DOMAIN\OMDAS account for all SQL Database access roles to simplify your installation (Data Access, Reader, and Writer accounts). Click Next.
    • On the Diagnostic Data screen – click Next.
    • On the Microsoft Update screen – choose OFFNext.
    • Click Install.
    • Close when complete.
    • The Management Server will be very busy (CPU) for several minutes after the installation completes. Before continuing it is best to give the Management Server time to complete all post install processes, complete discoveries, database sync and configuration, etc. 10-15 minutes is typically sufficient.

     

    2.  (Optional)  Install the second Management Server on SCOM2.

    • Log on using your domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
    • Run Setup.exe
    • Click Install
    • Select the following, and then click Next:
      • Management Server
      • Operations Console
      • Web Console
    • Accept or change the default install path and click Next.
    • Resolve any issues with prerequisites, and click Next.
    • Choose “Add a management server to an existing management group” and click Next.
    • Accept the license terms and click Next.
    • Input the servername\instance hosting the Ops DB. Select the correct database from the drop down and click Next.
    • Accept the Default Web Site on the Web Console page and click Next.
    • Use Mixed Authentication and click Next.
    • On the accounts screen, choose Domain Account for ALL services, and enter in the unique DOMAIN\OMAA, DOMAIN\OMDAS accounts we created previously.  Click Next.
    • On the Diagnostic Data screen – click Next.
    • On the Microsoft Update screen – choose OFFNext.
    • Click Install.
    • Close when complete.

     

    3.  Install SCOM Reporting Role on the SQL server.

    • Log on using your domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
    • Locate the SCOM media. Run Setup.exe. Click Install.
    • Select the following, and then click Next:
      • Reporting Server
    • Accept or change the default install path and click Next.
    • Resolve any issues with prerequisites, and click Next.
    • Accept the license and click Next.
    • Type in the name of a management server, and click Next.
    • Choose the correct local SQL reporting instance and click Next.
    • Enter in the DOMAIN\OMREAD account when prompted. It is a best practice to use separate accounts for distinct roles in OpsMgr, although you can also just use the DOMAIN\OMDAS account for all SQL Database access roles to simplify your installation. You MUST input the same account here that you used for the OM Reader account when you installed the first management server.  Click Next.
    • On the Diagnostic Data screen – click Next.
    • On the Microsoft Update screen – choose OFFNext.
    • Click Install.
    • Close when complete.

     

    You have a fully deployed SCOM Management group at this point.

     

     

    image

     
    What’s next?

     

    Once you have SCOM up and running, these are some good next steps to consider for getting some use out of it and keep it running smoothly:

     

    1.  Fix the Database permissions for Scheduled Maintenance Mode

    2.  Set the Operations Manager Administrators User Role

    • Add your OMAdmins Global Group.  Ensure you, your team, and the SCOM DAS and Action accounts are members of this group FIRST.
    • Remove BUILTIN\Administrators from the Operations Manager Administrators - User Role, to secure your SCOM installation.

    3.  Set SCOM License

    4.  Optimize SQL Server for growth and performance

    5.  Set up SQL maintenance jobs

    6.  Configure Data Warehouse Retention

    7.  Optimize your management servers registry

    8.  Enable Agent Proxy as a default setting

    9.  Configure Administration Settings per your requirements:

    • Database Grooming
    • Automatic Alert Resolution
    • Heartbeat configuration (modify only if required)
    • Manual Agent Installs (Reject, Review, or Accept)

    10.  Backup Unsealed Management packs

    11.  Deploy an agent to the SQL DB server.

    12.  Import management packs.

    • https://docs.microsoft.com/en-us/system-center/scom/manage-mp-import-remove-delete
    • Using the console – you can import MP’s using the catalog, or directly importing from disk.  I recommend always downloading MP’s and importing from disk.  You should keep a MP repository of all MP’s both current and previous, both for disaster recovery and in the case you need to revert to an older MP at any time.
    • Import the Base OS and SQL MP’s at a minimum.

    13.  Configure Notifications:

    14.  Deploy Unix and Linux Agents

    15.  Configure Network Monitoring

    16.  Configure SQL MP RunAs Security:

    17.  Continue with optional activities from the Quick Reference guide:

    18.  (Optional) Configure your management group to support APM monitoring.

    19.  (Optional) Deploy Audit Collection Services

    20.  Learn MP authoring.

    Whats versions of SCOM can be upgraded to Operations Manager 1801?

    0
    0

     

    image

     

    If you would like to move your SCOM deployment on to the Semi-Annual Channel, what versions does Microsoft support moving from?

    The answer is here:  https://docs.microsoft.com/en-us/system-center/scom/plan-system-requirements?view=sc-om-1801#supported-coexistence

     

    There are two methods to get to SCOM 1801:

    • Migration (parallel install with coexistence)
    • In Place Upgrade

    Both are supported.

     

    System Center Operations Manager 1801 supports an in-place upgrade from the following *minimum* versions:

    • System Center 2012 R2 UR12 to the latest update rollup
    • System Center 2016 RTM to the latest update rollup

     

    For a side by side migration, with multi-homing, we recommend that environments be on the latest Update Rollup.  So if you are doing a side by side (parallel) migration, at the time of this release, the most current update rollups are:

    • System Center Operations Manager 2012 R2 UR14
    • System Center Operations Manager 2016 UR4

    Monitoring AD Certificate Services on Windows Server 2012 R2 and Windows Server 2016

    0
    0

     

    We have management packs for Active Directory Certificate Services on Windows 2012R2 and Windows 2016. 

    WS 2012 and 2012R2:  https://www.microsoft.com/en-us/download/details.aspx?id=34765

    WS 2016:  https://www.microsoft.com/en-us/download/details.aspx?id=56671

     

    However, there is an issue with the recently released ADCS MP for WS 2016.  A change was made in the library MP which modified some class property names.  This breaks MP update, so customers using the ADCS MP’s for Windows 2012 and 2012R2 cannot “add” the ADCS for Windows Server 2016 MP’s to the management group.

    image

     

    You might see these errors:

     

    Certificate Services Common Library could not be imported.

    If any management packs in the Import list are dependent on this management pack, the installation of the dependent management packs will fail.

    Verification failed with 5 errors:
    -------------------------------------------------------
    Error 1:
    Found error in 2|Microsoft.Windows.CertificateServices.Library|7.1.10100.0|Microsoft.Windows.CertificateServices.Library|| with message:
    Version 10.0.0.0 of the management pack is not upgrade compatible with older version 7.1.10100.0. Compatibility check failed with 4 errors:

    -------------------------------------------------------
    Error 2:
    Found error in 1|Microsoft.Windows.CertificateServices.Library/31bf3856ad364e35|1.0.0.0|Microsoft.Windows.CertificateServices.CAWatcher|| with message:
    Publicly accessible ClassProperty (WatcheeName) has been removed in the newer version of this management pack.
    -------------------------------------------------------
    Error 3:
    Found error in 1|Microsoft.Windows.CertificateServices.Library/31bf3856ad364e35|1.0.0.0|Microsoft.Windows.CertificateServices.CAWatcher|| with message:
    Publicly accessible ClassProperty (IsWatcheeOnline) has been removed in the newer version of this management pack.
    -------------------------------------------------------
    Error 4:
    Found error in 1|Microsoft.Windows.CertificateServices.Library/31bf3856ad364e35|1.0.0.0|Microsoft.Windows.CertificateServices.CAWatcher|| with message:
    Publicly accessible ClassProperty (WatcheeHierarchyEntryPoint) has been removed in the newer version of this management pack.
    -------------------------------------------------------
    Error 5:
    Found error in 1|Microsoft.Windows.CertificateServices.Library/31bf3856ad364e35|1.0.0.0|Microsoft.Windows.CertificateServices.CAWatcher|| with message:
    New Key ClassProperty item (WatcherName) has been added in the newer version of this management pack.
    -------------------------------------------------------

     

     

    There is a workaround:

    Delete all the ADCS 2012 MP’s you have, while first backing up and then deleting any unsealed MP’s which reference them.

    Import only the Microsoft.Windows.CertificateServices.Library.mp version 10.0.0.0

    Now you may import all the rest of the MP’s, including 2012, 2012R2, and 2016 for ADCS, and your unsealed MP’s which you have to remove.

     

    image

     

    If you only need to monitor ADCS on Windows Server 2016, simply delete your existing ADCS MP’s first, then you can import these as normal.

    Viewing all 158 articles
    Browse latest View live




    Latest Images