Archive for September, 2006

HP EVA: Monitor Eva’s error log for critical issues

September 23, 2006 Leave a comment
The EVA Management Pack from HP monitors alerts sent by WEBES to MOM via the DESTA service.  We configured DESTA to send to MOM via a XML feed.  What happens to monitoring of your EVA systems when DESTA crashes or there are other errors on the EVA management server?  You will receive no notification about critical EVA issues.  Therefore it is essential that you monitor a log file that will inform you of issues with the EVA management server.  These issues affect things like the phone home capability and the DESTA service feeding MOM.
If you want to ensure WEBES and DESTA are healthy and all systems are operational, then you might want to do the following:
1) Create a rule to monitor the DESTA service.
This is a simple task.  Create a new rule.  The data provider should be: Internally-generated Event
The criteria should be:
Source: Microsoft Operations Manager
Event ID: 21207
Click on advanced and select ‘Parameter 5’ equals ‘DESTA_Service’
Now you can fill out the rest of the tabs accordingly, but we have a response if the DESTA service is down or crashes.  Our response is to net start the DESTA service:
Click Add and select ‘Execute a command or batch file’
Then fill out the blanks accordingly:
Application: cmd.exe
Command Line: /c net start DESTA_Service
Initial Directory: c:\winnt\system32\
Select the radio button for Agent Computer.
Now you are monitoring the DESTA service on your EVA management server.
After you set up this rule, let’s create three rules that look for critical errors in your director_err.txt log file.
2) Create a new log file provider.
Create a new rule in the appropriate rule group (I recommend making a computer group that contains just your HP EVA management servers.  After you make this new group, associate it with a new rule group that will contain three simple rules).
The rule will be an event rule.  When you get to the provider portion of the rule, you want to click on ‘NEW’ near the bottom of the dialog box.  You will be prompted to name this new provider.  I called it HPEVAErrorlog.  The type is Generic single-line log file.  You will then be prompted to enter the directory where the file is located and give the name of the log file.  For our environment it was located in ‘C:\Program Files\Hewlett-Packard\svctools\specific\Webes\logs’ and the name is ‘director_err.txt’.  Once you create this provider you will never have to create it again, you will have to select it for the next few rules that you create.
The first rule (that you already started to create) should have Parameter 4 matches wildcard ‘*Cannot establish socket connection with the server.*’.
The second rule will use the same provider and Parameter 4 matches wildcard ‘*Error parse error"
The third and final rule will use the same provider and Parameter 4 matches wildcard ‘OutOfMemoryError’
These three lines were identified as crucial by HP.  If these text segments are found in the log file, then you will have to troubleshoot the issue you are having with the management server.  These issues may prevent WEBES from sending MOM HP EVA related alerts via the DESTA service.  So it would be smart for you to monitor for these three lines of text.
To test if the rules are working, just create the text file ‘director_err.txt’ in that directory (one may not be present) and ensure that you have the above text in the file.  MOM will pick it up and alert you accordingly.
Good luck.
Categories: MOM

MOM Reporting: Event Analysis – What gives?

September 21, 2006 Leave a comment
As you are aware I am having some issues with the Exchange SLA Scorecard.  I have been trying to use MOM reporting to do Alert and Event analysis for the event id 9539.  Alert analysis worked as intended.  I ran the Alert analysis report and it returned all instances of that particular alert on our Exchange servers (the ones that the agent picked the event up on).
I then ran the Event Analysis report.  You get prompted to select a date range, computer group, event type, event id, provider instance name, source, sort by, and sort order.
The criteria I selected:
Computer group: MOM 2005 Agents
Event type: Informational
Event Id: 9539
Provider instance name: Application
Source: MSExchangeIS Mailbox Store
The results should be all 9539 events right?  WRONG.  I got 9539, 9523, and 1221 events that all use the same source.  Why?  I don’t know.  I will be bringing this up with MSFT.  I asked Marcus Oh for some help, and he looked at the .rdl file to look at the query and he didn’t find any wild cards, so I don’t know why it’s returning more then 9539 events.
Categories: MOM

Exchange SLA Scorecard woes Part 1

September 21, 2006 Leave a comment
The Exchange SLA scorecard contains about three or four rules.  Two of the most important are in regards to a store being dismounted or mounted.  These two rules are looking for event id’s 9539 and 9523.  These two rules are redundant and do not raise alerts.  In the Exchange Management Pack in the Information Store Service child rule group there are two rules for these events.  The store dismounted is set to raise an alert.
We have a six way cluster (four active virtual servers and two inactive) that had stores dismounted on September 13.  MOM did not alert on this 9539 event, nor did it collect it.  Therefore, when you go to view a report in the Exchange SLA scorecard you don’t see any outages for this cluster.  MOM should have collected these events and even alerted on them, but it did not.
We are working on this issue with MSFT, but so far we haven’t gotten very far.  MSFT recommended uninstalling the agents and reinstalling the agents.  We have done this.  We have also checked configuration for the computer groups and rule groups.  We even ensured that the agents are alerting by doing the end to end monitoring test and creating a separate computer group and rule group for these particular clusters.  Our tests proved that agent communication is working; it’s just a matter of finding out why these events were not picked up by MOM.
I’ll keep you posted as we make progress.
Categories: MOM

Exchange SLA Scorecard 1.5

September 2, 2006 2 comments

It has been a long time coming, but we were able to implement this new Exchange SLA scorecard via a MCS engagement.  I can’t get into the specifics of the install due to a NDA; However I can talk about the configuration.

After we were able to install the scorecard the fun began!  What fun am I talking about?  Mapping server names to their Exchange roles!  If you have a large Exchange deployment, then get ready to spend a few hours doing this.  You will see a list of all the Exchange servers and you will have to manually touch each one and select what it is.  This can only be done initially to FEP, BRD, and PUB servers (basically stand alone Exchange servers).  Once you are done mapping all of those servers manually (max display of servers per page is twenty – uhg!!) you get to map CLUSTERS!  Oh what a joy this is!  The MCS consultant got a few laughs as I made some sarcastic comments about how fun and high tech this was.  You select one cluster (Exchange Virtual Server) and then you get to go through multiple screens and add all the other cluster names and physical nodes to that Exchange Virtual Server.

All kidding aside, you would think that MSFT would have a way to automate this via XML or even a text file, but they don’t.  Besides set up (server role assignment and mapping) being a very manual process, there was a problem with sorting in their drop down tables.  All my resources were listed in alphabetical order until the last twenty servers.  I suppose MSFT didn’t anticipate a very large Exchange organization.  I was losing my mind because I couldn’t find some of my clusters, then I scrolled all the way to the bottom of the drop down menu and saw about fifteen servers that were not in alphabetical order.

I haven’t logged a DCR yet, but I hope that the MCS consultant gave that feedback to the team who created the SLA scorecard.

Other then the time consuming process of assigning the server roles and mapping clusters, the install was pretty painless and the scorecards look really nice.  A definite must for anyone who is running Exchange and MOM.  If you have clustered Exchange servers, then you will need a MCS engagement to get the 1.5 version of the scorecard and assistance with installing it.

Categories: MOM