I did the CU3 update yesterday to our infrastructure. Later, in the afternoon, I started to approve and process agent updates. In the evening I got pinged on OCS by our OCS and Group Chat engineer. He asked if I was doing an install on OCS because “SCOM” is restarting all of the OCS and GroupChat services. I told him that this wasn’t possible, that the agent install shouldn’t bounce application services. After looking at one of the boxes, it was apparent that RestartManager was bouncing several services after the SCOM agent update took place. I had patched other Windows 2008 servers earlier that day without any issue. I am still uncertain what caused this to happen on our OCS and GroupChat servers, however if it happens to you here is what you need to look for and what you need to do to resolve it.
Despite the push showing as “Successful” you will find that some of these were not so. The quick way to find them is through an alert view and or this view in the console:
All of the above Critical states are agents that experienced problems during install. Pick one and log onto that box. Checking the SCOM Agent service you will find it in a “Starting State”:
After you verify that the SCOM service is “Starting” open up task manager and you should find the MOMAgentInstaller.exe still operating:
Kill this and the HealthService.exe process:
Now start the SCOM agent service and verify your .dll’s have been updated with the .49 version. If we look at the application and scom event logs we will see what potentially happened. When looking at the application log we notice that after the scom agent install started the RestartManager started to cycle several services and the SCOM agent had been hung since the incident started:
So be careful about pushing agent updates to Windows 2008 servers if the Restart Manager service is running and is allowed to run, as it may cause some application outages for you.
So I had to roll CU3 to production today and one of my agents was throwing an odd error:
The Agent Management Operation Agent Install failed for remote computer servername.domain.com.
Install account: myaccount
Error Code: 80070641
Error Description: The Windows Installer Service could not be accessed. This can occur if you are running Windows in safe mode, or if the Windows Installer is not correctly installed. Contact your support personnel for assistance.
Microsoft Installer Error Description:
For more information, see Windows Installer log file “(null)” on the Management Server.
I thought this was odd and had never seen it before. Did a little “google” search for this issue and found this KB that mentioned the windows installer service could be unregistered or corrupt. After I followed the steps in the article, I tried to install the update to the agent and it was successful. Very nice!