The issue we had was that a Barcode user selected all assets (55,000) to download to scanner. However, the same troubleshooting method can be used to discover a hanging Web process from Asset Management Server.
My user instructions for Barcode tells users to always select a single location and then all child resources. Then they need to verify that the download number of assets is not equal to all assets in the asset database. To verify they need to look at the Rows selected before hitting Select All. We need to make sure that only the subgroup select is going to be downloaded.
The problem was that all we knew was happening on the administrative side was that the server was slow. We could see that Barcode was trying to do something in the logs. However when you looked at the barcode upload report only 3 uploads in the last couple of days
So it looked like someone was trying to do a barcode download and having it fail.
We tried restarting IIS service to see if that would clear any processes that were hung. The server would clear for a moment and then W3Wp.exe would go back up to 300-700MB of mem and 50-80% CPU utilization. We even tried a reboot of server and it would still hang.
Looked at services and no errors. Everything was working as designed.
Finally after 2 days one of the Barcode users called to say that he was having issues with his barcode scanner. It would not complete the download. One of the warehouse guys went over to the users desk and walked him through the steps of downloading to scanner.
The warehouse worker noticed that the end user was downloading 55,000 assets.
Once we figured out what user it was then I went and found their SID from our AD-ENT server.
Ran query in SQL enterprise manager.
Select * from usersettings
where sid is like S-1-5-21-XXXXXXXXXXX-XXXXXXXXXXXX'
Looked at the modified times to make sure that they match the times the end user was trying to upload.
Had user log off their computer.
Then ran the delete query.
Delete from usersettings
where sid is like 'S-1-5-21-XXXXXXXXX-XXXXXXXX-XXXXXXX-XXXX'
Restarted IIS server on Asset Management Server to clear the security cache. If you don’t do this the users old settings won't clear. We need to make sure that the users SID has to rebuild there rights from the Altiris database.
If you have the time I would recommend rebooting the server.
Going Forward
How do we find the user or process hogging the server?
Use the transaction logs. In our case they are located on the Asset management server at:
\WWW\inet\W3SVC1
Look for the most recent EX%%%%%%.log.
or
Save off the most recent file to the local computer.
Delete the first two lines.
Delete the first box and shift everything left to make the columns match.
Open the file in Excel – delimiters are space
Sort by the time taken.
Notice the time taken.
sc-bytes cs-bytes time-taken
118760 1241 3474140
118712 990 3420468
118712 1241 1971868
119030 15357 277312
We have a mind boggling 57.90 minutes!!!
The user name is in the cs-username column:
cs-username
Domain\UserID
Domain\UserID
Domain\UserID
The process is in the cd-uri-stem column:
cs-uri-stem
/Altiris/BarCode/DownloadToScanner.aspx
/Altiris/BarCode/DownloadToScanner.aspx
/Altiris/BarCode/DownloadToScanner.aspx
/Altiris/BarCode/DownloadToScanner.aspx
So I know it was a user trying to do a barcode download that is hanging.
I can now use my previous process to look up the users SID to delete them from the dbo.Usersettings table after calling the user and telling them to log out.
I restart IIS and tell the user to log back in and try the same process of doing a download.
CPU use drops from 50-75% usage down to 5-10%.
Everyone can now work again, YEAH!!