VMware vSphere

 View Only
Expand all | Collapse all

SQL Server Performance issues as a VM

TomR_CAW

TomR_CAWAug 13, 2008 07:01 PM

TomR_CAW

TomR_CAWAug 13, 2008 07:34 PM

  • 1.  SQL Server Performance issues as a VM

    Posted Aug 08, 2008 05:10 PM

    Since converting an MSSQL server to a virtual machine we have seen severe performance drops and an array of unusual errors. Here's the scenario:

    3 Dell PowerEdge 2950 servers each configured as follows:

    2 PROC, Quad core Intel E5430 2.66Ghz

    8GB RAM

    2 onboard Broadcom GB NICs (used for service console and admin network only)

    1 Intel GB 4 port NIC (2 for VMotion, 2 for Production VMs)

    2 QLogic QLA2432 4GB HBAs

    EMC CX300 SAN

    Admin, VMotion and Production networks are physically separate, each with it's own GB switch.

    We are running ESX 3.5 on all hosts with VC 2.5 setup as a VM. HA, DRS, VMotion all enabled. There are six VMs running currently arranged like so:

    Host1

    WebServer VM (4 vCPU, 3GB RAM, Win2K)

    Host2

    WebServer2 VM (1 vCPU, 1GB RAM, Win2K3), secondary sites, minimal traffic

    WebServer3 VM (1 vCPU, 1GB RAM, Win2K3), not in production yet, no traffic

    VirtualCenter VM (1 vCPU, 2GB RAM, Win2K3)

    DataImport VM (1 vCPU, 1GB RAM, Win2K), processes data files to load into SQLServer

    Host3

    SQLServer VM (4 VCPU, 3GB RAM, Win2K, MSSQL2000 sp4) primary backend database

    The WebServer, DataImport and SQLServer VMs were all converted using Converter Enterprise.

    All VMs are stored on the SAN in LUN A, except SQLServer in LUN B.

    All SQLServer databases (.mdf files) are in two RDMs to LUNs C & D.

    All SQLServer log files are in the VM.

    We operate approximately 200 SQL databases ranging in size from 20MB to 20GB. Most are around 750MB. There is a lot of SQL activity. We receive daily update files from clients to load into their databases. These files are stored in a separate physical server and processed by the DataImport VM which does most of the insert/update/delete functions. The web portals themselves are mostly just queries. Very few inserts or updates. The past couple weeks since converting everything to VMware we have seen a lot of performance issues. Web Portal requests routinely return ASP timeouts while waiting for responses from SQL. We have also seen a lot of other intermittent errors since the conversion like these:

    "The RPC Server is unavailable" when accessing some pages.

    and

    "Microsoft OLE DB Provider for SQL Server error '80004005'

    DBNETLIB ConnectionOpen (PreLoginHandshake()). General network error. Check your network documentation."

    Both of these are intermittent and will often go away after refreshing a page. We've also seen these creep up on our web server: "Out of process application '/LM/W3SVC/1/ROOT/xxxxx' terminated unexpectedly."

    After some monitoring with esxtop, VI client Performance, perfmon, and our switches the network, SAN fabric and memory do not appear to be dragging. The only thing that is jumping out at me is the SQLServer CPU usage. Since converting the VI client performance tab and esxtop is showing consistently above 90% CPU usage when our DataImport programs are running. If I shut them down the importers, usage drops to 30% or below. Also perfmon and taskman in the guest are showing barely 25% usage at the same time esxtop shows 100%.

    I'm at a loss of where to look next. SQL Server was able to handle all site and importer requests perfectly prior to the conversion and that was with lesser hardware. Any help would be greatly appreciated.

    Thanks.

    Tom



  • 2.  RE: SQL Server Performance issues as a VM

    Posted Aug 08, 2008 06:54 PM

    Tom, is your storage setup completely different o your former physical setup. Are you sharing the physical disks for lun's of sql server with other one's?

    Are you using load balacing withing your virtual switch (which policy are you using?) Try using only one nic for production and see what happens....

    Did you check the following article (irq sharing) http://www.vmware.com/resources/techresources/1061 ?

    Best wishes

    Spex



  • 3.  RE: SQL Server Performance issues as a VM

    Posted Aug 08, 2008 08:34 PM

    We were using the CX300 in a direct attach to an old PowerEdge 2650. The databases were the only thing stored on the SAN at that time and those LUNs have not changed at all. Those are the RDMs setup on the new SQLServer VM. We added another drive cage to the CX300 when we converted. The new cage houses all the drives for the VMs. The two LUNs each encompass a 6 disk RAID5 array so the SQLServer VM is residing on it's own disk array.

    I am using load balancing on each of my virtual switches based on Port ID. I'll check the article you've linked.

    Thanks.

    Tom



  • 4.  RE: SQL Server Performance issues as a VM

    Posted Aug 11, 2008 07:07 PM

    Is it a 32 or 64 bit SQL Server? Ours is a 64 bit



  • 5.  RE: SQL Server Performance issues as a VM

    Posted Aug 11, 2008 04:23 PM

    Any chance a mod can move this thread to the ESX 3.5 forum? Looks like I posted it in the wrong area.



  • 6.  RE: SQL Server Performance issues as a VM

    Posted Aug 11, 2008 07:04 PM

    You have only mentioned cpu, what about memory, network and disc load? You can use esxtop to see outstanding I/O in the hba queue and also nic and memory. I have a virtual MS SQL 2005 2 vcpu and 3Gb memory used for MOSS 2007 and I would also like it to run better. I had to add an extra vcpu due to bad performance, but as you might know adding vcpu's will not always improve performance due to the running in-sync (or almost in-sync).



  • 7.  RE: SQL Server Performance issues as a VM

    Posted Aug 11, 2008 09:22 PM

    I can't really find any statistics on memory, network, or disk that makes me think there is a problem there. I've used esxtop and the Performance graphs in VI Client and nothing looks excessive. Network usage on the host will spike at 2.5Mbps. Memory usage maxes out at 1.8GB (3Gb allocated to the VM) and that is just because SQL will use everything it's given. Zero ballooning as this is the only VM on the host. Disk I/O is barely touched from what I can see. When running under the heaviest load I still saw nothing in any of the hba (0, 1 or 2) queues when monitoring the host with esxtop. Monitoring the traffic on our fiber switches even showed minimal activity. The only thing that seems excessive is the CPU on the SQLServer VM. Even then the usage according to taskman and perfmon is very low.

    It is a 32bit SQL Server.



  • 8.  RE: SQL Server Performance issues as a VM

    Posted Aug 11, 2008 09:56 PM

    Any chance you can take it back to 2 cpus, and then test again>?



  • 9.  RE: SQL Server Performance issues as a VM

    Posted Aug 11, 2008 10:04 PM

    I can try tonight. I tried reducing the number of PROCs SQL Server utiized to just 2 hoping to free up some of the resources for the OS or whatever else needed it so bad. This only made things worse. But that still had 4 vCPUs on the VM, just with 2 assigned to SQL rather than all 4.



  • 10.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 03:54 AM

    Reducing the SQLServer VM to 2 vCPUs resulted in even worse performance. This time I was seeing timeouts and general errors even without our data processing programs running. I set the VM back the way it was and reconfigured the WebServer1 VM to 2 vCPUs and 2GB of RAM just in case it was having an effect. I'm now at a loss to the real source. Before I was certain it was within the SQLServer VM. Now I was able to run numerous queries from Query Analyzer and most of them had no problems running. Those that did run slower were not unacceptable. However, our websites themselves are still seeing a lot of timeouts and other errors such as the RPC error and the General network error I referenced earlier.



  • 11.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 05:12 AM

    So increasing cpu's at both client and server, increase performance to an acceptable level, but before virtualizing the SQL everything ran fine? We are missing something in the SQL Server performance problem. I've read an article where they wrote about the query optimizer and that virtualizing SQL would invalidate all the performance tuning rules in the query optimizer, but that would also be true whenever technology is improved (replacing 10K rpm disks with 15K rpm disks). Have you tried using process monitor to see where all the power is going? I think I will try it when I'm at work in an hour or so. 64 bit is performing better when you have a high load and 4 vcpus (see )



  • 12.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 06:09 AM

    Did you try to disable loadbalancing? Traffic between web frontend and sql db shouln'd need that...

    Regards

    Spex



  • 13.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 06:12 AM

    In my installation I see a match between the load in the SQL vm and what VirtualCenter is saying. I noticed that I have Data Execution Protection enabled and I'm not running fibres, but they can reduce context switching, which might help in a virtualized environment.



  • 14.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 09:47 AM

    That is really peculiar. Like you, I'm not really certain your issue is really within the sql itself now. Try reducing the vcpu's for the web server again to 1 and see what it does. We always give our databases multiple cpu's (normally never above 2 even for our heavy hitters as peformance seems to be best at 2) but our webservers always get 1 because they seem to do better with a single vcpu.



  • 15.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 10:08 AM

    Anders, that is exactly correct. Prior to virtualization the performance was excellent on lesser hardware. I have not used Process Monitor before so I'm not sure what I can extract out of the information. I tried it and found 250,000 entries to the screen within moments and cancelled the capture. According to the list of running processes in task manager sqlserver.exe is chewing up about 25%, sqlagent.exe is using 15% and System Idle Process is using 60%. This is when operating under load with our data importers running. Without load the figures are around 15%, 5% and 80% respectively.

    I don't know that upgrading to a 64-bit OS and instance of SQL Server is an option at this time.

    How can I make sure I have Data Execution Protection enabled and what do you mean by "not running on fibres"?

    Spex, what load balancing are you referring to? NIC load balancing on the host?

    william, I'll see what I can do about changing it to 1 vCPU. The real reason I suspected the SQL Server to begin with was of course the various SQL related errors (RPC and preloginhandshake as noted above) and the strange discrepency between taskman and VI client performance counters on the CPU usage. Also when the system started presenting these problems it was difficult to even execute queries from Query Analyzer directly from the SQLServer VM. I had trouble even pulling down the list of DBs within Query Analyzer. This does not seem to be the case anymore. I'm starting to wonder if it's somewhere in the communication between the WebServer VM and the SQLServer VM. Performance from my data importers on their VM is fine (even better in some cases). My WebServer2 I listed above has three sites that use small SQL databases on the SQLServer VM. These have no problems at all with performance to my knowledge. I'm going to do some further testing by duplicating a large database with access from a freshly installed Windows system and see if it incurs the performance hits or errors when the SQLServer is under load like the existing sites do.

    Thanks for all your input guys. Keep it coming please.



  • 16.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 01:04 PM

    Have you made sure that all hardware specific drivers and vendor specific application (management agent, e.g.) have been removed from the virtualized server? These can also make life difficult. The 64-bit is just an option if the current performance is unacceptable and we can't find any other solution.

    Data Execution Protection is set on the system (Right-click My Computer, properties, somewhere on one of the tabs) and in SQL 2005 it is also mentioned on the SQL Server properties.

    In SQL Server you can run threads or fibres (disables CLR if you're running SQL2005). On the SQL Server properties, you can enable and disable fibres. Enabling fibres might improve your performance (based on the helptext saying it reduces context switching).

    Have you checked duplex setting on the network links on both the hosts and switches, they can also cause some nasty problems and it can look like your symptoms as well. Just to make sure.



  • 17.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 05:52 PM

    I could not find any setup in Windows 2000 for Data Execution Protection. I did see it in Windows 2003 where you described. I have not tried fibers yet. All NICs and switches are running at 1000MB Full duplex.

    I created a new VM with Windows Server 2003 Enterprise R2, fully patched with VMTools installed. 1 vCPU, 1024MB RAM. I then setup IIS 6 and put a dummy site up with access to a duplicate database on the existing SQL Server. When our data importers are running this system also experienced issues with timeouts, and all of the errors described at the beginning. So I'm back to thinking it's the SQLServer VM.



  • 18.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 05:57 PM

    I may be hazy(in fact I know I am) but is this a new buildout of vmware, or are other vm's behaving as expected?



  • 19.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 07:21 PM

    william, this is a new install ov vmware with most of the VMs converted from physical machines using Converter Enterprise. Other VMs appear to be working fine, but I just may not have noticed any problems since they are not under heavy load. We are getting strange errors on our Web server once in a while. "Out-of-process" errors. The exact message is on the first post of this thread.



  • 20.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 09:24 PM

    Long post, but I really think these are all worth investigating...

    My gut-feel, "short answer" would also be to investigate the possibily of a fresh install of the DB server in order to try and resolve the matter. P2V operations, while better than in the past, sometimes has a tendency to pull through legacy stuff that doesn't really belong in the VM. It could be that there is an extra device driver or hardware management agent that's not playing well with ESX. (Just double-checking: can you confirm that in Device Manager, Computer, your system is set to use a Multiprocessor HAL?)

    Longer answer would be to try and locate the bottleneck, as with troubleshooting any performance issue. Fortunately, there are some excellent tools available to help with this.

    - VMware Tools - simple to miss, potentially huge impact. If you don't have the Tools installed in the VM, there could potentially be a massive CPU overhead in the network stack.

    - Antivirus - just a sanity check. Disable temporarily if installed and see if anything changes.

    - W2K SP4 Update Release 1. This is required for Windows 2000 to be supported on VMware ESX, so make sure it's installed. Note sure if it could explain your symptoms, but worth a look.

    - CPU - esxtop is the tool to use for real-time troubleshooting; the most important value you want to look out for in most CPU-bound cases is %READY. If this is higher than 10% in esxtop, something is preventing the VM from being scheduled to run as frequently as it should. I know you're saying it's the only VM on the system, but look at the possibility of contention somewhere. Also, make sure you're not allocating more than half the available number of cores to a single VM; see the performance tuning best practices guide for more details: http://www.vmware.com/pdf/vi_performance_tuning.pdf

    - Disk - IOmeter with a decent profile can help compare storage figures before/after - I'm not sure if your SQL data was on the Clariion previously as well; even if you can't compare the figures to anything on your side, if you post them here I can compare them with what I've got (IOmeter: http://www.iometer.org/; use the 8K OLTP profile from this definition file with queue depths of 4, 16, 64 and 256: http://arethusa.tweakers.net/~femme/iometer/workloads.icf)

    - Networking - use iperf to verify that you've got reliable, high-speed connections between the various virtual machines. Run "iperf -s -w 256k" on one node, and "iperf -c -w 256k" on the other node. (http://dast.nlanr.net/Projects/Iperf/#download)

    One last thing - can you please check your vmkwarning files (located in /var/log/) to see if there are any errors/warnings popping up on any of the ESX hosts?



  • 21.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 10:33 PM

    I'm considering the fresh install of the DB now. It's not a pleasant thought at this point.

    We are running a Multiprocessor HAL on all multiproc VMs and Singleprocessor HAL on the single proc VMs.

    I removed all specialty and unused device drivers from Windows and uninstalled any unused software (EMC PowerPath, QLogic SANSurfer and such) last week.

    VMware Tools is installed on all VMs.

    Antivirus was removed from all VMs and was to be reevaluated once everything was converted and running smoothly.

    All Windows VMs are fully patched.

    esxtop %RDYnever went above 3% when running under full load. While %USED on each vmm was above 95%.

    I've never used IOmeter or iperf. I'll look into those right away.

    I checked the vmkwarning files on the host and the following is all I found:

    # cat /var/log/vmkwarning.3

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.009 cpu6:1040)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.009 cpu6:1040)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.009 cpu6:1040)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.205 cpu5:1041)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.205 cpu5:1041)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.205 cpu5:1041)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.207 cpu6:1040)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.207 cpu6:1040)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.207 cpu6:1040)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.480 cpu5:1041)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.480 cpu5:1041)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.480 cpu5:1041)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.498 cpu6:1040)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.498 cpu6:1040)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.498 cpu6:1040)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.507 cpu5:1041)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.507 cpu5:1041)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:01 host3 vmkernel: 35:00:13:14.507 cpu5:1041)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.843 cpu5:1041)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.843 cpu5:1041)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.843 cpu5:1041)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.860 cpu5:1041)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.860 cpu5:1041)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.860 cpu5:1041)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.868 cpu5:1041)WARNING: SCSI: 279: SCSI device type 0xd is not supported. Cannot create target vmhba0:288:0

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.868 cpu5:1041)WARNING: SCSI: 1249: LegacyMP Plugin could not claim path: vmhba0:288:0. Not supported

    Jul 23 09:40:02 host3 vmkernel: 35:00:13:14.868 cpu5:1041)WARNING: ScsiPath: 3180: Plugin 'legacyMP' had an error (Not supported) while claiming path 'vmhba0:C0:T288:L0'.Skipping the path.

    Jul 23 10:50:36 host3 vmkernel: 35:01:23:49.001 cpu6:1058)WARNING: SCSI: 4526: Manual switchover to path vmhba2:0:2 begins.

    Jul 23 10:50:36 host3 vmkernel: 35:01:23:49.002 cpu6:1058)WARNING: SCSI: 4575: Manual switchover to vmhba2:0:2 completed unsuccessfully.

    Jul 23 10:50:36 host3 vmkernel: 35:01:23:49.017 cpu3:1057)WARNING: SCSI: 4526: Manual switchover to path vmhba2:1:2 begins.

    Jul 23 10:50:37 host3 vmkernel: 35:01:23:50.063 cpu1:1057)WARNING: SCSI: 4567: Manual switchover to vmhba2:1:2 completed successfully.

    Jul 23 10:50:40 host3 vmkernel: 35:01:23:53.206 cpu4:1055)WARNING: SCSI: 4526: Manual switchover to path vmhba2:0:3 begins.

    Jul 23 10:50:40 host3 vmkernel: 35:01:23:53.206 cpu4:1055)WARNING: SCSI: 4575: Manual switchover to vmhba2:0:3 completed unsuccessfully.

    Jul 23 10:50:40 host3 vmkernel: 35:01:23:53.210 cpu0:1057)WARNING: SCSI: 4526: Manual switchover to path vmhba2:1:3 begins.

    Jul 23 10:50:40 host3 vmkernel: 35:01:23:53.583 cpu0:1057)WARNING: SCSI: 4567: Manual switchover to vmhba2:1:3 completed successfully.

    Jul 23 10:59:31 host3 vmkernel: 35:01:32:43.905 cpu4:1055)WARNING: SCSI: 4526: Manual switchover to path vmhba2:1:1 begins.

    Jul 23 10:59:31 host3 vmkernel: 35:01:32:43.905 cpu4:1055)WARNING: SCSI: 4575: Manual switchover to vmhba2:1:1 completed unsuccessfully.

    Jul 23 10:59:31 host3 vmkernel: 35:01:32:43.989 cpu6:1056)WARNING: SCSI: 4526: Manual switchover to path vmhba1:0:1 begins.

    Jul 23 10:59:33 host3 vmkernel: 35:01:32:46.350 cpu4:1055)WARNING: SCSI: 4526: Manual switchover to path vmhba1:1:2 begins.

    Jul 23 10:59:33 host3 vmkernel: 35:01:32:46.389 cpu4:1055)WARNING: SCSI: 4575: Manual switchover to vmhba1:1:2 completed unsuccessfully.

    Jul 23 10:59:33 host3 vmkernel: 35:01:32:46.451 cpu0:1057)WARNING: SCSI: 4526: Manual switchover to path vmhba1:0:2 begins.

    Jul 23 10:59:35 host3 vmkernel: 35:01:32:47.860 cpu0:1057)WARNING: SCSI: 4567: Manual switchover to vmhba1:0:2 completed successfully.

    Jul 23 10:59:35 host3 vmkernel: 35:01:32:48.019 cpu6:1056)WARNING: SCSI: 4567: Manual switchover to vmhba1:0:1 completed successfully.

    Jul 23 10:59:35 host3 vmkernel: 35:01:32:48.039 cpu4:1058)WARNING: SCSI: 4526: Manual switchover to path vmhba1:0:0 begins.

    Jul 23 10:59:35 host3 vmkernel: 35:01:32:48.041 cpu4:1058)WARNING: SCSI: 4567: Manual switchover to vmhba1:0:0 completed successfully.

    Jul 23 10:59:39 host3 vmkernel: 35:01:32:51.991 cpu0:1057)WARNING: SCSI: 4526: Manual switchover to path vmhba1:1:3 begins.

    Jul 23 10:59:39 host3 vmkernel: 35:01:32:51.994 cpu3:1057)WARNING: SCSI: 4575: Manual switchover to vmhba1:1:3 completed unsuccessfully.

    Jul 23 10:59:39 host3 vmkernel: 35:01:32:51.997 cpu6:1056)WARNING: SCSI: 4526: Manual switchover to path vmhba1:0:3 begins.

    Jul 23 10:59:39 host3 vmkernel: 35:01:32:52.029 cpu6:1056)WARNING: SCSI: 4567: Manual switchover to vmhba1:0:3 completed successfully.

    #



  • 22.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 03:56 PM

    I hate trying to troubleshoot problems with inconsistent symptoms...

    So in order to keep my websites running as smoothly as possible for my clients I've started shutting down our data importers during the day and only running them at night. Last night I started the importers and approximately 35 were processing files at once. This is a large number to do at once and usually drags the system down, even in the old physical setup. So I jump on a few of the sites to see if the errors are still showing up like before. I had ZERO problems logging in, executing large reports, and general site navigation. I also ran a good number of queries from Query Analyzer with very little delay. Most returned results in under 5 seconds, even on large reports. The longest was about 15 seconds which was expected with the particular query. This is all while the 35 data imports were running. CPU usage in the VI client was pegged at 100%. In the guest it ran at about 40-50%. I let the importers run overnight and checked them first thing this morning. About 15 were running when I looked and I did have some problems with sites this morning. Several of them returned timeouts and one returned the RPC error. The data importers are currently set to slice out about 1/3 of our clients per instance. So we have three instances of the program running with about 60 clients per. I shut down one of these and the sites all started behaving again. Even with the other two still running and about 6 clients still processing files. So I let those two run. About an hour later I started the third instance again which brought another 5 clients processing files to the game, totaling about 10. CPU in VI client still averaging high 90s. Guest averaging 30-35%. Site usage appears normal at the moment. Right now I only have two files processing and the CPU in VI client is STILL averaging 90% usage! Guest is averaging 20%. I understand that there will be some discrepancy but a 70% difference!? So far today things are running smooth. I don't believe anything is solved, but for some reason it's working well enough today.

    I will probably spend the day today looking into IOmeter and iperf.



  • 23.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 07:01 PM

    IOmeter results.



  • 24.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 07:34 PM

    More IOmeter results.



  • 25.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 10:05 PM

    iperf test results at 256k TCP window size:

    Source Destination Bandwidth

    -


    -


    -


    WebServer1 SQLServer1 913 Mbits/sec

    Importer1 SQLServer1 810 Mbits/sec

    WebServer2 SQLServer1 883 Mbits/sec

    WebServer3 SQLServer1 841 Mbits/sec

    SQLServer1 WebServer1 919 Mbits/sec

    Importer1 WebServer1 1.54 Gbits/sec *running on same host

    WebServer2 WebServer1 928 Mbits/sec

    WebServer3 WebServer1 939 Mbits/sec

    WebServer1 Importer1 1.48 Gbits/sec *running on same host

    SQLServer1 Importer1 839 Mbits/sec

    WebServer2 Importer1 920 Mbits/sec

    WebServer3 Importer1 933 Mbits/sec

    WebServer1 WebServer2 928 Mbits/sec

    Importer1 WebServer2 924 Mbits/sec

    SQLServer1 WebServer2 919 Mbits/sec

    WebServer3 WebServer2 1.40 Gbits/sec *running on same host

    WebServer1 WebServer3 930 Mbits/sec

    Importer1 WebServer3 875 Mbits/sec

    SQLServer1 WebServer3 920 Mbits/sec

    WebServer2 WebServer3 1.58 GBits/sec *running on same host



  • 26.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 10:51 PM

    Take a look if you are hitting the max TCP port limit in Windows on your SQL Server. It can lead to strange errors and general network error messages. It's common if you have a high traffic web site or batch jobs that incorrectly open too many SQL connections in a loop. By default max number of open TCP ports in Windows is 5000. You can increase it to 65000 with the MaxUserPort registry key. The key TcpTimedWaitDelay also plays a role here. By default its 4 minutes before a TCP connection will close.You can set it as low as 30 seconds.

    On your SQL Server run the following from a command prompt when you encounter problems:

    netstat -nao

    If you see client port numbers close to 5000 for the SQL port (1433) you know you have this problem. The TCP connections are not closing fast enough. If you have local batch jobs running change these to Named Pipes / Shared Memory connections instead of TCP/IP connections. Let the web servers and other external apps use TCP/IP. You can find more info about it here:

    http://support.microsoft.com/kb/196271

    http://support.microsoft.com/kb/328476

    It's possible that you are hitting the 5000 port limit on your web servers as well.



  • 27.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 11:29 PM

    Thanks Argyle. netstat is indeed showing TCP ports near 5000 even now when the system is mostly idle. I'll check into this as soon as possible. I'll also see about changing our data importers to connect using named pipes instead of TCP/IP. Unfortunately I won't be able to do anything with this until tomorrow night since I will be away from the office tomorrow and don't want to make changes just before I leave town.



  • 28.  RE: SQL Server Performance issues as a VM

    Posted Aug 14, 2008 10:11 AM

    Those network performance tests look very good - nothing wrong there.

    Looking at the disk performance tests, I'm a bit concerned - the figures do not compare well at all to what I've seen on other setups. To give you some sort of indication, I once tested 5x 36GB 10k SCSI drives on an HP ProLiant DL380 G3 system (old) with the built in RAID controller and no write cache; at a queue depth of 64, I saw 500 Transactions per Second with the 8K OLTP test. On an HP EVA 4400 with 16 spindles and a vRAID1 LUN, I got over 5000 transactions per second.

    Having said that, your tests are a bit difficult to compare, since you ran the test on all disks simultaneously, but your combined transaction rate is around 300. This can be explained, though, if the storage subsystem was very busy at the time.

    For the sake of making sure, would it be possible to run the tests on only one disk at a time? You don't have to run it on all your data disks - perhaps just one data disk and one log disk.



  • 29.  RE: SQL Server Performance issues as a VM

    Posted Aug 14, 2008 01:20 PM

    Apparently the support forums here will autopost any autoresponder I have setup...



  • 30.  RE: SQL Server Performance issues as a VM

    Posted Aug 14, 2008 04:05 PM

    Tom,

    When you get back tomorrow, try setting a CPU reservation on your SQL VM. Since it's the only thing on the host, try setting it to 400% (100% * 4 vCPU). I don't know why this makes a difference in this scenario, but I have seen instances where setting the reservation (even with no contention) makes a difference. The other thing I would recommend is the clean install in a new VM. There have been reports of performance issues with P2V'ed VMs - not frequent, but it does happen.

    Ken Cline

    Technical Director, Virtualization

    Wells Landers[/url]

    VMware Communities User Moderator



  • 31.  RE: SQL Server Performance issues as a VM

    Posted Aug 18, 2008 11:55 PM

    I have a clean VM setup with SQL installed. I'm starting the process of moving the databases to the new VM. Only about a dozen will be moved initially to make sure there are no other side effects I'm not anticipating.



  • 32.  RE: SQL Server Performance issues as a VM

    Posted Aug 19, 2008 01:12 PM

    I have a clean VM setup with SQL installed. I'm starting the process of moving the databases to the new VM. Only about a dozen will be moved initially to make sure there are no other side effects I'm not anticipating.

    I hope this fixes it for you! You've certainly been persistent, and I commend you for your efforts. Please do keep us informed about how things go with the new VM.

    Thanks!

    Ken Cline

    Technical Director, Virtualization

    Wells Landers[/url]

    VMware Communities User Moderator



  • 33.  RE: SQL Server Performance issues as a VM

    Posted Aug 19, 2008 06:27 PM

    So far so good, but only 7 databases have been moved and three of those are rarely used. I will be moving another 50 later today hopefully.



  • 34.  RE: SQL Server Performance issues as a VM

    Posted Aug 20, 2008 12:37 AM

    90+ databases have been moved. It's off hours, but even with the data processors running, things are looking good. We'll find out tomorrow morning if things are stable enough for me to push the rest of the databases over. I've noticed that the gap between VI Client performance and Windows perfmon has closed significantly. Before it was 100% VI, 25% guest. Now it is 40% VI and 45% guest.



  • 35.  RE: SQL Server Performance issues as a VM

    Posted Aug 25, 2008 04:29 PM

    All 200 databases were moved last Wednesday. Thursday and Friday ran good, but we won't know how things are really operating until the end of the day today or tomorrow morning. So far today things are running smoothly. fingers crossed



  • 36.  RE: SQL Server Performance issues as a VM

    Posted Aug 25, 2008 06:38 PM

    Thanks for the update, Tom. Glad to hear things are better. I wish I knew what it was about the P2V process that sometimes causes these problems...

    Ken Cline

    Technical Director, Virtualization

    Wells Landers

    VMware Communities User Moderator



  • 37.  RE: SQL Server Performance issues as a VM

    Posted Sep 02, 2008 09:28 PM

    Well we have been running over a week with zero problems. I'm going to consider this one resolved. I guess it was just a bad conversion. Thanks everyone for your help with this.



  • 38.  RE: SQL Server Performance issues as a VM

    Posted Sep 02, 2008 11:40 PM

    Well we have been running over a week with zero problems. I'm going to consider this one resolved. I guess it was just a bad conversion. Thanks everyone for your help with this.

    Thank you for your efforts - and for sticking with it. Many folks would have given up and simply said "it's a bad idea to virtualize SQL Server". I wish VMware would be able to figure out what is happening in the conversion process and fix it...maybe one day :smileywink:

    Ken Cline

    Technical Director, Virtualization

    Wells Landers[/url]

    VMware Communities User Moderator



  • 39.  RE: SQL Server Performance issues as a VM

    Posted Sep 03, 2008 04:42 AM

    That it great news for you and other vmware & sql users.

    Perhaps it would be a good idea for vmware and other virtualization vendors to start investigating what goes wrong in some conversion projects. With enough bad performing systems it should be possible to find the cultprits.



  • 40.  RE: SQL Server Performance issues as a VM

    Posted Aug 15, 2008 09:05 AM

    Looking at the symtoms I'm quite sure it has nothing to do with either VMware, SQL or hardware and that it's the max TCP port limit in Windows you are hitting. I'd be interested to hear what the results are after changing the registry keys MaxUserPort to 60000 and the key TcpTimedWaitDelay to 30 seconds as mentiond in a previous post (on the SQL server but sometimes also needed on the web servers). And look for any code that open SQL connections in a loop :D. I'd also limit the max RAM SQL server can use to total RAM in system minus 512 to 1024 MB to leave some dedicated RAM for the OS that handles network processes. Otherwise SQL Server has a tendancy to take it all and doesn't give it back in a timely manner when the OS needs it.



  • 41.  RE: SQL Server Performance issues as a VM

    Posted Aug 15, 2008 03:24 PM

    I just had a report of the problem resurfacing and was able to duplicate the errors. Timeout errors from the web server. Come and gone, but I'm sure they will be back.

    Ken, I changed the CPU reservation last night to approximately 400% of the available mhz.

    Argyle, I haven't had a chance to modify the registry entries yet. According to the articles you linked I would need to change the max TCP port limit on my client (WebServer1) not the SQL Server. Also, I've checked my programming and we are using pooled connections. Using perfmon I can pull up the current user connections to SQL and it's currently at 268 with a maximum of 300. netstat also gets nowhere near 5000 entries even though the port number it lists occassionally has one in the 4000's. The RAM on the VM is set to 3GB and SQL is using up 1.7GB, which I believe the max RAM available to SQL Server 2000 standard is 2.0GB so it's has plenty of room overall to hit it's cap and leave enough for the OS.

    I will still attempt to change the registry entries soon though.



  • 42.  RE: SQL Server Performance issues as a VM

    Posted Aug 18, 2008 03:19 PM

    I made the registry modifications to MaxUserPort and TCPTimedWaitDelay with values of 60000 and 30 respectively. I was notified of problems with my websites at approximately 7:00am this morning. Netstat on the SQL Server showed roughly 300 open TCP connections. Netstat on the web server showed approximately 200 TCP connections as either ESTABLISHED or TIME_WAIT and about 2400 entries like "UDP 0.0.0.0:29835 :" The site issues did not go away until I shut down one of my data importers. Even then only a few sites were operating smooth.



  • 43.  RE: SQL Server Performance issues as a VM

    Posted Aug 18, 2008 05:32 PM

    Ok. It sounded really similar to a problem we had with tcp ports. Especially the "RPC Server Unavaliable" and "PreLoginHandshake()). General network error" messages. Are you running any reindexing or update statistics on the involved databases?



  • 44.  RE: SQL Server Performance issues as a VM

    Posted Aug 18, 2008 05:44 PM

    You did have me hopeful for a while, but I guess it just wasn't it. We don't have any reindexing other than the usual automatic stuff done by SQL Server. We do backup and clear the transaction logs every hour but that only lasts about 5 minutes and the issues are constant.



  • 45.  RE: SQL Server Performance issues as a VM

    Posted Aug 18, 2008 09:30 PM

    When it comes to reindexing that part isn't automatic but it can help a lot to run it weekly for performance. You should at least run it once on each database when you move server or change hardware or disks. If you do not want to run an entire reindex right now (can take time and log can grow depending on table sizes) you could run a "sp_updatestats" on each database to begin with via Query Analyzer to see if it improves performance. A sql query plan can turn really bad with timeouts as a result if the index and statistics data is old.

    Sometimes when you have many (or big) dataload operations it can help to run a index rebuild straight after but again it depends on the database size if its possible.



  • 46.  RE: SQL Server Performance issues as a VM

    Posted Aug 18, 2008 09:40 PM

    I'm sorry, I should have been more clear. Reindexing occurs on a weekly schedule on Saturday at 10:00am. It is not done during the week. The only scheduled changes during the week are the transaction log backups/truncations that occur every hour.



  • 47.  RE: SQL Server Performance issues as a VM

    Posted Aug 15, 2008 06:18 PM

    jhanekom, I finally got the updated iometer results. These look better.

    1100 average IOs on the log file drive (RAID 5 on the VMs LUN)

    3000 average IOs on the data file drive (RAID 10 RDM from the SQL VM) <-- Enough acronyms in that?

    Anyway, here are the result files.



  • 48.  RE: SQL Server Performance issues as a VM

    Posted Aug 12, 2008 06:03 PM

    Perhaps it would be worth creating a 2003 with SQL 2000 and copy a database to see if the problems persist in a clean installation.



  • 49.  RE: SQL Server Performance issues as a VM

    Posted Aug 13, 2008 11:30 PM

    Your post has been moved to the Performance forum

    Dave Mishchenko

    VMware Communities User Moderator