SQL Performance Tuning When You Can't Fix the Queries – By Brent Ozar

Speaker: Brent Ozar
Published : June, 2015
SQL Performance Tuning When You Can't Fix the Queries – By Brent Ozar
Rating: 4 / 5 (10 votes cast)

Do you want your SQL Server VM to be as fast as possible, despite having time or budget constraints when it comes to upgrading and reinstallation? Good news! Brent Ozar and Veeam® have a webcast with a workaround that’s designed just for you to learn more about SQL performance tuning

In this demo-focused webcast, Microsoft Certified Master Brent Ozar explains the VM-friendly performance features of SQL Server 2012, 2014, and even the next version, 2016. If you are a system admin or DBA running your production workloads in VMware, Hyper-V and Xen, and if your end users are unhappy with the performance, this session will address your challenges.

Additionally, joining Brent Ozar you will learn how to measure the performance of a running application, and then improve that performance by simply tuning a few knobs and switches. He explains which features in the SQL Server are right for your workloads and gives you some additional resources you can use to go faster and learn more.

 At the end, Brent will wrap up his SQL performance tuning webcast with a discussion about achieving availability for your SQL Server with Veeam. He’ll even tell you how to get fast, transaction-level recovery of your SQL databases with Veeam Explorer™, including agentless, transaction-log backup and replay to restore your SQL databases to a precise point-in-time to meet your recovery time and point objectives (RTPO™).

Video Transcript 


So, hi folks, my name is Brent Ozar and today we're going to be talking about making your SQL servers go faster when they're under virtualization. Everything that I'm going to be doing here is going to be just our my own VM here as I'm doing demos, but as I'm going through this, it's really important for you to know what the bottlenecks are on your SQL server with your SQL server.

Run SPaskbrent


What you want to do here is go run SPaskbrent. This is not a stored procedure that's included with SQL server. This is a totally free store procedure that I give away. I was constantly dealing with people who had run into my cubicle and say, “Hey dude, you know, can you tell me, is anything going on the SQL server right now? Is anything up with the sequel server or is there a reason why it's slow?”

And I just gave him a stored procedure and said, here, go run this. And you can go see whether or not there's anything going on the SQL server. You can go get this now for free from Brentozar.com/go/download there's a brand-new version as of today, of course if you're watching the recording, it's a brand-new version as of yesterday, or whenever, it's flexible. And it has all kinds of interesting parameters. No. For example, the one, there's just one by default. If I just run it with SPaskbrent what it does to take, take five second sample of your perf counter DMV counters, wait stats, all kinds of stuff. It's a snapshot waits five seconds. Then takes another snapshot to tell you what's going on right now on your SQL server and of course with my SQL server works, I have no problems right now. I guarantee you your SQL server does not return those results.

Troubleshooting demo


No, we're not going to troubleshoot what's going on on your SQL server right now. We're going to troubleshoot what's been going on on your virtual machine since you started it, so we had new parameters for, for SPaskbrent. SPaskbrent seconds equals zero. Usually you would use this to tell me how many seconds do you want to sample? What you want to take a ten second sample or a 30 second sample. We're not even going to take a sample. I'm just going to say since the time this SQL server started up, tell me what's been going on with it and I'm also going to turn on expert mode. Expert mode equals one returns to us a lot more information, so let's go ahead and run it and see how this thing works. I'm going to click execute and now it returns immediately. That's because it doesn't have to go sample things.

It's just going to go off and immediately take off and tell you what's going on with the SQL server. The top results set gives you a quick picture of what's going on, on the SQL server right now. How many batch requests per second you're doing, how busy the CPU is, how many databases you have and what their total size is. For today's webcast, I'm not worried about any of that. What I'm worried about is something called wait statistics and that's where the second set of results comes in.

Wait statistics


What SQL server is doing is any time that it's running a query, it is tracking how long it waits on storage, CPU, network, locks, anything that your SQL server could be waiting on. It's tracking at all times. Now it isn't telling me which queries waited on which things, it's just returning. Hey, I waited three seconds on CPU. Hey, I waited three seconds on locking and what SPaskbrent does here when I tell us seconds equals zero. It tells me, “Hey go back through the entire history of the SQL server as long as it's been up” Cause these statistics reset whenever sequel server restarts. “Since you've been up, how many seconds have gone by and how many seconds have I spent waiting on things”?

This wait type here tells me what it is that I've been waiting on and these wait types are really cryptic. You look at the names and you're like, ‘what’? And you can Google for them and you get all kinds of bad advice. Really all boils down to common wait stats that I see in virtual machine ‘cause they were talking about virtual machines. They're really just a handful that I see on a regular basis. Page IO latch means I am needing to read a page off of disc. Everything SQL server is stored in eight K pages and they're the same eight K pages, whether they're in memory or they're on disc. If you go insert update or delete a row, SQL server grabs that page off of this, puts it in memory, makes the changes to it and then write that whole entire page back down on disc. That page, idle batch wait means SQL server needed to, I needed to read a page off of this in order to put that page in memory. There are similar wait that sound like page latch or latch, the three parts of that page, IO and latch altogether mean we're reading data from storage. Now this is typically from select queries to can also be from insert, update or delete queries if we have to coordinate to relative data off of disc and when you think about reading data from disc, some people will go ‘storage is too slow’.

How to fix the problem


Maybe the ways that you fixed page IO latch are make storage faster, add more memory to cache data, tune your indexes so that you read less pages for or tune your queries so that they don't need to read as much data. It's funny, the database administrators usually see these 15 second IO warnings and they go, “all storage was going really slow, my SAN admin sucks”. Turn this list upside down cause the stuff at the top isn't under your control. Usually want to turn this stuff upside down and go from tuning your queries. If you can't do that, then you make the database match your queries and if you can't do that then you cash more data in RAM so that you don't have to hit disc all the time. And then the last resort is to make storage faster — cause that's expensive. It's really hard to make storage faster and it's not like it's going to suddenly be overnight, dramatically faster night and day.

Now I see page IO latch very frequently as a problem in virtual machines because often my BMS are undersized. They don't have enough memory. And I've got a whole bunch of VMs all crammed into the same shared storage today. Doug Lane, one of our, one of our consultants was working with a client who had over a hundred virtual machines, all about a hundred a hard drives. Go tweak a couple of things to reduce my network performance here. Cause I guess I'm hearing some people are saying, “Oh but things are lagging a little bit”. You feel that off here? Drag those off and all of us, all the leaves off. We go at this thing a little bit clearer. That might help someone pick up products that are also going at the same time. Cause I'm paranoid, right? Aren't you? Your database administrator, you're probably, paranoid.

SOS schedular


So, this is how I'm going to go fit Pedro latch that's done, and I'm gonna show you in demos how I'm going to go about playing around with that. That's not the only way I see all the time in virtual machines. And other way that I see all the time is SOS scheduler yield, which happens to be one way that my VM is facing here. Often when I see that, I often see CXPACKET. These both of these waits mean is that I'm dealing with queries that are needing to get on CPU a lot. They need a lot of CPU time, so a scheduler generally means that one query, sorry, one thread of one query wants more CPU time. This is parallel queries are defining work. Now I'm not going to talk about CXPACKET in this particular webcast. If you want to learn more about it, go ahead.com/go/cxpacket.

There's a couple of really simple CPU settings that you can make to make a world of difference with CXPACKET. Those settings are up there. Pre-shot asks, what does the CX stand for? Class exchange waiting for my class to exchange their packets and I talk about that in a separate 30 minute video over slash go slash CX packet and you're seeing that as your number one way. The good news is that you've got some simple performance. Any tweaks you can make aren't going to cost you a world with her. SOS schedule or yield though means that we've got queries that need as much CPU power as they can possibly get and they're not getting it. Now, what's interesting with this is often people think to fix SOS, SOS scheduler yield oopsie fix SOS schedule, or you know, people often think, Oh, I need to add more CPUs, I need to move to faster CPU or maybe tune your indexes or tune your queries.

As you can get with this same exact list. I'm going to tell you that you want to flip these upside down, especially in a world of virtualization. The idea isn't to constantly add more CPU. See, you use the expensive. What I probably want to do instead is to my inquiries into my indexes in order to make this stuff go faster. So, my particular SQL server, let's start going into some demos here on my particular seagulls. My big weights are say schedule or yield or CX packet, but I want to start using a tune. Just one specific query. I want to start really simply. So what I'm going to do here is I'm going to give you a tour of my sequel server, my SQL sounds. This is actually my personal SQL server, this SQL server.

Stack overflow


I'm running SQL server 2014 on here. The demos that I'm going to show you work the same for older versions of SQL server as well. I am using the stack overflow database. Stack overflow is that questions and answers site where you go and post questions and other people do your hard work for you. Now the cool part about stack overflow, one of the medical parts about stack overflow is that they make the database available via creative comments. Anybody can go in and download that database. I've got instructions on how to get this script and the stack overflow database over at Brentozar.com/go/faster. It is about a 60-gig download and it involves a bit torrent, so if you're doing this at work, this is probably something you're going to have to do at home. You get an XML file, a bunch of XML files with the contents of the database.

These are the actual questions and answers and users from the stack overflow database. This is so much more fun than screwing around with AdventureWorks like I never want to see a bicycle demo again as long as I live. So sometimes when I'm dealing with SQL server demos and I talk you this in the script in here, sometimes when I'm dealing with SQL server demos, I'll go in and time my SQL servers hands behind its’ back in order to purposely slow it down as much as it can. I'm not doing that here today. I'm setting my parallelism options appropriately and I'm giving my SQL server as much memory as I possibly can. So what I'm going to go through and do is I'm going to do a single query. We're going to go off and run this query, execute it. Let's pretend that my whole SQL server's job duty is just to run this one query, execute, execute.

It's pretty quick. It's not terrible. It goes off and does its thing. I might think, ‘well that's fine. As I run this, I don't need to tune that query too much’, but I need to throw a lot more load at my little test lab here. So what I'm going to do is I'm going to use a little utility. I'm going to copy out that query because I want to have, I'll have this thing go off and run thousands of times. I'm going to copy that query and I'm going to go fire up something called sequel query stress. SQL query stress is a totally free utility from Admin mechanic, great MVP out of the Boston U area, and what it does is as you can point it at any database server, I'm pointing it at my lab here, I call it prod one. Of course you shouldn't touch at one point the situ production instance.

I can pick which authentication method I want to use like SQL or Windows logins and then I can point it at a specific database. I'm going to point it at my stack overflow database and then I'm going to click test connection cause I really like it when I pass tests. And then I'm going to click okay. Then I'm going to paste in the exact same query that I was running. I'm going to set it to go run this 10,000 times across 25 threads. I can't even keep a straight face. I love doing load tests and then I'm going to click the go button. Bam. Now I can see that this thing is off and running. I can see how many iterations it's doing. This is so wonderful for load tests when I need to do the same query across. Lots of times. Now if I go off and I look at my task manager, task manager, performance, you can kind of see where I'm going to be having some problems.

This is good for load testing. I want to drive my SQL server right to the edge, but I need to know what's going on on the SQL server. What are the bottlenecks that are holding this thing back from going faster? So, if I go- Salaam just to asked, “I just tuned in, can you please tell me what he's doing?” No Salaam, this is why you should be on time. This is why you're not succeeding in life. Now what I'm going to do here is I'm going to use that SPaskbrent procedure again. I'm going to say go monitor for five seconds just for laughs I'm going to turn expert mode on cause I'm pretty smart and I can deal with it. Now when I run SPaskbrent, it's going to take a sample. It's going to wait for five seconds. Then take another sample of things like Perth Monde counters, wait stats, DMV queries.

When your SQL server is really heavily underload, this may take longer than five seconds in order to run because it has processing queries that it has to do too, and of course you know this thing is under load cause you saw how bad my CPU was. So, what does SPaskBrent come back and tell me, “Hey buddy, you have a problem with high CPU utilization”, and it gives me a little bit more details there. At the moment that it took the sample, it was running at around 71% it also tells me what my high waits are. Let me go off and kill this query just so you guys don't have to deal with hearing my CPU fan go off, so it tells me on this VM, my big waitstaff was that's the SOS skin over yield. In 11 seconds, that we took the sample for, we had 843 seconds of CXPACKET.

Wait, I know what you're thinking, Brent. Your math sucks. Possible can't possibly be right because you have more than eight 11 seconds of waits. Well remember my VM has four cores each which could have hundreds of queries that are all waiting for CPU, all piled up like a clown car, so that's exactly what I have going on here. I have all the ones, the one query, running across 25 instances and it is just absolutely hammering CPU. Then another thing that's interesting with this uh SPaskbrent is it also will tell you what files you're reading and writing from. I'm not even reading and writing from any files during that sample. Everything that I need is in RAM. I'm just churning through CPU as hard as I possibly can. So how do I go about fixing that? Well, if I go down here, let me scroll down a little further.

If I look at the execution plan for this query, if I go off and I display the estimated execution plan, I used to keyboard shortcut, but of course you can click query display, estimated execution plan, and you're going to get the same result. SQL services. Man, I really need an index because I constantly have to go through and scan this entire user's table and it's going to be a whole lot of work. Okay, well, Hmm, it kind of makes sense. I should index for queries in order to reduce my CPU used, but that's not always the problem. What gets kind of interesting with SQL server, let's go look at that table, user's table and stack overflow going to go over. Now explore, and I'm going to expand the user's table and I'm going to go expand the indexes. Hey, you don't have an index. I really need it.

Create a clustered index


Alright, let's go create it and see what a difference that makes a clustered index. Now you don't really want to leave it with name. I like the name. My indexes is the fields that they're on. So last access to date and includes cause I'm also including the reputation field. They're on that index now re- I should also tell you now just as a side thing, whenever you're dealing with uh vendor databases, which often happens in virtualization. What you end up doing is putting your own initials in here, so like DGO index so that when the vendor says, “hey, you're not supposed to have any indexes on here”. I can just go delete all of my indexes really quickly and I don't have to wonder which ones were mine and which ones were the vendors. I'm going to go ahead and execute that while it goes and creates the index I'm going to keep talking.

The other thing you can do here is that you can put your company name in there. For example, if you work for ACME Enterprises and this is an application from somebody else, like a SharePoint or ACME, you can put ACME, whatever, and then you'll immediately be able to identify your indexes versus somebody else. If you're trying to get somebody fired. Let's say that I want to get Don Calvin fired. He's one of the attendees. I can say DC. Whenever the vendor asks what's going on, I can be like, well you know they've Don Calvin and he's always adding indexes to our tables. We're going to ignore them, so we're going to close that and then my haul off and I'm going to run my query again or go get my execution plan again and now presto, it doesn't index seek.

Well that's pretty awesome. That's a lot better than before. It's actually not, but I wanted to show you that I'm going to cheat, I'm going to cheat because I should have had another index hiding there lying in wait. Everybody talk amongst yourselves here cause I'm going to go cheat, drop index. I usually have a stored procedure that drops all my indexes, but I don't want to run that today. That would be too bad. Let's create a different index just on the last access date, just our last access state. So go create the index. Nothing to see here. Move along close. Oops, let's go look at our execution plan again. So even though I haven't indexed and I've kinda cheated a little bit here, even though I have an index and I have an index on last access date, SQL server is choosing not to use it.

SQL Server:


See, sometimes SQL server makes bad decisions. Sometimes SQL server decides to ignore indexes or use the wrong indexes SQL. SQL’s from Microsoft, one of the things I love about Microsoft is they've been really improving execution plans over time, but they don't always turn those features for everybody by default. They would rather have you get predictable execution plans than constantly changing execution plans. So they've made improvements that aren't turned on, even though I'm in a brand-spanking-new version of a SQL server. So if I go down a little further, I'm going to show you something kind of slick. I'm going to show it. I shouldn't even show you. Let me fire off that to the load test query again just so that you can verify that this thing still doesn't go fast. Even though I added it indexed, it's equal. Server still isn't going fast and if I look at my CPU, okay, so what I was saying was trace on 41 99 trace flight 41 99 it makes a difference in how execution plans get built.

When I turn this on, it turns it on for everybody. The entire SQL server. Now all I did was turn on this trace flag and I'm going to go start my load test again. You remember how this was going to join kind of pokey. Let's hit go, check out how much faster those numbers are going. Now let's do something a little different as well and I even got to move a little quickly so that my load test doesn't finish before it absolutely takes off. Before this thing was only moving up a few quarries. At a time. Now it's running up hundreds or thousands of queries at a time. Now it will actually finish that a hundred thousand test results in a matter of seconds.

So, what's the difference now? If I look at SPaskbrent, what's the difference in my waits? Now the difference in my waits that SOS scheduler yield completely disappeared. Whoa, what is this trace flag 41 99 what is going on? If I go and look at the execution plan for that query, now suddenly SQL server is doing a different plan. Now SQL servers doing this index seek immediately followed by this key lookup. I know what you're thinking immediately. Let's go off and I'm going to just put my notes up here so you guys who are visual can see it. Hey dude, let's turn on trace flag 41 99 everywhere. Well, the thing is this is a targeted fix that works for some queries in some situations, not all queries as across the board, and some queries can actually get slower rather than faster. It is a new way of building execution plans and there's some risk involved with that.

You want to know that it's out there and I've got notes in the, in the KB or a notes to link to a KB article for example of a bug that's involved in 41 99 and a Alberto asks, “what if I used a query hint instead?”, could I hint there's a query that I can earn option that I can use at the end of my query is like option query trace on that. Let me look. That will let me force on a, a, a trace flag for one particular query. That's cool. If you can do that. I'd rather you just fix the queries rather than turning on trace flags and individual queries. Let me show you instead the better way to get the improvements that are in trace flag 41 99 what I've done is I've turned off trace, well like 41 99 cause I'm just not really a big fan of running trace flags.

Compatibility mode:


What I can do instead is I can right click on the database and I can click properties and up at the top of the options list for this database, there's this compatibility mode switch. For years you've been playing around with it and you never really understand what it did and you switched it back and forth. The way that you know trag door does the lights, which rave, I guess it's not trag door. That's the long story. Anyway. Ah, see now I go Homestar runner. Now I want to go tell all kinds of Homestar runner jokes. For the longest time, up until SQL server 2014 this didn't really do anything. It didn't really make a difference in the way to execution. Plans get built. But now starting with SQL server 2014 just by switching your database into 2014 compatibility mode, you get the improvements that were in trace flag 41 99 Oh without having to enable trace flags.

Oh, without having to screw around with your queries. So now if I just change my compat. mode to 2014 and say, okay, and now I go back and I run my load test again, whoops, wrong machine there. Now if I go back and run query stress again and I hit go, why am I get the super-fast execution plans that used to require trace flag 41 99. Now I don't have to worry about what my vendor says in terms of compatibility. As long as they say you can roll on SQL server 2014 now, I get magically faster execution plans — most of the time. Like trace flight 41 99 your mileage, your mmmm, and your mileage can vary dramatically with this trace flag. Sometimes things get better, sometimes they get a lot worse. And so what I've got in here is to list of explanations in here.

List of explanations


Let me show you down here. Do you, so this paragraph right here, if you are on SQL server 2012 trace flag 41 99 can have an amazing difference with certain execution plans. If you're unable to tune things via indexing or doing a query tuning, make sure that you go off and try this. But really once you want to get onto SQL server 2014 now the party starts, and you can use the brand-new cardinality estimator to suddenly get better execution plans in most cases for most workloads. So I gotta be careful with that cause sometimes it makes things worse instead of better. What's this have to do with virtualization? You might be asking, virtualization doesn't change query tuning. You still have to do your homework. You still have to find the execution plans that are going slow. But first you have to go off and look at SPaskbrent to say, tell me what have my wait and since the SQL server started. What is it that I've been waiting on? What caused those waits? How can I influence my query plans to be better?

Reporting workload


So, the next thing that we're going to work look at is a reporting workload. We were looking at this really odd, tiny transactional workloads immediately or in the beginning. What we're going to look at next is I've got this reporting table. I've built stuff on top of stack overflow cause I like to do all kinds of analysis about tags and about users, which ones are answering, which kinds of questions, which questions that I should focus on answering and I'm going to go off and show you the results of one table. I'm just going to do a big old select here out of one of my tables and I've got a table that might look kind of like your own data warehousing stuff. I've got a couple of IDs and then I've got all kinds of answers this year., answers last year, answers total comments this year, comments last year, questions this year and so forth.

This is really common for data warehouses where you have all kinds of fields in this table and then people query it in all kinds of crazy ways. I'll give you an example. Let's go through and look at some of the reporting queries that I do, so show you here and I'm going to start executing it. Well it runs. I'm going to talk through it. Minimize this here, go through my reporting table, group my stuff together by tag and tell me for the popular tags, tags that have more than a thousand users active in them, show me which ones have the oldest users. So across all of stack overflow, which tags have the oldest users in them, because I'm getting a little old, I have gray hair. Now what else do gray hair people like to answer?

And this is really typical for reporting query. You see these things take lots of time. It's usually normal for reporting. Query is to take 30, 60, 90 seconds. People go, man, how can I make these kinds of queries run faster? We'll let this thing finish off a while. It runs, I'm going to go look at, oops, I'm gonna go look at, oh we just finished! Alright, so this query just finished and like you would probably guess the tag legacy, Delphi, visual source safe, IBM mid-range, legacy code, COBOL. These are all things that are very old technology. So only us geezers go off and run queerer, go off and work in that particular tag. People in their average age 37, damn it, I'm even older than the average mainframe user on stack overflow, man, it sucks to be getting old. So the query took about 51 seconds in order to run.

That's pretty usual for reporting queries. We'll look at one other one just to see as an example which tags users have the lowest reputation. So if you go out and group things by a tag again, but now find me the ones who have the worst reputation, people who answer or answer questions, but nobody seems to like their work. I know, man, I don't like answering questions about Microsoft Access. I would, I would have low reputations in terms of there. Again, this course is going to take 30 to 60 seconds and if I watch this on SPaskbrent, while this query runs, let's go off and look at a SPaskbrent second. Oh and of course, dang it. The other query finished already. We'll go run it again here in a second. And this one you can see CXPACKET. Well, this query ran CXPACKET is parallelism revving to branch out lots of threads in order to get our work done.

CXPACKET doesn't necessarily mean you have a parallelism problem. It often really means that you're just doing too much work. And I can certainly empathize with that. It's taking 30 seconds in order to run my query. That is too much work. So while we've got it here, we can just go look out of our users who has the worst reputation, Android people, iOS, objective C, Ruby on rails — hey — .net doesn't come in unlit like number 10 that's kind of cool. So at least you know, whoo. Yeah, yes. So this is normal for reputation or for a data warehousing type times with kinds of queries. This stuff is really hard to make it go fast. So what we're going to do, I'm going to scroll way down and I'm gonna create something called a non-clustered column store index. An index, usually when I show you how to create an index, you would look at this list of fields and go, ooh, no one should ever create an index on that many fields.

Column store index


And most of the time you would be right. But a column store index is a little bit different and while it goes off and creates, I'm going to explain how this works. A column store index is really an index on every field individually. Each field has its own index just by itself. Normally this doesn't make any sense, but it makes sense in data warehousing kinds of queries where you have a really wide, fat table with lots of columns in it. Your users don't query all of them at the same time. They only get a few different columns individually. But I can never predict which columns they're going to be, how they're going to be by or how they're going to be sorted. It's really kind of funny stuff. And if we go back and while this thing runs cause it's gonna take like a few minutes in order to create this index, well this thing runs that really matches up with the kinds of reporting queries that I'm doing.

Let we jump way back up there. So if I look at these, some of these are on tag grouped by tags. Some of them are grouped by user ID, by display name, by location or reputation. Their order-bys are different, their counts are different. It's all over the place. This is what data warehousing is really like. Now, if I was a really good VBA, I would design perfect indexes for every one of these queries. Nobody has time for that. That just doesn't work, so instead this non-clustered column store index that I'm creating really goes off and works perfectly for reporting queries that slice and dice across all of my columns. Now I will also say notice how long this is taking. This is part of the drawback of column store indexes in SQL server 2012 column store indexes were read-only. As soon as you made a non-clustered column store index on a table, you froze that table.

It couldn't be inserted, updated or deleted, right? Now Brent, who would ever put that kind of index on a table? It's perfect for data warehouses where we just do a bunch of loads at night and then people run ugly queries all day long. Still people don't like that a lot though. What they like is in SQL server 2014 I have clustered column store indexes and I can rewrite, insert, update, delete across those so I can put them across all kinds of tables, but not OLTP tables, it's not really for transactional stuff. This is still really for reporting workloads. Now eventually, and I should put my notes on findings with this up here, boom, boom. So here's where columnar indexes makes sense for your virtual machines. When your chief weight type is page IO latch. When I'm having to constantly read a lot of pages from disk and my tables are very wide and my report users don't query all the columns, they sort and group by different columns all the time.

And for SQL 2012 where I can afford to rebuild this index nightly. Starting with SQL server 2016 there are even more improvements to column store that make this thing run even faster. But as we can see what this fricking VM and this slow ol’ demo, it is taking a long time in order to run because of course, creating a cluster column store index or non-clustered comp story index on this is pretty slow.



So, I'll go and look at the questions while this thing finishes up. Don says, ah, crap, a custom cost or Google comp story next only came out in 2012 as we talked about earlier in the, in the pre-show versions. If you are not a review, if performance really matters to you, it's really important to be on current versions of SQL server, AKA 2012 and 2014 this is just kind of what table stakes costs in order to do a high performance databases. Oh, it finally finished. All right, so three minutes, 44 seconds later, let's go off and run our queries. Now before these things were taking members 30 to 60 seconds, execute, boom, it's done. I'm gonna do that again just cause it's kind of neat to see, but it's done. Let's go try the next one.

Which tags would have the lowest reputation? But boom, boom, boom, boom, boom, boom, boom, boom. See this is why I can't go off script and then we'll even get to run the one that we didn't even have time to get to run because it was taking so long and boom, it's done. This is miraculously amazing especially for when the companies tell us we have to put our data warehouses inside VMs. We don't have any choice. This is one of those instant, ahhhhhh, amazing awe, magical experiences. So, for my guy from Veeam®, make sure to raise your hand and tell me in the notes, I don't see you in here, so I don't know where you're at. I don't know when he's going to come in and do his demo here. Otherwise I'm just going to keep right on going and we're going to keep talking SQL server ‘cause I have a ridiculous amount of demos and I should actually stop and do questions here too cause there's a lot of good questions.

So, Gregory says, “what are some best practices for virtual machines, CPU setup and data store LUN set up for CPU”. I want to have as few as I can get away with starting with four, I don't go any lower than four because that's the smallest amount you can license for a virtual machine unless you're licensing at the host level. If you're licensing at the host level, then the smallest I would go with this two. I know there's crappy books out there that, so just start with a single processor. Yeah, that's cool. As long as you don't need to do backups or DBCC. I kind of like to be able to do a backup or a DBBC while my VM is still running. Kishore says we'll trace flags, blah, blah, blah, blah cause any delay on the server. He's asking about 3,605 Provo for 1222 every trace flag you turn on is a different branch of the SQL server code.

Typically, these branches do more work, so yes, 12 four 1222 for example, or for tracing deadlocks. If you are not actively troubleshooting deadlocks, turn those back off. Let's see here. Next step to do do do do Kishor asks for most of our tables. When I do, if I tried to put on column store indexes, will there be an a delay for my inserts and updates? Yes. Just like index, any index you add is going to cause what I call DUI operations. Delete, update and insert is going to cause those to go slower, so I want to add only enough indexes that I need to support my workloads no more. Generally speaking, I aim for five or less indexes each with five or less fields in them. You can totally go higher and stack overflow, we do go higher. There are some tables that have dozens of non-clustered indexes on them, but of course that's because they have all solid-state storage and there's not really that much of a delay for doing inserts, updates and buildings.

Scott says, “how do you deal with application vendors that won't let you tune their queries”? I have a daily battle with this. You're going about the battle backwards. You have to use the carrot instead of the stick. This is what they talk about with motivating horses and donkeys. You could put a carrot in front of them and kind of lead them on forward or you could beat their backend with a stick. There are times when you have to beat somebody's backend with a stick — that doesn't work with vendors. The only thing that works with vendors is carrots. You have to say, Hey, check this out. I found this really cool trick to make this query go faster. Here's the before execution plan. Here's the after, here's how I can prove that it made a difference. Here you go, hope you enjoy that. You don't have to credit me or anything.

I just wanted to drop that off to let you know, cause I happened to be doing query tuning on something else and then shut up. That's the hardest part is shutting up. You can't say you suck as a vendor. You gotta do your code differently. You've gotta be, “Hey, here's some free gifts. Here's some free candy”, and constantly be upbeat and positive every time you work with them. Guess what? They'll actually hire you for consulting work. If you're nice about that. Let's see here. Do do, do, do do do. Orson says what happened to our a blitz. All we have is a stored procedures out there. Yeah. We only do store procedures now. Not windows apps cause it was getting old troubleshooting set up dealing with everybody's windows apps.

Tim says, “can you address hyper-threading it'll be good for folks to hear in a…” blah blah blah. Tim asked me questions, don't ask them, don't put paragraph long rants in there. So generally speaking, I am a big fan of hyper-threading. Hyper-threading doubles the number of virtual processors that are available to you at the host level. You do have to understand the CPU cache. So, say I've got a quadcore processor that happens to have four megs of cache on it and record, we'll get a Meg worth of cache throwing gang signs through virtualization up here. However, when I count on hyper-threading how I have eight virtual processors instead of four, each virtual processor gets half the amount of cache. So there are situations where that can backfire on you. For Sean says recently we increased memory on our virtual machine and instead of getting good it deteriorated. What could have gone wrong?

Oh man, there's a laundry list of things that it could, they could have gone wrong. Well, you got to step back is what is the big wait type that we were facing. Let me jump back up to the very first query here, so you've got to go get SPaskbrent from Brentozar.com/go/download look at the top wait. If you run it with seconds equals zero. It will tell you how many, how much waits you've had since your virtual SQL server restarted. Look at the top wait and adding memory has to be tied into whatever that wait is. There are some wait. For example, one of my screen SOS scheduler yield at memory won't make a single bit of difference with Justin says, “was SPaskbrent the only one updated today”? Yes. The rest of them are like a couple of few weeks old. At least. We tend not to update them all at the same time.

Let's see here. Mitch says, “I'm having trouble back finding best practices on backing up VM-ware virtual center. SQL server”. Yeah, same as a user database. What I try and do is back it up with transaction log backups off to a file share. I don't want to back it up to local drives. I want to back it up to a file share that lives on different storage than my virtual machines. Very important cause if I lose my SAN I do not want to lose my V-center databases back it up to a different file share. I am a big fan of doing log backups every minute. “Every minute. Brent, you’re crazy”. I am crazy but that has nothing to do with what's going on right here. Backing up every minute doesn't cost you any extra. It doesn't add more load. If you think about if you wanted to put a certain amount of data in the database every minute, would you rather wait say 15 minutes and then kick your SQL server in the junk by trying to back it all up, say every hour or would you rather nibble off parts of the backup by doing those backups every minute throughout the course of the hour?

There's usually much less overhead when we back up more frequently. The next thing people tell me is, “but Brent that'll create all kinds of transaction log files. I can't point and click my way to restoring all those”. Yes, put your big boy pants on because Ben, when it comes time to restore mission critical databases, you do not use the mouse. Use a script that just restores all of the databases in a folder. No matter how many backup files there are. If you go to our site and search for backup, we've got links to scripts. There's a good one out on MSSQL tips that shows you how to do that.

Let's see here. Mike says, “if you back up logs every minute, what that make your MSPB backup tables explode in size”? Yes, but you should be cleaning those out and only keeping it 14 days’ worth of history in there anyway, so it doesn't get that unmanageable. And he says, “won't that make the related to the indexes explode with fragmentation?”, fragmentation doesn't matter. Fragmentation is one of those ridiculous. I’m making an overstatement there. Internal fragmentation matters. External fragmentation does not matter, especially in a virtualization world where all of your reads and writes are random anyway. Internal fragmentation, free space on pages. That's a different story. Mark says, “does the use of transaction replication impact your advice to do log backups every minute”? Transaction replication is a one of those things where I, we were talking about it before the webcast started. Actually, often when I do these webcasts, tune in like 15 minutes early and I do Q&A during startup.

Usually when I find transactional replication, somebody goes, “well, it's because we have too much workload or the SQL server could possibly support. We have big data”, and then we go in and look at it… nobody's ever tuned indexes and it's like a 100 gig database that could fit in RAM for like a thousand dollars’ worth of RAM. We go, “why don't you just do that and then you cut your licensing costs in half”? Whew. So in terms of does it affect my advice on transactional replication? Yeah. I usually tell people, “step back and tell me what you're really using replication for”. John says, “as a VMR admin, what would you recommend me to look at for memory utilization requirements”? We don't really have a good DBA and I have earned some of that responsibility. Doesn't have responsibility suck? Okay, so here's the deal. If you ask any a SQL server DBA how much memory they need, they don't know either.

I know all these people in this webcast, they're trying to look really smart in they’re like “I know. I would know that”. No you don't. I deal with VBAs all the time and all they ever say is I want more memory. Often you don't need more memory. If you are a VM admin and you're facing performance problems on an existing virtual machine, run this SPaskbrent with seconds equals zero and expert mode turned on when you run it down in this wait stats section, if your biggest weight is page IO latch, this means we are waiting to read data from disk. If your biggest wait type is page IO latch and users are complaining and you can't tune the indexes, which is pretty typical when you don't have a DBA, right? Add more memory to that virtual machine. So how much memory is enough?

This is one of those things where even DBAs can't give you a really good answer. My usual answer is what's the smallest SQL server I would ever build is 32 gigs of RAM. “32 gigs of RAM. Brett, you're crazy.” Hold up a second. Remember how much SQL server costs standard edition. The El cheapo edition of SQL server is $2,000 US per core, and the smallest amount you can license is four cores, which means the smallest SQL server you can build is $8,000 just for licensing. Now scenario gets a little bit different. Whenever you at the host level, you can pile them all in like a clown car. But if you don't have a DBA, I'm going to keep things simple cause I bet you're licensing by the guest. $8,000 worth of software. You don't run that on a machine that has less memory than my Mac book here on the desk. This thing costs about a thousand dollars. Don't run it on something that's got less memory than my laptop and one of mine, that's my smallest laptop. I've got a bigger one over there that has more. So that's why I just go, I wouldn't go less than 32 gigs of RAM as long as you're licensing by the cast.

So, let's see, last question we'll do before we bail out here. Gregory asks, any sessions via blah blah suggestions based on VM settings on how to lower CXPACKET. And that's probably a good point for us to stop here. So, what we do if you want to go learn more about that is if you go to Brentozar.com/go/faster, BrentOzar.com/go/faster and hit enter. I've got a whole page here dedicated just to show you how to make slow SQL servers go faster. And we've got lots of information in here on how did you query tuning, how to reduce different types of weights. I have a section in here just on CXPACKET and I'd also tell you to go watch Brent tune queries. So if you scroll down here through some of them, there's a section on performance tuning the SQL queries. There's a whole video that I did over at a SQL rally and Copenhagen I think on how to, how to reduce the CXPACKET as part of that as well.

You get to watch me go through and tune the queries to reduce that. So we'll wrap things up here. I want to say thank you to Veeam for sponsoring today's webcast being not puts out products like their backup products for SQL server that are fantastic. What these things will do is they'll back up all of your databases on the SQL server and they'll live in combination with native backups and with log shipping they work with log shipping as well. So they'll coexist peacefully alongside these. When you want to pluck an object out of your backup, Veeam has that capability. So the most often when I'm a database administrator, the request I get isn't to restore the whole server. It's to just go get me one table out cause somebody hosed up a table or a set of records or a stored procedure. No one actually wants to restore the whole server.

Veeam has the ability to pluck individual objects out. They can spin up that SQL server, totally isolated, pluck individual objects out and drop them right back into your SQL server. Even put them in under different names so you can set them in side by side so that then you can go pick out exactly which rows you want to deal with afterwards. Way better than dealing with native SQL server backups, especially if you're kind of shocked where you only have CIS admins and you only have VM admins. You don't have database administrators full time this, you can use one pane of glass to manage all of your backups instead of hassling with native SQL server backups. So thanks everybody for joining me today. Hope you had fun. Thanks to Veeam for sponsoring it and I will see you guys on our next webcast. Bye everybody.

Duration: 50:38

Register to watch

By registering, you agree that your personal data will be managed by Veeam in accordance with the Privacy Policy.