Resource Library

Search

Sort by

Hypervisor

Application Recovery

Topics

Resource Type

Language

Product

Author: Paul Szelesi, Manager Solutions Architects, Regional Technical Sales EMEA

Veeam® Backup & Replication™ suits all businesses, from an individual virtual machine to thousands of virtual machines, the management server is crucial in ensuring this is operated with maximum efficiency.
This paper will help you understand how you can reduce risk on the management server using High Availability and Veeam’s inbuilt processes, ensuring continuity in protection.

Download this whitepaper to learn:
  • How Veeam protects your business in the event it all goes wrong, using:
         - SQL AlwaysOn Availability groups offering High Availability across your infrastructure
         - Veeam configuration backup files
  • How to recover from a failed management server
  • Disaster recovery (DR) with Veeam, how to protect your management server across multiple sites
Read now
White Paper
June 07, 2019
Infographic with highlights describing Veeam Customer Support Read now
Author: Tibor Karaszi, Consultant, SQL Server expert, MVP

The purpose of this paper is to describe SQL Server backup in general and also the options you have for backing up your Microsoft SQL Server databases in conjunction with Veeam®. The paper is both targeted to the SQL Server DBA, as well as the backup operator who may have more experience with Veeam and less with SQL Server. This is not intended to be a reference paper covering all the options in the graphical user interfaces (GUIs) or the SQL commands. For that, please refer separately to the Veeam and SQL Server documentation that accompanies each product.

Transaction logging and management of the transaction log

SQL Server supports transaction. Every modification is logged in the transaction log before the modification is performed on the actual data page. The transaction log lives in the ldf file(s) of the database. Please reference the first paper in this series for a more detailed discussion about storage architecture and transaction logging.

Ultimately, it is the DBA’s responsibility to make sure the transaction log doesn’t fill up the disk, as log records are generated for our modifications.

Virtual Log Files (VLFs)

The transaction log file (or files) is internally divided into Virtual Log Files(VLFs). This is performed automatically by SQL Server, and a DBA typically does not have to be aware of VLFs. There are some disadvantages of having “too many” VLFs, such as the case when the ldf file has grown frequently. Things such as startup and restore of the database can be slower with many VLFs. Search the Internet for terms such as “VLF” and “shrink” and you will find details on how to determine if you have many VLFs and how to properly manage them.

So, think of the ldf file internally as a series of VLFs. A VLF can be in use or it can be free for SQL Server to use (slightly simplified, but enough for our purposes). Also, imagine SQL Server having a series of log records with a head and a tail. When the head reaches the end of the current VLF, SQL Server has to find a VLF that it can use. If all VLFs in the ldf file are in use, then the ldf file has to grow – or if it cannot grow, then the modification will return an error message and fail.

What you need to do is make VLFs reusable. We sometimes refer to this as “truncate the log,” or as I prefer to say “empty the log.” However, technically, we make SQL Server mark as many VLFs as possible as OK to use – as free, reusable, or “OK to overwrite”.

The recovery model setting

A database option called the recovery model, is all about management of the transaction log. The available modes are full, simple and bulk logged. Most installations and databases are either in simple or full recovery model. The default value – what you get when you create a database – is inherited from the model database, and by default is in full recovery.

Simple SQL Server recovery

This recovery model is designed to be used when you do not perform backup of the transaction log of the database. In simple recovery, it is not your responsibility to “empty the log” (or “truncate the log”), as SQL Server will do that for you. However, you can still end up with large ldf files due to long-running transactions and problems with the log reader when using transactional replication. Since SQL Server will truncate the log for you, you cannot perform backup of the transaction log – the BACKUP LOG command will return an error if you try.

The log is typically truncated when a checkpoint occurs (reference the first paper), which is done automatically now and then. You can even stress this using the CHECKPOINT command.

Full SQL Server recovery

In full recovery, it is your responsibility to truncate the log. This happens when you perform backup of the transaction log, i.e., the BACKUP LOG command will truncate the log after producing the backup. It is worth mentioning that other backup types (full, differential, snapshot, etc.) do not empty the log – only log backup will do this. If you are in full recovery and do not take a log backup, then the log file will continue to grow until it reaches maximum size or the disk is full. There is a setting in Veeam that will make Veeam empty the transaction log after producing its snapshot backups, and essentially manage the transaction log for you, even if you are in full recovery. I will discuss this in more detail later. However, it is important to note that you will not want to use this setting in Veeam if you produce your own log backups (outside of Veeam).

Bulk logged SQL Server recovery

Bulk logged recovery is not commonly used, but for the right situation it can be valuable. In order to explain this properly, we need to first explain minimally logged operations. There are some operations that can be logged in a minimal fashion to the transaction log. One such operation is mass-loading data into a table, such as importing them from a file. This is usually referred to as bulk loading data. Imagine you need to import one million rows of data from a file into a table. If fully logged, this operation will log at least one million log records – or two million, or three million etc., as each index is also maintained and reflected in the transaction log. In full recovery model, all operations are fully logged, as there are no minimally logged operations. However, in simple or bulk logged recovery, these operations do not log actual modifications of your data but only the fact that it allocates storage (basically “now this extent is used by this table,” and so on).

In bulk logged recovery, these operations can be performed as minimally logged operations and you can also produce a log backup after those operations. Such a log backup will not only include log records from the ldf file, but also the data (extents) modified by the minimally logged operations. However, you can only produce such a log backup if the data files are available (having the data files available is not a requirement for a “normal” log backup). Also, you cannot restore this type of logbackup to any point in time using the STOPAT option for the RESTORE LOG command.

The other operations that can be minimally logged, beside bulk loading of data, are SELECT INTO and create, rebuild, and drop of indexes.

In the end, deciding which recovery model to use isn’t particularly difficult, if we leave bulk logged aside. If you are to produce log backups, then use full. If not, then use simple. Or, if you only want Veeam snapshot backups, you can let Veeam truncate the log for you, if the database happens to be in full recovery after the snapshot has been produced.

SQL Server backup types

Full SQL Server backup

A full SQL backup includes everything in the database. SQL Server will copy all of the data in the database’s data files (all extents) to the backup destination, which is typically a file. Changes that are made to the data while the backup is running are reflected in the transaction log, and when all data (all extents) have been copied, SQL Server will then also copy the log records that were produced while the backup was running. When you restore from a full backup, SQL Server will copy all pages from the backup file into the data file(s), and all log records from the backup files into the ldf file(s). And finally perform the same type of recovery as when you start SQL Server (see the first paper in this series). For example, you start a full backup at 02:00, and the backup finishes at 02:45. When you restore from that backup, the database will look like it did at 02:45 – not 02:00. A full backup is performed using the BACKUP DATABASE command.

Differential SQL Server backup

A Differential SQL backup is very much like a full backup, except that SQL Server will only backup the extents that have been modified since the last full backup. It also uses the log records produced while copying the extents, the exact same way as for a full SQL Server backup. For example, let’s say you have a full backup F1 and then differential backups D1, D2 and D3.

When you restore, you would restore F1 and D3, assuming you want to restore to the most recent time as possible (a full backup and then the last differential backup since). Note that a differential backup is based on the most recent full backup. Say you have F1, D1, D2, D3, F2, D4, D5 and D6. If you want to restore D6, you would restore F2 and thenD6. You cannot base D6 on the F1 backup.

The BACKUP DATABASE is also used for differential backups, adding the option DIFFERENTIAL to the WITH clause.

Differential SQL Server backups can be a huge space saver considering how much backup data is produced in the end. Here is an example from one of our customers. The figures used in the example have been slightly rounded. Initially, we did daily full backups. One such backup produced 100 GB of (compressed) backup data for the SQL Servers. This was stored on backup servers for four weeks, equaling 2.8 TB. We changed it to weekly full backup and daily differential backups. About 1 GB of data was modified each day, therefore, we produced 121 GB per week (100 + 1 + 2 + 3 + 4 + 5 + 6), meaning 484 GB for four weeks. So, the amount of SQL Server backup data we produced and stored on the backup servers decreased from 2.8 TB to 0.48 TB.

We had to adjust based upon the amount of time we stored the backup files on local machines. Three days could mean that we cannot perform a restore from what exists only on that machine, which is something I always recommend if you let SQL Server produce backup files. Imagine that we only have differential backup files on the machine. So, we changed it from three days to 13 days and in the end, the amount of data stored in the local backup files reduced some but not significantly.

SQL Server Transaction log backup

Transaction log backup is defined as backing up the changes made since the last transaction log backup. This option is similar to incremental backup. Technically, SQL Server reads the log records in the ldf file and copies them to the backup file. Log backups have several advantages. First, you can produce a log backup even if the database files are damaged or even lost (using the NO_TRUNCATE option for the BACKUP LOG command). In many cases this means you can achieve zero data loss in the event of an accident. Another advantage is the possibility to perform log backups very frequently, perhaps every hour, every 10 minutes, or 5 minutes.

The command to produce a log backup is BACKUP LOG.

It is important to note that when you restore SQL Server log backups, you need to restore them in sequence and cannot skip a log backup.

The restore sequence for SQL Server is pretty straight forward:

  • 1. Restore from a full SQL Server backup
  • 2. If you have differential SQL backups, restore from the most recent differential backup produced after that full backup. If you have log backups, restore all subsequent log backups with an option to stop at a certain point in time when you restore the last log backup.

Snapshot backup

SQL Server snapshot backups are completely different. From a high abstraction viewpoint, your backup software tells SQL Server to stop using I/O for a certain time period, and while SQL Server isn’t performing I/O, the backup software can produce a snapshot copy of the data in the database files. SQL Server is informed that this snapshot is being produced using the SQL Server VSS Writer service in the operating system. In other words, SQL Server does not produce any backup data, it is just halting modifications activity (not doing any I/O) while the snapshot is being performed (while being “frozen”). You can see that snapshots are produced by looking in the SQL Server errorlog file, where you will see messages such as “Freezing I/O for database …”, for each database; and later “Resuming I/O for database …”. An interesting and important fact is that SQL Server will consider such a snapshot a full backup – even though SQL Server did not produce any backup data itself. This is important from several viewpoints, as we will explain later. Another important fact is that this is a fully supported backup type. There is nothing strange about snapshot backups assuming they are produced the right way (utilizing the SQL Server VSS Writer service).

More details about how snapshot backups work in SQL Server can be found in the following article: ttps://technet.microsoft.com/en-us/library/cc966520.aspx

The COPY_ONLY option

Sometimes you produce a backup to simply get a copy of a database to restore on a test-server, for instance, especially if you want to avoid impacting the chain of your scheduled backups. For these purposes, we have an option to the backup command named COPY_ONLY. This is relevant for two backup types:

  • COPY_ONLY used with full SQL Server backups. This means that this full backup will not impact your differential backups. For example, you have scheduled weekly full backups (Sunday, for instance), and daily differential backups (all days except Sunday). Now, if you perform a full backup just for the purpose of getting a copy of your database, say on Tuesday afternoon, then the differential backups for the rest of the week will be based on this Tuesday “out-of-bands” backup you performed. Imagine if the administrator who performed this full Tuesday backup after restore deleted that backup file. The following differential backups for that week will be based on the Tuesday full backup – but this no longer exists. This is a disaster! So what we do is specify the COPY_ONLY option for this Tuesday “outof-band” backup and this way it will not impact the following differential backups.
  • COPY_ONLY used with transaction log backups. This is a far less common situation. When specifying COPY_ONLY when performing a log backup, then that log backup will not impact the subsequent log backups. Basically it will not truncate the log.

More advanced SQL Server backup options

There are other backup options which we will not explain in this document – being a document about Veeam and SQL Server backups. These other options are well described in the SQL Server documentation. They include backup at the file or filegroup level.

Scheduling SQL Server to perform its own backups

Scheduling SQL Server to perform its own backups is probably what most experienced SQL Server DBAs will initially be most comfortable with. Let me first say that there are several advantages to using Veeam to back up your SQL Server, so I suggest you also read the following section before deciding what strategy to choose. Having said that, if you want to produce your own backups to files, then there are some things you must consider when using Veeam. Basically, you want to avoid Veeam interfering with your SQL Server backups.

In the Veeam backup job, we strongly recommend you specify “Enable application-aware processing.” This will make Veeam do the backup using the SQL Server VSS Writer service. This means that the machine snapshot will be a valid backup of also the SQL Server databases. So even if you are also producing your own SQL Server full backups, you have a second level of safety, using the Veeam machine snapshot backup for your databases. This also means that the Veeam snapshot backup is seen by SQL Server as a full database backup.

In order to play nice with your own SQL Server backups, you want to select “Applications” and make sure your Veeam backup is configured in a suitable manner.

The “Processing Setting” configuration dialog, the “General” tab

Regarding the “Application” setting, which isn’t specific to only SQL Server, try to imagine the SQL Server VSS Writer service isn’t available, for any reason. I strongly recommend you use the topmost option – to fail the backup in these situations. This way the backup operator can be alerted of the error and manage the situation.

As for the “Transaction logs” option, you want to select “Perform copy only” if you perform your own SQL Server backups. This way the Veeam snapshot backup will be seen by SQL Server as a COPY_ONLY backup and will not interfere with any differential backups that you produce in SQL Server. Even if you don’t produce differential backups today, you still want to select this since you or someone within your organization might want to start using differential SQL Server backups in the future.

The “SQL tab”

If you select “Process transaction logs with this job” and do not specify “Perform copy only,” then it is very important you select the “SQL” tab, which is not available if you select the copy only option.

If you select “Truncate logs” inside the SQL tab then Veeam will perform a log backup to the file name “nul” after the snapshot was produced. It will do this for all databases that are in full or bulk logged recovery model. This will render your own log backups performed after this useless. So, to avoid this, choose “Do not truncate logs.” However, if you want to perform your own SQL Server backups then simply select “Perform copy only” on the General tab and Veeam will not interfere with your backup strategy. Simply put, the “Perform copy only” option is to help avoid Veeam interfering with your backup strategy.

How do you produce your backups?

Most SQL Server DBAs use some tool to generate the SQL Server backup commands. These tools typically also have the ability to do things besides performing backups, such as defragmenting indexes and checking that your databases are free of corruption, etc.

  • SQL Server comes with Maintenance Plans which provide the ability to, among other things, produce SQL Server backups. These are typically produced to disk and the maintenance plan components will name the backup files so you will have the database name, date, and time in the backup file name, as well as a clean-up process to remove old backup files.
  • There are other maintenance tools (scripts) available, which have advantages compared to the maintenance plans that come with the product. Perhaps the most commonly used is Ola Hallengren’s Maintenance Solutions (https://ola.hallengren.com/). One advantage of Ola’s tools is the smart index defragmentation handling which is designed to check the fragmentation level and only perform defragmentation for the indexes where we have fragmentation in the first place. This will save time including reducing time when the data isn’t available, and also save space in the transaction log files and subsequent transaction log backups.

In the end, the above solutions will execute a job by the SQL Server Agent service. If you decide to let Veeam perform your SQL Server backup, then you will most likely still use some type of maintenance solution for all tasks except backup.

Should we use compression for the SQL Server backup files?

SQL Server has a compression option in the backup command. You may ask yourself whether or not it is appropriate to use this since Veeam will perform deduplication in the end.

Imagine a database where only a few pages have been modified between two backup occasions (two backup files). Without compression, these backup files will mostly be identical, where only small parts of the files will differ (for our example, remember that we only modified a few pages). Theoretically, deduplication would pick up on this and store the matching data only once. Compare this to the case where you let SQL server compress the backup data. Compression will likely “scramble” the bit-pattern so that the backup files will have little in common.

This might lead you to the conclusion that you shouldn’t compress SQL Server backups. However, the way that deduplication works, the data being served by the hypervisor doesn’t provide the data at a file-by-file level to the deduplication parts in Veeam. The end result is that deduplication might not do as much deduplication as is theoretically possible, and compression is likely to save on storage in the end. As always you should take in consideration the CPU cost for SQL Server to compress the backup data.The bottom-line is that we do not recommend that you treat compression differently just because you happen to be in an environment where your SQL Server backup files will be picked up by Veeam.

Restore

We all know that it is important to practice restore and that a production failure is not the ideal time to practice a restore!

SQL Server has a GUI to perform restore built into the SQL Server Management Studio tool. The restore GUI will use backup history, which is stored in a few tables in the msdb database, to construct your RESTORE commands – and then execute these RESTORE commands if you wish (or you can use the script button to script them to a query window). It is, of course, important that it gets the restore commands right and this is where it gets a bit complicated.

The basic design principal for the restore GUI is that it uses backup history to figure out what RESTORE command to execute, based on what date and time you specify that you want to restore the database to. Unfortunately there are some “gotchas” to watch out for in this case.

First, Microsoft did a major change to the restore GUI between SQL Server 2008 R2 and SQL server 2012 and there have been minor changes with other versions as well. Obviously, we cannot point out every behavior change in every version, so consider the points below to be cautious about and verify whether they apply to you, if you want to use the restore GUI in the first place.

Perhaps the most obvious aspect is that the restore GUI only knows about the backups takes from the point in time of the machine. This might sound strange, so let me explain this better with an example. Say that you performed your Veeam snapshot on Wednesday at 04:00, you performed your SQL Server full backups Tuesday at 19:00 and transaction log backups every hour. Now, a problem occurred Wednesday at 10:43 and you want to restore the database to the point in time it had at 10:00 (your most recent log backup). This means you want to restore the Tuesday 19:00 full backup and all transaction log backups since, up to the one taken Wednesday at 10:00. Also, let’s say the virtual machine also broke so you start by restoring the virtual machine from your snapshot taken Wednesday at 04:00. Your SQL Server backup history will now be from Wednesday 04:00 and there is no information in the restored backup history about the backups takes since 04:00. This means that the restore GUI in SQL Server can only help you to restore to 04:00, and the rest is up to you. This of course is assuming you have the log backup files copied somewhere else than your virtual machine, which is VERY important.

Another aspect is that the restore GUI might try to include COPY_ONLY backups and even snapshot backups when generating its RESTORE commands. Consider snapshot backups, for instance. Once again, let’s take the above example but do not restore the virtual machine from a Veeam snapshot backup. Now, we want to use our SQL Server backups and bring the database to the point it had at 10:00. To do this, you specify the time 10:00 in the restore GUI and it will generate the first RESTORE command from the 04:00 snapshot backup, pointing to a filename for a file which doesn’t exist (remember that this was a snapshot backup produced by Veeam). The restore GUI isn’t smart enough to realize that this is a snapshot backup and skip/ignore it. I am not saying this happens in all versions of SQL Server Management Studio, but I have seen it happen in SQL Server 2012 and 2014. This may eventually be fixed at some point in time, assuming Microsoft considers it a bug in the first place. But it exemplifies how important it is that you practice your restore routines. And practicing restore cannot be easier than if you are using Veeam. With Veeam, simply perform a SureBackup restore and test/practice in that environment. Make sure you document your findings while doing this, as such a document is very valuable in a production situation emergency.

All of my clients have decided to not use the restore GUI for production restore. They have trained themselves to use the RESTORE commands from a query window, possibly aided by the ability to use the GUI, script the commands from the GUI, and then make adjustments in that script, if necessary, before executing the SQL commands

There is nothing wrong with doing your own backups when using SQL Server with Veeam. However, there are several advantages of letting Veeam perform your SQL Server backups, as I will identify in the next section.

We have several components in play in our modern environments, such as virtualization software, backup solutions, and snapshot solutions at various levels, etc. It is more important than ever that we have actually practiced how to perform a restore. A production situation is not the right time to be surprised.

Using Veeam for your SQL Server backups

Using Veeam for your SQL Server backups may feel a bit unusual for the seasoned SQL Server DBA, so let us first lay out some facts before we dive into the details.

    • There is nothing unsupported about the SQL Server backups produced by Veeam. Veeam uses supported and documented methods of producing SQL Server backups. More details will follow.
    • There are potentially huge savings by letting Veeam handle SQL Server backups. Consider the example we had above: SQL Server produced 2.8 TB per four weeks, to be stored on the backup server, when we didn’t use differential backups and 0.48 TB when we used differential backups. What if we let Veeam do our SQL Server backups? It would take 0 TB since we already have the backup in our machine snapshots, which is a significant savings on our backup servers. We also save on disk space for each SQL Server virtual machine, since we do not produce and store SQL Server databases or differential backup files on the local machines.
    • The restore process is all performed by the Veeam toolset. Veeam knows where all the backups are located and you have the same tools to perform both the machine restore as well as the SQL Server restore and in many cases it will be in the same place, your snapshot.
    • Your transaction log backups will be copied away to a separate location immediately after the log backups have been performed. This is an interesting aspect, since if you keep the log backups only on the SQL Server machine and the machine breaks altogether, then your log backups will be lost as well.
    • The remainder of this document will focus on how Veeam works with SQL Server and what the SQL Server related configuration option in Veeam means.
    • Veeam supports two types of SQL Server backups:
    • Snapshot backup. As explained above, this is a snapshot of the whole machine, from the point in time when the snapshot was produced. The SQL Server databases will be in a consistent state, thanks to the snapshot that was produced with assistance of the SQL Server VSS Writer service.
    • Transaction log backups. You can complement your snapshot backups with transaction log backups. As we will see, Veeam will produce the log backups to files, using an ordinary BACKUP LOG command and then copy the log backup files to the Veeam backup repository. Transaction log backups will only be performed for the databases which are in full or bulk logged recovery model. So make sure you have the desired recovery model for your database.

The full database backup – the snapshot

As we already explained above, Veeam produces a snapshot of your virtual machine using the VSS functionality in Windows. This, in conjunction with the SQL Server VSS Writer Service produces a consistent view of your databases. Veeam uses the VDI API to communicate with SQL Server that a backup is performed, and SQL Server will consider this a full database backup. You might wonder whether or not it is important that SQL Server considers this a full backup (as long as it is supported, and we can perform a restore from it)? Well, there is one aspect that might not be immediately obvious and that is the ability to perform log backups after a snapshot backup and restore such log backups. Once again, let me share a link to an article that describes how snapshot backups work with SQL Server and VDI (Virtual Device Interface, the API used): https://technet.microsoft.com/en-us/library/cc966520.aspx . This article is technical; however, it will help answer deeper technical questions that you may have.

If you know you will not be doing any log backups, then you have two options to manage your transaction logs, i.e., make sure the ldf files don’t grow indefinitely:

      • You can make sure all your databases are in simple recovery model
      • You can let Veeam truncate the transaction logs after the snapshot has been produced. This is implemented by Veeam executing the BACKUP LOG command to the file name ‘nul’.

Using Veeam to produce transaction log backups

An interesting aspect with snapshot backups in SQL Server is that we can also complement them with log backups – just as we can use log backup to complement full and differential backups.

In order to do that, select “Backup logs periodically” in the “Processing Settings” dialog, the SQL tab – as you can see available above. It is that easy!

The implementation of log backups in Veeam

If you are curious how log backups work, then read on. If not, then skip to the next section.

Veeam installs a couple components on your SQL Server machines:

VeeamAgent which executes the BACKUP LOG command with the frequency you have specified. This component is started from the Backup Server. The log backups are performed to files, so there is nothing strange going on here. The following is an example of a backup command executed by the VeeamAgent:

DECLARE @database_name_var NVARCHAR(255) SET @database_name_var = N'Tandlaege' BACKUP LOG @database_name_var TO DISK = N'C:\ProgramData\Veeam\ Backup\SqlLogBackup\{78d18633-05ae-4613-b903-b2ea8854ad34}.bak'

As you can see, it sets the database name into a variable and uses that variable in the BACKUP LOG command.

VeeamLogShipper is the task that grabs the backup file produced by VeeamAgent and copies it to the Veeam backup repository (where the Veeam backup data is stored). If this server isn’t directly reachable from the SQL Server machine, then one or more intermediate machines are attempted – log shipping servers. You can configure which log shipping server

Veeam will attempt to use, but in the vast majority of cases, you want to let Veeam determine this for you. The VeeamLogShipper component is implemented as a Windows service.

If you look at the backup files in the Veeam backup repository, you will find files with the extension VLB. These are several log backup files compressed into a single file.

Restore databases using Veeam

The easiest and simplest restore situation, from a SQL Server point of view, is to simply restore the whole machine snapshot, to the Restore Point. There is nothing “SQL Server specific” about this type of restore. Of course, your databases will be based from when the snapshot was produced.

The other option is to restore from a Veeam restore point and then restore transaction log backups to the point in time you desire. Restoring to anything but your restore points will only be available for databases in full or bulk logged recovery model – and only if you let Veeam produce log backups. You don’t have to be aware of what backups you have, restore commands, etc. You use any of the Veeam tools and specify your restore point, and optionally point in time – and Veeam will figure out what restore operations to perform.

Restore using the Veeam Backup & Replication tool

This is the tool that Veeam backup operators will be most familiar with. This tool is used to configure your Veeam environment and you can also use this tool to restore, as well as a number of other things. If you want to restore to anything else than a restore point, then you will use the “Restore Application Items, “Microsoft SQL Server Databases” option when you right-click your machine with SQL Server installed:

You then select the restore point you want to base your restore from:

This will bring up a new tool titled “Veeam Explorer for Microsoft SQL Server”:

When you select a database, you will see what time interval you can restore within. This is based on the Restore point you selected earlier. In technical terms, you have the snapshot from that restore point and then the subsequent log backups taken since.

In this tool, you can select either Restore Database, or Export Database:

“Restore Database” will restore the selected database to the latest point in time, the selected point in time, or restore it onto a different SQL Server.

“Export Database” will create database files (mdf, ldf and possibly ndf ) that you can copy to another SQL Server machine and attach to that SQL Server.

For either of the above two buttons, if you select the bottom two options (Restore/Export “point-in time…” or “to another…”) you have the option to select point in time and will be presented a dialog similar to below:

The top two options are self-explanatory. The “Perform restore to the specific transaction” will read the transaction backup file and based on that present to you the modifications reflected in there. You can select the operation and the database will be restored to the point in time just before this operation.

When is the local SQL Server required?

Some of the operations above require a local SQL Server (also known as “staging SQL Server”), installed on the Veeam server:

      • Restore to the state before selected transaction (Fine-tune restore point)
      • Export to selected point in time
      • Export to the state before selected transaction (Fine-tune restore point)

If features specific to a certain SQL Edition are used for the database, then the same Edition is required for the staging SQL Server. For example, the feature "Table and index partitioning" is supported only in Enterprise Edition. Furthermore, if the staging SQL Server is Express edition, then you will not be able to work with databases above the size limit for the Express edition (currently 10 GB).

Restore using Veeam Backup Enterprise Manager or Veeam Self Service File Restore

These two are really the same tool, where the Self Service tool is limited to restore only onto the machine where you are sitting. So, if you are on the SQL Server machine then you, using Self Service, can only restore onto this SQL Server machine.

The tool is designed to be easy-to-use. It is likely the SQL Server DBA will be more interested in this tool compared to the full-blown Veeam Backup & Replication™ tool. Also, since this tool is web-based, bits of the tool can be incorporated and customized into other tools and solutions (using a REST API), such as from a Service Provider.

Use “Items” in this page when you want to restore SQL Server databases, as shown below:

By default, you will restore based on the most recent restore point. However, since you might have performed transaction log backups for the database, you want to click “Pick another restore point” so you can dig a bit deeper into when you want to restore to:

By clicking “Pick another restore point” you can select your restore point (your snapshot). You can use the slider to specify point-in-time restore if there are transaction log backups available for the selected database. The default is to restore to the exact restore point. If you want to restore to a more recent point (you have transaction log backups), then use the slider to specify that point or earlier. The range for the slider is based on the selected restore point, back to the prior restore point, up to the next restore point, or the most recent transaction log backup, if it is the last restore point.

Note that there might be several restore points per day, i.e. several snapshots were produced that day. You select to the right of the day selector which restore point to use:

You can either restore the database to its original location, or select a different SQL Server to restore to:

The credentials used for the restore commands are specified for the Guest OS in the Veeam Backup & Replication tool.

Hybrid strategies

You can combine the two strategies outlined above, the “Let SQL Server produce backup files” strategy and the “Use Veeam to backup your SQL Servers” strategy.

The DBA centric hybrid strategy

This strategy has already been described. As long as you make sure the machine snapshot is produced using the SQL Server VSS Writer Service (enable “Enable application-aware processing”) then the machine snapshot produced by Veeam is a valid restore option for your SQL Server data. And then, of course you also have the backup files produced by your SQL Server, as described in that section. Two distinct options for restore:

      • Perform restore using Veeam. Use Veeam snapshots (Restore Points) only.
      • Perform restore using SQL Server RESTORE command or GUI. Use whatever backup files you told SQL Server to produce (full, differential, log).

The backup operator centric hybrid strategy

This strategy means you will let Veeam produce its snapshot (restore point) and also transaction log backups. The question now is whether we can perform this while also performing SQL Server backups to file – and not have SQL Server somehow interfere with the Veeam snapshot and log backups? The answer is simple… Yes, we can!

You can produce full backups any way you want – they won’t interfere with the Veeam snapshot or log backups. But you don’t want to use the COPY_ONLY option for these full backups, if you also want differential backups (see next paragraph).

If you want, you can now complement these full backups with differential backups. For instance, you perform a full backup every night and a differential backup every four hours. Full backups or differential backup do not interfere in any way with the chain of log backups produced by Veeam.

However, in this scenario, you want to avoid allowing SQL Server to produce log backups. Remember that this scenario means that Veeam produces log backups. Again, here are two distinct options for restore:

      • Perform restore using Veeam. Use Veeam snapshots (Restore points) and the log backups produced by Veeam.
      • Perform restore using SQL Server RESTORE command or GUI. Use whatever backup files you told SQL Server to produce (full, differential – but not log).

Summary

You can use Veeam for all of your SQL Server backup operations, or only to produce snapshots of your virtual machines. We suggest the person responsible for the overall backup strategy discuss with the DBA whether or not to use Veeam for your SQL Server data. There are several advantages for allowing Veeam to handle your SQL Server backups, including storage space saving on both the backup server and your SQL Server virtual machines, and a single interface for all of your backup and restore operations, etc.

Veeam uses supported measures to back up your SQL Server, and when you let Veeam also produce transaction log backups you have a single place for all restore operations. The backup operator will most likely be using the Veeam Backup & Replication tool for the restore operations, where the DBA might be more comfortable using the web-based interface. Regardless of which tool is used, you simply specify the date and time you want to restore to by specifying the restore point, and if you have log backups, then select exactly what time to restore to – and the restore tool will perform the restore operations for you.

And you can also restore onto a different SQL Server, or export to new database files which you yourself can copy to another SQL Server and attach.

Read now
Author: Tibor Karaszi, Consultant, SQL Server expert, MVP

The purpose of this paper is to provide you background on SQL Server. We will explain what SQL Server is, where it came from and its architecture and building blocks.

Relational Database Management System, RDBMS

An RDBMS represents data in what we usually call tables. There are formal terms as well, but we won’t use those terms in this paper. A table has a name and is a structure where you have one or more columns – each column also has a name. The data is represented as rows. A row has a value for each column (or NULL meaning that the value is missing). A row is supposed to represent an entity (an order, a customer, etc.). While a table represents a set of entities (orders or customers). A database is a collection of tables. When you operate on the data (retrieve data, for instance), you work with sets. For instance, with the SQL language, a SELECT statement returns a number of rows (0, 1 or more).

SQL, the language

In the 1970s, a language was developed for a prototype of a Relational Database Management System.

This language was later called SQL. Some say SQL stands for Structured Query Language, while others think it is simply three letters put together. The SQL language consists of many commands (loosely speaking) that allow us to create, secure, manage and operate over the data in our tables. We sometimes see a categorization of commands in the SQL language:

DDL, Data Definition Language

This category was originally meant to include the commands with which we build the database. For instance, the CREATE TABLE command. And also DROP TABLE and ALTER TABLE (allow us to add a column, for example). Over time, the DDL category came to include all types of CREATE, ALTER and DROP commands (DATABASE, LOGIN, USER, VIEW, PRODECURE, etc.).

DCL, Data Control Language

Here we have the commands with which we define permissions. Who is allowed to do what, optionally with the data. We have the GRANT, DENY and REVOKE commands in this category. A couple of examples:

GRANT CREATE TABLE TO Sue GRANT SELECT ON Customers TO Joe

 

DML, Data Manipulation Language

These are the commands we use to add, remove, modify and read rows in our tables. The commands (also known as statements) are INSERT, DELETE, UPDATE and SELECT. These are the commands sent by your applications to SQL Server when you use your application to add orders, look at customers, etc.

There are other commands as well in the SQL language which do not fit in above categorization. For instance, there is a command to start a transaction or to check that a database is free from corruptions.

SQL was originally an industry standard, but it was quickly standardized formally by the likes of ANSI, ISO, etc. The standard is revised now and then, where new language elements can be added and even old elements removed. Each product has its own dialect of the SQL language, where it implements parts of the SQL standard (which is huge) and adds its own language elements. Additions might be added in order to be competitive, or even just useable. For instance, the standard doesn’t say a word about indexes, which is an important performance structure.

There is a fair amount of critique of the SQL language , some people say it was a quick and dirty implementation of a language for a relational database system. Many feel that SQL isn’t even a relational language. It is safe to say that the SQL language, in some aspects, doesn’t do the relational model justice, as it doesn’t allow us to explore the full potential of a relational database system. But SQL, “flawed” or not, has certainly proven itself to be useful.

Background and history of SQL Server

In the 1980s, Sybase had an RDBMS called SQL Server for the Unix environment. “PCs” were growing in popularity and they wanted to get into the PC market. So they worked with Ashton-Tate (Dbase) and Microsoft to port and release the product for the PC architecture. At this point in time, the operating system for PC servers was OS/2. Version 1.0, 1.1 and 4.2 were released for OS/2.

Then came Windows NT (what we call today “Windows”) so the product was ported to this new and hot operating system. In 1992, shortly after Windows NT was released, Microsoft also released SQL Server 4.2 for Windows NT. Microsoft then separated from Sybase and developed SQL Server on their own. Later, 6.0 and 6.5 was released with no major changes or additions in these versions.

Then, Microsoft re-wrote the whole product, which we might refer to as “the new architecture.” This was a while ago — SQL Server 7.0 was released 1998. The goal was a storage structure that could last two or three decades and considering that we are still using this architecture, it is safe to say it was a successful re-architecture of the product. In 7.0, we also had additions outside the database engine. Microsoft added OLAP Server for cubes and DTS for data export and import. With SQL Server 2000, we also got Reporting Services.

SQL Server 2005 had some pretty significant changes to the product although not as major as 7.0. For instance, 2005 was when SQL Server Management Studio was released. Those of you who have ever worked with SQL Server over the last decade have probably been using this tool. Another significant change was a whole new meta-data structure where Microsoft changed the system table structure totally, from for instance sysobjects (which was a physical table) to sys.tables (which is a view, the physical table is hidden from us). As for the other components, OLAP Server was renamed to Analysis Services and DTS was renamed to SQL Server Integration Services (SSIS) — both re-written. This paper focuses on the database engine, or “the SQL engine.”

The more recent versions of SQL Server are 2008 R2, 2012, 2014, 2016 and the latest 2017. Each of these of course can have added potentially significant additions to the product – but I like to think of them more as evolutions of the product. The fact is, if you worked with 2005, you will more than likely feel at home with 2017.

Architecture of SQL Server, the database engine

Note that for the remainder of this paper, when we say “SQL Server,” we refer to the database engine (if not otherwise specified).

SQL Server is a process, viewed from an operating system perspective. Memory allocated from the operating system is owned by the process and so are files that are opened, and the process has a number of threads that the operating system can schedule for execution. SQL Server was started as a service. If you check “Services” in your operating system, you may see several services where the friendly name starts with “SQL.” The following is a breakdown of the most common SQL Server services. Note that the explanations are not meant to be exhaustive but more of an idea of what each service entails.

SQL Server

This is the database engine.

SQL Server Agent

We will refer to this simply as “Agent.” Agent allows you to schedule jobs for execution at specified times or intervals. A job consists of one or more commands to be executed, and these commands can be SQL commands, operating system commands, a PowerShell or VB script, etc. Agent also has alerting capabilities. You can define that Agent will send an email or execute a job when certain conditions are met. These conditions for instance can be an error written by the database engine to the windows event log, or a SQL Server performance monitor counter reaching some specified value.

SQL Server Analysis Services

SQL Server Analysis Services is also known as SSAS and allows you to build cubes over data in a data warehouse, or as of SQL Server 2012, use the tabular model instead of cubes. SSAS also has data mining functionality, allowing you to work over large amounts of data, using various algorithms to find patterns and trends that exist in data.

SQL Server Browser

This will be defined and discussed later in the text.

SQL Server Integration Services

This service is involved when executing SSIS packages, i.e. performing data transfer and such operations.

SQL Server Reporting Services

Reporting Service allows you to use a tool to design reports and upload such reports to the Reporting Services service. This service stores the reports in a database and publishes them either through a website or a SharePoint site.

SQL Server VSS Writer

This component allows some software to “freeze” the I/O over one or more databases so the software can grab the data from the database file in a consistent state and produce a “snapshot” over the data in your SQL Server. Veeam is an example of a software that can use VSS to produce such a snapshot.

Instances

For some of the above components (services), you can have more than one installed in an operating system. We call each installed component an instance. The components in question are the database engine, Analysis Services and Reporting Services. You can see by the service name whether it is a component that allows us to have several instances — there is a parenthesis and an instance name after the friendly service name. One instance can be a default instance (where you see MSSQLSERVER in parenthesis), and you can have several named instances for which you specify the instance name when you install that instance.

Each instance is separated from each other, except that they of course live in the same operating system. Each instance also has its own folder structure, for instance, where the exe file is stored. This allows them to be different versions and build numbers. You can have totally different configurations at the instance level and also different security settings (including who are “super administrators,” sysadmin, etc.).

Database and storage structures

Each SQL Server instance has several databases. A database is created using the CREATE DATABASE command (or using “New Database” in SSMS, which in turn uses the CREATE DATABASE command). However, there are five databases in SQL Server from the beginning:

master

The master database is the “root” database, so to speak. There are certain configuration information which is common to the whole instance which is stored in the master database. Examples are logins (sys.logins), configuration settings (sys.configurations) and linked servers (sys.servers). Master is special from another perspective as well — the startup, or boot, perspective. When you start your SQL Server it reads command-line parameters from the registry. Some of those parameters are the full path and name to the two files that make your master database (master.mdf and mastlog.ldf ).

msdb

The msdb database is mainly used by SQL Server Agent. There are tables in this database designed to store your scheduled jobs, alerts, operators and anything else Agent works with. It is, however, a common misconception that msdb is only used by Agent. The database engine also uses msdb. For instance, each time you perform a backup or restore, information about that operation is written to a number of tables in the msdb database.

model

When you create a database, SQL Server uses a template for that database. This template includes items such as database option settings. And yes, you have guessed it – this template database is of course the model database.

tempdb

As the name implies, this database is used for temporary storage. There are a number of things that uses tempdb, including:

  • Worktables used during execution of a SQL query
  • Explicitly created temporary tables (tables beginning with #) and table variables
  • Row versioning, where a transaction can get to the prior value for a row which is being modified by somebody else, instead of being blocked or doing a dirty read
  • The inserted and deleted tables that you have available in a trigger that fires for INSERT, UPDATE or DELETE

The tempdb database is re-created every time you start your SQL Server. The database file structure (number of database files, size and path) is not picked from the model database, however – it is based on a system table – visible using master.sys.master_files.

The resource database

This resource database is in fact hidden from us. We cannot see this database in the Object Explorer in SQL Server Management Studio nor does it show in sys.databases. It is hidden because it is not supposed to be altered. I once heard someone from Microsoft suggest to “think of it as a DLL file.” It only contains code, not data. But what code you may ask? The code that is your system stored procedure (like sp_detach_db, sp_help, etc) and also the system views (sys.databases, sys.tables, etc) are included in the resource database. This allow Microsoft to replace these files when you patch your SQL Server instead of running script files to re-create these objects in the master database (like we had in SQL Server 2000 and earlier).

Database files

Each database is stored over a set of files. Files are internally numbered from file one on up (see for instance sys.database_files). We have at least two files for each database; one data file (.mdf) and one transaction log file (.ldf). We can have more than one data file (.ndf) and also more than one transaction log file (.ldf).

A data file belongs to a filegroup. We have a minimum of one filegroup for each database – the PRIMARY filegroup. When you create a table or an index, you can specify what filegroup will be used in the CREATE command. There’s always a default filegroup, which will be PRIMARY unless you change the default filegroup. If you do not specify what filegroup a table or index should be created on, then it will be created on the default filegroup.

You can have more than one data file for each database. The primary (.mdf ) always belongs to the PRIMARY filegroup. Each filegroup can have more than one data file. If you have more than one file in the same filegroup then the files will be filled up proportionally to their sizes.

In the end, the filegroup concept allows you to specify on what file - or files - a table will be stored, such as if you have one table for which you know you will have heavy access. The table is of moderate size, large enough so it won’t fit in RAM by the standard caching mechanism in SQL Server. But you also have lots of other data in the database. Also, let’s say you have a fast disk (SSD) of moderate size that you can utilize on this server. Create a filegroup, add one file on this SSD disk and create your heavily accessed table on this SSD disk (the filegroup with the file which is on this SSD disk). This is only one example of what you can use several filegroups for, and another is backup at the filegroup level, or performing database consistency checks at the filegroup level (DBCC CHECKFILEGROUP instead of DBCC CHECKDB).

However, the vast majority of databases have only one filegroup and even only one data file (the .mdf file). And, in most cases, this is perfectly sufficient. I’m a strong believer in the KISS concept (keep it simple) and if more than one filegroup fails to give you advantages, then why should you have it?

Data file architecture

Each data file is divided into a set of pages, 8 kB in size. Pages are internally numbered from page 0 and up. A page can be unused, i.e. a free page. Pages are grouped into eight consecutive pages referred to as an “extent.” An extent can also be unused if none of the pages on the extent are in use. We sometimes refer to this type of extent as “unallocated” (as opposed to allocated).

An allocated extent can be allocated in two ways. Either as a shared (or mixed) extent. On a shared extent you find pages from different tables and indexes, hence the name shared (or mixed). Also, an extent can essentially be owned by a certain table or index, what we call a “uniform extent.” The first eight pages for each table or index comes from shared extents and subsequent pages come from uniform extents.

We have assumed two types of allocations in this discussion, data and indexes. There are other types as well, such as LOB allocations for instance, but data and index pages are enough to illustrate the page and extent allocation principle in SQL Server.

Table and index architecture

Logically, a table is defined over a number of columns. Each column has a name and is a certain data type.

From a physical perspective, a table has rows stored in pages. However, the table (the data) can be structured in two different ways:

1. A Heap table

A heap table is a table for which we have not created a clustered index. SQL Server doesn’t store the rows in any particular order. Rows and pages are essentially stored where there is free space. Needless to say, SQL Server tries to be conservative and not use more pages than necessary and will try to have as many rows on each page, within reasonable limits.

2. A clustered table

A clustered table is a table over which you have created a clustered index. This index might have been created automatically since SQL Server by default will create a clustered index on the Primary Key (column), however you can override this. Or, you can create the clustered index explicitly using the CREATE CLUSTERED INDEX command.

Now, this begs the question: What is an index?

Let’s first look at this from the clustered index perspective. An index sorts the data. For example, let’s say you represent people in a table, with columns such as first name, last name, city, etc. Also, let’s say you define the clustered index over the last name column (last name being the index key). The rows are now stored in order of last name. Imagine a set of pages (for this index), and in the header of the first page you have the address (page number) to the next page. This then repeats until the last page. A page also points back to the previous page. This is, what we call a doubly linked list. The row with the lowest last name will be the first row on the first page, and vice versa. In other words, you can follow the linked list from the beginning to the end and you will have read the person in order of the last name. What we have described here is the leaf level of the index.

We also have a tree structure above this level. Take the first value of the index key (last name) for the first page, and the same for the second page, etc. Store these on another set of pages, along with the page number they point to. You now have the level above the leaf level. If one page isn’t sufficient for this level then you keep building higher levels until you have exactly one root page. This is how SQL Server builds the index tree.

Non-clustered indexes

The description above describes a clustered index. As we know, a table can have a clustered index, or it might not have a clustered index (in which case it is a heap table). Regardless of which, we can also have a number of non-clustered indexes for the table. A non-clustered index is also defined over one or more columns (say first name, for our previous example). SQL Server builds the non-clustered index pretty much the same way as the clustered index, but in the leaf level of the index we don’t have all columns for the row (the data). Instead, we only have the index key column(s), which in our example is the first name column along with a “pointer” to the data row. If the table is a heap table, we will have the file, page and row number for the row. If the table is a clustered table, we will instead have the clustering key column values.

You might realize that there is plenty more we could say about these things, digging deeper into index structures and going deeper into index options. However, the above description is sufficient to understand what data in tables are used and the fact that we can create an index that SQL Server can use when finding data. Imagine if SQL Server had to read all the pages that the table uses, just to find the rows your query is looking for!

The optimizer and execution plans

The SQL language, a SELECT statement for instance, operates at a logical level. You specify what you want, not how it should be performed. For instance, loosely speaking, “give me all rows with the last name Johnson.” We want this to be executed as quickly and efficiently as possible by SQL Server.

For this, we have an optimizer. The optimizer is responsible for creating an execution plan. The execution plan defines how the query should be executed, like for instance, what index should be used to find the rows for which we are searching. It also helps determine what order to join tables, if the query combines data from more than one table. Or, how join(s) should be performed; there are various ways that joins can be performed, such as hash join, loop join and merge join. We can go on and on. The execution plan consists of a series of operators. Imagine a rather simple query where the most inner operator reads data from the table or index and passes the rows out the next outer operator - which might for instance sort the data if you imagine an ORDER BY in the query.

There are various ways to get a representation of this execution plan. For instance, there are options to see a graphical representation of the execution plan in SQL Server Management Studio (“Show estimated execution plan” and “Show actual execution plan”). When you work with performance tuning, you are interested in execution plans, how the query is executed – what indexes are used etc., and whether we can somehow make SQL Server run a query faster, for instance using less system resources.

Keeping track of used and free space

SQL Server keeps track of used versus free extents in each data file.

The third page, or page two (page numbers starts with 0), in the file is a GAM (Global Allocation Map) page and this has a bit for each extent in the following approximate 4 GB of the database file. GAM keeps track of whether the extent is in use (as shared or uniform) with 0 for the extent, or if it is free (unallocated) with the value 1.

The fourth page (page number three) is the SGAM (Secondary Global Allocation Map) page. The SGAM comes into play when we think about shared extents. An extent allocated as a mixed extent with free pages has 1 in the SGAM, else it has 0.

This makes it easy for SQL Server to find free storage. Imagine that SQL Server needs a free (unallocated) extent. SQL Server scans the GAM page and finds an extent having 1. Or, imagine that SQL Server needs a page from a shared extent. SQL Server scans the SGAM page and finds an extent having 1.

In other words, an extent can be represented in the GAM and SGAM combinations (in that order) by 1, 0 meaning this is a free, unallocated extent. Or, we can have 0, 0 meaning either a uniform extent or a mixed extent with no free pages. Finally, we can have 0, 1 meaning a shared extent with free pages. The combination 1, 1 is not used.

As mentioned, the GAM and SGAM pages represent approximately 4 GB in the database file, and then we have another set of GAM and SGAM pages, and so on. The GAM and SGAM pages are always at fixed positions in the database files. How many pairs of GAMs and SGAMs there are depends on the size of the database file.

But how can we know what pages a table or index are using, or more precisely, a heap or an index? Every heap and every index has an IAM (Index Allocation Map) page. The IAM page also maps approximately 4 GB in the database file and we have one for each extent that this heap or index is using. IAM pages are of course not at fixed positions, so SQL Server keeps track of the address (page number) of the first IAM page at the heap or index level. If the heap or index uses extents across a larger area than 4 GB in the database files, then we have more IAM pages and pointers from the first IAM page to the next one.

Finally, we have PFS (Page Free Space) pages. A PFS keeps track of how full a page is, approximately. The first PFS is the second page (page number 1) in each data file. The PFS map approximate 64 MB in the data file. And they are repeated for each additional 64 MB portion. Each byte (not bit) in the PFS represents one page in that 64 MB area, with roughly how much free space is included on this page. The PFS is not maintained for index pages, since that information is not of interest in the first place – when you insert a new row in an index, we always know where to put this row, in the right positionaccording to the index key.

So, there you have it — this is a look at the allocation architecture for SQL Server. Now, you might wonder whether or not we have to know or care about these things? No, in most cases you can happily administer and program your SQL Server with no knowledge about GAM, SGAM, IAM and PFS’s. However, an understanding can help you understand some error messages more effectively, index fragmentation, or just to de-mystify the engine that is “under the hood” of SQL Server.

If you were to dig further you will discover more details. For example:

  • A heap or index can be partitioned (having more than one partition) and you will see that IAMs are actually at the partition level.
  • There are more allocation types (again, IAMs) than heap and index. We have LOB pages, pages for data types such as varchar(max), nvarchar (max), varbinary (max), text, ntext and image. And there are also allocations for row-overflow pages – we can have a combination of variable length data types for columns so that a row no longer fits on one column and this is when SQL Server stores some of the column values on such row-overflow pages.

If you want to dig deeper into this area, I recommend the SQL Server documentation (SQL Server Books Online, BOL) as a good starting point. Unfortunately, Microsoft has decided to no longer maintain some of the architectural sections of the product documentation, but we can use the SQL Server 2008 R2 version of BOL which is still accurate:

https://technet.microsoft.com/en-us/library/cc280361(v=sql.105).aspx

Transaction logging

As you know, each database has at least one file for its transaction log, the ldf file. You can have more than one, but this won’t give you any performance impact since they will be used one after the other serially. Here’s a brief description of what happens when you modify data in SQL Server:

Every modification is always done in a transaction. Among other things, a transaction is defined as a number of modifications that should either be formed all or none — an atomic operation. By default, a single modification statement, such as an INSERT, UPDATE or DELETE, will be performed within its own transaction – meanwhile, if anything fails, while the modification command is being executed, then everything performed until that point, for that modification command, will be rolled back.

You can also group several modification commands inside the same transaction, using commands such as BEGIN TRANSACTION, COMMIT TRANSACTION and ROLLBACK TRANSACTION.

When a transaction is started, either implicitly using a modification command, or explicitly by a BEGIN TRANSACTION command, SQL Server will record in the transaction log that this session has started a transaction.

For each modification (a row is inserted, updated, deleted, similar for index modifications, etc.), SQL Server will make sure the modified page is in cache. Every read and modification is served from cache. If it isn’t in cache already then the page will be read from disk and be brought into cache. A log record is constructed to reflect the modification, and written to the transaction log (not necessarily physically to the file yet). Now, the page can be modified in cache. This happens for each modification within this transaction. And in regards to an end of transaction, such as a commit, SQL Server will reflect the commit in the transaction log and also make sure all log records produced up until the specific point in time are physically written to the disk (“force log write at commit”). This is why you want to have good write performance where the ldf file is located, since the writing to the ldf file is synchronous writes –the application will wait until the operating system and SQL Server has acknowledged that the write operation has been performed.

The pages that have been modified are dirty at this point. They have been modified since brought into cache and don’t look the same in cache as on disk. SQL Server performs checkpoints now and then where it writes all dirty pages (for the database) to disk, and also reflects that in the transaction log. This gives SQL Server a starting point for recovery, for instance when you start SQL Server. It will find the most recent checkpoint in the transaction log and use the transaction log to make sure that all modifications recorded in the log have actually been made to the data pages, but also roll back any transactions that were incomplete when you stopped your SQL Server. We sometimes refer to this as REDO and UNDO. You can see information about this recovery process reflected in the SQL Server errorlog file, from when you started your SQL Server.

Tools

You get several tools to help you manage SQL Server. The following is a list of the most commonly used and most important tools included.

SQL Server Management Studio (SSMS)

SSMS is quite simply a graphical user interface allowing you to manage your SQL Server. You can connect to a SQL Server (or some of the other components) and use Object Explorer to view your databases, tables, views etc. You can right-click on an object and get a context-menu allowing you to perform various tasks against that object. And you can open query windows, allowing you to type SQL queries and commands and execute them — and of course also save these as files, “script files.”

SQL Server Configuration Manager

This tool allows you to configure your server and perform changes that in general are stored outside of SQL Server (mostly in the Windows Registry) instead of inside SQL Server. These settings include items such as what Windows account each service is started as, startup parameters and network settings for your SQL Server.

SQLCMD.EXE

SQLCMD is a command-line tool for executing queries against a SQL Server. You can do this by either specifying the query as a command-line option, read them from an input-file or use the tool interactively where you get a prompt from which you can type and use GO to send the queries to your SQL Server.

SQL Server Documentation (also known as Books Online)

This is exactly what it sounds like: the documentation for SQL Server. By default, this will take you to a website for the documentation. This has several limitations. The navigation is very slow compared to a local help application and the search functionality is managed by Bing, not limited to only SQL Server. You can change the documentation to a local application using the tool “Manage Help Settings.” Change this to local and then download the parts you are most interested in using the same tool.

APIs, the client and network architecture

In this context, a client application (“client” for short) is an application that logs on to your SQL Server instance and submits SQL queries to the instance. For example, this can be SSMS, SQLCMD, or whatever application that utilizes SQL Server to store and manage its data. Regardless of what this application is, there are some common grounds, regarding how it communicates with your SQL Server.

The developers who created and programmed this application had to have a way to submit SQL queries from the application to your SQL Server instance. For example, let’s say we are using a language like C#. We can’t just have SQL queries in our source code and expect our (C#) compiler to understand this. We need to use a library (loosely speaking) with which we can send our SQL to SQL Server and after execution read the results returned from SQL Server. This is commonly known as our database API (Application Programming Interface).

There have been many database APIs over the years including Db Library, ADO, RDO, ODBC, OLEDB, ADO.NET, and JDBC to mention a few. The actual implementation of this API can vary, but it is common to implement it as one or more DLL files. From the programmer’s perspective, it can be, for instance, function calls in a DLL, or a class library. Some APIs might be limited in usage from a certain programming environment. ADO.NET, for instance, is only available in the .Net programming environment.

Whichever API is used, typically, the DLL that implements the API needs to be able to send the SQL command from the client application to your SQL Server. This is done using what we call a “Network Library,” or Netlib. This needs to be implemented in both the client machine and the server machine. On the server side, we sometimes refer to this as an Endpoint. Nowadays, there are three different Netlibs:

Shared Memory

This is only usable when the client application is in the same operating system as the SQL Server instance. Due to its architecture (a shared memory area), it is the fastest Netlib.

Named Pipes

This is pretty much outdated since it uses NetBIOS. NetBIOS is something we have been trying to discard from our environments the last couple of decades.

TCP/IP Sockets

This is what we typically use. This is also the default Netlib, the one that a client will try first – assuming the server isn’t on the same machine as the client (where it will try Shared Memory first).

Port numbers

When I say “SQL Server” and “port number” to somebody, they either look at me like I’m from Mars or they say 1433.

A default instance will, by default (it is changeable) listen on port 1433. And a client application, when you address the SQL Server instance using only the machine name, host name or IP address, will try 1433. That is how the client finds the right instance on your server machine. But what if you have several SQL Servers? Remember that we can have only one default instance - only one listening to 1433.

A named instance will pick a free port by asking the operating system when it first starts. This port number is saved in the registry and when you start it next time, it will try to use the same port number. You can see what port an instance is using by looking at the startup information in the SQL Server “errorlog” file, or using the SQL Server Configuration Manager tool – where you also can change the port number if you wish. But we don’t connect using the port number, you might say. We connect using the instance name, such as MachineName\InstanceName. Obviously, we have something that translates the instance name to a port number. This is the SQL Server Browser service. When you have the “\InstanceName” part in your connection, the client libraries send a request to the server machine using port 1434 (UDP), and the SQL Server Browser service on the server machine replies to the client with the port number that instance is listening on. We can connect using the port number instead of instance name using MachineName,PortNumer (or host name or IP address instead of machine name).

Security and the logon process

We need a way to identify people using SQL Server and make sure they can perform what they need, but not more. Or at least this is what we should aim for. As a starter you need some identity in SQL Server and a way to authenticate yourself as that identity. In short, we are referring to some logon or login process. There are two types of logins in SQL Server:

SQL Server Logins

When you create a SQL Server Login, you specify a name and a password. SQL Server will store the name and a hash of the password. Since SQL Server only stores a hashed version of the password, there is no way to “decrypt” the password, even if you find the stored bits on disk. When a client connects to a SQL Server instance using a SQL Server login, the login name and password is always sent encrypted from the client to the server, encrypted by the client libraries. The server libraries decrypts them, hashes the password, makes sure the name exists and the hashed password sent by the client matches with the stored hash. This is sometimes known as a non-trusted connection. The most known SQL login is sa. This exists from the beginning and is member of the sysadmin role. In my opinion, we should never use the sa login — it should have a complex password, possibly renamed and definitely disabled. The employee that needs system administrator privileges on your SQL Server should have their own login for such purposes instead of using the sa login.

Windows Logins

When you create a Windows Login, you specify a user, or a group, in your Windows environment (in your domain, most likely). SQL Server will grab the SID (Security Identifier), from Windows and store the name and SID. When you connect using a Windows Login, through a trusted connection, SQL Server verifies that the SID the user is represented as, exists as a login in SQL Server, or any of the groups that the user is a member. If you are allowed to connect to SQL Server using a Windows login which is a group, we can still identify the Windows user name inside SQL Server. This is important because we don’t lose the ability to track who is connected or who did what.

Server roles

At the server level, there are roles you can add a login as member to. There’s the public role which every login is a member. This can be used to grant permission that should apply to everyone. Then there are eight fixed server roles. You cannot change the permissions that come with a fixed server role, but you can of course decide who (if anyone) should be a member of that role. Probably, the most known server role is sysadmin. As a sysadmin, you can do anything everywhere in your SQL Server instance. This should of course be used very carefully. Think of it as SQL Server’s equivalent to Domain Admin. Examples of other fixed server roles are dbcreator and processadmin. As of SQL Server 2012, we can also create our own server roles as well as add members and grant permissions to the role instead of to individual logins.

Gaining access to a database – the user

A login only allows you to connect to the instance. This is meaningless unless you can also access the database(s) you need. This is the user concept. Sometimes we think of this like mapping a login to a database, or granting access for the login to the database. But what we are really doing is creating a user (with a name) in the database to, “point to,” the login. This allows the login to access the database. You can then grant permission to this user, so they can, for instance, SELECT from a table. In most cases, we of course have the same name for the user as we have for the login. The connection for the user to the login is made, though, using the SID for the login. A Windows’ login SID comes from Windows and a SQL Server’s login SID is generated by SQL Server when you create that login. There is always a user named dbo in each database. The login who owns a database has the dbo user in that database.

Database roles

Just like the server roles, we also have database roles. We can assign a database role to a user and the permissions granted to this database roles are now also available for that user. There’s a public database role and every user always has this role – you can grant permissions to this if you want the permissions to apply to all users in the databases. And there are also fixed database rows. Db_owner gives the same privileges as the dbo. Examples of other fixed database roles are db_datareader and db_backupoperator.

We can also create our own database roles, assign permissions to them and add users to such a role.

Permissions

Some permissions can be inherited from some of the fixed server or database roles. Other permissions you will grant specifically, either to a role (server- or database-, depending on type of permission) or directly to a login or user. In general, you can perform an operation if you have been granted privileges, unless there is a “deny” for this operation. Privileges granted accumulate and the following paragraph is an example of this:

Sue connects to SQL Server using Windows authentication. The Sue Windows account exists as a login in SQL Server. Sue is also a member of the Accounting Windows group, and that group also exist as a Windows login on your SQL Server. The Sue login has a user in database A and the Accounting login has access to database B. Sue will be able to access both databases A and B. In database A, the “Sue” user is also a member of a database role named, X. In this database, Sue will be able to perform the operations that have been granted to her, and also the operations that have been granted to the X database role. The same principle goes for the B database, of course. The exception to this is DENY, which overrides GRANT. With DENY, you know that a login or user cannot perform the operation in question. But there is an exception to this as well: a sysadmin can do everything in the instance. SQL Server doesn’t even check DENY for somebody who is sysadmin.

Data transfer, exporting and importing data

It seems like the need for import and export has exploded over the last decade or so. We exchange data for business purposes, export for reporting, perform migrations, build data warehouses, etc. Years ago, the tools for export and import were very simple. As soon as you needed something more advanced, like understating various file formats, you needed to go outside of what came with the product. Then version 7.0 was introduced and it popularized “Data Warehousing.” We needed a tool to export data from our production systems into our data warehouse. Version 7.0 introduced DTS (Data Transformation Services). Since then, we have seen various new tools, commands and improvements in this area.

Importing data, bulk loading

Importing data into a table is also known as loading data. There are certain code paths in SQL Server to facilitate loading data as quickly and efficiently as possible. This is different from performing an INSERT. There are various types of optimization taking place in this scenario, like potentially minimizing transaction logging, optimizations in how storage is allocated, potentially bypassing database constraints and triggers, etc. Some tools and commands designed for data loading always use the bulk loading functionality in SQL Server, while others allow you to choose whether to load the data using regular INSERTs or as bulk loading. Tools and functionality for bulk loading data include:

BCP.EXE, which is a console command (a command-line exe file). It allows you to export from a table to a file, or vice versa. It uses command-line switches to specify the table, file, direction, etc.

BULK INSERT is pretty much the same as BCP, but as a T-SQL command and only for import. An architectural difference between the two is BULK INSERT which is the database engine that reads the file to import. With BCP, it is the client (BCP.EXE) that reads from (or write to) the file.

SQL Server Integration Services (SSIS) is a more advanced tool for export and import. We sometimes refer to these types of tools as ETL tools, as in Extract, Transform and Load. You define the transfer using a design tool. This came with the product and was named “SQL Server Business Intelligence Development Studio” in SQL Server 2008 and “SQL Server Data Tools” in SQL Server 2012. As of SQL Server 2014, this no longer comes with the product and is a separate download. Regardless of version, the design tool is in the end a plug-in to the Visual Studio development environment.

When you create an SSIS package, you add and configure Tasks. A task performs an action, such as running an EXE file, creating a folder, executing a SQL command or sending an email. There are a number of tasks available, and you connect these using “Precedence Constraints,” which basically define the sequence to execute your tasks. A special task type is the Data Flow Task, where you add data sources and destinations. And in between, you can have transformations — which can do things such as lookups, aggregations, and calculations, etc.

The package can then be executed using various methods. You can use the Execute Package Utility GUI program, DTEXEC.EXE command-line program or a SQL Server Agent job step. There is much more functionality in SSIS. You can also create SSIS packages using the easier to use but less powerful Import and Export Wizards.

Read now
Author: Pascal di Marco, Solutions Architect Southern EMEA

This guide explains how to achieve application consistency in MySQL and MariaDB databases with Veeam Backup & Replication. To answer the Availability challenge, modern tools must do more than safely transferring data blocs from a container to another. Application consistency allows for the best possible combination of safety and Recovery Time Objectives (RTOs) to keep business-critical operations online.

Read now
Author: Ali Salman, EMEA Solutions Architect, Veeam Software

SQL AlwaysOn Availability meets Veeam® Availability for the Always-On Enterprise®.

Virtualization provides agility and flexibility so businesses can achieve an Always-On Enterprise™. As part of the virtualization strategy, enterprisers are virtualizing their mission critical applications.Enterprise-level companies commonly use Microsoft SQL platform for their Tier 1 applications and this platform is frequently operated on VMware and Hyper-V environments.

For the AlwaysOn Availability of SQL database, organizations have the following choices:

  • Microsoft SQL Failover Cluster
  • Microsoft SQL AlwaysOn Availability Groups

To learn more, read the whitepaper Benefits of using SQL AlwaysOn Availability Groups for Virtual Environment & Veeam.

Read now
Author: Kirsten Stoner, Technical Analyst, Product Strategy

When deploying Tier-1 business applications, most administrators choose to go virtual for many reasons, including reducing hardware dependency, consolidating workloads, portability and the ability to improve application performance. Going virtual also means having the right recovery tools to let your critical business applications such as Microsoft Exchange, Active Directory, SharePoint and SQL Server be available 24/7 with no data loss.

An application-consistent backup tool is beneficial when it comes to applications such as Microsoft Exchange, SharePoint, SQL Server and Active Directory. These data sources can become quite large, and since they are critical to running an organization, they need to remain online when being backed up and in select restore situations. The data sources don’t necessarily need to be recovered in their entirety, and if fully restored, could potentially lead to downtime and data loss. Item-level recovery gives administrators the option to restore one or more items from the backup without going offline.

That is why for every admin, item-level recovery might be the first resort, instead of restoring the whole Tier-1 application workload which is very time consuming and in most cases unnecessary. To fulfill the Tier-1 recovery demand, Veeam® offers application-specific Explorers allowing you to restore individual items from applications in the form of their native database types.

Read this white paper to learn details on specific application workloads and how you can utilize Veeam to perform item-level recovery. You’ll also read about:

  • 47 restore scenarios Veeam provides
  • Methods for item-level restores for Tier-1 applications
  • Data loss avoidance
  • And more!
Read now
White Paper
February 17, 2009
Author: Ricky El-Qasem, vExpert and startup warrior

MySQL requires a specific approach to transaction-consistent database backup, especially in a non-Microsoft operating system like Linux. If you are administering virtualized Linux VMs and looking for guidelines to protect your critical data, this white paper will provide you with scripting examples and best practices.

This white paper describes how to back up a Linux VM using activated scripts local to the MySQL database. In the VMware environments, scripts can perform some actions before and after the VM snapshot creation, thus enabling hot backups of running VMs. To achieve database consistency, you can chose whether to suspend the MySQL service or use an online dump of each MySQL database.

This guide discusses advantages and disadvantages of both options and gives detailed instructions on how to use each method. You will get scripting examples for VMware tools installation on your Ubuntu server and learn to create pre-freeze and post-thaw scripts.

Read this FREE step-by-step guide and learn how to:

  • Prepare your infrastructure for backup
  • Create scripts for hot backup with Linux and MySQL command-line
  • And much more!
Read now
How can we assist you today?

How can we assist you today?

Happy holidays!

Chat is offline now. Please fill out the form and we will call you back.

Technical

General and licensing

Renewal

Sales

Partnership

Do you have an issue or need post-sales technical assistance?
*if you need help choosing the right product, please select "Sales" chat

Do you need guidance on the website, upgrade hints or help with your licenses/accounts?

Questions regarding your Veeam contract renewal? Contact us and learn how you can maximize your savings.

Do you need additional license, pricing details, product information or have any other sales-related question?

Would you like to become our partner or do you need assistance as our existing partner?

Start chat

We apologize, but this chat is not available at the moment

Start chat

We apologize, but this chat is not available at the moment

Start chat

We apologize, but this chat is not available at the moment

Start chat

We apologize, but this chat is not available at the moment

Privacy Notice. By submitting, you agree that your personal data will be managed by Veeam in accordance with the Privacy Policy.
Technical
Open a case
General and licensing
Start chat
Renewal
Start chat
Sales
Start chat
Partnership
Start chat
Back

Please leave a message


Thank you for your request!

Veeam representative will get back to you shortly.

Contact us  new message