View on SQL Server View

  • 2:06 AM
  • 0 comments

Advantages


Views are used for several different reasons:


1.     To hide data complexity. Instead of forcing your users to learn the T-SQL JOIN syntax you might wish to provide a view that runs a commonly requested SQL statement.
 
2.     To protect the data. If you have a table containing sensitive data in certain columns, you might wish to hide those columns from certain groups of users. For instance, customer names, addresses and their social security numbers might all be stored in the same table; however, for lower level employees like shipping clerks, you can create a view that only displays customer name and address. You can grant permissions to a view without allowing users to query the underlying tables. There are a couple of ways you might want to secure your data:
 
1.     Create a view to allow reading of only certain columns from a table. A common example of this would be the salary column in the employee table. You might not want all personnel to be able to read manager's or each other's salary. This is referred to as partitioning a table vertically and is accomplished by specifying only the appropriate columns in the CREATE VIEW statement.
 
2.     Create a view to allow reading only certain rows from a table. For instance, you might have a view for department managers. This way, each manager can provide raises only to the employees of his or her department. This is referred to as horizontal partitioning and is accomplished by providing a WHERE clause in the SELECT statement that creates a view.
 
      Enforcing some simple business rules. For example, if you wish to generate a list of customers that need to receive the fall catalog, you can create a view of customers that have previously bought your shirts during the fall.
 
      Data exports with BCP. If you are using BCP to export your SQL Server data into text files, you can format the data through views since BCP's formatting ability is quite limited.
 
      Customizing data. If you wish to display some computed values or column names formatted differently than the base table columns, you can do so by creating views.
Disadvantages


Even though views can be a great tool for securing and customizing data, they can be slow. Indeed, they are not any faster than the query that defines them. With SQL Server 2000, indexed views (also referred to as "materialized" views) are supported to overcome this limitation. 



Views can especially degrade the performance if they are based on other views. Therefore, it is recommended NOT to create views based on other views. All views should be created against base tables

From the database management system (DBMS) perspective, a view is a description of the data (a form of metadata). When a typical view is created, the metadata is defined by encapsulating a SELECT statement that defines a result set to be represented as a virtual table. When a view is referenced in the FROM clause of another query, this metadata is retrieved from the system catalog and expanded in place of the view's reference. After view expansion, the SQL Server query optimizer compiles a single execution plan for executing the query. The query optimizer searches though a set of possible execution plans for a query, and it chooses the lowest-cost plan it can find, based on estimates of the actual time it will take to execute each query plan.
In the case of a nonindexed view, the portions of the view necessary to solve the query are materialized at run time. Any computations such as joins or aggregations are done during query execution for each query referencing the view [Note1] . After a unique clustered index is created on the view, the view's result set is materialized immediately and persisted in physical storage in the database, saving the overhead of performing this costly operation at execution time.
The indexed view can be used in a query execution in two ways. The query can reference the indexed view directly, or, more importantly, the query optimizer can select the view if it determines that the view can be substituted for some or all of the query in the lowest-cost query plan. In the second case, the indexed view is used instead of the underlying tables and their ordinary indexes. The view does not need to be referenced in the query for the query optimizer to use it during query execution. This allows existing applications to benefit from the newly created indexed views without changing those applications.
Pros
Indexed views can increase query performance in the following ways:
  • Aggregations can be precomputed and stored in the index to minimize expensive computations during query execution.
  • Tables can be prejoined and the resulting data set stored.
  • Combinations of joins or aggregations can be stored.
Secondary, nonclustered indexes on views can provide additional query performance. Similar to nonclustered indexes on tables, nonclustered indexes on views may provide more options for the query optimizer to choose from during the compilation process.
Applications that benefit from the implementation of indexed views include:
  • Decision support workloads.
  • Data marts.
  • Data warehouses.
  • Online analytical processing (OLAP) stores and sources.
  • Data mining workloads.
From the query type and pattern point of view, the benefiting applications can be characterized as those containing:
  • Joins and aggregations of large tables.
  • Repeated patterns of queries.
  • Repeated aggregations on the same or overlapping sets of columns.
  • Repeated joins of the same tables on the same keys.
  • Combinations of the above.
Cons:
Adding indexes to the schema increases the overhead on the database because the indexes will require ongoing maintenance.
Not all queries will benefit from indexed views. Similar to ordinary indexes, if the indexed views are not used, there is no benefit. In this case, not only are performance gains not realized, but the additional cost of disk space, maintenance, and optimization is incurred. Online transaction processing (OLTP) systems with many writes, or database applications with frequent updates, may not be able to take advantage of indexed views because of the increased maintenance cost associated with updating both the view and underlying base tables.
Indexes on tables and indexed views should be designed concurrently to obtain the best results from each construct. Because both indexes and indexed views may be useful for a given query, designing them separately can lead to redundant recommendations that incur high storage and maintenance overhead. While you tune the physical design of a database, tradeoffs must be made between the performance requirements of a diverse set of queries and updates that the database system must support. Therefore, identifying an appropriate physical design for indexed views is a challenging task, and the Database Tuning Advisor should be used wherever it is possible.
Query optimization cost can increase substantially if there are many indexed views that the query optimizer may consider for a particular query. A query optimizer may consider all indexed views that are defined on any subset of tables in the query. Each view has to be investigated for the potential substitution before it is rejected. This may take some time, especially if there are hundreds of such views for a given query.

Indexed View Background:
Views are also known as virtual tables because the result set returned by the view has the same general form as a table with columns and rows, and views can be referenced just like tables in SQL statements. The result set of a standard view is not stored permanently in the database. Every time a query references a standard view, SQL Server substitutes the definition of the view into the query internally until a modified query is formed that only references base tables
From DBMS perspective, a view is a description of the data, a form of metadata. It is possible to create a unique clustered index on a view, as well as non-clustered indexes, to improve data access performance on the most complex queries. In SQL Server 2000 and 2005, a view that has a unique clustered index is referred to as an indexed view. The main benefit of using index view is that the query optimizer searches though a set of possible execution plans for a query, and chooses the lowest-cost plan it can find, based on estimates of the actual time it will take to execute each query plan.
In the case of a non-indexed view, the portions of the view necessary to solve the query are materialized at run time. Any computations such as joins or aggregations are done during query execution for each query referencing the view. After a unique clustered index is created on the view, the view's result set is materialized immediately and persisted in physical storage in the database, saving the overhead of performing this costly operation at execution time.
The indexed view can be used in a query execution in two ways. The query can reference the indexed view directly, or, more importantly, the query optimizer can select the view if it determines that the view can be substituted for some or the entire query in the lowest-cost query plan. In the second case, the indexed view is used instead of the underlying tables and their ordinary indexes. The view does not need to be referenced in the query for the query optimizer to use it during query execution. This allows existing applications to benefit from the newly created indexed views without changing those applications.
Advantages of Indexed View:
Indexed views improve the performance in the following ways:
  • Joins and aggregations that process many rows.
  • Join and aggregation operations that are frequently performed by many queries.
  • Decision support workloads.
  • The real benefits of Indexed Views are when we have aggregates that are too expensive to compute in real time.
  • Aggregations can be pre-computed and stored in the index to minimize expensive computations during query execution.
  • Tables can be pre-joined and the resulting data set stored.
  • Combinations of joins or aggregations can be stored.
  • Indexed views may be substituted by the query optimizer if it is determined that the cost of the query will be less than using the base table.
For a standard view, the overhead of dynamically building the result set for each query that references a view can be significant for views that involve complex processing of large numbers of rows, such as aggregating lots of data, or joining many rows. If such views are frequently referenced in queries, we can improve performance by creating a unique clustered index on the view. When a unique clustered index is created on a view, the result set is stored in the database just like a table with a clustered index is stored.
Another benefit of creating an index on a view is that the optimizer starts using the view index in queries that do not directly name the view in the FROM clause. Existing queries can benefit from the improved efficiency of retrieving data from the indexed view without having to be recoded.
As modifications are made to the data in the base tables, the data modifications are reflected in the data stored in the indexed view. The requirement that the clustered index of the view be unique improves the efficiency with which SQL Server can find the rows in the index that are affected by any data modification. The capability of the query optimizer to take advantage of indexed views when it processes queries has improved over earlier versions when both the query and the view definition contain the following matching elements:
Disadvantages of Indexed View:
To use an indexed view in all other editions, the NOEXPAND table hint must be used. Indexed views work best when the underlying data is infrequently updated. The maintenance of an indexed view can be greater than the cost of maintaining a table index. If the underlying data is updated frequently, the cost of maintaining the indexed view data may outweigh the performance benefits of using the indexed view.
Another cost of indexed views is that the data is actually stored. By applying the clustered index we are creating a copy of the data. So if we have several indexed views on a single table, we will have several copies of the data, along with the statistics of the data and the indexes and all of the overhead that goes with it.
Following are the disadvantages using index views that do not improve the performance:
  • OLTP systems that have many writes to the disk.
  • Databases that have many updates.
  • Queries that do not involve aggregations or joins.
  • High numbers of updates will most likely kill the benefit. If it is mainly reads then it will work fine.
  • The overuse of indexed views can be like winding a watch with a wrench.
  • Online index rebuilds are not supported for indexed views.
I agree with Lutz that Filtered index is better and top of his comments on filtered index versus indexed view; we can create our filtered index as a non-unique index and where as indexed views can only be created as a unique index. One point to keep in mind is we cannot create a filtered index on a view. However, the query optimizer can benefit from a filtered index defined on a table that is referenced in a view. With filtered index, online index rebuilds is possible. We can rebuild filtered indexes while they are available for queries. Online index rebuilds are not supported for indexed views.

When To Use A View

You need to have a goal in mind when creating a view. There are a number of scenarios where you will want to look for a view as a solution.
  • To hide the complexity of the underlying database schema, or customize the data and schema for a set of users.
  • To control access to rows and columns of data.
  • To aggregate data for performance.
Let’s take a look at each of these scenarios.

Complexity and Customization

Taking care of complex joins and filtering rules inside of a view can benefit other users. As an example, consider the following view from the Northwind database.
CREATE VIEW "Order Details Extended" AS
  SELECT 
     "Order Details".OrderID, 
     "Order Details".ProductID, 
     Products.ProductName, 
     "Order Details".UnitPrice, 
     "Order Details".Quantity, 
     "Order Details".Discount, 
     (CONVERT(money,("Order Details".UnitPrice*Quantity*(1-Discount)/100))*100) AS ExtendedPrice
  FROM 
      Products 
      INNER JOIN 
        "Order Details" ON 
          Products.ProductID = "Order Details".ProductID
A business user with an ad-hoc reporting tool can take advantage of the above view in building customized reports to support her goals. She can use the view to see all of the details about an order without finding the tables to join for product and order information, and without performing the calculation for the price discount. Not only does this make the database easier for the end user, but it also allows a DBA to make changes to the underlying tables without breaking end user reports.
View can also be useful to novice SQL developers on your team. Although stored procedures can remove complexity from a database only a view can be the target of an INSERT, UPATE, or DELETE statement thus functioning more like a real table. There are restrictions to modifying data through views. For more information, see “Modifying Data Through a View” on MSDN. A view can ensure the novice developer always accesses a set of tables in the most efficient manner.
Take care not to let the number of special case views explode in your database. Although it might be tempting to provide every users a view to meet specific filtering requirements, remember it is possible to SELECT from an existing view and provide additional filtering. A large number of views can be difficult to maintain, especially when breaking changes are required to the underlying schema.

Views As A Security Tool

SQL Server already has the ability to restrict access to column in a table (see a previous article on OdeToCode for more information on this topic). However, a view can be a useful tool to provide both column and row level security for a database. Even though you can deny access to the salary column of a payroll table to user Joe, Joe might be confused when he can see the payroll table but has an error appear when his SELECT statement includes the payroll column. A user-friendly solution is to deny Joe any access to the payroll table, but give Joe access to a view that retrieves data from payroll without salary information.
An even more interesting scenario happens when Joe is a department manager and requires access to the salary column, but only for those employees in his department. Row level security is a perfect fit for a solution with a view. You can construct a view to return only payroll records (with salary information) with Joe’s user name in the row (see the built-in SQL Server function USER_NAME()).

Aggregate Views

Another great use case for a view is if you need to roll up or aggregate data from a set of tables, as in the following Northwind view.
CREATE VIEW "Sales by Category" AS
  SELECT 
    Categories.CategoryID, 
    Categories.CategoryName, 
    Products.ProductName, 
    SUM("Order Details Extended".ExtendedPrice) AS ProductSales
  FROM 
    Categories 
      INNER JOIN 
       (Products INNER JOIN 
         (Orders INNER JOIN "Order Details Extended" ON 
            Orders.OrderID = "Order Details Extended".OrderID) 
         ON Products.ProductID = "Order Details Extended".ProductID) 
        ON Categories.CategoryID = Products.CategoryID
  GROUP BY 
    Categories.CategoryID, 
    Categories.CategoryName, 
    Products.ProductName
The view above will give us the sum of the total sales broken down by category and product. If we wanted to see just the total sales for Chai, we could use the following query.
SELECT ProductSales FROM [Sales by Category] WHERE ProductName = 'Chai'
Queries performing aggregations can be harder to optimize than other queries. Using can a view can not only hide the complexity but can also ensure anyone who needs the aggregation will be using an optimized query.

Read more »

Know System Databases

  • 10:23 PM
  • 1 comments
  • Master
    • Purpose – Core system database to manage the SQL Server instance. In SQL Server 2005, the Master database is the logical repository for the system objects residing in the sys schema. In SQL Server 2000 and previous editions of SQL Server, the Master database physically stored all of the system objects.
    • Prominent Functionality
      • Per instance configurations
      • Databases residing on the instance
      • Files for each database
      • Logins
      • Linked\Remote servers
      • Endpoints
    • Additional Information
      • The first database in the SQL Server startup process
      • In SQL Server 2005, needs to reside in the same directory as the Resource database
  • Resource
    • Purpose – The Resource database is responsible for physically storing all of the SQL Server 2005 system objects. This database has been created to improve the upgrade and rollback of SQL Server system objects with the ability to overwrite only this database.
    • Prominent Functionality
      • System object definition
    • Additional Information
      • Introduced in SQL Server 2005 to help manage the upgrade and rollback of system objects
      • Prior to SQL Server 2005 the system related data was stored in the master database
      • Read-only database that is not accessible via the SQL Server 2005 tool set
      • The database ID for the Resource database is 32767
      • The Resource database does not have an entry in master.sys.databases
  • TempDB
    • Purpose – Temporary database to store temporary tables (#temptable or ##temptale), table variables, cursors, work tables, row versioning, create or rebuild indexes sorted in TempDB, etc. Each time the SQL Server instance is restarted all objects in this database are destroyed, so permanent objects cannot be created in this database.
    • Prominent Functionality
      • Manage temporary objects listed in the purpose above
    • Additional Information
      • Each time a SQL Server instance is rebooted, the TempDB database is reset to its original state
  • Model
    • Purpose – Template database for all user defined databases
    • Prominent Functionality
      • Objects
      • Columns
      • Users
    • Additional Information
      • User defined tables, stored procedures, user defined data types, etc can be created in the Model database and will exist in all future user defined databases
      • The database configurations such as the recovery model for the Model database are applied to future user defined databases
  • MSDB
    • Purpose – Primary database to manage the SQL Server Agent configurations
    • Prominent Functionality
      • SQL Server Agent Jobs, Operators and Alerts
      • DTS Package storage in SQL Server 7.0 and 2000
      • SSIS Package storage in SQL Server 2005
    • Additional Information
      • Provides some of the configurations for the SQL Server Agent service
      • For the SQL Server 2005 Express edition installations, even though the SQL Server Agent service does not exist, the instance still has the MSDB database
  • Distribution
    • Purpose – Primary data to support SQL Server replication
    • Prominent Functionality
      • Database responsible for the replication meta data
      • Supports the data for transaction replication between the publisher and subscriber(s)
  • ReportServer
    • Purpose – Primary database for Reporting Services to store the meta data and object definitions
    • Prominent Functionality
      • Reports security
      • Job schedules and running jobs
      • Report notifications
      • Report execution history
  • ReportServerTempDB
    • Purpose – Temporary storage for Reporting Services
    • Prominent Functionality
      • Session information
      • Cache
  • Read more »

    Types of startup accounts

    • 12:49 AM
    • 2 comments

    Local User Account: This user account is created in your server where SQL Server is installed, this account does not have access to network resources.
    Local Service Account: This is a builtin windows account that is available for configuring services in windows. This account has permissions as same as accounts that are in the users group, thus it has limited access to the resources in the server. This account is not supported for SQL SERVER and AGENT services.

    Local System Account:
     This is a builtin windows account that is available for configuring services in windows. This is a highly privileged account that has access to all resources in the server with administrator rights.
    Network Service Account: This is a builtin windows account that is available for configuring services in windows. This has permissions to access resources in the network under the computer account.
    Domain Account: This account is a part of your domain that has access to network resources for which it is intended to have permission for. It is always advised to run SQL Server and related services under a domain account with minimum privilege need to run SQL Server and its related services.
    Where can you see the Local System, Local Service and Network Service accounts? These are windows in-built accounts that are part of the operating system and assigned to the users, you cannot use the account to login to the system, these accounts are meant to be used for securing and authentication mechanism.
    Read more »

    SQL Server 2012 Certification Exam Codes

    • 7:10 AM
    • 1 comments

    v  Querying Microsoft SQL Server (70-461) [BI and DBA]
    v  Administering a Microsoft SQL Server 2012 Database (70-462) [DBA]
    v  Building Data Warehouses with Microsoft SQL Server 2012 (70-463) [BI]
    v  Developing a Microsoft SQL Server 2012 Database (70-464) [DBA]
    v  Designing Database Solutions for SQL Server 2012 (70-465) [DBA]
    v  Implementing Data Models and Reports with Microsoft SQL Server 2012 (70-466) [BI]
    v  Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Platform (70-467) [BI]

    Read more »

    FILESTREAM Example

    • 9:36 AM
    • 0 comments

    FILESTREAM Feature

    FILESTREAM was introduced in SQL Server 2008 for the storage and management of unstructured data. The FILESTREAM feature allows storing BLOB data (example: word documents, image files, music and videos etc) in the NT file system and ensures transactional consistency between the unstructured data stored in the NT file system and the structured data stored in the table.
    Storing BLOB data in NT file system, allows SQL Server to take advantage of the NTFS I/O streaming capabilities and at the same time, maintain transactional consistency of the data. FILESTREAM uses NT System Cache for caching file data. This minimizes the effect that FILESTREAM data might have on the Database Engine performance. When accessing FILESTREAM data through the streaming API, SQL Server buffer pool is not used and hence it does not reduce the amount of memory available for Database Engine query processing.
    The term ‘FILESTREAM data type’ or ‘FILESTREAM column’ is very commonly used and it gives an indication that the FILESTREAM feature is implemented as a data type. This is not true. FILESTREAM is not a data type; instead, it is an attribute that can be assigned to a VARBINARY (MAX) column. When the FILESTREAM attribute of a VARBINARY (MAX) column is set, it becomes a ‘FILESTREAM enabled’ column. Any data that you store in such columns will be stored in the NT file system as a disk files and a pointer to the disk file is stored in the table.  A VARBINARY (MAX) column with FILESTREAM attribute is not restricted to the 2 GB limit SQL Server imposes on Large Value Types. The size of the file is limited by the size of the disk volume only.
    When the FILESTREAM attribute is set, SQL Server stores the BLOB data in the NT file system and keeps a pointer the file, in the table. SQL Server ensures transactional consistency between the FILESTREAM data stored in the NT file system and the structured data stored in the tables.

    Installing and Configuring FILESTREAM

    The default installation of SQL Server 2008 disables FILESTREAM feature. It is quite easy to enable the FILESTREAM feature as part of a new SQL Server 2008 installation. If the SQL Server 2008 instance is already installed without FILESTREAM feature, it can still be enabled following the steps explained later in this section.

    Enabling FILESTREAM as part of installation

    The easiest way to enable FILESTREAM feature is to do so as part of the installation process. You will see a new tab labeled “FILESTREAM” on the “Database Engine Configuration” page of the installation wizard. This tab allows you to specify the FILESTREAM configuration options that you wish to enable on your new SQL Server Instance.
    FILESTREAM feature may be enabled with three different levels of access to the FILESTREAM data, namely: 
    1. Enable  FILESTREAM for Transact-SQL access
    2. Enable FILESTREAM for file I/O streaming access
    3. Allow remote clients to have streaming access to FILESTREAM data

    Enabling FILESTREAM during an Unattended installation

    If you are installing SQL Server 2008 in ‘Unattended mode’, you can use /FILESTREAMLEVEL and /FILESTREAMSHARENAME configuration options to configure FILESTREAM features. For more information about installing SQL Server 2008 in ‘Unattended mode’, see How to: Install SQL Server 2008 from the Command Prompt

    Enabling FILESTREAM after installation

    Enabling FILESTREAM feature on a SQL Server 2008 Instance, installed without FILESTREAM features, would require a little more work. To enable FILESTREAM feature, go to ‘SQL Server configuration Manager’ and right click on instance of SQL Server Service and select ‘properties’. The property page will show an additional tab labeled ‘FILESTREAM’, which looks exactly the same as the FILESTREAM configuration page shown as part of the installation wizard. The FILESTREAM feature can be enabled by setting the appropriate options on this page.
    If you wish to automate this process, you could try running the VBScript given at How to enable FILESTREAM from the command line. to do it.
    Once FILESTREAM feature is enabled, the next step is to configure FILESTREAM Access Level. This is an additional step that is required only if you configure FILESTREAM after the installation of SQL Server. Open SQL Server Management Studio and open the properties of the SQL Server 2008 instance. Select the ‘Advanced’ tab and change the ‘FILESTREAM Access Level’ to ‘Transact-SQL Access enabled’ or ‘Full access enabled’.
    Alternatively, ‘FILESTREAM Access Level’ can be configured using TSQL by running the following statement.
    EXEC sp_configure filestream_access_level, 2
    GO
    RECONFIGURE
    GO
    The last parameter to sp_configure specifies the Access Level, where 0 means ‘Disabled’, 1 means ‘Transact-SQL Access Enabled’ and 2 means ‘Full Access Enabled’

    Using FILESTREAM

    Once you have an instance of SQL server 2008 with FILESTREAM feature enabled, you are ready to go ahead and create FILESTREAM enabled databases. When you create a FILESTREAM enabled database, the FILESTREAM data should be placed in a separate file group having a special attribute that indicates that the file group contains FILESTREAM data.
    SQL Server allows FILESTREAM file groups to be on compressed volumes. If you are using FILESTREAM on a clustering environment, FILESTREAM file groups must be on shared disk resources.

    Creating a FILESTREAM enabled database

    A basic FILESTREAM enabled database should have the following storage components:
    • MDF File
    • LOG File
    • FILESTREAM Data Container
    MDF File and LOG files are familiar to us and do not need any explanation. However, FILESTREAM Data Container may be new to some. FILESTREAM Data Container is a special folder in the NT File System where SQL Server will store FILESTREAM data as disk files. We will examine FILESTREAM Data container a little later in this article.
    Here is the script that creates a FILESTREAM enabled database.
    CREATE DATABASE NorthPole
    ON
    PRIMARY (
          NAME = NorthPoleDB,
          FILENAME = 'C:\Temp\NP\NorthPoleDB.mdf'
    ), FILEGROUP NorthPoleFS CONTAINS FILESTREAM(
          NAME = NorthPoleFS,
        FILENAME = 'C:\Temp\NP\NorthPoleFS')
    LOG ON (                        
          NAME = NorthPoleLOG,
        FILENAME = 'C:\Temp\NP\NorthPoleLOG.ldf')
    GO
    Note the portion highlighted in yellow. That is the part that adds the FILESTREAM Data Container to the database. One important point to note here is that the last sub directory in the FILESTREAM Data Container path should not already exist. In the above example,  “c:\temp\np” should exist, however, the root level folder “NorthPoleFS” should not exist in “c:\temp\np” at the time of creating the database. SQL Server will create the root folder and configure it. If the folder already exists, the operation will fail with an error.

    Understanding FILESTREAM Data Container

    The root folder where FILESTREAM data of a database is stored is called FILESTREAM Data ContainerWhen you create a database with FILESTREAM feature enabled, the FILESTREAM Data Container will be created with the path you specify in your CREATE statement. The root directory in the FILENAME parameter will be configured as FILESTREAM Data Container. It is important that this folder should not exist at the time of creating the database. SQL Server will create this folder and will configure it as FILESTREAM Data Container.

    Reviewing the FILESTREAM Data Container

    Let us go ahead and take a look at the FILESTREAM data container. You may not do this every time you create a FILESTREAM enabled database or table. However, for the purpose of understanding the FILESTREAM Data Container structure, let us take a look at how the folder looks like.
    When we created the database, we had specified the path to the location that we wanted to have configured as the FILESTREAM Data Container for our database. If you have not modified the location I specified in the code snippet, it will be “C:\temp\NP\NorthPoleFS”.  Open windows explorer and navigate to the folder you specified as FILESTREAM Data Container while creating the database.
    You will see a folder named “$FSLOG” and a file “filestream.hdr” there. The folder “$FSLOG” is the FILESTREAM equivalent of the Database Transaction Log file. “filestream.hdr” contains important metadata information used by SQL Server internally. Make sure that you do not tamper with this file.

    Creating a table with FILESTREAM columns

    Once we have a FILESTREAM enabled database, we are ready to go ahead and create tables having FILESTREAM columns. A FILESTREAM column is a VARBINARY(MAX) column that has the FILESTREAM attribute enabled. Remember, FILESTREAM  is not a new DATA TYPE, it is a new attribute added to a VARBINARY(MAX) column.
    CREATE TABLE [dbo].[Items](
       [ItemID] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
       [ItemNumber] VARCHAR(20),
       [ItemDescription] VARCHAR(50),
       [ItemImage] VARBINARY(MAX) FILESTREAM NULL
    )
    Note that every table that has a FILESTREAM column should have a UNIQUEIDENTIFIER column with ROWGUIDCOLand UNIQUE attributes.

    Reviewing the FILESTREAM Data Container Changes

    Let us go back to the FILESTREAM Data Container folder and see if we can detect any change since the last time we visited it. This time you will notice that a new folder is created in the root folder with a GUID value as its name. SQL Server will create a folder for each table that uses FILESTREAM enabled columns. Within this folder, a sub folder will be created for each FILESTREAM enabled column in the table.

    Inserting FILESTREAM data

    We have got the table created. Let us now add a couple of rows to the table we created. As you must have figured out by looking at the column names, this table is intended for storing product information.  Along with the item information (ItemNumber and ItemDescription) we will also store an image of the item. The image will be stored in the FILESTREAM enabled column.
    Let us add ‘Microsoft Mouse’ as the first item to this table. I have a cute image of a Microsoft Mouse on my hard disk and let us load the content of the image file and store that to the ‘ItemImage’ column.
     For the purpose of this example, let us use OPENROWSET(BULK..) to load the content of the image file (jpg) from the disk to a VARBINARY(MAX) variable. Once the content of the image is loaded to the variable, we can store it to the FILESTREAM enabled column. Let us see the code that does it.
    -- Declare a variable to store the image data
    DECLARE @img AS VARBINARY(MAX)

    -- Load the image data
    SELECT @img = CAST(bulkcolumn AS VARBINARY(MAX))
          FROM OPENROWSET(
                BULK
                'C:\temp\MicrosoftMouse.jpg',
                SINGLE_BLOB ) AS x
               
    -- Insert the data to the table          
    INSERT INTO Items (ItemID, ItemNumber, ItemDescription, ItemImage)
    SELECT NEWID(), 'MS1001','Microsoft Mouse', @img
    Before you run the above code, make sure that you change the name of the file (“c:\temp\MicrosoftMoust.jpg”) to a valid image file name that exists in your hard disk.

    Accessing FILESTREAM Data using TSQL

    Even though the actual data of a FILESTREAM enabled column is stored in the NT File System, it will be completely transparent to the TSQL code. You can access the data in the FILESTREAM enabled column just like any other column of the given table. The following example queries all the information from the “Items” table which returns the FILESTREAM data too.
    SELECT * FROM items
    /*
    ItemID           ItemNumber ItemDescription ItemImage
    ---------------- ---------- --------------- ------------
    EE9A1CB8-3514... MS1001     Microsoft Mouse 0xFFD8FFE...
    */
    Usually the TSQL code (stored procedure) does not really do any manipulation of the FILESTREAM data except for saving and retrieving it. It will be the client applications that actually use the FILESTREAM data. Let us look at a VB.NET client application that reads the image data that we just stored into the FILESTREAM data store and displays it in an image control. (To keep the source code listing minimal, I have not added any error handling code)
    'Create a connection to the database
    Dim ConStr As String
    ConStr = "Data Source=JACOBXPS\katmaifs;Initial Catalog=NorthPole" & _
             ";Integrated Security=True"
    Dim con As New SqlConnection(ConStr)
    con.Open()

    'Retrieve the FilePath() of the image file
    Dim sqlCommand As New SqlCommand()
    sqlCommand.Connection = con
    sqlCommand.CommandText = "SELECT ItemImage FROM items " + _
                             "WHERE ItemNumber = 'MS1001'"
    Dim buffer As Byte() = sqlCommand.ExecuteScalar()

    'Bind the image data to an image control
    Dim ms As MemoryStream = New MemoryStream(buffer)
    Dim bmp As Bitmap = New Bitmap(ms)
    ItemImage.Image = bmp

    'Cleanup
    con.Close()

    Accessing FILESTREAM data with Managed API

    Accessing FILESTREAM data using Win32 Streaming has a number of advantages over accessing it using TSQL. When accessing FILESTREAM data using TSQL, SQL Server reads the content of the FILESTREAM data file and serves it to the client. SQL Server memory is used for reading the content of the data file. Accessing FILESTREAM data using Win32 Streaming does not use SQL Server memory. In addition it allows the application to take advantage of the Streaming capabilities of the NT File System.
    Though accessing FILESTREAM data using Win32 Streaming has a number of advantages, it is a bit complicated to use, compared to the syntax needed to access it from TSQL. Before a client application can access the FILESTREAM data, it needs to find out the logical path that uniquely identifies the given file in the FILESTREAM data store. This can be achieved by using the “PathName” method of a FILESTREAM column. Note that the PathName()function is Case Sensitive. The following example shows how to retrieve the PathName() associated with the FILESTREAM data of a column.
    SELECT
          ItemNumber,
          ItemImage.PathName() AS FilePath
    FROM Items
    /*
    ItemNumber FilePath
    ---------- ----------------------------------------------
    MS1001     \\JACOBXPS\KATMAIFS\v1\NorthPole\dbo\Items\
               ItemImage\EE9A1CB8-3514-4115-8227-4E8F080E22E0
    */
    The next step is to begin a transaction and obtain a transaction context. A client application can start a transaction by calling the ‘BeginTransaction’ method of the SqlConnection object. After starting a transaction, the transaction context can be obtained by running the following query.
    SELECT GET_FILESTREAM_TRANSACTION_CONTEXT() AS TransactionContext
    /*
    TransactionContext
    -----------------------------------
    0xCB261797D1878D4A9F9562660124BBD8
    */
    Once the transaction context to the FILESTREAM data file is obtained, the file can be accessed using the‘SqlFileStream’ class. Here is the complete listing of a basic VB.NET application that reads FILESTREAM data using the Managed API and displays the picture on an Image Control.  (To keep the code listing minimal, I have not added any error handling code)
    ‘Create a connection to the database
    Dim ConStr As String
    ConStr = "Data Source=JACOBXPS\katmaifs;Initial Catalog=NorthPole" & _
             ";Integrated Security=True"
    Dim con As New SqlConnection(ConStr)
    con.Open()

    'Retrieve the FilePath() of the image file
    Dim sqlCommand As New SqlCommand()
    sqlCommand.Connection = con
    sqlCommand.CommandText = "SELECT ItemImage.PathName() AS PathName " + _
                             "FROM items WHERE ItemNumber = 'MS1001'"
    Dim filePath As String = sqlCommand.ExecuteScalar()

    'Obtain a Transaction Context
    Dim transaction As SqlTransaction = con.BeginTransaction("ItemTran")
    sqlCommand.Transaction = transaction
    sqlCommand.CommandText = "SELECT GET_FILESTREAM_TRANSACTION_CONTEXT()"
    Dim txContext As Byte() = sqlCommand.ExecuteScalar()

    'Open and read file using SqlFileStream Class
    Dim sqlFileStream As New SqlFileStream(filePath, txContext, FileAccess.Read)
    Dim buffer As Byte() = New Byte(sqlFileStream.Length) {}
    sqlFileStream.Read(buffer, 0, buffer.Length)

    'Bind the image data to an image control
    Dim ms As MemoryStream = New MemoryStream(buffer)
    Dim bmp As Bitmap = New Bitmap(ms)
    ItemImage.Image = bmp

    'Cleanup
    sqlFileStream.Close()
    sqlCommand.Transaction.Commit()
    con.Close()

    Updating FILESTREAM Data

    FILESTREAM data may be modified using either TSQL or using the FILESTREAM API. As mentioned earlier, modifying FILESTREAM data using the Streaming API has a number of advantages over accessing it using TSQL.
    Using TSQL to update one or more FILESTREAM columns is not different from updating regular columns. Though a FILESTREAM column can store more than 2 GB of data, it is not possible to access more than 2 GB of FILESTREAM data using TSQL, because 2 GB is the maximum size of any Large Value Type (VARBINARY, VARCHAR, NVARCHAR, TEXT, NTEXT) can store.
    Modifying FILESTREAM data using streaming API is pretty much the same as what we saw in the previous example. Before accessing the data you need to access the PathName(), start a transaction and obtain a transaction context before modifying the data using the SqlFileStream class.

    Deleting FILESTREAM Data

    When a row is deleted from a table having FILESTREAM enabled columns, the record is removed from the table and the FILESTREAM data file is removed from the FILESTREAM Data Container. FILESTREAM data from the FILESTREAM data container is removed by the FILESTREAM garbage collector. Since this is done in a separate background thread, you may still see the deleted file in the FILESTREAM data container until the garbage collector runs again.

    FILESTREAM Garbage Collector

    SQL Server triggers the FILESTREAM Garbage Collector thread when a CHECKPOINT occurs. So, after deleting FILESTREAM data using TSQL you might notice that the physical file is not removed from the FILESTREAM data container right away. The file will remain in the FILESTREAM data container until the next CHECKPOINT occurs and the garbage collector runs. FILESTREAM operations generate minimal log in the TRN log file and hence it might take longer for a CHECKPOINT to occur if the database is not highly transactional. In such a case, if you want to trigger the garbage collector thread, you should issue an EXPLICIT CHECKPOINT.

    FILESTREAM Feature Summary

    With FILESTREAM, the SQL Server team not only added a  feature to handle unstructured data, but also made sure that it smoothly integrates with many of the existing features of SQL Server.
    • FILESTREAM feature is available with all versions of SQL Server 2008, including SQL Server Express.
    • SQL Server Express database has a 4 GB limitation; however this limitation does not apply to the FILESTREAM data stored in a SQL Server Express database.
    • FILESTREAM Columns can be replicated.
    • FILESTREAM enabled databases can be used with LOG Shipping
    • FILESTREAM columns can be used in Full Text Indexes
    • FILESTREAM works with all recovery models
    • FILESTREAM File Groups can be placed on compressed disk volumes
    • The maximum size of the file that can be stored into the FILESTREAM data storage is limited by the size of the disk volume only.

    Restrictions on existing features

    Though the FILESTREAM feature smoothly integrates with many of the existing features of SQL Server, it adds restrictions to or completely disables a few other important features of SQL Server.
    • A FILESTREAM enabled database cannot be used for mirroring. This is one of the key restrictions that I dislike. Mirroring is a very interesting feature and it cannot be used with FILESTREAM.
    • FILESTREAM data is not available in database snapshots. If you create database snapshots and run a “SELECT * FROM table” query on a table with FILESTREAM columns, you will get an error.  All queries that you run on a FILESTREAM enabled table of a database snapshot should exclude FILESTREAM columns.

    FILESTREAM Limitations

    The FILESTREAM implementation in SQL Server 2008 comes with a few limitations. The following limitations make complete sense to me and I have no complaints on them.
    • FILESTREAM columns cannot be used in Index Keys.
    • FILESTREAM columns cannot be used in the INCLUDED columns of a Non Clustered Index
    • FILESTREAM columns cannot be used in a Table Valued Parameter
    • FILESTREAM columns cannot be used in a memory table
    • FILESTREAM  columns cannot be used in a global or local temp table
    • Statistics cannot be created on FILESTREAM columns
    • Computed columns having reference to FILESTREAM columns cannot be indexed
    • When FILESTREAM Data is accessed through Win 32 APIs, only READ COMMITTED ISOLATION level is supported.
    • FILESTREAM data can be stored only on local disk volumes
    However the following limitations are little restrictive and I really wish to have them removed in the future versions of SQL Server.
    • FILESTREAM data is not supported on Database Snapshots.  Well, that is acceptable, however what is the real pain is the error that we get if we run a “SELECT * FROM table” query on a FILESTREAM enabled table in a database snapshot. SQL Server should ideally return NULL values for FILESTREAM enabled columns, if it cannot create snapshots of FILESTREAM data. At present, it is a real pain to exclude all FILESTREAM enabled columns from all queries that runs on a Database Snapshot.
    • Transparent Data Encryption (TDE) does not encrypt FILESTREAM data. It would be interesting to have this restriction removed in the future.

    FILESTREAM Feature - Points to Remember

    Take note of the following points if you intend to use FILESTREAM in your application.
    • You cannot do database mirroring if you use FILESTREAM features
    • If you replicate FILESTREAM enabled columns, make sure that all subscribers are running on SQL Server 2008 or later versions.
    • If you intend to use LOG shipping, make sure that both primary and secondary servers are running on SQL Server 2008 or later versions.
    • If you are on a FAILOVER CLULSTERING environment, make sure that you place the FILESTREAM file groups in a shared disk and FILESTREAM should be enabled on each node.
    • You cannot access FILESTREAM data if you create database snapshots. If you try to access a FILESTREAM enabled column, SQL Server will raise an error.
    • SQL Server Instance should be configured with Integrated Security if Win 32 access to the FILESTREAM data is required

    FILESTREAM Best Practices

    • Place each FILESTREAM data container in a separate volume
    • Use the correct RAID level depending upon the nature of the application (read intensive, write intensive), expected work load etc
    • Do periodical disk defragmentation
    • Decide diligently on whether or not to use a compressed disk volume
    • Disable 8.3 names in NTFS
    • Disable last access time tracking in NTFS
    • FILESTREAM data container should not be on a fragmented volume
    • Make sure that the data being stored is appropriate for FILESTREAM storage. FILESTREAM storage is only good if the size of the data is more than 1 MB (approx)
    • Avoid multiple small appends to the FILESTREAM file
    • If FILESTREAM files are large, avoid using TSQL access to the FILESTREAM data.
    • If reads require only the first few bytes, then consider using TSQL access using substring() function
    • If the entire file is needed Win 32 access is desirable
    Read more »
     

    Copyright © 2010 SQL Cached, All Rights Reserved. Design by DZignine