SQL Server 2008: DDL (Create/ Alter/ Drop/ Truncate)
SQL Server 2008: DDL (Create/ Alter/ Drop/ Truncate)
T-SQL PL/SQL
Max. column per table 1024 1000
Max. table row length 8036 255000
Recursive subquery 40 64
Array Support No Yes
Q: What are the major difference between earlier version of sql server and current
version?
A: BI Tools, .Net Integration, Data partitioning and Mirroring
DBA Enhancement:
- >New Installation center with different look and feel
->Compressed backup: since less data is being written to the disk, backup time is
usually reduced (priory this option was available for TPV such as Red Gate, Quest.
->Enhanced Configuration and Management of Audits using the new Change Data
Capture (CDC) feature. This new feature makes tracking down changes and auditing
much easier than it has been in the past.
->New table value parameter: The new table type can be passed to a stored procedure.
->File Stream data type: database engine will store all of the data associated with the
column in a disk file as opposed to the actual database.
->Sparse column support
->Encryption enhancement: Transparent data encryption (TDE) is available as opposed
to TPV such as NetLib. Also key management and encryption
->New high availability feature like hot add CPU and hot add memory. Also new failover
clustering enhancement is available.
->Internal Performance enhancement
->Performance data management tools
->Resource Governor is a nice new feature to help manage workload by limiting the
resources available to a process.
->Plan freezing is meant to offer greater predictability when it comes to executing a
particular query in SQL Server 2008.
->Declarative Management Framework (DMF). This is basically a new policy-based
management system for SQL Server 2008.
Developer Enhancement:
->LINQ Support: It’s basically a mapping of the database object to programming objects
->Merge Statement
->The MERGE statement performs insert, update, or delete operations on a target table
based on the results of a join to a source table.
->Spatial data type for storing location based data.
BI Enhancement:
->better cube designer, an improved dimension and attribute designer, and enhanced
data mining structures.
->Able to script in c# as well
->In SSRS the configuration has changed; for example, it now supports rich text format.
->SharePoint integration Reporting Services now will integrate closely with Microsoft
SharePoint.
->No longer require IIS.
->Export to word support
->Better graphing
Q: We use an application that uses SQL 2005 DMVs. Will it work with SQL 2008?
A: For the most part the DMVs remain intact. However, you should test your application
to be certain.
Q: Using the Resource Governor, is it possible to limit the resources for a
particular application?
A: Yes, you would need to create a CLASSIFIER function that tests each new session
to see if the APP_NAME( ) is the name of the application you wish to limit. For sessions
with your applications name, the function should return the name of the Workload Group
to you along with the application to be associated with.
Data type
Q: If the server won’t convert my datatype to hold fractions, what happens when I
get the sum of something too big to be held?
A: Suppose that you have a table of tinyints. The max value for a tinyint is 255. If the
sum is, say, 400, the server is smart enough to promote the datatype to an integer. It
skips right over smallint. In short, you don’t have to worry about datatype conversion
during sums.
Data Conversion
Q: Difference between Implicit and explicit data conversion?
A: Converting data from one data type to another data type is often handled
automatically by SQL server called implicit data conversion.
Those conversions that are explicit require cast and convert function.
Q: What is boxing?
A: Boxing occurs when a value stored as a specific data type is converted or cast to a
different data type.
Q: What is unboxing?
A: Unboxing occurs when a value is converted back to its original data type after it had
been converted or cast to a data type other than the original data type.
Constraints
Entity Integrity
Entity integrity defines a row as a unique entity for a particular table. Entity integrity
enforces the integrity of the identifier columns or the primary key of a table, through
UNIQUE indexes, UNIQUE constraints or PRIMARY KEY constraints.
Domain Integrity
Domain integrity is the validity of entries for a specific column. You can enforce domain
integrity to restrict the type by using data types, restrict the format by using CHECK
constraints and rules, or restrict the range of possible values by using FOREIGN KEY
constraints, CHECK constraints, DEFAULT definitions, NOT NULL definitions, and
rules.
Referential Integrity
Referential integrity preserves the defined relationships between tables when rows are
entered or deleted. In SQL Server, referential integrity is based on relationships
between foreign keys and primary keys or between foreign keys and unique keys,
through FOREIGN KEY and CHECK constraints. Referential integrity makes sure that
key values are consistent across tables. This kind of consistency requires that there are
no references to nonexistent values and that if a key value changes, all references to it
change consistently throughout the database.
When you enforce referential integrity, SQL Server prevents users from doing the
following:
Adding or changing rows to a related table if there is no associated row in the primary
table.
Changing values in a primary table that causes orphaned rows in a related table.
Deleting rows from a primary table if there are matching related rows.
Primary Key
A table typically has a column or combination of columns that contain values that
uniquely identify each row in the table. This column, or columns, is called the primary
key (PK) of the table and enforces the entity integrity of the table. You can create a
primary key by defining a PRIMARY KEY constraint when you create or modify a table.
A table can have only one PRIMARY KEY constraint, and a column that participates in
the PRIMARY KEY constraint cannot accept null values. Because PRIMARY KEY
constraints guarantee unique data, they are frequently defined on an identity column.
When you specify a PRIMARY KEY constraint for a table, the Database Engine
enforces data uniqueness by creating a unique index for the primary key columns. This
index also permits fast access to data when the primary key is used in queries.
Therefore, the primary keys that are chosen must follow the rules for creating unique
indexes.
If a PRIMARY KEY constraint is defined on more than one column, values may be
duplicated within one column, but each combination of values from all the columns in
the PRIMARY KEY constraint definition must be unique.
Foreign Key
A foreign key (FK) is a column or combination of columns that is used to establish and
enforce a link between the data in two tables. You can create a foreign key by defining a
FOREIGN KEY constraint when you create or modify a table. In a foreign key reference,
a link is created between two tables when the column or columns that hold the primary
key value for one table are referenced by the column or columns in another table. This
column becomes a foreign key in the second table.
Unique Key
You can use UNIQUE constraints to make sure that no duplicate values are entered in
specific columns that do not participate in a primary key. Although both a UNIQUE
constraint and a PRIMARY KEY constraint enforce uniqueness, use a UNIQUE
constraint instead of a PRIMARY KEY constraint when you want to enforce the
uniqueness of a column, or combination of columns, that is not the primary key.
Multiple UNIQUE constraints can be defined on a table, whereas only one PRIMARY
KEY constraint can be defined on a table. Also, unlike PRIMARY KEY constraints,
UNIQUE constraints allow for the value NULL. However, as with any value participating
in a UNIQUE constraint, only one null value is allowed per column. A UNIQUE
constraint can be referenced by a FOREIGN KEY constraint.
Check Constraint
CHECK constraints enforce domain integrity by limiting the values that are accepted by
a column. They are similar to FOREIGN KEY constraints in that they control the values
that are put in a column. The difference is in how they determine which values are valid:
FOREIGN KEY constraints obtain the list of valid values from another table, and
CHECK constraints determine the valid values from a logical expression that is not
based on data in another column. For example, the range of values for a salary column
can be limited by creating a CHECK constraint that allows for only data that ranges from
$15,000 through $100,000. This prevents salaries from being entered beyond the
regular salary range
Default
When you load a row into a table with a DEFAULT definition for a column, you implicitly
instruct the Database Engine to insert a default value in the column when a value is not
specified for it. If a column does not allow for null values and does not have a DEFAULT
definition, you must explicitly specify a value for the column, or the Database Engine
returns an error that states that the column does not allow null values. The value
inserted into a column that is defined by the combination of the DEFAULT definition and
the nullability of the column can be summarized as shown in the following table.
No entry, no No entry,
Column DEFAULT DEFAULT Enter a null
definition definition definition value
Allows null
values NULL Default value NULL
Disallows null
values Error Default value Error
.
NULL
The nullability of a column determines whether the rows in the table can contain a null
value for that column. A null value, or NULL, is different from zero (0), blank, or a zero-
length character string such as "". NULL means that no entry has been made. The
presence of NULL typically implies that the value is either unknown or undefined.
-> Changes to PRIMARY KEY constraints are checked with FOREIGN KEY constraints
in related tables.
-> Foreign key columns are frequently used in join criteria when the data from related
tables is combined in queries by matching the column or columns in the FOREIGN KEY
constraint of one table with the primary or unique key column or columns in the other
table. An index enables the Database Engine to quickly find related data in the foreign
key table. However, creating this index is not required. Data from two related tables can
be combined even if no PRIMARY KEY or FOREIGN KEY constraints are defined
between the tables, but a foreign key relationship between two tables indicates that the
two tables have been optimized to be combined in a query that uses the keys as its
criteria. For more information about using FOREIGN KEY constraints with joins
Q: When I use the INSERT statement with values or a query, what happens if one
of the values already exists in the table, and the INSERT operation violates a
PRIMARY KEY or UNIQUE constraint?
A: The entire INSERT operation will fail and you will receive an error message like
"Violation of UNIQUE KEY constraint…Cannot insert duplicate key in object
'dbo.TableY'". The entire INSERT operation will be terminated, so no records will be
inserted, even if only one violates the constraint. If this is not the behavior you intend,
use the MERGE statement instead.
Table
Permanent Temp
Local Temp Table Global Temp Table Normal Table
Table
Created in any
Created in tempdb Created in tempdb Created in tempdb
database
Q: Why would I want to use a temporary table to hold data if I can just go after the
base table?
A: There are two good reasons. First, if lots of users are altering data in the base table
and you are interested in a “snapshot” of the data, you can get that with a temp table. If
the base table is very “busy,” your table, usable only by you, will never require you to
wait while another user has pages locked. Second, tempdb is often placed on a fast
device, sometimes a solid state disk (an expensive bank of memory that acts just like a
disk drive), and sometimes placed in the server’s RAM.
Q: I created a table using sparse columns and a column set; now when I query
from the table, the sparse columns are missing!
A: When a table has a column set defined, individual sparse columns are not returned if
a SELECT∗is done on the table. However, you can reference the sparse columns by
name, and they will be returned.
Q Can I create a table with just a single identity column, and nothing else?
A: Strangely, yes, but you cannot insert data into it unless you set identity_insert on,
defeating the purpose of the identity column. This is because you cannot provide an
empty column list to insert.
Q: Can I create a null identity column?
A: No.
Indexing
Q How do I know if SQL Server is using an index and choosing the best plan?
A: Use showplan to display the optimization plan and determine whether the server is
using an index. Is the plan the best one? You will need to do some analysis. Look at the
size of the table (in pages and rows) and the selectivity of the various indexes available
for the query. If you think that the server isn’t choosing a good plan, try forcing a
different one and see if the statistics io output proves you right. If so, consider restating
the query or breaking it up into parts, using a temporary table to store an intermediate
result set.
Q: What’s reformatting?
A: Sometimes SQL Server can’t resolve a query using the existing indexing resources,
or it decides that its own clustered index or temporary result set will work better than a
typical join given the table structure. In that case, the server can reformat the query. In
the showplan output, the server reports reformatting as a separate step in the query
plan, like this:
STEP 1
The type of query is INSERT
The update mode is direct
Worktable created for REFORMATTING
If important queries based on a particular table are consistently being reformatted, try
adding the index the server seems to be looking for. Often, reformatting takes time and
can result in poor query performance. (When it reformats, the server is creating a
nonreusable index for every query. If the same index were available, it wouldn’t need to
create it, so the query would have to be faster.) Sometimes, the only way to stop
reformatting is to change your clustered index. You will have to decide if it’s worth it.
Q: If a stored procedure throws an error message will the statements all be rolled
back?
A: No, a stored procedure is not necessarily a single transaction. The only time a stored
procedure is a single transaction is when the BEGIN TRANSACTION and COMMIT
statements are explicitly within the procedure.
Q: If I want to run select statements without using table locking hints what
transaction isolation level should I use?
A: You will want to use the READ UNCOMMITED transaction isolation level as this will
allow SELECT statements to run without taking any locks and without requiring table
locking hints.
Q: I’m getting deadlocks on a server; how can I get more information about the
deadlocks?
A: You will want to enable trace flags 1204 and 1222. These flags will output
information about both statements that are involved in the deadlock to the ERRORLOG
file.
Q: What option is used to control the rollback priority when dealing with
deadlocks?
A: You will want to use the SET DEADLOCK_PRIORITY switch. You can either use
LOW, MEDIUM, and HIGH, or for more control use any whole value from –10 to 10,
giving you a total of 21 settings.
Q: Which are the transaction isolation levels that SQL Server offers?
A: READ UNCOMMITTED, READ COMMITTED, SNAPSHOT, REPEATABLE READ,
and SERIALIZABLE.
Q: Can I set user priorities so that some users requesting locks go to the head of
the line, while others have to wait?
A: No. This is an often requested feature, however, and may be implemented in a future
release.
Q: What cardinal sins should I be sure to avoid when dealing with transactions?
A: Never allow user interaction during a transaction. Don’t perform unrelated work
inside the same transaction. Don’t nest transactions.
Q: If I want to control the transaction isolation level, do I have to use table hints?
A: No, generally the “Read Committed” default transaction isolation is the optimal
choice. If you need to change the isolation for your entire session, you should use the
SET TRANSACTION ISOLATION LEVEL setting. That will affect all statements
submitted by your session. The table level hints should only be used when a specific
table needs special locking behaviors in a statement.
Q: What service do you need to run when you need to initiate transactions in
Linked Servers?
A: Distributed Transaction Coordinator (DTC).
Q: I want to get locks that span multiple servers. How do multiserver transactions
work?
A: MS SQL Server 6.5 provides distributed transactions, which are transactions that
span servers. Distributed transactions are managed by the Distributed Transaction
Coordinator, or DTC, which was shipped first with Windows NT v. 3.51 and is a part of
NT 4.0 as well.
Distributed transactions are based on the traditional two-phase commit (2PC) method of
ensuring transactional integrity across distributed systems. The DTC informs each
server when it’s time to commit a transaction. The server commits once, partway, and
then tells the coordinator that the commit is ready to go through. When all of the servers
have completed the first phase of commit, the DTC releases the servers to finish the
commit process. If any server fails to complete the first commit phase, DTC orders all of
the servers to roll back.
Prior to this release of SQL Server, distributed transactions were only possible from a
client program written in C with the DB-Library API. The application was responsible for
handling all of the coordination of the commit process. Now 2PC is available to any
SQL-based or ODBC-based application.
1. Parse: Syntax—the commands are checked for syntactic validity. Do the commands
make sense?
2. Parse: Object references—If the syntax is okay, check for referenced objects. Do the
referenced tables exist? Do you have permission to access them? Do the columns exist
in those tables?
3. Optimize—The server considers different plans for getting and processing the data.
Are indexes available? Can they be used effectively, or would a table scan be faster?
This is a very complex process, which is being constantly improved. The result is a
query plan detailing how to access the data in the fastest way.
4. Compile—The query plan is compiled.
5. Execute—The compiled plan is executed at the server.
Q: Just WHILE and IF? Aren’t there CASE statements, DO UNTIL, FOR NEXT, any
of those commands?
A: No. You would be surprised how much work you can do with just IF and WHILE.
There is a CASE expression, but this is not used for flow control. The CASE expression
is most often used for conditional execution inside a SELECT or UPDATE statement.
Q: What types of errors does SQL Server have, and what are their differences?
A: SQL Server has two types of error messages: information and error. They differ in
severity level: information messages have a severity level from 0 to 10; error messages
have a severity from 11 to 25.
Q: What are the differences between the @@ERROR function and a TRY…
CATCH block?
A: The @@ERROR function provides a limited way to handle errors, and it works in
previous versions of SQL Server. TRY…CATCH blocks provide better error handling
capabilities, but they only work in versions 2005 and 2008 of SQL Server.
Q: When I am creating and testing localized error messages, how can I verify the
language of my session?
A: You can query the @@LANGUAGE function to determine your session’s language.
Q: What’s the difference between a raiserror and a print, and when would I use
them?
A: RAISERROR is used to indicate an exception. When things go wrong,
RAISERRORS give the user an idea about what happened, and how to fix it.
RAISERROR allows you to specify an error number, severity, and state. PRINT is used
to give supplemental information. You can use PRINT to confirm that things succeeded.
Operator
Q: Why would I use FREETEXT search function instead of multiple LIKE '%value
%' comparisons?
A: Using LIKE comparisons is a highly time- and resource-intensive operation that
always requires a table scan. The FREETEXT utilizes the full-text index structure and
delivers much better performance than the LIKE comparison.
Ranking function
Q: Do any of the ranking functions require an argument to be passed in?
A: Yes, the NTILE function requires that you supply the number of “tiles” you would like
to classify the rows by. None of the other ranking functions take an argument, however.
Ordering
Q: Is there a limit to the number of columns I can list in a single select statement?
A: SQL does not impose a limit, but the most columns that a table can have is 255.
Scalar function
Q: What is the use of quotename and how it differ from replace?
A: As a best practise, one should wrap the object names with QUOTENAME() when
using dynamic sql to avoid SQL Injection and more to follow on this
topic. QUOTENAME() works correctly for object names that are less than or equal to
128 characters in length but returns NULL if the length is more. This fact is incorrectly
shown as 258 on BOL so watchout when you are using QUOTENAME and
use REPLACE() instead. The only difference between these is QUOTENAME adds the
brackets automatically but using replace you have to manually add them.
Q: What can the WHERE clause do that the HAVING clause cannot?
A: The WHERE clause restricts the rows that are used to evaluate a query. If a row
can’t pass the conditions in the WHERE clause, it is not used in the calculation of an
aggregate in the query.
Q: Because the rowcount coming back from an aggregate is always one, is there
any other way to determine on how many columns the aggregate acted?
A: The best way is to perform a count(<column>) in addition to your other aggregates.
This will return the number of columns that passed the WHERE test and are not null—
the same set of columns upon which any other aggregates in the query will have acted.
Join
Q: Is there a right outer join?
A: Yes, there are left, right, and full outer joins. The right outer join simply forces all the
rows to appear from the right-hand table in the join relationship rather than the left.
Q: What are the benefits to using ANSI or old style join syntax?
A: The ANSI syntax can prevent you from making a simple mistake. You cannot
accidentally create a cross join when you wanted an inner join, as you can do with the
old style syntax. ANSI style syntax makes the relationships between tables easier to
understand because it is explicitly spelled out for someone viewing the code. Also, it
requires you to specify join keys together with the tables to which they belong (instead
of separately, like with the old style). Finally, the ANSI join syntax is defined in the ANSI
SQL standard. If standards compliance is important to you and your company, this style
is guaranteed to be supported by MS SQL Server in future releases.
The old style syntax is less verbose than the ANSI style. It is more portable between
RDBMS’s because more of them recognize the old style. If you are using a combination
of MS 6.0 and 6.5 servers, the old style syntax will work on both, but the ANSI syntax
won’t. The old style syntax is also more recognizable by a wider range of people in the
industry (it has been around longer), so you will have an easier time getting help if you
use it.
There is no performance benefit to using one or the other join syntax.
Top/ Apply
purging historic data might involve deleting millions of rows of data. Deleting such a
large set of rows in a single transaction has several drawbacks. A DELETE statement is
fully logged, and it will require enough space in the transaction log to accommodate the
whole transaction. During the delete operation (which can take a long time), no part of
the log from the oldest open transaction up to the current point can be overwritten.
Furthermore, if the transaction breaks in the middle for some reason, all the activity that
took place to that point will be rolled back, and this will take a while. Finally, when many
rows are deleted at once, SQL Server might escalate the individual locks held on the
deleted rows to an exclusive table lock, preventing both read and write access to the
target table until the DELETE is completed.
Distinct/ All
Set operation
Sub-query
Q: Can I mix the subquery and join syntax together in the same query?
A: Sure. Some users find that doing this helps them conceptualize how the server is
creating their result set. Remember the limitations that the subqueries place on you.
The 16-table limit still applies.
Q: What is CTE?
A: CTE can be thought of as a temporary result set that is defined within the execution
scope of a single Select, Insert, Update, Delete or Create View statement.
Q: Advantage of CTE.
A: Improved reliability and ease of maintenance of complex queries. CTE can be
defined in user defined routines, such as functions, procedure, triggers or views.
Q: Can a single Common Table Expression (CTE) be used by more than one
statement?
A: No, a CTE is only valid for the single SELECT, INSERT, UPDATE, or DELETE
statement that references it. If multiple queries want to use the same CTE, the CTE
must be restated for each subsequent query.
Cursor
Q: Why is the cursor so maligned in the newsgroups, trades, and conventional
wisdom?
A: Cursors are the most often misused feature of SQL Server. When used properly and
sparingly, cursors complement SQL by providing an easy way to handle problems that
set processing makes very difficult. Many programmers who migrated to SQL from
xBase or COBOL (myself included) found the migration much easier when they
discovered cursors. As a result, newcomers to SQL used cursors for everything. Using
cursors for normal SELECT and DELETE operations is a recipe for horrendous
performance and unmaintainable cursor code.
Q: Can I have more than one cursor active at the same time? On the same table?
A: Sure.
View
Q: What is view?
A: A View is a saved text of a select statement. View doesn’t store any data, it merely
refer to the data stored in tables.
For occasionally run queries, view perform well even when compared with SP.
View is a key component of data access architecture but I still prefer using SP for the
abstraction layer because they offer error handling, better transaction control and the
option of programmatic control.
If your application has a lot of dynamic SQL embedded within a .NET application, using
view to build a database abstraction layer may be easier than refactoring the application
to call SP. (I would see view directly access a table than allow dynamic SQL to do it)
Performance issue involving views concerns the locks that views can place on the data.
When the data and execution plan are already in the cache, SP are slightly faster than
views; however, if the execution plan is not cached(mentioned than it has not been
executed recently), than views execute faster than SP. Best practice is to use views to
support users ad hoc queries, and SP for the applications data abstraction layer.
Q: View limitation.
A:
Q: How to modify data through view?
A:
Q: What is the advantage of using an updateable view over updating the table
directly, given that the updateable view will by definition always be based on a
single table?
A: A view is more flexible. For example, you may wish to restructure the underlying
tables without the need to change your client applications. You will only need to update
the view in this case, not everything that referenced the tables being restructured.
Q: If I delete a row from a view, does that change data in the table on which the
view is defined?
A: Yes. A view does not store data itself; it is simply a lens through which you can view,
in a different way, data that lives in normal tables.
Q: Why are there restrictions on the kind of views I can create an index on?
A: To make sure that it is logically possible to incrementally maintain the view, to restrict
the ability to create a view that would be expensive to maintain, and to limit the
complexity of the SQL Server implementation. A large set of views is nondeterministic
and context-dependent; their contents 'change' independently of DML operations.
These can't be indexed. Examples are any views that call GETDATE or
SUSER_SNAME in their definition.
Q: Why does the first index on a view have to be CLUSTERED and UNIQUE?
A: It must be UNIQUE to allow easy lookup of records in the view by key value during
indexed view maintenance, and to prevent creation of views with duplicates, which
would require special logic to maintain. It must be clustered because only a clustered
index can enforce uniqueness and store the rows at the same time.
Q: Why isn't my indexed view being picked up by the query optimizer for use in
the query plan?
A: There are three primary reasons the indexed view may not be being chosen by the
optimizer:
1) You are using a version other than Enterprise or Developer edition of SQL Server.
Only Enterprise and Developer editions support automatic query-to-indexed-view
matching. Reference the indexed view by name and include the NOEXPAND hint to
have the query processor use the indexed view in all other editions.
2) The cost of using the indexed view may exceed the cost of getting the data from the
base tables, or the query is so simple that a query against the base tables is fast and
easy to find. This often happens when the indexed view is defined on small tables. You
can use the NOEXPAND hint if you want to force the query processor to use the
indexed view. This may require you to rewrite your query if you don't initially reference
the view explicitly. You can get the actual cost of the query with NOEXPAND and
compare it to the actual cost of the query plan that doesn't reference the view. If they
are close, this may give you confidence that the decision of whether or not to use the
indexed view doesn't matter.
3) The query optimizer is not matching the query to the indexed view. Double-check the
definition of the view and the definition of the query to make sure that a structural match
between the two is possible. CASTS, converts, and other expressions that don't
logically alter your query result may prevent a match. Also, there are limits to the
expression normalization and equivalence and subsumption testing that SQL Server
performs. It may not be able to show that some equivalent expressions are the same, or
that one expression that is logically subsumed by the other is really subsumed, so it
may miss a match.
Q: My view has duplicates, but I really want to maintain it. What can I do?
A: Consider creating a view that groups by all the columns or expressions in the view
you want and adds a COUNT_BIG(*) column. Then create a unique clustered index on
the grouping columns. The grouping process ensures uniqueness. This isn't really the
same view, but it might satisfy your needs.
Q: I have a view defined on top of another view. SQL Server won't let me index the
top-level view. What can I do?
A: Consider expanding the definition of the nested view by hand into the top-level view
and then indexing it, indexing the innermost view, or not indexing the view.
UDF
Q: What are the types of functions, and how they are used?
A: Functions can be scalar or table-valued. A scalar function returns a single value,
whereas a table-valued function returns a table variable.
Procedure
Q: What is stored procedure?
A: Transact-SQL stored procedure is a set of T-SQL code that is stored in a SQL
Server database and compiled when used.
Q: What is the size and max number of parameter a procedure can have?
A: The maximum stored procedure size is 128MB. The maximum number of
parameters a procedure may receive is 1,024.
Q: Types of Procedure
A: User defined Stored Procedure-A Transact-SQL stored procedure is a set of T-
SQL code that is stored in a SQL Server database and compiled when used.
Temporary Procedures-You create temporary procedures the same way you create
temporary tables—a prefix of a single pound sign (#) creates a local temporary
procedure that is visible only to the current connection, whereas a double pound sign
prefix (##) creates a global temporary procedure all connections can access.
Temporary procedures are useful when you want to combine the advantages of using
stored procedures such as execution plan reuse and improved error handling with the
advantages of ad hoc code. Because you can build and execute a temporary stored
procedure at run-time, you get the best of both worlds. For the most part, sp_executesql
can alleviate the necessity for temporary procedures, but they're still nice to have
around when your needs exceed the capabilities of sp_executesql.
System Procedures-System procedures reside in the master database and are
prefixed with sp_. You can execute a system procedure from any database. When
executed from a database other than the master, a system procedure runs within the
context of that database. So, for example, if the procedure references the sysobjects
table (which exists in every database) it will access the one in the database that was
current when it was executed, not the one in the master database, even though the
procedure actually resides in the master.
Extended Stored Procedure-Extended procedures are routines residing in DLLs that
function similarly to regular stored procedures. They receive parameters and return
results via SQL Server's Open Data Services API and are usually written in C or C++.
They must reside in the master database and run within the SQL Server process space.
Although the two are similar, calls to extended procedures work a bit differently than
calls to system procedures. Extended procedures aren't automatically located in the
master database and they don't assume the context of the current database when
executed. To execute an extended procedure from a database other than the master,
you have to fully qualify the reference (e.g., EXEC master.dbo.xp_cmdshell 'dir').
Internal Procedures-A number of system-supplied stored procedures are neither true
system procedures nor extended procedures—they're implemented internally by SQL
Server. Examples of these include sp_executesql, sp_xml_preparedocument, most of
the sp_cursor routines, sp_reset_connection, and so forth. These routines have stubs in
master..sysobjects, and are listed as extended procedures, but they are actually
implemented internally by the server, not within an external ODS-based DLL. This is
important to know because you cannot drop these or replace them with updated DLLs.
They can be replaced only by patching SQL Server itself, which normally only happens
when you apply a service pack.
-> Functions cannot alter data or objects in a server. Procedures can manipulate data
and objects inside the database and server.
-> Proc called independently while function are called from within another sql statement.
-> Proc allow you to enhance application security
-> Function must always return value (either scalar or table) while proc returns value or
nothing at all.
-> Function parameter is differ from proc in that even if the default is desired, the
parameter is required to call the function.
Q: Why does the server need to keep multiple plans in memory for a single
procedure?
A: Each execution of a procedure needs a private execution space. That’s where
session information (temporary table names, local variable values, cursor information,
and global variables) is stored, and it’s where the procedure actually runs. In order to
allow multiple users to execute the same procedure concurrently, SQL Server needs to
load multiple versions of the procedure.
Q: Do I need to worry about exactly matching the data types of parameters to the
data types of referenced columns?
A: Absolutely. When you are planning to pass a stored proc parameter to a WHERE
clause, you should make certain that the parameter matches in both data type and
length. (Don’t worry about nullability: all parameters and variables allow null values.)
Take the time to look up the data type using the Enterprise Manager or sp_help. If you
fail to pass the proper data type, SQL Server may be forced to guess about
optimization, and that’s never a good thing.
If you perform the same query in a procedure where the start and end dates are
parameters, SQL Server knows the values of the parameters at optimization time, so it
substitutes them in the query to develop an optimization plan:
(@start datetime, @end datetime)
as
select pubdate, price
from titles
where pubdate between @start and @end -- optimizer can look up
parameter vals
return
The procedure works fine, but may perform horribly. The problem concerns the style of
update. Certain changes to a row cause SQL Server to perform a complete overhaul of
the row (a direct or deferred update), whereas other changes only cause a modification
of the changed column values (a direct update in place). The difference in performance
can be dramatic.
SQL Server decides what form of update to use at optimization time from your syntax
and from what it knows about the table. If the UPDATE statement in the stored
procedure modifies every column (even if the value hasn’t changed), update
performance may suffer. There are lots of workarounds, including coding every possible
variety of UPDATE statement in your procedure and running the one that corresponds
with the modified variables. A more modest approach would be to modify sets of related
data, especially in tables with many columns.
Q: When I find an error inside my procedure, @@error is always 0 inside the error
trap. What’s going on?
A: You’ve written the error trap incorrectly. Your code probably looks like this:
declare @err int, @rcount int
-- variables to store error and rowcount
update titles -- statement that causes the error to occur
set title_id = “RR1234”
select @err = @@error, @rcount = @@rowcount
if (@err != 0) or (@rcount = 0)
-- check the error and rowcount
begin
raiserror (“Error in update. Error no is %d, rowcount is %d”,
0, 1, @err, @rcount)
return
end
The first four lines of the output are generated by SQL Server. The error number is 2627
, and the message provides more detailed information. The last line of output is from our
RAISERROR message. Notice that the value of the error that was substituted is 0. The
global variable, @@error , reports the error code from the last statement executed . If
the statement succeeds, @@error is 0; any failure results in a non-zero error code. In
this case, the error code reported in our RAISERROR statement is the status of the if
@@error = 0 statement, which worked beautifully!
If you need to reuse the value of @@error , you must store it in a variable as soon as it
is available (This is also true of @@rowcount .) Here’s how to capture and report
@@error :
Msg 2627, Level 14, State 1
Violation of PRIMARY KEY constraint 'UPKCL_titleidind':
Attempt to insert duplicate key in object 'titles'.
Command has been aborted.
Error in update. Error no is 0, rowcount is 0
SELECT * FROM
OPENROWSET('sqloledb','server_name';'user_name';'password','exec
database.owner.stored_procedure')
Q: Does storing of data in stored procedures increase the access time? Explain?
A: Data stored in stored procedures can be retrieved much faster than the data stored in
SQL database. Data can be precompiled and stored in Stored procedures. This reduces
the time gap between query and compiling as the data has been precompiled and
stored in the procedure. To avoid repetitive nature of the data base statement caches
are used.
Trigger
Q: What is trigger?
A: A trigger is a special kind of stored procedure that automatically executes when an
event occurs in the database server.
Q: Types of trigger.
A: DML-DML triggers execute when a user tries to modify data through a data
manipulation language (DML) event. DML events are INSERT, UPDATE, or DELETE
statements on a table or view.
DDL-DDL triggers execute in response to a variety of data definition language (DDL)
events. These events primarily correspond to Transact-SQL CREATE, ALTER, and
DROP statements, and certain system stored procedures that perform DDL-like
operations.
Logon trigger-Logon triggers fire in response to the LOGON event that is raised when a
user sessions is being established.
Q: Use of trigger.
A:
- prevent changes (e.g. prevent an invoice from being changed after it's been mailed
out)
- log changes (e.g. keep a copy of the old data)
- audit changes (e.g. keep a log of the users and roles involved in changes)
- enhance changes (e.g. ensure that every change to a record is time-stamped by the
server's clock, not the client's)
- enforce business rules (e.g. require that every invoice have at least one line item)
- execute business rules (e.g. notify a manager every time an employee's bank account
number changes)
- replicate data (e.g. store a record of every change, to be shipped to another database
later)
- enhance performance (e.g. update the account balance after every detail transaction,
for faster queries)
- DDL triggers can be used for auditing purposes.
- Use triggers only if logs for table needs to be maintained.
Q: Argument over SP
A: They will argue that all access to the database should go thru stored procedures
because it is more secure and shields applications from changing logic. The other side
will vehemently argue that you should avoid this because not all databases support this
and the way each supports it is different so your code is less portable.
Q: If I define multiple triggers on a single table, what is the order of the triggers
firing?
A: The triggers will execute in random order; you cannot rely on the order of triggers
firing.
Q: What are the types of triggers, and how are they used?
A: Triggers can be DML, DDL, or Logon types. DML triggers fire with insert, delete, and
update operations. DDL triggers fire with DDL operations, such as CREATE and
GRANT. Logon triggers fire when a login connects an instance.
Several performance problems are introduced. First, the trigger needs to use a cursor to
handle multirow inserts. Cursors are slow, so updates will be slower and locking will be
more complex. Second, the use of the counter_table introduces another lock, and this
will be a far more intrusive lock than the one on your primary table. Each user
performing an insert will need to visit this table and will do so as a part of his or her
transaction. You will single-thread all inserts using this method. Contrast this with a
memory semaphore used to handle identity counters, where the locking of an identity
value is not part of the overall transaction locking function.
This method also requires the use of an alternate key, some other method of uniquely
identifying the row you want to update. Without an alternate identifier, it’s impossible to
correlate a row in the inserted table with the row in the base table that will obtain the
new counter value.
Finally, the additional update on the base table will triple the logging and memory-based
data management work versus the comparable trigger approach.
Q: Why should I use the MERGE statement, when I can write the same
functionality with individual INSERT, UPDATE, and DELETE statements?
A: The MERGE statement is especially optimized, and results in better performance,
mainly due to locking optimizations. Also, a well-written MERGE statement will be
simpler to read and write than the equivalent expressed by the individual INSERT,
UPDATE, and DELETE statements.
Q: Don’t functions take CPU time to operate? Why didn’t you mention the
performance implications of that?
A: Functions do take time to operate, but in the SQL Server world, the bottleneck on
performance is almost always disk I/O. CPU time is cheap but data access is
expensive. If a function can save disk I/O, the small CPU hit is worth it. Remember,
however, that applying a function to a column in the WHERE clause prevents the server
from using an index to resolve the query.
Q: Does the order of the tables in the FROM clause affect performance?
A: The short answer is “No.” MS SQL 6.5 was improved to deal with multitable queries
in a better way than ever before. In previous versions, some queries that referenced
over four tables could, in rare cases, miss a possibility for better join performance.
CLR
Q: When should I choose to use the CLR over Transact-SQL to implement
database objects?
A: You should use the CLR only when the functionality you are trying to implement is
very difficult or impossible to write using Transact-SQL. For example, you simply cannot
use Transact-SQL to read files or the registry. Also, if your functionality requires a
cursor in Transact-SQL, you may get more simplicity and better performance out of a
CLR aggregate.
Q: For the exam, do I need to learn the syntax of a .NET language like Visual
Basic and C#?
A: No, you will not be required to write .NET code or find deliberate mistakes in .NET
code. You should be able to read the code, and understand what it is trying to do. Focus
your efforts on understanding SQL statements, and concepts related to CLR integration,
like CREATE ASSEMBLY , PERMISSION_SET , and EXTERNAL NAME .
XML
Q: Can I specify the format of an XML document using FORXML?
A: Yes. You need to use the EXPLICIT mode in the FORXML DDL.
Q: Are there some features of XPath language that are not supported?
A: Yes, in Microsoft SQLXML 4.0, some features are not supported. For example, the
arithmetic operator, mod, is not supported. You can see a list of Unsupported XPath
features in Books Online under Using XPath Queries in SQLXML4.0.
SSIS
Q: Integration Services Architecture
(i) Integration Services Service: The Integration Services service, available in SQL
Server Management Studio, monitors running Integration Services packages and
manages the storage of packages.
(ii) Integration Services Object Model: The Integration Services object model includes
managed application programming interfaces (API) for accessing Integration Services
tools, command-line utilities, and custom applications.
(iii) Integration Services Runtime and the runtime executable: The Integration Services
runtime saves the layout of packages, runs packages, and provides support for logging,
breakpoints, configuration, connections, and transactions. The Integration Services run-
time executables are the package, containers, tasks, and event handlers that
Integration Services includes, and custom tasks.
(iv) Integration Services Data Flow: The Data Flow task encapsulates the data flow
engine and data flow components. The data flow engine provides the in-memory buffers
that move data from source to destination, and calls the sources that extract data from
files and relational databases. The data flow engine also manages the transformations
that modify data, and the destinations that load data or make data available to other
processes. Integration Services data flow components are the sources, transformations,
and destinations that Integration Services includes. You can also include custom
components in a data flow.