Microsoft SQL Server

Relational Data Base Management Systems (RDBMS) are database management systems that maintain data records and indices in tables. Relationships may be created and maintained across and among the data and tables. In a relational database, relationships between data items are expressed by means of tables. Interdependencies among these tables are expressed by data values rather than by pointers. This allows a high degree of data independence. An RDBMS has the capability to recombine the data items from different files, providing powerful tools for data usage.

Database normalization is a data design and organization process applied to data structures based on rules that help build relational databases. In relational database design, the process of organizing data to minimize redundancy. Normalization usually involves dividing a database into two or more tables and defining relationships between the tables. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

A - Atomic. Transaction must be Atomic (it is one unit of work and does not dependent on previous and following transactions).

C - Consistent. Data is either committed or roll back, no “in-between” case where something has been updated and something hasn’t.

I - Isolated. No transaction sees the intermediate results of the current transaction.

D - Durable. The values persist if the data had been committed even if the system crashes right after.

A stored procedure is a named group of SQL statements that have been previously created and stored in the server database. Stored procedures accept input parameters so that a single procedure can be used over the network by several clients using different input data. And when the procedure is modified, all clients automatically get the new version. Stored procedures reduce network traffic and improve performance. Stored procedures can be used to help ensure the integrity of the database.

e.g. sp_helpdb, sp_renamedb, sp_depends etc.

  • Stored procedure can reduce network traffic and latency, boosting application performance.
  • Stored procedure execution plans can be reused, staying cached in SQL Server’s memory, reducing server overhead.
  • Stored procedures help promote code reuse.
  • Stored procedures can encapsulate logic. You can change stored procedure code without affecting clients.
  • Stored procedures provide better security to our data.

A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored in and managed by the DBMS. Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion. A trigger cannot be called or executed; the DBMS automatically fires the trigger as a result of a data modification to the associated table. Triggers can be viewed as similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-drive and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. In addition, triggers can also execute stored procedures.

Trigger uses Inserted and Deleted virtual tables

Nested Trigger: a trigger can also contain INSERT, UPDATE and DELETE logic within itself, so when the trigger is fired because of data modification it can also cause another data modification, thereby firing another trigger. A trigger that contains data modification logic within itself is called a nested trigger.

We can define view as stored query. A simple view can be thought of as a subset of a table. It can be used for retrieving data, as well as updating or deleting rows. Rows updated or deleted in the view are updated or deleted in the table the view was created with. It should also be noted that as data in the original table changes, so does data in the view, as views are the way to look at part of the original table. The results of using a view are not permanently stored in the database. The data accessed through a view is actually constructed using standard T-SQL SELECT command and can come from one to many different base tables or even other views.

An index is a set of ordered references to rows of a table. Indices are created in an existing table to locate rows more quickly and efficiently. It is possible to create an index on one or more columns of a table, and each index is given a name. The users cannot see the indexes, they are just used to speed up queries. Effective indexes are one of the best ways to improve performance in a database application.

A table scan happens when there is no index available to help a query. In a table scan SQL Server examines every row in the table to satisfy the query results. Table scans are sometimes unavoidable, but on large tables, scans have a terrific impact on performance.

A clustered index is a special type of index that reorders the way records in the table are physically stored. It defines the physical sorting of a database table’s rows. The leaf nodes of a clustered index contain the data pages. For this reason, each database table may have only one clustered index.

A non-clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows. Non-clustered index created outside of the database table and contain a sorted list of references to the table itself.

Cursor is a database object used by applications to manipulate data in a set on a row-by-row basis, instead of the typical SQL commands that operate on all the rows in the set at one time.
In order to work with a cursor we need to perform some steps in the following order:

  1. Declare cursor
  2. Open cursor
  3. Fetch row from the cursor
  4. Process fetched row
  5. Close cursor
  6. Deallocate cursor

Both primary key and unique enforce uniqueness of the column on which they are defined. But by default, primary key creates a clustered index on the column, where are unique creates a nonclustered index by default.

Another major difference is that, primary key doesn’t allow NULLs, but unique key allows one NULL only.

One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships.

One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships.

Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.

When the NOLOCK hint is included in a SELECT statement, no locks are taken when data is read. Using the NOLOCK query optimiser hint is generally considered good practice in order to improve concurrency on a busy system. The result is a Dirty Read, which means that another process could be updating the data at the exact time we are reading it. There are no guarantees that your query will retrieve the most recent data. The advantage to performance is that our reading of data will not block updates from taking place, and updates will not block our reading of data. SELECT statements take Shared (Read) locks. This means that multiple SELECT statements are allowed simultaneous access, but other processes are blocked from modifying the data. The updates will queue until all the reads have completed, and reads requested after the update will wait for the updates to complete. The result to our system is delay ( because of blocking).

Delete command removes the rows from a table based on the condition that we provide with a WHERE clause.
Truncate will actually remove all the rows from a table and there will be no data in the table after we run the truncate command.

TRUNCATE

  • TRUNCATE is DDL Command.
  • TRUNCATE removes the data by deallocating the data pages used to store the table’s data, and only the page deallocations are recorded in the transaction log.
  • TRUNCATE is faster and uses fewer system and transaction log resources than DELETE.
  • TRUNCATE removes all rows from a table, but the table structure and its columns, constraints, indexes and so on remain. The counter used by an identity for new rows is reset to the seed for the column.
  • We cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint.
  • Because TRUNCATE TABLE is not logged, it cannot activate a trigger.
  • TRUNCATE cannot be Rolled back using logs.
DELETE
  • DELETE is DML Command.
  • DELETE removes rows one at a time and records an entry in the transaction log for each deleted row.
  • If we want to retain the identity counter, we can use DELETE instead. DELETE does not reset identity of the table. If we need to remove table definition and its data, we can use the DROP TABLE statement.
  • DELETE Can be used with or without a WHERE clause
  • DELETE Activates Triggers.
  • DELETE Can be Rolled back using logs.

User-Defined Functions allow to define its own T-SQL functions that can accept 0 or more parameters and return a single scalar data value or a table data type.
There are three types of User-Defined functions:

  • Scalar Functions (Returns a Single Value)
  • Inline Table Valued Functions (Contains a single TSQL statement and returns a Table Set)
  • Multi-Statement Table Valued Functions (Contains multiple TSQL statements and returns Table Set)
Scalar User-Defined Function A Scalar user-defined function returns one of the scalar data types. Text, ntext, image and timestamp data types are not supported. We pass in 0 to many parameters and get a return value.
Inline Table-Value User-Defined Function An Inline Table-Value user-defined function returns a table data type and is an alternative to a view as the user-defined function can pass parameters into a T-SQL SELECT command and in essence provide us with a parameterized, non-updateable view of the underlying tables.
Multi-statement Table-Value User-Defined Function A Multi-Statement Table-Value user-defined function returns a table and is also an alternative to a view as the function can support multiple T-SQL statements to build the final result where the view is limited to a single SELECT statement. Also, it has an ability to pass parameters into a T-SQL SELECT command or a group of them. It gives us the capability to create a parameterized, non-updateable view of the data in the underlying tables. Within the create function command we must define the table structure that is being returned. After creating this type of user-defined function, it can be used in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets.

  • Stored Procedures are pre-compiled objects which are compiled for the first time and its compiled format is saved, which executes (compiled code) whenever it is called.
  • A function is compiled and executed every time whenever it is called. A function must return a value and cannot modify the data received as parameters.
  • The function must return a value but in Stored Procedure it is optional. Even a procedure can return zero or n values.
  • Functions can have only input parameters for it whereas Procedures can have input or output parameters.
  • Functions can be called from Procedure whereas Procedures cannot be called from a Function.
  • The procedure allows SELECT as well as DML(INSERT/UPDATE/DELETE) statement in it whereas Function allows only SELECT statement in it.
  • Function can be embedded in the SELECT SQL statements anywhere in the SELECT, WHERE, HAVING section where as Stored procedures cannot be.
  • Functions that return tables can be treated as another rowset. This can be used in JOINs with other tables.
    Inline Functions can be thought of as views that take parameters and can be used in JOINs and other Rowset operations.
  • An exception can be handled by try-catch block in a Procedure whereas try-catch block cannot be used in a Function.
  • We can use Transactions in Procedure whereas we can't use Transactions in Function.

Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.

Types of joins are:

  • INNER JOINs
  • OUTER JOINs
  • CROSS JOINs
OUTER JOINs are further classified as:
  • LEFT OUTER JOINS
  • RIGHT OUTER JOINS
  • FULL OUTER JOINS
There is also a particular case when one table joins to itself, with one or two aliases to avoid confusion. This is a Self Join. A self join can be of any type, as long as the joined tables are the same. A self join is rather unique in that it involves a relationship with only one table. The common example is when company have a hierarchal reporting structure whereby one member of staff reports to another.

A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. The common example is when company wants to combine each product with a pricing table to analyze each product at each price.

The LEFT JOIN keyword returns all records from the left table (table1), and the matching records from the right table (table2). The result is 0 records from the right side, if there is no match.

A full outer join is a method of combining tables so that the result includes unmatched rows of both tables. If you are joining two tables and want the result set to include unmatched rows from both tables, use a FULL OUTER JOIN clause.

Both WHERE and HAVING clauses specify a search condition for a group or an aggregate.

The WHERE clause is applied to each row before they are part of the GROUP BY function in a query. The HAVING criteria is applied after the grouping of rows has occurred.

HAVING can be used only with the SELECT statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING behaves like a WHERE clause.

Sub-queries are often referred to as sub-selects, as they allow a SELECT statement to be executed within the body of another SQL statement. A sub-query is executed by enclosing it in a set of parentheses. Sub-queries are generally used to return a single row as an atomic value, though they may be used to compare values against multiple rows with the IN keyword.

A subquery is a SELECT statement that is nested within another T-SQL statement. A subquery SELECT statement if executed independently of the T-SQL statement, in which it is nested, will return a result set. Meaning a subquery SELECT statement can standalone and is not depended on the statement in which it is nested. A subquery SELECT statement can return any number of values, and can be found in, the column list of a SELECT statement, a FROM, GROUP BY, HAVING, and/or ORDER BY clauses of a T-SQL statement. A Subquery can also be used as a parameter to a function call. Basically a subquery can be used anywhere an expression can be used.
Sub-query has the following properties:

  • A subquery must be enclosed in the parenthesis.
  • A subquery cannot contain an ORDER-BY clause.
  • A subquery must be put in the right hand of the comparison operator
  • A query can contain more than one sub-query.
There are the following types of sub-qieries in SQL Server:
  • Single-row subquery, where the subquery returns only one row
  • Multiple-row subquery, where the subquery returns multiple rows
  • Multiple column subquery, where the subquery returns multiple columns.