Skip to main content

SQL Server 70-461 Exams fundamental - Part 2 (Tips from Querying from Microsoft SQL Server Training kit by Itzik-Ben-Gan)

 In continuation with my previous blog, here, we will discuss further more.

Table Columns and Rows

Usually the table is represented by “Fields” and “Records” which is wrong terminology. Just to improve our standard on SQL, we should refer as “Columns” and “Rows”.
“Fields” are interfaces what we have on client applications and “Records” are what we have in files and cursors.
Tables are logical and they are represented by Columns and Rows.
There is nothing called “NULL Value” as mentioned above, null represents missing value. So we should rather call just NULL instead of saying it “NULL Value”.

SQL Engine
This is an interesting part of SQL. How the query is processed? Let us answer that in some time.
SQL is English like language which uses common terms we use in English but converts into a code later stage.
SEQUEL is original format (Structured English Query Language) was coined first but due to trademark dispute with Airline company, it is renamed as SQL (Structured Query Language).
Let us give some English sentence:

“Bring me some sandwich from shop”. On Observing, we can say, Object (Sandwich) comes before the Location (Shop).

This would be simply like Select Sandwich from shop. (This is called Conceptual Processing)
But SQL Engine, being a machine, gets instructed in a different fashion internally. (Just like instructing a Robot).
Ex: Go to the shop, ask the shopkeeper and buy a Sandwich and give it to me.
Similarly, the logical processing would be followed like;
From shop select Sandwich.

The order always defers with Conceptual processing.

The Conceptual processing will start like SELECT >> FROM >> WHERE >> GROUP BY >> HAVING >>ORDER BY
But
The Logical Processing (SQL Engine) starts with
FROM>> WHERE >>GROUP BY >>HAVING>> SELECT >> ORDER BY

Recap: If Order by is specified in a code, the result is not relational.

Example for Logical Processing:

Select EMPID, AVG(Sal) As AvgSal  from EMP Where AvgSal> = 4000
This query returned an error saying there is no column called “AvgSal”.

The reason is, Where Clause is evaluated before the Select Statement according to Logical processing.

NOTE: ORDER BY is only clause that is allowed to refer to column aliases in the SELECT clause.

Ø  Rows in a table have no particular order, conceptually. There is no guarantee that rows are stored in the way, it is inserted. It is decided by SQL Server Engine based on performance, memory space and its tuning to retrieve the data in faster way.
Ø  When using a select statement, rows which are easily accessible first are pulled irrespective of orders they are stored.
If a table is said to follow the rule of SET theory, IFF, they allow
Ø  No order by clause implemented.
Ø  Distinct set of data in table.
Ø  All the attributes/columns are named. If Column name is not found, then table is not relational.

Difference between “Where” and “Having” clause?

·         The WHERE clause is evaluated before rows are grouped and therefore is evaluated per row. The HAVING clause is evaluated after rows are grouped and therefore is evaluated per group.
Looking into an example:

SELECT empID,YEAR(hireddate) as yearhired, yearHired-1 as prevyear FROM emp;
This query will throw error: “yearhired” is an invalid column list.

The reason is conceptually T-SQL evaluated all expressions that appear in the same logical query processing phase in an all-at-once-manner. SQL Server won’t unnecessarily physically process all expressions at the same point in time, but it has to produce a results as if it did.

Ø  Sorting of data costs the query processing.
Ø  ORDER BY is the first and only clause that is allowed to refer to column aliases defined in the SELECT Clause. That’s because the ORDER BY clause is the only one to be evaluated after the SELECT clause.

Comments

Popular posts from this blog

BIG Data, Hadoop – Chapter 2 - Data Life Cycle

Data Life Cycle The data life cycle is pictorial defined as show below:     As we see, in our current system, we capture/ Extract our data, then we store it and later we process for reporting and analytics. But in case of big data, the problem lies in storing and then processing it faster. Hence Hadoop takes this portion, where it stores the data in effective format (Hadoop distributed File System) and also process using its engine (Map Reduce Engine). Since Map Reduce engine or Hadoop engine need data on HDFS format to process, We have favorable tools available in market to do this operation. As an example, Scoop is a tool which converts RDBMS to HDFS. Likewise we have SAP BOD to convert sap system data to HDFS.

SSIS: The Value Was Too Large To Fit In The Output Column

I had a SSIS package where I was calling a stored procedure in OLEDB Source and it was returning a “The Value Was Too Large to Fit in the Output Column” error. Well, My Datatype in OLEDB source was matching with my OLEDB Destination table. However, when I googled, we got solutions like to increase the output of OLEDB Source using Advanced Editor option . I was not at all comfortable with their solution as my source, destination and my intermediate transformation all are having same length and data type and I don’t want to change. Then I found that I was missing SET NOCOUNT ON option was missing in Stored Procedure. Once I added it, my data flow task ran successfully. 

How to Copy or Move Multiple Files from One Folder to Another Folder using Talend

Hello all, In this Post, I will explain how to move Multiple Files from One Folder (Say Source) to Other folder (Say Destination). This Post will also helps you to understand How to Declare Variable and Use it. To Declare a variable, We are go to use Contexts option in repository. Lets say we have two .txt files in Path D:/Source/ . My Requirement is to move the files from Source Folder ( D:/Source/ ) to Destination Folder ( D:/Dest/ ). Step 1: Open a New job Step 2: Now right click and Create a New Contexts from Repository. Give some Name and give Next. Step 3: Now Fill in the Source Directory Details where the loop on files should happen as shown in the snippet and give finish. Step 4: Now Context is created and The values will be changing based on each file in Folder. Step 5: Click and Drag the context from Repository to Context Job Window below the Job Designer. Step 6: If we Expand the Contexts, We can find the variable SourcePath is holdi...