When I wanted to know how to remove duplicates or in more sense, If i wanted to get Unique records, I went through a post which suggested to go for Rank function to find unique records.
Well some suggested to for Row_number function. Here is a example where I worked on both the types and let you the differences.
CREATE TABLE Table1 (ID INT, employee VARCHAR(100), sal INT, dept VARCHAR(100))
GO
INSERT INTO Table1
SELECT 1, 'dhinakaran', 100, 'finance'
UNION ALL
SELECT 2, 'bharath', 100, 'marketing'
UNION ALL
SELECT 3, 'job', 100, 'accounts'
UNION ALL
SELECT 4, 'job', 100, 'markting'
UNION ALL
SELECT 5, 'job', 200, 'infrastruture'
UNION ALL
SELECT 6, 'raghu', 300, 'markting'
UNION ALL
SELECT 7, 'raghu', 400, 'finance'
GO
SELECT employee, sal, dept
FROM Table1
GO
we can observe that Employee JOB varies in dept and we can see salary.
Select the data without duplicate records as well select the data with maximum Sal.
By using Rank ( ) Function, If we try, by following query, we will get Following result.
select employee, sal, dept from (select employee, sal, dept, rank () over
(partition by employee order by sal desc) Rank from table1 ) a where Rank=1
The same result will be obtained even If we try using ROW_NUMBER ( ). The snippet is shown Below.
The real Difference between ROW_NUMBER and RANK ( ) function will come When I insert a data of similar Type.
For example, Let me insert one more value:
insert INto table1 values(8,'Job',200,'infrastruture')
select * from table1
go
Now For this table, If I use, RANK ( ) function, Then result is not we are expecting it. It is perfectly obtained by Using ROW_NUMBER ( ) only.
Result of both Snippets is as follows:
Comments
Post a Comment