Skip to main content

Use of Merge Command: Update if changed, Insert when not exists, Delete if not found in Target (Incremental Load)/ UPSERT Command

Hello All,

Today we will be looking how to achieve the Incremental Load concept which is in SSIS, through SQL.

SQL provides us MERGE command to accomplish this task.

We shall start off with creating two tables: 
  1. Source
  2. Destination
Create table Src (ID Int, Name varchar(100), Designation Varchar(100))
Insert Into Src
Select 1, 'Dhina', 'Snr.Analyst'
Union
Select 2, 'Scott', 'Lead Analyst'
Union
Select 5, 'Peter', 'Jnr. Analyst'

Create table Dest (ID Int, Name varchar(100), Designation Varchar(100))
Insert Into Dest
Select 1, 'Dhina', 'Analyst'
Union
Select 2, 'Scott', 'Lead Analyst'
Union
Select 3, 'Brad', 'Test Analyst'


Select * From Src
Select * From Dest
Go

will result as shown in snippet:





As Shown, We need to Update employee 'Dhina' who promoted to Snr.Analyst from Analyst in Destination.
We need to Insert a new guy 'Peter' in destination table as it is not present in destination table "Dest".
and we shall also delete "Brad" whose record is found in destination table "Dest" but not in source table "Src".

In this way, we can maintain, the Source table and Destination table synchronized.

We have two types of loads: 
  • Full Load
  • Incremental Load.
Full Load: Dropping or truncating the Entire dataset and load new value from the Source. This is not recommended when we are having huge amount of records and also when we have dependencies on the destination table.

This method though it is simple but takes lot of time and Non Availability of the Destination/target table at any given point of time.

Incremental Load: This method works on only few records which has changed or added recently. Hence the old records, are untouched and since it is working on few record sets, this method is faster and will not spoil or hamper any dependencies.


Incremental load can be created using SSIS which I would explain in a separate blog. Here we shall concentrate on SQL on how to achieve the same.


merge  [dbo].[dest] as d
       using  [dbo].Src as s
       on d.id=s.id
when matched
       then update set d.designation=s.designation
when  not matched
       then insert (id,name,designation)
       values (s.id,s.name,s.designation)
when  not matched by source

Split wise explanation:

We can see the Merge command has to be followed by Destination Table.
Using must be followed by Source Table. 

When Records are matched based on the ID's, then Update the record if any changes happened.

When records are not matched between source and destination, then it means, its a new record from the source and is not present in destination. Hence use Insert Statement.

When Records are not matching with SOURCE table, then We can delete those records.

The result of the above Upsert Statement is shown in below snippet and we can say, the synchronization is maintained.






Comments

  1. I think there is a need to provide some more information about Upsert operations such as the SSIS Update and Insert.

    SSIS Upsert

    ReplyDelete

Post a Comment

Popular posts from this blog

Zip/Unzip multiple files and also include password for zipped file using SSIS

We have many scenario that we need to Zip many files which we come across and then so some operations like either sending it as a email or just moving zipped file to some other destinations etc. But we were using manual method to zip multiple files. In this post, I tried to create a package which will zip multiple files using SSIS. Here for Zipping files purpose, I'm using 7-ZIP which is free software available in google sites. Download files and install onto your system. First let me show how to Zip on file and later I will show how to zip multiple files using SSIS and 7Zip tool. Compressing Single file. Here I'm trying to Zip one single flat file which is of 40MB size. I kept this file in C:\Documents and Settings\\Desktop\test\source folder. Now to compress this file, I will open my SSIS and I'm dragging and dropping EXECUTE PROCESS TASK from Control Flow. Now right click on Execute Process task and go for edit and select Process option. In process tab,

SSIS: The Value Was Too Large To Fit In The Output Column

I had a SSIS package where I was calling a stored procedure in OLEDB Source and it was returning a “The Value Was Too Large to Fit in the Output Column” error. Well, My Datatype in OLEDB source was matching with my OLEDB Destination table. However, when I googled, we got solutions like to increase the output of OLEDB Source using Advanced Editor option . I was not at all comfortable with their solution as my source, destination and my intermediate transformation all are having same length and data type and I don’t want to change. Then I found that I was missing SET NOCOUNT ON option was missing in Stored Procedure. Once I added it, my data flow task ran successfully. 

How to move multiple files in ssis and also rename simultaneously

There are two ways to achieve this. 1) We can move the flat files and then rename it. 2) While moving files itself, automatic rename should be done. We will do the second type. The criteria is to rename the files while moving from source to destination. So for that, we need FILE SYSTEM TASK to be included. Secondly since we need to move many files, we will use FOR EACH LOOP CONTAINER. To fetch all the files, we can use FOR EACH LOOP task in SSIS. In collection tab, we can select FOREACH FILE enumerator option for fetching files and we can change enumerator configuration Folder option: Points to source where we need to fetch files. Files: will give us idea whether we need to fetch all the files (*.*) or if we give extension like *.txt, it is going to fetch only  .txt files . Once I give Source name in FOR EACH LOOP container, It is going to fetch all the files corresponding to that path. Retrieve file name: This option is used to let the variables mentioned in VARIA