Skip to main content

ROW_NUMBER () using SSIS

Hi Everyone,

Would like to share the knowledge how to achieve the ROW_NUMBER () Functionality through SSIS.
For this, we shall consider an example.


The business logic that needed to be followed was that I had to assign a “Twin Code” to each record. This meant that for each family in the database, if two or more members were born on the same day they should be treated as twins. The twins should be assigned a Code enumerating them in order of birth.
This can be achieved through SQL by just writing a simple ROW_NUMBER () function.



To achieve this same in SSIS, We shall in need of Data Flow task.

Connect an OLEDB Source to the Family table.



Now, use a Sort transformation which is likely to be used as ORDER BY Statement in our ROW_NUMBER () Function.

We are going to sort by FamilyID and DateOfBirth Column.



Now pull out a Script Component. Because we need to “Partition By” Family ID and DateOfBirth, We shall include those as an Input in our Script component and we shall call partition it.




To add the inputs, go to Inputs Columns option on Script task and Add two columns DateOfBirth and FamilyID Columns.



Now to create one more column which represents holds the Row_Number() values, I am creating a Row_Rank on Inputs and Outputs output.



Add the connection for Script component using Connection manager Options in Script Task.


Now add the following code and give ok to generate a Row_Number().
/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    DateTime Category;
    int row_rank = 1;

/*Variable: Category::: Declare a string On class level which can be accessed anywhere inside the class. This variable is used Compare the dataofbirth presence */

/*Variable: row_rank:: The initial value of Rank is set to 1 for every record
*/


    public override void PreExecute()
    {
        base.PreExecute();
        /*
          Add your code here for preprocessing or remove if not needed
        */
    }

    public override void PostExecute()
    {
        base.PostExecute();
        /*
          Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example:
          Variables.MyIntVar = 100
        */
    }

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        if (Row.DateOfBirth.Date != Category)
        {
            row_rank = 1;
            Row.RowRank = row_rank; //Row_Rank

            Category = Row.DateOfBirth.Date;
        }
        else
        {
            row_rank++;
            Row.RowRank = row_rank;

        }
       
        /* We are validating whether the value is present in Row. If yes, then we are incrementing the Rank else, Swap the ranks, assign new value for rank starting from 1 and store the DateOfBirth value in Category variable for next run.
         */
    }

}





Comments

  1. This comment has been removed by the author.

    ReplyDelete
  2. Nice information. Thanks for sharing content and such nice information for me. I hope you will share some more content about. Please keep sharing!
    big data training in chennai

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Thanks for the informative Post. I must suggest your readers to Visit Big data course in coimbatore

    ReplyDelete

Post a Comment

Popular posts from this blog

Zip/Unzip multiple files and also include password for zipped file using SSIS

We have many scenario that we need to Zip many files which we come across and then so some operations like either sending it as a email or just moving zipped file to some other destinations etc. But we were using manual method to zip multiple files. In this post, I tried to create a package which will zip multiple files using SSIS. Here for Zipping files purpose, I'm using 7-ZIP which is free software available in google sites. Download files and install onto your system. First let me show how to Zip on file and later I will show how to zip multiple files using SSIS and 7Zip tool. Compressing Single file. Here I'm trying to Zip one single flat file which is of 40MB size. I kept this file in C:\Documents and Settings\\Desktop\test\source folder. Now to compress this file, I will open my SSIS and I'm dragging and dropping EXECUTE PROCESS TASK from Control Flow. Now right click on Execute Process task and go for edit and select Process option. In process tab,

SSIS: The Value Was Too Large To Fit In The Output Column

I had a SSIS package where I was calling a stored procedure in OLEDB Source and it was returning a “The Value Was Too Large to Fit in the Output Column” error. Well, My Datatype in OLEDB source was matching with my OLEDB Destination table. However, when I googled, we got solutions like to increase the output of OLEDB Source using Advanced Editor option . I was not at all comfortable with their solution as my source, destination and my intermediate transformation all are having same length and data type and I don’t want to change. Then I found that I was missing SET NOCOUNT ON option was missing in Stored Procedure. Once I added it, my data flow task ran successfully. 

How to move multiple files in ssis and also rename simultaneously

There are two ways to achieve this. 1) We can move the flat files and then rename it. 2) While moving files itself, automatic rename should be done. We will do the second type. The criteria is to rename the files while moving from source to destination. So for that, we need FILE SYSTEM TASK to be included. Secondly since we need to move many files, we will use FOR EACH LOOP CONTAINER. To fetch all the files, we can use FOR EACH LOOP task in SSIS. In collection tab, we can select FOREACH FILE enumerator option for fetching files and we can change enumerator configuration Folder option: Points to source where we need to fetch files. Files: will give us idea whether we need to fetch all the files (*.*) or if we give extension like *.txt, it is going to fetch only  .txt files . Once I give Source name in FOR EACH LOOP container, It is going to fetch all the files corresponding to that path. Retrieve file name: This option is used to let the variables mentioned in VARIA