Dan English's BI Blog

Welcome to my random thoughts in regards to Business Intelligence, databases, and other technologies

Informatica PowerCenter vs. Integration Services (SSIS) Comparison

Posted by denglishbi on June 12, 2010

This past year I got an opportunity to finally take the great Informatica product for a test drive.  I guess it wasn’t really a test drive since this was the ETL tool that was being used for the project, but it was my first time using the product (other than a brief hands-on with it during a vendor selection process when I worked at the casino).  As far as I know, this product hasn’t really changed since I previously saw it, so I was a bit surprised, especially with the rave reviews and with it being a leader according to Gartner – Magic Quadrant for Data Integration Tools November 2009.

Magic Quadrant for Data Integration Tools

Magic Quadrant for Data Integration Tools

Source: Gartner (November 2009)

The version of the product that I used was PowerCenter 8.6 (with a hotfix that provides the Mapping Analyst for Excel feature) and I know there is a new version out now that was recently released, but that is not the version that we were using and almost certain that the client will not be upgrading to new version going forward (will most likely be moving to Integration Services).

So let’s get back to my experience with the product.  Just a warning, this by no means is an extensive critique or in-depth review of the entire product, just my usage of the product.  In the past I used DataStage and DTS, but most recently I have been using SSIS for the past four years.

What I liked:

  • Versioning – has to be enabled out of the gate, but once turned on you have check-in/check-out versioning capabilities with the product which is a must in any development environment. one thing to note that the file and context menu options were not always available when I wanted them, so the user experience was a bit unpleasant.
  • Metadata repository – a huge plus I will have to admit. being able to search for items and access them for a central tool is very powerful and a nice feature – if you need this.
  • Reusable components – for example, items like sources and targets were centralized and reusable throughout the environment, so if you imported a new table definition of an existing table this would be reflected in any mappings it was being referenced in.
  • Mapping Architect for Visio – built-in feature even with Standard Edition that allows you to generate a Visio diagram of mapping
  • Mapping Analyst for Excel – another built-in feature once hotfix applied with Standard Edition that allows you to create source to target Excel files for mappings and this is bidirectional for creating mappings as well.
  • Built-in mapping documentation – has to be configured on server along with security permissions, but Data Analyzer report gets created with tabular report output off all the components in a mapping, not extremely useful, but it is something – no visuals like with DataStage documenation generator, but you can get that with the Visio generator component.
  • Traceability – you have the ability to select a column (port) in the mapping and trace it back or forward within the mapping to see where it was generated from or where it ends up in the mapping.

What I didn’t like:

  • Too clunky – not a good user experience or development environment. menu options too random or not where expected, seemed really buggy at times with context menus not being where I would expect.
  • ODBC data source setup – had to create DSNs for reference and utilize third party providers from DataDirect for connectivity
  • Not intuitive UI at all – even out of the gate when you open the Designer tool if you want to edit a mapping you first need to make sure you have opened the Mapping Designer Tool (can’t just double click on the mapping). Even to rename items this was buried at times and confusing – sometimes you could do this if you edited the item through context menu, other times you had to access from the file menu.  Even the warning and error icons used were confusing and didn’t provide tooltips – message displayed in separate windows.
    • Here is a invalid mapping – informatica mapping invalid
    • Here is a warning message for a mapping – informatica mapping warning
    • Validation message in output window – validate window
  • Confusing layout – within a mapping you can create multiple flows, but then I wondered which source to target flow will occur first and in what order will they happen. Well needless to say this was buried in the file menu under ‘Mappings’ –> ‘Target Load Plan…’ to get at the mapping execution sequence
  • Too many steps – in order to create an ETL mapping you need to do at least the minimum steps using two separate components: create repository folder, create ODBC DSN (local workstation and eventually on the server for a data source), import a source definition (source analyzer designer), import a target definition (target designer), create a mapping (mapping designer – source, source qualifier (required to define source query and columns – called ports), transforms, router, update, destination) – now if you want to run the mapping you need to open up more components to create a session, then a workflow, and finally you can execute the workflow and monitor the progress.  Wow, just wanted to do a simple dump from source to target and I just ate up half a day.
  • Can’t add annotations – on the mapping design surface you don’t have the ability to simply add a textbox to provide any comments or documentation about the flow. this would all have to be buried in each item in the designer in the description area. I find this feature in SSIS extremely useful and helpful for development and documentation purposes.
  • Single repository folder support – can only be connected to a single repository folder at a time. Unable to work or view two different mappings located in separate repository folders.
  • No ability to watch the data flow during runtime – you couldn’t watch the records and progress of a mapping while it was being execute. There is an ability to monitor stats, but you can’t see what is going on during each stage of the mapping visually like in DataStage or SSIS.  In the Monitor stats window you can only see what is going on for the source and target, not each step of the mapping like for expressions, lookups, routers, etc.
  • Extensibility / Flexibility – just wasn’t there for me. We had to implement some script files in order to create a looping mechanism to re-run a single mapping in a batch mode for historical load purposes – seemed like a basic operation that I could quickly setup in SSIS, but in Informatica I need to place two script files on the server and create two workflows to setup a looping process.  With SSIS I can easily extend the functionality with custom components, script components (.NET code), and open source components (CodePlex.com or SQLIS.com).
  • Trust but Verify – as one of my colleagues always told me (I think he knows who he is) you always need to double-check everything. If you modify a session in a workflow it might not be reflect in the task editor. I had this happen to me a few times.  Make sure you modify in the task editor and it should reflect properly in the workflow.
  • Viewing source and target data – now that I think about it I always went into an external tool to view my data. I don’t believe this functionality is available in the tool and if it is, I simply overlooked it or it wasn’t pointed out to me. You could validate your source qualifier queries, but that was about it.

Hopping between the different products within PowerCenter seemed really clunky to me.  I guess with SSIS at the most I might have two open with Business Intelligence Development Studio and SQL Server Management Studio, but with PowerCenter you would have at least three if not four if you need to access the Repository.  I also had a really hard time with the flexibility of the tool.  For the price of it I wanted to do some fairly basic items like adding annotations, dynamically setting source file names, looping, passing of variable values between mappings, watch mapping during process, add some custom code, etc. and it just wasn’t there for me or I just wasn’t aware of it.  Granted I did not go through any formal training and learned the product basically on my own in two days, but I was really blown away how even the DataStage product that I used over four or five years ago was light years ahead of this product in my opinion and SSIS is a dream.

Now with SSIS you don’t get a metadata repository or documentation, but with external add-ins (which most are free and open source – surprising that I bring up open source when mentioning Microsoft, but it is free and made available by the community).  Even the Metadata Repository Toolkit was upgraded for SQL 2008 and there is just a ton of SSIS components available on CodePlex along with numerous blog postings, forums help online, video tutorials, virtual labs, and much more all free with tons of samples to get you started. 

The fact that you get the SQL Server database engine, Integrations Services, Reporting Services, and Analysis Services (along with Data Mining) in the SQL Server 2008 product for a fraction of the price of just Informatica PowerCenter just blows me away; I just don’t get the comparison or logic as to why people would spend that kind of money to implement the product.  I can easily overlook some of the advantages of the Informatica product for the flexibility, extensibility, and additional features and functionality that I gain with SSIS any day of the week.  Clearly I was confused as far as the Magic Quadrant goes and couldn’t understand why SSIS was not higher up and pushing behind Informatica and IBM.sql server 2008

Additional references to check out:

About these ads

22 Responses to “Informatica PowerCenter vs. Integration Services (SSIS) Comparison”

  1. Thavash said

    I’ve always wondered how SSIS compared to other ETL tools , especially since some of these tools are so expensive. Anyway thanks for a great article.

  2. MD said

    Really appreciate this, a good article on how to sell SSIS over Informatica

  3. Oscar said

    Very good article, I was using PWC in my last project and it’s true that it takes you ages to build a simple ETL process. Now I’m with SSIS and it’s really quick. One good thing about PWC is that you can see the INSERT-SELECT sql that it generates for the database which is very useful sometimes.

  4. CCW said

    Dan,
    Interesting reading. We have been assessing the Metadata Manager component of PC, with the target requirements of supporting enterprise metadata management functions. Many of your observations (e.g., “clunky” UI) I would agree with from my experience.

    Since you were using the PC tool for its intended core purpose (ETL), and I was focused on the MM component, I am concerned that my suspicions are too well-founded. I just read the HotFix 13 Release Notes, and it seems to me like there are too many significant fixes occurring to be this far along the product’s life.

    So, considering the issues of architecture, design, and the attendant quality challenges, I don’t quite follow the Gartner assessment, either. But the INFA stock price has tripled over the last two years. ;)

  5. xBeanie said

    The reason SSIS is behind on the chart is because it has serious architectural flaws as an ETL tool. Despite the pipeline architecture marketed by MS, in practice the data flows are very limited (dont even have an update destination for crying out loud). Anything more than take data from A and dump it exactly as is into B ends up having to be done as code usually in the control flows as run this code, then run this code, then run this code. The tool has been obviously developed against 2 objectives; 1) raw performance (at the expense of doing anything actually useful to the data), and 2) lock in to SQL Server. Not quite as bad as Oracle’s OWB for lock-in but still MS openly pushes devs to use Stored Procedures (which defeats the purpose of using an ETL tool in the first place).

    Dont get me wrong, I am not anti-SSIS – it has its place and I definitely agree that tools like Informatica and Datastage are way overpriced. I am just telling it how it is, as someone with 15+ years experience in ETL with tools new and old including SSIS. If MS would only put in a bit more effort into the Data Flows it could really improve its position in the grids.

    • denglishbi said

      Everyone is going to have their own opinions and experiences and I appreciate your comments. SSIS does have an update/insert destination with the OLE DB Command. And just like in Informatica using the Update Strategy you would need to direct your rows appropriately in the Data Flow Task. Granted this is a row-by-row process and might not be optimal for large data sets. There are alternative options to implement a set based update process as well. The Enterprise version does include some Data Quality components and more are on the way as well. The product is moving the in right direction http://blogs.msdn.com/b/mattm/archive/2010/11/11/sql-server-denali-what-s-coming-next-in-ssis.aspx.

    • JT said

      Hi XBeanie,
      “dont even have an update destination for crying out loud”
      Strictly speaking I guess that’s true – there is no component called “Update Destination”. However, there *is* the OLE DB Command component which is capable of doing exactly the same thing (as far as I am aware). Is your issue simply the fact that it doesn’t have the word “destination” in the title?

      “MS openly pushes devs to use Stored Procedures ”
      Do they? Where? Could you provide a link/some links? I would be disappointed if there were formal documentation from Microsoft stating sprocs should always be used rather than SSIS dataflows.

      “lock in to SQL Server”
      I’d be interested to know why you think SSIS is locked in to SQL Server. To my knowledge the only features in SSIS that *require* SQL Server are the Fuzzy Lookup & Fuzzy Grouping components and those are not oft-used components.

      Looking forward to your reply.

      Regards
      Jamie

      • bob ama said

        “lock in to SQL Server”
        Absolutely not. From what I see, the MS BI offering is the only one you can have piecemeal, all the other ones appear to want ‘all or nothing’ buy in

        It’s free and gets better and better with each free version

        What’s not to like?

    • debsisabed said

      Hi xBeanie,
      How would you compare the development times between SSIS and DataStage for similar functionalities? Also, do you have an idea on the limiting volumes for SSIS please?

      Thanks
      Deb

    • Pierre said

      I would to add my comments to this as well. ETL stands for Extract, Transform and Load. In order words a tool to get data into a DW. In a well defined DW environment you shouldn’t need to do massive updates. I am guessing that the approach the MS designers took. Updates are for your SCDs and I can’t imagine, a very good reason right now, where one whould want to update more than one row of a dimension for the same change.

  6. Sanjay said

    Dan, Thank you for the posting.

    A very good article to have the comparision. I liked it for the points on Informatica as i worked on the product for quite sometime. But would request if you could detail – how easy (the steps in involved as detailed for informatica) is it in SSIS on comarision elements for better understading (for people who are well-versed with informatica and not with SSIS would help).

    Also the article has only the developer view and not on all aspects of product like
    – Administration
    – Architecture
    – Performace with Data Volume and robostness. Etc

    Analysis or information on this would help.

  7. JT said

    Hi Dan,
    Great article,

    “Wow, just wanted to do a simple dump from source to target and I just ate up half a day.”
    I find it amusing that you consider this to be advantage that SSIS has over INFA given one of the rods that SSIS is often beaten with by old DTS developers is the difficulty of doing a simple import or export. :)

    JT

    • denglishbi said

      Hi Jamie,

      Good to hear from you.

      I thought it was rather amusing myself and that the UI for the version of the Informatica product I was using (8.6). Very clunky and not at all what I would have expected. DataStage 4+ years ago seemed more user friendly to me. SSIS is leaps and bounds over it as far as ‘keeping it simple’ for the user, plus much more flexible and extensible.

      You can always go with the import/export functionality in SSMS to create the basic shell for a starting point if people are uncomfortable with starting from scratch.

  8. Lewis Wheeler said

    Excellent review – especially as getting a hold of PowerCenter in order to do a ‘User Review’ rather then functionality review is difficult. Thank you for the added link at the end – is definitly helping me in my system analysis for final year development piece at university.

    I would love to hear your thoughts on any Open Source systems such as Talend or CloverETL; especially from a user experience perspective.

    Lewis

  9. Great review!

    As an ETL Developer with formal (PRICEY) training on Informatica, I find the product to resemble much of the comments mentioned in the section of “what you disliked”. I believe in keeping cost low and managing time efficiently in order to deliver quality reports. While Informatica certainly has it’s pros, the cons may be bigger to new small business companies looking to capitalize off of business intelligence (http://usgovinfo.about.com/od/smallbusiness/a/sbadrives.htm). Not every resource will be able to ramp up on the Informatica as quickly as they would like to. While the demand to understand the historical, current and future trends of business operations increase in a fickle economy, simplicity along with extensible and affordability of SSIS seems feasible for more and more SBA as well as large corporate establishments.

    Thanks again!
    Maurice
    SSIS/BI Developer
    sqlbased.blogspot.com

  10. The capabilities or potential of these two tools is very deceptive. When you look at SSIS and Informatica just visually it may seem they are Apples and Oranges. However these two are not on the same street. If informatica were an Apple, SSIS is a Lemon. For truly rigourous and expensive Data Warehouse and Data Mart initiatives with a multitude of data from across the enterprise, you will have to be on a death-wish to pick SSIS. Just because it has the looks of an ETL tool, it does not mean its one. Yes its great for copying data from one SQL Server database to another. But is horribly flawed for doing a multitude of data cleansing, data transformation and flexible target operations based on application/data logic in the flow in any cogent manner. For Informatica this is its day job. Further at the core in Informatica the Logical layer of the tool is separate from the Physical Run time layer of the tool. There is no such separation in SSIS. Further SSIS has a totally obtuse integration of (or lack of) data types from ADO.NET, OLE DB and the mix of adapters for databases including unfortunate inclination to type everything as UNICODE, even when there is no sign of UNICODE in source or target. Informatica has its own data types in effect during the transformation process that will put data into any target database without having to jump conversion hoops. Then there are reusability, mapplets and a lot more small things that make a big difference. Last but not the least and this is why you pay money for proper ETL tool (not the seeming GUI for design), Informatica has its own engine that can run in Parallel, can Partition, load balance, run on a grid, be published easily as a web service and do a lot of things that are not within the universe of an SSIS which runs as a Windows Service and has no transformation engine to speak of.

    • John said

      As someone who has used each for 3+ years, I Completely agree. Debugging watch options and error handling are also much more comprehensive in Informatica than SSIS.

  11. S said

    Hi Dan,
    Do we have version control in SSIS? I need to do evaluation of ETL tools – Informatica, Datastage and SSIS against some parameters.
    Thanks
    S

    • denglishbi said

      You would need to use a source control system and they do have integration with SSIS (BIDS). You could use TFS, Subversion, StarTeam, etc. With SSIS 2012 they do have some versioning cabilities built into the server deployments in the SSISDB now as well.

  12. Mike said

    Thanks for your review. I was using SSIS before and had to do my last project with Informatica. I expected a lot, because of the Magic Quadrant and the high price.
    But in fact I am totally disappointed and agree with your review.
    Maybe I didn’t find it, but I couldn’t loop through a file directory to import a set of flat files, because I have to specify for each file a source. That’s a basic task in SSIS. Or just executing a simple SQL command needs a dummy mapping, dummy session, etc. I don’t want to create always dummy things and workarounds, it can’t be so difficult to develop a simple workflow task that executes a simpe SQL command considering the price of the product.
    The dynamic lookup is a nice idea, but in practise it didn’t work as I expected, because the cache gets refreshed before the database, so it was useless in my SCD2 scenario where I had to process multiple rows.

  13. Aired said

    I just thought that i’d drop you a line and tell you that I think you’re doing a fantastic job.
    Thanks! if you want, you can check out my blog at

    http://www.aired.in/2010/12/informatica-interview-question-answers.html

  14. panlondon said

    I am a newbie with Informatica. I basically found it very difficult to get started. I remember when I first started on SSIS I was able to develop very complex stuff within the first week. In informatica everything seems not very intuitive and not logical.The advantages the article mentions are correct but I can’t discount the fact that informatica is a very difficult dare I say unnatural tool to use! Between the 2 I will choose SSIS every time and can’t really recommend informatica, unless there is some special need that ssis can’t provide. SSIS just makes sense. The advantages ofinformatica don’t make up for the disadvantages.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 93 other followers

%d bloggers like this: