Call +61 2 9965 3781
Home Home Company Company Solutions Solutions Services Services Resources Resources Customers Customers Contact Contact

Building a Data Warehouse? Jump Off a Cliff!

Want to Seriously Throw Good Money After Bad? Start a BI Project! by Ian Nicholson On a group thread on LinkedIn recently I witnessed several consultants arguing over which ETL tool is best for building data warehouses. They all discussed the same, tired old approach they’ve been doing for years, completely oblivious to a few key stats:- 1. According to Gartner, 76% of the budget of any BI project is sunk into the data warehouse. 2. Also, according to Gartner, 70-80% of BI projects fail. 3. Of those that succeed, many are over time or over budget. And yet these same consultants are continuing to use those same ‘best’ ETL tools and approaches to ‘deliver’ BI projects, just like lemmings jumping off a cliff. So what’s going wrong? Let’s look at what’s currently happening... You decide to start your BI project. It is deemed that a Business Analyst (1st consultant) talks to the business to ‘get an understanding’ of their requirements. The BA give her results to a Solution Architect (2nd consultant) who creates the design and gets a Data Modeller (3rd Consultant, but possibly the same guy) to design a data model. Now the likelihood is, he will design a set of star-schemas - because that’s what you’ll need for the reports, right? Also, they can be delivered straight away for a quick win with the business. So the Data Modeller gives his design to a DBA (4th Consultant) to create the schema in the data warehouse server, and to the ETL developer (5th Consultant) to write the ETL code to populate the schema in the data warehouse. Once the schema is present, a report is written by the BI Developer (6th Consultant). Of course, all of this is overseen by the Project Manager (7th Consultant). Now, fast forward a few months to the first deliverable being revealed to the business. More often than not, the solution isn’t what the user expected, wanted or signed-for. Changes are required. Why? Usually, a business user doesn’t understand what is possible or what is required in a data warehouse. Only when they see a result do the pennies drop, and they realise they really needed something else. Either that, or during the time it took to deliver the report, the world moved on and focus changed. Whatever the reason, you must go back through that convoluted process above to make the changes. Now, did I mention that when the ETL Developer is writing his code, he has a problem. If he needs to get data from lots of data sources, he must do some further ‘manipulation’ in order to shoehorn it into a star-schema (a star-schema, by the way, is a table design made up of facts (things we want to measure: Invoice amount, quantity etc) surrounded by dimensions (how we want to sort and slice that data: Customer, Region, Product, Date, Sales Rep). Put simply, you cannot take data from multiple sources and just put it into a star-schema. A lot things must happen first. Bill Inmon, the Father of Data Warehousing, has written numerous books on this science, but in a nutshell he says you should make the data generic and put it into its Third Normal Form. Few people know how to do this properly, and this is where, I believe, that 76% of money and 80% failure rate begins. Your ETL Developer writes a lot of convoluted code on he understands which becomes a nightmare when changes are required. So how do we fix this? Well, I should be clear that Software Development Life Cycle (SDLC) as a development approach does NOT work for BI projects, since they tend to be fluid in nature and more of a journey than a destination. Agile development methodologies yield far more success in BI projects. Current ETL development tools are too limited and cumbersome in their approach. There is little if no automation available from these products. They are development tools, originally designed just to move data from one place to another, not build data warehouses. ETL tools do not build data warehouses. People using ETL tools build data warehouses, and ETL tools simply cannot deliver in an Agile fashion. Nowadays there is a new breed of Data Warehouse Automation tools available that do the job much faster - giving deliverables in hours or days, not weeks or months. In doing so, they slash development times and costs to a fraction of what they are currently. Data Warehouse Automation work by automatically designing the required schema and generating the code to populate it. It does this by analysing the model of the data sources, and applying the teachings of Bill Inmon to the design process. Using Data Warehouse Automation tools save time, money, deliver value sooner and virtually eliminate risk of project failure. Call or email us for further information on how we can assist you:
Copyright © 2013 Data Transformed Consulting Services Pty Ltd ACN:     162 629 183 ABN: 96 162 629 183
“Plans fail for the lack of counsel,  but with many advisers they succeed.” (Proverbs 15:22)
Data Transformed  YOUR DATA. CLEARLY.
“Insanity: doing the same thing over and over again and expecting different results.” Albert Einstein