FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   
BI Review content and features are now in DMReview.com!

One brand, one Web site! DM Review is now the home of all the content you're used to at BIReview.com and much more. If you are registered at BIReview.com, you're already registered at DM Review. If not, take just a moment to sign up for all the free services we have for you at the new DMReview.com.

Can data warehouses be designed to cover all possible queries on dimensions that we throw at it without providing a list of reports to an IT department?

Question: We're in the process of updating a data warehouse to include new information. IT is quite keen to understand what reports we want to produce from the data. I was always told by past "experts" that one shouldn't worry too much about the reports since, in reality, queries from users cannot be predicted. The data we are adding is relatively simple: sales and budget numbers, a product hierarchy, a salesperson reporting hierarchy and a time dimension.

 

I thought that the data warehouse could be designed to cover all possible queries on these dimensions that we could throw at it without providing a list of reports to IT. Was I right?

 

Sid’s Answer: First of all, some reports can be predicted. The reports users get today that will be satisfied by the data warehouse will still need to be run, perhaps somewhat differently and delivered differently, but they will still be needed and can be predicted. Some types of queries are predictable, but you will never be able to anticipate all the queries. Ideally, the results of one query will generate an idea, which will be the basis for the next unpredictable query.

 

There will always be some requirements for queries for which no data exists – so much for a design that covers all queries. There will be some queries that will run, but your initial design will result in terrible performance and so you will either live with the terrible performance or you’ll make some changes to your design.

 

So it would be useful for the folks in IT to have some idea of the reports the users are planning to run so they can design the data warehouse for what is known. They will have to be on their toes to monitor user activity and to be able to make the appropriate changes to deal with queries and reports that are very different and unanticipated.

 

Joe Oates’ Answer: Actually, a list of reports can be quite useful, but you are right in believing that a list of reports alone cannot tell you everything that needs to be in a data warehouse. The main thing that you need to do is to talk to a representative sample of users and management to understand what they need from the data warehouse.

 

Undoubtedly, each of these users is getting at least one report. What I have found to be effective is to set up interviews with each of these users, including management, and use these reports as a basis for making the interviewee feel at ease. I ask each person to tell me what reports they use to make key decisions and, specifically, what items on the reports are used to make a decision. Then, I ask them something like “What information do you need that you are not getting from your reports, like sales by region or what are my top stores and salesmen?” After they tell you, you should keep asking the question “Can you think of anything else?”

 

If you can talk to 10 or more people and get their responses, you should be able to make up a pretty comprehensive list of the kinds of queries that are needed. If you want more detailed information on interviewing for data warehouse requirements, The Data Warehouse Lifecycle Toolkit (First and Second Editions) by Ralph Kimball, et al. has a very good section on interviewing.

 

Chuck Kelley’s Answer: I would use the business communities “reports” as a basis of what data I get from the source. I also like to use the “touch it, take it” method of building my data warehouse. If I touch some data, I take it all. That way we can get more data than is required by the reports. And, while I am in the source system(s), I look around for interesting data that also might be available.

 

The problem is that sometimes the owners of the source system will not allow us access except to what we need. Then we have to show what we need and use the “touch it, take it” to get other data as well.

In short, you were correct, but sometimes practice doesn’t allow us to always do the correct thing.


Sid Adelman is a principal in Sid Adelman & Associates, an organization specializing in planning and implementing data warehouses, in data warehouse and BI assessments, and in establishing effective data architectures and strategies. He is a regular speaker at DW conferences. Adelman chairs the "Ask the Experts" column on www.dmreview.com. He is a frequent contributor to journals that focus on data warehousing. He co-authored Data Warehouse Project Management and is the principal author on Impossible Data Warehouse Situations with Solutions from the Experts and Data Strategy. He can be reached at (818) 783-9634 or sid@sidadelman.com. Visit his Web site at www.sidadelman.com.

Joe Oates is an internationally known speaker, author and consultant on data warehousing. Oates has more than 30 years of experience in the successful management and technical development of business, real-time and data warehouse applications for industry and government clients. He has designed or helped design and implement more than 30 successful data warehouse projects.

Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.

For more information on related topics, visit the following channels:



Industry Vendors