The recent theme in books and articles on the subject of business intelligence is that in order to get value from these initiatives, you have to do something with the information: make better decisions, take action, measure your results...lather, rinse, repeat. I'm up on that crowded soapbox and I'll tell you why.
In order for business intelligence to be more than just a smarter way of doing what you're already doing, you have to be able to reach across departmental boundaries (which are imaginary, by the way, as in this humorous Coke commercial where one department tries to sue another) and put the data together in new ways that will give you access to the new insights that can make you more competitive. Everything you're learning about your own department? Your competitors already know all that stuff. You're not getting a leg up that way. But if you can ignore those artificial boundaries and think about things from a customer perspective--like an end-to-end process that cuts across multiple departments, for example--you can find ways to move your business to a new level of competition.
What's difficult about doing this is not necessarily a data issue. Typically each department/division has it's own goals that it needs to meet, it's own incentive programs and each department will focus on their goals which, by the very nature of organizations, will be different than those of another department. There is typically little or no incentive for the managers to allocate resources to activities that will not have a direct impact on their departmental goals.
In previous posts I have stated that data warehousing and business intelligence functions should be centralized with some resources assigned to various internal clients. There's an important reason why the organization needs to be set up this way. I can hear the protests right now: "But, but, but! Our needs are unique!" "Our IT doesn't get it!" "We need to move faster than that!" You know how I know you're saying this? Because that's exactly where I was a few years ago when I started working on our data warehousing project. But once we started working on projects that leveraged the data we had gathered, the light bulb came on as we began to realize the true value of what we had created. Creating a more efficient and elegant way to manage our division's processes with information was our primary aim in the beginning. As we began to accumulate data from various areas, our division executive started coming up with ways to combine data from different systems to get new information. For example, she requested analysis comparing customer satisfaction data with employee satisfaction data to see if there was any correlation. This may not sound groundbreaking, but historically these were two different processes with totally separate data stores and combining the data was something that had never been considered. It doesn't take a rocket scientist to figure out that the two were directly correlated and this information was used to make important management decisions.
Another great idea was connecting customer satisfaction data to health data. What we wanted to know was whether we could predict that someone was going to be unhappy simply based on the types of claims they were filing. The answer was amazing and not at all what we thought. Customers who were dissatisfied tended to fall under the category of either having filed very little or a whole, whole lot. Most of our satisfied customers were those who had filed enough claims to have some experience with our customer service department, but not enough to be buried in the avalanche of paperwork that often comes with a catastrophic illness or injury. What this told us was that we needed to approach two very different populations with different processes in order to bring about significant improvement in satisfaction.
The common thread in both of these scenarios was that these were data sets that had never been connected before and in both cases it took the support of an upper-level executive to make the analysis and subsequent actions take place. So if you're lucky enough to have someone in your upper ranks who "gets it" and asks provocative questions and then does something about the answers, you are on the road from simply faster and more accurate analysis to "competing" on analytics.
Friday, April 18, 2008
Wednesday, April 9, 2008
Web Analytics a Whole New Ball Game
Direct marketing has long been leading the charge in information-based decision-making and this was never more true than it is today as companies large and small find innovative ways to leverage the Internet. In the wake of the news of Yahoo! acquiring IndexTools web analytics in an effort to compete with Google Analytics, I thought I would share another web marketing story I read yesterday. DMReview's article "The New Online Reality at Fox Sports" on their BI channel gives an interesting view into how this multimedia giant is using analytics to leverage its web content on its sites that function as an income stream.
The article discusses Fox's "secret formula" that it uses to measure and tweak its NFL, NBA and MLB products. These metrics are considered critical to running a successful web marketing engine, but the article stresses the value the editors bring to the table to make content decisions initially, the quality of the reporting, and the deep understanding of their markets necessary to know what will be interesting to their viewers. In a highly competitive space like web sports media, analytics support, rather than replace, high-quality decision-making that keeps Fox Sports in the game.
Measuring time spent, where readers come from, where they go, revenue generated and a host of other web metrics is a daunting task, both in complexity and volume of data. In a sidebar, the article points out that standard BI software is not optimized to manage the massive data points that are tracked across many web pages and that this type of data is not typically fed into traditional data warehouses. Special web analytics packages fit this growing niche. Interestingly, the difficulty with using this data is not cited as a lack of data or technology to access it, but a lack of analysts who understand the data. It seems that web analytics is still uncharted territory to a great extent, awaiting a few cowboys to ride in and rustle the data. (Did I go too far with that analogy?) Similarly, companies are looking for marketing VP's and managers specifically with electronic media expertise.
If you have a website, you need to measure what is going on there, bottom line. Even tiny companies are figuring out how to leverage a web presence to gain market share, generate leads, and create new income streams, and to manage these efforts effectively, you need goals, metrics (leading indicators and success measurements), you need to know where your traffic is coming from to effectively focus your marketing efforts...all the same things you need to know for a traditional marketing program. But as this article points out, different technology is needed for this rapidly-changing channel. These days, there's no excuse for a company of any size not to be on the web and even less excuse for companies not to measure their efforts there, since it does cost to be on the web, but even small companies can have some level of analytics through something like Google Analytics which is, shockingly, free. It seems even analytics are being democratized.
For more information about the evolving field of web analytics, check out The Web Analytics Association.
The article discusses Fox's "secret formula" that it uses to measure and tweak its NFL, NBA and MLB products. These metrics are considered critical to running a successful web marketing engine, but the article stresses the value the editors bring to the table to make content decisions initially, the quality of the reporting, and the deep understanding of their markets necessary to know what will be interesting to their viewers. In a highly competitive space like web sports media, analytics support, rather than replace, high-quality decision-making that keeps Fox Sports in the game.
Measuring time spent, where readers come from, where they go, revenue generated and a host of other web metrics is a daunting task, both in complexity and volume of data. In a sidebar, the article points out that standard BI software is not optimized to manage the massive data points that are tracked across many web pages and that this type of data is not typically fed into traditional data warehouses. Special web analytics packages fit this growing niche. Interestingly, the difficulty with using this data is not cited as a lack of data or technology to access it, but a lack of analysts who understand the data. It seems that web analytics is still uncharted territory to a great extent, awaiting a few cowboys to ride in and rustle the data. (Did I go too far with that analogy?) Similarly, companies are looking for marketing VP's and managers specifically with electronic media expertise.
If you have a website, you need to measure what is going on there, bottom line. Even tiny companies are figuring out how to leverage a web presence to gain market share, generate leads, and create new income streams, and to manage these efforts effectively, you need goals, metrics (leading indicators and success measurements), you need to know where your traffic is coming from to effectively focus your marketing efforts...all the same things you need to know for a traditional marketing program. But as this article points out, different technology is needed for this rapidly-changing channel. These days, there's no excuse for a company of any size not to be on the web and even less excuse for companies not to measure their efforts there, since it does cost to be on the web, but even small companies can have some level of analytics through something like Google Analytics which is, shockingly, free. It seems even analytics are being democratized.
For more information about the evolving field of web analytics, check out The Web Analytics Association.
Monday, April 7, 2008
Data Warehouse and BI Support Organization: Functional or by Client Area?
If your organization is like the one I grew up in, no doubt there is some undulation in organization structure. Maybe things started off in the old "strategic business unit" structure where every business unit had just about every function except finance and HR. Then over time, as your company became larger, went public, felt a budget pinch, or whatever, they decided to go to a more functional organization to eliminate redundancies in areas such as marketing and possibly even some IT functions.
I promised to stay business-focused with this blog and I will. But as business people we are often engaged in the debate over where the data warehouse effort will live. Perhaps historically in your company, anything that smelled like IT is functionally organized in IT and prioritized through a corporate prioritization process. My bet is that these resources are scheduled for the next 3 years and not able to respond to your business needs with the speed you would like. Many times what results from this situation is data efforts, large and small, sprouting up throughout the company, from little Access databases to full-scale data warehouses owned and supported by business areas. Necessity is the mother of invention, but eventually these efforts will capture the attention of IT management and the debate (war) will be on.
I will place a stake in the ground and say the best scenario is something of a compromise. It is best for a data warehousing group to be centralized within IT, but to have a "customer-centric" organization. There is much to be supported: hardware, nightly ETL loads, metadata, end users, security, training, etc. It is impossible to ensure "one version of the truth" with data marts distributed throughout the organization unless they are connected to an overall architecture and it is very difficult to support these functions within a business department or even a business division. While business people may have the vision for how the data will be leveraged, their IT counterparts bring the special knowledge of what is involved in supporting such a major initiative and should be at the table to offer this value.
That said, I think that somewhere between a fully-centralized, functional approach and a siloed, isolated data mart is an enterprise-wide effort that is organized to quickly address immediate needs of business clients. In order for a data warehouse to be supported by the business, it must remain more than relevant, its relevance must be on the order of oxygen for the business users. And in order to remain that relevant, the data warehouse must be able to address the business needs swiftly and nimbly.
Exactly what that organization will look like will vary from company to company, but generally speaking, the subject matter experts--managers and analysts in each business area--need some control over priorities for the business intelligence resources. They also need to be very involved as a group in the overall priorities for how the data will be developed at the warehouse level, so they understand how their priorities will fit into the overall plan.
Like I said, how deeply into the process this client-focused organization would go varies by company, but in most cases even down to the ETL development and support level there will be some datasets that are particular to a business client and should be organized to support that particular client, balancing swift attention to business priorities with ensuring that the data does not become siloed. The result is a sort of cross-functional, client-focused team that lives in the IT area, is managed by IT managers, but acts like part of the business customer's team.
Additionally, as much of the report development that can be handed off to business analysts should be. These are usually cheaper resources than BI developers found in IT and can be directly driven by their manager's priorities. The important thing is to ensure that reports are properly documented.
The advantages for the manager and business analyst are clear. The advantage of an organization like this for IT management is that they get to balance the immediate needs of their business customers (whose support they need) with building the enterprise data warehouse, which is a constantly moving target with new systems coming in, old systems being retired, and new overall strategic initiatives coming into play every day. Finding a balance that takes all these considerations into account requires political savvy and a deep organizational knowledge of what will work organically within your company.
I promised to stay business-focused with this blog and I will. But as business people we are often engaged in the debate over where the data warehouse effort will live. Perhaps historically in your company, anything that smelled like IT is functionally organized in IT and prioritized through a corporate prioritization process. My bet is that these resources are scheduled for the next 3 years and not able to respond to your business needs with the speed you would like. Many times what results from this situation is data efforts, large and small, sprouting up throughout the company, from little Access databases to full-scale data warehouses owned and supported by business areas. Necessity is the mother of invention, but eventually these efforts will capture the attention of IT management and the debate (war) will be on.
I will place a stake in the ground and say the best scenario is something of a compromise. It is best for a data warehousing group to be centralized within IT, but to have a "customer-centric" organization. There is much to be supported: hardware, nightly ETL loads, metadata, end users, security, training, etc. It is impossible to ensure "one version of the truth" with data marts distributed throughout the organization unless they are connected to an overall architecture and it is very difficult to support these functions within a business department or even a business division. While business people may have the vision for how the data will be leveraged, their IT counterparts bring the special knowledge of what is involved in supporting such a major initiative and should be at the table to offer this value.
That said, I think that somewhere between a fully-centralized, functional approach and a siloed, isolated data mart is an enterprise-wide effort that is organized to quickly address immediate needs of business clients. In order for a data warehouse to be supported by the business, it must remain more than relevant, its relevance must be on the order of oxygen for the business users. And in order to remain that relevant, the data warehouse must be able to address the business needs swiftly and nimbly.
Exactly what that organization will look like will vary from company to company, but generally speaking, the subject matter experts--managers and analysts in each business area--need some control over priorities for the business intelligence resources. They also need to be very involved as a group in the overall priorities for how the data will be developed at the warehouse level, so they understand how their priorities will fit into the overall plan.
Like I said, how deeply into the process this client-focused organization would go varies by company, but in most cases even down to the ETL development and support level there will be some datasets that are particular to a business client and should be organized to support that particular client, balancing swift attention to business priorities with ensuring that the data does not become siloed. The result is a sort of cross-functional, client-focused team that lives in the IT area, is managed by IT managers, but acts like part of the business customer's team.
Additionally, as much of the report development that can be handed off to business analysts should be. These are usually cheaper resources than BI developers found in IT and can be directly driven by their manager's priorities. The important thing is to ensure that reports are properly documented.
The advantages for the manager and business analyst are clear. The advantage of an organization like this for IT management is that they get to balance the immediate needs of their business customers (whose support they need) with building the enterprise data warehouse, which is a constantly moving target with new systems coming in, old systems being retired, and new overall strategic initiatives coming into play every day. Finding a balance that takes all these considerations into account requires political savvy and a deep organizational knowledge of what will work organically within your company.
Saturday, April 5, 2008
BI Tools v. Excel
When I think of BI tools, I'm thinking of something specifically built to sit on top of a data warehouse and pull the data in a structured way, like Business Objects(R) or Cognos(R). There are among us "BI Snobs" who think that Microsoft(R) Excel(R) is bad and who would like to stamp out spreadsheet use. These are probably the same people who think Microsoft(R) Access(R) databases need to be stamped out in the corporate environment, but I would tend to agree with them on that point, which is a post for another day.
When I worked on a data warehouse team, part of my job was selling the value of the data warehouse and BI tool we were charged with implementing. But I admit that even I would get to the end of the flexibility with my BI software and resort to converting my report to Excel to complete my analysis. I was a closet Excel lover, because like many other business people, I was raised on Excel and tended to think about analysis in this very conventional, two-dimensional fashion. I'm sure I'm not the first person to write about the pros and cons of Excel v. BI tools, but please indulge me for a minute, since that's what I'm thinking about this morning.
See, I don't believe that there are two types of people in this world: BI (Business Object, Cognos, or whatever) people and spreadsheet people. I believe it is possible to use both technologies as long as you are smart and aware. The big issue with spreadsheets is that they are very prone to error, accidental or otherwise. Spreadsheets can be used to "spin" the truth and make the results seem different than they are, to one's benefit. When using your BI tool, however, you are pulling from your warehouse calculations that have been (presumably) previously vetted by a team of business and technical resources and have been agreed upon by everyone as the "one version of the truth." But BI tools will never be as flexible as spreadsheets and spreadsheets will never be as powerful as BI tools at handling massive amounts of data and drilling down.
The other problem with spreadsheet use is that once you convert your data out of your BI tool, you lose the ability to drill down to the detail, which eliminates much of the value of your data warehouse. The best thing I can suggest for this is maintaining documentation of the BI report from which your spreadsheet originated, so you can track back to the underlying data when necessary.
Also, as I mentioned before, Excel encourages thinking about analysis in a conventional, two-dimensional approach, where a dimensionally-modeled data warehouse has your data organized in a much more efficient, multi-dimensional way. So you may be shortchanging yourself and adding steps by forcing the data into a two-dimensional format. If that doesn't make sense, just ask me and I'll try to explain it more clearly.
I may be stating the obvious, but I don't believe the BI tools will ever be successful at stamping out spreadsheets, nor should they be. As for the accidental errors that are easy to commit in spreadsheets, the best thing to do is create a lot of checking calculations (usually calculations that should equal zero) and be thorough at going back over your formulas and making sure they are accurate. Run BI reports to compare to your spreadsheets to make sure they pass the "sniff" test. And once you feel good about the integrity of your spreadsheet, lock out the cells that should not be manually entered. I also visually indicate which cells have manual entries, in order to make very short work of figuring out where errors may be, once I have built my model. Use both technologies, but be smart and thorough about how you use them and there will still be a place for spreadsheets in the world of BI.
When I worked on a data warehouse team, part of my job was selling the value of the data warehouse and BI tool we were charged with implementing. But I admit that even I would get to the end of the flexibility with my BI software and resort to converting my report to Excel to complete my analysis. I was a closet Excel lover, because like many other business people, I was raised on Excel and tended to think about analysis in this very conventional, two-dimensional fashion. I'm sure I'm not the first person to write about the pros and cons of Excel v. BI tools, but please indulge me for a minute, since that's what I'm thinking about this morning.
See, I don't believe that there are two types of people in this world: BI (Business Object, Cognos, or whatever) people and spreadsheet people. I believe it is possible to use both technologies as long as you are smart and aware. The big issue with spreadsheets is that they are very prone to error, accidental or otherwise. Spreadsheets can be used to "spin" the truth and make the results seem different than they are, to one's benefit. When using your BI tool, however, you are pulling from your warehouse calculations that have been (presumably) previously vetted by a team of business and technical resources and have been agreed upon by everyone as the "one version of the truth." But BI tools will never be as flexible as spreadsheets and spreadsheets will never be as powerful as BI tools at handling massive amounts of data and drilling down.
The other problem with spreadsheet use is that once you convert your data out of your BI tool, you lose the ability to drill down to the detail, which eliminates much of the value of your data warehouse. The best thing I can suggest for this is maintaining documentation of the BI report from which your spreadsheet originated, so you can track back to the underlying data when necessary.
Also, as I mentioned before, Excel encourages thinking about analysis in a conventional, two-dimensional approach, where a dimensionally-modeled data warehouse has your data organized in a much more efficient, multi-dimensional way. So you may be shortchanging yourself and adding steps by forcing the data into a two-dimensional format. If that doesn't make sense, just ask me and I'll try to explain it more clearly.
I may be stating the obvious, but I don't believe the BI tools will ever be successful at stamping out spreadsheets, nor should they be. As for the accidental errors that are easy to commit in spreadsheets, the best thing to do is create a lot of checking calculations (usually calculations that should equal zero) and be thorough at going back over your formulas and making sure they are accurate. Run BI reports to compare to your spreadsheets to make sure they pass the "sniff" test. And once you feel good about the integrity of your spreadsheet, lock out the cells that should not be manually entered. I also visually indicate which cells have manual entries, in order to make very short work of figuring out where errors may be, once I have built my model. Use both technologies, but be smart and thorough about how you use them and there will still be a place for spreadsheets in the world of BI.
Friday, April 4, 2008
Who Needs an Executive Dashboard?
First, allow me to say that I think the Executive Dashboard is misnamed. It should be named the Everyone Dashboard, or at least the Analyst and Manager Dashboard. Because those are the people who are sent digging in the corporate archives for the history of the company's performance.
One of the biggest values of a dashboard isn't the end-user interface (what you and I see) itself, with all its snazzy graphics and the latest bells-and-whistles marketed by your particular BI vendor, but the underlying store of information that supports it. Most companies do not have anywhere they keep summary level data for posterity's sake, unless it's financial data. How many sales did we have last month v. the same month last year? How many customers dropped off? The best possible scenario is to have this data available, along with key supporting information, such as the ability to drill down to a greater degree of detail, historical information that could impact results (for example: there were 5 Mondays in this month and Mondays are our biggest sales day) and metadata that explains where the data came from and what calculations were applied to it.
For the forecaster, this information is invaluable. Forecasting is a formidable responsibility. You get it wrong and people lose their jobs, either because the company overstaffed based on bad information or because the forecaster put up a goal that the company couldn't achieve and Wall Street is very unforgiving of this kind of mistake. But a forecaster is only as good as her information and one of the key pieces of information is historical data. This historical data must be given some context with which to interpret it in order to be useful. Otherwise it's like trying to speak that Inuit language that only one living person knew, but she died and the knowledge went with her.
One of the biggest values of a dashboard isn't the end-user interface (what you and I see) itself, with all its snazzy graphics and the latest bells-and-whistles marketed by your particular BI vendor, but the underlying store of information that supports it. Most companies do not have anywhere they keep summary level data for posterity's sake, unless it's financial data. How many sales did we have last month v. the same month last year? How many customers dropped off? The best possible scenario is to have this data available, along with key supporting information, such as the ability to drill down to a greater degree of detail, historical information that could impact results (for example: there were 5 Mondays in this month and Mondays are our biggest sales day) and metadata that explains where the data came from and what calculations were applied to it.
For the forecaster, this information is invaluable. Forecasting is a formidable responsibility. You get it wrong and people lose their jobs, either because the company overstaffed based on bad information or because the forecaster put up a goal that the company couldn't achieve and Wall Street is very unforgiving of this kind of mistake. But a forecaster is only as good as her information and one of the key pieces of information is historical data. This historical data must be given some context with which to interpret it in order to be useful. Otherwise it's like trying to speak that Inuit language that only one living person knew, but she died and the knowledge went with her.
Thursday, April 3, 2008
Truthtellers to the Rescue
Most managers live and die by their numbers. Whether you are a manager or an analyst who supports one, chances are you inherited a set of metrics by which to judge the performance of your department/project. Maybe you even have a bonus structure based on these metrics.
But ask yourself this question: do you really understand where your metrics come from? Because most people just accept their reports as gospel (after all, they are very official-looking) without understanding the underlying sources, calculations, transformations and definitions that result in one little, innocuous-looking number.
For example, when you’re counting “sales,” are sales from a new client different from sales generated by an existing client? If so, how do you define a new client? (DM Review’s article "Do Your Metrics Lie?" gives a great example of just such a scenario.) Do you count sales differently from how the finance department counts sales? (Careful: the answer to that question could impact your bonus!) When your boss is comparing sales from your department to sales from your peers’ departments, are they all being counted the same? Or maybe you have different systems that were developed as different markets emerged and each one counts sales differently.
I know you want to report the Truth, because the Truth will set you free. The Truth will show the world that you’re the best, ass-kickin’-est manager ever. The Truth will MAKE YOU the best, because it will enable you to manage what is really going on in the world. So if there’s one thing you’re going to be at the end of the day, it’s a Truthteller.
So what’s a Truthteller to do? First, find out who’s generating those reports. There is underlying code somewhere that defines what ends up on your (and your boss’) desktop every morning. Second, figure out whether it says what you think it says. Maybe you can read the code and figure that out. I’m guessing not. Does your company keep current metadata? If so, that’s an excellent place to start. If not (and many companies don’t) you’ll have to find a sympathetic developer to unravel the spaghetti and figure out where your numbers are coming from.
A better solution? If your company has a data warehouse and business intelligence tools, maybe you can easily create a report yourself and see if it matches the canned report your boss is looking at. If not, use your research as a jumping off point to get your metrics more in line with what’s really going on in your business.
But ask yourself this question: do you really understand where your metrics come from? Because most people just accept their reports as gospel (after all, they are very official-looking) without understanding the underlying sources, calculations, transformations and definitions that result in one little, innocuous-looking number.
For example, when you’re counting “sales,” are sales from a new client different from sales generated by an existing client? If so, how do you define a new client? (DM Review’s article "Do Your Metrics Lie?" gives a great example of just such a scenario.) Do you count sales differently from how the finance department counts sales? (Careful: the answer to that question could impact your bonus!) When your boss is comparing sales from your department to sales from your peers’ departments, are they all being counted the same? Or maybe you have different systems that were developed as different markets emerged and each one counts sales differently.
I know you want to report the Truth, because the Truth will set you free. The Truth will show the world that you’re the best, ass-kickin’-est manager ever. The Truth will MAKE YOU the best, because it will enable you to manage what is really going on in the world. So if there’s one thing you’re going to be at the end of the day, it’s a Truthteller.
So what’s a Truthteller to do? First, find out who’s generating those reports. There is underlying code somewhere that defines what ends up on your (and your boss’) desktop every morning. Second, figure out whether it says what you think it says. Maybe you can read the code and figure that out. I’m guessing not. Does your company keep current metadata? If so, that’s an excellent place to start. If not (and many companies don’t) you’ll have to find a sympathetic developer to unravel the spaghetti and figure out where your numbers are coming from.
A better solution? If your company has a data warehouse and business intelligence tools, maybe you can easily create a report yourself and see if it matches the canned report your boss is looking at. If not, use your research as a jumping off point to get your metrics more in line with what’s really going on in your business.
Data Warehousing: ETL--Part III
Enter the data warehousing team. What they do is magically make the information match and put it where you can get to it, thus solving the two major problems outlined in parts I and II. This process is known as Extraction/Transformation/Loading or ETL. Sometimes this is done with straight up coding and sometimes a tool is used, such as Informatica, but either way, it's a lot of work.
The first step is extraction, which is much like what your analyst does when they pull data from an individual system.
The second step is transformation. This is where much of the value is added by the data warehousing team. When they get the data, it is one holy mess: in one system female is denoted as F, in the next, it's female, in the third, it's 02. There are different data types, calculations, holes in the data, and other horrors we don't even want to hear about.
And it's your DW team's job to get all that data squeaky clean, integrated and moved, every night, without fail, so you can get your reports when you walk in the door in the morning. This final step is loading, which organizes the data so that you can get to it easily and quickly.
Okay, so that's just the beginning. It doesn't sound like much, but once you get to this point in your data warehouse project, you are well on your way to intelligent results!
Thanks for reading!
The first step is extraction, which is much like what your analyst does when they pull data from an individual system.
The second step is transformation. This is where much of the value is added by the data warehousing team. When they get the data, it is one holy mess: in one system female is denoted as F, in the next, it's female, in the third, it's 02. There are different data types, calculations, holes in the data, and other horrors we don't even want to hear about.
And it's your DW team's job to get all that data squeaky clean, integrated and moved, every night, without fail, so you can get your reports when you walk in the door in the morning. This final step is loading, which organizes the data so that you can get to it easily and quickly.
Okay, so that's just the beginning. It doesn't sound like much, but once you get to this point in your data warehouse project, you are well on your way to intelligent results!
Thanks for reading!
Data Warehousing: Siloed Data--Part II
This continues our definition of a data warehouse and our discussion of why you need one.
The second characteristic of your operational systems is that the data is typically "siloed." This is lingo meant to illustrate that the data in each system is independent of other systems. For example, the data in your sales system might be totally separate from the data in your financial system. But in real life, nothing could be further from the truth. This data is inextricably interdependent and you may have an analyst whose full time job is pulling revenue and cost data to analyze your most profitable market segments.
The problem with siloed data, aside from the fact that it often needs to be queried from multiple systems and "massaged" to produce meaningful analysis, is that often the same information is manually entered in multiple locations, inviting inconsistencies and inaccuracies. But, even more importantly, the data isn't always defined the same way in different systems, which leads to discrepancies in reporting. Sometimes you may not even be aware these differences exist without intensive investigation, but often you will be met with situations (as an example) where you feel you met an important goal, but your boss thinks you fell short. See "Not My Incentive!" for further instructions.
(to be continued)
The second characteristic of your operational systems is that the data is typically "siloed." This is lingo meant to illustrate that the data in each system is independent of other systems. For example, the data in your sales system might be totally separate from the data in your financial system. But in real life, nothing could be further from the truth. This data is inextricably interdependent and you may have an analyst whose full time job is pulling revenue and cost data to analyze your most profitable market segments.
The problem with siloed data, aside from the fact that it often needs to be queried from multiple systems and "massaged" to produce meaningful analysis, is that often the same information is manually entered in multiple locations, inviting inconsistencies and inaccuracies. But, even more importantly, the data isn't always defined the same way in different systems, which leads to discrepancies in reporting. Sometimes you may not even be aware these differences exist without intensive investigation, but often you will be met with situations (as an example) where you feel you met an important goal, but your boss thinks you fell short. See "Not My Incentive!" for further instructions.
(to be continued)
Data Warehousing: Why You Need It--Part I
My plan is to focus on issues that are critical to business leveraging data for a competitive advantage in the data warehouse/business intelligence environment. It occurred to me that in order to do that, I need to assume that there are some people out there who are moving from an organization that did not have a data warehouse effort to one that does (perhaps moving from a mid-size company to a larger one.) For that reason, I thought it would be useful to have a short primer on data warehousing to refer back to. To that end, I have posted three "chapters" on the subject.
I have to admit, the first time I heard the term "data warehousing," I pictured little Oompa Loompas inside our computers, shuttling our data around on little forklifts. It's not like that.
You're probably a little more savvy than I was and have a better grasp of the concept of data warehousing. But in the interest of being thorough, I'm going to pretend you're as clueless as I was.
If your company is like most, you have a number of operational systems that may or many not "talk" to each other. Maybe you have a sales system, a call-routing system, a customer service tracking system, and a financial system, to name a few. There are two key points to note about these systems. First, they are "transactional" in nature. That is to say, the data is optimized in such a way that if Bill Clinton calls in to inquire about his order of "First Husbands for Dummies," your system will very quickly retrieve the information about that one particular transaction.
What's different about a data warehouse is that it's optimized for analysis rather than transactions. Initially, the analyst (or manager) wants to know how many books have been sold, by which authors, to which stores, etc. They are looking for many transactions that share one or more characteristics so it's kind of like looking at a leaf (a transaction) through a microscope v. looking at a landscape through a camera lens. You need different tools for these very different tasks.
The big problems you face when trying to gather large amounts of data from a transactional system are: 1) performance of your query, which is likely to take a very long time since that is not what the transactional system is designed to do; 2) pulling data from a transactional system requires some degree of technical expertise, which puts you at the mercy of the IT department or your analyst who may not be available at the very moment you need your answers. And 3) performance of the transactional system, since running a sizable query in the midst of the work day is sure to have a deleterious effect on your sales order system, causing a one minute customer service call to take two minutes, your 50- person customer service team to grow to 100 people, and your profits to go down the proverbial toilet. The other alternative is to hire vampires to run your queries at night, but that solution comes with its own inherent problems. Not the least of which is that when your VP is breathing down your neck and wants his numbers "now!" tomorrow is rarely good enough.
(to be continued)
I have to admit, the first time I heard the term "data warehousing," I pictured little Oompa Loompas inside our computers, shuttling our data around on little forklifts. It's not like that.
You're probably a little more savvy than I was and have a better grasp of the concept of data warehousing. But in the interest of being thorough, I'm going to pretend you're as clueless as I was.
If your company is like most, you have a number of operational systems that may or many not "talk" to each other. Maybe you have a sales system, a call-routing system, a customer service tracking system, and a financial system, to name a few. There are two key points to note about these systems. First, they are "transactional" in nature. That is to say, the data is optimized in such a way that if Bill Clinton calls in to inquire about his order of "First Husbands for Dummies," your system will very quickly retrieve the information about that one particular transaction.
What's different about a data warehouse is that it's optimized for analysis rather than transactions. Initially, the analyst (or manager) wants to know how many books have been sold, by which authors, to which stores, etc. They are looking for many transactions that share one or more characteristics so it's kind of like looking at a leaf (a transaction) through a microscope v. looking at a landscape through a camera lens. You need different tools for these very different tasks.
The big problems you face when trying to gather large amounts of data from a transactional system are: 1) performance of your query, which is likely to take a very long time since that is not what the transactional system is designed to do; 2) pulling data from a transactional system requires some degree of technical expertise, which puts you at the mercy of the IT department or your analyst who may not be available at the very moment you need your answers. And 3) performance of the transactional system, since running a sizable query in the midst of the work day is sure to have a deleterious effect on your sales order system, causing a one minute customer service call to take two minutes, your 50- person customer service team to grow to 100 people, and your profits to go down the proverbial toilet. The other alternative is to hire vampires to run your queries at night, but that solution comes with its own inherent problems. Not the least of which is that when your VP is breathing down your neck and wants his numbers "now!" tomorrow is rarely good enough.
(to be continued)
Intro: Why Should You Care About BI?
What I want to talk about is Business Intelligence: wringing the data out of your systems and processes to improve decision-making, and ultimately, performance.
There's lots of BI talk out there for our technical partners (written by technical people), but very little to educate us as business people about why BI matters now, how our competitors are using it to gain an advantage over us, or what we have to bring to the table to make BI successful...because it won't happen without us.
Now, before we break into rounds of Kumbaya, let me tell you what to expect (and what not to expect) from this blog.
I won't be reviewing software, weighing in on architecture, or gossipping about the goings-on in Silicon Valley. The first two issues are important, but that is not what we're called to do. The value we add is keeping the focus on business results. Because that's the point, right? I mean, that may seem pretty elementary to some people, but in the tech world, I think it's too easy to get wrapped up in the technology for technology's sake.
That said, the world of competitive BI is probably going to require you to learn some things they didn't teach you in business school, like what is a data warehouse, why you need metadata and how to leverage business intelligence tools to raise your group's level of performance. Don't let your brain shut down if you encounter unfamiliar terms. For you to be competitive in the New World Order of data, this all needs to be part of your vocabulary.
I'm telling you this because no one ever bothered to tell me. I had to wade through a lot of technical noise to get to the bottom line: how will this help me do a better job at the job I already do?
One last point: your feedback is welcome and necessary to the relevance of this blog. Let me know what issues you are facing in your organization and I will try to address as many as I can. Thanks for reading.
There's lots of BI talk out there for our technical partners (written by technical people), but very little to educate us as business people about why BI matters now, how our competitors are using it to gain an advantage over us, or what we have to bring to the table to make BI successful...because it won't happen without us.
Now, before we break into rounds of Kumbaya, let me tell you what to expect (and what not to expect) from this blog.
I won't be reviewing software, weighing in on architecture, or gossipping about the goings-on in Silicon Valley. The first two issues are important, but that is not what we're called to do. The value we add is keeping the focus on business results. Because that's the point, right? I mean, that may seem pretty elementary to some people, but in the tech world, I think it's too easy to get wrapped up in the technology for technology's sake.
That said, the world of competitive BI is probably going to require you to learn some things they didn't teach you in business school, like what is a data warehouse, why you need metadata and how to leverage business intelligence tools to raise your group's level of performance. Don't let your brain shut down if you encounter unfamiliar terms. For you to be competitive in the New World Order of data, this all needs to be part of your vocabulary.
I'm telling you this because no one ever bothered to tell me. I had to wade through a lot of technical noise to get to the bottom line: how will this help me do a better job at the job I already do?
One last point: your feedback is welcome and necessary to the relevance of this blog. Let me know what issues you are facing in your organization and I will try to address as many as I can. Thanks for reading.
Subscribe to:
Posts (Atom)