Yosemite Analytics Viewpoint
Integrated BI Data Model
Historically, business reporting started with the information technology (IT) department building a data warehouse. Business intelligence (BI) reporting tools were added later. Some reporting tools recognized the need for additional data modeling. Now companies interested in business reporting have two options: a data warehouse with separate reporting tools, or a BI data model tightly integrated with the reporting tools. The curious thing is that the data warehouses built by IT do not look like the BI data models built to satisfy specific reporting needs. We explore some differences with examples.
Example 1. Salesperson Performance — A key metric in evaluating salesperson performance is how much they discount the product. Good salespeople communicate the value proposition while less skilled salespeople offer a discount. Oddly, all (or most) enterprise resource planning (ERP) systems only store a product’s current list price. This list price changes over time. The list price at the time of the sale is not stored with the sales transaction, only the actual sale price. It is thus not possible to see which salespeople are discounting products and by how much. A BI data model can add the current list price to the sales transaction as part of the daily extraction from the source system.
Example 2. Sales Orders Change Over Time — All (or most) ERP systems only store the latest version of a sales order. If a client orders 1000 widgets in May and adds 500 more in July, the ERP sales orders only shows that 1500 widgets were ordered and the due date is July. Some companies want to track these changes. A BI data model can detect these changes as part of the daily extraction and maintain the fact that 1000 widgets were ordered in May and 500 more were added in July.
Example 3. Past and Projected Sales — A basic reporting need is to view past and projected sales on a single trend chart (where time is the x-axis). We have seen ERP systems that maintain actual sales using invoice tables. Projected sales are stored separately as orders. The two have different table structures and would not normally be combined in a data warehouse. A BI data model can merge similar properties (such as invoice date and order due date) and measures (actual and expected revenue) in the same fields, thereby enabling reporting using a single trend chart.
Example 4. Shipping Charges — A simple reporting requirement is to identify the total cost attributed to a product or group of products. This includes shipping costs. Shipping costs are very significant to hippopotamus distributors, as well as other industries. ERP systems do not store shipping costs with the sales order. Shipping costs are maintained separately, by carton. A data warehouse would maintain this separation. A BI data model can purposely allocate the cost of each carton by product weight and integrate the allocated cost into the sales fact table.
Example 5. Redundant Fact Tables. — A sales fact table can contain many measures. 17 are shown in the graphic below. End users can select one or all measures from a palette to create an ad hoc report. However, suppose there was a dashboard page which showed a trend chart, a geomap, and other visualizations for one of the core measures: quantity, revenue, or margin. Further, suppose users needed to switch the entire page between core measures. This is not possible with the fact table represented by the graphic. Typical BI dashboards only allow end users to filter (select) using dimension members. To support this requirement, a second fact table must be created with a single measure, named Core Measure, and an additional dimension with 3 members: quantity, revenue, and margin. Thus, redundant fact tables are required. The need for redundant fact tables arises directly from reporting requirements and would never be captured in a data warehouse.
The BI data model addresses other aspects that a data warehouse does not, such as data-level security. Salespeople can only view sales for their region. Department managers can only view expenses for their department. Operational personnel can view quantities and time-based metrics but not finances. In addition, a BI data model often integrates data that is not maintained in ERP systems, such as commercially available competitive data. Further, some BI products allow end users to integrate personal data, such a spreadsheet of sales targets, with governed data.
The conclusion is that reporting requirements affect the BI data model in ways that cannot be anticipated from a pure data-centric perspective. The diagram below summarizes the situation. The data warehouse design is based on IT’s perspective of the dominant database tables and is often maintained as normalized or partially normalized relational structures. This is the way IT thinks and breathes. The company’s analytic requirements results in a BI data model based on business subject areas (sales, orders, invoices, etc.) and the data is coalesced into a set of star schemas with shared dimensions. Star schemas are easier for end users to understand and navigate as well as better for reporting. Star schemas readily support interactive end user roll ups (aggregations), drill down, slicing and dicing. The BI data model, being driven by prioritized business needs, is usually smaller in scope than the data warehouse. The BI data model also includes information that the data warehouse does not, in terms of both additional data and data transformations.
So why is this such a big deal? Many (most) enterprises have IT build a data warehouse purely based on data structures. BI is viewed solely as a reporting function performed by business analysts. Even Gartner evaluates the two capabilities separately. Some enterprises use sophisticated federated database products rather than a BI product that supports data modeling. This is partly due to a reluctance to learn how to use BI data modeling products in favor of traditional SQL solutions. Nevertheless, none of the examples described above are supported by the pure data warehousing approach. This is the wrong approach, in general, and is an especially bad approach for small to mid-sized enterprises.
Yosemite Analytics is powered by Birst because Birst has a 2-tier architecture that integrates the BI data model with advanced analytics. Birst data modeling is performed using a graphical user interface (GUI) supported by a SQL-like programming language. This programming language has extensions providing capabilities focused on reporting, such as flattening an organizational hierarchy. Birst uniquely supports “networked BI”, a capability where parts of a governed data model can be shared with other data models, functionally equivalent to federated databases. Birst also allows end users to integrate their personal spreadsheets with governed data. All Birst functionality is available in the cloud. Birst scales from individuals to the enterprise in a manner that is smarter, more connected, and more scalable than any previous analytics and business intelligence platform. You can learn more about Birst at www.birst.com.