Both data warehousing and business intelligence are a important part of the broad topic we call Data Science. The idea behind this article is to define both of these concepts and to explain how they are interconnected and can, and should, be used together.
It is specially important to learn and fully understand these concepts since business intelligence is a growing topic and crucial to nowadays enterprises as businesses in general.
On the other hand comes the data warehouse, which stores and provides all the information that makes BI analysis possible.
So, let’s dig into these concepts!
Business Intelligence, also known as BI, is a set of methods and processes that, combined with different technologies and architectures, provide analysis, interpretation and presentation of information. The strategies to consume the data may vary a lot regarding the objective of the given analysis.
Therefore, based on what we want to know about a given data set, there are some important choices about architecture and technologies that must be made in order to optimize the results we are aiming for.
These analysis con be about current data, historical or even a prediction of what will happen based on specific models. The methods which provide these capabilities are:
- Data Mining
- Predictive Analytics
- Process Mining
- Text Mining
All these kinds of processes are combined with huge sets of data or, as is commonly called, big data, and provide distinct kinds of results and analysis capabilities.
This way, Business Intelligence focus is to provide tools which help decision making and defining specific goals, objectives and choices based on the concrete analysis of all the available data. These decisions are backed by the investigation techniques which were executed and can be applied to a vast variety of things, from stock analysis to product pricing.
Another goal of BI is the forecasting ability regarding several different topics based on predictive procedures and historical data. One example is the forecasting of the demand of a specific good within a store.
In order to deliver the best results and analysis, a business intelligence application must improve and maintain the following characteristics about it’s information:
- Quantity of data – Without relevant sums of information, a BI technique can’t reproduce it’s analysis with the best accuracy.
- Standardization of data – All data sources should be consumed and normalized in order to consume the information properly.
- Clean data – Information that doesn’t matter about the topic or erroneous data should be removed from the analysis set. This is called data cleaning.
- Quality of the data – The merge between standardization, cleaning and consistency of information is a huge part of the analysis success, increasing the quality of the data.
Regarding these requirements, it is obvious that a good data storage system is absolutely critical for the sucess of a Business Intelligence application.
A data warehouse isn’t a necessary requisite for business intelligence. However, taking advantage of it’s characteristics and specifications can be very useful and important for a BI system.
These applications consume and analyse data present and consolidated on data warehouses or data marts, depending of the scope of the investigation. In this article we describe in detail the difference between data warehouse and data mart.
Basically, business intelligence is the analytical tool, based of methods, processes and components to manipulate data. This consumption of information depends on the architecture defined by the BI application. The aim is to convert the raw information into a data product which may lead to several business decisions.
On the other hand, the data warehouse stores and consolidates all the data which will be further used by the BI process. The following image will show, in a simple way, this link between data warehousing and business intelligence concepts that we’re talking about.
On the figure, we see all information from different data sources consolidated into a data warehouse, which is consumed by the BI application and methods. The results are available to users.
For further understanding of this simbiose between concepts that we are discussing, we provide you a simple, yet pratical example.
Imagine a small data warehouse with information about an online store. It stores three sets of information: data about consumers (age, job, address, etc), all the products in the store and a table with the number of sold items related to a consumer.
Until now, we are on the data warehouse layer. Now let’s get into the BI part of the example.
The owners of the store want to open a new store in a region near several university campus. The consumers of that age should be tha main users of that store.
In order to optimize the sales on that new store, they want to know which goods are selling better for those kind of consumers.
To achieve that, they used a business intelligence application which segregates the consumers into ranges of ages, and correlate them to the goods they buy. With this, they are able to build reporting and dashboard tools to analyse and show that information, in order to help on the decision of the new store.
This is a simples example of the vast applications of these two concepts together.