Let’s jump right away to the learning process!
In the next chapters, we’ll be describing each topic in a complete, yet simple way. Before going any further on these topics, we show you a simple infographic comparing the two approaches:
Online analytical processing is a computer technology term referring to systems focused on analysing data in a specific database. This kind of systems are characterized for their analytical capabilities, addressing multi-dimensional or one dimension data, processing all the information. The standard applications of OLAP are bussiness intelligence, data writing and reporting, throught data mining processes.
OLAP operations and databases
On the database level, these systems operation is defined by a low level of transactions, dealing with archived and historical information. This data is seldom updated, identifying the SELECT database operation as the key feature of the system. Therefore, this kind of databases are based on READ operations, aggregating all available information.
Databases that work as data warehouses apply this methodology, optimizing reading and aggregation operations of its multidimensional data model. Thus providing a great support for data analysis and reporting operations, critical in these kind of databases.
The main component of these systems is a OLAP cube. A cube consists in combining data warehouse’s structures like facts and dimensions. Those are organized as schemas: star schema, snowflake schema and fact constellation. The merging of all the cubes creates a multidimensional data warehouse.
There are many types of OLAP systems, depending on it’s structure characteristics. The most common ones are: MOLAP, ROLAP and HOLAP.
The most important real world applications of these systems are: bussiness management and reporting, financial reporting, marketing, research and another data related issues. These processes are growing faster on these days, making them absolutely critical in a world that is becoming dependent of data. In the next paragraph we have a real world example of what we described before.
Real World Example: In a hospital there is 20 years of very complete patient information stored. Someone on the administration wants a detailed report of the most common deseases and sucess rate of treatment. To achieve that, the application of OLAP operations to the hospital’s data warehouse is the key. Usually complex queries on the historical data compile that report, providing the means to the board analyse the information.
Online Transaction Processing is a information system type that prioritizes transaction processing, dealing with operational data. This kind of computer systems are identified by the large number of transactions they support. Thus making them the best to address online application. The main applications of this method are all kind of transactional systems like databases, commercial, hospital applications and so on.
In a simple way, these systems gather input information and store them on a database, in a large scale. Most of today’s applications are based on this interaction methodology, with implementations of centralized or descentralized systems.
OLTP database and operations
On the database level, these transactional systems base their operation on multi-access, fast and effective querys to the database. The most used operations are INSERT, UPDATE and DELETE, since they are directly modifying the data, providing new information on new trasactions. So, in these systems, data is frequently updated, requiring a effective write operations support.
One special characteristic of those databases is the standarizarion of it’s data. This happens because data normalization provides a faster and more effective way to perform database writes. The main concern is the atomicity of the trasanctions and ensuring that concurrent accesses don’t damage data. Addicionally, it is also important that it don’t degradate system’s performance.
OLTP is not only about databases, but also other types of interaction mecanisms. All client-server architectures are examples of these processes, benefiting of the fast transaction and concurrent models. Descentralized systems are also online transaction processing, as all broker programs and web servervices are transaction oriented.
Real World Example: A banking transaction system is a classic example. There are many users executing operations into their accounts and the system must guarantee the completeness of those actions. In this case, there are several concurrent transactions at the same time. Therefore, data coherence and efficient operations are the main goal.
Both are related to information databases, which provide the means and support for these two types of functioning.
Each one of the methods creates a different branch on data management system, with it’s own ideas and processes, but they complement themselves.
To analyse and compare them we’ve built this resource!
Basically, OLAP and OLTP are very different approaches to the use of databases, but not only. In one hand, online analytical processing is more focused on data analysis and reporting. On the other hand online trasaction processing target a transaction optimized system, with a lot of data changes.
For someone learning about data sciences, related to IT methods, it is important to know the difference between these two approaches to information. This is the base idea to systems like business intelligence, data mining, data warehousing, data modelling, etl processes and big data.
Regarding the previous descriptions of the systems, we can compare them in a lot of distinct categories.
The review is detailed in the next table. Then we have a further discussion on each compared item which could evoke some doubts, to ensure you understood.
|Category||Online Transactional Processing||Online Analytical Processing|
|Data Source||Operational information of the application. Generally this processes are the source of the data.||Historical and archive data.|
|Focus||Updating data.||Reporting and retrieval of information.|
|Applications||Management, operational, web services, client-server.||Management Systems. Reporting, decision.|
|Users||Common person. Staff.||Managers, executives, data scientists, marketers.|
|Data task||Operational tasks. Business tasks.||Reporting and data analysis.|
|Data Refresh||Insert, update and delete operations. They are performed fast. Immediate results.||Refreshing of data with huge data sets. Takes time and is sporadic.|
|Data Model||Entity-relantionship on databases.||One or multi-dimensional data.|
|Schema||Normalized schemas. Many tables and relationships.||Star, snowflake and constellation schema. Fewer tables not normalized.|
|Backup||Regular backups with full, incremental and archives. This data is critical and can't be loss.||Simple backup or realoading of the mecanisms that support data insertion.|
|Horizon||Day-to-day, weeks, months.||Long time data.|
|Queries||Simple, returning the results expected for the system activity.||Complex queries of data in order to aggregate information for reporting.|
|Speed||Fast. Requires some indexes on large tables.||Slow. Depending on the amount of data. Requires more indexes.|
|Space||Operational data stored, tipically small.||Large data sets of historical information. Large storage needed.|
The first obvious difference is the focus of each. OLTP is best suited to update existing data, becoming the logical choice to operational systems which work with every day actions. In opposite, OLAP provides great analysis habilities of stored information, making this kind of systems perfect for reporting purposes. They usually base their work in historical data, which is best to analyse, getting more accurate reports.
These systems have very different purposes.
To achiveve the better performance on each one, there are some architectures that suit best operational actions, and others that boost reporting and analysis. Then OLTP database schemas are usually on a normalized form, providing a better performance for the usual queries. On the other hand, OLAP databases have specific data warehouse organization schemas.
Based on this first and critical differences, we can infer some other contrasts.
In OLTP, where the common work is operational, the queries on the system should be much more simpler than on OLAP, where complex digging on data is performed. Therefore, OLTP systems tend to be much faster than OLAP ones.
One of the subjects that hasn’t been discussed yet is data refresh.
In the OLTP system information is constantly changing, so this refresh rate is immediate after each actions. In the other hand, on OLAP systems, the refresh is a pre-defined processing job that stores large sets of data simultaneously. This take long time because of the data size and the process of normalizing the information.
Another term we didn’t discuss is the backup situation.
Since data on OLTP systems is absolutely critical, it needs a complex backup system. Full backups of the data combined with incremental backups are required. OLAP only needs a backup from time to time, since it’s data is not critical and doesn’t keep the system running.
Space requirements on both sides is, obviously dependent on the size of information stored. However is safe to assume that historical data will need more storage space, since a lot more information is stored.