Data Analysis Tools, what I favorite?

Posted by arlene

Data analysis tools are used to perform complex analysis of data. They will normally have a rich set of analytic functions, which allow sophisticated analysis of the data. These tools are designed for business analysis, and will generally understand the common business metrics, such as market share, churn and profitability.

Data analysis tools can again be subdivided into two categories, MOLAP tools and ROLAP tools. MOLAP and ROLAP are terms that sprang up to distinguish two different approaches to analyzing data. They came into existence because the term OLAP has become an industry buzzword. To understand the difference between them, first one has to know three things:

Let us take these questions one at a time. First, what is OLAP? The term OLAP is an acronym for online analytical processing. Much has been written about the subject in the computer literature, and for a detailed discussion you should consult some of that work. For our purposes it is sufficient to have a basic understanding of the term.

OLAP is primarily all about being able to access live data online and analyze it. It is about the methods, structures and tools required to perform this analysis. OLAP is about rapid access to and analysis of data. OLAP tools are designed to allow reasonably large quantities of data to be analyzed online. An OLAP tool will allow a user to quickly perform standard analytical functions on the data and to represent both data and results graphically. The idea is to allow the user to easily manipulate and visualize the data.

Living the Web 2.0Relational technology has been around for many years, and is fairly well understood these days. Again, there is a large body of literature on the subject. For our purposes it is sufficient to say that basically the relational model works by allowing data to be normalized into relations, usually referred to as tables. This normalization minimizes data duplication, making the data more manageable, while still allowing data to be efficiently manipulated. The power of relational technology is in the relational operations that allow data to be joined, unioned, intersected and so on. This, along with the standardization on a common SQL language, has made relational databases the norm in the marketplace today.

Multidimensional analysis is a technique whereby data can be analyzed in many dimensions at once. The term dimension in this context means an attribute such as cost, duration, or name. These attributes will generally be equivalent to a column in a relational table. The idea is that instead of analyzing the data in a two-dimensional table, the data is loaded into a multidimensional hypercube to be analyzed.

Using matrix arithmetic and sparse matrix optimizations this allows the data to be stored space-efficiently and analyzed very rapidly by the loaded dimensions. Some tools also allow the use of multiple separate but related hypercubes; this reduces the sparsity of the matrices and makes the dimension calculations more efficient.

The acronyms MOLAP and ROLAP stand for multidimensional OLAP and relational OLAP respectively. They are terms that have come into use to differentiate multidimensional tools from relational tools. The distinction is somewhat artificial, because many of the MOLAP tools have an SQL interface that allows them to extract data from a relational database. This said, the SQL interface is automatic, in that it generates the SQL automatically, but the SQL generated is not necessarily efficient.

The advantage of using a multidimensional tool is that on a predefined set of data it gives user-friendly, fast access to powerful analytical and statistical functions. Multidimensional tools suffer a heavy performance hit when uploading a cube, but once the data is in memory they can carry out certain operations on that data far more efficiently than a relational tool can. Operations such as time series analysis and top-ten bottom-ten selection can be performed extremely efficiently. These operations, while possible in a relational tool, are difficult to program in SQL, and will not perform efficiently. The other thing that multidimensional tools do very well is dimensional slicing. If data has been loaded into the cube dimensioned by office, region, sales_ quantity, sales_value and product, the tool will be able to switch almost instantly from displaying data by region to displaying data by product or by sales value.

MOLAP tools are good to use for analyzing aggregated data in conjunction with its dimension data. They are not so good if you need to drill down to detailed data at the fact level, or if you need to query very large quantities of base data. If you are going to use a MOLAP tool against the data warehouse it will be better to help the SQL performance by creating aggregations that will allow the commonly accessed cubes to be quickly built, or even prebuilt. This will allow the MOLAP tool to get up and running very quickly. It will also prevent inefficient generated SQL from trying to build data sets at the correct aggregated level for the desired cube. In effect, these aggregations are data marts designed specifically for the MOLAP tool.

ROLAP tools are the traditional SQL-oriented tools that have tight integration to the relational model. These tools have been around a long time, but are changing all the time. The current generation of ROLAP tools are powerful and easy to use. They use metadata to isolate the user from the underlying complexities of the data warehouse, and to present a business perspective of the data.

ROLAP tools can be distinguished from the data dippers by their range and depth of analytical functionality. As with the MOLAP tools these are business-aware tools that understand business terminology. Being relational they can be used as data browsers, and will have good drill-down capability from aggregation to detailed data.

Possibly related posts: (automatically generated)
Data Analysis Tools, what I favorite?

3 Responses to “Data Analysis Tools, what I favorite?”

  1. Rebecca’s email marketing practices combined with her innovative online scrapbooking community, work together to help her build a strong customer base, something the small business needs to succeed. … Email Marketing Service

  2. If you think that this site is not following its stated information policy, you may contact us at the above addresses or phone number, state or local chapters of the Better Business Bureau, state or local consumer protection office, The Federal Trade Commission by phone at 202. … Home Business Opportunity Schemes

  3. Monarch is Windows-based “Report Mining” software that easily extracts data from existing reports produced by any information system, along with easy data analyst is, graphing, and exporting of data to other applications such as Excel and Access. … Data ExtractionSoftware Name

Leave a Reply

LogoAlexa CounterFeedBurner Counter