Practical work on olap technology in excel. Understand real-time analytical processing (OLAP). OLAP functions in Excel

The first interface of pivot tables, also called pivot reports, was included in Excel back in 1993 (version of Excel 5.0). Despite its many useful features, it is hardly used by most Excel users. Even experienced users often mean by the term “summary report” something built using complex formulas. Let's try to popularize the use of pivot tables in the daily work of economists. The article discusses the theoretical foundations of creating summary reports, provides practical recommendations for their use, and also provides an example of data access based on multiple tables.

Multivariate data analysis terms

Most economists have heard the terms "multidimensional data", "virtual cube", "OLAP technologies" and so on. But with a detailed conversation, it usually turns out that almost everyone does not really know what it is about. That is, people mean something complex and usually not relevant to their daily activities. In fact, this is not the case.

Multidimensional data, measurements

It is safe to say that economists are almost constantly faced with multidimensional data, but try to represent it in a predefined form using spreadsheets. Multidimensionality here means the ability to enter, view or analyze the same information with a change in appearance, the use of different groupings and data sorts. For example, a sales plan can be analyzed according to the following criteria:

types or groups of goods;
brands or product categories;
periods (month, quarter, year);
buyers or groups of buyers;
sales regions
etc.

Each of the above criteria in terms of multivariate data analysis is called "dimension". We can say that a measurement characterizes information for a certain set of values. A special type of measurement of multidimensional information is "data". In our example, the sales plan data can be:

volume of sales;
Selling price;
individual discount
etc.

In theory, data can also be a standard dimension of multidimensional information (for example, you can group data by sales price), but usually data is still a special type of value.

Thus, we can say that in practical work economists use two types of information: multidimensional data ( actual and planned numbers, having many characteristics) and reference books (characteristics or measurements of data).

OLAP

The abbreviation OLAP (online analytical processing) is literally translated as "real-time analytical processing." The definition is not very specific, almost any report of any software product can be summed up under it. In the meaning of OLAP, it means a technology for working with special reports, including software, for obtaining and analyzing multidimensional structured data. One of the popular software products that implement OLAP technologies is SQL Server Analysis Server. Some even mistakenly consider it to be the only representative of the software implementation of this concept.

Virtual data cube

"Virtual cube" (multidimensional cube, OLAP cube) is a special term coined by some specialized software vendors. OLAP systems usually prepare and store data in their own structures, and special analysis interfaces (for example, Excel pivot reports) access data from these virtual cubes. At the same time, the use of such a dedicated storage is not at all necessary for processing multidimensional information. In general, virtual cube- this is an array of specially optimized multidimensional data that is used to create summary reports. It can be obtained both through specialized software tools and through simple access to database tables or any other source, such as an Excel table.

Pivot table

A Pivot Table is a user interface for displaying multidimensional data. Using this interface, you can group, sort, filter and change the location of data in order to obtain various analytical samples. The report is refreshed by simple means of the user interface, the data is automatically aggregated according to the specified rules, and no additional or repeated input of any information is required. The Excel PivotTable interface is perhaps the most popular software product for working with multidimensional data. It supports both external data sources (OLAP cubes and relational databases) and internal spreadsheet ranges as a data source. Beginning with version 2000 (9.0), Excel also supports a graphical form of displaying multidimensional data - a pivot chart (Pivot Chart).

Excel's PivotTable interface allows you to position dimensions of multidimensional data in an area of a worksheet. For simplicity, you can think of a pivot table as a report lying on top of a range of cells (in fact, there is a certain binding of cell formats to the fields of the pivot table). An Excel PivotTable has four areas for displaying information: filter, columns, rows, and data. Data dimensions are named pivot table fields... These fields have their own properties and display format.

Once again, I would like to draw your attention to the fact that the Excel pivot table is intended solely for data analysis without the ability to edit information. Closer in meaning would be the ubiquitous use of the term "Pivot Report", and this is how this interface was called until 2000. But for some reason, in subsequent versions, the developers abandoned it.

Editing pivot tables

By its definition, OLAP technology, in principle, does not imply the ability to change the source data when working with reports. Nevertheless, a whole class of software systems has formed on the market that implement the capabilities of both analysis and direct editing of data in multidimensional tables. Basically, such systems are focused on solving budgeting problems.

Using built-in Excel automation tools, you can solve many non-standard tasks. An example of implementing editing for Excel pivot tables based on worksheet data can be found on our website.

Preparing multidimensional data

Let's get to the practical use of pivot tables. Let's try to analyze sales data in different directions. File pivottableexample.xls consists of several sheets. Sheet Example contains basic information about sales for a certain period. For simplicity of the example, we will analyze the only numerical indicator - the sales volume in kg. There are the following key data dimensions: product, customer, and carrier (shipping company). In addition, there are several additional data dimensions that are characteristic of a product: type, brand, category, supplier, and customer: type. These data are collected on the References sheet. In practice, there can be many more such measurements.

Sheet Example contains a standard data analysis tool - AutoFilter. Looking at the example of filling out the table, it is obvious that the data on sales by dates (they are arranged in columns) lend themselves to normal analysis. In addition, using the autofilter, you can try to summarize the data by combinations of one or more key criteria. There is absolutely no information about brands, categories and types. It is not possible to group data with automatic summarization by a specific key (for example, by customers). In addition, the set of dates is fixed, and it will not be possible to view the summary information for a certain period, for example, 3 days, automatically.

In general, the presence of a predefined date location in this example is the main drawback of the table. By arranging the dates in columns, we kind of predefined the dimension of this table, thus depriving ourselves of the ability to use analysis using pivot tables.

First, you need to get rid of this drawback - i.e. remove the predefined location of one of the dimensions of the original data. Example of a valid table - sheet Sales.

The table has the form of an information input log. Here, date is a peer dimension of the data. It should also be noted that for the subsequent analysis in pivot tables, the relative position of the rows relative to each other (in other words, sorting) is completely indifferent. Records in relational databases have these properties. It is on the analysis of large volumes of databases that the interface of pivot tables is primarily focused. Therefore, you must adhere to these rules when working with a data source in the form of cell ranges. At the same time, no one prohibits the use of Excel interface tools in the work - pivot tables analyze only data, and formatting, filters, grouping and sorting of source cells can be arbitrary.

From auto filter to summary report

Theoretically, on the data of the Sales sheet, it is already possible to analyze in three dimensions: goods, customers and carriers. There is no data on the properties of products and buyers on this sheet, which, accordingly, will not allow them to be shown in the pivot table. In normal PivotTable creation mode for source data, Excel does not allow data from multiple tables to be linked by specific fields. You can work around this limitation programmatically - see an example supplement to this article on our website. In order not to resort to software methods of information processing (especially since they are not universal), additional characteristics should be added directly to the journal entry form - see the SalesAnalysis sheet.

The use of VLOOKUP functions makes it easy to supplement the original data with missing characteristics. Now, using AutoFilter, you can analyze data in different dimensions. But the problem of groupings remains unresolved. For example, tracking the amount only by brands for certain dates is quite problematic. If you limit yourself to Excel formulas, then you need to build additional selections using the SUMIF function.

Now let's see what possibilities the pivot table interface provides. On the sheet CodeAnalysis several reports were built based on a range of cells with sheet data SalesAnalysis.

The first analysis table is built through the Excel 2007 interface Ribbon \ Insert \ Pivot Table(in Excel 2000-2003 the menu Data \ Pivot Table).

The second and third tables were created through copying and subsequent customization. The data source for all tables is the same. You can check this by changing the original data, then you need to update the data of the summary reports.

From our point of view, the advantages in information clarity are obvious. You can swap filters, columns and rows, and hide certain groups of values for any dimension, use manual drag and drop, and auto sort.

Properties and formatting

In addition to the direct display of data, there is a large set of options for displaying the appearance of pivot tables. Extra data can be hidden using filters. For a single element or field, it is easier to use the context menu item Delete(in version 2000-2003 Hide).

It is also advisable to set the display of other elements of the pivot table not through formatting the cell, but through setting the field or element of the pivot table. To do this, move the mouse pointer to the desired element, wait for the appearance of a special cursor shape (in the form of an arrow), then select the selected element through a single click. After selection, you can change the view through the ribbon, context menu, or call the standard cell format dialog:

In addition, Excel 2007 introduces many predefined PivotTable display styles:

Notice that the control filters and drag areas are active in the chart.

Access to external data

As already noted, perhaps the greatest effect of using pivot tables can be obtained when accessing data from external sources - OLAP cubes and database queries. Such sources usually store large amounts of information and also have a predefined relational structure that makes it easy to define dimensions of multidimensional data (pivot table fields).

Excel supports many types of external data sources:

The greatest effect from the use of external sources of information can be achieved by using automation tools (VBA programs) both for obtaining data and for their preprocessing in pivot tables.

Analytical Analytical Processing (OLAP) is a technology that is used to organize big business databases and support business intelligence. OLAP databases are divided into one or more cubes, and each cubes are organized by the cube administrator to match how the data is retrieved and analyzed to make it easier to create and use the PivotTable and PivotChart reports that you need.

What is Business Intelligence?

A business analyst often wants to get a bigger picture of a business in order to view broader trends based on aggregated data, as well as view trends broken down into any number of variables. Business intelligence is the process of extracting data from an OLAP database and analyzing that data to provide information that can be used to make informed business decisions and take action. For example, with OLAP and Business Intelligence, you can answer the following questions about business data.

How does the total sales of all products in 2007 compare to sales from 2006?

How does this compare to a date and time for a favorable period over the past five years?

How much money did customers spend on 35 in the past year, and how has this behavior changed over time?

How many products were sold in two specific countries / regions this month versus the same month last year?

For each customer age group What is the breakdown of ROI (both margin percentage and total) by product category?

Search for top and bottom sellers, distributors, suppliers, customers, partners and clients.

What is Online Analytical Processing (OLAP)?

Online Analytical Processing (OLAP) databases simplify business intelligence queries. OLAP is a database technology optimized for queries and reports, not for transaction processing. The data source for OLAP is online transaction processing (OLTP) databases, which are typically stored in data warehouses. OLAP data is extracted from this historical data and combined into structures that allow for complex analysis. OLAP data is also organized hierarchically and stored in cubes rather than tables. It is a sophisticated technology that uses multidimensional structures to provide fast access to data for analysis. In this organization, for a PivotTable or PivotChart report, you can easily display high-level summary data, such as total sales for an entire country or region, and display information about sites where sales are particularly high or weak.

OLAP databases are designed to speed up data loading. Because the OLAP Server, and not Microsoft Office Excel, calculates the summarized values, less data needs to be sent to Excel when you create or modify a report. This approach allows you to work with a larger amount of raw data than if the data was organized in a traditional database, where Excel retrieves all individual records and calculates the summarized values.

OLAP databases contain two main types of data: measures, which are numerical data, quantities and averages, which are used to make informed business decisions, and dimensions, which are categories used to organize those measures. OLAP databases help you organize your data with many levels of detail, using the same categories you know for analyzing data.

The following sections describe each component in detail below.

Cubic A data structure that aggregates measures by level and hierarchy for each dimension that you want to analyze. Cubes combine multiple dimensions such as time, geography, and product lines with totals such as sales and inventory. Cubes are not "Cubes" in the strict mathematical sense, as they do not necessarily have the same sides. However, they represent Apt's metaphor for a complex concept.

Measurements A set of values in a cube based on a column in the cube fact table, which is usually a numeric value. Measures are central values in Cube that are preprocessed, processed and analyzed. The most common examples are sales, revenues, revenues, and costs.

Member An item in a hierarchy that represents one or more occurrences of data. An item can be either unique or non-unique. For example, 2007 and 2008 represent unique members at the year level of the time dimension, while January represents non-unique members at the month level because there is more than one January in the time dimension because it contains data for more than one year.

Calculated member A dimension member whose value is calculated at run time using an expression. Calculated member values can be derived from other member values. For example, a calculated item "profit" can be determined by subtracting the item value as well as the cost from the item value, sales.

dimension A collection of one or more ordered cube level hierarchies that the user understands and uses as the basis for data analysis. For example, a geographic dimension might include country / region, state / district, and city levels. In addition, a time dimension can include a hierarchy with the year, quarter, month, and day levels. In a PivotTable report or PivotChart report, each hierarchy becomes a set of fields that you can expand and collapse to show lower or higher levels.

Hierarchy A logical tree structure that orders the members of a dimension such that each member has one parent and zero or more children. A child is a member of an earlier group in the hierarchy that is directly related to the current member. For example, in a time hierarchy containing the quarter, month, and day levels, January is a child of Qtr1. A parent is a member at a lower level in the hierarchy that is directly related to the current member. The parent value is usually the consolidation of the values of all children. For example, in a time hierarchy containing the levels "quarter", "month", and "day", Qtr1 is the parent of January.

Level In a hierarchy, data can be organized into lower and higher levels of detail, such as years, quarters, months, and daily levels in the time hierarchy.

OLAP functions in Excel

Retrieving OLAP data You can connect to OLAP data sources in the same way as you can connect to other external data sources. You can work with databases created with Microsoft SQL Server OLAP 7.0, Microsoft SQL Server Analysis Services 2000 and Microsoft SQL Server Analysis Services 2005, and Microsoft OLAP server products. Excel can also work with third-party OLAP products that are compatible with OLE-DB for OLAP.

OLAP data can only be displayed as a PivotTable report or PivotChart report, or as a worksheet function converted from a PivotTable report, but not as an external data range. You can save OLAP PivotTable and PivotChart reports in report templates, and you can create Office Data Connection (ODC) files to connect to OLAP databases for OLAP queries. When you open the .odc file in Excel, a blank PivotTable report is displayed, ready to be hosted.

Creating cube files for offline use You can create a standalone cube file (. Cub) with a subset of data from an OLAP server database. Offline cube files are used to work with OLAP data when you are not connected to a network. By using a cube, you can work with more data in a PivotTable report or PivotChart report than you would otherwise, and speed up data retrieval. You can only create Cube files if you are using an OLAP provider such as Microsoft SQL Analysis Services 2005 that supports this feature.

Server actions A server action is an optional function that an OLAP cube administrator can define on a server that uses a cube item or measure as a parameter in a query to retrieve information in the cube, or to launch another application, such as a browser. Excel supports URLs, reports, rowsets, drill down and drill down on verbose server actions, but does not support its own native operator and dataset.

KPI A KPI is a special calculated measure, defined on the server, that tracks “KPIs”, including status (the current value is a specific number). and trend (values over time). When displayed, the server can send appropriate icons, similar to the new Excel icon, to line up above or below status levels (for example, for a stop icon), and scroll the value up or down (for example, a directional arrow icon).

Server formatting Cube administrators can create measures and calculated members using color formatting, font formatting, and conditional formatting rules, which can be assigned as a corporate standard business rule. For example, the server-side format for revenue might be a numeric currency format, the cell color is green if the value is greater than or equal to 30,000, and red if the value is less than 30,000, and the font style is bold if the value is less than 30,000, and if the value is positive. - normal. is greater than or equal to 30,000. More information can be found.

Office interface language A cube administrator can define translations for data and errors on the server for users who need to view PivotTable information in a different language. This function is defined as a file connection property, and the user's computer locale and country must match the interface language.

Software components required to access OLAP data sources

OLAP provider To set up OLAP data sources for Excel, you need one of the following OLAP providers.

Microsoft OLAP Provider Excel includes a data source driver and client software for accessing databases created with Microsoft SQL Server olap version 7.0, Microsoft SQL Server olap version 2000 (8.0), and Microsoft SQL Server Analysis services version 2005 (9 , 0).

Third party OLAP providers Other OLAP products require additional drivers and client software to be installed. To use Excel's OLAP data capabilities, the third-party product must comply with the OLE-DB for OLAP standard and be compatible with Microsoft Office. For information on installing and using a third-party OLAP provider, contact your system administrator or OLAP product provider.

Server Databases and Cube Files Excel OLAP client software supports connections to two types of OLAP databases. If the database on the OLAP server is online, you can get raw data from it directly from it. If you have an offline cube file that contains OLAP data or a cube definition file, you can connect to that file and get raw data from it.

Data sources The data source provides access to all data in an OLAP database or offline cube file. Once you create an OLAP data source, you can base reports on it and return OLAP data to Excel as a PivotTable report or PivotChart report, and as a worksheet function converted from a PivotTable report.

Microsoft Query With Query, you can retrieve data from an external database such as Microsoft SQL or Microsoft Access. You do not need to use a query to retrieve data from an OLAP PivotTable associated with a cube file. Additional information .

Differences in OLAP and Non-OLAP Features of Source Data

If you are working with PivotTable reports and PivotCharts from OLAP source data and other types of source data, you will see some differences in functionality.

Retrieving data OLAP Server returns new data to Excel whenever the report layout changes. With other types of external data sources, you query all the source data in one go, or you can specify parameters for the query only when displaying different elements of the report filter fields. In addition, you have several more options for updating the report.

In reports based on OLAP raw data, the report filter field options are not available, the background query is not available, and the memory optimization option is not available.

Note: The memory optimization option is also not available for OLEDB data sources and PivotTable reports based on a range of cells.

Field types OLAP source data. dimension fields can only be used as rows (rows), columns (category), or page fields. Measure fields can only be used as value fields. For other types of source data, all fields can be used anywhere in the report.

Access to detailed data For OLAP source data, the server determines the available levels of detail and calculates summary values, so the detail records that make up the summary values may not be available. However, the server can provide property fields that you can display. Other types of source data do not have property fields, but you can display basic information for data field and member values, and display non-data members.

OLAP report filter fields may not have all elements, and the command Show report filter pages unavailable.

Initial sort order For OLAP source data, items are first displayed in the order in which they are returned by the OLAP server. You can sort or manually change the order of the items. For other types of source data, the new report items are first sorted by item name in ascending order.

Nimi OLAP servers provide summary values directly for the report, so you cannot change the summary functions for value fields. For other types of source data, you can change the aggregate function for the value field and use multiple summary functions for the same value field. You cannot create calculated fields and calculated members in reports with OLAP source data.

Subtotals In reports with OLAP raw data, you cannot change the summary function for subtotals. With other types of source data, you can change the subtotal functions for subtotals and show or hide subtotals for all row and column fields.

For OLAP source data, you can include or exclude hidden items when calculating subtotals and grand totals. For other types of source data, you can include hidden items in the report filter fields in the subtotals, but hidden items in other fields will be excluded by default.

In a standard pivot table, the source data is stored on the local hard drive. This way, you can always manage and reorganize them, even without having access to the network. But this does not in any way apply to OLAP Pivot Tables. In OLAP PivotTables, the cache is never stored on the local hard drive. Therefore, immediately after disconnecting from the local network, your pivot table will become unusable. You cannot move any fields in it.

If you still need to analyze OLAP data after disconnecting from the network, create an offline data cube. An offline data cube is a separate file that is a pivot table cache and stores OLAP data that is viewed after disconnecting from the local network. OLAP data copied to the pivot table can be printed, on the website http://everest.ua it is described in detail.

To create an offline data cube, first create an OLAP PivotTable. Position the cursor within the PivotTable and click the OLAP Tools button on the Tools contextual tab, which is part of the PivotTable Tools contextual tab group. Select the Offline OLAP command (Figure 9.8).

Rice. 9.8. Create an offline data cube

The Offline OLAP Data Cube Settings dialog box appears. Click on the Create Offline Data File button. You have launched the New Data Cube File Wizard. Click the Next button to continue the procedure.

First, you need to specify the dimensions and levels that will be included in the data cube. In the dialog box, you must select the data to be imported from the OLAP database. The idea is to specify only those dimensions that will be needed after disconnecting the computer from the local network. The more dimensions you specify, the larger the offline data cube will be.

Click the Next button to proceed to the next dialog box of the wizard. It gives you the ability to specify members or data items that will not be included in the cube. In particular, you do not need the Internet Sales-Extended Amount measure, so the check box for it will be cleared in the list. A cleared check box indicates that the specified item will not be imported and take up extra space on the local hard drive.

In the last step, provide the location and name of the data cube. In our case, the cube file will be named MyOfflineCube.cub and will be located in the Work folder.

Data cube files have the extension .cub

After some time, Excel will save the offline data cube in the specified folder. To test it, double-click on the file, which will automatically generate an Excel workbook that contains a pivot table associated with the selected data cube. Once created, you can distribute the offline data cube to all interested users who are working in offline LAN mode.

Once connected to the local network, you can open the offline data cube file and update it, as well as the corresponding data table. The main principle is that a standalone data cube is used only to work when the local network is disconnected, but it must be updated after the connection is restored. Attempting to update the offline data cube after disconnecting the connection will result in a failure.

Working with offline cube files

An offline cube file (. cub) stores data in the form of an OLAP (Online Analytical Processing) cube. This data can represent part of an OLAP database on an OLAP server, or it can be generated independently of an OLAP database. Use an offline cube file to continue working with PivotTable and PivotChart reports when the server is unavailable or when you are offline.

Safety note: Be careful when using or distributing an offline cube file that contains sensitive or personal data. It is recommended that you save the data in a workbook instead of a cube file so that you can control access to the data using Rights Management. For more information, see Information Rights Management in Office.

When working with a PivotTable or PivotChart report based on OLAP server source data, you can use the Offline Cube Wizard to copy the source data into a separate offline cube file on your computer. These offline files require an OLAP Data Provider to support this capability, such as MSOLAP from Microsoft SQL Server Analysis Services installed on the computer.

Note: The creation and use of offline cube files from Microsoft SQL Server Analysis Services is governed by the terms and licensing of the Microsoft SQL Server installation. Review the related SQL Server version licensing information.

Working with the Standalone Cube Wizard

To create an offline cube file, you can select a subset of data in an OLAP database using the offline cube wizard and then save that subset. The report does not need to include all of the fields included in the file, nor do you need to select any of them and the data fields available in the OLAP database. To keep the file at a minimum, you can only include the data that you want to appear in the report. You can omit all dimensions, and for most types of dimensions, you can also exclude lower level information and top level items that you do not want to display. For all items that you include, the property fields available in the database for those items are also saved in an offline file.

Taking data offline and reconnecting it

To do this, you first need to create a PivotTable or PivotChart report based on the server database, and then create an offline cube file from the report. After that, you can switch the report between the server database and the offline file at any time. For example, if you are using a laptop for home and video trips, then reconnect the computer to the network.

The following are the basic steps to take to work offline with your data and then bring your data back online.

Create or open a PivotTable or PivotChart report based on OLAP data that you want to access offline.

Create an offline cube file on your computer. In chapter Create an offline cube file from an OLAP server database(later in this article).

Disconnect from the network and work with the offline cube file.

Go online and reconnect the cube file offline. Check out the section Reconnecting the offline cube file to the OLAP server database(below in this article).

Updating the offline cube file with new data and re-creating the offline cube file. Check out the section updating and re-creating the offline cube file(later in this article).

BLOG

Only high quality posts

What are Excel Pivot Tables and OLAP Cubes

Watch the video for the article:

OLAP Is English. online analytical processing, analytical technology for processing data in real time. In simple terms - a storage with multidimensional data (Cube), even simpler - just a database from which you can get data in Excel and analyze using the Excel tool - Pivot Tables.

Pivot tables Is a user interface for displaying multidimensional data. In other words, a special kind of tables that can be used to make almost any report.

To make it clear, let's compare the "Normal Table" with the "Pivot Table"

Normal table:

Summary table:

The main difference Pivot tables Is the presence of a window " PivotTable Field List", From which you can select the required fields and get any table automatically!

How to use

Open the Excel file that is connected to the OLAP cube, for example "BIWEB":

Now, what does this mean and how do you use it?

Drag the required fields to get, for example, a table like this:

« Plus signs»Allow you to drill down into the report. In this example, "Brand" is detailed to "Abbreviated names", and "Quarter" to "Month", ie. So:

Analytical functions in Excel (cubes functions)

Microsoft is constantly adding new capabilities to Excel in terms of data analysis and visualization. Working with information in Excel can be represented as relatively independent three layers:

"Correctly" organized source data
mathematics (logic) of data processing
data presentation

Rice. 1. Data analysis in Excel: a) source data, b) measure in Power Pivot, c) dashboard; to enlarge the image, right-click on it and select Open picture in a new tab

Download note in Word or pdf format, examples in Excel format

Cube functions and pivot tables

Pivot tables are the simplest and at the same time very powerful data presentation tool. They can be built on the basis of data contained: a) in an Excel worksheet, b) an OLAP cube, or c) a Power Pivot data model. In the last two cases, in addition to the pivot table, you can use analytical functions (cubes functions) to generate a report on an Excel sheet. Pivot tables are simpler. The functions of cubes are more complex, but they provide more flexibility, especially in the design of reports, so they are widely used in dashboards.

What follows is for cube formulas and pivot tables based on the Power Pivot model and, in a few cases, on OLAP cubes.

An easy way to get the functions of cubes

When (if) you started learning VBA code, you learned that the easiest way to get the code is by recording a macro. Then you can edit the code, add loops, checks, etc. Similarly, the easiest way to get a set of cubes' functions is to transform the pivot table (Fig. 2). Stand on any cell of the pivot table, go to the tab Analysis, click on the button Funds OLAP, and press Convert to Formulas.

Rice. 2. Converting a pivot table to a set of cube functions

The numbers will be preserved, not values, but formulas that pull data from the Power Pivot Data Model (Figure 3). You can format the resulting table. Including, you can delete and insert rows and columns inside the table. The slice remains, and it affects the data in the table. When the original data is updated, the numbers in the table are also updated.

Rice. 3. Table based on cube formulas

CUBEVALUE () function

This is perhaps the main function of cubes. It is the equivalent of the area The values pivot table. CUBEVALUE retrieves data from a Power Pivot cube or model, and displays it outside of the PivotTable. This means that you are not limited to the limits of the PivotTable and can create reports with countless possibilities.

Writing a formula from scratch

You don't need to convert the finished PivotTable. You can write any cube formula from scratch. For example, the following formula is entered in cell C10 (Fig. 4):

Rice. 4. The CUBEVALUE () function in cell C10 returns bike sales for all years, just like in the pivot table

A little trick. To make it easier to read the cube formulas, it is desirable that only one argument is placed on each line. You can shrink the Excel window. To do this, click on the icon Minimize to window located in the upper right corner of the screen. And then adjust the size of the window horizontally. An alternative option is to force the formula text to wrap to a new line. To do this, in the formula bar, place the cursor in the place where you want to transfer and press Alt + Enter.

Rice. 5. Minimize the window

Function syntax CUBEVALUE ()

Excel Help is absolutely accurate and completely useless for beginners:

CUBEVALUE (connection; [member_expression1]; [member_expression2]; ...)

Connection- required argument; a text string representing the name of the connection to the cube.

Member_expression- optional argument; a text string representing an MDX expression that returns an item or tuple in a cube. In addition, "member_expression" can be a set, defined using the CUBESET function. Use member_expression as a slice to define the portion of the cube for which you want to return an aggregated value. If no measure is specified in member_expression, the default measure for that cube will be used.

Before moving on to explaining the syntax of the CUBEVALUE function, a few words about cubes, data models, and the cryptic tuple.

Some Information About OLAP Cubes and Power Pivot Data Models

OLAP data cubes ( O n l ine A nalytical P rocessing - operational data analysis) were developed specifically for analytical processing and rapid extraction of data from them. Imagine a three-dimensional space where time periods, cities and goods are plotted along the axes (Figure 5a). In the nodes of such a coordinate grid, the values of various measures are located: sales volume, profit, costs, number of units sold, etc. Now imagine that there are tens, or even hundreds of dimensions ... and there are also a lot of measures. This will be the OLAP multidimensional cube. Creating, configuring and keeping OLAP cubes up to date is the business of IT professionals.

Rice. 5a. OLAP 3D Cube

Excel analytical formulas (cube formulas) extract the axis names (for example, Time), names of elements on these axes (August, September), values of measures at the intersection of coordinates. It is this structure that allows cube-based pivot tables and cube formulas to be so flexible and adaptable to user needs. Excel worksheet-based PivotTables do not use measures, so they are not as flexible for data analysis purposes.

Power Pivot is a relatively new feature of Microsoft. It is a built-in Excel and somewhat independent environment with a familiar interface. Power Pivot is vastly superior to standard PivotTables. However, developing cubes in Power Pivot is relatively easy and, most importantly, does not require an IT professional. Microsoft is implementing its slogan: "Business Intelligence - to the masses!". Although Power Pivot models are not 100% cubes, they can also be referred to as cubes (see Mark Moore's introductory course. Power Pivot and the larger edition by Rob Colley. DAX Formulas for Power Pivot for more information).

The main components of a cube are dimensions, hierarchies, levels, members (or members) and measures. Measurement - the main characteristic of the analyzed data. For example, product category, time period, sales geography. Dimension is what we can place on one of the axes of the pivot table. Each dimension, in addition to unique values, includes a member that aggregates all members of that dimension.

Measurements are plotted based on hierarchies... For example, a product category can be broken down into subcategories, then into models, and finally into product names (Figure 5b). The hierarchy allows you to create summary data and analyze it at different levels of the structure. In our example, the hierarchy Category includes 4 Level.

The elements(individual members) are present at all levels. For example, at the Category level, there are four elements: Accessories, Bikes, Clothing, Components. Other levels have their own elements.

Measures Are calculated values such as sales. Cubed measures are stored in their own dimension called (see Figure 9 below). Measures have no hierarchies. Each measure calculates and stores a value for all dimensions and all members, and is sliced based on which dimension members we place on the axis. They also say which coordinates will be set, or which filter context will be set. For example, in Fig. 5а in each small cube the same measure is calculated - Profit. And the value returned by the measure depends on the coordinates. On the right in Figure 5a it is shown that Profit (in three coordinates) in Moscow in October on apples = 63,000 r. The measure can be interpreted as one of the dimensions. For example, in Fig. 5a instead of the axis Goods, place axis Measures with elements Volume of sales, Profit, Units Sold... Then each cell will be some value, for example, Moscow, September, sales volume.

Tuple- several elements of different dimensions, specifying coordinates along the axes of the cube, in which we calculate the measure. For example, in Fig. 5a Tuple= Moscow, October, apples. Also a valid tuple is Perm, apples. Another one is apples, August. Dimensions not included in a tuple are implicitly present in a tuple, and are represented by a default member. Thus, a cell in a multidimensional space is always defined by a complete set of coordinates, even if some of them are omitted in the tuple. You cannot include two members of the same dimension in a tuple, the syntax will not allow. For example, an invalid motorcade Moscow and Perm, apples. To implement such an MDX you need a set of two tuples: Moscow and apples + Perm and apples.

Set of elements- several elements of the same dimension. For example, apples and pears. A set of tuples- multiple tuples, each of which consists of the same dimensions in the same sequence. For example, a set of two tuples: Moscow, apples and Perm, bananas.

Auto-completion to help

Let's go back to the syntax of the CUBEVALUE function. Let's use auto-completion. Start typing the formula in the cell:

Excel will suggest all connections available in the Excel workbook:

Rice. 6. The connection to the Power Pivot Data Model is always called ThisWorkbookDataModel

Rice. 7. Connections to cubes

Let's continue entering the formula (in our case, for the data model):

Autocompletion will suggest all available tables and measures of the data model:

Rice. 8. Available first-level elements - table names and set of measures (highlighted)

Select the icon Measures... Put a point:

CUBEVALUE ("ThisWorkbookDataModel"; ".

Auto-completion will suggest all available measures:

Rice. 9. Available Second Level Items in the Measure Set

Choose a measure. Add quotation marks, closing parenthesis, press Enter.

CUBEVALUE ("ThisWorkbookDataModel"; ".")

Rice. 10. Formula CUBEVALUE in an Excel cell

Similarly, you can add a third argument to your formula:

VBA in Excel Excel.PivotTable Object and Working with Pivot Tables and OLAP Cubes in Excel

10.8 Working with pivot tables (PivotTable object)

Excel.PivotTable object, programmatically working with pivot tables and OLAP cubes in Excel using VBA, PivotCache object, creating a pivot table layout

In the course of the operation of most enterprises, so-called raw data about activities are accumulated. For example, for a merchant, data on the sales of goods can be accumulated - for each purchase separately, for cellular enterprises - statistics of the load on base stations, etc. Very often, the management of an enterprise needs analytical information that is generated on the basis of raw information - for example, to calculate the contribution of each type of product to the company's income or the quality of service in the area of a given station. It is very difficult to extract such information from raw information: you need to execute very complex SQL queries that take a long time and often interfere with current work. Therefore, nowadays, more and more often, raw data is consolidated first in the data warehouse - the Data Warehouse, and then - in OLAP cubes, which are very convenient for interactive analysis. The easiest way to think of OLAP cubes is as multidimensional tables, in which instead of the standard two dimensions (columns and rows, as in ordinary tables), there can be a lot of dimensions. The term “cutaway” is commonly used to describe dimensions in a cube. For example, the marketing department may need information in terms of time, in regional terms, in terms of product types, in terms of sales channels, etc. Using cubes (as opposed to standard SQL queries) it is very easy to get answers to questions like “how many goods of this type were sold in the fourth quarter of last year in the Northwest region through regional distributors.

Of course, such cubes cannot be created in ordinary databases. OLAP cubes require specialized software. SQL Server ships with an OLAP database from Microsoft called Analysis Services. There are OLAP solutions from Oracle, IBM, Sybase, etc.

To work with such cubes, a special client is built into Excel. In Russian it is called Pivot table(on the graphical screen, it is accessible via the menu Data -> Pivot table), and in English - Pivot table... Accordingly, the object that this client represents is called a PivotTable. It should be noted that he knows how to work not only with OLAP cubes, but also with ordinary data in Excel tables or databases, but many possibilities are lost.

The PivotTable and PivotTable object are Panorama Software products that were acquired by Microsoft and integrated into Excel. Therefore, working with the PivotTable object is somewhat different from working with other Excel objects. Guessing what needs to be done is often difficult. Therefore, it is recommended to actively use the macro recorder to get hints. At the same time, when working with pivot tables, users often have to perform the same repetitive operations, so automation is necessary in many situations.

What does it look like to programmatically work with a pivot table?

The first thing we need to do is create a PivotCache object that will represent the set of records retrieved from the OLAP source. Very conventionally, this PivotCache object can be compared to a QueryTable. Only one PivotCache object can be used for each PivotTable. The PivotCache object is created using the Add () method of the PivotCaches collection:

Dim PC1 As PivotCache

Set PC1 = ActiveWorkbook.PivotCaches.Add (xlExternal)

PivotCaches is a standard collection, and of the methods that deserve detailed consideration, only the Add () method can be named in it. This method takes two parameters:

SourceType- required, specifies the type of data source for the pivot table. You can specify to create a PivotTable based on a range in Excel, data from a database, in an external data source, another PivotTable, and so on. In practice, it usually makes sense to use OLAP only when there is a lot of data - accordingly, you need a specialized external storage (for example, Microsoft Analysis Services). In this situation, xlExternal is selected.
SourceData- mandatory in all cases, except those when the value of the first parameter is xlExternal. Strictly speaking, it defines the data range on the basis of which the PivotTable will be created. Usually takes a Range object.

The next task is to configure the parameters of the PivotCache object. As already mentioned, this object is very similar to QueryTable, and has a very similar set of properties and methods. Some of the most important properties and methods:

ADOConnection- the ability to return an ADO Connection object that is automatically created to connect to an external data source. Used for additional configuration of connection properties.
Connection- works in exactly the same way as the property of the same name of the QueryTable object. It can accept a connection string, a ready-made Recordset object, a text file, a Web request. file Microsoft Query. Most often, when working with OLAP, a connection string is written directly (since it makes little sense to receive a Recordset object, for example, to change data - OLAP data sources are almost always read-only). For example, setting this property to connect to a Foodmart database (Analysis Services sample database) on a LONDON server might look like this:

PC1.Connection = "OLEDB; Provider = MSOLAP.2; Data Source = LONDON1; Initial Catalog = FoodMart 2000"

properties CommandType and CommandText similarly describe the type of command that is sent to the database server and the text of the command itself. For example, to access the Sales cube and get it entirely into the cache on the client, you can use code like

property LocalConnection allows you to connect to a local cube (* .cub file) created by Excel. Of course, it is not recommended to use such files for working with "production" volumes of data - only for the purpose of creating layouts, etc.
property MemoryUsed returns the amount of RAM used by the PivotCache. Returns 0 if a PivotTable based on this PivotCache has not yet been created or opened. Can be used for checks if your application will run on weak clients.
property OLAP returns True if PivotCache is connected to an OLAP server.
OptimizeCache- the ability to optimize the cache structure. Initial loading of data will take longer, but then the speed may increase. For OLE DB sources does not work.

The rest of the properties of the PivotCache object are the same as those of the QueryTable object, and therefore will not be discussed here.

The main method of the PivotCache object is the CreatePivotTable () method. Using this method, the next stage is performed - the creation of a pivot table (PivotTable object). This method takes four parameters:

TableDestination is the only required parameter. Takes in a Range object, at the top-left corner of which the pivot table will be placed.
TableName- the name of the pivot table. If not specified, a name of the "PivotTable1" type will be automatically generated.
ReadData- if set to True, then the entire content of the cube will be automatically placed in the cache. You need to be very careful with this parameter, as its incorrect use can dramatically increase the load on the client.
DefaultVersion- this property is usually not specified. Determines the version of the pivot table that is being created. The most recent version is used by default.

Creating a pivot table in the first cell of the first sheet of the workbook might look like this:

PC1.CreatePivotTable Range ("A1")

We have created a pivot table, but immediately after creation it is empty. It provides four areas in which you can place fields from the source (on the graphical screen, all this can be configured either using the window PivotTable Field List- it opens automatically, or using the button Layout on the last screen of the Create PivotTable Wizard):

column area- it contains those dimensions ("section" in which the data will be analyzed), the members of which are fewer;
line area- those dimensions with more members;
page area- those measurements for which you only need to filter (for example, show data only for a certain region or only for a certain year);
data area- Strictly speaking, the central part of the table. Those numerical data (for example, the amount of sales), which we analyze.

Relying on the user to correctly place items in all four areas is difficult. In addition, it may take some time. Therefore, it is often necessary to programmatically arrange the data in the pivot table. This operation is performed using the CubeField object. The main property of this object is Orientation, it determines where this or that field will be located. For example, let's place the Customers dimension in the column area:

PT1.CubeFields ("") .Orientation = xlColumnField

Then - measuring Time into the area of strings:

PT1.CubeFields ("") .Orientation = xlRowField

Then - measuring Product into the page area:

PT1.CubeFields ("") .Orientation = xlPageField

And finally, the measure (numerical data for analysis) Unit Sales:

PT1.CubeFields ("."). Orientation = xlDataField

Working with an OLAP cube in MS Excel
1. Get permission to access the SQL Server Analysis Services OLAP cube (SSAS)
2. MS Excel 2016/2013/2010 must be installed on your computer (MS Excel 2007 is also possible, but it is not convenient to work in it, and the functionality of MS Excel 2003 is quite poor)
3. Open MS Excel, launch the wizard for configuring the connection to the analytical service:

3.1 Specify the name or IP-address of the current OLAP server (sometimes you need to specify the open port number, for example, 192.25.25.102:80); domain authentication is used:

3.2 Select a multidimensional database and an analytical cube (if you have access rights to the cube):

3.3 The settings for connecting to the analytical service will be saved in an odc file on your computer:

3.4 Select the type of report (pivot table / graph) and indicate the location for its placement:

If a connection has already been created in the Excel workbook, then it can be used again: main menu "Data" -> "Existing connections" -> select the connection in this workbook -> insert the pivot table into the specified cell.
4. Successfully connected to the cube, you can start interactive data analysis:

Before starting interactive data analysis, it is necessary to determine which of the fields will participate in the formation of rows, columns and filters (pages) of the pivot table. In general, the pivot table is three-dimensional, and we can assume that the third dimension is located perpendicular to the screen, and we observe sections parallel to the plane of the screen and determined by which "page" is selected for display. Filtering can be done by dragging and dropping the corresponding dimension attributes into the filter area of the report. Filtering limits cube space, reducing the load on the OLAP server, so it is preferable to install the necessary filters first... Then you place the dimension attributes in the row, column, and measure areas in the data area of the PivotTable.

Each time the PivotTable changes, an MDX statement is automatically sent to the OLAP Server and returns the data. The larger and more complex the amount of processed data, the calculated indicators, the longer the query execution time. You can cancel the execution of the request by pressing the key Escape... The last performed operations can be undone (Ctrl + Z) or redo (Ctrl + Y).

Typically, for the most commonly used combinations of dimension attributes, the cube stores pre-calculated aggregated data, so the response time for such queries is several seconds. However, it is impossible to calculate all possible combinations of aggregations, since this can take a lot of time and storage space. Executing massive queries to data at the granularity level can require significant server computing resources, so their execution time can be long. After reading data from disk drives, the server places it in the RAM cache, which allows subsequent such requests to be executed instantly, since the data will be fetched from the cache.

If you think that your request will be used frequently and the execution time is unsatisfactory, you can contact the analytical development support service to optimize the execution of the request.

After placing the hierarchy in the row / column area, it is possible to hide individual levels:

Key attributes(less often - for attributes higher in the hierarchy) dimensions can have properties - descriptive characteristics that can be displayed both in tooltips and in the form of fields:

If you want to display several field properties at once, you can use the corresponding dialog list:

User-defined kits

Excel 2010 introduces the ability to interactively create your own (user-defined) sets from dimension members:

Unlike sets created and stored centrally on the cube side, custom sets are stored locally in the Excel workbook and can be used later:

Advanced users can create sets using MDX constructs:

Setting pivot table properties

Through the item "Pivot table parameters ..." of the context menu (right-clicking within the pivot table), you can customize the pivot table, for example:
- tab "Output", parameter "Classic layout of the pivot table" - the pivot table becomes interactive, you can drag the fields (Drag & Drop);
- "Output" tab, "Show items without data in rows" parameter - empty rows will be displayed in the pivot table that do not contain any indicator values for the corresponding dimension items;
- "Layout and Format" tab, "Preserve cell formatting when updating" parameter - in the pivot table, you can override and preserve the format of cells when updating data;

Create pivot charts

For an existing OLAP pivot table, you can create a pivot chart - pie, bar, bar, graph, scatter and other types of charts:

In this case, the pivot chart will be synchronized with the pivot table - when you change the composition of indicators, filters, dimensions in the pivot table, the pivot table is also updated.

Creation of dashboards

Select the original pivot table, copy it to the clipboard (Ctrl + C) and paste a copy of it (Ctrl + V), in which we will change the composition of indicators:

To manage multiple pivot tables at the same time, insert a slice (new functionality available since MS Excel 2010). Let's connect our Slicer to pivot tables - right-click within the slicer, select "Pivot table connections ..." from the context menu. It should be noted that there can be multiple pivot panels that can serve pivot tables on different sheets at the same time, which allows for the creation of coordinated dashboards.

Slicer panels can be customized: you need to select the panel, then see the items "Size and properties ...", "Slicer settings", "Assign a macro" in the context menu, activated by the right mouse click, or the "Options" item of the main menu. So, it is possible to set the number of columns for the elements (buttons) of the slice, the sizes of the slice buttons and the panel, to define the color scheme and the design style from the existing set for the slice (or create your own style), to define your own panel title, to assign a program macro through which you can expand panel functionality.

Executing an MDX query from Excel

First of all, you need to perform the DRILLTHROUGH operation on some key, i.e. go down to the detailed data (the detailed data is displayed on a separate sheet), and open the list of connections;
Open connection properties, go to the "Definition" tab;
Select the default command type, and place the prepared one in the command text field. MDX inquiry;
When the button is pressed after checking the correct syntax of the query and the availability of the appropriate access rights, the query will be executed on the server, and the result will be presented in the current sheet in the form of a regular flat table.
You can view the text of the MDX query generated by Excel by installing a free add-on that also provides other additional functionality.

Translation into other languages

The analytical cube supports localization into Russian and English (if necessary, localization into other languages is possible). Translations apply to the names of dimensions, hierarchies, attributes, folders, measures, as well as elements of individual hierarchies if there are translations for them on the side of accounting systems / data storage. To change the language, you need to open the connection properties and add the following option to the connection string:
Extended Properties = "Locale = 1033"
where 1033 is English localization
1049 - localization into Russian

Additional Excel extensions for Microsoft OLAP

The ability to work with Microsoft OLAP cubes will increase if you use additional extensions, for example, OLAP PivotTable Extensions, thanks to which you can use a quick search by dimension: