About Sparse structures
  • 13 Jun 2024
  • 4 Minutes to read
  • Contributors
  • Dark
    Light

About Sparse structures

  • Dark
    Light

Article summary

A Cube can store a value for any combination (cell) of members of its dimensions. For example a Cube that has three dimensions in its structure includes a cell for any combinations of said dimensions.

The virtual size of a Cube is the maximum number of cells it has. For example, if a Cube has the dimensions Month, Product, and Customer in its structure and they contain, respectively, 24 months, 500 products, and 1,000 customers, the virtual size of the Cube is 24x500x1000=12,000,000 cells.

Normally, after loading data into the Cube, only a small fraction of its cells really contain data. The ratio between the number of cells containing data and the total number of cells of the Cube (obtained by multiplying the number of members of each dimension together) is the Cube density.

Sparse management

Board does not create a cell for any possible value of a Cube dimension: There are several compression methods, but the most efficient one is the sparse management.

To better understand how it works, let's imagine a food and beverages company which sells a large number of products to a large number of customers, such as small retail shops, restaurants, hotels, catering companies, hospitals, schools, and the like.
In this scenario, a typical customer would probably order only a small list of products from the entire stock, and they may vary depending on the type of customer: a hospital might buy different products from a hotel or a school. If all customers don't buy all possible products and vice-versa, then we say that the Customer and Product Entities are sparse. If customer C1 buys product P1, then the C1-P1 combination is called a "sparse combination".

A sparse structure is a combination of 2 or more hierarchically unrelated Entities for which the number of distinct combinations of existing values is much smaller than the total number of potential combinations.

contents/assets/images/assets/images/Dense - Sparse structure.png

Sparse combinations are created when data is loaded into Cubes and sparse structures are defined when creating Cube versions. When a sparse structure is defined, disk space is allocated only for the sparse combinations created during the loading process, so disk space overhead is minimal.

contents/assets/images/assets/images/sparse-structure-optimization.gif

To manage sparsity efficiently and effectively, administrators can monitor the structure of a Cube from the Cubes section. After selecting the appropriate Cube, click on the sparsity ANALYSIS tab to monitor sparsities that contain huge amounts of combinations. Here you can easily spot which Cubes use large amounts of sparsities and easily delete the unused sparse combinations in huge sparsities to optimize hard disk space usage and speed up the interaction with the affected Cubes.

Sparse structures are shared among all Cubes: when a sparse structure is defined, it will also be used for any new Cube that has the same Entities as dimensions.

Time Entities cannot be included in a sparse structure.

Some degree of disk space compression also occurs on dimensions that are not part of a sparse structure.

By default, Board automatically puts all Entities in sparse mode as long as the product of Max item numbers stays below the 64-bit limit.

In case of automatic Max item numbers, affected Entities will be considered as having a corresponding Max item number greater than the current one while keeping the value below the 64-bit limit.

If the 64-bit pointer is not sufficient, Board will scale up to a 128-bit pointer. However, there are no such limits for Entities set as dense.

Sparse definition guidelines

When creating a Cube version with 2 or more dimensions, use the following guidelines to define a sparse structure:

  • Ignore the time dimension

  • Define the dimensions of Cube version without setting sparse Entities

  • Identify the 2 largest Entities in terms of number of members and ask the following question: ”will every possible combination of both Entities correspond to a meaningful value?” If no, then the combination of those 2 Entities should be defined as a sparse structure.
    Subsequently, identify the next largest Entity and go through the same reasoning considering the sparse structure as a unique Entity. If the combination of the new Entity and the 2 Entities mentioned before is sparse, then add the new Entity to the sparse structure. Repeat this process for the other dimensions

  • In general, you should define a sparse structure when 2 Entities have more than 1,000 members, or when an Entity has several thousand members and the other a few hundred

  • It's always advisable to set a sparse structure, even if it is not needed: when in doubt, set the Entities as sparse to ensure better performance and to be less taxing for the hardware

  • There is a dimensional limitation in defining sparse structures. If you have exceeded the limit, a warning message appears on the Cube icon at the top of the corresponding version. For a Cube version that includes a sparse structure, the product of the Max item number of dense Entities has no limits, while the product of the Max item number of sparse Entities must be less than 7.8x10^28 (please note that those numbers are approximations)


Was this article helpful?