Concepts¶

Multi-dimensional datasets¶

In the context of DBnomics, a multi-dimensional dataset is a group of time series where each one is categorized using dimensions.

Let’s start from the following hypothetical CSV file named product_prices.csv that tracks the evolution of the price of different products in different countries:

sku	country	year	price
111	FR	2000	12
111	FR	2001	13
111	FR	2002	11
111	DE	2001	9
111	DE	2002	11
111	DE	2003	14
222	FR	2000	87
222	FR	2001	88
222	FR	2002	90
222	FR	2003	79
333	FR	2000	23
333	FR	2001	22
333	FR	2002	23
333	FR	2003	21

This CSV file can be turned into a multi-dimensional dataset with the code PRODUCT_PRICES.

We can infer 2 dimensions: SKU={111,222,333} and COUNTRY={DE,FR}.

The dataset is composed of 4 time series, each being related to a single product and country, as each dimension must be set with a single value:

Series 111.FR:

period	value
2000	12
2001	13
2002	11

Series 111.DE:

period	value
2001	9
2002	11
2003	14

Series 222.FR:

period	value
2000	87
2001	88
2002	90
2003	79

Series 333.FR:

period	value
2000	23
2001	22
2002	23
2003	21

Note: because the dimensions of a dataset are ordered, we can infer the series codes by concatenating the codes of the values of the dimensions, separated by a . character. For example, the series for SKU=111 and COUNTRY=FR has the code 111.FR.

Concepts¶

Multi-dimensional datasets¶

DBnomics toolbox

Navigation

Related Topics