Data model in DataLens
Data in a dataset is represented as fields.
data source
One or more tables are used as the data source.
Note
There is a limit on displaying the first 1,000 tables from a source in a dataset. If the required tables are not on the list, currently, you can only add them manually using an SQL query.
If there are multiple tables in the source, you can join them with a JOIN
When the tables are joined, a link is created between them. When you create a link, you specify the fields from the source table and merged table.
Tables are linked automatically by the first match in the field name and field data type.
In this case, you can:
- Edit fields in the link.
- Add new links or delete existing links.
- Change the type of the
JOINoperator in the link (INNER,LEFT,RIGHT, orFULL). - Manage link optimization.
JOIN is used if a query made from a chart uses fields from two or more dataset tables.
JOIN is not used if:
- The dataset contains one table.
- The dataset contains multiple tables; however, the query uses fields from only one of those tables (with link optimization enabled).
To manage the link behavior when joining data from multiple tables, use the Optimize link option in the link settings. The option is enabled by default for all links in the dataset: the JOIN operator is applied if a query uses fields from two or more linked tables. You can disable the option for each individual link to make such a link a required one. In which case the JOIN operation will be performed even if fields from a single table are selected.
Note
If you disable optimization, it may take more time to run a query.
For more information about optimization when using JOIN, see Optimizing data when joining tables.
Data fields
The fields define the structure and format of the dataset. The following types of fields are available:
- Dimension. Contains values that define data parameters, such as a city, date of purchase, or product category. The aggregation function is not applied to fields with a dimension; otherwise, the field becomes a measure. In the interface, dimensions are displayed in green.
- Measure. Contains numeric values the aggregation functions (information) apply to, such as the amount of clicks and the number of click-throughs. If you remove the aggregation function from this field, it will become a dimension. In the interface, measures are displayed in blue.
In the dataset creation interface and wizard, you can duplicate fields, create fields, and use aggregation functions.
Warning
The maximum number of fields in a single dataset is 1,200.
DataLens allows you to create calculable fields using formulas. To write formulas, you can use existing dataset fields, constants, and functions. For a full list of functions, see the Function reference.
For more information about calculable fields, see Calculated fields.
Aggregating data
The following aggregation functions are available for fields with different data types:
| Function | Description | Supported types |
|---|---|---|
| No | Without aggregation | All types |
| Average | Arithmetic mean value | Fractional numberInteger |
| Amount | Number of records | StringDateDate and timeFractional numberInteger |
| Number of unique | Number of unique records | StringDateDate and timeFractional numberInteger |
| Maximum | Maximum value | DateDate and timeFractional numberInteger |
| Minimum | Minimum value | DateDate and timeFractional numberInteger |
| Amount | Sum of values | Fractional numberInteger |
Additional aggregation functions are available in calculated fields.
Note
For some sources, aggregation functions are unavailable.
The sources you can use aggregation functions for are listed under Data source support on the aggregation function page in the reference.
To learn more about data types, see Yandex DataLens data types.