Zing Modeling Layer

Data modeling for trusted results

Maintaining consistency in how metrics are defined and consumed across an organization is challenging. Fully ad-hoc data analysis is great for exploration but loses the consistency that growing organizations need for a ‘single source of truth’.

Zing Data’s latest feature - a robust semantic layer - addresses this challenge head-on by providing a centralized way to define, manage, and govern metrics easily. And most importantly, its optional unlike many other BI tools…so you can add the structure when you need it.

What is a Semantic Layer?

A semantic layer acts as a translation layer between your raw data and business definitions. It provides a business-friendly interface to your data warehouse, allowing you to define metrics, relationships, and access controls in a single place. Think of it as your organization’s single source of truth for metric definitions.

While Zing already offers capabilities to define metrics examples, create aliases, and define join relationships, a semantic layer goes beyond this to allow you to specify:

Which fields you want to expose from your tables / views
Metrics calculations and descriptions
Dimensions which must co-occur with metrics (for instance, only show order_count by city but never by customer_name)
Table / view join relationships
Field and table aliases
Row level security requirements

Zing’s Implementation

We’ve taken a straightforward approach to semantic layer implementation, offering a JSON-based configuration that makes it easy to define your metrics and their relationships.

We heard from many users that although there are multiple semantic models that exist out in the market (such as dbt), they can complex and require additional software. We’ve made it simple to define calculations, enforce relationships, and bring consistency and governance to your data users.

Importantly, Zing does not require you to have a data model to work, and a semantic model can be added later as your organization’s needs for consistency in metrics, guardrails, and governance increase.

Columns, Metrics and Relationships

A data model is defined at the data source level.

Columns and Tables

You can choose which tables/views, columns, and metrics are used to answer questions by defining a model. This allows you to put guardrails on natural language questions that users may ask within Zing, and ensures data users have a consistent, governed experience.

Specifying tables/views and columns is not required. If no tables/views are specified in your modeling layer, then by default, all tables/views within data source will be made available to answer questions.If no columns are specified, then all columns within a given table/view will be made available.

Metrics

Metrics are defined by a:

A description describing the metric (and used as context to map user questions to this metric)
An expression which is the calculation for the metric
A Group bys section which lets you indicate is a metric can be aggregated by various columns (‘allowed’), must be aggregated by one or more columns (‘required’), or cannot be aggregated by one or more columns (‘forbidden’)
- Allowed
- Required
- Forbidden
- Note: that these group by attributes are optional and if they are not specified then a metric can be aggregated by any column in the columns part of the date model

A sample data model file for Zing is shown below:

{
  "tables": [
    {
    "name": "Orders",
      "columns": [
        {
          "name": "OrderID",
          "description": "Unique identifier for the order"
        },
        {
          "name": "CustomerID",
          "description": "Unique identifier for the customer"
        },
        {
          "name": "Item_Amount",
          "description": "Total revenue excluding taxes of all items in the order"
        },
        {
          "name": "Tax_Amount",
          "description": "Tax amount paid on the order"
        },
        {
          "name": "Total_Amount",
          "description": "Total order amount including tax"
        }
      ]
    },
    {
      "name": "Customers",
      "columns": [
        {
          "name": "CustomerID",
          "description": "Unique identifier for the customer"
        },
        {
          "name": "Name",
          "description": "Name of the customer"
        },
        {
          "name": "City",
          "description": "City of the customer"
        },
        {
          "name": "State",
          "description": "State of the customer"
        }
      ]
      }
  ],
  "metrics": [
    {
      "description": "Tax_Percentage",
      "expr": "sum(Tax_Amount)/sum(Item_Amount)",
      "groupBy": {
        "allowed": [
          "City",
          "State"
        ],
        "required": [],
        "forbidden":[]
      }
    },
    {
      "description": "Order_Amount",
      "expr": "sum(Total_Amount)",
      "groupBy": {
        "allowed": [
        ],
        "required": [],
        "forbidden": ["CustomerID", "Name"]
      }
    }
  ]
}

This would ensure that:

Order amount can’t be shown by Customer Name or Customer ID
Tax rate is only shown at a City or State level
Only the Orders and Customers tables/views are available to answer questions, and only with the specific columns outlined.

Getting Started

Log in to the Zing Data Console, then click on ‘data sources’ and select the data source you’d like to create (or update) a modeling layer for
Click the ‘semantic layer’ tab
Upload a semantic model .json file in the format outlined above
Any questions that a user asks with natural language, will adhere to the modeling layer you have uploaded. Questions asked with the visual editor, or using the SQL editor do not adhere to the data modeling layer you’ve uploaded because those interfaces are typically used for more customized or ad-hoc analysis.
Note that users with an editor or admin role can optionally ask natural language questions which are not bound to the semantic layer by clicking on the options drop-down within the natural language composer and selecting ‘Expert mode’ instead of ‘Semantic model mode’