Zoomdata Version

Creating Calculations and Derived Fields

Custom calculations serve as additional metrics to use with your charts to help you analyze your data. After you have created a data source, you can begin creating custom calculations for it. Calculations are available for all supported data sources. Derived fields are supported by certain connectors that come out-of-the-box in Zoomdata. To see what connectors support derived fields and what functions are available, see Support for Calculations.

You can create calculations by aggregating data in different ways, whether from an entire data source or from a selected subset and, as needed, applying arithmetic operators to them. After you have created a custom calculation, it becomes available as a metric in your charts and dashboards, so you can use it like any other metric.

Aggregating Data

Scope of data refers to the domain of data included in a calculation - whether a whole column from the data source, or just part of the column. Zoomdata always excludes data from a calculation if it is excluded from the chart by a filter even if that data would otherwise be included in the scope of the calculation. You can further control what data is included by choosing metric functions that match the scope you want. Data can be aggregated on a column-wide scope, a table scope, or a window scope. Each scope is explained below.

For a table of all supported aggregation functions, see Support for Calculations.

Column Aggregation Functions

Column metric functions calculate using all the data displayed on a chart. They group results in the same way that the chart itself groups its data. That is, if your chart shows data grouped by gender, male and female, then column aggregation functions return results grouped by male and female. Metric functions, for instance, SUM( ), are column-wide unless the they are named Table- or Window- such as TableSUM( ).

Imagine the following data set.

name gender city earned spent
Alan M Rockville $10 $2
Bob M Rockville $8 $3
Carol F Rockville $5 $5
Darlene F Reston $4 $6
Ed M Reston $2 $8

Using this data set, to calculate metric called Leftover, that is, a group's leftover money, use the following formula.

SUM( earned ) - SUM( spent )

If you used this Leftover metric on a chart grouping by gender using the data above, you would get the results shown below.

Males have $7, derived from (10+8+2) - (2+3+8). Females have -$2 left over, derived from (5+4) - (5+6). If you used the same metric on a chart grouping by city, you would see Rockville having $13, from (10+8+5) - (2+3+5), and Reston having -$8, from (4+2) - (6+8).

For a table of all supported aggregation functions, see Support for Calculations.

Table Functions

Table metric functions are broader in scope than column aggregation functions. Table functions use all data from a field and produces a single, ungrouped value. You typically do not use table functions directly on a chart, since it is ungrouped. That is, if you have sales records grouped by gender (say, male and female), a TableSUM( ) calculation returns the total sales of all records one value regardless of the group in consideration. On a chart, the females and males would both seem to have the same sales whether they did or didn't. Table functions are typically used to calculate percentages of a whole or average values.

For example, to calculate a group's earnings as percentage of total earnings, use the following formula.

SUM( earned ) / TableSUM( earned ) * 100

in which:

  • SUM( earned ) calculates the sum earnings for each particular group.
  • TableSUM( earned ) calculates the total earnings of all records.
  • The quotient of SUM( earned ) / TableSUM( earned ) is multipled by 100 to convert it into a percentage.

Shown on a table using the example data above, results would look like the following.

For a table of all supported aggregation functions, see Support for Calculations.

Window Functions

Window metric functions are a middle case between columns and tables. They provide a snapshot or window into a subset of data, depending on the groupings used by the chart. Each window function such as WindowSUM or WindowAVG requires a numeric field ID to aggregate followed by a list of one or more attributes. The function aggregates the data and groups the results based on these attributes if the attributes are present in the chart. Attributes absent from the chart are ignored from the aggregation. For example, an aggregation WindowAVG( profits, gender, city ) returns the average profits for the data set, grouped by gender and city if gender and city are represented in the chart. If gender happens to be absent from the chart, then it is dropped from the aggregation. Effectively, the average profits would be grouped only by city in that case.

For example, to calculate a group's contribution to just gender, rather than to the whole, use the following formula.

SUM( earned ) / WindowSUM( earned, gender ) * 100

If you use this custom calculation in a pivot table with the example data set shown above, you derive results like the ones show below.

In the example pivot table, each city's total earnings (SUM(earned)) is shown as a percentage of each gender's total earnings. If gender had been absent from the chart, the cities' total earnings would have been shown as totals of the whole, rather than of each gender.

Zoomdata provides metric functions for summing, averaging, identifying minimums and maximums in a data set, counting the number of values in a set, and counting the number of distinct values in a set.

For a table of all supported aggregation functions, see Support for Calculations.

Derived Fields

A derived field is an in-memory column for your data table that is populated with results from calculations performed on data already in your table. You can create derived fields using expressions to use as data filters, attributes, or for use in aggregations. Expressions are built on operators and functions.

These calculations are performed at the level of a row, that is, a record, and do not include other data from your table that is outside of that particular row. If a source supports derived fields, then you can use them as arguments for aggregate functions when creating other calculations.

Consider the following examples:

Your data source has records that list the revenue generated and the term of employment but does not have an average of the two. You can use a derived field to create an average of the two for each record. Use the following formula:

(revenue/lengthofemployment)

Your data source continues values that have been brought in as text strings. In order to cross-reference this data with the time values, you need to change the text to a numeric value. Use the following formula as a base:

TEXT_TO_NUM (LTRIM (Field_A, '$')) SUM(TEXT_TO_NUM(SUBSTRING("$124456.00", 2, 10)))

Your data source contains records that list the start of employment and termination of employment for your company. You want to find the differences between these time values to average out the length of employment. Use the following formula as a base:

TIME_DIFF (timePart, startTime : Time, endTime : Time) : Numeric

Zoomdata support the following functions for derived fields:

  • Arithmetic
  • Logical
  • Relational
  • Numerical
  • Text
  • Time
  • Conditional

For a list of all available functions, see Support for Calculations.

Performing Math in Calculations

The SUM, AVG, MIN, and MAX functions require a single parameter, a field ID, that must be numeric. They return numeric values.

The COUNT and COUNTD functions can operate on any field ID. These functions count records. They return numeric values.

You can also use basic arithmetic operators to build your formulas from aggregated metrics, and from constant values. In this way, you can create calculations of percentages, differences, averages, and the like.

Zoomdata supports addition ( + ), subtraction ( - ), multiplication ( * ), and division ( / ).

ABS() can be applied to a metric value or to a calculation to return an absolute value.

Applying Filters in Calculations

Filters applied to a chart are automatically applied to any calculations used with that chart. In addition, you can add filters into calculations to control what data is included when the calculation is used. To filter, use the SQL-like expression WHERE. For example:

AVG(revenue) WHERE state = 'Florida'

String literals like Florida or Electronic Goods must be enclosed in single quotes.

The following can be used as filters in calculations.

  • Attribute options can be used as conditions for a filter by comparing to values using =, !=, IN, and NOT IN. Use = and != to compare to single values. Use IN and NOT IN to compare to a list of values enclosed in parenthesis and separated by commas. For example:
    AVG(revenue) WHERE state IN ('Florida', 'Maryland', 'New York')

  • Numeric fields and references to existing calculations using the comparisons >, <, =, and !=

You can also filter using dates and times. For more information, see Using Dates and Time below.

SQL-Like Expressions

Zoomdata's calculations support the following SQL-like expressions:

  • WHERE - to filter by a condition

    COUNT(accidentnumber) WHERE airportcode IN ('LAX', 'ORD', 'IAD')

    Establishes a filter's condition. Data will only be included in the calculation if the condition that follows is true.

  • AND, OR - to form a conjunctive or a disjunctive condition

    SUM(totalfatalinjuries) WHERE broadphaseofflight = 'LANDING' OR airportcode = 'LAX'

    Calculates the sum of totalfatalinjuries, including only data that has a broadphaseofflight of LANDING and an airportcode of LAX.

    SUM(totalfatalinjuries) WHERE broadphaseofflight = 'LANDING' OR airportcode = 'LAX'

    Calculates the sum of totalfatalinjuries, including only data that has a broadphaseofflight of LANDING or an airportcode of LAX.

  • BETWEEN... AND - to filter using a range of values

    COUNTD(accidentnumber) WHERE totalminorinjuries BETWEEN 2 AND 10

    Counts the number of distinct records for accidentnumber that have totalminorinjuries between 2 and 10.

  • IN, NOT IN - to filter using a set of values

    SUM(totalfatalinjuries) WHERE airportcode IN ('LAX', 'ORD', 'IAD')

    Calculates the sum of totalfatalinjuries at the airports, including only those with the airportcode of LAX, ORD, or IAD.

    SUM(totalfatalinjuries) WHERE airportcode NOT IN ('LAX', 'ORD', 'IAD')

    Calculates the sum of totalfatalinjuries at the airports, excluding those with the airportcode of LAX, ORD, or IAD.

  • TRANSFORM - to filter based on a derived date. To derive a date with TRANSFORM, you must already have a time attribute configured in your data source.

    SUM(totalfatalinjuries) TRANSFORM eventdate = PreviousPeriod()

    Calculates the SUM of totalfatalinjuries, supposing the eventdate attribute were for the previous period. For example, if the chart is examining two weeks of data for totalfatalinjuries, the calculation above will provide data about the two weeks prior to that. To work correctly, data must be available for the periods of time considered.

Using Dates and Time

To filter or examine your data using date or time, you must already have a time attribute configured in your data source. These can only be used after WHERE in your calculation.

Date field options use common time formats such as YTD , MMDDYYYY , and YoY.

The following functions are available.


Supported Date and Time Functions
Function Type Description
DATE()    
TIME()    

DateADD( time_period, interval, date )

date formats

DateSUB( time_period, interval, date )

date formats

PreviousPeriod( offset, numPeriods )

time granularity

Parameters are optional.

Without parameters, the function returns the period immediately before the currently represented data and of the same length. For example, if you are currently viewing data for March, the function returns data from February.

With parameters, the period returned is of the same length as the currently represented period, but not immediately prior to it. Instead, it counts back in numPeriods periods of time, measured in units named by offset such as 'weeks' or 'months'.

The following time offset values are supported:

  • 'Years'
  • 'Quarters'
  • 'Months'
  • 'Weeks'
  • 'Days'
  • 'Hours'

Example Calculations

The following commonly used custom calculations demonstrate calculation syntax. These example calculations work with your data source only if your data source contains fields of the same name and type. You can rename the field IDs to make them work with your data source.

Calculations Using Date

Sum of a metric between a specific date and the current date

SUM(totalfatalinjuries) WHERE eventdate BETWEEN '2015-01-01' AND DATE()

in which:

  • totalfatalinjuries can be replaced by your own metric
  • eventdate can be replaced by your own date attribute
  • 2015-01-01 can be replaced by a specific date

Metric from previous period

SUM(totalfatalinjuries) TRANSFORM eventdate = PreviousPeriod()

in which:

  • totalfatalinjuries can be replaced by your own metric
  • eventdate can be replaced by your own date attribute

Change in metric from previous period

SUM(totalfatalinjuries) - (SUM(totalfatalinjuries) TRANSFORM eventdate = PreviousPeriod())

in which:

  • totalfatalinjuries can be replaced by your own metric

Calculation Editor

Zoomdata provides a calculations editor to help you make and test calculations for a data source. To make calculations for a data source, you must have access to the data source. You can access the calculator from a data source configuration or from the metric selection label on any chart built from the data source.

The calculations editor is shown below.

The numbered regions are:

  1. Calculation name
  2. Functions, Derived Fields, and Attributes/Metrics list
  3. Editing space
  4. Preview space

You create calculations by defining a formula composed of constants, metrics, and metrics with functions applied to them. You can also include existing calculations in your new calculation's formula, just as if it were a metric.

Creating Calculations

Use the editor to create calculations.

To create an calculation:

  1. Enter a name for the calculation in the space labeled Untitled Calculation.
  2. Enter the expression into the editing space. Expressions should follow standard mathematical and logical syntax and are resolved using the standard order of operations. You can manually key in expressions or you can select the elements of your expressions from the menus at the left. The expression should not be assigned to a variable because the resolved value of the expression is assigned to the calculation, which serves as the 'variable' to which the value is assigned. That is, enter a / b rather than x = a / b.
  3. To test your calculation, select Run Calculation.
  4. When you are finished with your calculation, select Save.

Was this topic helpful?