Using the Zoomdata Scheduler
The Zoomdata Scheduler is a component within the Server used to run jobs that update the metadata for the source in asynchronous mode. Zoomdata scheduler is integrated with the data connectors and supports the following types of jobs:
- Refreshing the data sources that are connected to Zoomdata (in other words, refreshing metadata and clearing the cache)
- Refreshing the specific fields for the data source.
The Zoomdata Administrator and users with admin privileges can access the Scheduler, which is available from the Refresh page for specific data source (Figure 1).
Administrators can view the status of scheduled jobs in the Zoomdata Console , which is available from the Settings menu (as shown in Figure 2).
The following topics are covered in this article:
- How the Zoomdata Scheduler Works
- Setting Up Data Source Refresh Job
- Refreshing the Data Source
- The Zoomdata Console
When you initially connect Zoomdata to your data source the following activities are automatically run:
- A sampling of the dataset is executed to determine distinct values for all field types set to Attribute and min/max values for all field types set to ‘Number’, ‘Integer’ and ‘Money’ (used by a chart's Filter controls).
You can also refresh them manually by clicking Refresh near the corresponding field on the data source's Fields page. This action kicks off an asynchronous job to determine the distinct values and min/max range based on the entire dataset rather than the sample.
Besides these initial activities, administrators can set the Scheduler to perform jobs related to the data sources. Table 1 identifies the jobs that are supported currently, the triggers for these jobs, and the activities that occur when the job is run.
|Data Source Refresh||
|Data Source Fields Refresh||By manual selection of the Refresh button on the Fields page||
By default, when you are in the process of connecting your data source to Zoomdata, the Refresh page is set to the No Schedule option (as shown in Figure 3). This means that the Scheduler runs an initial data source refresh job after the source has been successfully created and saved.
To enable the Scheduler to run at predetermined points in time, perform the following steps (Figure 4):
- Select the Periodically option.
date and time.
Zoomdata uses the UTC time zone.
- Select the time interval for the job to be run from the Runs list (which includes monthly, weekly, daily, or hourly). Depending on the option that you select in this list, corresponding options will be available in the Run every section:
Monthly - specify the time interval (months) for the job to be run. The job will run as described below.
The job will run every M months starting January (included), where M is the value specified in the Run every field.
For example, your job starts on March 10, 2016 and is scheduled to run every 3 months . Therefore, the job will run every third month at the specified time (that is, April, July, and October, the following January, etc).
Weekly - select the days of the week for the job to be run.
Daily - specify the time interval (in days from 1 to 31) for the job to be run.
The job will run every D days from the first day of the month (inclusive), where D is the value specified in the Run every field. The first job runs on the date and time you specified in the Start on field. For example, you set the job to start on March 10, 2016 at 5:00 AM and to run every five days. The next job runs on March 11 at 5:00 AM and subsequent jobs will run every fifth day at the specified time until the end of the month.
Hourly - specify the time interval in hours (1-23) and minutes (1-59) for the job to be run.
You can set the specific hour and minute for the initial job to run (in the Start on field). Then set the time interval for jobs to be run down to the hourly and minutes granularity (in the Run every field). For example, you can set your job to start on March 10, 2016 at 5:00 AM and to run every 3 hours and 20 minutes. The next job run will be at 8:20 AM and so on.
- Your configuration summary is displayed in the Summary section.
For more complicated update schedules, use the Advanced option to set Cron expressions (as shown in Figure 10).
A Cron expression sets a schedule using a string of six fields and separated by a blank space. The format for a Cron expression is:
The standard values that are supported by each field (and with Zoomdata’s Scheduler) include:
|Field||Allowed Values||Additional Characters|
|Seconds||0-59||, - * /|
|Minutes||0-59||, - * /|
|Hours||0-23||, - * /|
|Day of the month||1-31||, - * / ? L W|
|Month||1-12 or Jan-Dec||, - * /|
|Day of the week||1-7 or Sun-Sat||, - * / ? L W #|
When creating a Cron expression, keep the following requirements in mind:
- Either ‘Day of the month’ or ‘Day of the week’ is needed, but not both; insert a question mark (?) as a placeholder for the one not specified.
- Names of the ‘Month’ and ‘Day of the week’ are not case sensitive; for example, ‘FRI’ and ‘fri’ are both acceptable formats.
|Special Characters||What It Means|
All values. Represents all the values within the specified field. For example, when used in the minute field, a job will run every minute.
0 * 0 0 0 0
No specific value. Used as a placeholder when no value is needed in the field. For example, if specifying a ‘Month’ value you would enter ‘?’ for the ‘Day of the week’ field.
0 0 0 0 6 ?
Range. Enter a time range for the field using this symbol. For example, 3-6 in the ‘Hours’ field means a job will run at 3:00, 4:00, 5:00 and 6:00 am.
0 0 3-6 0 0 0
Comma. When a series of information is needed, use the comma to identify all the values for the field. For example, Wed, Thur, Fri in the ‘Day of the week’ field means a job is run on Wednesdays, Thursdays and Fridays.
0 0 0 0 0 Wed,Thur,Fri
Forward slash. Specifies the starting time value and the incremental increase of time. For example, 0/5 in the minutes field means a starting point of 0 and running a job every 5 minutes.
0 0/5 0 0 0 0
Last. Used in two fields only - ‘Day of the month’ and ‘Day of the week’.
0 0 0 5L 0 0
Weekday. Used in two fields only - ‘Day of the month’ and ‘Day of the week’.
Identifies the weekday closest to the given day. For example, 15W means the closest weekday to the 15th of the month. The following results are possible:
Number sign. Used only with the ‘Day of the week; identifies the specific day of the month. For example, both Wed#2 and 3#2 identifies the second Wednesday of the month.
0 0 0 0 0 Wed#2
Examples of Cron Expressions
|0 0 12 * * ?||Noon every day|
|0 30 20 ? * *||8:30pm every night|
|0 0/10 17 * * ?||Every 10 minutes starting at 5pm and ending at 5:50pm, every day|
|0 15-30 20 * * ?||Every minute starting at 8:15pm and ending at 8:30pm, every day|
|0 45 20 ? * Mon,Wed,Fri||8:45pm every Monday, Wednesday and Friday|
|0 0 20 3/3 * ?||
8pm every 3 days in every month, starting on the third day of the month
You can select the fields from your data source to be refreshed on the Configuration page.
All the fields from your data source are listed in the Refresh Fields Metadata section. By default, only the fields of type Time are selected. If you want to refresh all the fields from your data source, click Select All . Otherwise, select the checkboxes for the specific fields in the Refreshable column.
To update the entire list of fields from the data source, access the Fields page and click Refresh Fields (Figure 12). This option differs from the functionality in the Refresh page because this is focused on a manual refresh of the fields contained in the data source.
You can also refresh specific fields from your data source. Click the Refresh button from the Statistics column for the field. The job immediately begins and the status shows in that cell.
Administrators can monitor jobs using the Console (which is located in the Settings menu, as shown in Figure 12). The Console automatically refreshes every 15 seconds.
The jobs (that is, the Job Names) are identified in the Console by the Source Name (as shown in Figure 13).
If you have scheduled many jobs, you can quickly filter by a specific job status:
. To return to the comprehensive list of all jobs, click
The Console provides the following details for jobs:
- Data Source - the name of the data source, for which the job has been created
- Status - the status of the job
- Last Finished - date and time of the most recent executed job
- Next Run - the next scheduled run for the job
- Job History - opens new pop-up window showing all jobs that have been run for the data source
You can also sort the Jobs table by the following column headers:
- Data Source
- Job Type
- Last Finished
However, keep in mind that the sorting automatically resets to the default state every time the table is refreshed.
The Source Refresh window provides an historical view of the jobs that have been run for the selected data source. The information provided includes: job start time, job finish time, and the job execution status (as shown in Figure 14). Use the quick filters to view the jobs in the In Progress or Finished status.
For the Status column, three conditions are used to identify the status of the most currently run job:
- COMPLETE: The job was successfully completed
The job has only been partially completed
For example, the min/max values were successfully refreshed, but the distinct values were not refreshed.
The job could not run or could not be completed due to some error in the system
For example, Zoomdata may be experiencing connection issues with the data source. Click the arrow to view the details on the issues that occurred while executing the job.