Commit b1f6f664 authored by Alessio Netti's avatar Alessio Netti
Browse files

Analytics: update Readme

parent 37a44e0d
......@@ -12,12 +12,15 @@
1. [Configuration Syntax](#configSyntax)
2. [Instantiating Units](#instantiatingUnits)
3. [MQTT Topics](#mqttTopics)
4. [Pipelining Analyzers](#pipelining)
5. [Job Analyzers](#jobanalyzers)
3. [Rest API](#restApi)
1. [List of ressources](#listOfRessources)
2. [Examples](#restExamples)
3. [Plugins](#plugins)
1. [Average Plugin](#averagePlugin)
2. [Writing Plugins](#writingPlugins)
1. [Aggregator Plugin](#averagePlugin)
2. [Job Aggregator Plugin](#jobaveragePlugin)
3. [Writing Plugins](#writingPlugins)
# Introduction <a name="introduction"></a>
In this Readme we describe the DCDB Data Analytics framework, and all data abstractions that are associated with it.
......@@ -234,15 +237,15 @@ mqttPart /avg1
output {
sensor "<bottomup, filter cpu01>sum" {
sensor "<bottomup, filter cpu00>sum" {
mqttsuffix /sum
sensor "<bottomup, filter cpu01>max" {
sensor "<bottomup, filter cpu00>max" {
mqttsuffix /max
sensor "<bottomup, filter cpu01>avg" {
sensor "<bottomup, filter cpu00>avg" {
mqttsuffix /avg
......@@ -306,6 +309,9 @@ rack02.node03.cpu00. {
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; In order to get units that operate at the _node_ level, the output sensors in the
configuration discussed here should have a unit block in the form of < bottomup 1, filter cpu00 >.
#### MQTT Topics <a name="mqttTopics"></a>
The MQTT topics associated to output sensors of a certain analyzer are constructed in different ways depending
on the unit they belong to:
......@@ -326,6 +332,27 @@ the framework cannot infer the correct order of initialization so as to resolve
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; This feature is not supported when using analyzers in _on demand_ mode.
#### Job Analyzers <a name="jobanalyzers"></a>
_Job Analyzers_ are a class of analyzers which act on job-specific data. Such data is structured in _job units_. These
work similarly to ordinary units, with the difference that they are arranged hierarchically as follows:
* The top unit is associated to the job itself and contains all of the required output sensors;
* One sub-unit for each node on which the job was running is allocated. Each of these sub-units contains all of the input
sensors that are required at configuration time.
The computation algorithms driving job analyzers can then navigate freely this hierarchical unit design according to
their specific needs. Since all of the sub-units are tied to the nodes on which the job was running, the output sensors
specified in the configuration must also be at the node level in the unit system, such that unit resolution is performed correctly.
Job analyzers also support the _streaming_ and _on demand_ modes, which work like the following:
* In **streaming** mode, the job analyzer will retrieve the list of jobs that were running in the time interval starting
from the last computation to the present; it will then build one job unit for each of them, and subsequently perform computation;
* In **on demand** mode, users can query a specific job id, for which a job unit is built and computation is performed.
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The _duplicate_ setting does not affect job analyzers.
## Rest API <a name="restApi"></a>
DCDBAnalytics provides a REST API that can be used to perform various management operations on the framework. The
API is functionally identical to that of DCDBPusher, and is hosted at the same address. All requests that are targeted
......@@ -601,6 +628,12 @@ Additionally, output sensors in analyzers of the Aggregator plugin accept the fo
| operation | Operation to be performed over the input sensors. Can be "sum", "average", "maximum", "minimum", "std" or "percentiles".
| percentile | Specific percentile to be computed when using the "percentiles" operation. Can be an integer in the (0,100) range.
## Job Aggregator Plugin <a name="jobaveragePlugin"></a>
The _Job Aggregator_ plugin offers the same functionality as the _Aggregator_ plugin, but on a per-job basis. As such,
it performs aggregation of the specified input sensors across all nodes on which each job is running. Please refer
to the corresponding [section](#jobanalyzers) for more details.
## Writing DCDBAnalytics Plugins <a name="writingPlugins"></a>
Generating a DCDBAnalytics plugin requires implementing a _Analyzer_ and _Configurator_ class which contain all logic
tied to the specific plugin. Such classes should be derived from _AnalyzerTemplate_ and _AnalyzerConfiguratorTemplate_
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment