Commit 980f3038 authored by Alessio Netti's avatar Alessio Netti
Browse files

Analytics: README updated

- Reflects new hierarchical design
- Updated job_aggregator example configuration
parent f03468ea
......@@ -11,9 +11,10 @@
4. [Operator Configuration](#operatorConfiguration)
1. [Configuration Syntax](#configSyntax)
2. [Instantiating Units](#instantiatingUnits)
3. [MQTT Topics](#mqttTopics)
4. [Pipelining Operators](#pipelining)
5. [Job Operators](#joboperators)
3. [Instantiating Hierarchical Units](#instantiatingHUnits)
4. [Job Operators](#joboperators)
5. [MQTT Topics](#mqttTopics)
6. [Pipelining Operators](#pipelining)
3. [Rest API](#restApi)
1. [List of ressources](#listOfRessources)
2. [Examples](#restExamples)
......@@ -42,16 +43,17 @@ analytics. This approach is preferable for models that require data from multipl
(e.g., clustering-based anomaly detection), or for models that are deployed in [on-demand](#operatorConfiguration) mode.
## Global Configuration <a name="globalConfiguration"></a>
Wintermute shares the same configuration structure as DCDB Pusher and Collect Agent, using a global.conf configuration file.
All output sensors of the frameworks are therefore affected by configuration parameters described in the global Readme.
Additional parameters specific to this framework are the following:
Wintermute shares the same configuration structure as DCDB Pusher and Collect Agent, and it can be enabled via the
respective (i.e., _dcdbpusher.conf_ or _collectagent.conf_) configuration file. All output sensors of the frameworks
are therefore affected by configuration parameters described in the global Readme. Additional parameters specific to
this framework are the following:
| Value | Explanation |
|:----- |:----------- |
| **analytics** | Wrapper structure for the data analytics-specific values.
| hierarchy | Space-separated sequence of regular expressions used to infer the local (DCDB Pusher) or global (DCDB Collect Agent) sensor hierarchy. This parameter should be wrapped in quotes to ensure proper parsing. See the Sensor Tree [section](#sensorTree) for more details.
| filter | Regular expression used to filter the set of sensors in the sensor tree. Everything that matches is included, the rest is discarded.
| jobFilter | Regular expression used to filter the jobs processed by job operators. The expression is applied to the first node of the job's nodelist. If a match is found the job is processed, otherwise it is discarded. This behavior can be changed at the plugin level.
| jobFilter | Regular expression used to filter the jobs processed by job operators. The expression is applied to the first node of the job's nodelist. If a match is found the job is processed, otherwise it is discarded.
| **operatorPlugins** | Block containing the specification of all data analytics plugin to be instantiated.
| plugin _name_ | The plugin name is used to build the corresponding lib-name (e.g. average --> libdcdboperator_average.1.0)
| path | Specify the path where the plugin (the shared library) is located. If left empty, DCDB will look in the default lib-directories (usr/lib and friends) for the plugin file.
......@@ -59,12 +61,12 @@ Additional parameters specific to this framework are the following:
| | |
## Operators <a name="operators"></a>
Operators are the basic building block in Wintermute. A Operator is instantiated within a plugin, performs a specific
Operators are the basic building block in Wintermute. An Operator is instantiated within a plugin, performs a specific
task and acts on sets of inputs and outputs called _units_. Operators are functionally equivalent to _sensor groups_
in DCDB Pusher, but instead of sampling data, they process such data and output new sensors. Some high-level examples
of operators are the following:
* An operator that performs time series regression on a particular input sensor, and outputs its prediction;
* An operator that performs time-series regression on a particular input sensor, and outputs its prediction;
* An operator that aggregates a series of input sensors, builds feature vectors, and performs machine
learning-based tasks using a supervised model;
* An operator that performs clustering-based anomaly detection by using different sets of inputs associated to different
......@@ -74,42 +76,41 @@ compute nodes;
### The Sensor Tree <a name="sensorTree"></a>
Before diving into the configuration and instantiation of operators, we introduce the concept of _sensor tree_. A sensor
tree is simply a data structure expressing the hierarchy of sensors that are being sampled; internal nodes express
hierarchical entities (e.g. clusters, racks, nodes, cpus), whereas leaf nodes express actual sensors. In DCDB Pusher,
a sensor tree refers only to the local hierarchy, while in DCDBCollectAgent it can capture the hierarchy of the entire
hierarchical entities (e.g., clusters, racks, nodes, cpus), whereas leaf nodes express actual sensors. In DCDB Pusher,
a sensor tree refers only to the local hierarchy, while in the Collect Agent it can capture the hierarchy of the entire
system being sampled.
A sensor tree is built at initialization time of DCDB Wintermute, and is implemented in the _SensorNavigator_ class.
Its construction is regulated by the _hierarchy_ global configuration parameter, which can be for example the following:
By default, if no hierarchy string has been specified in the configuration, the tree is built automatically by assuming
that each forward slash-separated part of the sensor name expresses a level in the hierarchy. The total depth of the
tree is thus determined at runtime as well. This is, in most cases, the preferable configuration, as it complies with
the MQTT topic standard, and interprets each sensor name as if it was a path in a file system.
```
rack\d{2}. node\d{2}. cpu\d{2}.
```
Given the above example hierarchy string, we are enforcing the target sensor tree to have three levels, the topmost of
which expresses the racks to which sensors belong, and the lowest one the cpu core (if any). Such string could be used,
for example to build a sensor tree starting from the following set of sensor names:
For spacial cases, the _hierarchy_ global configuration parameter can be used to enforce a specific hierarchy, with
a fixed number of levels. More in general, the following could be a set of forward slash-separated sensor names, from
which we can construct a sensor tree with three levels corresponding to racks, nodes and CPUs respectively:
```
rack00.status
rack00.node05.MemFree
rack00.node05.energy
rack00.node05.temp
rack00.node05.cpu00.col_user
rack00.node05.cpu00.instr
rack00.node05.cpu00.branch-misses
rack00.node05.cpu01.col_user
rack00.node05.cpu01.instr
rack00.node05.cpu01.branch-misses
rack02.status
rack02.node03.MemFree
rack02.node03.energy
rack02.node03.temp
rack02.node03.cpu00.col_user
rack02.node03.cpu00.instr
rack02.node03.cpu00.branch-misses
rack02.node03.cpu01.col_user
rack02.node03.cpu01.instr
rack02.node03.cpu01.branch-misses
/rack00/status
/rack00/node05/MemFree
/rack00/node05/energy
/rack00/node05/temp
/rack00/node05/cpu00/col_user
/rack00/node05/cpu00/instr
/rack00/node05/cpu00/branch-misses
/rack00/node05/cpu01/col_user
/rack00/node05/cpu01/instr
/rack00/node05/cpu01/branch-misses
/rack02/status
/rack02/node03/MemFree
/rack02/node03/energy
/rack02/node03/temp
/rack02/node03/cpu00/col_user
/rack02/node03/cpu00/instr
/rack02/node03/cpu00/branch-misses
/rack02/node03/cpu01/col_user
/rack02/node03/cpu01/instr
/rack02/node03/cpu01/branch-misses
```
Each sensor name is interpreted as a path within the sensor tree. Therefore, the _instr_ and _branch-misses_ sensors
......@@ -120,9 +121,9 @@ The generated sensor tree can then be used to navigate the sensor hierarchy, and
all sensors belonging to a certain node, to a neighbor of a certain node, or to the rack a certain node belongs to_.
Please refer to the documentation of the _SensorNavigator_ class for more details.
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; If no hierarchy string has been specified in the configuration, the tree is built automatically by assuming that each
dot-separated part of the sensor name expresses a level in the hierarchy. The total depth of the tree is thus determined
at runtime as well.
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; In the example above, if sensor names were formatted in a different format (e.g.,
rackXX.nodeXX.cpuXX.sensorName), we should have defined a hierarchy string explicitly in order to generate a sensor
tree correctly. Such string would be, in this case "_rack\d{2}. node\d{2}. cpu\d{2}._".
> NOTE 2 &ensp;&ensp;&ensp; Sensor trees are always built from the names of sensors _as they are published_. Therefore,
please make sure to use the _-a_ option in DCDB Pusher appropriately, to build sensor names that express the desired hierarchy.
......@@ -133,7 +134,7 @@ Each operator operates on one or more _units_. A unit represents an abstract (or
is the target of analysis. A unit could be, for example, a rack, a node within a rack, a CPU within a node or an entire HPC system.
Units are identified by three components:
* **Name**: The name of this unit, that corresponds to the entity it represents. For example, _rack02.node03._ or _rack00.node05.cpu01._ could be unit names. A unit must always correspond to an existing internal node in the current sensor tree;
* **Name**: The name of this unit, that corresponds to the entity it represents. For example, _/rack02/node03/_ or _/rack00/node05/cpu01/_ could be unit names. A unit must always correspond to an existing internal node in the current sensor tree;
* **Input**: The set of sensors that constitute the input for analysis conducted on this unit. The sensors must share a hierarchical relationship with the unit: that is, they can either belong to the node represented by this unit, to its subtree, or to one of its ancestors;
* **Output**: The set of output sensors that are produced from any analysis conducted on this unit. The output sensors are always directly associated with the node represented by the unit.
......@@ -146,15 +147,15 @@ Operators can operate in two different modes:
* **Streaming**: streaming operators perform data analytics online and autonomously, processing incoming sensor data at regular intervals.
The units of streaming operators are completely resolved and instantiated at configuration time. The type of output of streaming
operators is identical to that of _sensors_ in DCDB Pusher, which are pushed to DCDBCollectAgent and finally to the Cassandra database,
resulting in a time series representation;
operators is identical to that of _sensors_ in DCDB Pusher, which are pushed to a Collect Agent and finally to the Cassandra datastore,
resulting in a time-series representation;
* **On-demand**: on-demand operators do not perform data analytics autonomously, but only when queried by users. Unlike
for streaming operators, the units of on-demand operators are not instantiated at configuration, but only when a query is performed. When
such an event occurs, the operator verifies that the queried unit belongs to its _unit domain_, and then instantiates it,
resolving its inputs and outputs. Then, the unit is stored in a local cache for future re-use. The outputs of a on-demand
operator are exposed through the REST API, and are never pushed to the Cassandra database.
Use of streaming operators is advised when a time series-like output is required, whereas on-demand operators are effective
Use of streaming operators is advised when a time-series-like output is required, whereas on-demand operators are effective
when data is required at specific times and for specific purposes, and when the unit domain's size makes the use of streaming
operators unfeasible.
......@@ -166,52 +167,60 @@ file. The following is instead a list of configuration parameters that are avail
| Value | Explanation |
|:----- |:----------- |
| default | Name of the template that must be used to configure this operator.
| interval | Specifies how often the operator will be invoked to perform computations, and thus the sampling interval of its output sensors. Only used for operators in _streaming_ mode.
| interval | Specifies how often (in milliseconds) the operator will be invoked to perform computations, and thus the sampling interval of its output sensors. Only used for operators in _streaming_ mode.
| relaxed | If set to _true_, the units of this operator will be instantiated even if some of the respective input sensors do not exist.
| delay | Delay in milliseconds to be applied to the interval of the operator. This parameter can be used to tune how operator pipelines work, ensuring that the next computation stage is started only after the previous one has finished.
| unitCacheLimit | Defines the maximum size of the unit cache that is used in the on-demand and job modes. Default is 1000.
| minValues | Minimum number of readings that need to be stored in output sensors before these are pushed as MQTT messages. Only used for operators in _streaming_ mode.
| mqttPart | Part of the MQTT topic associated to this operator. Only used when the Unit system is not employed (see this [section](#mqttTopics)).
| mqttPart | Part of the MQTT topic associated to this operator. Only used for the _root_ unit or when the _enforceTopics_ flag is set to true (see this [section](#mqttTopics)).
| enforceTopics | If set to _true_, mqttPart will be forcibly pre-pended to the MQTT topics of all output sensors in the operator (see this [section](#mqttTopics)).
| sync | If set to _true_, computation will be performed at time intervals synchronized with sensor readings.
| disabled | If set to _true_, the operator will be instantiated but will not be started and will not be available for queries.
| duplicate | If set to _false_, only one operator object will be instantiated. Such operator will perform computation over all units that are instantiated, at every interval, sequentially. If set to _true_, the operator object will be duplicated such that each copy will have one unit associated to it. This allows to exploit parallelism between units, but results in separate models to avoid race conditions.
| streaming | If set to _true_, the operator will operate in _streaming_ mode, pushing output sensors regularly. If set to _false_, the operator will instead operate in _on-demand_ mode.
| input | Block of input sensors that must be used to instantiate the units of this operator. These can both be a list of strings, or fully-qualified _Sensor_ blocks containing specific attributes (see DCDB Pusher Readme).
| output | Block of output sensors that will be associated to this operator. These must be _Sensor_ blocks containing valid MQTT suffixes. Note that the number of output sensors is usually fixed depending on the type of operator.
| unitInput | Block of input sensors that must be used to instantiate the units of this operator. These can both be a list of strings, or fully-qualified _Sensor_ blocks containing specific attributes (see DCDB Pusher Readme).
| unitOutput | Block of output sensors that will be associated to this operator. These must be _Sensor_ blocks containing valid MQTT suffixes. Note that the number of output sensors is usually fixed depending on the type of operator.
| globalOutput | Block for _global_ output sensors that are not associated with a specific unit. If this is defined, all units described by the _unitInput_ and _unitOutput_ blocks will be grouped under a hierarchical _root_ unit that contains the output sensors described here.
| | |
#### Configuration Syntax <a name="configSyntax"></a>
In the following we show a sample configuration block for the _Average_ plugin. For the full version, please refer to the
In the following we show a sample configuration block for the _Aggregator_ plugin. For the full version, please refer to the
default configuration file in the _config_ directory:
```
template_average def1 {
global {
mqttprefix /analytics
}
template_aggregator def1 {
interval 1000
minValues 3
duplicate false
streaming true
}
average avg1 {
aggregator avg1 {
default def1
mqttPart /avg1
input {
unitInput {
sensor col_user
sensor MemFree
}
output {
unitOutput {
sensor sum {
operation sum
mqttsuffix /sum
}
sensor max {
operation maximum
mqttsuffix /max
}
sensor avg {
operation average
mqttsuffix /avg
}
}
......@@ -220,40 +229,47 @@ mqttPart /avg1
The configuration shown above uses a template _def1_ for some configuration parameters, which are then applied to the
_avg1_ operator. This operator takes the _col_user_ and _MemFree_ sensors as input (which must be available under this name),
and outputs _sum_, _max_, and _avg_ sensors. In this configuration, the Unit system and sensor hierarchy are not used,
and outputs _sum_, _max_, and _avg_ sensors. In this configuration, the Unit System and sensor hierarchy are not used,
and therefore only one generic unit (called the _root_ unit) will be instantiated.
#### Instantiating Units <a name="instantiatingUnits"></a>
Here we propose once again the configuration discussed above, this time making use of the Unit system to abstract from
Here we propose once again the configuration discussed above, this time making use of the Unit System to abstract from
the specific system being used and simplify configuration. The adjusted configuration block is the following:
```
template_average def1 {
global {
mqttprefix /analytics
}
template_aggregator def1 {
interval 1000
minValues 3
duplicate false
streaming true
}
average avg1 {
aggregator avg1 {
default def1
mqttPart /avg1
input {
unitInput {
sensor "<bottomup>col_user"
sensor "<bottomup 1>MemFree"
}
output {
unitOutput {
sensor "<bottomup, filter cpu00>sum" {
operation sum
mqttsuffix /sum
}
sensor "<bottomup, filter cpu00>max" {
operation maximum
mqttsuffix /max
}
sensor "<bottomup, filter cpu00>avg" {
operation average
mqttsuffix /avg
}
}
......@@ -287,38 +303,168 @@ navigate to the corresponding sensor node according to its _< >_ block, which id
unit, once its inputs and outputs are defined, is then added to the operator.
According to the sensor tree built in the previous [section](#sensorTree), the configuration above would result in
an operator with the following set of units:
an operator with the following set of _flat_ units:
```
rack00.node05.cpu00. {
/rack00/node05/cpu00/ {
Inputs {
rack00.node05.cpu00.col_user
rack00.node05.MemFree
/rack00/node05/cpu00/col_user
/rack00/node05/MemFree
}
Outputs {
rack00.node05.cpu00.sum
rack00.node05.cpu00.max
rack00.node05.cpu00.avg
/rack00/node05/cpu00/sum
/rack00/node05/cpu00/max
/rack00/node05/cpu00/avg
}
}
rack02.node03.cpu00. {
/rack02/node03/cpu00/ {
Inputs {
rack02.node03.cpu00.col_user
rack02.node03.MemFree
/rack02/node03/cpu00/col_user
/rack02/node03/MemFree
}
Outputs {
rack02.node03.cpu00.sum
rack02.node03.cpu00.max
rack02.node03.cpu00.avg
/rack02/node03/cpu00/sum
/rack02/node03/cpu00/max
/rack02/node03/cpu00/avg
}
}
```
#### Instantiating Hierarchical Units <a name="instantiatingHUnits"></a>
A second level of aggregation beyond ordinary units can be obtained by defining sensors in the _globalOutput_ block. In
this case, a series of units will be created like in the previous example, and they will be added as _sub-units_ of a
top-level _root_ unit, which will have as outputs the sensors defined in the _globalOutput_ block. This type of unit
is called a _hierarchical_ unit.
Computation for a hierarchical unit is always performed starting from the top-level unit. This means that all sub-units
will be processed sequentially in the same computation interval, and that they cannot be split. However, both the
top-level unit and the respective sub-units are exposed to the outside, and their sensors can be queried. Please
refer to the plugins' documentation to see whether hierarchical units are supported or not.
Recalling the previous example, a hierarchical unit can be constructed with the following configuration:
```
global {
mqttprefix /analytics
}
template_aggregator def1 {
interval 1000
minValues 3
duplicate false
streaming true
}
aggregator avg1 {
default def1
mqttPart /avg1
unitInput {
sensor "<bottomup>col_user"
sensor "<bottomup 1>MemFree"
}
unitOutput {
sensor "<bottomup, filter cpu00>sum" {
operation sum
mqttsuffix /sum
}
sensor "<bottomup, filter cpu00>max" {
operation maximum
mqttsuffix /max
}
sensor "<bottomup, filter cpu00>avg" {
operation average
mqttsuffix /avg
}
}
globalOutput {
sensor globalSum {
operation sum
mqttsuffix /globalSum
}
}
}
```
Note that hierarchical units can only have global outputs, but not global inputs, as they are meant to perform aggregation
of the results obtained on single sub-units. Such a configuration would result in the following unit structure:
```
__root__ {
Outputs {
/analytics/avg1/globalSum
}
Sub-units {
/rack00/node05/cpu00/ {
Inputs {
/rack00/node05/cpu00/col_user
/rack00/node05/MemFree
}
Outputs {
/rack00/node05/cpu00/sum
/rack00/node05/cpu00/max
/rack00/node05/cpu00/avg
}
}
/rack02/node03/cpu00/ {
Inputs {
/rack02/node03/cpu00/col_user
/rack02/node03/MemFree
}
Outputs {
/rack02/node03/cpu00/sum
/rack02/node03/cpu00/max
/rack02/node03/cpu00/avg
}
}
}
}
```
> NOTE&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp; The _duplicate_ setting has no effect when hierarchical units are used
(i.e., _globalOutput_ is defined).
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; As of now the _aggregator_ plugin does not support hierarchical units. The
above is only meant as an example for how hierarchical units can be created in general.
#### Job Operators <a name="joboperators"></a>
_Job Operators_ are a class of operators which act on job-specific data. Such data is structured in _job units_. These
units are _hierarchical_, and work as described previously (see this [section](#instantiatingHUnits)). In particular,
they are arranged as follows:
* The top unit is associated to the job itself and contains all of the required output sensors (_globalOutput_ block);
* One sub-unit for each node on which the job was running is allocated. Each of these sub-units contains all of the input
sensors that are required at configuration time (_unitInput_ block), along output sensors at the compute node level (_unitOutput_ block).
The computation algorithms driving job operators can then navigate freely this hierarchical unit design according to
their specific needs. Job-level sensors in the top unit do not require Unit System syntax (see this [section](#mqttTopics));
sensors that are defined in sub-units, if supported by the plugin, need however to be at the compute node level in the
current sensor tree, since all of the sub-units are tied to the nodes on which the job was running. This way,
unit resolution is performed correctly by the Unit System.
Job operators also support the _streaming_ and _on-demand_ modes, which work like the following:
* In **streaming** mode, the job operator will retrieve the list of jobs that were running in the time interval starting
from the last computation to the present; it will then build one job unit for each of them, and subsequently perform computation;
* In **on-demand** mode, users can query a specific job id, for which a job unit is built and computation is performed.
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The _duplicate_ setting does not affect job operators.
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; In order to get units that operate at the _node_ level, the output sensors in the
configuration discussed here should have a unit block in the form of < bottomup 1, filter cpu00 >.
configuration discussed earlier should have a unit block in the form of < bottomup 1 >.
#### MQTT Topics <a name="mqttTopics"></a>
The MQTT topics associated to output sensors of a certain operator are constructed in different ways depending
......@@ -326,13 +472,29 @@ on the unit they belong to:
* **Root unit**: if the output sensors belong to the _root_ unit, that is, they do not belong to any level in the sensor
hierarchy and are uniquely defined, the respective topics are constructed like in DCDB Pusher sensors, by concatenating
the MQTT prefix, operator part and sensor suffix that are defined;
the MQTT prefix, operator part and sensor suffix that are defined. The same happens for sensors defined in the _globalOutput_
block, which are part of the top level in a hierarchical unit, which also corresponds to the _root_ unit;
* **Job unit**: if the output sensors belong to a _job_ unit in a job operator (see below), the MQTT topic is constructed
by concatenating the MQTT prefix, the operator part, a job suffix (e.g., /job1334) and finally the sensor suffix;
* **Any other unit**: if the output sensor belongs to any other unit in the sensor tree, its MQTT topic is constructed
by concatenating the MQTT prefix associated to the unit (which is defined as _the portion of the MQTT topic shared by all sensors
belonging to such unit_) and the sensor suffix.
Even for units belonging to the last category, we can enforce arbitrary MQTT topics by enabling the _enforceTopics_ flag.
Using this, the MQTT prefix and operator part are pre-pended to the unit name and sensor suffix. This is enforced also
in the sub-units of hierarchical units (e.g., for job operators). Recalling the example above, this would lead to the following result:
```
MQTT Prefix /analytics
MQTT Part /avg1
Unit /rack02/node03/cpu00/
Without enforceTopics:
/rack02/node03/cpu00/sum
With enforceTopics:
/analytics/avg1/rack02/node03/cpu00/sum
```
#### Pipelining Operators <a name="pipelining"></a>
The inputs and outputs of streaming operators can be chained so as to form a processing pipeline. To enable this, users
......@@ -342,27 +504,6 @@ the framework cannot infer the correct order of initialization so as to resolve
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; This feature is not supported when using operators in _on demand_ mode.
#### Job Operators <a name="joboperators"></a>
_Job Operators_ are a class of operators which act on job-specific data. Such data is structured in _job units_. These
work similarly to ordinary units, with the difference that they are arranged hierarchically as follows:
* The top unit is associated to the job itself and contains all of the required output sensors;
* One sub-unit for each node on which the job was running is allocated. Each of these sub-units contains all of the input
sensors that are required at configuration time.
The computation algorithms driving job operators can then navigate freely this hierarchical unit design according to
their specific needs. Since all of the sub-units are tied to the nodes on which the job was running, the output sensors
specified in the configuration must also be at the node level in the unit system, such that unit resolution is performed correctly.
Job operators also support the _streaming_ and _on demand_ modes, which work like the following:
* In **streaming** mode, the job operator will retrieve the list of jobs that were running in the time interval starting
from the last computation to the present; it will then build one job unit for each of them, and subsequently perform computation;
* In **on demand** mode, users can query a specific job id, for which a job unit is built and computation is performed.
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The _duplicate_ setting does not affect job operators.
## Rest API <a name="restApi"></a>
Wintermute provides a REST API that can be used to perform various management operations on the framework. The
API is functionally identical to that of DCDB Pusher, and is hosted at the same address. All requests that are targeted
......@@ -587,34 +728,36 @@ Prefix `/analytics` left out!
</tr>
</table>
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; Opt. = Optional
> NOTE&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp; Opt. = Optional
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; The value of operator output sensors can be retrieved with the _compute_ resource, or with the [plugin]/[sensor]/avg resource defined in the DCDB Pusher REST API.
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; The value of operator output sensors can be retrieved with the _compute_ resource, or with the _/average_ resource defined in the DCDB Pusher REST API.
> NOTE 3 &ensp;&ensp;&ensp;&ensp;&ensp; Developers can integrate their custom REST API resources that are plugin-specific, by implementing the _REST_ method in _OperatorTemplate_. To know more about plugin-specific resources, please refer to the respective documentation.
> NOTE 4 &ensp;&ensp;&ensp;&ensp;&ensp; When operators employ a _root_ unit (e.g., when the Unit System is not used or a _globalOutput_ block is defined in regular operators) the _unit_ query can be omitted when performing a _/compute_ action.
### Rest Examples <a name="restExamples"></a>
In the following are some examples of REST requests over HTTPS:
* Listing the units associated to the _avgoperator1_ operator in the _average_ plugin:
* Listing the units associated to the _avgoperator1_ operator in the _aggregator_ plugin:
```bash
GET https://localhost:8000/analytics/units?plugin=average;operator=avgOperator1
GET https://localhost:8000/analytics/units?plugin=aggregator;operator=avgOperator1
```
* Listing the output sensors associated to all operators in the _average_ plugin:
* Listing the output sensors associated to all operators in the _aggregator_ plugin:
```bash
GET https://localhost:8000/analytics/sensors?plugin=average;
GET https://localhost:8000/analytics/sensors?plugin=aggregator;
```
* Reloading the _average_ plugin:
* Reloading the _aggregator_ plugin:
```bash
PUT https://localhost:8000/analytics/reload?plugin=average
PUT https://localhost:8000/analytics/reload?plugin=aggregator
```
* Stopping the _avgOperator1_ operator in the _average_ plugin:
* Stopping the _avgOperator1_ operator in the _aggregator_ plugin:
```bash
PUT https://localhost:8000/analytics/stop?plugin=average;operator=avgOperator1
PUT https://localhost:8000/analytics/stop?plugin=aggregator;operator=avgOperator1
```
* Performing a query for unit _node00.cpu03._ to the _avgOperator1_ operator in the _average_ plugin:
* Performing a query for unit _/node00/cpu03/_ to the _avgOperator1_ operator in the _aggregator_ plugin:
```bash
PUT https://localhost:8000/analytics/compute?plugin=average;operator=avgOperator1;unit=node00.cpu03
PUT https://localhost:8000/analytics/compute?plugin=aggregator;operator=avgOperator1;unit=/node00/cpu03/
```
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The analytics RestAPI requires authentication credentials as well.
......@@ -667,7 +810,7 @@ The following are the configuration parameters available for the _Regressor_ plu
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; When the _duplicate_ option is enabled, the _outputPath_ field is ignored to avoid file collisions from multiple regressors.
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; When loading the model from a file and getImportances is set to true, importance values will be printed only if the original model had this feature enabled upon training.
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; When loading the model from a file and _getImportances_ is set to true, importance values will be printed only if the original model had this feature enabled upon training.
Additionally, input sensors in operators of the Regressor plugin accept the following parameter:
......@@ -683,7 +826,7 @@ Finally, the Regressor plugin supports the following additional REST API action:
| importances | Returns the sorted importance values for the input features, together with the respective labels, if available.
## Tester Plugin <a name="testerPlugin"></a>
The _Tester_ plugin can be used to test the functionality and performance of the query engine, as well as of the unit system. It will perform a specified number of queries over the set of input sensors for each unit, and then output as a sensor the total number of retrieved readings. The following are the configuration parameters for operators in the _Tester_ plugin:
The _Tester_ plugin can be used to test the functionality and performance of the query engine, as well as of the Unit System. It will perform a specified number of queries over the set of input sensors for each unit, and then output as a sensor the total number of retrieved readings. The following are the configuration parameters for operators in the _Tester_ plugin:
| Value | Explanation |
|:----- |:----------- |
......@@ -696,7 +839,7 @@ Here we describe available plugins in Wintermute that are devoted to the output
## File Sink Plugin <a name="filesinkPlugin"></a>
The _File Sink_ plugin allows to write the output of any other sensor to the local file system. As such, it does not produce output sensors by itself, and only reads from input sensors.
The input sensors can either be fully qualified, or can be described through the unit system. In this case, multiple input sensors can be generated automatically, and the respective output paths need to be adjusted by enabling the _autoName_ attribute described below, to prevent multiple sensors from being written to the same file. The file sink operators (named sinks) support the following attributes:
The input sensors can either be fully qualified, or can be described through the Unit System. In this case, multiple input sensors can be generated automatically, and the respective output paths need to be adjusted by enabling the _autoName_ attribute described below, to prevent multiple sensors from being written to the same file. The file sink operators (named sinks) support the following attributes:
| Value | Explanation |
|:----- |:----------- |
......
......@@ -9,7 +9,7 @@ aggregator avg1 {
default def1
window 2000
input {
unitInput {
sensor "<bottomup>col_user"
......@@ -17,10 +17,9 @@ window 2000
}
output {
globalOutput {
; In this case "bottomup 1" is the sensor tree level associated to compute nodes
sensor "<bottomup 1>sum" {
sensor sum {
mqttsuffix /sum
operation sum
}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment