README.md 48.8 KB
Newer Older
1
# Wintermute, the DCDB Data Analytics Framework
2
3

### Table of contents
4
1. [Introduction](#introduction)
5
2. [DCDB Wintermute](#dcdbanalytics)
6
    1. [Global Configuration](#globalConfiguration)
7
    2. [Operators](#operators)
8
9
10
    	1. [The Sensor Tree](#sensorTree)
    	2. [The Unit System](#unitSystem)
    	3. [Operational Modes](#opModes)
11
		4. [Operator Configuration](#operatorConfiguration)
12
13
			1. [Configuration Syntax](#configSyntax)
			2. [Instantiating Units](#instantiatingUnits)
Alessio Netti's avatar
Alessio Netti committed
14
15
16
17
			3. [Instantiating Hierarchical Units](#instantiatingHUnits)
			4. [Job Operators](#joboperators)
			5. [MQTT Topics](#mqttTopics)
			6. [Pipelining Operators](#pipelining)
18
    3. [Rest API](#restApi)
Alessio Netti's avatar
Alessio Netti committed
19
        1. [List of resources](#listOfResources)
Micha Mueller's avatar
Micha Mueller committed
20
        2. [Examples](#restExamples)
21
3. [Plugins](#plugins)
Alessio Netti's avatar
Alessio Netti committed
22
23
	1. [Aggregator Plugin](#averagePlugin)
	2. [Job Aggregator Plugin](#jobaveragePlugin)
24
25
26
27
28
29
	3. [Smoothing Plugin](#smoothingPlugin)
	4. [Regressor Plugin](#regressorPlugin)
	5. [Classifier Plugin](#classifierPlugin)
	6. [Clustering Plugin](#clusteringPlugin)
	6. [CS Signatures Plugin](#csPlugin)
	7. [Tester Plugin](#testerPlugin)
30
31
4. [Sink Plugins](#sinkplugins)
	1. [File Sink Plugin](#filesinkPlugin)
Alessio Netti's avatar
Alessio Netti committed
32
5. [Writing Plugins](#writingPlugins)
33

34
35
36
37
### Additional Resources

* **End-to-end usage example** of the Wintermute framework to perform regression. Can be found in _operators/regressor_.

38
# Introduction <a name="introduction"></a>
39
In this Readme we describe Wintermute, the DCDB Data Analytics framework, and all data abstractions that are associated with it. 
40

41
42
43
# DCDB Wintermute <a name="dcdbanalytics"></a>
The Wintermute framework is built on top of DCDB, and allows to perform data analytics based on sensor data
in a variety of ways. Wintermute can be deployed both in DCDB Pusher and Collect Agent, with some minor
44
45
differences:

46
* **DCDB Pusher**: only sensor data that is sampled locally and that is contained within the sensor cache can be used for
47
48
data analytics. However, this is the preferable way to deploy simple models on a large-scale, as all computation is
performed within compute nodes, dramatically increasing scalability;
49
* **DCDB Collect Agent**: all available sensor data, in the local cache and in the Cassandra database, can be used for data
50
analytics. This approach is preferable for models that require data from multiple sources at once 
51
(e.g., clustering-based anomaly detection), or for models that are deployed in [on-demand](#operatorConfiguration) mode.
52
53

## Global Configuration <a name="globalConfiguration"></a>
Alessio Netti's avatar
Alessio Netti committed
54
55
56
57
Wintermute shares the same configuration structure as DCDB Pusher and Collect Agent, and it can be enabled via the 
respective (i.e., _dcdbpusher.conf_ or _collectagent.conf_) configuration file.  All output sensors of the frameworks 
are therefore affected by configuration parameters described in the global Readme. Additional parameters specific to 
this framework are the following:
58
59
60

| Value | Explanation |
|:----- |:----------- |
Alessio Netti's avatar
Alessio Netti committed
61
| **analytics** | Wrapper structure for the data analytics-specific values.
62
| hierarchy | Space-separated sequence of regular expressions used to infer the local (DCDB Pusher) or global (DCDB Collect Agent) sensor hierarchy. This parameter should be wrapped in quotes to ensure proper parsing. See the Sensor Tree [section](#sensorTree) for more details.
Alessio Netti's avatar
Alessio Netti committed
63
| filter | Regular expression used to filter the set of sensors in the sensor tree. Everything that matches is included, the rest is discarded.
64
65
| jobFilter | Regular expression used to filter the jobs processed by job operators. The expression is applied to all nodes of the job's nodelist to extract certain information (e.g., rack or island).
| jobMatch | String against which the node names filtered through the _jobFilter_ are checked, to determine if a job is to be processed (see this [section](#jobOperators)).
66
| jobIdFilter | Like the jobFilter, this is a regular expression used to filter out jobs that do not match it. In this case, the job ID is checked against the regex and the job is discarded if a match is not found.
Alessio Netti's avatar
Alessio Netti committed
67
| **operatorPlugins** | Block containing the specification of all data analytics plugin to be instantiated.
68
| plugin _name_ | The plugin name is used to build the corresponding lib-name (e.g. average --> libdcdboperator_average.1.0)
69
70
71
72
| path | Specify the path where the plugin (the shared library) is located. If left empty, DCDB will look in the default lib-directories (usr/lib and friends) for the plugin file.
| config | One can specify a separate config-file (including path to it) for the plugin to use. If not specified, DCDB will look up pluginName.conf (e.g. average.conf) in the same directory where global.conf is located.
| | |

73
## Operators <a name="operators"></a>
Alessio Netti's avatar
Alessio Netti committed
74
Operators are the basic building block in Wintermute. An Operator is instantiated within a plugin, performs a specific
75
task and acts on sets of inputs and outputs called _units_. Operators are functionally equivalent to _sensor groups_
76
in DCDB Pusher, but instead of sampling data, they process such data and output new sensors. Some high-level examples
77
of operators are the following:
78

Alessio Netti's avatar
Alessio Netti committed
79
* An operator that performs time-series regression on a particular input sensor, and outputs its prediction;
80
* An operator that aggregates a series of input sensors, builds feature vectors, and performs machine 
81
learning-based tasks using a supervised model;
82
* An operator that performs clustering-based anomaly detection by using different sets of inputs associated to different
83
compute nodes;
84
* An operator that outputs statistical features related to the time series of a certain input sensor.
85
86

### The Sensor Tree <a name="sensorTree"></a>
87
Before diving into the configuration and instantiation of operators, we introduce the concept of _sensor tree_. A  sensor
88
tree is simply a data structure expressing the hierarchy of sensors that are being sampled; internal nodes express
Alessio Netti's avatar
Alessio Netti committed
89
90
hierarchical entities (e.g., clusters, racks, nodes, cpus), whereas leaf nodes express actual sensors. In DCDB Pusher, 
a sensor tree refers only to the local hierarchy, while in the Collect Agent it can capture the hierarchy of the entire
91
92
system being sampled.

93
A sensor tree is built at initialization time of DCDB Wintermute, and is implemented in the _SensorNavigator_ class. 
Alessio Netti's avatar
Alessio Netti committed
94
95
96
97
By default, if no hierarchy string has been specified in the configuration, the tree is built automatically by assuming 
that each forward slash-separated part of the sensor name expresses a level in the hierarchy. The total depth of the 
tree is thus determined at runtime as well. This is, in most cases, the preferable configuration, as it complies with
the MQTT topic standard, and interprets each sensor name as if it was a path in a file system.
98

Alessio Netti's avatar
Alessio Netti committed
99
100
101
For spacial cases, the _hierarchy_ global configuration parameter can be used to enforce a specific hierarchy, with
a fixed number of levels. More in general, the following could be a set of forward slash-separated sensor names, from
which we can construct a sensor tree with three levels corresponding to racks, nodes and CPUs respectively:
102
103

```
Alessio Netti's avatar
Alessio Netti committed
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
/rack00/status
/rack00/node05/MemFree
/rack00/node05/energy
/rack00/node05/temp
/rack00/node05/cpu00/col_user
/rack00/node05/cpu00/instr
/rack00/node05/cpu00/branch-misses
/rack00/node05/cpu01/col_user
/rack00/node05/cpu01/instr
/rack00/node05/cpu01/branch-misses
/rack02/status
/rack02/node03/MemFree
/rack02/node03/energy
/rack02/node03/temp
/rack02/node03/cpu00/col_user
/rack02/node03/cpu00/instr
/rack02/node03/cpu00/branch-misses
/rack02/node03/cpu01/col_user
/rack02/node03/cpu01/instr
/rack02/node03/cpu01/branch-misses
124
125
126
127
128
129
130
131
132
133
``` 

Each sensor name is interpreted as a path within the sensor tree. Therefore, the _instr_ and _branch-misses_ sensors
will be placed as leaf nodes in the deepest level of the tree, as children of the respective cpu node they belong to.
Such cpu nodes will be in turn children of the nodes they belong to, and so on.

The generated sensor tree can then be used to navigate the sensor hierarchy, and perform actions such as _retrieving
all sensors belonging to a certain node, to a neighbor of a certain node, or to the rack a certain node belongs to_.
Please refer to the documentation of the _SensorNavigator_ class for more details.

Alessio Netti's avatar
Alessio Netti committed
134
135
136
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; In the example above, if sensor names were formatted in a different format (e.g.,
rackXX.nodeXX.cpuXX.sensorName), we should have defined a hierarchy string explicitly in order to generate a sensor
tree correctly. Such string would be, in this case "_rack\d{2}.  node\d{2}.  cpu\d{2}._". 
137
138

> NOTE 2 &ensp;&ensp;&ensp; Sensor trees are always built from the names of sensors _as they are published_. Therefore,
139
please make sure to use the _-a_ option in DCDB Pusher appropriately, to build sensor names that express the desired hierarchy.
140
141
142


### The Unit System <a name="unitSystem"></a>
143
Each operator operates on one or more _units_. A unit represents an abstract (or physical) entity in the current system that
144
145
146
is the target of analysis. A unit could be, for example, a rack, a node within a rack, a CPU within a node or an entire HPC system.
Units are identified by three components:

Alessio Netti's avatar
Alessio Netti committed
147
* **Name**: The name of this unit, that corresponds to the entity it represents. For example, _/rack02/node03/_ or _/rack00/node05/cpu01/_ could be unit names. A unit must always correspond to an existing internal node in the current sensor tree;
148
149
150
151
152
* **Input**: The set of sensors that constitute the input for analysis conducted on this unit. The sensors must share a hierarchical relationship with the unit: that is, they can either belong to the node represented by this unit, to its subtree, or to one of its ancestors; 
* **Output**: The set of output sensors that are produced from any analysis conducted on this unit. The output sensors are always directly associated with the node represented by the unit.

Units are a way to define _patterns_ in the sensor tree and retrieve sensors that are associated to each other by a 
hierarchical relationship. See the configuration [section](#instantiatingUnits) for more details on how to create
153
templates in order to define units suitable for operators.
154
155

### Operational Modes <a name="opModes"></a>
156
Operators can operate in two different modes:
157

158
159
* **Streaming**: streaming operators perform data analytics online and autonomously, processing incoming sensor data at regular intervals.
The units of streaming operators are completely resolved and instantiated at configuration time. The type of output of streaming
Alessio Netti's avatar
Alessio Netti committed
160
161
operators is identical to that of _sensors_ in DCDB Pusher, which are pushed to a Collect Agent and finally to the Cassandra datastore,
resulting in a time-series representation;
162
163
164
* **On-demand**: on-demand operators do not perform data analytics autonomously, but only when queried by users. Unlike
for streaming operators, the units of on-demand operators are not instantiated at configuration, but only when a query is performed. When 
such an event occurs, the operator verifies that the queried unit belongs to its _unit domain_, and then instantiates it,
165
resolving its inputs and outputs. Then, the unit is stored in a local cache for future re-use. The outputs of a on-demand
166
operator are exposed through the REST API, and are never pushed to the Cassandra database.
167

Alessio Netti's avatar
Alessio Netti committed
168
Use of streaming operators is advised when a time-series-like output is required, whereas on-demand operators are effective
169
when data is required at specific times and for specific purposes, and when the unit domain's size makes the use of streaming
170
operators unfeasible.
171

172
### Operator Configuration <a name="operatorConfiguration"></a>
173
174
Here we describe how to configure and instantiate operators in Wintermute. The configuration scheme is very similar
to that of _sensor groups_ in DCDB Pusher, and a _global_ configuration block can be defined in each plugin configuration
175
file. The following is instead a list of configuration parameters that are available for the operators themselves:
176
177
178

| Value | Explanation |
|:----- |:----------- |
179
| default | Name of the template that must be used to configure this operator.
Alessio Netti's avatar
Alessio Netti committed
180
| interval | Specifies how often (in milliseconds) the operator will be invoked to perform computations, and thus the sampling interval of its output sensors. Only used for operators in _streaming_ mode.
181
| queueSize | Maximum number of readings to queue. Default is 1024.
182
| relaxed | If set to _true_, the units of this operator will be instantiated even if some of the respective input sensors do not exist.
183
| delay | Delay in milliseconds to be applied to the interval of the operator. This parameter can be used to tune how operator pipelines work, ensuring that the next computation stage is started only after the previous one has finished.
184
| unitCacheLimit | Defines the maximum size of the unit cache that is used in the on-demand and job modes. Default is 1000.
185
| minValues |   Minimum number of readings that need to be stored in output sensors before these are pushed as MQTT messages. Only used for operators in _streaming_ mode.
Alessio Netti's avatar
Alessio Netti committed
186
| mqttPart |    Part of the MQTT topic associated to this operator. Only used for the _root_ unit or when the _enforceTopics_ flag is set to true (see this [section](#mqttTopics)).
187
| enforceTopics | If set to _true_, mqttPart will be forcibly pre-pended to the MQTT topics of all output sensors in the operator (see this [section](#mqttTopics)). 
188
| sync | If set to _true_, computation will be performed at time intervals synchronized with sensor readings.
Alessio Netti's avatar
Alessio Netti committed
189
| disabled | If set to _true_, the operator will be instantiated but will not be started and will not be available for queries.
190
191
| duplicate | 	If set to _false_, only one operator object will be instantiated. Such operator will perform computation over all units that are instantiated, at every interval, sequentially. If set to _true_, the operator object will be duplicated such that each copy will have one unit associated to it. This allows to exploit parallelism between units, but results in separate models to avoid race conditions.
| streaming |	If set to _true_, the operator will operate in _streaming_ mode, pushing output sensors regularly. If set to _false_, the operator will instead operate in _on-demand_ mode.
Alessio Netti's avatar
Alessio Netti committed
192
193
194
| unitInput | Block of input sensors that must be used to instantiate the units of this operator. These can both be a list of strings, or fully-qualified _Sensor_ blocks containing specific attributes (see DCDB Pusher Readme).
| unitOutput | Block of output sensors that will be associated to this operator. These must be _Sensor_ blocks containing valid MQTT suffixes. Note that the number of output sensors is usually fixed depending on the type of operator.
| globalOutput | Block for _global_ output sensors that are not associated with a specific unit. If this is defined, all units described by the _unitInput_ and _unitOutput_ blocks will be grouped under a hierarchical _root_ unit that contains the output sensors described here.
195
196
197
| | |

#### Configuration Syntax <a name="configSyntax"></a>
Alessio Netti's avatar
Alessio Netti committed
198
In the following we show a sample configuration block for the _Aggregator_ plugin. For the full version, please refer to the
199
200
201
default configuration file in the _config_ directory:

```
Alessio Netti's avatar
Alessio Netti committed
202
203
204
205
206
global {
	mqttprefix /analytics
}

template_aggregator def1 {
207
208
209
210
211
212
interval	1000
minValues	3
duplicate 	false
streaming	true
}

Alessio Netti's avatar
Alessio Netti committed
213
aggregator avg1 {
214
default     def1
215
mqttPart    /avg1
216

Alessio Netti's avatar
Alessio Netti committed
217
	unitInput {
218
219
220
221
		sensor col_user
		sensor MemFree
	}

Alessio Netti's avatar
Alessio Netti committed
222
	unitOutput {
223
		sensor sum {
Alessio Netti's avatar
Alessio Netti committed
224
			operation 	sum
225
			mqttsuffix  /sum
226
227
228
		}

		sensor max {
Alessio Netti's avatar
Alessio Netti committed
229
			operation	maximum
230
			mqttsuffix  /max
231
232
233
		}

		sensor avg {
Alessio Netti's avatar
Alessio Netti committed
234
			operation	average
235
			mqttsuffix  /avg
236
237
238
239
240
241
		}
	}
}
``` 

The configuration shown above uses a template _def1_ for some configuration parameters, which are then applied to the
242
_avg1_ operator. This operator takes the _col_user_ and _MemFree_ sensors as input (which must be available under this name),
Alessio Netti's avatar
Alessio Netti committed
243
 and outputs _sum_, _max_, and _avg_ sensors. In this configuration, the Unit System and sensor hierarchy are not used, 
244
245
246
 and therefore only one generic unit (called the _root_ unit) will be instantiated.

#### Instantiating Units <a name="instantiatingUnits"></a>
Alessio Netti's avatar
Alessio Netti committed
247
Here we propose once again the configuration discussed above, this time making use of the Unit System to abstract from
248
249
250
the specific system being used and simplify configuration. The adjusted configuration block is the following: 

```
Alessio Netti's avatar
Alessio Netti committed
251
252
253
254
255
global {
	mqttprefix /analytics
}

template_aggregator def1 {
256
257
258
259
260
261
interval	1000
minValues	3
duplicate 	false
streaming	true
}

Alessio Netti's avatar
Alessio Netti committed
262
aggregator avg1 {
263
default     def1
264
mqttPart    /avg1
265

Alessio Netti's avatar
Alessio Netti committed
266
	unitInput {
267
268
269
270
		sensor "<bottomup>col_user"
		sensor "<bottomup 1>MemFree"
	}

Alessio Netti's avatar
Alessio Netti committed
271
	unitOutput {
Alessio Netti's avatar
Alessio Netti committed
272
		sensor "<bottomup, filter cpu00>sum" {
Alessio Netti's avatar
Alessio Netti committed
273
			operation 	sum
274
			mqttsuffix  /sum
275
276
		}

Alessio Netti's avatar
Alessio Netti committed
277
		sensor "<bottomup, filter cpu00>max" {
Alessio Netti's avatar
Alessio Netti committed
278
			operation 	maximum
279
			mqttsuffix  /max
280
281
		}

Alessio Netti's avatar
Alessio Netti committed
282
		sensor "<bottomup, filter cpu00>avg" {
Alessio Netti's avatar
Alessio Netti committed
283
			operation 	average
284
			mqttsuffix  /avg
285
286
287
288
289
290
		}
	}
}
``` 

In each sensor declaration, the _< >_ block is a placeholder that will be replaced with the name of the units that will
291
be associated to the operator, thus resolving the sensor names. Such block allows to navigate the current sensor tree,
292
293
294
295
296
297
298
299
300
301
302
303
304
305
and select nodes that will constitute the units. Its syntax is the following:

```
< bottomup|topdown X, filter Y >SENSORNAME 
``` 

The first section specified the _level_ in the sensor tree at which nodes must be selected. _bottomup X_ and _topdown X_
respectively mean _"search X levels up from the deepest level in the sensor tree"_, and _"search X levels down from the 
topmost level in the sensor tree"_. The _X_ numerical value can be omitted as well.

The second section, on the other hand, allows to search the sensor tree _horizontally_. Within the level specified in the
first section of the configuration block, only the nodes whose names match with the regular expression Y will be selected.
This way, we can navigate the current sensor tree both vertically and horizontally, and easily instantiate units starting 
from nodes in the tree. The set of nodes in the current sensor tree that match with the specified configuration block is
306
defined as the _unit domain_ of the operator.
307
308
309

The configuration algorithm then works in two steps:

310
1. The _output_ block of the operator is read, and its unit domain is determined; this implies that all sensors in the 
311
312
313
output block must share the same _< >_ block, and therefore match the same unit domain;
2. For each unit in the domain, its input sensors are identified. We start from the _unit_ node in the sensor tree, and 
navigate to the corresponding sensor node according to its _< >_ block, which identifies its level in the tree. Each 
314
unit, once its inputs and outputs are defined, is then added to the operator.
315
316

According to the sensor tree built in the previous [section](#sensorTree), the configuration above would result in
Alessio Netti's avatar
Alessio Netti committed
317
an operator with the following set of _flat_ units:
318
319

```
Alessio Netti's avatar
Alessio Netti committed
320
/rack00/node05/cpu00/ {
321
	Inputs {
Alessio Netti's avatar
Alessio Netti committed
322
323
		/rack00/node05/cpu00/col_user
		/rack00/node05/MemFree
324
325
326
	}
	
	Outputs {
Alessio Netti's avatar
Alessio Netti committed
327
328
329
		/rack00/node05/cpu00/sum
		/rack00/node05/cpu00/max
		/rack00/node05/cpu00/avg
330
331
332
	}
}

Alessio Netti's avatar
Alessio Netti committed
333
/rack02/node03/cpu00/ {
334
	Inputs {
Alessio Netti's avatar
Alessio Netti committed
335
336
		/rack02/node03/cpu00/col_user
		/rack02/node03/MemFree
337
338
339
	}
                     	
	Outputs {
Alessio Netti's avatar
Alessio Netti committed
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
		/rack02/node03/cpu00/sum
		/rack02/node03/cpu00/max
		/rack02/node03/cpu00/avg
	}
}
``` 

#### Instantiating Hierarchical Units <a name="instantiatingHUnits"></a>
A second level of aggregation beyond ordinary units can be obtained by defining sensors in the _globalOutput_ block. In
this case, a series of units will be created like in the previous example, and they will be added as _sub-units_ of a 
top-level _root_ unit, which will have as outputs the sensors defined in the _globalOutput_ block. This type of unit
is called a _hierarchical_ unit.
 
 
Computation for a hierarchical unit is always performed starting from the top-level unit. This means that all sub-units
will be processed sequentially in the same computation interval, and that they cannot be split. However, both the
top-level unit and the respective sub-units are exposed to the outside, and their sensors can be queried. Please
refer to the plugins' documentation to see whether hierarchical units are supported or not.

Recalling the previous example, a hierarchical unit can be constructed with the following configuration:

```
global {
	mqttprefix /analytics
}

template_aggregator def1 {
interval	1000
minValues	3
duplicate 	false
streaming	true
}

aggregator avg1 {
default     def1
mqttPart    /avg1

	unitInput {
		sensor "<bottomup>col_user"
		sensor "<bottomup 1>MemFree"
	}

	unitOutput {
		sensor "<bottomup, filter cpu00>sum" {
			operation 	sum
			mqttsuffix  /sum
		}

		sensor "<bottomup, filter cpu00>max" {
			operation 	maximum
			mqttsuffix  /max
		}

		sensor "<bottomup, filter cpu00>avg" {
			operation 	average
			mqttsuffix  /avg
		}
	}
	
	globalOutput {
		sensor globalSum {
			operation 	sum
			mqttsuffix  /globalSum
		}
    }
}
``` 

Note that hierarchical units can only have global outputs, but not global inputs, as they are meant to perform aggregation
of the results obtained on single sub-units. Such a configuration would result in the following unit structure:

```
__root__ {
	Outputs {
		/analytics/avg1/globalSum
	}
	
	Sub-units {
		/rack00/node05/cpu00/ {
			Inputs {
				/rack00/node05/cpu00/col_user
				/rack00/node05/MemFree
			}
			
			Outputs {
				/rack00/node05/cpu00/sum
				/rack00/node05/cpu00/max
				/rack00/node05/cpu00/avg
			}
		}
		
		/rack02/node03/cpu00/ {
			Inputs {
				/rack02/node03/cpu00/col_user
				/rack02/node03/MemFree
			}
								
			Outputs {
				/rack02/node03/cpu00/sum
				/rack02/node03/cpu00/max
				/rack02/node03/cpu00/avg
			}
		}
443
444
445
446
	}
}
``` 

Alessio Netti's avatar
Alessio Netti committed
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
> NOTE&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp; The _duplicate_ setting has no effect when hierarchical units are used 
(i.e., _globalOutput_ is defined).

> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; As of now the _aggregator_ plugin does not support hierarchical units. The
above is only meant as an example for how hierarchical units can be created in general.

#### Job Operators <a name="joboperators"></a>

_Job Operators_ are a class of operators which act on job-specific data. Such data is structured in _job units_. These
units are _hierarchical_, and work as described previously (see this [section](#instantiatingHUnits)). In particular,
 they are arranged as follows:

* The top unit is associated to the job itself and contains all of the required output sensors (_globalOutput_ block);
* One sub-unit for each node on which the job was running is allocated. Each of these sub-units contains all of the input
sensors that are required at configuration time (_unitInput_ block), along output sensors at the compute node level (_unitOutput_ block).

The computation algorithms driving job operators can then navigate freely this hierarchical unit design according to
their specific needs. Job-level sensors in the top unit do not require Unit System syntax (see this [section](#mqttTopics));
sensors that are defined in sub-units, if supported by the plugin, need however to be at the compute node level in the
current sensor tree, since all of the sub-units are tied to the nodes on which the job was running. This way, 
unit resolution is performed correctly by the Unit System.

Job operators also support the _streaming_ and _on-demand_ modes, which work like the following:

* In **streaming** mode, the job operator will retrieve the list of jobs that were running in the time interval starting
from the last computation to the present; it will then build one job unit for each of them, and subsequently perform computation;
* In **on-demand** mode, users can query a specific job id, for which a job unit is built and computation is performed.

475
476
477
478
479
480
481
A filtering mechanism can also be applied to select which jobs an operator should process. The default filtering policy uses
two parameters: a job _filter_ regular expression and a job _match_ string. When a job first appears in the system, the
job filter regex is applied to all of the node names in its nodelist. This regex could extract, for example, the portion
of the node name that encodes a certain _rack_ or _island_ in an HPC system. Then, frequencies are computed for each filtered
node name, and the mode is computed. If the mode corresponds to the job _match_ string, the job is assigned to the
operator. This policy can be overridden and changed on a per-plugin basis.

Alessio Netti's avatar
Alessio Netti committed
482
483
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The _duplicate_ setting does not affect job operators.

Alessio Netti's avatar
Alessio Netti committed
484
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; In order to get units that operate at the _node_ level, the output sensors in the
Alessio Netti's avatar
Alessio Netti committed
485
configuration discussed earlier should have a unit block in the form of < bottomup 1 >.
Alessio Netti's avatar
Alessio Netti committed
486

487
#### MQTT Topics <a name="mqttTopics"></a>
488
The MQTT topics associated to output sensors of a certain operator are constructed in different ways depending
489
490
491
on the unit they belong to:

* **Root unit**: if the output sensors belong to the _root_ unit, that is, they do not belong to any level in the sensor
492
hierarchy and are uniquely defined, the respective topics are constructed like in DCDB Pusher sensors, by concatenating
Alessio Netti's avatar
Alessio Netti committed
493
494
the MQTT prefix, operator part and sensor suffix that are defined. The same happens for sensors defined in the _globalOutput_
block, which are part of the top level in a hierarchical unit, which also corresponds to the _root_ unit;
Alessio Netti's avatar
Alessio Netti committed
495
496
* **Job unit**: if the output sensors belong to a _job_ unit in a job operator (see below), the MQTT topic is constructed
by concatenating the MQTT prefix, the operator part, a job suffix (e.g., /job1334) and finally the sensor suffix;
497
498
* **Any other unit**: if the output sensor belongs to any other unit in the sensor tree, its MQTT topic is constructed
by concatenating the MQTT prefix associated to the unit (which is defined as _the portion of the MQTT topic shared by all sensors
Alessio Netti's avatar
Alessio Netti committed
499
belonging to such unit_) and the sensor suffix.
500

Alessio Netti's avatar
Alessio Netti committed
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
Even for units belonging to the last category, we can enforce arbitrary MQTT topics by enabling the _enforceTopics_ flag.
Using this, the MQTT prefix and operator part are pre-pended to the unit name and sensor suffix. This is enforced also
in the sub-units of hierarchical units (e.g., for job operators). Recalling the example above, this would lead to the following result:

``` 
MQTT Prefix	/analytics
MQTT Part	/avg1
Unit 		/rack02/node03/cpu00/

Without enforceTopics:
	/rack02/node03/cpu00/sum
With enforceTopics:
	/analytics/avg1/rack02/node03/cpu00/sum
``` 

516
517
Like ordinary sensors in DCDB Pusher, also operator output sensors can be published via the _auto-publish_ feature, and metadata can be specified for them. For more details, refer to the README of DCDB Pusher.

518
#### Pipelining Operators <a name="pipelining"></a>
519

520
521
522
The inputs and outputs of streaming operators can be chained so as to form a processing pipeline. To enable this, users
need to configure operators by enabling the _relaxed_ configuration parameter, and by selecting as input the output sensors
of other operators. This is necessary as the operators are instantiated sequentially at startup, and
523
524
the framework cannot infer the correct order of initialization so as to resolve all dependencies transparently.

525
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; This feature is not supported when using operators in _on demand_ mode.
526
527

## Rest API <a name="restApi"></a>
528
529
Wintermute provides a REST API that can be used to perform various management operations on the framework. The 
API is functionally identical to that of DCDB Pusher, and is hosted at the same address. All requests that are targeted
530
531
at the data analytics framework must have a resource path starting with _/analytics_.

Alessio Netti's avatar
Alessio Netti committed
532
### List of resources <a name="listOfResources"></a>
Micha Mueller's avatar
Micha Mueller committed
533

Micha Mueller's avatar
Micha Mueller committed
534
Prefix `/analytics` left out!
Micha Mueller's avatar
Micha Mueller committed
535
536
537

<table>
  <tr>
Alessio Netti's avatar
Alessio Netti committed
538
    <td colspan="2"><b>Resource</b></td>
Micha Mueller's avatar
Micha Mueller committed
539
540
541
542
543
544
545
546
547
548
549
550
    <td colspan="2">Description</td>
  </tr>
  <tr>
  	<td>Query</td>
  	<td>Value</td>
  	<td>Opt.</td>
  	<td>Description</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
551
    <td colspan="2"><b>GET /help</b></td>
Micha Mueller's avatar
Micha Mueller committed
552
553
554
555
556
557
558
559
560
    <td colspan="2">Return a cheatsheet of possible analytics REST API endpoints.</td>
  </tr>
  <tr>
  	<td colspan="4">No queries.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
561
    <td colspan="2"><b>GET /plugins</b></td>
Micha Mueller's avatar
Micha Mueller committed
562
563
564
565
566
    <td colspan="2">List all currently loaded data analytic plugins.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
567
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
568
569
570
571
572
573
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
574
    <td colspan="2"><b>GET /sensors</b></td>
Micha Mueller's avatar
Micha Mueller committed
575
576
577
578
    <td colspan="2">List all sensors of a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
579
  	<td>All operator plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
580
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
581
582
583
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
584
585
  	<td>operator</td>
  	<td>All operators of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
586
  	<td>Yes</td>
587
  	<td>Restrict sensor list to an operator.</td>
Micha Mueller's avatar
Micha Mueller committed
588
589
590
591
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
592
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
593
594
595
596
597
598
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
599
    <td colspan="2"><b>GET /units</b></td>
Micha Mueller's avatar
Micha Mueller committed
600
601
602
603
    <td colspan="2">List all units of a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
604
  	<td>All operator plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
605
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
606
607
608
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
609
610
  	<td>operator</td>
  	<td>All operators of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
611
  	<td>Yes</td>
612
  	<td>Restrict unit list to an operator.</td>
Micha Mueller's avatar
Micha Mueller committed
613
614
615
616
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
617
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
618
619
620
621
622
623
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
624
625
    <td colspan="2"><b>GET /operators</b></td>
    <td colspan="2">List all operators of a specific plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
626
627
628
  </tr>
  <tr>
  	<td>plugin</td>
629
  	<td>All operator plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
630
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
631
632
633
634
635
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
636
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
637
638
639
640
641
642
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
643
    <td colspan="2"><b>PUT /reload</b></td>
644
    <td colspan="2">Reload configuration and initialization of all or only a specific operator plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
645
646
647
648
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
649
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
650
651
652
653
  	<td>Reload only the specified plugin.</td>
  </tr>
</table>

654
655
656
657
658
659
660
661
662
663
<table>
  <tr>
    <td colspan="2"><b>PUT /navigator</b></td>
    <td colspan="2">Rebuild the Sensor Navigator used for instantiating operators.</td>
  </tr>
  <tr>
  	<td colspan="4">No queries.</td>
  </tr>
</table>

Micha Mueller's avatar
Micha Mueller committed
664
665
<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
666
    <td colspan="2"><b>PUT /compute</b></td>
667
    <td colspan="2">Query the given operator for a certain input unit. Intended for "on-demand" operators, but works with "streaming" operators as well.</td>
Micha Mueller's avatar
Micha Mueller committed
668
669
670
671
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
672
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
673
674
675
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
676
677
  	<td>operator</td>
  	<td>All operator names of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
678
  	<td>No</td>
679
  	<td>Specify the operator within the plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
680
681
682
683
  </tr>
  <tr>
  	<td>unit</td>
  	<td>All units of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
684
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
685
686
687
688
689
  	<td>Select the target unit. Defaults to the root unit if not specified.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
690
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
691
692
693
694
695
696
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
697
698
    <td colspan="2"><b>PUT /operator</b></td>
    <td colspan="2">Perform a custom REST PUT action defined at operator level. See operator plugin documenation for such actions.</td>
Micha Mueller's avatar
Micha Mueller committed
699
700
701
702
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
703
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
704
705
706
707
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>action</td>
708
  	<td>See operator plugin documentation.</td>
Micha Mueller's avatar
Micha Mueller committed
709
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
710
711
712
  	<td>Select custom action.</td>
  </tr>
  <tr>
713
714
  	<td>operator</td>
  	<td>All operators of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
715
  	<td>Yes</td>
716
  	<td>Specify the operator within the plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
717
718
719
720
721
722
  </tr>
  <tr>
  	<td colspan="4">Custom action may require or allow for more queries!</td>
  </tr>
</table>

723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
<table>
  <tr>
    <td colspan="2"><b>POST /start</b></td>
    <td colspan="2">Start all or only a specific plugin. Or only start a specific streaming operator within a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
  	<td>Yes</td>
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>operator</td>
  	<td>All operator names of a plugin.</td>
  	<td>Yes</td>
  	<td>Only start the specified operator. Requires a plugin to be specified. Limited to streaming operators.</td>
  </tr>
</table>

<table>
  <tr>
    <td colspan="2"><b>POST /stop</b></td>
    <td colspan="2">Stop all or only a specific plugin. Or only stop a specific streaming operator within a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
  	<td>Yes</td>
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>operator</td>
  	<td>All operator names of a plugin.</td>
  	<td>Yes</td>
  	<td>Only stop the specified operator. Requires a plugin to be specified. Limited to streaming operators.</td>
  </tr>
</table>

Alessio Netti's avatar
Alessio Netti committed
761
> NOTE&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp; Opt. = Optional
Micha Mueller's avatar
Micha Mueller committed
762

Alessio Netti's avatar
Alessio Netti committed
763
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; The value of operator output sensors can be retrieved with the _compute_ resource, or with the _/average_ resource defined in the DCDB Pusher REST API.
Micha Mueller's avatar
Micha Mueller committed
764

765
> NOTE 3 &ensp;&ensp;&ensp;&ensp;&ensp; Developers can integrate their custom REST API resources that are plugin-specific, by implementing the _REST_ method in _OperatorTemplate_. To know more about plugin-specific resources, please refer to the respective documentation. 
766

Alessio Netti's avatar
Alessio Netti committed
767
768
> NOTE 4 &ensp;&ensp;&ensp;&ensp;&ensp; When operators employ a _root_ unit (e.g., when the Unit System is not used or a _globalOutput_ block is defined in regular operators) the _unit_ query can be omitted when performing a _/compute_ action.

769
770
771
### Rest Examples <a name="restExamples"></a>
In the following are some examples of REST requests over HTTPS:

Alessio Netti's avatar
Alessio Netti committed
772
* Listing the units associated to the _avgoperator1_ operator in the _aggregator_ plugin:
773
```bash
Alessio Netti's avatar
Alessio Netti committed
774
GET https://localhost:8000/analytics/units?plugin=aggregator;operator=avgOperator1
775
```
Alessio Netti's avatar
Alessio Netti committed
776
* Listing the output sensors associated to all operators in the _aggregator_ plugin:
777
```bash
Alessio Netti's avatar
Alessio Netti committed
778
GET https://localhost:8000/analytics/sensors?plugin=aggregator;
779
```
Alessio Netti's avatar
Alessio Netti committed
780
* Reloading the _aggregator_ plugin:
781
```bash
Alessio Netti's avatar
Alessio Netti committed
782
PUT https://localhost:8000/analytics/reload?plugin=aggregator
783
```
Alessio Netti's avatar
Alessio Netti committed
784
* Stopping the _avgOperator1_ operator in the _aggregator_ plugin:
785
```bash
Alessio Netti's avatar
Alessio Netti committed
786
PUT https://localhost:8000/analytics/stop?plugin=aggregator;operator=avgOperator1
787
```
Alessio Netti's avatar
Alessio Netti committed
788
* Performing a query for unit _/node00/cpu03/_ to the _avgOperator1_ operator in the _aggregator_ plugin:
789
```bash
Alessio Netti's avatar
Alessio Netti committed
790
PUT https://localhost:8000/analytics/compute?plugin=aggregator;operator=avgOperator1;unit=/node00/cpu03/
791
792
```

793
794
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The analytics RestAPI requires authentication credentials as well.

795
# Plugins <a name="plugins"></a>
796
Here we describe available plugins in Wintermute, and how to configure them.
797

Alessio Netti's avatar
Alessio Netti committed
798
799
## Aggregator Plugin <a name="averagePlugin"></a>
The _Aggregator_ plugin implements simple data processing algorithms. Specifically, this plugin allows to perform basic
800
aggregation operations over a set of input sensors, which are then written as output.
Alessio Netti's avatar
Alessio Netti committed
801
The configuration parameters specific to the _Aggregator_ plugin are the following:
802
803
804
805

| Value | Explanation |
|:----- |:----------- |
| window | Length in milliseconds of the time window that is used to retrieve recent readings for the input sensors, starting from the latest one.
806

807
Additionally, output sensors in operators of the _Aggregator_ plugin accept the following parameters:
808
809
810

| Value | Explanation |
|:----- |:----------- |
811
| operation | Operation to be performed over the input sensors. Can be "sum", "average", "maximum", "minimum", "std", "percentiles" or "observations".
812
| percentile |  Specific percentile to be computed when using the "percentiles" operation. Can be an integer in the (0,100) range.
813
814
| relative | If true, the _relative_ query mode will be used. Otherwise the _absolute_ mode is used.

815

Alessio Netti's avatar
Alessio Netti committed
816
817
818
819
## Job Aggregator Plugin <a name="jobaveragePlugin"></a>

The _Job Aggregator_ plugin offers the same functionality as the _Aggregator_ plugin, but on a per-job basis. As such,
it performs aggregation of the specified input sensors across all nodes on which each job is running. Please refer
820
to the corresponding [section](#joboperators) for more details.
Alessio Netti's avatar
Alessio Netti committed
821

822
823
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The Job Aggregator plugin does not support the _relative_ option supported by the Aggregator plugin, and always uses the _absolute_ sensor query mode.

824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
## Smoothing Plugin <a name="smoothingPlugin"></a>
The _Smoothing_ plugin performs similar functions as the _Aggregator_ plugin, but is optimized for performing running averages. It uses a single accumulator to compute approximate running averages, improving memory usage and reducing the amount of queried data.

Units ins the _Smoothing_ plugin are also instantiated differently compared to other plugins. As it is meant to perform running average on most or all sensors present in a system, there is no need to specify the input sensors of an operator: the plugin will automatically fetch all instantiated sensors in the system, and create separate units for them, each with their separate average output sensors. 
For this reason, pattern expressions specified on output sensors (e.g., _<bottomup 1, filter cpu>avg60_) are ignored and only the MQTT suffixes of the output sensors are used to construct units. If required, users can still specify input sensors using the Unit System syntax, to select subsets of the available sensors in the system. Operators of the _Smoothing_ plugin accept the following configuration parameters: 

| Value | Explanation |
|:----- |:----------- |
| separator | Character used to separate the MQTT prefix of output sensors (which is the input sensor's name) from the suffix (which is the average's identifier). Default is "#". 
| exclude | String containing a regular expression, defining which sensors must be excluded from the smoothing process.

Additionally, output sensors in operators of the _Smoothing_ plugin accept the following parameters:

| Value | Explanation |
|:----- |:----------- |
| range | Range in milliseconds of the average to be stored in this output sensor.

841
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The _Smoothing_ plugin will automatically update the metadata table of the input sensors in the units to expose the computed averages. This behavior can be altered by setting the _isOperation_ metadata attribute to _false_ for each output sensor. This way, the sensors associated to the averages will be published as independent entries.
842

843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
## Regressor Plugin <a name="regressorPlugin"></a>

The _Regressor_ plugin is able to perform regression of sensors, by using _random forest_ machine learning predictors. The algorithmic backend is provided by the OpenCV library.
For each input sensor in a certain unit, statistical features from recent values of its time series are computed. These are the average, standard deviation, sum of differences, 25th quantile and 75th quantile. These statistical features are then combined together in a single _feature vector_. 

In order to operate correctly, the models used by the regressor plugin need to be trained: this procedure is performed automatically online when using the _streaming_ mode, and can be triggered arbitrarily over the REST API. In _on demand_ mode, automatic training cannot be performed, and as such a pre-trained model must be loaded from a file.
The following are the configuration parameters available for the _Regressor_ plugin:

| Value | Explanation |
|:----- |:----------- |
| window | Length in milliseconds of the time window that is used to retrieve recent readings for the input sensors, starting from the latest one.
| trainingSamples | Number of samples necessary to perform training of the current model.
| targetDistance | Temporal distance (in terms of lags) of the sample that is to be predicted.
| inputPath | Path of a file from which a pre-trained random forest model must be loaded.
| outputPath | Path of a file to which the random forest model trained at runtime must be saved.
858
| getImportances | If true, the random forest will also compute feature importance values when trained, which are printed.
859
| rawMode | If true, only the average is used as feature for each of the sensor inputs.
860
861
862

> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; When the _duplicate_ option is enabled, the _outputPath_ field is ignored to avoid file collisions from multiple regressors.

Alessio Netti's avatar
Alessio Netti committed
863
> NOTE 2 &ensp;&ensp;&ensp;&ensp;&ensp; When loading the model from a file and _getImportances_ is set to true, importance values will be printed only if the original model had this feature enabled upon training.
864

865
Additionally, input sensors in operators of the Regressor plugin accept the following parameter:
866
867
868

| Value | Explanation |
|:----- |:----------- |
869
| target | Boolean value. If true, this sensor represents the target for regression. Every unit in operators of the regressor plugin must have excatly one target sensor.
870

871
Finally, the Regressor plugin supports the following additional REST API actions:
872
873
874
875

| Action | Explanation |
|:----- |:----------- |
| train | Triggers a new training phase for the random forest model. Feature vectors are temporarily collected in-memory until _trainingSamples_ vectors are obtained. Until this moment, the old random forest model is still used to perform prediction.
876
| importances | Returns the sorted importance values for the input features, together with the respective labels, if available.
877

Alessio Netti's avatar
Alessio Netti committed
878
879
880
881
882
883
## Classifier Plugin <a name="classifierPlugin"></a>

The _Classifier_ plugin, as the name implies, performs machine learning classification. It is based on the Regressor plugin, and as such it also uses OpenCV random forest models. The plugin supplies the same options and has the same behavior as the Regressor plugin, with the following two exceptions:

* The _target_ parameter here indicates a sensor which stores the labels (as numerical integer identifiers) to be used for training and on which classification will be based. The mapping from the integer labels to their text equivalent is left to the users. Moreover, unlike in the
Regressor plugin, the target sensor is always excluded from the feature vectors.
884
* The _targetDistance_ parameter has a default value of 0. It can be set to higher values to perform predictive classification.
Alessio Netti's avatar
Alessio Netti committed
885

886
887
888
889
890
891
892
893
894
895
## Clustering Plugin <a name="clusteringPlugin"></a>

The _Clustering_ plugin implements a gaussian mixture model for performance variation analysis and outlier detection. The plugin is based on the OpenCV library, similarly to the _Regressor_ plugin.
The input sensors of operators define the axes of the clustering space and hence the number of dimensions of the associated gaussian components. The units of the operator, instead, define the number of points in the clustering space: for this reason, 
the Clustering plugin employs hierarchical units, so that the clustering is performed once for all present sub-units, at each computation interval. When prediction is performed (after training of the GMM model, depending on the _reuseModel_ attribute) the
label of the gaussian component to which each sub-unit belongs is stored in the only output sensor of that sub-unit. Operators of the Clustering plugin support the following configuration parameters:

| Value | Explanation |
|:----- |:----------- |
| window | Length in milliseconds of the time window that is used to retrieve recent readings for the input sensors, starting from the latest one.
896
| lookbackWindow | Enables a lookback window (whose length is expressed in milliseconds) to collect additional points from previous time windows in order to perform model training. The additional points will be accumulated in memory until training can be performed. Disabled by default.
897
898
899
900
901
902
903
904
905
906
907
908
909
| numComponents | Number of gaussian components in the mixture model.
| reuseModel | Boolean value. If false, the GMM model is trained at each computation, otherwise only once or when the "train" REST API action is used. Default is false.
| outlierCut | Threshold used to determine outliers when performing the Mahalanobis distance. 
| inputPath | Path of a file from which a pre-trained random forest model must be loaded.
| outputPath | Path of a file to which the random forest model trained at runtime must be saved.

The Clustering plugin does not provide any additional configuration parameters for its input and output sensors. 
However, it supports the following additional REST API actions:

| Action | Explanation |
|:----- |:----------- |
| train | Triggers a new training phase for the gaussian mixture model. At the next computation interval, the feature vectors of all units of the operator are combined to perform training, after which predicted labels are given as output.
| means | Returns the means of the generated gaussian components in the trained mixture model.
910
| covs | Returns the covariance matrices of the generated gaussian components in the trained mixture model.
911

912
913
## CS Signatures Plugin <a name="csPlugin"></a>

914
The _CS Signatures_ plugin computes signatures from sensor data, which aggregate data both in time and across sensors, and are composed by a specified number of complex-valued _blocks_. 
915
916
917
918
919
920
921
922
Each of the blocks is then stored in two separate sensors, which contain respectively the real and imaginary part of the block. Like in the _Regressor_ and _Classifier_ plugins, the CS algorithm is trained using a specified number of samples, which are accumulated in memory, subsequently learning the correlations between sensors.
Operators in this plugin support the following configuration parameters:

| Value | Explanation |
|:----- |:----------- |
| window | Length in milliseconds of the time window that is used to retrieve recent readings for the input sensors, starting from the latest one, that are then aggregated in the signatures.
| trainingSamples | Number of samples for the sensors that are to be used to train the CS algorithm.
| numBlocks | Desired number of blocks in the signatures.
923
| scalingFactor | Scaling factor (and upper bound) used to compute the blocks. Default is 1000000.
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
| inputPath | Path of a file in which the order of the sensors and their upper/lower bounds are stored.
| outputPath | Path of a file to which the order of the sensors and their upper/lower bounds must be saved.

Additionally, the output sensors of the CS Signatures plugin support the following parameters:

| Value | Explanation |
|:----- |:----------- |
| imag | Boolean value. Specifies whether the sensor should store the imaginary or real part of a block.

The output sensors are automatically duplicated according to the specified number of blocks, and a numerical identifier is appended to their MQTT topics. If no sensor with the _imag_ parameter set to true is specified, the signatures will contain only their real parts.
Finally, the plugin supports the following REST API actions:

| Action | Explanation |
|:----- |:----------- |
| train | Triggers a new training phase for the CS algorithm. For practical reasons, only the sensor data from the first unit of the operator is used for training.

940
## Tester Plugin <a name="testerPlugin"></a>
Alessio Netti's avatar
Alessio Netti committed
941
The _Tester_ plugin can be used to test the functionality and performance of the query engine, as well as of the Unit System. It will perform a specified number of queries over the set of input sensors for each unit, and then output as a sensor the total number of retrieved readings. The following are the configuration parameters for operators in the _Tester_ plugin:
942
943
944
945
946
947
948
949

| Value | Explanation |
|:----- |:----------- |
| window | Length in milliseconds of the time window that is used to retrieve recent readings for the input sensors, starting from the latest one.
| queries | Number of queries to be performed at each computation interval. If more than the number of input sensors per unit, these will be looped over multiple times.
| relative | If true, the _relative_ query mode will be used. Otherwise the _absolute_ mode is used.

# Sink Plugins <a name="sinkplugins"></a>
950
Here we describe available plugins in Wintermute that are devoted to the output of sensor data (_sinks_), and that do not perform any analysis.
951

952
953
## File Sink Plugin <a name="filesinkPlugin"></a>
The _File Sink_ plugin allows to write the output of any other sensor to the local file system. As such, it does not produce output sensors by itself, and only reads from input sensors.
Alessio Netti's avatar
Alessio Netti committed
954
The input sensors can either be fully qualified, or can be described through the Unit System. In this case, multiple input sensors can be generated automatically, and the respective output paths need to be adjusted by enabling the _autoName_ attribute described below, to prevent multiple sensors from being written to the same file. The file sink operators (named sinks) support the following attributes:
955
956
957
958
959
960
961
962
963
964
965

| Value | Explanation |
|:----- |:----------- |
| autoName | Boolean. If false, the output paths associated to sensors are interpreted literally, and a file is opened for them. If true, only the part in the path describing the current directory is used, while the file itself is named accordingly to the MQTT topic of the specific sensor.

Additionally, input sensors in sinks accept the following parameters:

| Value | Explanation |
|:----- |:----------- |
| path | The path to which the sensors's readings should be written. It is interpreted as described above for the _autoName_ attribute.

Alessio Netti's avatar
Alessio Netti committed
966
# Writing Wintermute Plugins <a name="writingPlugins"></a>
967
Generating a DCDB Wintermute plugin requires implementing a _Operator_ and _Configurator_ class which contain all logic
968
tied to the specific plugin. Such classes should be derived from _OperatorTemplate_ and _OperatorConfiguratorTemplate_
969
respectively, which contain all plugin-agnostic configuration and runtime features. Please refer to the documentation 
Alessio Netti's avatar
Alessio Netti committed
970
of the _Aggregator_ plugin for an overview of how a basic plugin can be implemented.