README.md 24.7 KB
Newer Older
1
2
3
# DCDB Data Analytics Framework

### Table of contents
4
5
6
7
8
9
10
11
12
13
14
15
1. [Introduction](#introduction)
2. [DCDBAnalytics](#dcdbanalytics)
    1. [Global Configuration](#globalConfiguration)
    2. [Analyzers](#analyzers)
    	1. [The Sensor Tree](#sensorTree)
    	2. [The Unit System](#unitSystem)
    	3. [Operational Modes](#opModes)
		4. [Analyzer Configuration](#analyzerConfiguration)
			1. [Configuration Syntax](#configSyntax)
			2. [Instantiating Units](#instantiatingUnits)
			3. [MQTT Topics](#mqttTopics)
    3. [Rest API](#restApi)
Micha Mueller's avatar
Micha Mueller committed
16
17
        1. [List of ressources](#listOfRessources)
        2. [Examples](#restExamples)
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
3. [Plugins](#plugins)
	1. [Average Plugin](#averagePlugin)
	2. [Writing Plugins](#writingPlugins)

# Introduction <a name="introduction"></a>
In this Readme we describe the DCDB Data Analytics framework, and all data abstractions that are associated with it. 

# DCDBAnalytics <a name="dcdbanalytics"></a>
The DCDBAnalytics framework is built on top of DCDB, and allows to perform data analytics based on sensor data
in a variety of ways. DCDBAnalytics can be deployed both in DCDBPusher and in DCDBCollectAgent, with some minor
differences:

* **DCDBPusher**: only sensor data that is sampled locally and that is contained within the sensor cache can be used for
data analytics. However, this is the preferable way to deploy simple models on a large-scale, as all computation is
performed within compute nodes, dramatically increasing scalability;
* **DCDBCollectAgent**: all available sensor data, in the local cache and in the Cassandra database, can be used for data
analytics. This approach is preferable for models that require data from multiple sources at once 
(e.g., clustering-based anomaly detection), or for models that are deployed in [on-demand](#analyzerConfiguration) mode.

## Global Configuration <a name="globalConfiguration"></a>
DCDBAnalytics shares the same configuration structure as DCDBPusher and DCDBCollectAgent, using a global.conf configuration file. 
All output sensors of the frameworks are therefore affected by configuration parameters described in the global Readme. 
Additional parameters specific to this framework are the following:

| Value | Explanation |
|:----- |:----------- |
Alessio Netti's avatar
Alessio Netti committed
44
| analytics | Wrapper structure for the data analytics-specific values.
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
| hierarchy | Space-separated sequence of regular expressions used to infer the local (DCDBPusher) or global (DCDBCollectAgent) sensor hierarchy. This parameter should be wrapped in quotes to ensure proper parsing. See the Sensor Tree [section](#sensorTree) for more details.
| analyzerPlugins | Block containing the specification of all data analytics plugin to be instantiated.
| plugin _name_ | The plugin name is used to build the corresponding lib-name (e.g. average --> libdcdbanalyzer_average.1.0)
| path | Specify the path where the plugin (the shared library) is located. If left empty, DCDB will look in the default lib-directories (usr/lib and friends) for the plugin file.
| config | One can specify a separate config-file (including path to it) for the plugin to use. If not specified, DCDB will look up pluginName.conf (e.g. average.conf) in the same directory where global.conf is located.
| | |

## Analyzers <a name="analyzers"></a>
Analyzers are the basic building block in DCDBAnalytics. A Analyzer is instantiated within a plugin, performs a specific
task and acts on sets of inputs and outputs called _units_. Analyzers are functionally equivalent to _sensor groups_
in DCDBPusher, but instead of sampling data, they process such data and output new sensors. Some high-level examples
of analyzers are the following:

* An analyzer that performs time series regression on a particular input sensor, and outputs its prediction;
* An analyzer that aggregates a series of input sensors, builds feature vectors, and performs machine 
learning-based tasks using a supervised model;
* An analyzer that performs clustering-based anomaly detection by using different sets of inputs associated to different
compute nodes;
* An analyzer that outputs statistical features related to the time series of a certain input sensor.

### The Sensor Tree <a name="sensorTree"></a>
Before diving into the configuration and instantiation of analyzers, we introduce the concept of _sensor tree_. A  sensor
tree is simply a data structure expressing the hierarchy of sensors that are being sampled; internal nodes express
hierarchical entities (e.g. clusters, racks, nodes, cpus), whereas leaf nodes express actual sensors. In DCDBPusher, 
a sensor tree refers only to the local hierarchy, while in DCDBCollectAgent it can capture the hierarchy of the entire
system being sampled.

A sensor tree is built at initialization time of DCDBAnalytics, and is implemented in the _SensorNavigator_ class. 
Its construction is regulated by the _hierarchy_ global configuration parameter, which can be for example the following:

```
rack\d{2}. node\d{2}. cpu\d{2}. 
``` 

Given the above example hierarchy string, we are enforcing the target sensor tree to have three levels, the topmost of
which expresses the racks to which sensors belong, and the lowest one the cpu core (if any). Such string could be used,
for example to build a sensor tree starting from the following set of sensor names:

```
rack00.status
rack00.node05.MemFree
rack00.node05.energy
rack00.node05.temp
rack00.node05.cpu00.col_user
rack00.node05.cpu00.instr
rack00.node05.cpu00.branch-misses
rack00.node05.cpu01.col_user
rack00.node05.cpu01.instr
rack00.node05.cpu01.branch-misses
rack02.status
rack02.node03.MemFree
rack02.node03.energy
rack02.node03.temp
rack02.node03.cpu00.col_user
rack02.node03.cpu00.instr
rack02.node03.cpu00.branch-misses
rack02.node03.cpu01.col_user
rack02.node03.cpu01.instr
rack02.node03.cpu01.branch-misses
``` 

Each sensor name is interpreted as a path within the sensor tree. Therefore, the _instr_ and _branch-misses_ sensors
will be placed as leaf nodes in the deepest level of the tree, as children of the respective cpu node they belong to.
Such cpu nodes will be in turn children of the nodes they belong to, and so on.

The generated sensor tree can then be used to navigate the sensor hierarchy, and perform actions such as _retrieving
all sensors belonging to a certain node, to a neighbor of a certain node, or to the rack a certain node belongs to_.
Please refer to the documentation of the _SensorNavigator_ class for more details.

> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; If no hierarchy string has been specified in the configuration, the tree is built automatically by assuming that each
dot-separated part of the sensor name expresses a level in the hierarchy. The total depth of the tree is thus determined
at runtime as well.

> NOTE 2 &ensp;&ensp;&ensp; Sensor trees are always built from the names of sensors _as they are published_. Therefore,
please make sure to use the _-a_ option in DCDBPusher appropriately, to build sensor names that express the desired hierarchy.


### The Unit System <a name="unitSystem"></a>
Each analyzer operates on one or more _units_. A unit represents an abstract (or physical) entity in the current system that
is the target of analysis. A unit could be, for example, a rack, a node within a rack, a CPU within a node or an entire HPC system.
Units are identified by three components:

* **Name**: The name of this unit, that corresponds to the entity it represents. For example, _rack02.node03._ or _rack00.node05.cpu01._ could be unit names. A unit must always correspond to an existing internal node in the current sensor tree;
* **Input**: The set of sensors that constitute the input for analysis conducted on this unit. The sensors must share a hierarchical relationship with the unit: that is, they can either belong to the node represented by this unit, to its subtree, or to one of its ancestors; 
* **Output**: The set of output sensors that are produced from any analysis conducted on this unit. The output sensors are always directly associated with the node represented by the unit.

Units are a way to define _patterns_ in the sensor tree and retrieve sensors that are associated to each other by a 
hierarchical relationship. See the configuration [section](#instantiatingUnits) for more details on how to create
templates in order to define units suitable for analyzers.

### Operational Modes <a name="opModes"></a>
Analyzers can operate in two different modes:

* **Streaming**: streaming analyzers perform data analytics online and autonomously, processing incoming sensor data at regular intervals.
The units of streaming analyzers are completely resolved and instantiated at configuration time. The type of output of streaming
analyzers is identical to that of _sensors_ in DCDBPusher, which are pushed to DCDBCollectAgent and finally to the Cassandra database,
resulting in a time series representation;
* **On-demand**: on-demand analyzers do not perform data analytics autonomously, but only when queried by users. Unlike
for streaming analyzers, the units of on-demand analyzer are not instantiated at configuration, but only when a query is performed. When 
such an event occurs, the analyzer verifies that the queried unit belongs to its _unit domain_, and then instantiates it,
resolving its inputs and outputs. Then, the unit is stored in a local cache for future re-use. The outputs of a on-demand
analyzer are exposed through the REST API, and are never pushed to the Cassandra database.

Use of streaming analyzers is advised when a time series-like output is required, whereas on-demand analyzers are effective
when data is required at specific times and for specific purposes, and when the unit domain's size makes the use of streaming
analyzers unfeasible.

### Analyzer Configuration <a name="analyzerConfiguration"></a>
Here we describe how to configure and instantiate analyzers in DCDBAnalytics. The configuration scheme is very similar
to that of _sensor groups_ in DCDBPusher, and a _global_ configuration block can be defined in each plugin configuration
file. The following is instead a list of configuration parameters that are available for the analyzers themselves:

| Value | Explanation |
|:----- |:----------- |
| default | Name of the template that must be used to configure this analyzer.
| interval | Specifies how often the analyzer will be invoked to perform computations, and thus the sampling interval of its output sensors. Only used for analyzers in _streaming_ mode.
Alessio Netti's avatar
Alessio Netti committed
161
| delay | Delay in milliseconds to be applied to the start of the analyzer. This parameter only applies to streaming analyzers. It can be used to allow for input sensor caches to be populated before the analyzer is started.
162
163
| minValues |   Minimum number of readings that need to be stored in output sensors before these are pushed as MQTT messages. Only used for analyzers in _streaming_ mode.
| mqttPart |    Part of the MQTT topic associated to this analyzer. Only used when the Unit system is not employed (see this [section](#mqttTopics)).
164
| sync | If set to _true_, computation will be performed at time intervals synchronized with sensor readings.
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
| duplicate | 	If set to _false_, only one analyzer object will be instantiated. Such analyzer will perform computation over all units that are instantiated, at every interval, sequentially. If set to _true_, the analyzer object will be duplicated such that each copy will have one unit associated to it. This allows to exploit parallelism between units, but results in separate models to avoid race conditions.
| streaming |	If set to _true_, the analyzer will operate in _streaming_ mode, pushing output sensors regularly. If set to _false_, the analyzer will instead operate in _on-demand_ mode.
| input | Block of input sensors that must be used to instantiate the units of this analyzer. These can both be a list of strings, or fully-qualified _Sensor_ blocks containing specific attributes (see DCDBPusher Readme).
| output | Block of output sensors that will be associated to this analyzer. These must be _Sensor_ blocks containing valid MQTT suffixes. Note that the number of output sensors is usually fixed depending on the type of analyzer.
| | |

#### Configuration Syntax <a name="configSyntax"></a>
In the following we show a sample configuration block for the _Average_ plugin. For the full version, please refer to the
default configuration file in the _config_ directory:

```
template_average def1 {
interval	1000
minValues	3
duplicate 	false
streaming	true
}

average avg1 {
default     def1
mqttPart    FF0

	input {
		sensor col_user
		sensor MemFree
	}

	output {
		sensor sum {
			mqttsuffix  76
		}

		sensor max {
			mqttsuffix  77
		}

		sensor avg {
			mqttsuffix  78
		}
	}
}
``` 

The configuration shown above uses a template _def1_ for some configuration parameters, which are then applied to the
_avg1_ analyzer. This analyzer takes the _col_user_ and _MemFree_ sensors as input (which must be available under this name),
 and outputs _sum_, _max_, and _avg_ sensors. In this configuration, the Unit system and sensor hierarchy are not used, 
 and therefore only one generic unit (called the _root_ unit) will be instantiated.

#### Instantiating Units <a name="instantiatingUnits"></a>
Here we propose once again the configuration discussed above, this time making use of the Unit system to abstract from
the specific system being used and simplify configuration. The adjusted configuration block is the following: 

```
template_average def1 {
interval	1000
minValues	3
duplicate 	false
streaming	true
}

average avg1 {
default     def1
mqttPart    FF0

	input {
		sensor "<bottomup>col_user"
		sensor "<bottomup 1>MemFree"
	}

	output {
		sensor "<bottomup, filter cpu01>sum" {
			mqttsuffix  76
		}

		sensor "<bottomup, filter cpu01>max" {
			mqttsuffix  77
		}

		sensor "<bottomup, filter cpu01>avg" {
			mqttsuffix  78
		}
	}
}
``` 

In each sensor declaration, the _< >_ block is a placeholder that will be replaced with the name of the units that will
be associated to the analyzer, thus resolving the sensor names. Such block allows to navigate the current sensor tree,
and select nodes that will constitute the units. Its syntax is the following:

```
< bottomup|topdown X, filter Y >SENSORNAME 
``` 

The first section specified the _level_ in the sensor tree at which nodes must be selected. _bottomup X_ and _topdown X_
respectively mean _"search X levels up from the deepest level in the sensor tree"_, and _"search X levels down from the 
topmost level in the sensor tree"_. The _X_ numerical value can be omitted as well.

The second section, on the other hand, allows to search the sensor tree _horizontally_. Within the level specified in the
first section of the configuration block, only the nodes whose names match with the regular expression Y will be selected.
This way, we can navigate the current sensor tree both vertically and horizontally, and easily instantiate units starting 
from nodes in the tree. The set of nodes in the current sensor tree that match with the specified configuration block is
defined as the _unit domain_ of the analyzer.

The configuration algorithm then works in two steps:

1. The _output_ block of the analyzer is read, and its unit domain is determined; this implies that all sensors in the 
output block must share the same _< >_ block, and therefore match the same unit domain;
2. For each unit in the domain, its input sensors are identified. We start from the _unit_ node in the sensor tree, and 
navigate to the corresponding sensor node according to its _< >_ block, which identifies its level in the tree. Each 
unit, once its inputs and outputs are defined, is then added to the analyzer.

According to the sensor tree built in the previous [section](#sensorTree), the configuration above would result in
an analyzer with the following set of units:

```
rack00.node05.cpu00. {
	Inputs {
		rack00.node05.cpu00.col_user
		rack00.node05.MemFree
	}
	
	Outputs {
		rack00.node05.cpu00.sum
		rack00.node05.cpu00.max
		rack00.node05.cpu00.avg
	}
}

rack02.node03.cpu00. {
	Inputs {
		rack02.node03.cpu00.col_user
		rack02.node03.MemFree
	}
                     	
	Outputs {
		rack02.node03.cpu00.sum
		rack02.node03.cpu00.max
		rack02.node03.cpu00.avg
	}
}
``` 

#### MQTT Topics <a name="mqttTopics"></a>
The MQTT topics associated to output sensors of a certain analyzer are constructed in different ways depending
on the unit they belong to:

* **Root unit**: if the output sensors belong to the _root_ unit, that is, they do not belong to any level in the sensor
hierarchy and are uniquely defined, the respective topics are constructed like in DCDBPusher sensors, by concatenating
the MQTT prefix, analyzer part and sensor suffix that are defined;
* **Any other unit**: if the output sensor belongs to any other unit in the sensor tree, its MQTT topic is constructed
by concatenating the MQTT prefix associated to the unit (which is defined as _the portion of the MQTT topic shared by all sensors
belonging to such unit_) and the sensor suffix. The middle part of the topic is padded accordingly to ensure a fixed length.


## Rest API <a name="restApi"></a>
DCDBAnalytics provides a REST API that can be used to perform various management operations on the framework. The 
API is functionally identical to that of DCDBPusher, and is hosted at the same address. All requests that are targeted
at the data analytics framework must have a resource path starting with _/analytics_.

Micha Mueller's avatar
Micha Mueller committed
324
325
### List of ressources <a name="listOfRessources"></a>

Micha Mueller's avatar
Micha Mueller committed
326
Prefix `/analytics` left out!
Micha Mueller's avatar
Micha Mueller committed
327
328
329

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
330
    <td colspan="2"><b>Ressource</b></td>
Micha Mueller's avatar
Micha Mueller committed
331
332
333
334
335
336
337
338
339
340
341
342
    <td colspan="2">Description</td>
  </tr>
  <tr>
  	<td>Query</td>
  	<td>Value</td>
  	<td>Opt.</td>
  	<td>Description</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
343
    <td colspan="2"><b>GET /help</b></td>
Micha Mueller's avatar
Micha Mueller committed
344
345
346
347
348
349
350
351
352
    <td colspan="2">Return a cheatsheet of possible analytics REST API endpoints.</td>
  </tr>
  <tr>
  	<td colspan="4">No queries.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
353
    <td colspan="2"><b>GET /plugins</b></td>
Micha Mueller's avatar
Micha Mueller committed
354
355
356
357
358
    <td colspan="2">List all currently loaded data analytic plugins.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
359
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
360
361
362
363
364
365
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
366
    <td colspan="2"><b>GET /sensors</b></td>
Micha Mueller's avatar
Micha Mueller committed
367
368
369
370
371
    <td colspan="2">List all sensors of a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All anlalyzer plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
372
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
373
374
375
376
377
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>analyzer</td>
  	<td>All analyzers of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
378
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
379
380
381
382
383
  	<td>Restrict sensor list to an analyzer.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
384
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
385
386
387
388
389
390
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
391
    <td colspan="2"><b>GET /units</b></td>
Micha Mueller's avatar
Micha Mueller committed
392
393
394
395
396
    <td colspan="2">List all units of a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All analyzer plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
397
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
398
399
400
401
402
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>analyzer</td>
  	<td>All analyzers of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
403
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
404
405
406
407
408
  	<td>Restrict unit list to an analyzer.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
409
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
410
411
412
413
414
415
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
416
    <td colspan="2"><b>GET /analyzers</b></td>
Micha Mueller's avatar
Micha Mueller committed
417
418
419
420
421
    <td colspan="2">List all analyzers of a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All analyzer plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
422
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
423
424
425
426
427
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
428
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
429
430
431
432
433
434
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
435
    <td colspan="2"><b>PUT /start</b></td>
Micha Mueller's avatar
Micha Mueller committed
436
437
438
439
440
    <td colspan="2">Start all or only a specific plugin. Or only start a specific streaming analyzer within a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
441
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
442
443
444
445
446
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>analyzer</td>
  	<td>All analyzer names of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
447
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
448
449
450
451
452
453
  	<td>Only start the specified analyzer. Requires a plugin to be specified. Limited to streaming analyzers.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
454
    <td colspan="2"><b>PUT /stop</b></td>
Micha Mueller's avatar
Micha Mueller committed
455
456
457
458
459
    <td colspan="2">Stop all or only a specific plugin. Or only stop a specific streaming analyzer within a specific plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
460
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
461
462
463
464
465
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>analyzer</td>
  	<td>All analyzer names of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
466
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
467
468
469
470
471
472
  	<td>Only stop the specified analyzer. Requires a plugin to be specified. Limited to streaming analyzers.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
473
    <td colspan="2"><b>PUT /reload</b></td>
Micha Mueller's avatar
Micha Mueller committed
474
475
476
477
478
    <td colspan="2">Reload configuration and initialization of all or only a specific analytics plugin.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
479
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
480
481
482
483
484
485
  	<td>Reload only the specified plugin.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
486
    <td colspan="2"><b>PUT /compute</b></td>
Micha Mueller's avatar
Micha Mueller committed
487
488
489
490
491
    <td colspan="2">Query the given analyzer for a certain input unit. Intended for "on-demand" analyzers, but works with "streaming" analyzers as well.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
492
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
493
494
495
496
497
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>analyzer</td>
  	<td>All analyzer names of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
498
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
499
500
501
502
503
  	<td>Specify the analyzer within the plugin.</td>
  </tr>
  <tr>
  	<td>unit</td>
  	<td>All units of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
504
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
505
506
507
508
509
  	<td>Select the target unit. Defaults to the root unit if not specified.</td>
  </tr>
  <tr>
  	<td>json</td>
  	<td>"true"</td>
Micha Mueller's avatar
Micha Mueller committed
510
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
511
512
513
514
515
516
  	<td>Format response as json.</td>
  </tr>
</table>

<table>
  <tr>
Micha Mueller's avatar
Micha Mueller committed
517
    <td colspan="2"><b>PUT /analyzer</b></td>
Micha Mueller's avatar
Micha Mueller committed
518
519
520
521
522
    <td colspan="2">Perform a custom REST PUT action defined at analyzer level. See analyzer plugin documenation for such actions.</td>
  </tr>
  <tr>
  	<td>plugin</td>
  	<td>All plugin names.</td>
Micha Mueller's avatar
Micha Mueller committed
523
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
524
525
526
527
528
  	<td>Specify the plugin.</td>
  </tr>
  <tr>
  	<td>action</td>
  	<td>See analyzer plugin documentation.</td>
Micha Mueller's avatar
Micha Mueller committed
529
  	<td>No</td>
Micha Mueller's avatar
Micha Mueller committed
530
531
532
533
534
  	<td>Select custom action.</td>
  </tr>
  <tr>
  	<td>analyzer</td>
  	<td>All analyzers of a plugin.</td>
Micha Mueller's avatar
Micha Mueller committed
535
  	<td>Yes</td>
Micha Mueller's avatar
Micha Mueller committed
536
537
538
539
540
541
542
  	<td>Specify the analyzer within the plugin.</td>
  </tr>
  <tr>
  	<td colspan="4">Custom action may require or allow for more queries!</td>
  </tr>
</table>

Micha Mueller's avatar
Micha Mueller committed
543
544
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; Opt. = Optional

Micha Mueller's avatar
Micha Mueller committed
545
546
547
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The value of analyzer output sensors can be retrieved with the _compute_ resource, or with the [plugin]/[sensor]/avg resource defined in the DCDBPusher REST API.

> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; Developers can integrate their custom REST API resources that are plugin-specific, by implementing the _REST_ method in _AnalyzerTemplate_. To know more about plugin-specific resources, please refer to the respective documentation. 
548
549
550
551

### Rest Examples <a name="restExamples"></a>
In the following are some examples of REST requests over HTTPS:

Micha Mueller's avatar
Micha Mueller committed
552
553
> NOTE &ensp;&ensp;&ensp;&ensp;&ensp; The analytics RestAPI requires authentication credentials as well.

554
555
* Listing the units associated to the _avgAnalyzer1_ analyzer in the _average_ plugin:
```bash
Micha Mueller's avatar
Micha Mueller committed
556
GET https://localhost:8000/analytics/units?plugin=average;analyzer=avgAnalyzer1
557
558
559
```
* Listing the output sensors associated to all analyzers in the _average_ plugin:
```bash
Micha Mueller's avatar
Micha Mueller committed
560
GET https://localhost:8000/analytics/sensors?plugin=average;
561
562
563
```
* Reloading the _average_ plugin:
```bash
Micha Mueller's avatar
Micha Mueller committed
564
PUT https://localhost:8000/analytics/reload?plugin=average
565
566
567
```
* Stopping the _avgAnalyzer1_ analyzer in the _average_ plugin:
```bash
Micha Mueller's avatar
Micha Mueller committed
568
PUT https://localhost:8000/analytics/stop?plugin=average;analyzer=avgAnalyzer1
569
570
571
```
* Performing a query for unit _node00.cpu03._ to the _avgAnalyzer1_ analyzer in the _average_ plugin:
```bash
Micha Mueller's avatar
Micha Mueller committed
572
PUT https://localhost:8000/analytics/compute?plugin=average;analyzer=avgAnalyzer;unit=node00.cpu03
573
574
575
576
577
```

# Plugins <a name="plugins"></a>
Here we describe available plugins in DCDBAnalytics, and how to configure them.

Alessio Netti's avatar
Alessio Netti committed
578
579
580
581
## Aggregator Plugin <a name="averagePlugin"></a>
The _Aggregator_ plugin implements simple data processing algorithms. Specifically, this plugin allows to perform basic
aggregation operations over a set of input sensors, which are then written as output to one sensor per analyzer.
The configuration parameters specific to the _Aggregator_ plugin are the following:
582
583
584
585

| Value | Explanation |
|:----- |:----------- |
| window | Length in milliseconds of the time window that is used to retrieve recent readings for the input sensors, starting from the latest one.
Alessio Netti's avatar
Alessio Netti committed
586
| operation | Operation to be performed over the input sensors. Can be "sum", "average", "maximum" or "minimum".
587
588
589
590
591

## Writing DCDBAnalytics Plugins <a name="writingPlugins"></a>
Generating a DCDBAnalytics plugin requires implementing a _Analyzer_ and _Configurator_ class which contain all logic
tied to the specific plugin. Such classes should be derived from _AnalyzerTemplate_ and _AnalyzerConfiguratorTemplate_
respectively, which contain all plugin-agnostic configuration and runtime features. Please refer to the documentation 
Alessio Netti's avatar
Alessio Netti committed
592
of the _Aggregator_ plugin for an overview of how a basic plugin can be implemented.