README.md 38.3 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Table of contents
1. [Introduction](#introduction)
2. [dcdbpusher](#dcdbpusher)
    1. [Global Configuration](#globalConfiguration)
    2. [Rest API](#restApi)
        1. [Table of queries](#tableOfQueries)
        2. [Ressources for GET request](#ressourcesGET)
        3. [Ressources for PUT request](#ressourcesPUT)
        4. [Examples](#restExamples)
    3. [MQTT topic](#mqttTopic)
3. [Plugins](#plugins)
    1. [IPMI](#ipmi)
    2. [Perf-event](#perf)
        1. [type and config](#perfTypeConfig)
        2. [Footnotes](#perfFootnotes)
    3. [SNMP](#snmp)
    4. [SysFS](#sysfs)
    5. [PDU](#pdu)
    6. [BACnet](#bacnet)
    7. [OPA](#opa)
        1. [counterData](#opaCounterData)
Alessio Netti's avatar
Alessio Netti committed
22
23
    8. [ProcFS](#procfs)
    9. [Writing own plugins](#writingOwnPlugins)
24
25
26
3. [TODOS](#todos)

## Introduction <a name="introduction"></a>
Micha Mueller's avatar
Micha Mueller committed
27
28
29
DCDB (DataCenter DataBase) is a database to collect various (sensor-)values of a datacenter for further analysis.
Harvesting of the data is task of the dcdbpusher.

30
# dcdbpusher <a name="dcdbpusher"></a>
Micha Mueller's avatar
Micha Mueller committed
31

Micha Mueller's avatar
Micha Mueller committed
32
This is a general MQTT pusher which sends values of various sensors to the DCDB-database.
Micha Mueller's avatar
Micha Mueller committed
33
It ships with plugins for BACnet, IPMI, PDU(proprietary Power Delivery Unit, but could be used as XML plugin) ,percounter, SNMP and sysFS.
Micha Mueller's avatar
Micha Mueller committed
34
35
36
37
38
39
40
41
42

Build it by simply running
```bash
make
```
or alternatively use
```bash
make debug
```
Micha Mueller's avatar
Micha Mueller committed
43
within the `dcdbpusher` directory to build a version which will print additional debug-information during runtime.
Micha Mueller's avatar
Micha Mueller committed
44

45
The logic for the various sensors is encapsulated into plugins (shared dynamic libraries; the makefile will take care of compiling them for you). The dcdbpusher will dynamically open the libraries if they are specified in the [global configuration](#GC) file. Vice versa, if selected sensor-functionality, e.g. sysFS is not specified, the corresponding shared library libdcdbplugin_sysfs.so does not have to be present. 
Micha Mueller's avatar
Micha Mueller committed
46

Micha Mueller's avatar
Micha Mueller committed
47
48
49
50
You can run dcdbpusher by executing
```bash
./dcdbpusher path/to/configfile/
```
51
or run
Micha Mueller's avatar
Micha Mueller committed
52
53
54
```bash
./dcdbpusher -h
```
Micha Mueller's avatar
Micha Mueller committed
55
to print the help-section of dcdbpusher.
Micha Mueller's avatar
Micha Mueller committed
56

Micha Mueller's avatar
Micha Mueller committed
57
Dcdbpusher will check the given file-path for the global configuration file which has to be named `global.conf`.
Micha Mueller's avatar
Micha Mueller committed
58

59
### Global Configuration  <a name="globalConfiguration"></a>
Micha Mueller's avatar
Micha Mueller committed
60

Micha Mueller's avatar
Micha Mueller committed
61
62
The global configuration specifies various settings for dcdbpusher in general, e.g. which plugins should be loaded etc.
Please have a look at the provided `config/global.conf` example to get familiar with the file scheme. The example also forms a good starting point for writing a custom `global.conf`. The different sections and values are explained in the following table:
Micha Mueller's avatar
Micha Mueller committed
63

Micha Mueller's avatar
Micha Mueller committed
64
| Value | Explanation |
Micha Mueller's avatar
Micha Mueller committed
65
66
|:----- |:----------- |
| global | Wrapper structure for the global values.
Micha Mueller's avatar
Micha Mueller committed
67
68
| mqttBroker | Define address and port of the MQTT-broker which collects the messages (sensor values) send by dcdbpusher.
| mqttprefix | To not rewrite a full MQTT-topic for every sensor one can specify here a consistent prefix.
69
| sensorpattern | pattern used to perform automatic sensor name publishing. See the corresponding [section](#autopublish) for more information.
Micha Mueller's avatar
Micha Mueller committed
70
| threads | Specify how many threads should be created to handle the sensors async. Default value of threads is 1. Note that the MQTTPusher always starts an extra thread. So the actual number of started threads is always one more than defined here. Specifying not enough threads can result in a delay for some sensors until they are read.
71
| maxMsgNum | To avoid publishing too many MQTT messages at once you can define here a maximum count of values that are published in one turn. After reaching this limit the MQTTPusher will be forced to sleep for a short time before continuing.
72
73
|maxInflightMsgNum|Maximum number of messages that can be "inflight". This is a MQTT term and should match the broker's setting. Set to 0 for unlimited.
|maxQueuedMsgNum|Maximum number of MQTT messages (including "inflight") that should be queued. This is to limit the amount of memory that is used for buffering. Set to 0 for unlimited.
Micha Mueller's avatar
Micha Mueller committed
74
75
76
77
| verbosity | Level of detail in the logfile (dcdb.log). Set to a value between 5 (all log-messages, default) and 0 (only fatal messages). NOTE: level of verbosity for the command-line log can be set via the -v flag independently when invoking dcdbpusher.
| daemonize | Set to 'true' if dcdbpusher should run detached as daemon. Default is false.
| tempdir | One can specify a writeable directory where dcdbpusher can write its temporary and logging files to. Default is the current (' ./ ' ) directory.
| cacheInterval | Define a time interval in seconds. The last sensor readings within this time interval will be kept. This value can be overwritten by plugins.
Micha Mueller's avatar
Micha Mueller committed
78
| | |
79
80
81
82
83
| restAPI | Bundles all values related to the RestAPI. See the corresponding [section](#restApi) for more information on supported functionality.
| address | Define address and port where the REST API server should run on.
| certificate | Provide the (path and) file which the HTTPS server should use as certificate.
| privateKey | Provide the (path and) file which should be used as corresponding private key for the certificate. If private key and certificate are stored in the same file one should nevertheless provide the path to the cert-file here again.
| dhFile | Provide the (path and) file where Diffie-Hellman parameters for the key exchange are stored.
Micha Mueller's avatar
Micha Mueller committed
84
| authkey | This struct is used to define authentication key tokens for the REST API. Within the struct, define which operations over the REST API are allowed for the token (e.g. PUTReq or GETReq). Each token must be unique.
Micha Mueller's avatar
Micha Mueller committed
85
86
| | |
| plugins | In this section one can specify the plugins which should be used.
Micha Mueller's avatar
Micha Mueller committed
87
| plugin _name_ | The plugin name is used to build the corresponding lib-name (e.g. sysfs --> libdcdbplugin_sysfs.1.0)
Micha Mueller's avatar
Micha Mueller committed
88
89
| path | Specify the path where the plugin (the shared library) is located. If left empty, dcdbpusher will look in the default lib-directories (usr/lib and friends) for the plugin-file.
| config | One can specify a separate config-file (including path to it) for the plugin to use. If not specified, dcdbpusher will look up pluginName.conf (e.g. sysfs.conf) in the same directory where global.conf is located.
Micha Mueller's avatar
Micha Mueller committed
90
91
| | |

92
Formats of the other sensor-specific config-files are explained in the corresponding [subsections](#ipmi). Example configuration-files can be found in the `config/` directory.
Micha Mueller's avatar
Micha Mueller committed
93
94


95
## REST API <a name="restApi"></a>
96

97
Dcdbpusher runs a HTTPS server which provides some functionality to be controlled over a RESTful API. The API is by default hosted at port 8000 on the localhost but the address can be changed in the [`global.conf`](#globalConfiguration).
98
99
100
101

A HTTPS request to dcdbpusher should have the following format: `[GET|PUT] host:port[ressource]?[queries]`.
Tables with allowed ressources sorted by REST methods can be found below. A query consists of a key-value pair of the format `key=value`. For every request at least the authentication token has to be appended as query. Multiple queries are separated by semicolons(';').

102
### Table of queries <a name="tableOfQueries"></a>
103
104
105
106

| Key | Value | Explanation |
|:--- |:----- |:----------- |
| authkey | Your authentication token | The authentication token is required to verify that you are allowed to make this particular request. The authkey query must be included by every request
107
| json | Format replies to plugins and sensors requests as JSON instead of plain text
108
109
110
| interval | Time-value in [s] | Only for GET request of sensor readings average. One can (optionally) specify a custom time interval for which the average of sensor readings is calculated. By default, every sensor reading in the cache is used to calculate the average.


111
### Ressources for GET request <a name="ressourcesGET"></a>
112
113
114
115
116
117
118
119
120

| Ressource | Explanation |
|:--------- |:----------- |
| /help | Responds with a small cheatsheet which presents all currently supported ressources.
| /plugins | (Discovery) Returns a list of all currently loaded plugins.
| /[plugin]/sensors | (Discovery) Returns a list of all sensors which belong to [plugin]. To find out which plugins are available one can request the `/plugins` ressource.
| /[plugin]/[sensor]/avg | Calculates and returns the average of the last sensor readings. Can be combined with the interval query.


121
### Ressources for PUT request <a name="ressourcesPUT"></a>
122
123
124

| Ressource | Explanation |
|:--------- |:----------- |
125
| /[plugin]/[action] | One can request to do a action on the specified plugin. Currently supported actions are `start`, `stop` and `reload` which start or stop the polling of the sensors of the plugin respectively triggers a reload of the plugin configuration. If a configuration reload is requested, all sensors of the plugin are stopped first. Then new sensors are created and started according to the configuration.
126
127


128
### Examples <a name="restExamples"></a>
129
130

Two examples for HTTPS requests:
Micha Mueller's avatar
Micha Mueller committed
131
132
133
134
135

```bash
GET https://localhost:8000/sysfs/freq1/avg?authkey=myToken;interval=15
```
```bash
Micha Mueller's avatar
Micha Mueller committed
136
PUT https://localhost:8000/bacnet/stop?authkey=myToken
Micha Mueller's avatar
Micha Mueller committed
137
```
138

Micha Mueller's avatar
Micha Mueller committed
139
## MQTT topic <a name="mqttTopic"></a>
140

141
For communication between the different DCDB-components (database, dcdbpusher) the [MQTT protocol](https://mqtt.org/) is used. In order to identify each sensor, everyone has to have a unique MQTT topic assigned. A MQTT topic for DCDB consists of exactly 112 bits (= 28 hex characters), not including '/' separators. The topic for a sensor is built by appending up to 4 parts:
Micha Mueller's avatar
Micha Mueller committed
142
143
144
145
1. mqttprefix    (e.g. /00112233445566778899AA)
2. mqttpart of entity (if supported by plugin, e.g. /BB)
3. mqttpart of group    (e.g. /1122)
4. mqttsuffix    (e.g. /3344)
Micha Mueller's avatar
Micha Mueller committed
146

Micha Mueller's avatar
Micha Mueller committed
147
Then the topic for the sensor is /00112233445566778899AA/BB/1122/3344.
148

149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
### Automatic sensor name publishing <a name="autopublish"></a>

In order to perform queries through the *dcdbquery* tool for a certain sensor, a mapping from its MQTT topic to the desired displayed name must be supplied through the *dcdbconfig* tool, which stores this information in the underlying Cassandra datastore.
However, dcdbpusher can also handle this task automatically, and perform the publishing of all configured sensors. To enable this feature, users must use the *sensorpattern* global configuration parameter, which can also be supplied through the -a command line option.

Such sensor pattern is a string defining the naming scheme to be used by dcdbpusher: it is composed of a fixed part, which will be present in the names of all sensors, and of several wildcards, which are automatically replaced by dcdbpusher with the information of the specific sensor. Current supported wildcards are:

* \<sensor\>: the sensor's name as set in the configuration file of the corresponding plugin;
* \<group\>: the name of the sensor group to which the single sensor belongs;
* \<plugin\>: the name of the plugin managing the sensor group.

Note that the \<sensor\> wildcard *must* be supplied in order for the sensor pattern to be considered valid. If not, the sensor pattern will be discarded and automatic sensor name publishing will not be performed. An example of a valid sensor pattern is the following:

```
myHostname.<group>.<sensor>
```

This sensor pattern will include the hostname (in this case, *myHostname*) to distinguish between the sensors of different hosts, plus the sensor group and name. For the *MemFree* and *nr_alloc_batch* sensors defined in the default [ProcFS config file](config/procfs.conf), this pattern will produce the following names:

```
myHostname.meminfo.MemFree
myHostname.vmstat.nr_alloc_batch
```

It is advised to always include the group name together with the sensor name in the pattern, as dcdbpusher does not perform any checks on the uniqueness of sensor names.

175
# Plugins <a name ="plugins"></a>
Micha Mueller's avatar
Micha Mueller committed
176

Micha Mueller's avatar
Micha Mueller committed
177
The core of dcdbpusher is responsible of collecting all the values read by the sensors and sending them to the database. However, the main functionality of the sensors comes from the various plugins. Every plugin corresponds to a special sensor functionality.
178
All the different plugins share some same general principles in common regarding the sensor structure and configuration. Those principles should also be obeyed when [writing own plugins](#writingOwnPlugins).
Micha Mueller's avatar
Micha Mueller committed
179
1. There are three hierarchical levels (from bottom up):
Micha Mueller's avatar
Micha Mueller committed
180
181
182
    1. Sensors
    2. Groups
    3. Entities (optional)
Micha Mueller's avatar
Micha Mueller committed
183
184
2. There are no sensors on its own. Every sensor belongs to a group.
3. Multiple groups may or may not be aggregated by an entity. Entities can be optionally used by the plugin developer to aggregate groups which belong together, e.g. because they all query the same host.
Micha Mueller's avatar
Micha Mueller committed
185
186
187
188
189
190
191
4. Every hierarchical level is associated with some attributes. In the following are some hints on how one (when developing own plugins) should decide which attributes are associated with which level. Also for every level the common base attributes are listed (with explanation), which are specified independently of a plugin:
    1. Entities (if present) hold all attributes which are required to query the represented entity or all its associated groups have in common. Common entity attributes:
        * __default__     (One can define the name of a template group (see below) whose values and groups should be used as default)
        * Other entity attributes could be: mqttPart, protocol-version, host address and port.
    2. Groups hold all attributes which multiple sensors belonging to it share in common. Common group attributes:
        * __interval__    (Time in [ms] between two consecutive sensor reads. Default is 1000[ms] = 1[s])
        * __minValues__   (Minimum number of sensor reads the sensors in a group should gather before they are sent together to the database. Useful to reduce MQTT-overhead. Default is 1 (every sensor value is sent on its own))
192
        * __mqttPart__    (Part for the [mqtt-topic](#mqttTopic) all sensors in this group should share in common)
Micha Mueller's avatar
Micha Mueller committed
193
        * __default__     (One can define the name of a template group (see below) whose values and sensors should be used as default)
Micha Mueller's avatar
Micha Mueller committed
194
    3. Sensors hold only those attributes which are necessary to uniquely identify the target sensor. Common base attributes:
195
        * __mqttsuffix__  (to make its [mqtt-topic](#mqttTopic) unique)
196
        * __delta__ (identifies a monotonic sensor. If set to "on", differences between successive readings are collected)
197
198
        * __sink__ (a path to a file to which sensor readings should be written, disabled by default)
        * __subSampling__ (subsampling factor S. If > 1, only one reading every S is sent over MQTT, and the others are kept locally)
Micha Mueller's avatar
Micha Mueller committed
199
200
201
202
5. Be aware that naming of sensor/group/entity is not fixed. A plugin developer can name them as he likes, e.g. counter/multicounter/host.
6. It is possible to define template groups or entities in the config file, but not template sensors (as a sensor should only consists of attributs which make him unique this would not be too useful). To specify a template group/entity simply prefix its definition with `template_` (see the example below). You can reference them later by using the `default` attribute. A template entity can consist of groups and these in turn can consist of sensors. When using a template, all of its attribute values are copied to the actual sensor. Copied attributes can be overwritten in the actual entity/sensor (some of them even should be overwritten, e.g. the mqttPart). However, groups/sensors associated with a template are copied to the actual entity/group and can NOT be overwritten. One can specify further groups/sensors which are then added to those copied from the template. Template entitys/groups itself or sensors within them are never used in live operation of the plugin. They are purely cosmetic for convenient configuration.
 
In the following two abstract config files are shown to visualize the structure, one with the optional entity level and one without. A real example configuration file for every plugin should be provided in the `/config` directory. One should use them as a starting point to write own configuration files. 
Micha Mueller's avatar
Micha Mueller committed
203
```
Micha Mueller's avatar
Micha Mueller committed
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
 Without entity:
------------------------------------------------

global {
	mqttprefix /00112233445566778899AABB0000
	cacheInterval 120
	...
}

template_group temp1 {			;template group named temp1 (is not used in live operation)
	interval	1000			;While it is possible define entities/groups/sensors without
	minValues	3				;name it is strictly disregarded. Naming entities/groups/sensors
	mqttPart	AA				;simplifies debugging and especially enables one to reference
								;templates later on. Also names should be always unique.
	sensor s1 {
		mqttsuffix		01
		...						;usually the sensor would require additional attributes
	}

	sensor s2 {
		mqttsuffix		02
		...
	}
}

group g1 {
	default		temp1			;use temp1 as template group
	mqttPart	BB				;overwrite the mqttPart from temp1, to avoid identical
								;mqtt-topics if another group uses the same template
	sensor s3 {					;g1 has now 3 sensors: s1, s2 (both taken over from temp1)
		mqttsuffix		03		;and s3
		...
	}
}

group g2 {						;g2 consists of only one sensor (s21) and uses
 	sensor s21 {				;for every attribute the default value
		mqttsuffix	0000		;by using a longer mqttsuffix we do not need a
		...						;group mqtt-part
	}
}

...
Micha Mueller's avatar
Micha Mueller committed
247
248
```

Micha Mueller's avatar
Micha Mueller committed
249
250
251
252
253
254
255
256
```
 With entity:
------------------------------------------------

global {
	...
}

Micha Mueller's avatar
Micha Mueller committed
257
template_entity temp1 {				;template entity which is not used in live operation
Micha Mueller's avatar
Micha Mueller committed
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
	...								;here go entity attributes

	group g1 {						
		interval	1000
		minValues	3
		mqttPart	AA
		
		sensor s1 {
			mqttsuffix		01
			...						;usually the sensor would require additional attributes
		}
	
		sensor s2 {
			mqttsuffix		02
			...
		}
	}
}

entity ent1 {
	default		temp1				;use temp1 as template entity
	
	group g2 {						;ent1 has now two groups (g1 and g2) with a total of
	 	sensor s21 {				;3 sensors (s1, s2, s21)
			mqttsuffix	0000
			...
		}
	}
}

...
```
Micha Mueller's avatar
Micha Mueller committed
290
One should have noticed the global section in the examples which was not mentioned before. In this section the user can (but is not obligated to) overwrite values from the `global.conf` for this plugin or specify other settings which are global for this plugin.
291

292
## IPMI <a name="ipmi"></a>
Micha Mueller's avatar
Micha Mueller committed
293

Micha Mueller's avatar
Micha Mueller committed
294
The [IPMI](https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface) plugin enables dcdbpusher to collect sensor values offered by a baseboard management controller (BMC).
295

296
Explanation of the values specific for the IPMI plugin:
297

Micha Mueller's avatar
Micha Mueller committed
298
| Value | Explanation |
299
300
301
302
303
|:----- |:----------- |
| sessiontimeout | Session timeout value for the IPMI-connection
| retransmissiontimeout | Retransmission timeout value for the IPMI-connection
| username | For the remote IPMI-connection login credentials are required
| password | For the remote IPMI-connection login credentials are required
304
305
| ipmiversion | IPMI version to use for LAN connections (1 or 2)
| cipher | Cipher to use for IPMI 2.0 LAN connections (currently supported: 0, 1, 2, 3, 6, 7, 8, 11, 12)
306
| cmd | One can define a raw IPMI-command (in hex-notation) to be sent. In this case also the start and stop fields for the response have to be defined. Alternatively, one can define the record-ID of the sensor (see below).
307
308
| lsb | Offset where the least significant byte of the wanted return value of an IPMI raw command in the IPMI response<sup>[1](#ipmifn1)</sup>
| msb | Offset where the most significant byte of the wanted return value of an IPMI raw command in the IPMI response<sup>[1](#ipmifn1)</sup>
Micha Mueller's avatar
Micha Mueller committed
309
| recordId | Define the record-ID of the sensor to be read. One can look up the corresponding record-IDs for every sensor with the "ipmi-sensors" command line tool (ships with the freeipmi-library). Alternatively, one can define a raw IPMI-command (see above).
310
| factor | One can specify a factor to scale the read value before it is stored in the database (to adjust precision).
311
#### Footnotes <a name="ipmiFootnotes"></a>
312

313
<a name="ipmifn1">**1**</a>: &ensp; Use lsb > msb values if response is Little-endian (LSB first), use lsb < msb values if response is Big-Endian (MSB first). Maximum length is 8 bytes.  
314
## Perf-event <a name="perf"></a>
Micha Mueller's avatar
Micha Mueller committed
315

Micha Mueller's avatar
Micha Mueller committed
316
The Perfevent functionality is tasked with collecting data from the CPUs various performance counters (PMUs).
317
> NOTE &ensp;&ensp;&ensp; The perf-event plugin measures PMUs for all processes running on a specific CPU. Therefore a value of less than 1 is required in `/proc/sys/kernel/perf_event_paranoid`. Other values (>=1) restrict the access to PMUs. See this [footnote](#fn1) for additional information.
Micha Mueller's avatar
Micha Mueller committed
318

Micha Mueller's avatar
Micha Mueller committed
319
320
321
322
323
324
325
Explanation of the values specific for the perfevent plugin:

| Value | Explanation |
|:----- |:----------- |
| type | Type of which the counter should be. Each type determines different possible values for the config-field. Possible type-values are described below.
| config | Together with the type-field config determines which performance counter should be read. Possible values and what they measure are listed below.
| cpus | One can define a comma-separated list of cpu numbers (also value ranges can be specified, e.g. 2-4 equals 2,3,4). The hardware counter will then be only opened on the specified cpus.
326
| htVal | Specify multiplier for CPU aggregation. All CPUs where (CPU-number % htVal) has the same result are aggregated together. Only CPUs which are included in the "cpus" field (or all CPUs if the "cpus" field is not present) are aggregated. Background: To reduce the amount of pushed sensor data, it is possible to aggregate cpu readings. This feature is specifically aimed at processors which are hyper-threading enabled but can also come in handy for other use cases. Only the values pushed via the MQTT-Pusher are aggregated. There still exist sensors for each CPU and they store unaggregated readings in their local caches.
327
| mqttsufffix | In the context of the perfevent plugin the mqttpart requires a place holder ('x') for the CPU id. Sensors will be duplicated in order to open hardware counter for each CPU. Therefore the mqttsuffix should contain a placeholder consisting of 'x' to be replaced by the CPU id and make the suffix unique.
Micha Mueller's avatar
Micha Mueller committed
328

Micha Mueller's avatar
Micha Mueller committed
329

330
331
332
> NOTE &ensp;&ensp;&ensp; As perfevent counters are usually always monotonic, the delta attribute is by default set to true for all sensors. One has to explicitly set delta to "off" for a sensor to overwrite this behaviour.


333
### type and config <a name="perfTypeConfig"></a>
Micha Mueller's avatar
Micha Mueller committed
334

Micha Mueller's avatar
Micha Mueller committed
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
(see the [perf_event_open man-page](http://man7.org/linux/man-pages/man2/perf_event_open.2.html) for more detailed explanations)

| Type | Config | Explanation |
|:----:|:------ |:----------- |
| PERF_TYPE_HARDWARE | | generalized hardware CPU events
| " | PERF_COUNT_HW_CPU_CYCLES | total cycles (affected by frequency scaling)
| " | PERF_COUNT_HW_INSTRUCTIONS | retired instructions
| " | PERF_COUNT_HW_CACHE_REFERENCES | cache accesses (usually last level)
| " | PERF_COUNT_HW_CACHE_MISSES | cache misses (usually last level)
| " | PERF_COUNT_HW_BRANCH_INSTRUCTIONS | retired branch instructions
| " | PERF_COUNT_HW_BRANCH_MISSES | mispredicted branch instructions
| " | PERF_COUNT_HW_BUS_CYCLES | bus cycles
| " | PERF_COUNT_HW_STALLED_CYCLES_FRONTEND | stalled cycles during issue
| " | PERF_COUNT_HW_STALLED_CYCLES_BACKEND  | stalled cycles during retirement
| " | PERF_COUNT_HW_REF_CPU_CYCLES | total cycles (unaffected by frequency scaling)
350
| | | |
Micha Mueller's avatar
Micha Mueller committed
351
352
353
354
355
356
357
358
359
360
361
| PERF_TYPE_SOFTWARE | | software events provided by the kernel
| " | PERF_COUNT_SW_CPU_CLOCK | reports CPU clock
| " | PERF_COUNT_SW_TASK_CLOCK | clock count specific to the running task
| " | PERF_COUNT_SW_PAGE_FAULTS | number of page faults
| " | PERF_COUNT_SW_CONTEXT_SWITCHES | count of context switches
| " | PERF_COUNT_SW_CPU_MIGRATIONS | times the process has migrated to a new CPU
| " | PERF_COUNT_SW_PAGE_FAULTS_MIN | number of minor page faults (no disk-I/O)
| " | PERF_COUNT_SW_PAGE_FAULTS_MAJ | number of major page faults (disk-I/O was required)
| " | PERF_COUNT_SW_ALIGNMENT_FAULTS | alignment faults when accessing unaligned memory
| " | PERF_COUNT_SW_EMULATION_FAULTS | number of unimplemented instructions which had to be emulated
| " | PERF_COUNT_SW_DUMMY | placeholder which counts nothing
362
| | | |
Micha Mueller's avatar
Micha Mueller committed
363
364
| PERF_TYPE_TRACEPOINT | | not yet implemented
| PERF_TYPE_HW_CACHE | | not yet implemented
365
| | | |
Micha Mueller's avatar
Micha Mueller committed
366
| PERF_TYPE_RAW | | user can define architecture-specific raw events here.
367
| " | *XXXX* | Config must be a raw event config value, see <sup>[2](#fn2)</sup>
368
| | | |
Micha Mueller's avatar
Micha Mueller committed
369
| PERF_TYPE_BREAKPOINT | --- | config not required, any values will be ignored. However config must still be specified (even if empty)
370
|<Custom>|<Custom>| dynamic PMU event, see <sup>[3](#fn3)</sup>
Micha Mueller's avatar
Micha Mueller committed
371

372
#### Footnotes <a name="perfFootnotes"></a>
Micha Mueller's avatar
Micha Mueller committed
373
374
375

Taken from the [perf_event_open man-page](http://man7.org/linux/man-pages/man2/perf_event_open.2.html):

376
377
378
<a name="fn1">**1**</a>: &ensp; The pid and cpu arguments allow specifying which process and CPU to monitor:  
[...]  
pid == -1 and cpu >= 0  
379
This measures all processes/threads on the specified CPU. This requires CAP_SYS_ADMIN capability or a /proc/sys/kernel/perf_event_paranoid value of less than 1.
380
381
382
383

[...]

The perf_event_paranoid file can be set to restrict access to the performance counters.
Micha Mueller's avatar
Micha Mueller committed
384
385
386
387
388
389
390

| Value | Restriction |
|:-----:|:----------- |
| 2 | allow only user-space measurements (default since Linux 4.6) |
| 1 | allow both kernel and user measurements (default before Linux 4.6) |
| 0 | allow access to CPU-specific data but not raw trace-point samples |
| -1 | no restrictions |
Micha Mueller's avatar
Micha Mueller committed
391
392
393
	
The existence of the perf_event_paranoid file is the official method for determining if a kernel supports perf_event_open()

Micha Mueller's avatar
Micha Mueller committed
394
<a name="fn2">**2**</a>: &ensp; If type is *PERF_TYPE_RAW*, then a custom "raw" config value is needed. Most CPUs support events that are not covered by the "generalized" events. These are implementation defined; see your CPU manual (for example the Intel Volume 3B documentation or the AMD BIOS and Kernel Developer Guide). The libpfm4 library can be used to translate from the name in the architectural manual to the raw hex value perf_event_open() expects in this field.
Micha Mueller's avatar
Micha Mueller committed
395

396
<a name="fn3">**3**</a>: &ensp; Custom type and Config values can be specified to use the PMU of a specific device. The necessary configuration parameters can be obtained from the type and config files the respective in /sys/devices/<device> tree.
397
## snmp <a name="snmp"></a>
Micha Mueller's avatar
Micha Mueller committed
398

Micha Mueller's avatar
Micha Mueller committed
399
The SNMP plugin enables dcdbpusher to talk with devices which have an SNMP agent running and query requests from them. A SNMP sensor corresponds to a single value as identified by the unique OID. Sensors are aggregated by connections. See the exemplary snmp.conf file in the `config/` directory.
400
> NOTE &ensp;&ensp;&ensp; In the SNMP context the word privacy is used synonymously for encryption.
Micha Mueller's avatar
Micha Mueller committed
401

Micha Mueller's avatar
Micha Mueller committed
402
403
Explanation of the values specific for the SNMP plugin:

Micha Mueller's avatar
Micha Mueller committed
404
405
406
407
408
409
410
| Value | Explanation |
|:----- |:----------- |
| connection | An aggregating connection
| Type | Type of the SNMP application which runs on the device queried by the connection. Currently only the type Agent is supported.
| Host | Host name of the device which is to be queried.
| Port | The SNMP port should be usually 161. No changes should be required here.
| OIDPrefix | This OIDPrefix is used for all following sensors.
Micha Mueller's avatar
Micha Mueller committed
411
412
413
414
415
416
417
418
419
| |
| Version | Which SNMP version to use (either 2 (maps to 2c) or 3).
| Community | Which SNMP community to use (required only if version 2 is used).
| Username | Username to authenticate with (only required for version 3).
| SecLevel | The security level to be used (only required for version 3). Can be either `noAuthNoPriv` for no authentication and privacy ("privacy" is SNMPs synonym for encryption), `authNoPriv` for only authentication and `authPriv` for authentication and privacy.
| AuthProto | Which protocol to use for authentication (only required for version 3 and if SecLevel != noAuthNoPriv). Can be MD5 or SHA1.
| AuthKey | The passphrase for authentication (only required for version 3 and if SecLevel != noAuthNoPriv). Must be at least 8 characters long.
| PrivProto | Which protocol to use for privacy (only required for version 3 and if SecLevel = AuthPriv). Can be DES or AES.
| PrivKey | The passphrase for privacy encryption (only required for version 3 and if SecLevel = AuthPriv). Must be at least 8 characters long.
Micha Mueller's avatar
Micha Mueller committed
420
| mqttPart | Connection specific MQTT-part which is appended to the MQTT-prefix and succeded by the sensor specific suffix.
Micha Mueller's avatar
Micha Mueller committed
421
| |
Micha Mueller's avatar
Micha Mueller committed
422
| OID | OID suffix which together with the OIDPrefix forms the unique OID identifying a value to query.
Micha Mueller's avatar
Micha Mueller committed
423
| passphrase | has to be at least 8 characters long
Micha Mueller's avatar
Micha Mueller committed
424

425
## sysFS <a name="sysfs"></a>
Micha Mueller's avatar
Micha Mueller committed
426
427

SysFS sensors read data from sysFS files. The configuration file of the plugin corresponds to the generic plugin configuration with standalone sensors. Additionally for a sysFS sensor the following parameters are mandatory/possible:
428

Micha Mueller's avatar
Micha Mueller committed
429
430
Explanation of the values specific for the sysFS plugin:

Micha Mueller's avatar
Micha Mueller committed
431
432
433
| Value | Explanation |
|:----- |:----------- |
| path | Path to the sysFS file the sensor should read from. This parameter is mandatory.
434
| filter | One can define an optional filter if the sysFS file consists of more than only the sensor value. Please note the following points for filters: <br> 1.  The filter supports substitutions. For substitution sed syntax ("s/.../.../") is used. Therefore extended regular expressions (ERE) are used as regex-syntax. ERE is closest to Basic RE (BRE), which is actually used by sed, but requires less escaping. <br> 2.  If a \ ("backslash") is needed in the regex (for escaping), always use \\ ("double backslash") as the regex is read in as string and strings also escape with backslash <br> 3.  Whitespaces are actually used as value separators in the config files. If your filter requires whitespaces either use [[:space:]] in the regex or put it in quotation marks ("") <br> 4.  To be able to reference parts of the match (for substitution) use groups. Groups are created with parentheses. <br>  5.  If using character classes like [[:digit:]] always make sure to use double brackets ("[[" and "]]") or they will not be recognized. <br>  See [ERE-syntax](https://www.gnu.org/software/sed/manual/html_node/ERE-syntax.html#ERE-syntax) <br>  See [substitution syntax](http://www.boost.org/doc/libs/1_65_1/libs/regex/doc/html/boost_regex/format/sed_format.html)
435

436
## PDU <a name="pdu"></a>
437
438
439

The Power Delivery Unit (PDU) plugin is in charge of sending a network-request to the PDUs and gathering specified sensor data from the XML-file response.

Micha Mueller's avatar
Micha Mueller committed
440
Explanation of the values specific for the PDU plugin:
Micha Mueller's avatar
Micha Mueller committed
441

442
| Value | Explanation |
443
|:----- |:----------- |
444
| host | Hostname and (optional) port where to fetch the XML-file with sensor data from. If no port is specified, 443 is used. The plugin requests the file via HTTPS.
445
| TTL | To avoid requesting a current XML-file every time a sensor wants to read his value, one can define a time to live (TTL) for the file here. A new XML-file is requested at the earliest if the TTL has expired. Default value is 1000[ms].
Micha Mueller's avatar
Micha Mueller committed
446
| request | Define the request to be sent to the host via HTTPS as a string. One should put the request in quotation marks (' " ') to enable the use of whitespaces within the request. Special characters (like usage of ' " ' within the request) should be escaped (' " ' --> ' \" '; ' \ ' --> ' \\\\ '; newline --> ' \n '; ...).
447
| path | Define a dot-separated path to the value to be read in the XML file. One can specify attribute values a node has to fulfil in brackets after the node. Even multiple (comma-separated) attributes can be given, however no whitespaces should be used (!) as they will not be filtered and could therefore be treat as part of the attributes name.
448

449
## BACnet <a name="bacnet"></a>
450
451

The BACnet plugin enables dcdbpusher to communicate and request data from devices which communicate via the BACnet protocol. A so called "read property" request is sent by the plugin to the BACnet devices as configured in the config file. The response value is then stored in the database. Usually one is only interested in collecting the current reading of a BACnet device (property PROP_PRESENT_VALUE, ID 85). However, also reading of other properties is supported.
452
> NOTE &ensp;&ensp;&ensp; On startup BACnet plugin does no device discovery. Instead it relies on the user providing a file with addresses of all required BACnet devices. One can generate such an address-file for example by using the `bacwi` demo tool provided by the BACnet-Stack.
453
454
455
456
457

Explanation of the values specific for the BACnet plugin:

| Value | Explanation |
|:----- |:----------- |
458
| address_cache | (Path to and) filename of the address cache file where the addresses of BACnet devices are stored (as noted above).
Micha Mueller's avatar
Micha Mueller committed
459
| interface | Network interface (IPv4) which is to be used by the plugin to send its "Read Property" requests.
460
| port | Port to use on the interface
461
| timeout |	Value of µ-seconds to wait for a response packet.
462
463
464
| apdu_timeout | Value of µ-seconds before sending a request times out.
| apdu_retries | How often should sending a request be retried.
| templates | One can define template properties in this section for convenience.
465
| factor | Described in the section for the [IPMI-plugin](#ipmi).
466
467
468
469
470
471
| devices | Starts the part in the config file where the actual BACnet devices are configured. A BACnet device consists of multiple nested parts: device > objects > properties.
| instance (device) | Instance of the BACnet-device.
| type | Type of the object within the device.
| instance (object) | Instance of the object within the device.
| id | ID of the property to be read from the BACnet device-object. Assignment of numbers to properties is done according to the enum as defined in `bacenum.h`.

472
## Opa (Intel Omni-Path Architecture) <a name="opa"></a>
473
474
475
476
477
478
479
480
481
482
483

The Opa plugin enables dcdbpusher to query various counters from omni-path interconnects.

Explanation of the values specific for the Opa plugin:

| Value | Explanation |
|:----- |:----------- |
| hfiNum | Number of which omni-path Host Fabric Interface to query (starting with 1)
| portNum | Number of which omni-path port to query (starting with 1)
| cntData | Name which data counter to query. A list of possible values can be found below.

484

Micha Mueller's avatar
Micha Mueller committed
485
> NOTE &ensp;&ensp;&ensp; As opa counters are usually always monotonic, the delta attribute is by default set to true for all sensors. One has to explicitly set delta to "off" for a sensor to overwrite this behaviour.
486

487
### counterData <a name="opaCounterData"></a>
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517

Possible values for cntData:
* portXmitData
* portRcvData
* portXmitPkts
* portRcvPkts
* portMulticastXmitPkts
* portMulticastRcvPkts
* localLinkIntegrityErrors
* fmConfigErrors
* portRcvErrors
* excessiveBufferOverruns
* portRcvConstraintErrors
* portRcvSwitchRelayErrors
* portXmitDiscards
* portXmitConstraintErrors
* portRcvRemotePhysicalErrors
* swPortCongestion
* portXmitWait
* portRcvFECN
* portRcvBECN
* portXmitTimeCong
* portXmitWastedBW
* portXmitWaitData
* portRcvBubble
* portMarkFECN
* linkErrorRecovery
* linkDowned
* uncorrectableErrors

Alessio Netti's avatar
Alessio Netti committed
518
519
520
521
522
523
524
## ProcFS (/proc filesystem) <a name="procfs"></a>

The ProcFS plugin enables dcdbpusher to sample resource usage metrics from a variety of files in the /proc virtual filesystem generated by the Linux kernel. Each defined sensor group is assigned to a specific file, which is periodically parsed. Currently supported files for sampling are:
* /proc/vmstat: contains virtual memory-related usage metrics;
* /proc/meminfo: contains RAM memory-related usage metrics (note that some of the metrics overlap with /proc/vmstat);
* /proc/stat: contains CPU usage-related metrics, both at system and core levels.

Alessio Netti's avatar
Alessio Netti committed
525
526
527
Note that the ProcFS plugin can operate in two distinct modes, with respect to MQTT topics:
* Automatic: if no sensors are specified, all metrics discovered in the underlying parsed file are acquired; sensors and MQTT topics are generated for them. Please be careful when configuring the plugin so that its MQTT topics do not overlap with those of other plugins.
* Manual: If at least one sensor is specified, only the corresponding metrics are acquired, and all other metrics in the parsed file are discarded. MQTT topics are assigned accordingly, using the mqttPrefix, mqttPart and mqttSuffix fields.
Alessio Netti's avatar
Alessio Netti committed
528
529
530
531
532

Explanation of the values specific for the ProcFS plugin:

| Value | Explanation |
|:----- |:----------- |
Alessio Netti's avatar
Alessio Netti committed
533
| type | The type of the file parsed by the sensor group. Can be either "vmstat", "meminfo", "procstat" or "sar"
Alessio Netti's avatar
Alessio Netti committed
534
| path | Path of the file, if different from the default path in the /proc filesystem
535
| mqttPart | The mqttPart can be used a placeholder. For sensors associated to metrics that are core-specific (e.g. some of those in /proc/stat) the mqttPart is replaced with the CPU id. For all other metrics that are system-wide, the mqttPart is used as it is.
Alessio Netti's avatar
Alessio Netti committed
536
537
538
539
540
541
542
543
544
545
546
| mqttStart | Base MQTT suffix that is automatically incremented to generate topics for sensors associated to metrics in the same file. Note that this parameter is used only if automatic MQTT topic generation is enabled, when no sensors are explicitly defined.
| cpus | Defines the set of CPU cores for which metrics must be collected. Only affects extraction of core-specific metrics (e.g. those in /proc/stat), whereas system-level metrics are acquired regardless of this setting. If no CPU cores set is defined, metrics for all available CPU cores will be collected. This parameter follows the same syntax as in the Perf-event plugin.

Additionally, sensors in the ProcFS plugin (defined with the "metric" keyword) support the following additional values:

| Value | Explanation |
|:----- |:----------- |
| type | The type of the specific metric associated to the sensor. This field must match the exact name of a metric in the underlying parsed file. If such a match does not exist, the sensor is discarded.
| perCpu | Boolean. If set to "on", the metric will be collected for each CPU core specified with the "cpus" sensor group parameter, or for all CPU cores if none is specified. Otherwise, the metric will be collected only at system level. This parameter has no effect on metrics that are not acquired at CPU core level (e.g. those in /proc/vmstat and /proc/meminfo).

The "type" field can be inferred for each sensor by simply checking the underlying file parsed by the sensor group. For /proc/stat files, on the other hand, CPU core-related metrics are collected in separate columns, which adopt the following naming scheme that can be used to define sensors: 
Alessio Netti's avatar
Alessio Netti committed
547
548
549
550
551
552
553
554
555
556
* col_user 
* col_nice 
* col_system 
* col_idle 
* col_iowait
* col_irq
* col_softirq
* col_steal
* col_guest
* col_guest_nice
Alessio Netti's avatar
Alessio Netti committed
557
558

Additional CPU-related metrics (that may be introduced in future versions of the Linux kernel) are not supported by the DCDB ProcFS plugin.
Alessio Netti's avatar
Alessio Netti committed
559
Note that for /proc/meminfo instances, an additional synthetic sensor of type "MemUsed" can be defined. This sensor will automatically extract the amount of used memory from the MemTotal and MemFree values present in meminfo files.
Alessio Netti's avatar
Alessio Netti committed
560

561
562
## Writing own plugins <a name="writingOwnPlugins"></a>
First make sure you read the [plugins](#plugins) section.
563
Try out the `pluginGenerator/generatePlugin.sh` script!
Micha Mueller's avatar
Micha Mueller committed
564
TODO!
565

566
#### TODOS <a name="todos"></a>
Micha Mueller's avatar
Micha Mueller committed
567
* explain special cases in the code for every plugin?
568
* more about mqtt-topic