35 Commits

Author SHA1 Message Date
Brett T. Warden 9bb9288153 Link telemetry post daemon against json-c
also link its test
2025-03-10 12:06:12 -07:00
Alex Jaramillo f4f012af7b fix first record missing from journal
This change fixes the condition when the first record after telempostd
starts is not inserted in the telemetry journal.

Notice! there is a change in the logic, while previously records
where inserted to juornal as soon the record was processed. After this
change records will be inserted in the journal only when successfully
delivered or record_server_delivery_enabled is set to false.
2020-02-24 13:27:32 -08:00
avjarami 622fcfca10 pr comments and actions update 2020-01-22 16:08:15 -08:00
avjarami d30eb663b3 fixing memory leaks in tests 2020-01-22 16:08:15 -08:00
avjarami 885e325e1e valgrind check for tests 2020-01-22 16:08:15 -08:00
Juro Bystricky 7a3082eb59 Merge BERT probe with klogscanner
New Linux kernels detect/interpret/display BERT errors in klog.
This makes bertprobe unnecessary, as all the information can
be simply grabbed from klog. This also removes the need to
encode the binary data into HEX/ASCII, so all the encoding
code can be removed as well, including the test suite.

The change was implemented by adding a new pattern for BERT
in the oops_parser.c with some additional minor changes.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
2019-12-13 15:16:20 -08:00
Alex Jaramillo 5cbd31cbf2 Require explicit telemctl opt-in
telemetrics-client installation starts when the package is installed,
this change makes sure that to start telemetry the first time two steps
are needed: 1- telemctl opt-in and 2- telemctl start

Signed-off-by: Alex Jaramillo <alex.jch@gmail.com>
2019-10-14 11:02:20 -07:00
Juro Bystricky 672e741e5d Fix build for logtype=systemd
Build is broken for multiple binaries when configured for logging
to systemd journal:

$ ./configure --enable-logtype=systemd
$ make

All binaries that use the routine "telem_log" must link to additional
libraries when logging to systemd journal.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
2019-07-29 09:57:46 -07:00
Juro Bystricky c264cd044c check_probes.c: fix for compiler warning [-Wstringop-overflow=]
Replaced strncpy with memcpy and added some buffer overflow checks in
order to avoid GCC9 compiler warning:
warning: ‘__builtin___strncpy_chk’ specified bound depends on the length of the source argument [-Wstringop-overflow=]

While in there, removed one unused global variable and declared the remaining
global variables as static.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
2019-06-05 10:56:18 -07:00
Juro Bystricky 2639e4d042 daemons: name space separation
The telempostd and telemprobd daemons both used identical
name "initialize_daemon" for daemon initialization.
Although the name was identical, the code for each daemon initialization
was different, leading to some potential confusion.
Also moved telemprobd specific code "stage_record" from iorecord.c to
telemdaemon.c
Modified local.mk accordingly.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
2019-04-22 19:45:40 -07:00
California Sullivan d873c4e067 configuration: rework to allow layered configuration
Have a default configuration set in the binary, with conf files only
being used to change from the defaults. This allows making simpler
configs which only toggle specific options, which may be useful for user
configuration or testing. See src/data/example.2.conf for an example of
a new simplified config.

By default the telemetrics-client is configured with
https://clr.telemetry.intel.com/v2/collector as the built-in server
location. This is the normal Clear Linux telemetrics backend. It can
be changed via the configure flag --with-backendserveraddr=URI. This
allows anyone else using this project to easily specify their own
default backend without patching. As per usual, users can specify a
different server location in a configuration file.

Signed-off-by: California Sullivan <california.l.sullivan@intel.com>
2019-04-22 10:29:59 -07:00
Juro Bystricky 297e59fe5b Improve support of non-default configuration files.
Most probes allow passing of non-default configuration file
via command line using the "-f" switch.
For example:

$ hprobe -f custom_cfg_file.conf

If the file "custom_cfg_file.conf" contains

server = http://<my backend server>

one would expect the hprobe payload will be sent to the server
http://<my backend server>. However, this is not the case, as the
various daemons delivering the payload to the backend are blissfully
unaware of the of the config file hprobe wanted to use.

The solution is to include the absolute path of the config file
specified on the command line as part of the payload. Once the
payload is about to be sent via the routine "post_record_http",
the routine checks if a non-default config file was requested.
If so, configuration is re-initialized with the file.
Upon exit, the routine re-initializes the original (default) configuration.

The non-default file may not exist at the send time anymore,
for example when sending some spooled records. In that case we
intentionally don't send anything to the backend.

If the record does not contain the optional configuration file
information, it's business as usual.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
2019-04-22 09:29:30 -07:00
California Sullivan e20e572fad telemprobd,telempostd: greatly reduce timeouts
We don't need to stay resident for a static two hours. Instead, exit
cleanly after five minutes idle. This also involves reducing the default
and maximum values of spool_process_time, as it was previously set to 30
minutes.

telempostd required several changes with timers, whereas telemprobd only
required changing the values of spool_process_time and
TM_DAEMON_EXIT_TIME, and refreshing the timeout when handling a client.

Signed-off-by: California Sullivan <california.l.sullivan@intel.com>
2019-01-20 21:21:50 -08:00
Juro Bystricky 6b44aa21c7 klogscanner: merge functionality with oopsprobe
The previous sequence of operations was:
1. klogscanner waits for/monitors kernel buffer for messages.
2. If they come, it creates a "raw" file in /var/cache/telemetry/oops.
3. Once this file is placed into /var/cache/telemetry/oops, it is detected by oopsprobe.
4. oopsprobe parses the "raw" file, creates new payload and and sends it to the backend.

This commit merges the operations:
1. klogscanner waits for/monitors kernel buffer for messages.
2. If they come, parses the klog buffer, creates new payload and and sends it to the backend.

While at it, also declare routines as "static" when possible.
Fixed a few missplaced memory freeing.
Simplify the main loop operations, move allocate/free buffer outside
of the loop.

Return SUCCESS if terminated by the signal SIGTERM. This is a clean
way to stop the service. This prevents spamming the journal (and sending
journal/error telemetry data via journalprobe) each time telemetry is
restarted.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
2019-01-20 21:18:58 -08:00
Alex Jaramillo 839ee5c126 Retry logic for post failure on boot
This change alters the logic of post telemetry message to backend to
handle cases when the network or name resolution service is not
available when the daemon starts. Currently if a message is not
delivered it will be spooled and it should have to wait 900 seconds
(default value) for a future loop to process this message. With this
change when the daemon is started and is unable to deliver telemetry
messages it will atempt to re-send those messages using a t*t delay in
seconds up to t = 7 (or 1, 4, 9, 16, 25, 36, 49 seconds).

Signed-off-by: Alex Jaramillo <alex.v.jaramillo@intel.com>
2018-09-24 15:18:12 -07:00
Alex Jaramillo f81e4c7fba Telemetry refactoring
This change adds a daemon that handles telemetry record retention,
record reporting, and record limiting policies. This new daemon
receives records in a folder that is monitored by inotify.

Signed-off-by: Alex Jaramillo <alex.v.jaramillo@intel.com>
2018-09-06 09:57:01 -07:00
Alex Jaramillo 98b36d5609 Release v1.16.2
This realease contains changes to add source file references as a
workaround for failing libcheck tests. This changes allows telemetry
client to be built using gcc8. This has to be fixed properly in the
future.

Signed-off-by: Alex Jaramillo <alex.v.jaramillo@intel.com>
2018-07-27 14:19:11 -07:00
Alex Jaramillo e301dad48e Fix for journal tmp directory
This change fixes a bug where journal unit tests fail if the system
where the code is build has telemetry running.

Signed-off-by: Alex Jaramillo <alex.v.jaramillo@intel.com>
2018-05-14 07:22:39 -07:00
avjarami d1f11980be Record retention feature
This change contains:

* New configuration keys: record_retention_enabled and
  record_server_delivery_enabled. These keys are needed to control
  remote delivery of records and record retention. These keys are
  optional to preserve backward compatibility with existing custom
  configurations.

* Record copy implementation. This change allows to save copies of
  records locally when feature is enabled in configuration. This
  operation is independent of record spooling and record reporting
  to remote server.

* New telem_journal argument to allow record payload print from
  local copy (if it exists).

Signed-off-by: avjarami <alex.v.jaramillo@intel.com>
2018-03-23 10:07:47 -07:00
avjarami 8edd75df21 Running uncrustify
Fixing syntax and indentation inconsistencies.

Signed-off-by: avjarami <alex.v.jaramillo@intel.com>
2018-03-12 14:30:11 -07:00
avjarami dc4f4e67e4 Addressing review
Addressing comments from first code review and fixing travis-ci
check_journal error in prune test.

Signed-off-by: avjarami <alex.v.jaramillo@intel.com>
2018-03-09 13:16:34 -08:00
avjarami 41e9090dd8 Refactoring tests to use fixture
Using fixture for event id tests instead of creating a new record for
every event id test.

Signed-off-by: avjarami <alex.v.jaramillo@intel.com>
2018-03-09 13:16:34 -08:00
avjarami cbcc5e24d5 Tests for journal feature
Adding tests for journal exported functions.

Signed-off-by: avjarami <alex.v.jaramillo@intel.com>
2018-03-09 13:16:34 -08:00
avjarami 8a9dbc7528 Adding event_id header
* An event_id header is needed to group records when multiple records
  are generated by same event.

* Adding new available parameter to telem_record_gen, making possible
  for this utility to tag multiple records with same event_id.

Signed-off-by: avjarami <alex.v.jaramillo@intel.com>
2018-02-28 21:54:46 -08:00
avjarami 4b10dbec86 Base 64 encoding implementation
Providing a minimal function to encode data in base64 to allow the
use of binary data as printable (http transportable) characters.
2018-01-02 13:44:31 -08:00
avjarami d0a3e6bafa Including additional host metadata in headers
Additional headers added: board_name, cpu_model, and bios_version.

    * Board name is a combination of board_name and board_vendor from
    dmi file system.

    * CPU model is read from /proc/cpuinfo.

    * BIOS version is taken from dmi file system.
2017-09-08 07:26:01 -07:00
Patrick McCarty bccd669c91 Remove buildtime checks for glib; update README
Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
2017-06-09 14:01:51 -07:00
Patrick McCarty eedcb88cb7 Run uncrustify
Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
2017-05-10 12:07:44 -07:00
Patrick McCarty 787102c4fb Clean up makefile and header references for consistency
Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
2017-05-10 11:58:55 -07:00
Arjan van de Ven e02e22f192 replace glib string operations with nica string operations 2017-05-10 11:27:52 -07:00
Patrick McCarty 26fc7b69bb Fix make distcheck
I forgot to add the new test file to EXTRA_DIST, so add it here.

Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
2017-03-22 10:48:54 -07:00
Patrick McCarty a403d884bc tests: add test case for new oops format
Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
2017-03-21 22:55:03 -07:00
Patrick McCarty 003928f52e Preserve unreliable frames for kernel oopses
When frame addresses detected during the stack scan were not previously
found by unwinding, the string "? " is added as a prefix to the function
name.

However, the current oops parsing code strips "? " if encountered, so
the backtraces from kernel oopses are missing vital information; the
presence of "? " provides a hint for debugging a stack trace and
indicates that the frame info is "unreliable".

This commit removes the "? " strip code, ensuring the prefix is retained
by the function name, and updates unit tests that check for oops lines
that should contain the prefix.

Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
2017-02-02 12:22:22 -08:00
Loic Poulain 36f4854967 Make daemon recycling configurable
The user may want to disable recycling (daemon auto exit).
This can be the case if telemd daemon is not configured
to respawn or if service is automatically restarted by
the system after package update.

This patch introduces the daemon_recycling_enabled config
entry.
2016-08-17 21:08:24 +00:00
Robert Nesius 89c62aae4a First Commit 2016-07-01 20:16:54 +00:00