Skip to content

Commit

Permalink
Introduce Sampler Advertisement
Browse files Browse the repository at this point in the history
This commit introduces a new feature that simplifies aggregator
configuration.

* Previously, admins needed to manually specify hostnames for all
  samplers in the aggregator configuration using the `prdcr_add`
  command.
* This change enables samplers to advertise themselves to an aggregator.
  Admins specifies the aggregator hostname and listening port in sampler
  configuration via the `advertiser_add` command and start the
  advertisement with the `advertiser_start` command. The samplers now
  advertise their hostname to the aggregator.
* On the aggregator, admins may specify a regular expression to be
  matched with sampler hostname or an IP range in the CIDR format using
  the `prdcr_listen_add` command. The `prdcr_listen_start` command is used
  to tell the aggregator to automatically add producers corresponding to a
  sampler of which the hostname matches the regular expressions or the
  IP is in the subnet mask given at the `prdcr_listen_add` line. If
  neither of the regular expression or the IP range is given, LDMSD
  creates a producer when it receives an advertisement.

This feature eliminates the need for manual configuration of sampler
hostnames in the aggregator configuration file. The aggregator can now
automatically discover samplers and add them as metric set producers.
  • Loading branch information
nichamon committed Oct 8, 2024
1 parent 0840a25 commit b9a0ff4
Show file tree
Hide file tree
Showing 13 changed files with 2,354 additions and 209 deletions.
250 changes: 250 additions & 0 deletions ldms/man/ldmsd_sampler_advertisement.man
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
\" Manpage for ldmsd_sampler_advertisement
.TH man 7 "27 March 2024" "v5" "LDMSD Sampler Advertisement man page"

.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""/.
.SH NAME
ldmsd_sampler_advertisement - Manual for LDMSD Sampler Advertisement

.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""/.
.SH SYNOPSIS

**Sampler side Commands**

.IP \fBadvertiser_add
.RI "name=" NAME " xprt=" XPRT " host=" HOST " port=" PORT " reconnect=" RECONNECT
.RI "[auth=" AUTH_DOMAIN "]"

.IP \fBadvertiser_start
.RI "name=" NAME
.RI "[xprt=" XPRT " host=" HOST " port=" PORT " auth=" AUTH_DOMAIN " reconnect=" RECONNECT "]"

.IP \fBadvertiser_stop
.RI "name=" NAME

.IP \fBadvertiser_del
.RI "name=" NAME

.IP \fBadvertiser_status
.RI "[name=" NAME "]"

.PP
**Aggregator Side Commands**

.IP \fBprdcr_listen_add
.RI "name=" NAME "
.RI "[disable_start=" TURE|FALSE "] [regex=" REGEX "] [ip=" CIDR "]"

.IP \fBprdcr_listen_start
.RI "name=" NAME

.IP \fBprdcr_listen_stop
.RI "name=" NAME

.IP \fBprdcr_listen_del
.RI "name=" NAME

.IP \fBprdcr_listen_status

.SH DESCRIPTION

LDMSD Sampler Discovery is a capability that enables LDMSD automatically add
producers that its hostname matches a given regular expression. The feature
eliminates the need for manual configuration of sampler hostname in the
aggregator configuration file.

Admins specify the aggregator hostname and the listening port in sampler
configuration via the \fBadvertiser_add\fR command and start the advertisement
with the \fBadvertiser_start\fR command. The samplers now advertise their
hostname to the aggregator. On the aggregator, admins specify a regular
expression to be matched with sampler hostname via the \fBprdcr_listen_add\fR
command. The \fBprdcr_listen_start\fR command is used to tell the aggregator to
automatically add producers corresponding to a sampler of which the hostname
matches the regular expression.

The auto-generated producers is of the ‘advertised’ type. The producer name is
the same as the name given at the \fBadvertiser_add\fR line in the sampler
configuration file. LDMSD automatically starts them; however, admins need to
stop them manually by using the command \fBprdcr_stop\fR or
\fBprdcr_stop_regex\fR. They can be restarted by using the command
\fBprdcr_start\fR or \fBprdcr_start_regex\fR.

The description for each command and its parameters are as follows.

**Sampler Side Commands**

\fBadvertiser_add\fR adds a new advertisement. The parameters are:
.RS
.IP \fBname\fR=\fINAME
String of the advertisement name. The aggregator uses the string as the producer name as well.
.IP \fBhost\fR=\fIHOST
Aggregator hostname
.IP \fBxprt\fR=\fIXPRT
Transport to connect to the aggregator
.IP \fBport\fR=\fIPORT
Listen port of the aggregator
.IP \fBreconnect\fR=\fIINTERVAL
Reconnect interval
.IP \fB[auth\fR=\fIAUTH_DOMAIN\fB]
The authentication domain to be used to connect to the aggregator
.RE

\fBadvertiser_start\fR starts an advertisement. If the advertiser does not exist, LDMSD will create the advertiser. In this case, the mandatory attributes for \fBadvertiser_add\fB must be given. The parameters are:
.RS
.IP \fBname\fR=\fINAME
The advertisement name to be started
.IP \fB[host\fR=\fIHOST\fB]
Aggregator hostname
.IP \fB[xprt\fR=\fIXPRT\fB]
Transport to connect to the aggregator
.IP \fB[port\fR=\fIPORT\fB]
Listen port of the aggregator
.IP \fB[reconnect\fR=\fIINTERVAL\fB]
Reconnect interval
.IP \fB[auth\fR=\fIAUTH_DOMAIN\fB]
The authentication domain to be used to connect to the aggregator
.RE

\fBadvertiser_stop\fR stops an advertisement. The parameters are:
.RS
.IP \fBname\fR=\fINAME
The advertisement name to be stopped
.RE

\fBadvertiser_del\fR deletes an advertisement. The parameters are:
.RS
.IP \fBname\fR=\fINAME
The advertisement name to be deleted
.RE

\fBadvertiser_status reports the status of each advertisement. An optional parameter is:
.RS
.IP \fB[name\fR=\fINAME\fB]
Advertisement name
.RE

.PP
**Aggregator Side commands**

\fBprdcr_listen_add\fR adds a regular expression to match sampler advertisements. The parameters are:
.RS
.IP \fBname\fR=\fINAME
String of the prdcr_listen name.
.IP \fB[disable_start\fR=\fITRUE|FALSE\fB]
True to tell LDMSD not to start producers automatically
.IP \fB[regex\fR=\fIREGEX\fB]
Regular expression to match with hostnames in sampler advertisements
.IP \fBip\fR=\fICIDR\fB]
IP Range in the CIDR format either in IPV4 or IPV6
.RE

\fBprdcr_listen_start\fR starts accepting sampler advertisement with matches hostnames. The parameters are:
.RS
.IP \fBname\fR=\fINAME
Name of prdcr_listen to be started
.RE

\fBprdcr_listen_stop\fR stops accepting sampler advertisement with matches hostnames. The parameters are:
.RS
.IP \fBname\fR=\fINAME
Name of prdcr_listen to be stopped
.RE

\fBprdcr_listen_del\fR deletes a regular expression to match hostnames in sampler advertisements. The parameters are:
.RS
.IP \fBname\fR=\fINAME
Name of prdcr_listen to be deleted
.RE

\fBprdcr_listen_status\fR report the status of each prdcr_listen object. There is no parameter.

.SH EXAMPLE

In this example, there are three LDMS daemons running on \fBnode-1\fR,
\fBnode-2\fR, and \fBnode03\fR. LDMSD running on \fBnode-1\fR and \fBnode-2\fR
are sampler daemons, namely \fBsamplerd-1\fR and \fBsamplerd-2\fR. The
aggregator (\fBagg\fR) runs on \fBnode-3\fR. All LDMSD listen on port 411.

The sampler daemons collect the \fBmeminfo\fR set, and they are configured to
advertise themselves and connect to the aggregator using sock on host
\fBnode-3\fR at port 411. The following are the configuration files of the
\fBsamplerd-1\fR and \fBsamplerd-2\fR.

.EX
.B
> cat samplerd-1.conf
.RS 4
# Add and start an advertisement
advertiser_add name=samplerd-1 xprt=sock host=node-3 port=411 reconnect=10s
advertiser_start name=samplerd-1
# Load, configure, and start the meminfo plugin
load name=meminfo
config name=meminfo producer=samplerd-1 instance=samplerd-1/meminfo
start name=meminfo interval=1s
.RE
.B
> cat samplerd-2.conf
.RS 4
# Add and start an advertisement using only the advertiser_start command
advertiser_start name=samplerd-2 host=node-3 port=411 reconnect=10s
# Load, configure, and start the meminfo plugin
load name=meminfo
config name=meminfo producer=samplerd-2 instance=samplerd-2/meminfo
start name=meminfo interval=1s
.RE
.EE

The aggregator is configured to accept advertisements from the sampler daemons
that the hostnames match the regular expressions \fBnode0[1-2]\fR. The
auto-added producers will check for an establish connection with the samplers
every 10 seconds if the connection becomes disconnected. An updater is added to
update the sets of all producers on the aggregators every 10 seconds at the 100
milliseconds offset.

.EX
.B
> cat agg.conf
.RS 4
# Accept advertisements sent from LDMSD running on hostnames matched node-[1-2]
prdcr_listen_add name=computes regex=node-[1-2]
prdcr_listen_start name=computes
# Add and start an updater
updtr_add name=all_sets interval=1s offset=100ms
updtr_prdcr_add name=all_sets regex=.*
updtr_start name=all
.RE
.EE

LDMSD provides the command \fBadvertiser_status\fR to report the status of
advertisement of a sampler daemon.

.EX
.B
> ldmsd_controller -x sock -p 10001 -h node-1
Welcome to the LDMSD control processor
sock:node-1:10001> advertiser_status
Name Aggregator Host Aggregator Port Transport Reconnect (us) State
---------------- ---------------- --------------- ------------ --------------- ------------
samplerd-1 node-3 10001 sock 10000000 CONNECTED
sock:node-1:10001>
.EE

Similarly, LDMSD provides the command \fBprdcr_listen_status\fR to report the
status of all prdcr_listen objects on an aggregator. The command also reports
the list of auto-added producers corresponding to each prdcr_listen object.

.EX
.B
> ldmsd_controller -x sock -p 10001 -h node-3
Welcome to the LDMSD control processor
sock:node-3:10001> prdcr_listen_status
Name State Regex IP Range
-------------------- ---------- --------------- ------------------------------
computes running node-[1-2] -
Producers: samplerd-1, samplerd-2
sock:node-3:10001>
.EE

.SH SEE ALSO
.BR ldmsd (8)
.BR ldmsd_controller (8)
Loading

0 comments on commit b9a0ff4

Please sign in to comment.