Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry demo #150

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@
.DS_Store
site
venv/
clab-om
custom_anta_catalogs
intended
Binary file added docs/_media/topology.drawio.light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_media/topology.drawio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
118 changes: 118 additions & 0 deletions docs/telemetry/adapters/gnmic/prometheus-grafana-demo/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
layout: default
title: "Prometheus and Grafana Demo"
date: 2024-09-21 20:00:00 +0100
categories:
---

## Introduction

Prometheus is an open-source monitoring and alerting toolkit designed primarily for cloud-native
environments, including Kubernetes. Developed by SoundCloud in 2012, it has gained popularity due
to its ability to collect and store metrics as time-series data, which includes timestamps and
optional key-value pairs known as labels. (1) // --> TODO add citation

Prometheus operates on a pull-based model, where it scrapes metrics from HTTP endpoints exposed
by monitored services, storing this data in a time-series database. Users can query this data
using PromQL, a powerful query language, to generate alerts and visualize metrics through tools
like Grafana. (2) // --> TODO add citation

Since joining the Cloud Native Computing Foundation in 2016, Prometheus has become a cornerstone
in the monitoring landscape, particularly suited for dynamic service-oriented architectures and
microservices. (3) // --> TODO add citation

## Prerequisite

- [Containerlab](https://containerlab.dev/)
- [Docker](https://www.docker.com/)
- [cEOS](https://containerlab.dev/manual/kinds/ceos/)

cEOS lab will need to be downloaded from the arista software downloads
and imported via docker with a tag of e.g. 4.32.2F

## Environment

![topology](../../../../_media/topology.drawio.png)

The Containerlab file

<details><summary>Reveal output</summary>
<p>

```yaml
--8<-- "src/gnmic-prometheus/topology.yaml"
```

</p>
</details>

Looking at the `gnmic.yml` file

<details><summary>Reveal output</summary>
<p>

```bash
--8<-- "src/gnmic-prometheus/gnmic.yml"
```

We can see that we're going to use `gnmic` to subscribe to several OpenConfig and EOS native paths
and write the data into Prometheus either in their raw states or modifying them
with [processors](https://gnmic.openconfig.net/user_guide/event_processors/intro/), which
are needed due to Prometheus only accepting numerical values.

</p>
</details>

### Running the lab

```bash
cd src/gnmic-prometheus/
containerlab -t topology.yaml deploy
```

or

`containerlab -t topology.yaml deploy --reconfigure` on subsequent runs when modifications are made

Our environment should look as the following:

```shell
+----+--------------------+--------------+-------------------------------------------------------------+-------+---------+--------------------+------------------------+
| # | Name | Container ID | Image | Kind | State | IPv4 Address | IPv6 Address |
+----+--------------------+--------------+-------------------------------------------------------------+-------+---------+--------------------+------------------------+
| 1 | clab-om-avd | 2b71ef8fe868 | ghcr.io/aristanetworks/avd/universal:python3.12-avd-v4.10.2 | linux | running | 172.144.100.230/24 | 2001:172:144:100::7/80 |
| 2 | clab-om-client1 | 5d6f06a162d3 | alpine-host | linux | running | 172.144.100.8/24 | 2001:172:144:100::8/80 |
| 3 | clab-om-client2 | 95642c587a14 | alpine-host | linux | running | 172.144.100.9/24 | 2001:172:144:100::9/80 |
| 4 | clab-om-client3 | d4fd040c251b | alpine-host | linux | running | 172.144.100.10/24 | 2001:172:144:100::a/80 |
| 5 | clab-om-client4 | f98b0a992d42 | alpine-host | linux | running | 172.144.100.11/24 | 2001:172:144:100::c/80 |
| 6 | clab-om-gnmic | 7676f355ade9 | ghcr.io/openconfig/gnmic:0.38.2 | linux | running | 172.144.100.200/24 | 2001:172:144:100::2/80 |
| 7 | clab-om-grafana | 0fa1af12aac9 | grafana/grafana:11.2.0 | linux | running | 172.144.100.220/24 | 2001:172:144:100::d/80 |
| 8 | clab-om-om-pe11 | bd4888d56a1a | ceosimage:4.32.2F | ceos | running | 172.144.100.4/24 | 2001:172:144:100::4/80 |
| 9 | clab-om-om-pe12 | 51fe187893c7 | ceosimage:4.32.2F | ceos | running | 172.144.100.5/24 | 2001:172:144:100::b/80 |
| 10 | clab-om-om-pe21 | b9ed639155cb | ceosimage:4.32.2F | ceos | running | 172.144.100.6/24 | 2001:172:144:100::3/80 |
| 11 | clab-om-om-pe22 | 2b0061a2aec0 | ceosimage:4.32.2F | ceos | running | 172.144.100.7/24 | 2001:172:144:100::f/80 |
| 12 | clab-om-om-spine1 | 582e33ddbdb6 | ceosimage:4.32.2F | ceos | running | 172.144.100.2/24 | 2001:172:144:100::5/80 |
| 13 | clab-om-om-spine2 | a5f28f53582e | ceosimage:4.32.2F | ceos | running | 172.144.100.3/24 | 2001:172:144:100::6/80 |
| 14 | clab-om-prometheus | 04cdbdd65795 | prom/prometheus:v2.54.1 | linux | running | 172.144.100.210/24 | 2001:172:144:100::e/80 |
+----+--------------------+--------------+-------------------------------------------------------------+-------+---------+--------------------+------------------------+
```

Now we're ready to access Grafana at http://myserver:3001 (arista/arista)

To add configurations to the switches, such as configuring EVPN, we can use the clab-om-avd
container and run the Ansible playbook inside.:

```shell
docker exec -it clab-om-avd zsh
cd project
ansible-playbook playbooks/fabric-deploy-config.yaml -i inventory.yaml
```

> NOTE You might need to create the avd user on the host if it doesn't exist, otherwise the
container won't be able to create files.

```shell
useradd avd
usermod -aG wheel avd
chown -R avd:avd ./
```
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ nav:
- ygot: examples/ygot/index.md
- WiFi: examples/WiFi/index.md
- Telemetry:
- gnmic:
- gnmic-prometheus: telemetry/adapters/gnmic/prometheus-grafana-demo/index.md
- gNMIReverse: telemetry/adapters/gnmireverse/index.md
- kafka-telegraf: telemetry/adapters/kafka/index.md
- Models: models/index.md
Expand Down
121 changes: 121 additions & 0 deletions src/gnmic-prometheus/.topology.yaml.bak
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
name: om

topology:
kinds:
ceos:
startup-config: ./ceos.cfg.tpl
image: ceosimage:4.32.2F
exec:
- sleep 10
- FastCli -p 15 -c 'security pki key generate rsa 4096 eAPI.key'
- FastCli -p 15 -c 'security pki certificate generate self-signed eAPI.crt key eAPI.key generate rsa 4096 validity 30000 parameters common-name eAPI'
linux:
image: alpine-host
defaults:
kind: ceos
nodes:
om-spine1:
mgmt-ipv4: 172.144.100.2
binds:
- ./sn/spine1.txt:/mnt/flash/ceos-config:ro
om-spine2:
mgmt-ipv4: 172.144.100.3
binds:
- ./sn/spine2.txt:/mnt/flash/ceos-config:ro
om-pe11:
mgmt-ipv4: 172.144.100.4
binds:
- ./sn/pe11.txt:/mnt/flash/ceos-config:ro
om-pe12:
mgmt-ipv4: 172.144.100.5
binds:
- ./sn/pe12.txt:/mnt/flash/ceos-config:ro
om-pe21:
mgmt-ipv4: 172.144.100.6
binds:
- ./sn/pe21.txt:/mnt/flash/ceos-config:ro
om-pe22:
mgmt-ipv4: 172.144.100.7
binds:
- ./sn/pe22.txt:/mnt/flash/ceos-config:ro
client1:
kind: linux
mgmt-ipv4: 172.144.100.8
env:
TMODE: lacp
client2:
kind: linux
mgmt-ipv4: 172.144.100.9
env:
TMODE: lacp
client3:
kind: linux
mgmt-ipv4: 172.144.100.10
env:
TMODE: lacp
client4:
kind: linux
mgmt-ipv4: 172.144.100.11
env:
TMODE: lacp
# Telemetry stack
gnmic:
kind: linux
mgmt-ipv4: 172.144.100.200
image: ghcr.io/openconfig/gnmic:0.38.2
binds:
- ./gnmic.yml:/gnmic.yml:ro
cmd: --config gnmic.yml --log subscribe
group: gpromg
prometheus:
kind: linux
mgmt-ipv4: 172.144.100.210
image: prom/prometheus:v2.54.1
binds:
- prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
cmd: --config.file=/etc/prometheus/prometheus.yml
ports:
- 9090:9090
group: gpromg
grafana:
kind: linux
mgmt-ipv4: 172.144.100.220
env:
GF_SECURITY_ADMIN_USER: arista
GF_SECURITY_ADMIN_PASSWORD: arista
ports:
- '3001:3000'
image: grafana/grafana:11.2.0
binds:
- ./grafana/provisioning/:/etc/grafana/provisioning/
group: gpromg
avd:
kind: linux
mgmt-ipv4: 172.144.100.230
image: ghcr.io/aristanetworks/avd/universal:python3.12-avd-v4.10.2
binds:
- ./:/project
group: gpromg

links:
- endpoints: ["om-pe11:eth1", "om-spine1:eth1"]
- endpoints: ["om-pe12:eth1", "om-spine1:eth2"]
- endpoints: ["om-pe21:eth1", "om-spine1:eth3"]
- endpoints: ["om-pe22:eth1", "om-spine1:eth4"]
- endpoints: ["om-pe11:eth2", "om-spine2:eth1"]
- endpoints: ["om-pe12:eth2", "om-spine2:eth2"]
- endpoints: ["om-pe21:eth2", "om-spine2:eth3"]
- endpoints: ["om-pe22:eth2", "om-spine2:eth4"]
- endpoints: ["om-pe11:eth3", "client1:eth1"]
- endpoints: ["om-pe12:eth3", "client1:eth2"]
- endpoints: ["om-pe11:eth4", "client2:eth1"]
- endpoints: ["om-pe12:eth4", "client2:eth2"]
- endpoints: ["om-pe21:eth3", "client3:eth1"]
- endpoints: ["om-pe22:eth3", "client3:eth2"]
- endpoints: ["om-pe21:eth4", "client4:eth1"]
- endpoints: ["om-pe22:eth4", "client4:eth2"]

mgmt:
network: om_clab
ipv4-subnet: 172.144.100.0/24
ipv6-subnet: 2001:172:144:100::/80
21 changes: 21 additions & 0 deletions src/gnmic-prometheus/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.PHONY: help
help: ## Display help message
@grep -E '^[0-9a-zA-Z_-]+\.*[0-9a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'

.PHONY: deploy
deploy: ## Complete AVD & cEOS-Lab Deployment
@echo -e "\n############### \e[1;30;42mStarting cEOS-Lab topology\e[0m ###############\n"
@sudo containerlab deploy -t topology.yaml
@echo -e "\n############### \e[1;30;42mGenerating and deploying switch configuration\e[0m ###############\n"
@ansible-playbook playbooks/fabric-deploy-config.yaml --flush-cache
@echo -e "\n############### \e[1;30;42mConfiguring client nodes\e[0m ###############\n"
@bash host_l3_config/l3_build.sh
@echo -e "\n############### \e[1;30;42mcEOS-Lab Topology\e[0m ###############\n"
@sudo containerlab inspect -t topology.yaml
@echo -e "\n############### \e[1;30;42mcEOS-Lab Deployment Complete\e[0m ###############\n"

.PHONY: destroy
destroy: ## Delete cEOS-Lab Deployment and AVD generated config and documentation
@echo -e "\n############### \e[1;30;42mWiping nodes and deleting AVD configuration\e[0m ###############\n"
@sudo containerlab destroy -t topology.yaml --cleanup
@rm -rf .topology.yaml.bak config_backup/ snapshots/ reports/ documentation/ intended/
15 changes: 15 additions & 0 deletions src/gnmic-prometheus/ansible.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[defaults]
host_key_checking = False
inventory=./inventory.yaml
gathering=explicit
retry_files_enabled = False
collections_paths = ../ansible-cvp:../ansible-avd:~/.ansible/collections:/usr/share/ansible/collections
jinja2_extensions = jinja2.ext.loopcontrols,jinja2.ext.do,jinja2.ext.i18n
duplicate_dict_key=error
stdout_callback = yaml
bin_ansible_callbacks = True
deprecation_warnings=False

[persistent_connection]
connect_timeout = 300
command_timeout = 300
41 changes: 41 additions & 0 deletions src/gnmic-prometheus/ceos.cfg.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
hostname {{ .ShortName }}
username admin privilege 15 secret admin
!
service routing protocols model multi-agent
!
vrf instance MGMT
!
interface Management0
description oob_management
vrf MGMT
{{ if .MgmtIPv4Address }} ip address {{ .MgmtIPv4Address }}/{{ .MgmtIPv4PrefixLength }}{{end}}
{{ if .MgmtIPv6Address }} ipv6 address {{ .MgmtIPv6Address }}/{{ .MgmtIPv6PrefixLength }}{{end}}
!
{{ if .MgmtIPv4Gateway }}ip route vrf MGMT 0.0.0.0/0 {{ .MgmtIPv4Gateway }}{{end}}
{{ if .MgmtIPv6Gateway }}ipv6 route vrf MGMT ::0/0 {{ .MgmtIPv6Gateway }}{{end}}
no lldp receive
no lldp transmit
!
management security
ssl profile eAPI
cipher-list HIGH:!eNULL:!aNULL:!MD5:!ADH:!ANULL
certificate eAPI.crt key eAPI.key
!
management api http-commands
protocol https ssl profile eAPI
no shutdown
!
vrf MGMT
no shutdown
!
management api gnmi
transport grpc default
notification timestamp send-time
no shutdown
transport grpc MGMT
vrf MGMT
notification timestamp send-time
no shutdown
provider eos-native
!
end
Loading
Loading