Skip to content

Commit

Permalink
Merge pull request #838 from Peetz0r/stats
Browse files Browse the repository at this point in the history
Prometheus and Grafana on stats.<domain>
  • Loading branch information
spantaleev committed Feb 12, 2021
2 parents 11b310d + 890e4ad commit 2ac2b02
Show file tree
Hide file tree
Showing 34 changed files with 971 additions and 7 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# 2021-02-12

## Monitoring/metrics support using Prometheus and Grafana

Thanks to [@Peetz0r](https://github.com/Peetz0r), the playbook can now install a bunch of tools for monitoring your Matrix server: the [Prometheus](https://prometheus.io) time-series database server, the Prometheus [node-exporter](https://prometheus.io/docs/guides/node-exporter/) host metrics exporter, and the [Grafana](https://grafana.com/) web UI.

To get get these installed, follow our [Enabling metrics and graphs (Prometheus, Grafana) for your Matrix server](docs/configuring-playbook-prometheus-grafana.md) docs page.


# 2021-01-31

## Etherpad support
Expand Down
18 changes: 12 additions & 6 deletions docs/configuring-dns.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,25 @@ As we discuss in [Server Delegation](howto-server-delegation.md), there are 2 di
This playbook mostly discusses the well-known file method, because it's easier to manage with regard to certificates.
If you decide to go with the alternative method ([Server Delegation via a DNS SRV record (advanced)](howto-server-delegation.md#server-delegation-via-a-dns-srv-record-advanced)), please be aware that the general flow that this playbook guides you through may not match what you need to do.


## General outline of DNS settings you need to do
## Required DNS settings for services enabled by default

| Type | Host | Priority | Weight | Port | Target |
| ----- | ---------------------------- | -------- | ------ | ---- | ---------------------- |
| A | `matrix` | - | - | - | `matrix-server-IP` |
| CNAME | `element` | - | - | - | `matrix.<your-domain>` |
| CNAME | `dimension` (*) | - | - | - | `matrix.<your-domain>` |
| CNAME | `jitsi` (*) | - | - | - | `matrix.<your-domain>` |
| SRV | `_matrix-identity._tcp` | 10 | 0 | 443 | `matrix.<your-domain>` |

Be mindful as to how long it will take for the DNS records to propagate.

DNS records marked with `(*)` above are optional. They refer to services that will not be installed by default (see the section below). If you won't be installing these services, feel free to skip creating these DNS records. Also be mindful as to how long it will take for the DNS records to propagate.
If you are using Cloudflare DNS, make sure to disable the proxy and set all records to `DNS only`. Otherwise, fetching certificates will fail.

> If you are using Cloudflare DNS, make sure to disable the proxy and set all records to `DNS only`. Otherwise, fetching certificates will fail.
## Required DNS settings for optional services

| Type | Host | Priority | Weight | Port | Target |
| ----- | ---------------------------- | -------- | ------ | ---- | ---------------------- |
| CNAME | `dimension` (*) | - | - | - | `matrix.<your-domain>` |
| CNAME | `jitsi` (*) | - | - | - | `matrix.<your-domain>` |
| CNAME | `stats` (*) | - | - | - | `matrix.<your-domain>` |

## Subdomains setup

Expand All @@ -42,6 +46,8 @@ The `dimension.<your-domain>` subdomain may be necessary, because this playbook

The `jitsi.<your-domain>` subdomain may be necessary, because this playbook could install the [Jitsi video-conferencing platform](https://jitsi.org/) for you. Jitsi installation is disabled by default, because it may be heavy and is not a core required component. To learn how to install it, see our [Jitsi](configuring-playbook-jitsi.md) guide. If you do not wish to set up Jitsi, feel free to skip the `jitsi.<your-domain>` DNS record.

The `stats.<your-domain>` subdomain may be necessary, because this playbook could install [Grafana](https://grafana.com/) and setup performance metrics for you. Grafana installation is disabled by default, it is not a core required component. To learn how to install it, see our [metrics and graphs guide](configuring-playbook-prometheus-grafana.md). If you do not wish to set up Grafana, feel free to skip the `stats.<your-domain>` DNS record. It is possible to install Prometheus without installing Grafana, this would also not require the `stats.<your-domain>` subdomain.


## `_matrix-identity._tcp` SRV record setup

Expand Down
66 changes: 66 additions & 0 deletions docs/configuring-playbook-prometheus-grafana.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Enabling metrics and graphs for your Matrix server (optional)

It can be useful to have some (visual) insight into the performance of your homeserver.

You can enable this with the following settings in your configuration file (`inventory/host_vars/matrix.<your-domain>/vars.yml`):

```yaml
matrix_prometheus_enabled: true

matrix_prometheus_node_exporter_enabled: true

matrix_grafana_enabled: true

matrix_grafana_anonymous_access: false

# This has no relation to your Matrix user id. It can be any username you'd like.
# Changing the username subsequently won't work.
matrix_grafana_default_admin_user: some_username_chosen_by_you

# Passwords containing special characters may be troublesome.
# Changing the password subsequently won't work.
matrix_grafana_default_admin_password: some_strong_password_chosen_by_you
```
By default, a [Grafana](https://grafana.com/) web user-interface will be available at `https://stats.<your-domain>`.


## What does it do?

Name | Description
-----|----------
`matrix_prometheus_enabled`|[Prometheus](https://prometheus.io) is a time series database. It holds all the data we're going to talk about.
`matrix_prometheus_node_exporter_enabled`|[Node Exporter](https://prometheus.io/docs/guides/node-exporter/) is an addon of sorts to Prometheus that collects generic system information such as CPU, memory, filesystem, and even system temperatures
`matrix_grafana_enabled`|[Grafana](https://grafana.com/) is the visual component. It shows (on the `stats.<your-domain>` subdomain) the dashboards with the graphs that we're interested in
`matrix_grafana_anonymous_access`|By default you need to log in to see graphs. If you want to publicly share your graphs (e.g. when asking for help in [`#synapse:matrix.org`](https://matrix.to/#/#synapse:matrix.org?via=matrix.org&via=privacytools.io&via=mozilla.org)) you'll want to enable this option.
`matrix_grafana_default_admin_user`<br>`matrix_grafana_default_admin_password`|By default Grafana creates a user with `admin` as the username and password. If you feel this is insecure and you want to change it beforehand, you can do that here


## Security and privacy

Metrics and resulting graphs can contain a lot of information. This includes system specs but also usage patterns. This applies especially to small personal/family scale homeservers. Someone might be able to figure out when you wake up and go to sleep by looking at the graphs over time. Think about this before enabling anonymous access. And you should really not forget to change your Grafana password.

Most of our docker containers run with limited system access, but the `prometheus-node-exporter` has access to the host network stack and (readonly) root filesystem. This is required to report on them. If you don't like that, you can set `matrix_prometheus_node_exporter_enabled: false` (which is actually the default). You will still get Synapse metrics with this container disabled. Both of the dashboards will always be enabled, so you can still look at historical data after disabling either source.


## Collecting metrics to an external Prometheus server

If you wish, you could expose homeserver metrics without enabling (installing) Prometheus and Grafana via the playbook. This may be useful for hooking Matrix services to an external Prometheus/Grafana installation.

To do this, you may be interested in the following variables:

Name | Description
-----|----------
`matrix_synapse_metrics_enabled`|Set this to `true` to make Synapse expose metrics (locally, on the container network)
`matrix_nginx_proxy_proxy_synapse_metrics`|Set this to `true` to make matrix-nginx-proxy expose the Synapse metrics at `https://matrix.DOMAIN/_synapse/metrics`
`matrix_nginx_proxy_proxy_synapse_metrics_basic_auth_enabled`|Set this to `true` to password-protect (using HTTP Basic Auth) `https://matrix.DOMAIN/_synapse/metrics` (the username is always `prometheus`, the password is defined in `matrix_nginx_proxy_proxy_synapse_metrics_basic_auth_key`)
`matrix_nginx_proxy_proxy_synapse_metrics_basic_auth_key`|Set this to a password to use for HTTP Basic Auth for protecting `https://matrix.DOMAIN/_synapse/metrics` (the username is always `prometheus` - it's not configurable)


## More inforation

- [Understanding Synapse Performance Issues Through Grafana Graphs](https://github.com/matrix-org/synapse/wiki/Understanding-Synapse-Performance-Issues-Through-Grafana-Graphs) at the Synapse Github Wiki
- [The Prometheus scraping rules](https://github.com/matrix-org/synapse/tree/master/contrib/prometheus) (we use v2)
- [The Synapse Grafana dashboard](https://github.com/matrix-org/synapse/tree/master/contrib/grafana)
- [The Node Exporter dashboard](https://github.com/rfrail3/grafana-dashboards) (for generic non-synapse performance graphs)

2 changes: 2 additions & 0 deletions docs/configuring-playbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ When you're done with all the configuration you'd like to do, continue with [Ins

- [Setting up Dynamic DNS](configuring-playbook-dynamic-dns.md) (optional)

- [Enabling metrics and graphs (Prometheus, Grafana) for your Matrix server](configuring-playbook-prometheus-grafana.md) (optional)

### Core service adjustments

- [Configuring Synapse](configuring-playbook-synapse.md) (optional)
Expand Down
6 changes: 6 additions & 0 deletions docs/container-images.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,3 +85,9 @@ These services are not part of our default installation, but can be enabled by [
- [anoa/matrix-reminder-bot](https://hub.docker.com/r/anoa/matrix-reminder-bot) - the [matrix-reminder-bot](https://github.com/anoadragon453/matrix-reminder-bot) bot for one-off & recurring reminders and alarms (optional)

- [awesometechnologies/synapse-admin](https://hub.docker.com/r/awesometechnologies/synapse-admin) - the [synapse-admin](https://github.com/Awesome-Technologies/synapse-admin) web UI tool for administrating users and rooms on your Matrix server (optional)

- [prom/prometheus](https://hub.docker.com/r/prom/prometheus/) - [Prometheus](https://github.com/prometheus/prometheus/) is a systems and service monitoring system

- [prom/node-exporter](https://hub.docker.com/r/prom/node-exporter/) - [Prometheus Node Exporter](https://github.com/prometheus/node_exporter/) is an addon for Prometheus that gathers standard system metrics

- [grafana/grafana](https://hub.docker.com/r/grafana/grafana/) - [Grafana](https://github.com/grafana/grafana/) is a graphing tool that works well with the above two images. Our playbook also adds two dashboards for [Synapse](https://github.com/matrix-org/synapse/tree/master/contrib/grafana) and [Node Exporter](https://github.com/rfrail3/grafana-dashboards)
80 changes: 79 additions & 1 deletion group_vars/matrix_servers
Original file line number Diff line number Diff line change
Expand Up @@ -974,6 +974,7 @@ matrix_nginx_proxy_proxy_matrix_enabled: true
matrix_nginx_proxy_proxy_element_enabled: "{{ matrix_client_element_enabled }}"
matrix_nginx_proxy_proxy_dimension_enabled: "{{ matrix_dimension_enabled }}"
matrix_nginx_proxy_proxy_jitsi_enabled: "{{ matrix_jitsi_enabled }}"
matrix_nginx_proxy_proxy_grafana_enabled: "{{ matrix_grafana_enabled }}"

matrix_nginx_proxy_proxy_matrix_corporal_api_enabled: "{{ matrix_corporal_enabled and matrix_corporal_http_api_enabled }}"
matrix_nginx_proxy_proxy_matrix_corporal_api_addr_with_container: "matrix-corporal:41081"
Expand All @@ -991,7 +992,10 @@ matrix_nginx_proxy_proxy_matrix_federation_api_addr_sans_container: "127.0.0.1:8

matrix_nginx_proxy_container_federation_host_bind_port: "{{ matrix_federation_public_port }}"

matrix_nginx_proxy_proxy_synapse_metrics: "{{ matrix_synapse_metrics_enabled }}"
# This used to be hooked to `matrix_synapse_metrics_enabled`, but we don't do it anymore.
# The fact that someone wishes to enable Synapse metrics does not necessarily mean they want to make them public.
# A local Prometheus can consume them over the container network.
matrix_nginx_proxy_proxy_synapse_metrics: false
matrix_nginx_proxy_proxy_synapse_metrics_addr_with_container: "matrix-synapse:{{ matrix_synapse_metrics_port }}"
matrix_nginx_proxy_proxy_synapse_metrics_addr_sans_container: "127.0.0.1:{{ matrix_synapse_metrics_port }}"

Expand Down Expand Up @@ -1024,6 +1028,8 @@ matrix_ssl_domains_to_obtain_certificates_for: |
+
([matrix_server_fqn_jitsi] if matrix_jitsi_enabled else [])
+
([matrix_server_fqn_grafana] if matrix_grafana_enabled else [])
+
([matrix_domain] if matrix_nginx_proxy_base_domain_serving_enabled else [])
}}

Expand Down Expand Up @@ -1297,6 +1303,9 @@ matrix_synapse_tls_private_key_path: ~

matrix_synapse_federation_port_openid_resource_required: "{{ not matrix_synapse_federation_enabled and (matrix_dimension_enabled or matrix_ma1sd_enabled) }}"

# If someone instals Prometheus via the playbook, they most likely wish to monitor Synapse.
matrix_synapse_metrics_enabled: "{{ matrix_prometheus_enabled }}"

matrix_synapse_email_enabled: "{{ matrix_mailer_enabled }}"
matrix_synapse_email_smtp_host: "matrix-mailer"
matrix_synapse_email_smtp_port: 8025
Expand Down Expand Up @@ -1368,6 +1377,75 @@ matrix_synapse_admin_container_self_build: "{{ matrix_architecture != 'amd64' }}



######################################################################
#
# matrix-prometheus-node-exporter
#
######################################################################

matrix_prometheus_node_exporter_enabled: false

# Normally, matrix-nginx-proxy is enabled and nginx can reach Prometheus Node Exporter over the container network.
# If matrix-nginx-proxy is not enabled, or you otherwise have a need for it, you can expose
# Prometheus' HTTP port to the local host.
matrix_prometheus_node_exporter_container_http_host_bind_port: "{{ '' if matrix_nginx_proxy_enabled else '127.0.0.1:9100' }}"

######################################################################
#
# /matrix-prometheus-node-exporter
#
######################################################################



######################################################################
#
# matrix-prometheus
#
######################################################################

matrix_prometheus_enabled: false

# Normally, matrix-nginx-proxy is enabled and nginx can reach Prometheus over the container network.
# If matrix-nginx-proxy is not enabled, or you otherwise have a need for it, you can expose
# Prometheus' HTTP port to the local host.
matrix_prometheus_container_http_host_bind_port: "{{ '' if matrix_nginx_proxy_enabled else '127.0.0.1:9090' }}"

matrix_prometheus_scraper_synapse_enabled: "{{ matrix_synapse_enabled and matrix_synapse_metrics_enabled }}"
matrix_prometheus_scraper_synapse_targets: ['matrix-synapse:{{ matrix_synapse_metrics_port }}']
matrix_prometheus_scraper_synapse_rules_synapse_tag: "{{ matrix_synapse_docker_image_tag }}"

matrix_prometheus_scraper_node_enabled: "{{ matrix_prometheus_node_exporter_enabled }}"

######################################################################
#
# /matrix-prometheus
#
######################################################################



######################################################################
#
# matrix-grafana
#
######################################################################

matrix_grafana_enabled: false

# Normally, matrix-nginx-proxy is enabled and nginx can reach Grafana over the container network.
# If matrix-nginx-proxy is not enabled, or you otherwise have a need for it, you can expose
# Grafana's HTTP port to the local host.
matrix_grafana_container_http_host_bind_port: "{{ '' if matrix_nginx_proxy_enabled else '127.0.0.1:3000' }}"

######################################################################
#
# /matrix-grafana
#
######################################################################



######################################################################
#
# matrix-registration
Expand Down
3 changes: 3 additions & 0 deletions roles/matrix-base/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ matrix_server_fqn_dimension: "dimension.{{ matrix_domain }}"
# This is where you access Jitsi.
matrix_server_fqn_jitsi: "jitsi.{{ matrix_domain }}"

# This is where you access Grafana.
matrix_server_fqn_grafana: "stats.{{ matrix_domain }}"

matrix_federation_public_port: 8448

# The architecture that your server runs.
Expand Down
47 changes: 47 additions & 0 deletions roles/matrix-grafana/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# matrix-grafana is open source visualization and analytics software
# See: https://github.com/matrix-org/synapse/blob/master/docs/metrics-howto.md

matrix_grafana_enabled: false

matrix_grafana_docker_image: "docker.io/grafana/grafana:7.4.0"
matrix_grafana_docker_image_force_pull: "{{ matrix_grafana_docker_image.endswith(':latest') }}"

# Not conditional, because when someone disables metrics
# they might still want to look at the old existing data.
# So it would be silly to delete the dashboard in such case.
matrix_grafana_dashboard_download_urls:
- "https://raw.githubusercontent.com/matrix-org/synapse/master/contrib/grafana/synapse.json"
- "https://raw.githubusercontent.com/rfrail3/grafana-dashboards/master/prometheus/node-exporter-full.json"

matrix_grafana_base_path: "{{ matrix_base_data_path }}/grafana"
matrix_grafana_config_path: "{{ matrix_grafana_base_path }}/config"
matrix_grafana_data_path: "{{ matrix_grafana_base_path }}/data"

# Allow viewing Grafana without logging in
matrix_grafana_anonymous_access: false

# specify organization name that should be used for unauthenticated users
# if you change this in the Grafana admin panel, this needs to be updated
# to match to keep anonymous logins working
matrix_grafana_anonymous_access_org_name: 'Main Org.'


# default admin credentials, you are asked to change these on first login
matrix_grafana_default_admin_user: admin
matrix_grafana_default_admin_password: admin

# A list of extra arguments to pass to the container
matrix_grafana_container_extra_arguments: []

# List of systemd services that matrix-grafana.service depends on
matrix_grafana_systemd_required_services_list: ['docker.service']

# List of systemd services that matrix-grafana.service wants
matrix_grafana_systemd_wanted_services_list: []

# Controls whether the matrix-grafana container exposes its HTTP port (tcp/3000 in the container).
#
# Takes an "<ip>:<port>" or "<port>" value (e.g. "127.0.0.1:3000"), or empty string to not expose.
matrix_grafana_container_http_host_bind_port: ''


5 changes: 5 additions & 0 deletions roles/matrix-grafana/tasks/init.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
- set_fact:
matrix_systemd_services_list: "{{ matrix_systemd_services_list + ['matrix-grafana.service'] }}"
when: matrix_grafana_enabled|bool


14 changes: 14 additions & 0 deletions roles/matrix-grafana/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
- import_tasks: "{{ role_path }}/tasks/init.yml"
tags:
- always

- import_tasks: "{{ role_path }}/tasks/validate_config.yml"
when: run_setup|bool
tags:
- setup-all
- setup-grafana

- import_tasks: "{{ role_path }}/tasks/setup.yml"
tags:
- setup-all
- setup-grafana
Loading

0 comments on commit 2ac2b02

Please sign in to comment.