Skip to content

Commit

Permalink
tools: Introduce PSI Visualizer for memory PSI analysis.
Browse files Browse the repository at this point in the history
This commit adds a new tool, `psi-visualizer`, designed to visualize PSI
data, which is essential for understanding memory pressure patterns over
time. The tool uses Python with Plotly to create interactive plots from
PSI logs generated by the PSI Collector.

The visualizer includes a Makefile for environment setup, a
`requirements.txt` for dependencies, and a `visualize.py` script that
reads PSI data and generates plots.

A README is also included to guide users through setup and usage, making
it easier to analyze memory pressure visually.

Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>
  • Loading branch information
OhmSpectator committed Aug 14, 2024
1 parent 4ef964b commit aba25f0
Show file tree
Hide file tree
Showing 5 changed files with 169 additions and 0 deletions.
1 change: 1 addition & 0 deletions tools/psi-visualizer/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
venv
18 changes: 18 additions & 0 deletions tools/psi-visualizer/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@

check-requirements:
@command -v python3 >/dev/null 2>&1 || { echo >&2 "Python3 is required but it's not installed. Aborting."; exit 1; }
@command -v pip >/dev/null 2>&1 || { echo >&2 "Pip is required but it's not installed. Aborting."; exit 1; }

prepare-env: check-requirements
@if [ ! -d "venv" ]; then \
python3 -m venv venv; \
. venv/bin/activate; \
pip install -r requirements.txt; \
fi

# Print help message, how to use activate the environment
activate-env: prepare-env
@echo -e "To activate the environment, run:\n\tsource venv/bin/activate"
@echo -e "To run the visualizer, run:\n\tpython3 visualize.py <path-to-psi-file>"
@echo -e "To deactivate the environment later, run:\n\tdeactivate"

52 changes: 52 additions & 0 deletions tools/psi-visualizer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# PSI (Process Stall Information) Visualizer

This tool visualizes PSI (Process Stall Information) data collected by the Linux
kernel. PSI is a feature introduced in Linux 4.20 that provides information
about various kinds of stalls that can happen in the kernel.
For more information about PSI, see the
[kernel documentation](https://www.kernel.org/doc/Documentation/accounting/psi.rst).

The tool creates an interactive plot that shows the memory pressure statistics
over time.

It can be used to understand the dynamics of memory pressure in the system and
to identify the processes that are causing the pressure.

## Grabbing PSI data

To collect PSI data that can be fed to the visualizer, you need to run the
`psi-collector` tool. The tool is available in the
[psi-collector](../../pkg/pillar/agentlog/cmd/psi-collector) directory.
Documentation on how to use the tool is available in the tool's
[README](../../pkg/pillar/agentlog/cmd/psi-collector/README.md).

## Preparing the environment

To build the PSI visualizer, you need to have the following dependencies
installed:

* Python 3
* pip

To install the dependencies, run:

```sh
make prepare-env
```

It will create a virtual environment in the `venv` directory and install the
required dependencies.

Then you have to activate the virtual environment:

```sh
source venv/bin/activate
```

## Running the PSI visualizer

To run the PSI visualizer, run:

```sh
python3 visualize.py <path-to-psi-file>
```
2 changes: 2 additions & 0 deletions tools/psi-visualizer/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pandas
plotly
96 changes: 96 additions & 0 deletions tools/psi-visualizer/visualize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
"""
SPDX-License-Identifier: Apache-2.0
Copyright (c) 2024 Zededa, Inc.
This script reads the log file generated by the statistics collector and visualizes the memory
pressure over time. The script uses Plotly to create interactive plots that can be viewed in a
web browser.
"""

import sys
import os

import pandas as pd
import plotly.graph_objects as go

EXPECTED_HEADER = 'date time someAvg10 someAvg60 someAvg300 someTotal ' \
'fullAvg10 fullAvg60 fullAvg300 fullTotal'


def visualize_memory_pressure(log_file):
"""
Visualizes the memory pressure over time using an interactive plot.
:param log_file: Path to the log file generated by the statistics collector.
:return: None
"""
# Read the log file into a DataFrame
dataframe = pd.read_csv(log_file, sep=r'\s+')

# Combine 'date' and 'time' columns into a single 'Timestamp' column
dataframe['Timestamp'] = pd.to_datetime(dataframe['date'] + ' ' + dataframe['time'],
format='%Y-%m-%d %H:%M:%S')

# Drop the now redundant 'date' and 'time' columns
dataframe.drop(columns=['date', 'time'], inplace=True)

# Create interactive plots using Plotly
fig = go.Figure()

# Adding traces for 'some' values
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['someAvg10'], mode='lines',
name='someAvg10', yaxis="y1"))
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['someAvg60'], mode='lines',
name='someAvg60', yaxis="y1"))
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['someAvg300'], mode='lines',
name='someAvg300', yaxis="y1"))

# Adding traces for 'full' values
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['fullAvg10'], mode='lines',
name='fullAvg10', yaxis="y1"))
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['fullAvg60'], mode='lines',
name='fullAvg60', yaxis="y1"))
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['fullAvg300'], mode='lines',
name='fullAvg300', yaxis="y1"))

# Adding cumulative area plots for total values
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['someTotal'], mode='lines',
name='someTotal', line={"width": 0.5, "color": 'rgb(131, 90, 241)'},
stackgroup='one', yaxis="y2")) # for area plot
fig.add_trace(go.Scatter(x=dataframe['Timestamp'], y=dataframe['fullTotal'], mode='lines',
name='fullTotal', line={"width": 0.5, "color": 'rgb(255, 50, 50)'},
stackgroup='two', yaxis="y2")) # for area plot

# Update layout for better readability
fig.update_layout(
title="Memory Pressure Over Time",
xaxis_title="Timestamp",
yaxis_title="Values",
yaxis={"range": [0, 100], "title": "Pressure Averages"},
yaxis2={"title": "Total Values", "overlaying": "y", "side": "right"},
legend_title="Metrics",
hovermode="x unified"
)

# Show the interactive plot
fig.show()


if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python visualize.py <log_file>")
sys.exit(1)

# Check if the log file exists
if not os.path.exists(sys.argv[1]):
print(f"Error: Log file '{sys.argv[1]}' not found!")
sys.exit(1)

# Check the header of the log file
with open(sys.argv[1], encoding='utf-8') as f:
header = f.readline().strip()
if header != EXPECTED_HEADER:
print(f"Error: Invalid log file '{sys.argv[1]}'!")
sys.exit(1)

LOG_FILE_ARG = sys.argv[1]
visualize_memory_pressure(LOG_FILE_ARG)

0 comments on commit aba25f0

Please sign in to comment.