memprofile: add http debug test #7

christoph-zededa · 2024-08-15T10:06:15Z

Summary by CodeRabbit

New Features
- Introduced improved error handling for server operations.
- Added a new test for memory pressure monitoring functionality.
Bug Fixes
- Enhanced error distinction during server shutdown to improve logging accuracy.
Refactor
- Renamed listenDebug to ListenDebug to increase its accessibility.
- Changed PressureMemoryFile from a constant to a variable for runtime modifications.

eve-kernel-amd64-v6.1.38-generic 8859f43ee8c9: eve_defconfig: Enable PSI (Pressure Stall Information). eve-kernel-amd64-v6.1.38-rt 37bc37278441: eve_defconfig: Enable PSI (Pressure Stall Information). eve-kernel-arm64-v5.10.104-nvidia 7b9a55e0b659: defconfig: Enable PSI in default configuration. eve-kernel-arm64-v6.1.38-generic acbfadebeb76: eve_defconfig: Enable PSI in default configuration. bb8ed98284d6: eve_defconfig: Remove ST33HTPM support. The second commit in the 5.10 kernel update is not logically a part of this patch set, but a leftover from a previous one. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

This commit introduces a Pressure Stall Information (PSI) collector to the AgentLog component in Pillar. The collector periodically gathers memory pressure data from the system, logging it in a managed file. The code also includes functionality for handling the file size to ensure that only recent data is kept. This collector is designed to be shared between the AgentLog component and a future standalone tool for enhanced memory pressure monitoring. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

This commit introduces new constants for the memory-monitor directories within the Pillar codebase. Specifically, it adds paths for the memory-monitor output directory and the PSI log file. This change ensures that Pillar components are aware of and can correctly reference the locations used by the memory-monitor. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

This commit adds functionality to the Pillar internal debug API, allowing the start and stop of the Memory PSI collector via HTTP POST requests. New endpoints `/memory-monitor/psi-collector/start` and `/memory-monitor/psi-collector/stop` have been introduced, enabling control over the PSI collector process. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

…ript. This commit extends the `eve` script with new commands `psi-collector start|stop`. These commands interact with the internal debug API to start and stop the Memory PSI collector, respectively. This addition makes it convenient to control the PSI collector directly from the terminal, with the results stored in `/persist/memory-monitor/output/psi.txt`. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

This commit adds a standalone version of the PSI Collector, allowing it to be built and used independently on older versions of EVE where the collector is not integrated into the root filesystem. The new component includes a Makefile for building the binary, a README with instructions, and a basic main.go implementation that handles the PSI collection process. This ensures compatibility and provides memory pressure monitoring capabilities for legacy systems. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

This commit adds a new tool, `psi-visualizer`, designed to visualize PSI data, which is essential for understanding memory pressure patterns over time. The tool uses Python with Plotly to create interactive plots from PSI logs generated by the PSI Collector. The visualizer includes a Makefile for environment setup, a `requirements.txt` for dependencies, and a `visualize.py` script that reads PSI data and generates plots. A README is also included to guide users through setup and usage, making it easier to analyze memory pressure visually. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>

coderabbitai · 2024-08-15T10:06:27Z

Walkthrough

The changes enhance the agentlog module by improving error handling and accessibility. A key function was renamed to be exported, allowing external use, while its error management was refined to differentiate between normal server shutdowns and other errors. Additionally, a new test was introduced for memory pressure monitoring, expanding the module's testing capabilities. Lastly, a constant was converted to a variable, permitting dynamic assignment of the memory pressure file path.

Changes

Files	Change Summary
`pkg/pillar/agentlog/agentlog.go`	Renamed `listenDebug` to `ListenDebug` for export; improved error handling for server shutdown.
`pkg/pillar/agentlog/agentlog_test.go`	Added `TestMemoryPressure` to validate memory pressure functionality; checks file creation and updates.
`pkg/pillar/agentlog/memprofile.go`	Changed `PressureMemoryFile` from a constant to a variable to allow runtime modifications.

Poem

🐇 In the code, we twist and twine,
Functions gleam and brightly shine.
Memory's pressure, we now can test,
Errors handled, we give our best!
With hops of joy, let’s celebrate,
Code so fine, it feels just great! ✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

OhmSpectator · 2024-08-15T10:10:04Z

@coderabbitai review

coderabbitai · 2024-08-15T10:10:09Z

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (2)

pkg/pillar/agentlog/agentlog_test.go (1)
248-249: Consider making the HTTP server address configurable.

Hardcoding the server address can lead to conflicts in shared environments.
- go agentlog.ListenDebug(log, "stackdump", "memdump")
+ serverAddress := "localhost:6543"
+ go agentlog.ListenDebug(log, serverAddress, "stackdump", "memdump")
pkg/pillar/agentlog/memprofile.go (1)

44-44: Document the intended use cases for modifying PressureMemoryFile.

Changing this from a constant to a variable allows runtime modifications, which should be controlled and documented to prevent unintended side effects.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between aba25f0 and 9e9c3df.

Files selected for processing (3)

pkg/pillar/agentlog/agentlog.go (4 hunks)
pkg/pillar/agentlog/agentlog_test.go (2 hunks)
pkg/pillar/agentlog/memprofile.go (2 hunks)

Additional comments not posted (2)

pkg/pillar/agentlog/agentlog.go (2)
8-8: Good use of the errors package for improved error handling.

The addition of the errors package allows for more precise error checking using errors.Is.

Line range hint 237-342:
Exporting ListenDebug and refining error handling enhances functionality.

The changes improve accessibility and robustness. Ensure that the function is used appropriately in the codebase.

Verification successful

Exporting ListenDebug is currently contained within the package.

The function ListenDebug is used internally within the package and is tested in agentlog_test.go. There are no external usages impacting other parts of the codebase.

agentlog_test.go: Tests ListenDebug by starting and stopping the HTTP service.

agentlog.go: Internal usage of ListenDebug.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify new usages of `ListenDebug` in the codebase.

# Test: Search for usages of `ListenDebug`. Expect: Occurrences of the new function name.
rg --type go -A 5 $'ListenDebug'
Length of output: 1169

coderabbitai · 2024-08-15T10:12:17Z

pkg/pillar/agentlog/agentlog_test.go

+	mpfh, err := os.CreateTemp("", "memory-pressure")
+	if err != nil {
+		panic(err)


Use t.Fatalf instead of panic for error handling in tests.

Using t.Fatalf provides better test output and integration with testing tools.

- if err != nil { - panic(err) - } + if err != nil { + t.Fatalf("unexpected error: %v", err) + }

Also applies to: 239-241, 255-259, 261-264, 268-271, 273-276, 293-296

Do we really need these "better test output and integration with testing tools" in the scope of this test?

@OhmSpectator, using t.Fatalf instead of panic in tests is generally recommended because it provides better integration with the Go testing framework. It allows for more informative test output and ensures that the test is properly marked as failed, which can be particularly beneficial in automated testing environments or CI/CD pipelines. However, if these benefits are not crucial for your current testing scope, you might choose to keep the existing implementation. Ultimately, the decision depends on your project's specific needs and constraints.

I think it makes sense to use both panic and t.Fatalf as this way, we can distinguish between the errors caused by generic system failure (testing env is broken - use panic) and real test failure (use t.Fatalf for that). Or are there better approaches?

@OhmSpectator, your approach to using both panic and t.Fatalf to differentiate between system failures and test failures is understandable. However, using panic in tests can lead to abrupt exits, which might skip other tests and make it harder to gather comprehensive results.

A more structured approach could involve using custom error handling or logging to categorize errors. You could log system-related errors with a specific tag or message, making it clear they are not test failures. Additionally, using t.Skip or t.Skipf for known environmental issues can help manage test execution without resorting to panic. This way, you maintain the clarity of test results while avoiding the potential downsides of using panic.

Give me an example of such handling...

@OhmSpectator, here's an example of how you can handle errors in tests using structured logging and custom error handling instead of panic. This approach helps distinguish between system-related issues and actual test failures:

package mypackage import ( "log" "os" "testing" ) func TestExample(t *testing.T) { // Simulate an environmental check if !isEnvironmentReady() { log.Println("Environment not ready, skipping test") t.Skip("Skipping test due to environmental issues") } // Simulate a test operation file, err := os.Open("testfile.txt") if err != nil { log.Printf("System error: %v", err) t.Fatalf("Failed to open test file: %v", err) } defer file.Close() // Further test logic... } // isEnvironmentReady simulates an environmental check func isEnvironmentReady() bool { // Add logic to verify if the environment is ready // Return false if not ready return true }

In this example, log.Println and log.Printf are used to log messages with specific information about the error type. t.Skip is used to skip the test if the environment is not ready, while t.Fatalf handles actual test failures. This way, you maintain clear and structured test results without using panic.

OhmSpectator · 2024-08-15T12:24:41Z

pkg/pillar/agentlog/memprofile.go

@@ -42,6 +40,9 @@ const (
 	StatLineSize = 75
 )

+// PressureMemoryFile is the memory pressure file
+var PressureMemoryFile = "/proc/pressure/memory"


@christoph-zededa, I'm not too fond of making such a value a variable. It's a constant. Is there a way to mock a constant in Go? Or maybe we can introduce conditional builds with different values for test env and prod env?

We could f.e. make the path a parameter to listenDebug and when we invoke it from the tests, we can give it a different path than in the real system.

Yeah... It looks like a proper solution...

On the other hand, as I consider adding parameters to all the functions, I start thinking about the possibility of collecting PSI stats for not just memory but also CPU and IO. In that case, we could make the PSI collector more generic, move it to another file, or even a different component. However, I then realize that we don't actually need all of that right now, and I hesitate to redo all the code now.

So, can we mock a constant at this point?

OhmSpectator · 2024-08-15T12:54:56Z

pkg/pillar/agentlog/agentlog_test.go

+		panic(err)
+	}
+
+	err = os.MkdirAll(types.MemoryMonitorOutputDir, 0755)


It will fail to create a directory at /persist on a host system...

Or is it supposed to run only in a container? How do you start the tests? Not with go test?

os.MkdirAll is like mkdir -p it won't fail if the directory already exists.

I start the test either from an ubuntu container lf-edge/eve@master...christoph-zededa:eve:rr or from the eve-build container or the pillar container itself.

make test runs all tests in a container - so it takes a while ...

Aha... In the pillar dir, I can run make build-docker-env enter-docker-env, and the tests work there. It's not very convenient, but it works.

OhmSpectator · 2024-08-15T12:55:43Z

I find the approach useful, but we still have to integrate it better...

OhmSpectator · 2024-08-15T13:19:53Z

@christoph-zededa, I wantd to push an extra commit on top of the PR, but got this

ERROR: Permission to christoph-zededa/eve.git denied to OhmSpectator.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

christoph-zededa · 2024-08-15T13:21:37Z

@christoph-zededa, I wantd to push an extra commit on top of the PR, but got this

ERROR: Permission to christoph-zededa/eve.git denied to OhmSpectator.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I guess you either have to create a PR into my branch which is a PR into your branch.
Or you merge/cherry-pick my branch into your branch and put a commit on top.

OhmSpectator · 2024-08-15T13:26:03Z

I guess you either have to create a PR into my branch which is a PR into your branch.
Or you merge/cherry-pick my branch into your branch and put a commit on top.

I wanted to draft the approach in this PR.
To push changes into the PR, I should have collaborator permissions for your EVE fork, as I did for you in my fork (so you can push changes into my PRs).

OhmSpectator and others added 8 commits August 9, 2024 18:56

memprofile: add http debug test

9e9c3df

coderabbitai bot reviewed Aug 15, 2024

View reviewed changes

OhmSpectator reviewed Aug 15, 2024

View reviewed changes

OhmSpectator force-pushed the feature/add-mem-pressure-statistics-collector branch 7 times, most recently from 71c2091 to 19f2306 Compare August 16, 2024 23:58

OhmSpectator force-pushed the feature/add-mem-pressure-statistics-collector branch 3 times, most recently from 0969b2b to 122406e Compare August 26, 2024 22:42

OhmSpectator closed this Sep 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memprofile: add http debug test #7

memprofile: add http debug test #7

christoph-zededa commented Aug 15, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 15, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

OhmSpectator commented Aug 15, 2024

coderabbitai bot commented Aug 15, 2024

coderabbitai bot left a comment

coderabbitai bot Aug 15, 2024 •

edited

Loading

OhmSpectator Aug 15, 2024

coderabbitai bot Aug 15, 2024

OhmSpectator Aug 15, 2024

coderabbitai bot Aug 15, 2024

OhmSpectator Aug 15, 2024

coderabbitai bot Aug 15, 2024

OhmSpectator Aug 15, 2024

christoph-zededa Aug 15, 2024 •

edited

Loading

OhmSpectator Aug 15, 2024

OhmSpectator Aug 15, 2024

OhmSpectator Aug 15, 2024

OhmSpectator Aug 15, 2024

christoph-zededa Aug 15, 2024

OhmSpectator Aug 15, 2024

OhmSpectator commented Aug 15, 2024

OhmSpectator commented Aug 15, 2024

christoph-zededa commented Aug 15, 2024

OhmSpectator commented Aug 15, 2024

memprofile: add http debug test #7

memprofile: add http debug test #7

Conversation

christoph-zededa commented Aug 15, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Aug 15, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

OhmSpectator commented Aug 15, 2024

coderabbitai bot commented Aug 15, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Aug 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Aug 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Aug 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

christoph-zededa Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OhmSpectator commented Aug 15, 2024

OhmSpectator commented Aug 15, 2024

christoph-zededa commented Aug 15, 2024

OhmSpectator commented Aug 15, 2024

christoph-zededa commented Aug 15, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 15, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Aug 15, 2024 •

edited

Loading

christoph-zededa Aug 15, 2024 •

edited

Loading