Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[profiling] Fuzz profiler API #449

Merged
merged 54 commits into from
Jun 10, 2024
Merged

[profiling] Fuzz profiler API #449

merged 54 commits into from
Jun 10, 2024

Conversation

danielsn
Copy link
Contributor

What does this PR do?

Adds fuzzable versions of the data structures, and uses them to fuzz the profiler API.

Motivation

R&D week

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

@github-actions github-actions bot added the profiling Relates to the profiling* modules. label May 23, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 23, 2024

Codecov Report

Attention: Patch coverage is 94.24779% with 26 lines in your changes are missing coverage. Please review.

Project coverage is 69.36%. Comparing base (4709a67) to head (db18e70).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #449      +/-   ##
==========================================
+ Coverage   68.93%   69.36%   +0.42%     
==========================================
  Files         193      194       +1     
  Lines       25574    26006     +432     
==========================================
+ Hits        17630    18039     +409     
- Misses       7944     7967      +23     
Components Coverage Δ
crashtracker 19.34% <ø> (ø)
datadog-alloc 98.76% <ø> (ø)
data-pipeline 51.45% <ø> (ø)
data-pipeline-ffi 0.00% <ø> (ø)
ddcommon 85.16% <ø> (ø)
ddcommon-ffi 74.15% <ø> (ø)
ddtelemetry 56.09% <ø> (ø)
ipc 81.69% <ø> (ø)
profiling 79.18% <94.24%> (+1.20%) ⬆️
profiling-ffi 60.00% <ø> (ø)
serverless 0.00% <ø> (ø)
sidecar 37.27% <ø> (ø)
sidecar-ffi 0.00% <ø> (ø)
spawn-worker 54.98% <ø> (ø)
trace-mini-agent 69.12% <ø> (ø)
trace-normalization 97.79% <ø> (ø)
trace-obfuscation 95.74% <ø> (ø)
trace-protobuf 33.33% <ø> (ø)
trace-utils 92.25% <ø> (ø)

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented May 28, 2024

Software Composition Analysis

✅ No library vulnerabilities found (compared 09ff22a against b462829).

@taegyunkim taegyunkim marked this pull request as ready for review May 29, 2024 17:22
@taegyunkim taegyunkim requested a review from a team as a code owner May 29, 2024 17:22
profiling/src/internal/profile.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile/mod.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Outdated Show resolved Hide resolved
}
}

/// PartialEq and Hash work on just the key to avoid getting duplicate
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using Eq to compare labels in the tests? Will this miss issues where the values don't agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::hash::Hash doc says this property must hold when implementing Eq and Hash, k1 == k2 -> hash(k1) == hash(k2). In our case, values can disagree even if the hashes are the same.

This was introduced mainly to use Labels as keys for HashMap in https://github.com/DataDog/libdatadog/blob/dsn/fuzz-profile-api/profiling/src/internal/profile/fuzz_tests.rs#L369

profiling/src/internal/profile/fuzz_tests.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Show resolved Hide resolved
if let Some(expected_sample) = expected_timestamped_samples.next() {
assert_eq!(owned_locations, expected_sample.locations);
assert_eq!(sample.values, expected_sample.values);
assert_eq!(owned_labels, expected_sample.labels);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should potentially be sorted: there is no requirement that labels remain in the same order

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally I'm sorting the labels and got a new failure:

======================== Test Failure ========================

Input: 
(
    [],
    [
        (
            None,
            Sample {
                locations: [],
                values: [],
                labels: [
                    Label {
                        key: "local root span id",
                        str: None,
                        num: 281474976710656,
                        num_unit: None,
                    },
                    Label {
                        key: "",
                        str: Some(
                            "",
                        ),
                        num: 0,
                        num_unit: Some(
                            "",
                        ),
                    },
                ],
            },
        ),
    ],
)

Error: 
panicked at profiling/src/internal/profile/fuzz_tests.rs:402:18:
Value not found for an aggregated sample
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

==============================================================

Line 402 is this expect:

            let key = (&owned_locations, &owned_labels);
            let expected_values = samples_without_timestamps
                .get(&key)
                .expect("Value not found for an aggregated sample");
            assert_eq!(&sample.values, expected_values);

I think this happens because Vec<Label> and such are used as keys in the hash, and then they don't work out because of order?

Copy link
Contributor

@morrisonlevi morrisonlevi May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed this up with function fuzz_failure_001 to reproduce.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Levi for capturing this test case!

profiling/src/internal/profile/fuzz_tests.rs Outdated Show resolved Hide resolved
profiling/src/internal/profile/fuzz_tests.rs Show resolved Hide resolved
@taegyunkim taegyunkim force-pushed the dsn/fuzz-profile-api branch 2 times, most recently from 9cfe3b5 to 9ad1b94 Compare June 4, 2024 23:47
Copy link
Contributor

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I don't know much about what a bolero_generator is, but the changes look fine 👍

@r1viollet
Copy link
Contributor

I am guessing we have some amount of work to fix the tests before we can merge this ?

test internal::profile::fuzz_tests::fuzz_failure_001 ... FAILED

@taegyunkim
Copy link
Contributor

taegyunkim commented Jun 10, 2024

I am guessing we have some amount of work to fix the tests before we can merge this ?

test internal::profile::fuzz_tests::fuzz_failure_001 ... FAILED

Yes, the test fails because input labels are not ordered in an expected way.

The label with 'local root span id' should follow other labels, looking at this function.

self.get_label_set(sample.labels)?
.iter()
.map(|l| self.get_label(*l).copied())
.chain(self.get_endpoint_for_labels(sample.labels).transpose())
.chain(timestamp.map(|ts| Ok(Label::num(self.timestamp_key, ts.get(), None))))

This failing test case has been introduced with these code lines.

let mut labels = std::collections::HashSet::<Label>::gen_with().generate(driver)?;
// Ensure that the label has "local root span id" key
// Generate non-zero num value for the label
let num = i64::gen().generate(driver)?;
// zero num is considered as invalid, see Profile::validate_sample_labels
if num == 0 {
return None;
}
labels.insert(Label {
key: "local root span id".into(),
str: None,
num,
num_unit: None,
});
let labels = labels.into_iter().collect();

I wanted to have bolero generated Samples to have a label with key 'local root span id', but didn't make sure that the orders are as expected.

@taegyunkim taegyunkim merged commit 2b45d24 into main Jun 10, 2024
26 checks passed
@taegyunkim taegyunkim deleted the dsn/fuzz-profile-api branch June 10, 2024 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Relates to the profiling* modules.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants