Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#478 Parallelization of Lindemann Index Calculation #479

Merged
merged 1 commit into from
Jun 11, 2024

Conversation

N720720
Copy link
Owner

@N720720 N720720 commented Jun 11, 2024

Description:
The current implementation of the Lindemann index calculation algorithm in the lindemann package is primarily executed in a serial manner, which limits its performance and scalability, especially for large datasets typically processed in high-performance computing (HPC) environments. The goal is to optimize the existing algorithm by introducing parallelization using Numba's just-in-time (JIT) compilation and parallel processing capabilities (numba prange).

Proposed Solution:
Enhance the performance of the Lindemann index calculation by implementing a parallel version of the algorithm using Numba. The key steps involved in this enhancement include:

  1. Implementation of Parallel Variance Calculation:

    • Using Welford's algorithm to compute the mean and variance in a numerically stable way.
    • Using Numba to calculate the variance.
  2. Chunk-wise Processing:

    • Splitting the frames into multiple chunks to enable parallel processing (prange)
    • Calculating the mean and variance for each chunk independently.
  3. Combining Results:

    • Aggregating the results from all chunks to compute the final Lindemann index.
    • Using reduction to efficiently combine the results.

image

Code Implementation:
The following functions were developed in parallel_trj.py to achieve the parallelization:

  • parallel_variance: Computes the variance in parallel using Welford's algorithm.
  • calculate_chunk: Processes a chunk of frames to compute mean and variance distances.
  • calculate: Divides the data into chunks and uses parallel processing to compute the Lindemann index.

Benefits:

  • Significant reduction in computation time for large datasets.
  • Efficient utilization of multi-core processors.
  • Enhanced scalability and performance, making it suitable for HPC environments.
    parallel_linde

Next Steps:

  • Parallelization of the -ot flag.

@N720720 N720720 linked an issue Jun 11, 2024 that may be closed by this pull request
@N720720 N720720 self-assigned this Jun 11, 2024
@N720720 N720720 added the enhancement New feature or request label Jun 11, 2024
@N720720 N720720 merged commit 03162fd into master Jun 11, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

Parallelization of Lindemann Index Calculation
1 participant