DeerLab models incur heavy overhead while fitting #100

luisfabib · 2021-03-18T11:54:34Z

While running tests on fitsignal I noticed a slowdown in performance.

=================================================== test session starts 
platform win32 -- Python 3.8.0, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: D:\lufa\projects\DeerLab\DeerLab
collected 44 items                                                                                                                                                                                                                                                                                                  

test\test_fitsignal.py ............................................                                                                                                                                                                                                                                          [100%] 

==================================================slowest 10 durations 
24.30s call     test/test_fitsignal.py::test_ridme5
11.39s call     test/test_fitsignal.py::test_cost_value
10.31s call     test/test_fitsignal.py::test_ovl4pdeer
9.51s call     test/test_fitsignal.py::test_ridme3
9.00s call     test/test_fitsignal.py::test_5pdeer
6.87s call     test/test_fitsignal.py::test_7pdeer
6.12s call     test/test_fitsignal.py::test_global_4pdeer
4.63s call     test/test_fitsignal.py::test_global_full_parametric
4.31s call     test/test_fitsignal.py::test_global_scale_4pdeer
3.83s call     test/test_fitsignal.py::test_boundaries_adjust_ex

A simple profiling of fitsignal

import deerlab as dl 
import numpy as np 
import cProfile

@profile(sort_by='cumulative', lines_to_print=20, strip_dirs=True)
def fitfcn():
    t = np.linspace(0,5,300)
    r = np.linspace(2,6,250)
    P = dl.dd_gauss(r,[4.5, 0.25])
    Bmodel = lambda t: dl.bg_exp(t,0.4)
    K = dl.dipolarkernel(t,r,0.4,Bmodel)
    V = K@P
    fit = dl.fitsignal(V,t,r,'P',dl.bg_exp,dl.ex_4pdeer,uqanalysis=False)
fitfcn()

revealed a very slow performance for such a simple fit.


import deerlab as dl ...
         15067173 function calls (15039888 primitive calls) in 9.770 seconds

   Ordered by: cumulative time
   List reduced from 625 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    9.777    9.777 <ipython-input-7-7f1b715129a0>:5(fitfcn)
        1    0.000    0.000    9.635    9.635 fitsignal.py:16(fitsignal)
        1    0.000    0.000    9.605    9.605 fitsignal.py:582(separable_nonlinear_lsq_analysis)
        1    0.000    0.000    9.488    9.488 snlls.py:18(snlls)
        1    0.000    0.000    9.228    9.228 least_squares.py:240(least_squares)
       43    0.009    0.000    9.204    0.214 snlls.py:287(ResidualsFcn)
        1    0.000    0.000    8.507    8.507 trf.py:112(trf)
        1    0.001    0.001    8.507    8.507 trf.py:205(trf_bounds)
       46    0.001    0.000    6.206    0.135 fitsignal.py:320(multiPathwayModel)
       45    0.000    0.000    6.091    0.135 utils.py:59(<lambda>)
       45    0.000    0.000    6.090    0.135 fitsignal.py:593(<lambda>)
      188    0.000    0.000    6.023    0.032 inspect.py:1512(stack)
      188    0.019    0.000    6.023    0.032 inspect.py:1484(getouterframes)
      188    0.005    0.000    5.999    0.032 bg_models.py:314(bg_exp)    <-------------
    10063    0.054    0.000    5.995    0.001 inspect.py:1445(getframeinfo)
      187    0.003    0.000    5.993    0.032 bg_models.py:12(_parsargs)    <-------------
       93    0.003    0.000    5.983    0.064 dipolarbackground.py:10(dipolarbackground)
      186    0.000    0.000    5.970    0.032 dipolarbackground.py:79(<lambda>)
      184    0.000    0.000    5.882    0.032 fitsignal.py:339(<lambda>)
       14    0.001    0.000    4.755    0.340 least_squares.py:882(jac_wrapped)

While there are other factors that contribute to the slower performance, bg_models.py is strongly unexpected. These functions are just called to perform a simple mathematical operation and return a vector.

After closer inspection, I performed a different benchmark to see if there was a strong overhead for calling models in bg_models.py

@profile(sort_by='cumulative', lines_to_print=5, strip_dirs=True)
def Bmodel():
    for _ in range(500):
        t = np.linspace(0,5,100)
        B = dl.bg_exp(t,0.4)
Bmodel()

@profile(sort_by='cumulative', lines_to_print=5, strip_dirs=True)
def Bmath():
    for _ in range(500):
        t = np.linspace(0,5,100)
        B = np.exp(-t*0.4)
Bmath()

which reveals a clear overhead for calling these models

       36717502 function calls (36716002 primitive calls) in 14.316 seconds

   Ordered by: cumulative time
   List reduced from 83 to 5 due to restriction <5>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.003    0.003   14.325   14.325 <ipython-input-8-2bded830471e>:1(Bmodel)
      500    0.012    0.000   14.278    0.029 bg_models.py:314(bg_exp)
      500    0.007    0.000   14.264    0.029 bg_models.py:12(_parsargs)
      500    0.001    0.000   14.244    0.028 inspect.py:1512(stack)
      500    0.036    0.000   14.243    0.028 inspect.py:1484(getouterframes)

---------------------------------------------------------------------------------------

         16002 function calls (14502 primitive calls) in 0.022 seconds

   Ordered by: cumulative time
   List reduced from 29 to 5 due to restriction <5>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002    0.022    0.022 <ipython-input-8-2bded830471e>:7(Bmath)
      500    0.000    0.000    0.020    0.000 <__array_function__ internals>:2(linspace)
 2000/500    0.003    0.000    0.020    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
      500    0.007    0.000    0.019    0.000 function_base.py:23(linspace)
      500    0.000    0.000    0.004    0.000 <__array_function__ internals>:2(any)

I can reproduce these results for both dd_models.py and ex_models.py.

While this overhead is negligible for most operations within DeerLab, fitsignal relies heavily on calling these models each time a residual vector is computed during optimization. This propagates strongly into the performance of the fit as seen in the profile above.

The text was updated successfully, but these errors were encountered:

luisfabib · 2021-03-18T12:48:17Z

My current approach has been to implement the model functions is such a way, that they return a function handle model containing the basic math as part of the info dictionary, for example

info = dict(
    Parameters = ['Intercept','1st-order coefficient','2nd-order coefficient','3rd-order coefficient'],
    Units = ['','μs⁻¹','μs⁻²','μs⁻³'],
    Start = np.asarray([ 1,  -1  , -1,   -1  ]),
    Lower = np.asarray([ 0,  -200, -200, -200]),
    Upper = np.asarray([200,  200,  200,  200]),
    ModelFcn = model
     )
    return info

This way, the basic function containing the math is requested once from the model in fitsignal and then can be directly called without the need to access one of the modules bgmodels.py, ex_models.py, or dd_models.py and the corresponding model inside there, avoiding the call overhead entirely.

Taking the same benchmark example as before:

@profile(sort_by='cumulative', lines_to_print=20, strip_dirs=True)
def fitfcn():
    t = np.linspace(0,5,300)
    r = np.linspace(2,6,250)
    P = dl.dd_gauss(r,[4.5, 0.25])
    Bmodel = lambda t: dl.bg_exp(t,0.4)
    K = dl.dipolarkernel(t,r,0.4,Bmodel)
    V = K@P
    fit = dl.fitsignal(V,t,r,'P',dl.bg_exp,dl.ex_4pdeer,uqanalysis=False)
fitfcn()

Before:

15067173 function calls (15039888 primitive calls) in 9.770 seconds

Proposed change:

1509554 function calls (1470428 primitive calls) in 4.565 seconds

So this change would already reduce the runtime of fitsignal by half.

luisfabib added the performance label Mar 18, 2021

luisfabib self-assigned this Mar 18, 2021

luisfabib added this to the 0.13.0 milestone Mar 18, 2021

luisfabib mentioned this issue Mar 18, 2021

Acceleration of fitting by avoiding model call overhead #101

Merged

stestoll closed this as completed in #101 Mar 18, 2021

luisfabib mentioned this issue Apr 7, 2021

Implement models metadata as function attributes #143

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeerLab models incur heavy overhead while fitting #100

DeerLab models incur heavy overhead while fitting #100

luisfabib commented Mar 18, 2021 •

edited

Loading

luisfabib commented Mar 18, 2021

DeerLab models incur heavy overhead while fitting #100

DeerLab models incur heavy overhead while fitting #100

Comments

luisfabib commented Mar 18, 2021 • edited Loading

luisfabib commented Mar 18, 2021

luisfabib commented Mar 18, 2021 •

edited

Loading