Size of saved model file changes dramatically after different calls of `predict` #1415

ziqin8 · 2023-09-07T22:06:58Z

Describe the bug

Time and disk space needed for saving a model changes dramatically after different calls of predict. Potentially this can be expected, as some info about the forecasted dataset might be saved on NeuralProphet object.

However, giving users an option to only save necessary part of model (enough for future predictions) can be a huge plus.

To Reproduce

save the model directly after training

import pandas as pd
from neuralprophet import NeuralProphet, set_log_level, df_utils
from neuralprophet import save as npsave

data_location = "https://raw.githubusercontent.com/ourownstory/neuralprophet-data/main/datasets/yosemite_temps.csv"
df = pd.read_csv(data_location)

m = NeuralProphet()
_ = m.fit(df, epochs=2, learning_rate=0.01, progress=None)

# save the model directly after training
fn = "test1.np"
%time npsave(m, fn)
print(f"np file size in MB: {os.path.getsize(fn) / 1e6}")

CPU times: user 3.86 s, sys: 29.8 ms, total: 3.89 s
Wall time: 3.9 s
np file size in MB: 13.682805

save model after predict on 2 rows

m = NeuralProphet()
_ = m.fit(df, epochs=2, learning_rate=0.01, progress=None)
# run forecast with 2 rows
forecast = m.predict(df.iloc[-2:])
# save model after predict on 2 rows
fn = "test2.np"
%time npsave(m, fn)
print(f"np file size in MB: {os.path.getsize(fn) / 1e6}")

CPU times: user 4.89 ms, sys: 602 µs, total: 5.5 ms
Wall time: 5.56 ms
np file size in MB: 0.045449

save model after predict on 1000 rows

m = NeuralProphet()
_ = m.fit(df, epochs=2, learning_rate=0.01, progress=None)
# run forecast with 1000 rows
forecast = m.predict(df.iloc[-1000:])
# save model after predict on 1000 rows
fn = "test3.np"
%time npsave(m, fn)
print(f"np file size in MB: {os.path.getsize(fn) / 1e6}")

CPU times: user 172 ms, sys: 1.9 ms, total: 174 ms
Wall time: 174 ms
np file size in MB: 0.761993

Expected behavior

Similar size of model files should be generated, OR
adding an option in save function to only save the minimal model

Additional context
The TimeNet object (m.model ) is causing the difference. But I did not dig deep enough for the root cause.

The text was updated successfully, but these errors were encountered:

SimonWittner · 2023-09-15T19:29:57Z

Hi,
thanks for raising this! Do you know of any package or existing solution, where you can choose to save a minimal model? It seems to me that pytorch does not include such an option in torch.save.

ziqin8 · 2023-09-18T19:53:02Z

@SimonWittner thanks for the reply - Yes I think torch.save does not have an option for this.

I'm not very familiar with TimeNet, but to me it looks like there's some attribute / component / layer on TimeNet that's changing its size after TimeNet.forward. So to add an option to only save "minimal" model we can potentially remove that attribute before torch.save?

ourownstory · 2023-09-19T17:57:06Z

@c3-ziqin Thank you for raising this to our awareness and for providing clear examples.
As you thought, this is not how we intend save to work.
It appears like the training / predicting data frame ends up in the scope of what torch.save considers part of the model. This is however not needed for the model to work, and thus should be avoided - we intend to only save the "minimal" model.

If you might be able to find how we can avoid the data frame from being indexed and included by torch.save, we would be happy to merge a PR addressing this!

leoniewgnr · 2023-09-19T19:34:41Z

@c3-ziqin Yes, I think this is the way to go. I just digged deeper into the forecaster object, but I'm unsure which attributes to remove. Probably sth like this:
non_essential_attrs = ["metrics_train", "metrics_val", "_optimizer", "_scheduler", "_state_dict_hooks"]

But not super sure, if the actual data is saved here

leoniewgnr · 2023-09-19T20:57:51Z

hi @c3-ziqin, I think I fixed it. Can you check if this works for you? #1425

ourownstory self-assigned this Sep 18, 2023

leoniewgnr mentioned this issue Sep 19, 2023

[fix] Reduce saved model size #1425

Merged

ourownstory linked a pull request Sep 20, 2023 that will close this issue

[fix] Reduce saved model size #1425

Merged

ourownstory closed this as completed in #1425 Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Size of saved model file changes dramatically after different calls of `predict` #1415

Size of saved model file changes dramatically after different calls of `predict` #1415

ziqin8 commented Sep 7, 2023 •

edited by ourownstory

Loading

SimonWittner commented Sep 15, 2023

ziqin8 commented Sep 18, 2023

ourownstory commented Sep 19, 2023

leoniewgnr commented Sep 19, 2023

leoniewgnr commented Sep 19, 2023

Size of saved model file changes dramatically after different calls of predict #1415

Size of saved model file changes dramatically after different calls of predict #1415

Comments

ziqin8 commented Sep 7, 2023 • edited by ourownstory Loading

save the model directly after training

save model after predict on 2 rows

save model after predict on 1000 rows

SimonWittner commented Sep 15, 2023

ziqin8 commented Sep 18, 2023

ourownstory commented Sep 19, 2023

leoniewgnr commented Sep 19, 2023

leoniewgnr commented Sep 19, 2023

Size of saved model file changes dramatically after different calls of `predict` #1415

Size of saved model file changes dramatically after different calls of `predict` #1415

ziqin8 commented Sep 7, 2023 •

edited by ourownstory

Loading