-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to load this model directly to generate data after saving it #361
Comments
Hi @RedBlue01, nice to meet you. The error message seems to indicate that the synthesizer you are loading in was never fitted -- therefore, it is not possible to sample from it. Did you create the original synthesizer (saved as BTW instead of using the CTGAN library directly, I would highly recommend you move to the SDV library. You can access the CTGAN Synthesizer via SDV. Doing so will allow you to make use of additional features -- such as better data pre-processing, customizations such as constraints, and conditional sampling. Here is a tutorial that uses CTGAN via the SDV library. |
Hi @npatki , Thank you very much for responding to this question, and I'm sorry I send message until now. import pandas as pd
data = pd.read_csv('/home/visitor/Huang/Analytical-Method/column_123after.csv', usecols=[0, 2])
from sdv.metadata import SingleTableMetadata
metadata=SingleTableMetadata()
metadata.detect_from_dataframe(data)
python_dict = metadata.to_dict()
print(data)
print(python_dict)
from sdv.single_table import CTGANSynthesizer
synthesizer = CTGANSynthesizer(
metadata, # required
enforce_rounding=True,
epochs=200,
verbose=True
)
synthesizer.save(
filepath='/home/visitor/Huang/Analytical-Method/GAN/my_synthesizer_e200NEW.pkl'
)
synthesizer.fit(data)
synthesizer.get_loss_values()
synthetic_data = synthesizer.sample(num_rows=10)
print(synthetic_data)
print('Done') And thank you so much for what you have done. I already "pip install sdv"ed. And it's an amazing work. |
Hi @RedBlue01, thanks for sharing your code. The problem is that you are saving your synthesizer before you are fitting it. I would recommend saving the synthesizer after you call the fit function. The fitting process is where the machine learning happens. You would want to include that in the saved file so saving should happen after that. synthesizer.fit(data)
synthesizer.save(
filepath='/home/visitor/Huang/Analytical-Method/GAN/my_synthesizer_e200NEW.pkl'
) Keep in mind that when you call |
Hi @npatki , thank you so much for your help. I finally successfully solved this problem that has troubled me for a long time. Indeed, I never thought that it was a problem with the order of |
Environment details
22.04) 12.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) Flush stdout buffer for epoch updates #1422.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Nov 20 18:15:30 UTC 2Problem description
This is my code. I want to fit a certain epoch and save the model, and then directly use this model to generate data. But the attempt failed and an error was reported.
`from sdv.single_table import CTGANSynthesizer
synthesizer=CTGANSynthesizer.load(
filepath='/home/visitor/Huang/Analytical-Method/GAN/my_synthesizer_mini_e200NEW.pkl'
)
synthetic_data = synthesizer.sample(num_rows=10)
synthetic_data.to_csv('/home/visitor/Huang/Analytical-Method/GAN/synthetic_data.csv', index=False)
print(synthetic_data)
print('Done')`
What I already tried
I tried to view the anaconda3/envs/AM/lib/python3.10/site-packages/sdv/data_processing/data_processor.py file, but my level is limited and I don’t know how to solve it.
The following is my current situation.
The text was updated successfully, but these errors were encountered: