Skip to content

EMAModel class bug: "from_pretrained" method bug #9764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wangyanhui666 opened this issue Oct 24, 2024 · 3 comments · Fixed by #9779
Closed

EMAModel class bug: "from_pretrained" method bug #9764

wangyanhui666 opened this issue Oct 24, 2024 · 3 comments · Fixed by #9779
Labels
bug Something isn't working

Comments

@wangyanhui666
Copy link

wangyanhui666 commented Oct 24, 2024

Describe the bug

@classmethod
def from_pretrained(cls, path, model_cls, foreach=False) -> "EMAModel":
    _, ema_kwargs = model_cls.load_config(path, return_unused_kwargs=True)
    model = model_cls.from_pretrained(path)

    ema_model = cls(model.parameters(), model_cls=model_cls, model_config=model.config, foreach=foreach)

    ema_model.load_state_dict(ema_kwargs)
    return ema_model

This is the from_pretrained method of EMAModel class.
first line ", ema_kwargs = model_cls.load_config(path, return_unused_kwargs=True)" will always return a empty dict "ema_kwargs"
I think this line should be "
, ema_kwargs = model_cls.from_config(path, return_unused_kwargs=True)" this will return the ema_kwargs correctly.

Reproduction

can use official example have ema.
https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py

Logs

kwargs=model_cls.load_config(path, return_unused_kwargs=True)
print(kwargs)

({'_class_name': 'DiTTransformer2DModel', '_diffusers_version': '0.30.2', 'activation_fn': 'gelu-approximate', 'attention_bias': True, 'attention_head_dim': 72, 'decay': 0.9999, 'dropout': 0.0, 'in_channels': 4, 'inv_gamma': 1.0, 'min_decay': 0.0, 'norm_elementwise_affine': False, 'norm_eps': 1e-05, 'norm_num_groups': 32, 'norm_type': 'ada_norm_zero', 'num_attention_heads': 16, 'num_embeds_ada_norm': 1000, 'num_layers': 28, 'optimization_step': 280000, 'out_channels': 4, ...}, {})


as you can see second kwargs always empty. ema related config ('optimization_step': 280000,'inv_gamma': 1.0, 'min_decay': 0.0,'decay': 0.9999) is in first kwarys.


However if use 
kwargs=model_cls.from_config(path, return_unused_kwargs=True)
print(kwargs)
(<DiTTransformer2DModel>, {'decay': 0.9999, 'inv_gamma': 1.0, 'min_decay': 0.0, 'optimization_step': 280000, 'power': 0.75, 'update_after_step': 0, 'use_ema_warmup': True, '_class_name': 'DiTTransformer2DModel', '_diffusers_version': '0.30.2'})

second kwargs is ema config! that is correct.

System Info

diffusers 0.30.2

Who can help?

@DN6

@wangyanhui666 wangyanhui666 added the bug Something isn't working label Oct 24, 2024
@wangyanhui666
Copy link
Author

@DN6 can you help me?

@SahilCarterr
Copy link
Contributor

i think you can look this discussion #8802 for the solution. @wangyanhui666

@wangyanhui666
Copy link
Author

我认为您可以查看讨论#8802 来找到解决方案。@wangyanhui666

Thanks. I took a look, and it was helpful to me. This approach does indeed solve the issue.

I believe it’s a bug in the EMAModel.from_pretrained method; it should use model_cls.from_config() instead of model_cls.load_config().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants