Skip to content

Support batch transform job for Model entity with container definitions which use ModelDataSource attribute (especially autogluon foundation jump start model) #5160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MarekSadowski-Alvaria opened this issue May 7, 2025 · 0 comments

Comments

@MarekSadowski-Alvaria
Copy link

MarekSadowski-Alvaria commented May 7, 2025

Describe the feature you'd like
I would like to support batch transform job for Model entity with container definitions which use ModelDataSource attribute. I am particularly interested in supporting the jump start foundation autogluon chronos model, but it seems that the problem is broader and affects all models, with container definitions which use ModelDataSource attribute (and have not defined "ModelDataUrl" attribute)

How would this feature be used? Please describe.
I would expect that below code will create properly batch transform job for autogluon chronos model.

from sagemaker.jumpstart.model import JumpStartModel
model = JumpStartModel(model_id='autogluon-forecasting-chronos-t5-base')
transformer = model.transformer(instance_type='ml.m5.large', instance_count=1)
transformer.transform('s3://some-bucket/')

Currently the above code returns the below error suggesting that for some reason models that use container deifinitions which use "ModelDataSource" (and probably transform batch job expect "ModelDataUrl") are not supported by Batch transform job.

ClientError: An error occurred (ValidationException) when calling the CreateTransformJob operation: SageMaker Batch
currently doesn't support Model entity with container definitions which use ModelDataSource attribute

Describe alternatives you've considered
Nothing

Additional context
I found the workaround to the problem by gzipping the autogluon forecasting chronos model, and define model by providing model_data in compressed form. Then the ModelDataUrl attribute of container is defined and the batch transform job can be created properly. For some reason when model_data is not provided in gzipped form the ModelDataUrl container attribute is not defined and there does not seem to be any tools to define this attribute manually.

I don't know where exactly the problem lies, whether in the fact that in the case of non-gzipped models the ModelDataUrl container attribute is undefined, or in the fact that the batch transformer job cannot properly copy model files that are not compressed for some reason. Either way, it seems that it could be easy resolved on aws side.

The easiest solution to the problem is to prepare all jump start models in gzipped form, but it is probably better to solve the problem more broadly so that the batch transform job can support models that have not been gzipped.

It is related to the issue: #4777

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants