Skip to content

Memory usage optimization on Serializer class #7250

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Anton-Shutik opened this issue Mar 30, 2020 · 5 comments
Closed

Memory usage optimization on Serializer class #7250

Anton-Shutik opened this issue Mar 30, 2020 · 5 comments

Comments

@Anton-Shutik
Copy link
Contributor

Anton-Shutik commented Mar 30, 2020

Steps to reproduce

Lets say we have model like this:

class User(models.Model):
    first_name = models.CharField()
    last_name = models.CharField()

and serializer like this:

class UserSerializer(serializers.ModelSerializer):

    class Meta:
        model = User
        fields = ('first_name', 'last_name')

and whenever we serialize python obj, the data property has reference to serializer:

serializer = UserSerializer(instance=User.objects.first())
serializer.data.serializer is serializer # gives true

It works fine on small objects like in the example, but it starts consuming more memory with larger querysets and nested Serializers, just because of that data.serializer reference. Python reference counter cannot delete the data.serializer object, because data property references on that, and such a way we have to wait until GC collects it.

The data.serializer reference is only used (at least it is single usage I found) in HTMLFormRenderer and it looks like overhead. Also I'm pretty sure there are lots of projects the do not use the renderer at all, but have to "suffer" from that extra memory usage.

If I change our serializer like below, it consumes less memory:

class UserSerializer(serializers.ModelSerializer):

    @property
    def data(self):
        ret = super(Serializer, self).data
        #return ReturnDict(ret, serializer=self)    <----original code
        return ReturnDict(ret)

So, my point is to make the serializer reference optional, depending if HTMLFormRenderer is used or not, or pass it via renderer context.

class GenericAPIView(views.APIView):

    def get_serializer_context(self):
        context = super().get_serializer_context()
        context['needs_serializer_ref'] = HTMLFormRenderer in self.renderer_classes
        return context

and then in serializer do:

class Serializer(BaseSerializer):

    @property
    def data(self):
        ret = super(Serializer, self).data
        if self.context.get('needs_serializer_ref'):
            return ReturnDict(ret, serializer=self)
        return ret

Just an idea.

Thoughts ?

PS.
Ready to submit a PR

@Anton-Shutik
Copy link
Contributor Author

@tomchristie Do you have any thoughts on this ?

@tomchristie
Copy link
Member

Thanks for looking into this - certainly interesting.

Okay, so this is required if BrowsableAPIRenderer in self.renderer_classes, since the browsable API is where that form renderer is used.

Given that the browsable API is on by default, I think this would be quite a big change for us to make, and I'd rather we just let the garbage collector do what garbage collectors do.

If there was a way to achieve this that had minimal impact on existing projects or existing behavior then perhaps there'd be something to consider.

@rpkilby
Copy link
Member

rpkilby commented Apr 7, 2020

Okay, so this is required if BrowsableAPIRenderer in self.renderer_classes, since the browsable API is where that form renderer is used.

Took a brief look yesterday - the BrowsableAPIRenderer will create its own instance of the serializer, passing the result to an internal instance of HTMLFormRenderer. So, this really should only affect users who have specifically added HTMLFormRenderer (BrowsableAPIRenderer doesn't inherit HTMLFormRenderer).

That all said, I'm kind of inclined to just get rid of ReturnDict as the the BrowsableAPIRenderer reconstructs the serializer anyway. ¯\_(ツ)_/¯

@tomchristie
Copy link
Member

That all said, I'm kind of inclined to just get rid of ReturnDict as the the BrowsableAPIRenderer reconstructs the serializer anyway. ¯_(ツ)_/¯

If there's an easy route onto refactoring the browsable API renderer so that it's not required, then sure, that sounds like a good plan.

@Anton-Shutik
Copy link
Contributor Author

Hi @tomchristie ! I've made a PR that fixes the issue, what do you think about it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants