-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Support groupby.ewm operations #37878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @mroeschke! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-11-17 21:58:52 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial scan looks good, can you add some asv's for this
can you show a timeit result of the new groupby.ewm vs groupby.apply(.....ewm) for posterity in the top of this PR |
Posted the timings. I think there is a slight slowdown using the cython engine with the convenience wrapper since we have to reconstruct the EWM object in the apply loop. |
what is the timing for master (before this PR for the groupby apply) i would have expected the numba one to be way after here |
Updated the snippet above; the numba timing including a caching run which skewed the benchmark. The groupby apply timing is similar on master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question and request for tests
], | ||
names=["A", None], | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also compare to groupby('A').apply(lambda x: getattr(x.ewm(com=1.0), method))
?
tm.assert_frame_equal(result, expected) | ||
|
||
expected = df.groupby("A").apply(lambda x: getattr(x.ewm(com=1.0), method)()) | ||
# There may be a bug in the above statement; not returning the correct index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hah i think its expected actually :->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or if you think its a bug, then open an issue :->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right ewm acts as a transform so the index should be the same.
thanks @mroeschke really nice |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Additionally enables a Numba engine for
groupby(...).ewm(...).mean(...)