-
Notifications
You must be signed in to change notification settings - Fork 212
Add DetectAndRemoveBadChannelsRecording
and DetectAndInterpolateBadChannelsRecording
classes
#3685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add DetectAndRemoveBadChannelsRecording
and DetectAndInterpolateBadChannelsRecording
classes
#3685
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, few suggestions / questions. Really love this as it solves a big problem for me!
src/spikeinterface/preprocessing/tests/test_detect_bad_channels.py
Outdated
Show resolved
Hide resolved
# make sure they are removed | ||
assert len(set(new_rec._kwargs["bad_channel_ids"]).intersection(new_rec.channel_ids)) == 0 | ||
# and that the kwarg is propogatged to the kwargs of new_rec. | ||
assert new_rec._kwargs["noisy_channel_threshold"] == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything to worry about channel ordering here? I guess not (and this is a question for ChannelSliceRecording
tests anyways. But as I have no understanding of how the ordering works I thought worth asking 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering of the channel ids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erm like order of the channels on the recording itself (I'm not sure how exactly this is represented 😅 ). But like the default order when you do plot_traces
without order_channel_by_depth
src/spikeinterface/preprocessing/tests/test_detect_bad_channels.py
Outdated
Show resolved
Hide resolved
updated_detect_bad_channels_kwargs = {k: v.default for k, v in sig.parameters.items() if k != "recording"} | ||
updated_detect_bad_channels_kwargs.update(detect_bad_channels_kwargs) | ||
|
||
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs) | |
bad_channel_ids = detect_bad_channels_kwargs.get("bad_channel_ids") | |
if bad_channel_ids is None: | |
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs) |
We don't have to rerun bad channel detecion if bad channels have already been computed. Otherwise, if we parallelize this then each process/thread will rerun detection.
We should also keep the ProcessingPipeline
in mind. In case we have a processing pipeline, we should differentiate somehow the lazy/non-lazy computation steps.
An option could be to extend the self._kwargs
mechanism, and add a self._precomputed_kwargs
(in this case this would have the bad_channel_ids
).
When loading an object, we could then specify whether we want to use or ignore the precomputed kwargs. When applying the pipeline to another object, you would ignore the precomputed, so that channel detection will be repreformed on the new recording.
Does it make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you're very right, great point. For the ProcessingPipeline
I was thinking that the main distinction is if the parameter is recording-specific or not. If the kwarg (e.g. bad_channel_ids
or whitening_matrix
- these might be the only examples??) depends on which recording it's applied to, it shouldn't be used by the pipeline. I reckon this is the exact same set of kwargs that could be precomputed, since you'll always need a recording to do a precomputation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_kwargs
are handled by the subclasses, so would we need to add _precomputed_kwargs
to all the preprocessing subclasses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be done in the base object. It will need to be there because we will have to handle it upon loading and give the option to add them to the main _kwargs (default), or skip them. @samuelgarcia what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, we could use hasattr
to check if a preprocessing class has possible _precomputed_kwargs
, nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: sorry I think I am confusing two cases and these are two separate issues:
-
Might passing a last of channels to remove in
detect_bad_channel_kwargs
be a bit confusing? Can this be a separate argument? must pass eitherbad_channel_ids
ordetect_bad_channel_kwargs
? -
If I understand correctly the idea would be to store the bad channel ids in
self._kwargs
and not re-detect in case they were already computed. I meant change the public function kwargs sorry, would this case cause a problem:
chan_removed_rec = remove_bad_channels(recording, noisy_channel_threshold=5)
# do some plotting or something to check the remove channels, and decide want to remove a few more
chan_removed_rec = remove_bad_channels(chan_removed_rec, noisy_channel_threshold=1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, having bad_channel_ids
as a separate argument sounds ok to me.
I think your example wouldn't cause a problem because they are two independent RemoveBadChannelRecording
classes each with their own kwargs. Will check...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To summarise the problem (and clarify in my head...)
The difficult bit is that there are two things we want. First, if you load a recording from the recording.json
file, it should quickly be able to reconstruct the preprocessed recording. For this to work and be fast, we need to save the bad_channel_ids
in the kwargs and apply them when we load the recording, rather than re-computing.
On the other hand, thinking about the PreprocessingPipeline, we want people to be able to share their pipelines and apply a pipeline from another lab to their own data. To do this, we need to be able to share which kwargs were used to detect the bad channels, but not the bad channels themselves; since different recordings have different bad channels.
So we need to mark out that as a special kwarg: apply it when you load the recording, but don't add it to a PreprocessingPipeline. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that makes total sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, I think this is sorted now. We've added _precomputable_kwarg_names = ["bad_channel_ids"]
as a class property, and we'll use this on load
or when making a pipeline
to include/not include the bad channel ids down the line.
Note: when you load a saved file from dict, it makes an e.g. DetectAndRemoveBadChannelsRecording
but skips the detection stage, since it knows the bad_channel_ids
. Hence you can load this recording very fast!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's figure out how to deal with precomputed kwargs before merging this
We need a strong refactoring to separate kwargs in 2 parts, the parametrsation and caching some heavy computation. |
+1 on discussing this. Sam mentioned it on the last meeting. Would be good to bring it back. |
Just a quick thought on this, there is the interpolate_bad_channels class, but that takes a list of channel ids. Given the function in this PR (and in general) you might expect it to detect-and-interpolate bad channels. I wonder if it makes sense to change that to |
Yes it makes total sense @JoeZiminski What about having two classes:
They are a bit verbose, but it binds the @chrishalcrow what do yo think? |
I think the verboseness makes it very clear that the function is dong two things, which is great! Thanks for the suggestion @JoeZiminski So you'd use |
Hello @JoeZiminski and @alejoe91 - I've added a |
Nice! That looks great @chrishalcrow cheers this will be super useful. I had a quick look at the code and looks good, I've got a slight preference to refactor the shared code:
into a new function on |
I've refactored (and moved) the gross |
RemoveBadChannelsRecording
class and remove_bad_channels
DetectAndRemoveBadChannelsRecording
and DetectAndInterpolateBadChannelsRecording
classes
Hello, I think this is ready @alejoe91 |
else: | ||
channel_labels = None | ||
|
||
self._main_ids = recording.get_channel_ids() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? The InterpolateBadChannelsRecording
should call the BasePreprocessor
init which does that automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - you're right! It is needed when you remove_bad_channels
, because ChannelSliceRecording
needs the list of "good" channels. To make these we get the difference between all units (generated by self.channel_ids
which is a method which basically just returns self._main_ids
) and the bad units.
But yeah, not needed for interpolate.
Part of plan to
class
-ifty all preprocessing steps.This PR introduces the very verbose
detect_and_interpolate_bad_channels
function which will detect then interpolate bad channels, anddetect_and_remove_bad_channels
which will detect then remove bad channels.The idea is that these can be used in a standard pipeline => can be chained like any other preprocessing step.
Users can now use
and can chain with other steps:
The user can specify the bad channels, to skip the
detect
step. Idea is that this argument can be used when you load from dict. NOTE: When you load a saved recording from dict, it will not recompute the bad channels. This is a good feature.Most difficult thing was the kwargs.
RemoveBadChannelsRecording
shares much of the same signature asdetect_bad_channels
, and I didn't want to duplicate the default parameters, but did want to save them in theRemoveBadChannelsRecording._kwargs
. Ended up usinginspect
fromsignature
, as is also done in the motion module. Happy to discuss different implementations.