Skip to content

Add DetectAndRemoveBadChannelsRecording and DetectAndInterpolateBadChannelsRecording classes #3685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

chrishalcrow
Copy link
Member

@chrishalcrow chrishalcrow commented Feb 12, 2025

Part of plan to class-ifty all preprocessing steps.

This PR introduces the very verbose detect_and_interpolate_bad_channels function which will detect then interpolate bad channels, and detect_and_remove_bad_channels which will detect then remove bad channels.

The idea is that these can be used in a standard pipeline => can be chained like any other preprocessing step.

Users can now use

recording_with_bad_channels_removed = detect_and_remove_bad_channels(raw_recording)

and can chain with other steps:

preprocessed_recording = whiten(detect_and_interpolate_bad_channels(bandpass_filter(raw_recording)))

The user can specify the bad channels, to skip the detect step. Idea is that this argument can be used when you load from dict. NOTE: When you load a saved recording from dict, it will not recompute the bad channels. This is a good feature.

Most difficult thing was the kwargs. RemoveBadChannelsRecording shares much of the same signature as detect_bad_channels, and I didn't want to duplicate the default parameters, but did want to save them in the RemoveBadChannelsRecording._kwargs. Ended up using inspect from signature, as is also done in the motion module. Happy to discuss different implementations.

@chrishalcrow chrishalcrow added enhancement New feature or request preprocessing Related to preprocessing module labels Feb 12, 2025
Copy link
Collaborator

@JoeZiminski JoeZiminski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, few suggestions / questions. Really love this as it solves a big problem for me!

# make sure they are removed
assert len(set(new_rec._kwargs["bad_channel_ids"]).intersection(new_rec.channel_ids)) == 0
# and that the kwarg is propogatged to the kwargs of new_rec.
assert new_rec._kwargs["noisy_channel_threshold"] == 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything to worry about channel ordering here? I guess not (and this is a question for ChannelSliceRecording tests anyways. But as I have no understanding of how the ordering works I thought worth asking 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of the channel ids?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm like order of the channels on the recording itself (I'm not sure how exactly this is represented 😅 ). But like the default order when you do plot_traces without order_channel_by_depth

updated_detect_bad_channels_kwargs = {k: v.default for k, v in sig.parameters.items() if k != "recording"}
updated_detect_bad_channels_kwargs.update(detect_bad_channels_kwargs)

bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)
bad_channel_ids = detect_bad_channels_kwargs.get("bad_channel_ids")
if bad_channel_ids is None:
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)

We don't have to rerun bad channel detecion if bad channels have already been computed. Otherwise, if we parallelize this then each process/thread will rerun detection.

We should also keep the ProcessingPipeline in mind. In case we have a processing pipeline, we should differentiate somehow the lazy/non-lazy computation steps.
An option could be to extend the self._kwargs mechanism, and add a self._precomputed_kwargs (in this case this would have the bad_channel_ids).
When loading an object, we could then specify whether we want to use or ignore the precomputed kwargs. When applying the pipeline to another object, you would ignore the precomputed, so that channel detection will be repreformed on the new recording.

Does it make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you're very right, great point. For the ProcessingPipeline I was thinking that the main distinction is if the parameter is recording-specific or not. If the kwarg (e.g. bad_channel_ids or whitening_matrix - these might be the only examples??) depends on which recording it's applied to, it shouldn't be used by the pipeline. I reckon this is the exact same set of kwargs that could be precomputed, since you'll always need a recording to do a precomputation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_kwargs are handled by the subclasses, so would we need to add _precomputed_kwargs to all the preprocessing subclasses?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be done in the base object. It will need to be there because we will have to handle it upon loading and give the option to add them to the main _kwargs (default), or skip them. @samuelgarcia what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, we could use hasattr to check if a preprocessing class has possible _precomputed_kwargs, nice.

Copy link
Collaborator

@JoeZiminski JoeZiminski Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: sorry I think I am confusing two cases and these are two separate issues:

  1. Might passing a last of channels to remove in detect_bad_channel_kwargs be a bit confusing? Can this be a separate argument? must pass either bad_channel_ids or detect_bad_channel_kwargs?

  2. If I understand correctly the idea would be to store the bad channel ids in self._kwargs and not re-detect in case they were already computed. I meant change the public function kwargs sorry, would this case cause a problem:

chan_removed_rec = remove_bad_channels(recording, noisy_channel_threshold=5)

# do some plotting or something to check the remove channels, and decide want to remove a few more

chan_removed_rec = remove_bad_channels(chan_removed_rec, noisy_channel_threshold=1)

Copy link
Member Author

@chrishalcrow chrishalcrow Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having bad_channel_ids as a separate argument sounds ok to me.

I think your example wouldn't cause a problem because they are two independent RemoveBadChannelRecording classes each with their own kwargs. Will check...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarise the problem (and clarify in my head...)

The difficult bit is that there are two things we want. First, if you load a recording from the recording.json file, it should quickly be able to reconstruct the preprocessed recording. For this to work and be fast, we need to save the bad_channel_ids in the kwargs and apply them when we load the recording, rather than re-computing.

On the other hand, thinking about the PreprocessingPipeline, we want people to be able to share their pipelines and apply a pipeline from another lab to their own data. To do this, we need to be able to share which kwargs were used to detect the bad channels, but not the bad channels themselves; since different recordings have different bad channels.

So we need to mark out that as a special kwarg: apply it when you load the recording, but don't add it to a PreprocessingPipeline. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes total sense

Copy link
Member Author

@chrishalcrow chrishalcrow Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, I think this is sorted now. We've added _precomputable_kwarg_names = ["bad_channel_ids"] as a class property, and we'll use this on load or when making a pipeline to include/not include the bad channel ids down the line.

Note: when you load a saved file from dict, it makes an e.g. DetectAndRemoveBadChannelsRecording but skips the detection stage, since it knows the bad_channel_ids. Hence you can load this recording very fast!

Copy link
Member

@alejoe91 alejoe91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's figure out how to deal with precomputed kwargs before merging this

@samuelgarcia
Copy link
Member

We need a strong refactoring to separate kwargs in 2 parts, the parametrsation and caching some heavy computation.
This is on my todo list since 3 years now...
Lets do it before.

@h-mayorquin
Copy link
Collaborator

We need a strong refactoring to separate kwargs in 2 parts, the parametrsation and caching some heavy computation. This is on my todo list since 3 years now... Lets do it before.

+1 on discussing this.

Sam mentioned it on the last meeting. Would be good to bring it back.

@JoeZiminski
Copy link
Collaborator

Just a quick thought on this, there is the interpolate_bad_channels class, but that takes a list of channel ids. Given the function in this PR (and in general) you might expect it to detect-and-interpolate bad channels. I wonder if it makes sense to change that to interpolate_channels() (similar to recording.remove_channels() and then have remove_bad_channels and interpolate_bad_channels as convenience functions to detect-and-remove. However, this might be a confusing change re: backwards compatibility.

@alejoe91
Copy link
Member

Just a quick thought on this, there is the interpolate_bad_channels class, but that takes a list of channel ids. Given the function in this PR (and in general) you might expect it to detect-and-interpolate bad channels. I wonder if it makes sense to change that to interpolate_channels() (similar to recording.remove_channels() and then have remove_bad_channels and interpolate_bad_channels as convenience functions to detect-and-remove. However, this might be a confusing change re: backwards compatibility.

Yes it makes total sense @JoeZiminski

What about having two classes:

  • DetectAndRemoveBadChannels (function detect_and_remove_bad_channels)
  • DetectAndInterpolateBadChannel (function: detect_and_interpolate_bad_channels)

They are a bit verbose, but it binds the bad_channel_ids of the remove/interpolate part to the detection.

@chrishalcrow what do yo think?

@chrishalcrow
Copy link
Member Author

Just a quick thought on this, there is the interpolate_bad_channels class, but that takes a list of channel ids. Given the function in this PR (and in general) you might expect it to detect-and-interpolate bad channels. I wonder if it makes sense to change that to interpolate_channels() (similar to recording.remove_channels() and then have remove_bad_channels and interpolate_bad_channels as convenience functions to detect-and-remove. However, this might be a confusing change re: backwards compatibility.

Yes it makes total sense @JoeZiminski

What about having two classes:

  • DetectAndRemoveBadChannels (function detect_and_remove_bad_channels)
  • DetectAndInterpolateBadChannel (function: detect_and_interpolate_bad_channels)

They are a bit verbose, but it binds the bad_channel_ids of the remove/interpolate part to the detection.

@chrishalcrow what do yo think?

I think the verboseness makes it very clear that the function is dong two things, which is great! Thanks for the suggestion @JoeZiminski

So you'd use detect_bad_channel when you're playing with parameters and getting things going, then use detect_and_remove_bad_channels once your pipeline is a bit more settled. I think it's a nice workflow.

@chrishalcrow
Copy link
Member Author

Hello @JoeZiminski and @alejoe91 - I've added a detect_and_interpolate_bad_channels function. At the moment, this does a very similar thing to detect_and_remove_bad_channels but calls the InterpolateBadChannels class rather than ChannelSlice.

@JoeZiminski
Copy link
Collaborator

Nice! That looks great @chrishalcrow cheers this will be super useful. I had a quick look at the code and looks good, I've got a slight preference to refactor the shared code:


        # get the default parameters from `detect_bad_channels`, and update with any user-specified parameters.
        sig = signature(detect_bad_channels)
        updated_detect_bad_channels_kwargs = {k: v.default for k, v in sig.parameters.items() if k != "recording"}
        updated_detect_bad_channels_kwargs.update(detect_bad_channels_kwargs)

        if bad_channel_ids is None:
            bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)
        else:
            channel_labels = None

        self._main_ids = recording.get_channel_ids()

into a new function on detect_bad_channels.py which is shared between both, but not very strong on that.

@chrishalcrow
Copy link
Member Author

Nice! That looks great @chrishalcrow cheers this will be super useful. I had a quick look at the code and looks good, I've got a slight preference to refactor the shared code:


        # get the default parameters from `detect_bad_channels`, and update with any user-specified parameters.
        sig = signature(detect_bad_channels)
        updated_detect_bad_channels_kwargs = {k: v.default for k, v in sig.parameters.items() if k != "recording"}
        updated_detect_bad_channels_kwargs.update(detect_bad_channels_kwargs)

        if bad_channel_ids is None:
            bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)
        else:
            channel_labels = None

        self._main_ids = recording.get_channel_ids()

into a new function on detect_bad_channels.py which is shared between both, but not very strong on that.

I've refactored (and moved) the gross updated_detect_bad_channels_kwargs code. I think the other bits are helpful to keep for understandability.

@chrishalcrow chrishalcrow changed the title Add RemoveBadChannelsRecording class and remove_bad_channels Add DetectAndRemoveBadChannelsRecording and DetectAndInterpolateBadChannelsRecording classes Apr 8, 2025
@chrishalcrow
Copy link
Member Author

Hello, I think this is ready @alejoe91

else:
channel_labels = None

self._main_ids = recording.get_channel_ids()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? The InterpolateBadChannelsRecording should call the BasePreprocessor init which does that automatically

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - you're right! It is needed when you remove_bad_channels, because ChannelSliceRecording needs the list of "good" channels. To make these we get the difference between all units (generated by self.channel_ids which is a method which basically just returns self._main_ids) and the bad units.

But yeah, not needed for interpolate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preprocessing Related to preprocessing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants