Skip to content

Show progress bar while fitting to training data #1606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 14, 2022
Merged

Conversation

aron-bram
Copy link
Collaborator

@aron-bram aron-bram commented Nov 9, 2022

Closes #1599

Motivation:
Building an AutoML model and calling the fit() function does not give any indication of how much time is left from the optimization process. This may cause confusion especially for fresh users not familiar with the API.

Solution
A new class called ProgressBar was added under autosklearn/utils, which is an instance of Thread. By instantiating this thread, it automatically starts itself and displays a tqdm progress bar in the console.
ProgressBar gets instantiated at the very beginning of the AutoML fit() method and stopped in the finally block of the outermost try block. This introduced only minimal changes to the existing functions.

Controversial
This feature is automatically on, meaning that the console will show a progress bar by default. However, it can be turned off by setting the newly introduced disable_progress_bar parameter of AutoSklearnEstimator to True.
I believe it is more sensible to have it on by default, since the feature aims to mainly benefit new users and from my experience it greatly improves overall user experience in general.

Potential introduced bugs
I noticed that the fit() method doesn't always stop after hitting "stop" a single time within my IDE (Pycharm). It requires the "stop" button to be clicked two times to terminate properly.
Nonetheless, I believe this is how autosklearn used to work before I introduced my feature.
This needs further investigation in the future.

Dependency
The only dependency required by this feature is tqdm, which is a very minimal library.

Test
This feature was only manually tested. So no additinal tests were written for it, because I couldn't think of a reasonable way to test it.

Side note
An alternative solution that I've tried was creating a context manager callback class called ProgressBar, which was an instance of IncorporateRunResultCallback. In the call() method of this class I updated the progress bar manually by the amount of time it took SMAC to train a sampled model. Entering a with block, my class returned an instance of itself and I passed this callback instance to SMAC's trials_callback parameter.
I found this alternative solution to be hard to manage and unelegant, since not only the time spent by SMAC on training estimators, but time spent in other processes has to also be accumulated.
This is why I went with the solution above instead.

@aron-bram aron-bram added the enhancement A new improvement or feature label Nov 9, 2022
@aron-bram aron-bram self-assigned this Nov 9, 2022
@eddiebergman
Copy link
Contributor

eddiebergman commented Nov 10, 2022

I noticed that the fit() method doesn't always stop after hitting "stop" a single time within my IDE (Pycharm). It requires the "stop" button to be clicked two times to terminate properly.
Nonetheless, I believe this is how autosklearn used to work before I introduced my feature.
This needs further investigation in the future.

For extra context, this stop button in pycharm is the same as Ctrl+c and raises a KeyboardInterrupt which is not handled properly somewhere, suspect it to be SMAC.

Copy link
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are still broken:

  • For the test workflows, we discussed that you installing pytest-forked worked for you and it seems like we're note really using pytest-xdist, we can remove the dependancy for pytest-xdist and replace it with pytest-forked, you should run it locally with the same arguments and check that it works:
    $PYTHON -m pytest --forked --durations=20 --timeout=600 --timeout-method=thread -s test
    We can later do a PR to utilize more cores if we need.
  • You can ignore the doc error for now until we update SMAC.

@codecov
Copy link

codecov bot commented Nov 12, 2022

Codecov Report

Merging #1606 (143b7c4) into development (ac745c0) will decrease coverage by 1.59%.
The diff coverage is 89.65%.

Additional details and impacted files
@@               Coverage Diff               @@
##           development    #1606      +/-   ##
===============================================
- Coverage        84.91%   83.31%   -1.60%     
===============================================
  Files              155      156       +1     
  Lines            11898    11927      +29     
  Branches          2058     1896     -162     
===============================================
- Hits             10103     9937     -166     
- Misses            1250     1424     +174     
- Partials           545      566      +21     

Impacted file tree graph

@eddiebergman eddiebergman merged commit 6a97f72 into development Nov 14, 2022
@aron-bram aron-bram deleted the progress_bar branch November 16, 2022 09:13
@Proryanator
Copy link

Proryanator commented Nov 11, 2023

Was this update released into autosklearn? I'm using version 0.15.0 and I'm not seeing any 'progressbar' related code in autosklearn/automl.py for example.

Edit: it looks like it's in 0.16.0dev actually, I can give that a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new improvement or feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indicate progress during optimization
3 participants