Skip to content

pmf-pymc example is broken #830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fonnesbeck opened this issue Sep 19, 2015 · 7 comments
Closed

pmf-pymc example is broken #830

fonnesbeck opened this issue Sep 19, 2015 · 7 comments

Comments

@fonnesbeck
Copy link
Member

The referenced data for this example seems not to exist, and print statements are legacy Python.

@twiecki
Copy link
Member

twiecki commented Sep 24, 2015

ping @macks22

@macks22
Copy link
Contributor

macks22 commented Sep 24, 2015

I assume you're referring to the "jokes" directory? The csv file, the train/test split and the pmf-map-d5 directory are all present. As for "legacy" print statements, would you like them converted to Python 3.X function-style print statements? If so, will a v4 notebook be compatible or are only < v4 notebooks supported?

@twiecki
Copy link
Member

twiecki commented Sep 24, 2015

@macks22 That's how I had remembered it. @fonnesbeck why weren't you able to run the files? Perhaps we aren't referencing the correct directory.

Changing to print() is 2 and 3 compatible so that one is an easy choice.

I think ipython nb v4 is fine.

@ColCarroll
Copy link
Member

This is still broken -- looks like the files can be found here: http://eigentaste.berkeley.edu/dataset/

They are ~97kb. I will add them back in a separate PR from #1444, and update print statements.

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-6-46c88078b60d> in <module>()
      7 # Let's see for ourselves. Load the jokes.
      8 joke_dir = '../data/jokes'
----> 9 files = [os.path.join(joke_dir, fname) for fname in os.listdir(joke_dir)]
     10 jokes = [fname for fname in files if fname.endswith('txt')]
     11 nums = [filter(lambda c: c.isdigit(), fname) for fname in jokes]

FileNotFoundError: [Errno 2] No such file or directory: '../data/jokes'

@twiecki
Copy link
Member

twiecki commented Oct 13, 2016

Can we download them on the fly if they don't exist?

@ColCarroll
Copy link
Member

Looking closer, the files are used in one place, and only to access the
best and worst rated jokes. I see three choices:

  1. It could be removed, but I think it adds more character to the example
    (if we're rating jokes, we should get to see the jokes)
  2. It could be hard coded. This would be sad for anyone who wanted to
    interact with the notebook.
  3. We download on the fly. I can't figure out how to do that in a way that
    is python 2/3 compatible, other than adding the requests library. Also
    there's some light post-processing (the files are available as html, but we
    might want to display them as text).
  4. We add 100 files to pymc3/examples/data/jokes, 1.txt through 100.txt.
    I've already done this, and it runs as expected.
  5. We combine all 100 into jokes.csv, with line number corresponding to
    joke number. This is a <50kb file.

I like choice 5 the best.

Another problem is that pmf.draw_samples(5000) appears to take hours to
run (didn't complete last night). I'm going to lower the number of samples
to just make sure it runs, and maybe someone else can work on the
performance.

On Thu, Oct 13, 2016 at 4:41 AM Thomas Wiecki notifications@github.com
wrote:

Can we download them on the fly if they don't exist?


You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub
#830 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACMHEFukl7ztNoScgh4IaWoG4fEKnGaPks5qze6fgaJpZM4GAZuK
.

@twiecki
Copy link
Member

twiecki commented Oct 13, 2016

Closed by #1447.

@twiecki twiecki closed this as completed Oct 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants