-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Return dictionaries instead of tuples #959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For reference, tuples vs dicts was discussed briefly (with no immediate resolution) in this thread: #784 (comment) |
PD: Since what I am proposing is a breaking change... I also think that reducing the number of parameters in a function is good (i.e.: |
My preference would be to use keyword arguments instead, eg:
|
In my experience maintaining pvlib, it's been easier to deal with things that require order (positional args and returned tuples) than things that require labels (keyword args and returned dicts or DataFrames). I do use a lot of dicts, Series, and DataFrames in my work, so I certainly appreciate their general value. It just seems they're harder for a more general purpose library. As I mentioned in the linked issue, I'd really like to read a blog post about this kind of API design question in scientific computing. It's also not clear to me that you'd actually need to switch to keyword arguments to achieve the input simplification you're after. Examples showing that it can be done with positional arguments: In [1]: def f(a, b):
...: print(a, b)
...:
In [2]: f(b=2, a=1)
1 2
In [3]: d = dict(a=1, b=2)
In [4]: f(**d)
1 2
In [5]: t = (1, 2)
In [6]: f(*t)
1 2 The test simplification could work with a tuple too: assert (1., 5.) == pytest.approx((1., 5.)) |
Will you explained what I was trying to say, but much better, thanks! |
@mikofski I have no strong opinion on packing/unpacking parameters. 😊 @wholmgren Yeah, tests simplification can be done either way, I should not have brought it to this issue. 👍 @wholmgren In my opinion positional args should be enforced then. That is possible in Python 3.8, but not before. Also, positional arguments must come before keyword arguments, and that may require further API modifications. In example: # Assuming `cec_params` is a tuple...
cec_params = fit_sdm_cec_sam(...)
# This works fine
calcparams_cec(
1000,
25,
0.004,
2.5,
*cec_params,
...,
)
# This does not
calcparams_cec(
effective_irradiance=1000,
temp_cell=25,
alpha_sc=0.004,
a_ref=2.5,
*cec_params,
...,
) You can't just use keyword parameters before the To summarize, I think returning a dict is better and I do not mind that much about |
A worthy discussion. |
ICYMI: @kanderso-nrel and @CameronTStark - could be useful to get your opinions on this, if you care to weigh in. |
On the subject of passing groups of values into a function as a unit vs individually -- if the values are coherent enough to interact as a group in several parts of the API, I wonder if they would be better grouped as a simple object like scipy's OptimizeResult instead of a dictionary. For positional-only arguments, I don't think it's a good idea at this time. For one, it would require a big bump in minimum python version, and for two I don't think pvlib satisfies any of the use cases listed in PEP 570. For returning dicts vs tuples... a 6-element tuple is a bit unwieldy. On the other hand, returning a dictionary for two values would seem silly. I notice
|
@wholmgren 's description of the situation is good although I would prefer to have a dictionary on the output if there are more than two values being returned. This makes it possible to enable easy transfer of the output of one function into another without careful positioning. In [1]: def outputs_dictionary():
...: return dict(this="foo", that="bar", other_thing="baz")
In [2]: def uses_output_dictionary(first_arg, that, other_thing, this, last_arg):
...: print(this, that, other_thing)
In [3]: example_dictionary = outputs_dictionary()
In [4]: example_dictionary
Out[4]: {'this': 'foo', 'that': 'bar', 'other_thing': 'baz'}
In [5]: uses_output_dictionary(42, **example_dictionary, last_arg=1337)
foo bar baz
In [6]: uses_output_dictionary(last_arg=1337, **example_dictionary, first_arg=42)
foo bar baz This is useful in interactive data analysis as well as in programmatic scripting to unify the API. Having sensible defaults for these arguments, when possible, makes this even easier and is also useful in enabling new users a path forward when they'd otherwise have no idea what value to insert. |
I don't think we're going to get any farther arguing in the abstract. Can the proponents of an API change provide a list of functions for which they think it should change? Are we only talking about the single diode functions? While we're at it, we might also consider functions that return labeled data (either as DataFrames or dicts) that could simply be ordered instead.
I'm not aware of any pvlib functions that require input for which there exists reasonable defaults, but they exist then we should address them in separate issues. |
I'm late to this conversation, but I lean against the suggestion to return dicts instead of admittedly long tuples. Two reasons, which have already been mentioned:
I do strongly favor changing the API so that positional arguments are in a consistent order - looking at the single diode functions in particular. |
It is expected that pvlib's key names do not match other packages'. Not a huge hassle in my opinion to handle it with dicts vs. tuples (I assume other packages will have different order in their parameters as well, of course): # Returning dicts
d = pvlib_function()
non_pvlib_function(x=d['bar'], y=d['foo'], z=d['baz'])
# Returning tuples
foo, bar, baz = pvlib_function()
non_pvlib_function(x=bar, y=foo, z=baz) On the other hand, as an end-user, I would be much more scared if I had to rely on the order. Of course I would assume that pvlib would take care of updating other pvlib functions if the order changed in the future, so this will always work: t = pvlib_function()
another_pvlib_function(*t, ...) But any change in the order in the future, for any reason, will break my code without raising an exception if tuples are used instead of dicts: foo, bar, baz = pvlib_function() # Now this should be "bar, baz, foo" instead
non_pvlib_function(x=bar, y=foo, z=baz) # Silently yields wrong results So, as an end user, I would vote against tuples. I would prefer my code to crash blatantly if the API changed in the future. On the other hand, if what you are proposing is to freeze the API once this is decided (i.e.: the order will never change no matter what), then I take everything I just said back. I just don't know how stable you consider pvlib's API to be. 😊 |
I second @wholmgren on this:
|
My proposal was about the SDM functions in particular. Since my pvlib's API knowledge is very limited, I do not know if it should extend to other parts of the API, so I would feel unable to provide a list of functions. I hope this issue ends up not causing more noise than it helps. ❤️ |
@Peque please take a look at the changes in #708, and let me know how we can improve the For the record, I regret that I did not first reorganize |
@cwhanse 👍 In my opinion this could be improved:
Minor suggestions:
|
If all the SDM related functions become compatible with each other with just a few changes, I'm in favor. As much as I like to see a layer of functions for everything at the base of pvlib (and as much as I like to limit myself to using only that layer) I do share the view of @kanderso-nrel that classes could provide a nice solution here. |
I don't really have a strong opinion here, and there are already a lot of voices, but I will just mention this: beware of classes. They can be useful if used thoughtfully, but they
As usually I may be completely wrong about this but I have seen otherwise. PVMismatch is a case in point. With cells as a "class" we have to do a lot of secret magic to keep the memory load down. This might be less of an issue if the cells' info was all in one giant NumPy array or other efficient memory container. |
I found it a bit inconvenient to have
fit_sdm_cec_sam()
return a tuple. I think returning a dictionary with['I_L_ref', 'I_o_ref', 'R_sh_ref', 'R_s', 'a_ref', 'Adjust']
keys would be much more convenient and safe.The same goes for
calcparams_cec()
, which could return a dictionary with['photocurrent', 'saturation_current', 'resistance_series', 'resistance_shunt', 'nNsVth']
keys instead.I think that is more Pythonic (explicit is better than implicit and readability counts).
Also, more convenient API-wise:
The tests could be simplified as well. They currently look like:
They could look like:
The text was updated successfully, but these errors were encountered: