-
Notifications
You must be signed in to change notification settings - Fork 125
ENH: Function to_gbq(): Dissociate the project ID to run the BQ job from the table project #321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You can actually even do it right now by attaching the project_id in front of the |
Hi @ShantanuKumar, many thanks for your suggestion. I actually tried this, but got an error because the destination table must be exactly in the form |
You need to use `` around the name so pass something like destination_table = "`project_id.dataset.tablename` " |
Hi @ShantanuKumar, many thanks for your comment.
And got this error message:
Thus, it seems that
Please let me know if you need further information. |
I think you are right. It only works when running query using
It doesn't work when you try to write to a table using |
Dear @ShantanuKumar, thank you very much for your nice clarification. |
Dear @tswast is there any update on this issue? In my personal opinion, having open issues without any progress in ~5 months does not make too much sense. I would proceed to close this issue if there is no progress in the next few weeks. Many thanks for your understanding. Best regards. |
We are open to pull requests to add a project ID parameter to |
Dear @tswast, many thanks for your answer. I will take a look to see if I can implement it myself. Could you maybe suggest how would you approach this task? In other words, which files should be changed in order to implement this fix? Should I take something important into consideration? Many thanks and best regards. |
This is the problematic line: It needs to be changed to accept a project ID. There will need to be some refactoring to add a project ID parameter to this function. I recommend using TableReference.from_string, where |
Dear @tswast, cool many thanks for your advice. I will take a closer look and see what I can do 😄 best regards 👍 |
I think it would be very helpful in certain situations to differentiate between the project where the BQ job is to be run, and the project in which the the table is located (especially because of the costs billing).
As far as I understood, at present this is not possible when calling the function
to_gbq
(link), because it gets only oneproject_id
parameter as input, and thedestination_table
is in the formdataset.tablename
.For instance, in the
python-bigquery
API from Google, you can do this by using the methodload_table_from_dataframe
(link) which receives as input aTableReference
object (link) (with aproject_id
, adataset_id
and atable_id
), being however called from aclient
running jobs in a separateproject_id
.I think this enhancement would be very useful in situations where splitting the costs among projects is important, while maintaining the availability of the data among projects and/or departments (which is the case in certain organizations).
Many thanks in advance for taking into consideration this issue, and please excuse me if I misunderstood something about the functionality of this method.
Best regards.
The text was updated successfully, but these errors were encountered: