Skip to content

GenericGBQException may be raised when listing dataset #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
parthea opened this issue Aug 4, 2017 · 4 comments
Closed

GenericGBQException may be raised when listing dataset #81

parthea opened this issue Aug 4, 2017 · 4 comments
Labels
type: process A process-related concern. May include testing, release, or the like.

Comments

@parthea
Copy link
Contributor

parthea commented Aug 4, 2017

I saw the following error today in a Travis-CI build log when listing tables under a dataset: 'GenericGBQException: Reason: notFound, Message: Not found: Token pandas_gbq_xxx'

@tswast also experienced this in #39 (comment)

Since the failure is intermittent we may be able to handle the 404 error from BQ in the first attempt and re-attempt the request to list tables under a dataset. I think we should raise GenericGBQException after a second attempt though and monitor for unit test failures.

==================================== ERRORS ====================================
 ERROR at teardown of TestToGBQIntegrationWithServiceAccountKeyPath.test_dataset_exists 

self = <pandas_gbq.gbq._Dataset object at 0x7f358b3736d8>

    def datasets(self):
        """ Return a list of datasets in Google BigQuery
    
            Parameters
            ----------
            None
    
            Returns
            -------
            list
                List of datasets under the specific project
            """
    
        dataset_list = []
        next_page_token = None
        first_query = True
    
        while first_query or next_page_token:
            first_query = False
    
            try:
                list_dataset_response = self.service.datasets().list(
                    projectId=self.project_id,
>                   pageToken=next_page_token).execute()

pandas_gbq/gbq.py:1247: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<googleapiclient.http.HttpRequest object at 0x7f358b352588>,)
kwargs = {}

    @functools.wraps(wrapped)
    def positional_wrapper(*args, **kwargs):
        if len(args) > max_positional_args:
            plural_s = ''
            if max_positional_args != 1:
                plural_s = 's'
            message = ('{function}() takes at most {args_max} positional '
                       'argument{plural} ({args_given} given)'.format(
                           function=wrapped.__name__,
                           args_max=max_positional_args,
                           args_given=len(args),
                           plural=plural_s))
            if positional_parameters_enforcement == POSITIONAL_EXCEPTION:
                raise TypeError(message)
            elif positional_parameters_enforcement == POSITIONAL_WARNING:
                logger.warning(message)
>       return wrapped(*args, **kwargs)

../../../miniconda/envs/test-environment/lib/python3.6/site-packages/oauth2client/_helpers.py:133: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <googleapiclient.http.HttpRequest object at 0x7f358b352588>
http = <google_auth_httplib2.AuthorizedHttp object at 0x7f358b378128>
num_retries = 0

    @util.positional(1)
    def execute(self, http=None, num_retries=0):
      """Execute the request.
    
        Args:
          http: httplib2.Http, an http object to be used in place of the
                one the HttpRequest request object was constructed with.
          num_retries: Integer, number of times to retry with randomized
                exponential backoff. If all retries fail, the raised HttpError
                represents the last request. If zero (default), we attempt the
                request only once.
    
        Returns:
          A deserialized object model of the response body as determined
          by the postproc.
    
        Raises:
          googleapiclient.errors.HttpError if the response was not a 2xx.
          httplib2.HttpLib2Error if a transport error has occured.
        """
      if http is None:
        http = self.http
    
      if self.resumable:
        body = None
        while body is None:
          _, body = self.next_chunk(http=http, num_retries=num_retries)
        return body
    
      # Non-resumable case.
    
      if 'content-length' not in self.headers:
        self.headers['content-length'] = str(self.body_size)
      # If the request URI is too long then turn it into a POST request.
      if len(self.uri) > MAX_URI_LENGTH and self.method == 'GET':
        self.method = 'POST'
        self.headers['x-http-method-override'] = 'GET'
        self.headers['content-type'] = 'application/x-www-form-urlencoded'
        parsed = urlparse(self.uri)
        self.uri = urlunparse(
            (parsed.scheme, parsed.netloc, parsed.path, parsed.params, None,
             None)
            )
        self.body = parsed.query
        self.headers['content-length'] = str(len(self.body))
    
      # Handle retries for server-side errors.
      resp, content = _retry_request(
            http, num_retries, 'request', self._sleep, self._rand, str(self.uri),
            method=str(self.method), body=self.body, headers=self.headers)
    
      for callback in self.response_callbacks:
        callback(resp)
      if resp.status >= 300:
>       raise HttpError(resp, content, uri=self.uri)
E       googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/[secure]/datasets?pageToken=pandas_gbq_923881&alt=json returned "Not found: Token pandas_gbq_923881">

../../../miniconda/envs/test-environment/lib/python3.6/site-packages/googleapiclient/http.py:840: HttpError

During handling of the above exception, another exception occurred:

self = <pandas_gbq.tests.test_gbq.TestToGBQIntegrationWithServiceAccountKeyPath object at 0x7f358b4293c8>
method = <bound method TestToGBQIntegrationWithServiceAccountKeyPath.test_dataset_exists of <pandas_gbq.tests.test_gbq.TestToGBQIntegrationWithServiceAccountKeyPath object at 0x7f358b4293c8>>

    def teardown_method(self, method):
        # - PER-TEST FIXTURES -
        # put here any instructions you want to be run *AFTER* *EVERY* test is
        # executed.
>       clean_gbq_environment(self.dataset_prefix, _get_private_key_path())

pandas_gbq/tests/test_gbq.py:949: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas_gbq/tests/test_gbq.py:116: in clean_gbq_environment
    all_datasets = dataset.datasets()
pandas_gbq/gbq.py:1263: in datasets
    self.process_http_error(ex)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ex = <HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/[secure]/datasets?pageToken=pandas_gbq_923881&alt=json returned "Not found: Token pandas_gbq_923881">

    @staticmethod
    def process_http_error(ex):
        # See `BigQuery Troubleshooting Errors
        # <https://cloud.google.com/bigquery/troubleshooting-errors>`__
    
        status = json.loads(bytes_to_str(ex.content))['error']
        errors = status.get('errors', None)
    
        if errors:
            for error in errors:
                reason = error['reason']
                message = error['message']
    
                raise GenericGBQException(
>                   "Reason: {0}, Message: {1}".format(reason, message))
E               pandas_gbq.gbq.GenericGBQException: Reason: notFound, Message: Not found: Token pandas_gbq_923881

pandas_gbq/gbq.py:450: GenericGBQException
@parthea parthea added the type: process A process-related concern. May include testing, release, or the like. label Aug 4, 2017
@tswast
Copy link
Collaborator

tswast commented Aug 4, 2017

This is a backend issue. The backend issues a token for the next page (which happens to be the name of the next dataset to list). When you give the token back, if the dataset does not exist the list command returns a 404 (I consider this a bug).

Since table and dataset listing is eventually consistent, this error could occur even if you aren't listing and deleting datasets in parallel.

@parthea
Copy link
Contributor Author

parthea commented Aug 4, 2017

Thanks for the explanation @tswast ! That makes sense. My initial thought is that we should work around this problem in pandas-gbq by retrying the listing command. What do you think ?

@tswast
Copy link
Collaborator

tswast commented Aug 4, 2017

Yeah, we can retry. Per the BigQuery SLA we should wait 1 second to retry and do exponential backoff after that.

@parthea
Copy link
Contributor Author

parthea commented Apr 21, 2020

Closing as obsolete.

@parthea parthea closed this as completed Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: process A process-related concern. May include testing, release, or the like.
Projects
None yet
Development

No branches or pull requests

2 participants