Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure on large TIGER requests #126

Open
ronnie-llamado opened this issue Apr 11, 2021 · 2 comments
Open

Failure on large TIGER requests #126

ronnie-llamado opened this issue Apr 11, 2021 · 2 comments
Labels
API Induced A change in the API, either revising existing products or moving to new products, caused this.

Comments

@ronnie-llamado
Copy link
Member

Failure

In test_functional_products.py:

aus = dectest.from_msa("Austin, TX", level="block", variables=["^P003", "P001001"])

Fails and returns:

KeyError: 'Response from API is malformed. You may have submitted too many queries,
formatted the request incorrectly, or experienced significant network connectivity issues. 
Check to make sure that your inputs, like placenames, are spelled correctly, and that 
your geographies match the level at which you intend to query. The original error from 
the Census is:\\n(API ERROR 500:Error performing query operation([]))'

The last recorded pass of this test appears to be on 21 Jan 2021 (see build: #1463.1).

Diagnosis

Error performing query operation

According to Esri's support, when "Error performing query operation" is returned from a map service is it due to an extremely large response failing (Source). In the article they state that the default max is 64MB.

We're able to adjust the number of returned results using the MapServer's resultRecordCount, so I did this until failure.

Metropolitan Statistical Area (MSA) Total Features Features Before Failure Size Before Failure (MB)
Austin, TX 42159 22000 31.3
Los Angeles-Long Beach-Anaheim, CA 169626 36000 31.8
Carson City, NV 2354 N/A N/A

My limited tests point to a 32MB limit instead of Esri's stated 64MB default, so there may have been an update server-side. This failure might require a little more rework if it's confirmed.

@ljwolf
Copy link
Member

ljwolf commented Apr 20, 2021

Yeah, this is definitely an update server-side. I think we'll have to figure out a chunked way to get the features now, to power this kind of thing :/

I'll merge the fix in #125 anyway and start thinking about a large-query fix. What we do in the "data" api is to split the query at 50 columns, then just make repeated requests (with a small delay). In this case, we'd need to (1) grab the records within the envelope and (2) split those into chunks based on an estimated request size (which... not sure what the heuristic should be) and then (3) request those in serial.

@ronnie-llamado ronnie-llamado added the API Induced A change in the API, either revising existing products or moving to new products, caused this. label May 4, 2021
@ronnie-llamado
Copy link
Member Author

Esri's map service query exposes some parameters that would apply here: returnCountOnly, resultOffset and resultRecordCount (source: Esri Documentation).

A rough version to pull large queries:

  1. Query number of records with returnCountOnly
  2. Query n records at a time until complete with resultOffset and resultRecordCount

That still doesn't address the estimating the size (still unsure), but that simplifies the logic instead of splitting envelopes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Induced A change in the API, either revising existing products or moving to new products, caused this.
Projects
None yet
Development

No branches or pull requests

2 participants