Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved API for document search and export #4792

Open
tpluscode opened this issue May 5, 2024 · 0 comments
Open

Improved API for document search and export #4792

tpluscode opened this issue May 5, 2024 · 0 comments

Comments

@tpluscode
Copy link

Excuse a single ticket for multiple related problems. I'll create separate if you'd prefer that.

Is your feature request related to a problem? Please describe.

I enabled API access to programmatically export documents. I found that I can retrieve project documents and then export annotations in the desired format. However, there are some shortcomings:

  1. The /api/aero/v1/projects/{projectId}/documents endpoint returns all documents, requiring filtering on client side, for example to get only state=CURATION-COMPLETE
  2. Documents can only be exported one-by-one with the /api/aero/v1/projects/{projectId}/documents/{documentId} endpoint
  3. When exporting, the response is always Content-Type: application/octet-stream. Additionally, Accept header causes status 406 Not Acceptable, even if a matching media type is requested. These I find a bug

Describe the solution you'd like

Ideally, it would be possible to directly export all matching documents, without doing a search first. Something like

GET /api/aero/v1/projects/{projectId}/documents{?format,state}

The response could be a ZIP with each document exported in the chosen format

Describe alternatives you've considered

If a new endpoint is not feasible, it would be nice to introduce some improvements

  1. Add ?state query param to document search endpoint
  2. Respond with matching content type, such as rdfcas => text/turtle, conllu => text/plain, jsoncas => application/json, etc
  3. (Optionally) Allow content negotiation of RDF formats. For example, requesting with Accept: application/n-triples should be honoured and respond with RDF in n-triples format
@reckart reckart added this to the ⭐️ Feature backlog milestone May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants