Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities CSV ETag doesn't change after entity deletion #599

Closed
matthew-white opened this issue Feb 22, 2024 · 6 comments · Fixed by getodk/central-backend#1106
Closed

Entities CSV ETag doesn't change after entity deletion #599

matthew-white opened this issue Feb 22, 2024 · 6 comments · Fixed by getodk/central-backend#1106
Assignees
Labels
backend Requires a change to the API server behavior verified Behavior has been manually verified bug entities Multiple Encounter workflows

Comments

@matthew-white
Copy link
Member

Problem description

Backend returns the entities CSV with an ETag based on the timestamp of the latest entity creation or update. That's true for the CSV download to OpenRosa clients (the form attachment endpoint), as well as the download from the entities Data page. However, that strategy doesn't seem to account for entity deletion: when an entity is deleted, I would expect the ETag of the CSV response to change.

I think @dbemke ran into this issue at #584 (comment).

Steps to reproduce the problem

  1. Create an entity list with at least 2 entities.
  2. Download the entities CSV.
  3. Delete an entity (but not the one most recently created or updated).
  4. Download the entities CSV again.
  5. Observe that the CSV is unchanged and still includes the deleted entity.

Expected behavior

I would expect that when I download the CSV again, the deleted entity is not included.

Central version shown in version.txt

I think this has been an issue since we introduced entity deletion.

@matthew-white matthew-white added bug backend Requires a change to the API server needs testing Needs manual testing entities Multiple Encounter workflows labels Feb 22, 2024
@github-project-automation github-project-automation bot moved this to 🕒 backlog in ODK Central Feb 22, 2024
@ktuite ktuite self-assigned this Feb 27, 2024
@lognaturel
Copy link
Member

when an entity is deleted, I would expect the ETag of the CSV response to change

Definitely.

I would expect that when I download the CSV again, the deleted entity is not included.

We have some decisions to make about how clients will learn about deleted entities that they may have locally. This is something @seadowg and I are going to be exploring very soon. For now excluding deleted entities seems like the best thing to do.

@lognaturel
Copy link
Member

lognaturel commented Mar 5, 2024

We might be able to use deleted_at as part of the ETag computation to address this. That means purging will change the ETag and clients would redownload.

Could we only use the audit table and look for the latest change on the actee (filtered by verb)?

There's a somewhat related issue around properties being added to an entity list. Currently the ETag doesn't change in that case even though the shape of the entity list has changed. If Collect and Enketo return a blank string when looking up a property that doesn't exist (I'm pretty sure that's what it does), then this would be fine. If Collect or Enketo crashes, we do need another option.

@lognaturel lognaturel moved this from 🕒 backlog to ✏️ in progress in ODK Central Mar 22, 2024
@github-project-automation github-project-automation bot moved this from ✏️ in progress to ✅ done in ODK Central Apr 5, 2024
@ktuite
Copy link
Member

ktuite commented Apr 5, 2024

Note for @getodk/testers: this bug of the deleted entity still being present in the entity list should also be tested in Collect.

The issue was that central backend was using the last updated at timestamp of any entity in the entity list to compute something to tell a client like Collect (and the browser) whether or not to use the cached dataset CSV or definitely fetch a new one. That meant if an entity was deleted, it was still just checking the updated timestamp of other entities still in the dataset, and not knowing to re-download the entities. Now it knows to look for the last action, including deletion, on a dataset.

@matthew-white
Copy link
Member Author

@getodk/testers, this fix is present on the staging server.

@matthew-white
Copy link
Member Author

There's a somewhat related issue around properties being added to an entity list. Currently the ETag doesn't change in that case even though the shape of the entity list has changed.

@ktuite's fix looks at the audits table, so the ETag should now change after a property is added. @getodk/testers, you can see that by following these steps:

  • Download the entities CSV.
  • Add a new property to the entity list. Don't modify any existing entities, just add the property to the entity list.
  • Download the entities CSV again. You should see that it includes the new property.

@dbemke
Copy link

dbemke commented Apr 12, 2024

Tested with succcess!

@dbemke dbemke added behavior verified Behavior has been manually verified and removed needs testing Needs manual testing labels Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Requires a change to the API server behavior verified Behavior has been manually verified bug entities Multiple Encounter workflows
Projects
Status: ✅ done
Development

Successfully merging a pull request may close this issue.

4 participants