Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data removal results in arki-check errors #296

Closed
brancomat opened this issue Jan 12, 2023 · 3 comments
Closed

data removal results in arki-check errors #296

brancomat opened this issue Jan 12, 2023 · 3 comments
Assignees
Labels

Comments

@brancomat
Copy link
Member

$ # preparo metadata per cancellazione
$ arki-query 'reftime:=2022-05-18 0:00' cosmo/ > qualcosa.md
$ # eseguo cancellazione
$ arki-check --fix --remove qualcosa.md cosmo/
$ # eseguo repack
$ arki-check -f -r cosmo/
cosmo:2022/05-18.grib: index knows of this segment but contains no data for it
cosmo:2022/05-18.grib: segment old enough to be archived
Traceback (most recent call last):
  File "/usr/bin/arki-check", line 11, in <module>
    main()
  File "/usr/bin/arki-check", line 7, in main
    sys.exit(Check.main())
  File "/usr/lib/python3.10/site-packages/arkimet/cmdline/base.py", line 83, in main
    return cmd.run()
  File "/usr/lib/python3.10/site-packages/arkimet/cmdline/check.py", line 125, in run
    arki_check.repack()
RuntimeError: cannot archive /arkitest/cosmo/2022/05-18.grib to /arkitest/cosmo/.archive/last/2022/05-18.grib because the destination already exists

test case:
arkitest.tar.gz

(this might be an edge case since I'm using the --remove option with specific metadata to remove all present data)

@spanezz
Copy link
Contributor

spanezz commented Jan 25, 2023

It looks like something like this happened:

  1. attempted deletion of data not present in the online part of the dataset
  2. searching for the data somewhat created bits of a segment (to be understood how)
  3. repack tries to archive that, and becomes disappointed because the archive already has a segment for that time period

@spanezz
Copy link
Contributor

spanezz commented Jan 25, 2023

Reproduced in test suite. Indeed, deleting nonexisting data creates an empty version of the affected segment

@spanezz
Copy link
Contributor

spanezz commented Jan 25, 2023

Reason: deletion happens internally via a dataset writer instead of a checker, despite one needing to run arki-check for it. The writer, which is usually used to append data to a dataset, will always create a segment if it is missing.

A possible way forward is to move the remove method from the writer infrastructure to the checker infrastructure, which is probably something that should have already happened long ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants