Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added an alert cache mechanism to avoid loosing information without internet #61

Merged
merged 10 commits into from
Sep 19, 2021

Conversation

frgfm
Copy link
Member

@frgfm frgfm commented Sep 12, 2021

This PR introduces the following modifications:

  • unified alert uploading: any detected alert is put in a cache (with a max size), and the object tries to empty this cache by uploading the alerts in it at every prediction it does. This means that if the connection is lost during N detections and comes back after this, it will dutifully upload the media and send the alerts for those in cache (limited to the cache size obviously)
  • added an automatic cache backup system: this cache is backup to disk every hour by default. Everytime the engine is instantiated it tries to restore the cache from disk. Which means that when we reboot the device to get back the connection, it will load back everything that was saved to disk.
  • updates requirements
  • updates unittests

One point to pay attention at: this can be checked in another PR but I prefer to point it out now. This PR handles the process as if the disk is persistent through reboot (if I save a file in my_path.jpg, then reboot, I'll find it in the same place). However, since we are using Docker and I'm not too sure about whether we bind volumes in production, this might be tricky. Anyone got more information about this? (I set the back path to data/ which seems to be the volume bind according to https://github.com/pyronear/pyro-engine/blob/master/server/docker-compose.yml#L14)

Any feedback is welcome!

@frgfm frgfm added type: enhancement New feature or request topic: build Related to build, installation & CI route: inference labels Sep 12, 2021
@frgfm frgfm added this to the 0.2.0 milestone Sep 12, 2021
@frgfm frgfm self-assigned this Sep 12, 2021
@codecov
Copy link

codecov bot commented Sep 12, 2021

Codecov Report

Merging #61 (4f8fc83) into master (201234c) will decrease coverage by 2.37%.
The diff coverage is 52.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
- Coverage   63.46%   61.08%   -2.38%     
==========================================
  Files          10       10              
  Lines         208      257      +49     
==========================================
+ Hits          132      157      +25     
- Misses         76      100      +24     
Flag Coverage Δ
unittests 61.08% <52.57%> (-2.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyroengine/engine/engine.py 52.45% <52.08%> (-0.97%) ⬇️
pyroengine/engine/predictor.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 201234c...4f8fc83. Read the comment docs.

Copy link
Contributor

@blenzi blenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @frgfm , thanks a lot for this PR. It looks good to me and I don't have more information about the binding of the volumes. I guess it all needs to be checked manually, right?

Copy link
Collaborator

@Akilditu Akilditu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi thanks a lot for this PR !

If I understand correctly, your concern about the reboot is the ensure that the data volume remains persistent after a reboot of the main rpi which rebuilds docker ?
A dump of cache outside of docker could then be a solution ?

Another question here, could we estimate on avg how many 'lost connection time' we can recover having the 100 frames max storage in cache within a full alert time-lapse ?

pyroengine/engine/engine.py Show resolved Hide resolved
pyroengine/engine/engine.py Show resolved Hide resolved
@frgfm
Copy link
Member Author

frgfm commented Sep 13, 2021

If I understand correctly, your concern about the reboot is the ensure that the data volume remains persistent after a reboot of the main rpi which rebuilds docker ?

Correct

A dump of cache outside of docker could then be a solution ?

Yes, that's why I put this in data/ which is the default volume location

Another question here, could we estimate on avg how many 'lost connection time' we can recover having the 100 frames max storage in cache within a full alert time-lapse ?

Rather easy yes :
each raspberry can at maximum detect fire each 30sec, say we have N of these:
which makes 2 * N frames per minute
So with 100 frames, the lower bound (in case of continuous wildfire detection at every analysis) we could cover is 100 / (2 * N) minutes. So with 4 active cameras, this covers ~12 minutes

But this would be the very worst case scenario (all devices won't be pointing towards the same location, and hopefully we won't have lost connection for all of those) :)

@frgfm frgfm requested a review from Akilditu September 14, 2021 09:12
@Akilditu
Copy link
Collaborator

Akilditu commented Sep 14, 2021

Yes, that's why I put this in data/ which is the default volume location

Isn't data/ gonna be overridden if the docker re-builds ?

Copy link
Member

@fe51 fe51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, thanks a lot for this PR :)
So far, it looks good to me. I would like to set up test devices to approve, if I can find the time.

About volume binding, as far as I understand, the volume source is ./data so this is a path, hence implicitly the volume is of type "bind" . (Would be a volume either if there is no source or a source as a named volume).
So, seems OK for our use case 👍

(but @Akilditu, to me, if I properly understand the doc If you don’t explicitly create it, a volume is created the first time it is mounted into a container. When that container stops or is removed, the volume still exists. Multiple containers can mount the same volume simultaneously, either read-write or read-only. Volumes are only removed when you explicitly remove them.
Hence, it should be OK with a volume as well )

pyroengine/engine/engine.py Outdated Show resolved Hide resolved
@frgfm
Copy link
Member Author

frgfm commented Sep 18, 2021

Yes, that's why I put this in data/ which is the default volume location

Isn't data/ gonna be overridden if the docker re-builds ?

So I've checked on my own, having a docker with a volume bind, creating a file in it, and then rebuild the image and check whether the file is still there and it works!

With this docker-compose:

version: '3.7'

services:
  web:
    image: python:3.8.1-slim
    command: >
        bash -c "ls /tmp
        && touch /tmp/tmp.txt
        && python -c 'print(1)'
        && ls /tmp
        "
    ports:
      - 8050:8000
    volumes:
      - /tmp:/home/fg/Downloads/tmp

I run it the first time to create the file in the volume:

➜  docker_check docker-compose up --build

Recreating docker_check_web_1 ... done
Attaching to docker_check_web_1
web_1  | 1
web_1  | tmp.txt
docker_check_web_1 exited with code 0

then a second time to see if it can access the created file:

➜  docker_check docker-compose up --build

Starting docker_check_web_1 ... done
Attaching to docker_check_web_1
web_1  | tmp.txt
web_1  | 1
web_1  | tmp.txt
docker_check_web_1 exited with code 0

@frgfm frgfm requested a review from fe51 September 18, 2021 13:58
Copy link
Member

@fe51 fe51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check FG ! So, let's merge 😃

@frgfm frgfm merged commit fa2cf54 into master Sep 19, 2021
@frgfm frgfm deleted the cache branch September 19, 2021 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
route: inference topic: build Related to build, installation & CI type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants