-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firestore listener stops receiving changes after a long time but without throwing an error #18
Comments
Just noticed this which seems like the same issue: firebase/firebase-admin-python#294. It concerns me that is has been open since May last year - isn't this considered a core feature? |
Btw. I always start two processes running in the background. Both listen to the same news but handle it differently. In the end both processes stopped receiving news at the same time. |
@mr-bjerre About how frequent are the updates before you stop receiving them? Also, could you try enabling logging in your script and report the output of the import logging
logging.basicConfig(filename='python-firestore-18.log', level=logging.DEBUG) |
It seems to be very random when I stop receiving them. It has happened after 20 minutes some times and also after 7 hours some times. I've created a log like you suggested but I don't see any grpc related entries? |
Oh wait I did find something in my logs where it broke (FYI I have four processes running - not entirely sure if each process writes to the log :-S)
and then the log just continues likes that |
I am experiencing the same issue! Wasn't able to solve it yet. |
I was directed to this thread by a google employee through my own Firestore IssueTracker question #152867838. Re: |
Same issue here, still looking for a solution |
@MaticConradi @r-hoeve @juancruzmartino could you share your library versions, python version, and where the code experiencing the failure is executing? |
Python version: Snapshots are created with: app = App()
app.init_listeners() def init_listeners(self):
database.collection("accounts").on_snapshot(self.update_account_properties)
# and 5 other snapshots def update_account_properties(self, settings, changes, timestamp):
for change in changes:
# processing changes All code executes just fine for a while, but at some point the |
I'm experiencing the same problem with Python 3.8.3 and google-cloud-firestore 1.6.1. The callback method specified as the parameter of on_snapshot() stops being called, without exception thrown or error message, after arbitrary amount of time, usually a few hours. |
I'm working this week on a reproducer app for this bug in Python: it will run on "Cloud Run", with a listener process and several different "mutator" process, making writes at different intervals. |
@mr-bjerre Working on a reproducer script, I've come across a couple of questions:
|
I haven’t noticed any correlation no with anything really. But I don’t think that I qualify to really answer that - I’m not exactly sure what to look for. Maybe some of the other users here can comment on that. Regarding the architecture - you are asking how many processes write to the collection that I listen to? If that’s the case then the answer is a few |
Thanks! Doing the "bisect" hunt would be helpful once we think we have the issue fixed, given that we don't believe it necessarily ever worked prior to |
@AntonioLule What version of @ikendra, @MaticConradi, @juancruzmartino Can you please try updating to |
Running |
Thanks @tseaver. I just upgraded to |
Hi @tseaver, I unfortunately still see the same issue also with I checked all I could/knew to check to rule out a mistake at my side:
I now added some extra logging to try and understand beter what's going on. |
The same issue with |
I've had my application running for 4 days and the listener is still working (no way of telling if it was broken at some point though). I'm on |
@ikendra Can you post the output from running $ venv/bin/pip list
Package Version Location
------------------------ --------- -------------------------------------------------------------
cachetools 4.1.1
certifi 2020.6.20
chardet 3.0.4
google-api-core 1.22.1
google-auth 1.20.1
google-cloud-core 1.4.1
google-cloud-firestore 2.0.0.dev1 /path/to/python-firestore
googleapis-common-protos 1.52.0
grpcio 1.31.0
idna 2.10
libcst 0.3.10
mypy-extensions 0.4.3
pip 20.2.2
proto-plus 1.7.1
protobuf 3.13.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pytz 2020.1
PyYAML 5.3.1
requests 2.24.0
rsa 4.6
setuptools 49.6.0
six 1.15.0
typing-extensions 3.7.4.2
typing-inspect 0.6.0
urllib3 1.25.10
wheel 0.35.1 |
I'm afraid my listener is broken now as well. Survived 4 days but not 5 |
@mr-bjerre Can you provide output of |
|
Yup my listener also breaks, though it does take longer before it happens. |
Pip list:
|
May be a tall ask, but do you have logs to match the failure? That has been the sticking point for fixing this: we know something is causing failures for users, but aren't aware of what that failure is. |
@crwilcox the general consensus I believe has been that there are no logs produced by this issue whatsoever. I have all function code enclosed inside try except blocks with Google Error reporting integrated, and there just isn't anything related to it coming up, not even network related errors, nothing. |
It also looks like the issue has been coming up more frequently for me lately, but this could be because of higher number of changes listeners had to have been handling on my end. |
@tseaver is your repro WIP PR capturing the failure at this point? |
@MaticConradi if you have the ability could you try The underlying protobuffer code is generated differently also so it is possible it has an effect. |
@crwilcox sure, I can look into migrating to v2 and seeing if that fixes it. Is there a list of known issues (if any) with existing functionality that I should be on the lookout for? |
I am not aware of issues, but there are some changes that are breaking. Particularly the way params are taken to some methods to protect against future API changes causing usability issues/breaks. We have some scripts you can run on a codebase to move to this as well: https://github.com/googleapis/python-firestore/tree/master/scripts Depending on what paths you use this could vary in the amount of change. |
Doesn't look like the issue is fixed in |
Btw. I imagine that this is related; googleapis/nodejs-firestore#1023 (comment) |
Nope. |
@BenWhitehead, @crwilcox I'm not sure what is to be done here, beyond constructing the scripts to (attempt) reproducing this issue in #158. |
@tseaver I also don't know what can be done to get a more firm repro. Anyone who is experiencing this, if you could try installing the candidate builds for v2.0.0 it is possible the work done to overhaul things a bit may have squashed this bug. Also, if anyone has suggestions on how to reproduce the failure we would be interested to try to track this down. |
Closing this issue because it looks like v2.0.0 (released Nov 10, 2020) may have fixed it. If anyone experiences this behavior with v2.0 or above going forward, please open a new issue. Thanks! |
Environment details
google-cloud-firestore
version: 1.6.1Steps to reproduce
I have a python script that sets up a listener to new documents in a firestore collection using
on_snapshot
on a query. To keep the script alive I have an infinite while loop in the end. Everything works fine but after several hours it stops receiving new documents (or the callback is not executed at least) - yesterday it was after ~7 hours.In an earlier version of
google-cloud-firestore
(1.5.0) there was a related issue where the listener got disconnected after 1 hour. I made my own ugly hack back then but it worked at least:Code example (sample)
Since the 1 hour issue was fixed I now do
and then execute a script like
Stack trace
Unfortunately no stack trace since the listener just stops receiving messages.
I see this as the core feature of Firestore so hopefully it'll get the right attention right away (some of our core business is to trigger python applications on news from firestore so we are extremely dependent on this).
The text was updated successfully, but these errors were encountered: