No more weird caching issues. #2

brycedrennan · 2018-09-10T17:46:07Z

Our pull request to the parent library has been ignored. Lets deploy to our private repo.

This is already deployed to pypi.cu and in use now.
http://pypi.cu/root/circleup/tldextract/3.0.0.circleup

Handles concurrency issues.

jessevogt

I am not comfortable with forking at this time since we would either lose access to new fixes from upstream OR would need to take on the work of manually keeping in sync. I would rather investigate implementing these fixes on our side external to tldextract.

brycedrennan · 2018-10-09T15:45:45Z

I do wish I had explored that route more as that would have been easier to maintain. There is some reason to think its not possible though because of how the library is designed.

I still think we should fork, as we have done in other cases because:

This solves a problem that has come up repeatedly for multiple people on our team and is very confusing when it happens. The problem is that whether you choose to include private domains is cached "forever". If the next call to the library specifies something different than the first call, then it's silently ignored, and incorrect results are returned. This is totally unexpected and hard to debug. It also totally precludes different parts of code using different lists for different purposes.
I think its unlikely we'll need the upstream changes. The library is an overcomplicated way of downloading a text file of regular expressions. It's already doing what we need.
It's not clear there will even be upstream changes. There have been no commits for 9 months. No response to this PR.
These changes have been live for a month.

jessevogt

If backing out these changes and fixing via the method I mentioned in my initial comment is not an option, please update readme for this repo to mention how/when this fork happened + additional relevant background around the changes we introduced. We should either remove or update the badges as well since they still point to the upstream version.

jessevogt · 2018-10-16T19:03:23Z

tldextract/utils.py

+                if not os.path.isfile(cache_path):
+                    result = func(*args, **kwargs)
+                    with open(cache_path, 'w') as cache_file:
+                        json.dump(result, cache_file)

            with open(cache_path) as cache_file:
                return json.load(cache_file)


shouldn't the read also be protected by the lock?

good catch will fix

jessevogt · 2018-10-16T19:08:01Z

Pleas also update permissions on this repo. I am currently seeing the following message:

brycedrennan · 2018-10-16T19:08:31Z

good feedback. will do.

brycedrennan · 2018-10-22T23:40:18Z

@jessevogt It locks for reads now and the team has access to the repo.

brycedrennan requested a review from jessevogt September 10, 2018 17:48

brycedrennan added 2 commits September 10, 2018 13:19

Add lockfile to cache.

8551ff7

Handles concurrency issues.

Prep for deploy to our private pypi server

c20317a

brycedrennan force-pushed the better-cache branch from 307f46a to c20317a Compare September 10, 2018 20:22

Update utils.py

d14fc76

jessevogt suggested changes Oct 8, 2018

View reviewed changes

jessevogt suggested changes Oct 16, 2018

View reviewed changes

Lock when reading as well

d39f2dc

brycedrennan added 2 commits October 26, 2018 10:40

Update setup.py

756fbc9

update docs

3b44f44

jessevogt approved these changes Oct 26, 2018

View reviewed changes

brycedrennan merged commit f337143 into master Oct 26, 2018

brycedrennan deleted the better-cache branch October 26, 2018 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No more weird caching issues. #2

No more weird caching issues. #2

brycedrennan commented Sep 10, 2018 •

edited

Loading

jessevogt left a comment

brycedrennan commented Oct 9, 2018

jessevogt left a comment

jessevogt Oct 16, 2018

brycedrennan Oct 16, 2018

jessevogt commented Oct 16, 2018

brycedrennan commented Oct 16, 2018

brycedrennan commented Oct 22, 2018

No more weird caching issues. #2

No more weird caching issues. #2

Conversation

brycedrennan commented Sep 10, 2018 • edited Loading

jessevogt left a comment

Choose a reason for hiding this comment

brycedrennan commented Oct 9, 2018

jessevogt left a comment

Choose a reason for hiding this comment

jessevogt Oct 16, 2018

Choose a reason for hiding this comment

brycedrennan Oct 16, 2018

Choose a reason for hiding this comment

jessevogt commented Oct 16, 2018

brycedrennan commented Oct 16, 2018

brycedrennan commented Oct 22, 2018

brycedrennan commented Sep 10, 2018 •

edited

Loading