Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

feat: skip timestamped messages instead of deleting #720

Merged
merged 1 commit into from
Oct 27, 2016
Merged

Conversation

bbangert
Copy link
Member

Alleviates heavy writes to remove messages by instead timestamping
messages and skipping through them. The first message record
retains the timestamp that was read up to for later use.

Closes #661

@codecov-io
Copy link

codecov-io commented Oct 25, 2016

Current coverage is 100% (diff: 100%)

Merging #720 into master will not change coverage

@@           master    #720   diff @@
=====================================
  Files          48      48          
  Lines        9947   10136   +189   
  Methods         0       0          
  Messages        0       0          
  Branches        0       0          
=====================================
+ Hits         9947   10136   +189   
  Misses          0       0          
  Partials        0       0          

Powered by Codecov. Last update 001129b...f46f8d7

@bbangert bbangert force-pushed the feat/issue-661 branch 5 times, most recently from ca18830 to 04ebcf7 Compare October 26, 2016 06:23
Copy link
Member

@jrconlin jrconlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits and questions in code. I've not seen anything critical, but I'm going to spend a bit more time looking over the rollover code in websocket.

:type uaid: uuid.UUID
:type limit: int
:returns: A tuple of the last timestamp to read for timestamped
messages and the list of non-timestamped messages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the doc builder. I've noticed that it can have problems with multi-line return descriptions.


# First extract the position if applicable, slightly higher than 01:
# to ensure we don't load any 01 remainders that didn't get deleted
# yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm. if it's a string, why not just pick a different prefix, kind of like we did with " "? We could use something like "#1", "#2" etc. You can then query on ">= '0'" Talked with Ben, he believes the timecode filtering is robust enough that it's not an issue.

from autopush.utils import (
generate_hash,
normalize_id,
WebPushNotification,
)

# Typing
T = TypeVar('T') # flake8: noqa
U = TypeVar('U') # flake8: noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is 'U' being used? I don't see references for it in the patch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, meant to remove it, only needed on generic var.

"""Update the route tables current_message_month

Note that we also update the last_connect at this point since webpush
users when connecting will always call this once that month.
users when connecting will always call this once that month. The
cuurrent_timestamp is also reset as a new month has no last read
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo cuurrent

expr_values = self.encode({":curmonth": month,
":last_connect": generate_last_connect()
":last_connect": generate_last_connect(),
":timestamp": 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it paranoid to set this to the TS of midnight of the roll-over month, rather than 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this isn't supposed to be here at all anymore, since the message record tracks the timestamp, not router. Leftover from first attempt at this.

@@ -524,4 +587,10 @@ def websocket_format(self):
def ms_time():
# type: () -> int
"""Return current time.time call as ms and a Python int"""
return int(time.time() * 1000)
return int(time.time() * pow(10, 3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the static numeric rather than calling a function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was getting tired of counting zero's between 3 of them and 6. I've seen typo's creep in as well where instead of 6 there'll be 3. It is a pain to call them constantly though, so I'll refactor that into a const.

@bbangert bbangert force-pushed the feat/issue-661 branch 4 times, most recently from 44e77c1 to 7852f8d Compare October 26, 2016 18:37
Copy link
Member

@jrconlin jrconlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+

from autopush.utils import (
generate_hash,
normalize_id,
WebPushNotification,
)

# Typing
T = TypeVar('T') # flake8: noqa
U = TypeVar('U') # flake8: noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

U isn't used

@@ -363,6 +379,7 @@ def fetch_notifications(self, uaid):

@track_provisioned
def save_notification(self, uaid, chid, version):
# type: (str, str, Optional[int]) -> bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is version actually Optional here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is, it str()'s it, so if it was None, then the later eq match would fail to compare afaik.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I take that back, it must be, since the delete has quite a bit of logic allowing an optional version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete's handling of that looks like a special case for unregistering a channel.. but I don't get why it would even call delete like that in that case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be happy when we can purge the simplepush legacy nonsense from the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is simplepush legacy cruft. "Version" here is an identifier that prevented delete from accidentally deleting the wrong "version" of the notification (e.g. if a remote client got a message, then an update occurred while in process, the client wouldn't nuke the newer message.) Likewise if we got a notification "out of order" we wouldn't overwrite the newer notification with an older one.

Technically, "version" isn't optional, at least in the older code. It's an int that is coverted to a string because that's a handcrafted, artisan build that never got properly replaced by the dynamodb serializer.

"Version" could well be considered "optional" with newer code since we don't really care anymore about message order and inflight confusion.

@@ -480,6 +501,7 @@ def all_channels(self, uaid):

@track_provisioned
def save_channels(self, uaid, channels):
# type: (str, Set[str]) -> bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> None

@@ -403,30 +454,44 @@ def from_message_table(cls, uaid, item):
key_info = cls.parse_sort_key(item["chidmessageid"])
if key_info.get("topic"):
key_info["message_id"] = item["updateid"]
else:
key_info["message_id"] = item["updateid"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

update_id=item.get("updateid"),
timestamp=item.get("timestamp"),
)
# Load the sortkey_timestamp if present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should just be sortkey_timestamp=key_info.get("sortkey_timestamp") above

# Delete non-timestamped messages
self.force_retry(self.ps.message.delete_message, notif)

# The line above runs, but coverage is confused on this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'continue' statement gets flagged by coverage as not being covered. No idea why, it has to run logically given that the lines below are fine.

if msg.sortkey_timestamp:
# Is this the last un-acked message we're waiting for?
last_unacked = sum(
len(self.ps.updates_sent[x]) for x in self.ps.updates_sent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(sent) for sent in self.ps.updates_sent.itervalues()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is simply iterating through updates_sent values so it should be an itervalues() call

@pjenvey
Copy link
Member

pjenvey commented Oct 26, 2016

another thing that probably deserves its own issue:

we're assuming ints in many places where boto2 is giving us Decimals

e.g. fetch_messages' last_position result

Alleviates heavy writes to remove messages by instead timestamping
messages and skipping through them. The first message record
retains the timestamp that was read up to for later use.

This also removes an existing edge case where the connection node
would not fetch more messages if all the ones in a batch it fetched
had their TTL expired.

Closes #661
yield client.connect()
yield client.hello()
result = yield client.get_notification()
ok_(result != {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the ok_(result != {}) in this file should be ok_(result is not None) like you are doing in the other new tests

)

# Ensure we generate the sort-key properly for legacy messges
if key_info["api_ver"] == "00":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could go into the constructor call too

legacy=key_info["api_ver"] == "00"

@bbangert bbangert merged commit c241810 into master Oct 27, 2016
@bbangert bbangert deleted the feat/issue-661 branch October 27, 2016 21:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants