Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeled Coverage - Phase 3 #559

Closed
wants to merge 51 commits into from
Closed

Modeled Coverage - Phase 3 #559

wants to merge 51 commits into from

Conversation

maplant
Copy link
Contributor

@maplant maplant commented Jun 28, 2023

This is mostly complete, there are two remaining tasks that should be relatively quick to complete:

  • Convert unit tests in the reward_share module over
  • Add a cache to the heartbeat deamon to fetch coverage objects so we can invalidate heartbeats that use incorrect coverage objects


fn try_from(v: CoverageObjectIngestReportV1) -> Result<Self> {
Ok(Self {
received_timestamp: v.received_timestamp.to_timestamp_millis()?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to implement the MsgTimestamp trait to handle these timestamps here and in subsequent msgs below ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, just didn't see a reason to implement it to only be used one other place

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its a matter of convention and ensuring consistency.

Cargo.toml Outdated Show resolved Hide resolved
mobile_verifier/migrations/13_modeled_coverage.sql Outdated Show resolved Hide resolved
mobile_verifier/src/coverage.rs Outdated Show resolved Hide resolved
mobile_verifier/src/coverage.rs Outdated Show resolved Hide resolved
file_store/src/coverage.rs Outdated Show resolved Hide resolved
}

impl PocShares {
pub async fn aggregate(
impl CoveragePoints {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but points are really just shares. i had the same issue with subscriber rewards. personally i think its better to stick to the shares naming convention

mobile_verifier/src/coverage.rs Outdated Show resolved Hide resolved
// Guaranteed that points contains the given hotspot.
*coverage_points
.get_mut(&hotspot)
.unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the olde unwrap usage

Copy link
Contributor

@andymck andymck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks to be doing the right thing, will give it another pass when conflicts and tests are fixed

indoor,
// It's pretty weird that we have to convert these back and forth, but it
// shouldn't be too inefficient
coverage: coverage.clone().into_iter().map(Into::into).collect(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following on from the above point, i think this is where the usefulness of having a full suite of conversions in the filestore comes into play ( allowing you to go from internal struct to proto and any child structs such as in this case the coverage struct)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these coverage objects do we actually need to have an internal type that mirrors the proto type of can we just use the proto type? it looks to me like the only reason we have the internal type is to convert the hex string to an h3o::CellIndex and that's only used to cast to a u64 here for inserting into the db. i think you can avoid this conversion by just having the internal coverage object contain a vec of proto::RadioSignalLevel, do the full string -> CellIndex -> u64 here where you need it and avoid this iter/map/collect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my take is that we should have an internal type and especially where we are deriving the source data from the DB and then subsequently manipulating that data. I would then argue if we then have to convert the internal type to the proto type that should be handled in the filestore where we can have conversions centralised and have a standard set of conventions ( such as timestamp traits ). Generating protos on the fly leaves more opportunity for mishaps such as mis-setting a timestamp ( secs insteads of millis for example )

@@ -97,6 +97,7 @@ itertools = "*"
data-credits = {git = "https://github.com/helium/helium-program-library.git", tag = "v0.1.0"}
helium-sub-daos = {git = "https://github.com/helium/helium-program-library.git", tag = "v0.1.0"}
price-oracle = {git = "https://github.com/helium/helium-program-library.git", tag = "v0.1.0"}
uuid = {version = "1.3.4", features = ["v4", "serde"]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not a concern depending on the expected volume of coverage objects, but ulid-rs may be good to keep in mind for "lexicographically sortable" IDs that are UUIDv4 compatible to prevent too much fragmentation in the database

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would absolutely prefer ulids, but we are given the uuids. we don't decide them


fn try_from(v: RadioHexSignalLevelProto) -> Result<Self> {
Ok(Self {
signal_level: SignalLevel::from_i32(v.signal_level).ok_or_else(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use v.signal_level() or are you explicitly avoiding the proto behavior of using the default (lowest/first) element of the signal level enum if the oracle doing the try_from() has an older version of the proto code that doesn't account for the variant in a newer message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just happens to be the pattern I see everywhere else in the file store

file_store/src/error.rs Outdated Show resolved Hide resolved
ingest/src/settings.rs Outdated Show resolved Hide resolved
mobile_verifier/migrations/13_modeled_coverage.sql Outdated Show resolved Hide resolved
mobile_verifier/src/coverage.rs Outdated Show resolved Hide resolved
#[derive(Clone, FromRow)]
pub struct HexCoverage {
pub uuid: Uuid,
pub hex: i64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this i64 useful anywhere or would it be better to custom impl FromRow and just do the conversion into an h3o::CellIndex directly out of the db?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, but having a custom FromRow impl seems excessive

}) = covered_hexes.next().await.transpose()?
{
self.hexes
.entry(CellIndex::try_from(hex as u64).unwrap())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a guaranteed safe operation unless we're sure we only inserted valid values into the db; h3ron didn't do validation so we've been bitten by trusting registered hexes before once we switched over to h3o

Comment on lines +149 to +152
FROM heartbeats t1
WHERE t1.latest_timestamp = (
SELECT MAX(t2.latest_timestamp)
FROM heartbeats t2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be done with a group-by on cbsd_id to get the latest timestamp per radio instead of having a triple-nested select statement? or have the single sub-select order results by timestamp and only take the first (highest) one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really not sure. We need to fetch the max latest timestamp and the min first timestamp of a group of at least 12 of cbsd_id

mobile_verifier/src/heartbeats.rs Show resolved Hide resolved
maplant and others added 2 commits July 7, 2023 17:50
Co-authored-by: Jeff Grunewald <jeff.grunewald@gmail.com>
mobile_verifier/src/coverage.rs Outdated Show resolved Hide resolved
}

impl CachedCoverage {
pub fn max_distance_km(&self, latlng: LatLng) -> f64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be more accurate or more efficient to do this distance calculation with the native h3 grid_distance functions (grid_disk_distances_fast/safe())? could we confidently say due to RF propagation that coverage has to be within N range of the k-ring from the "source", get the (CellIndex, u32) iterator and return the furthest of those that's also a member of the coverage vec?

@@ -136,11 +187,31 @@ impl HeartbeatReward {
) -> impl Stream<Item = Result<HeartbeatReward, sqlx::Error>> + 'a {
sqlx::query_as::<_, HeartbeatKey>(
r#"
SELECT hotspot_key, cbsd_id, cell_type
WITH coverage_objs AS (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for this query to pull a new coverage_object that is attached to a heartbeat that came after the rewardable period?

.await?;

// Is this a valid delete?
sqlx::query("DELETE FROM hex_coverage WHERE cbsd_id = $1 AND uuid != $2 AND coverage_claim_time < $3")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this query delete a new coverage_object where the coverage_claim_time is in the past?

.map(|poc_rewards_per_share| {
let start_period = epoch.start.encode_timestamp();
let end_period = epoch.end.encode_timestamp();
self.coverage_points
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has to be a way to make this more readable

@maplant maplant changed the title Modeled Coverage Modeled Coverage - Phase 3 Jul 31, 2023
}

impl CellHeartbeat {
pub fn coverage_object(&self) -> Option<Uuid> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn coverage_object(&self) -> Option<Uuid> {
pub fn coverage_object_id(&self) -> Option<Uuid> {

cbsd_id: &'a str,
coverage_obj: &'a Uuid,
period_end: DateTime<Utc>,
) -> Result<BoxStream<'a, Result<HexCoverage, sqlx::Error>>, sqlx::Error>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these errors overly prescriptive for a trait? why have the trait at all if it always assumes a postgres/sql source?

)
.bind(cbsd_id)
.bind(period_end)
.fetch_one(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is always guaranteed to be one? what happens when we first activate the feature and it's populating the new record types?

@@ -21,6 +21,7 @@ CREATE TABLE seniority (
seniority_ts TIMESTAMPTZ NOT NULL,
last_heartbeat TIMESTAMPTZ NOT NULL,
uuid UUID NOT NULL,
update_reason INT NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is update_reason an int and not an enum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then we can have it be the same type as the proto enum

@maplant maplant closed this Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants