Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Add weekly sync meeting notes 2020-01-20 #52

Merged
merged 2 commits into from
Jan 29, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions meeting-notes/2020/2020-01-20--ipld-sync.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# 🖧 IPLD Weekly Sync 🙌🏽 2020-01-20

- **Lead:** @vmx
- **Notetaker:**
- **Attendees:**
- @vmx
- @warfork
- @rvagg
- @mikeal
- @stebalien
- @riba
- **Recording:** https://www.youtube.com/watch?v=ZtPGrwamYxg


## Agenda

- General
- Start recording
- Start live stream
- Find a notetaker
- Ask everyone to put their name into the list of attendees
- Ask everyone to put their items that they've been working on the past two weeks (should be done prior to the meeting)
- Ask for last minute agenda items
- This meeting
- _add your agenda item_


## Weekly Update

@vmx
- Catching up after vacations

@warpfork

- slogging away on implementation details of codegen'd maps (and corresponding improved `ipldfree.Node` maps).
- not much that's worth high-level report report -- didn't get as far as the benchmarks i wanted that would've made a nice summary.
- found a need for key&value child assemblers to need a pointer back to the map assembler -- irritated. increases the complexity of recursion somewhat and adds another wrapper that adds another vtable. it's all viable, but i looked for an alternate way around... and didn't find one. biting the bullet and continuing; will hope for a clever idea to strike later.
- one technical detail worth reporting up to the group: we've debated about whether maps maintain order? can update the considerations to note: it's not high cost to do so in go-ipld-prime (which we might've previously suspected).
- turned out "free"... or more precisely, swallowed up by another cost unambiguously worth paying: the slices used for amortizing a bunch of allocations and preventing 'convT' costs can also be used for ordering.

@rvagg
- Mostly Filecoin focused, CommP - understanding, digging in Go & Rust and extracting utilities to perform CommP calculation outside of Filecoin, including a Lambda service to do it for S3 resources. Leaning heavily on https://github.com/filecoin-project/go-fil-markets/ for the algorithm so we don't have to own much of it.

@mikeal
- reviews, okrs, and other managey stuff
- resolved some threads on car files and data processing, unblocked now.


## Notes

<!-- After each call, the notetaker submits a PR to https://github.com/ipld/team-mgmt to store the notes on the meeting-notes folder -->

- Ordering of map keys
- Discussion regarding map key ordering, coming out of https://github.com/ipld/specs/pull/230 discussion re deterministic selectors - how to be deterministic over maps? We need consistent key ordering rules.
- @mikeal: JavaScript perserves the order the map was written in
- @warpfork: (sorry for dropping out, ~~technical~~ consent-based-software difficulties)
- @warpfork: this is interesting in go-ipld-prime because most of the Node stuff is written *codec-agnostic*. So my *nodes* will maintain order... because they might be used with two different codecs, and maybe those two codecs have different ideas about sorting requirements. /2c.
- Conclusion:
- Dag* formats (like DagCBOR): sort in a certain way in write time, on write time they just read in how it was written => ordered
- Validation should be applied on our strict formats (DAG*) on read - it's an error if the keys appear out of order
- ( riba ) When writing spec make abs. sure **unicode normal form** is specified
- dissent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"dissent" - I meant to question this during the meeting - who added this, @warpfork? can you expand on this, it's a bit of a landmine sitting in here like this.
Or was this just a note that there was dissent but it wasn't significant enough to extrapolate on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even notice this: definitely worth clarifying before merge...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry that was me.

I think we SHOULD make remarks about UTF-8 + NFC normalization being the desired format of strings.

I also think we should use the words "SHOULD" and not "MUST".

Copy link
Contributor

@warpfork warpfork Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Everything MUST be defined to work in a defined way on byte comparison, regardless -- or our system would contain undefined behavior, so this is truly nonoptional
  • It is the "fortunate" (well-designed) case that sorting UTF-8 strings bytewise is correct anyway
  • If the above two weren't enough, it's worth noting that we already have good company in Canonical CBOR's spec, which demands bytewise sorting
  • I had a conversation at a conference in which I said to someone "I'm dealing with unicode normalization issues" and they said "can you just... not" and I was frankly unable to uphold my position against such a simple argument.

I know -- I was in support for aggressive normalization and rule enforcement when this was asked around this time last year. I've updated my position. I'm no longer convinced that there are good things coming of trying to make strings be more than bytes.

Copy link
Contributor

@warpfork warpfork Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read the https://www.unicode.org/reports/tr15/#Problem_Sequence_Table, I have seen the , and I no longer believe its our place to police strings in this project nor do I want to be the one to do it.

Reliably round-tripping the user's bytes is a more viable strategy in every way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I previously did a roundup issue on this at ipld/specs#67 and that is now also a record frozen at https://github.com/ipld/specs/blob/master/design/history/exploration-reports/2018.07-merkle-segments.md . We should probably update those (if only to say the original author no longer stands by half that) -- or if we really wanna explore this again, start drafting a new exploration report and take that through a comment cycle... and then, yeah, ideally get this into spec somewhere.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is going to get lost and needs to be recorded somewhere more permanent. High quality rant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make sure that this conversation isn't lost. I'll open an issue/PR on the specs repo once ipld/specs#236 is merged. I leave this PR open until everything is merged so that we can link it from the notes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The promised PR: ipld/specs#350


- Discussion about rejecting things on the network level
- @riba: Do it there
- @stebalien: Always do it at the same place, i.e. not one time at parsing time, and one point at networking time