feat: introduce support for generating observations for circumvention nettests #48

DecFox · 2023-12-07T05:45:25Z

This diff introduces support for generating observations for circumvention nettests: psiphon and vanilla_tor using a new CircumventionToolObservation obserations class.

…vations

hellais · 2023-12-07T09:49:05Z

oonidata/models/observations.py

+
+@add_slots
+@dataclass
+class CircumventionToolObservation(MeasurementMeta):


While this looks really good, I think we should invest a bit of time to define this Observation data model such that's flexible enough for all future circumvention tools.

Ideally it would be something that can adapt nicely to new circumvention tools as we put them out and it's probably worth checking with @ainghazal what his thoughts are on the topic.

Some of the considerations to keep in mind are the following:

Schema migrations are a pain, so the less we do the better it is

It's easier to add new columns, than it is to change an existing column

We should factor in schema evolution in such a way where we make it as future proof as possible, but where we anticipate changes, they are done through the addition of new columns, rather than changes to existing ones

I lack broader context about data design for observations, but in principle, I think I'd go for a generic circumvention observation (with perhaps method family and optional flavor or configuration parameters) rather than a flat observation table that tries to accommodate all of them.

A couple of quick thoughts though:

If schema migrations are painful, wouldn't it be a good idea to spend some effort and try to come up with a solution that automates them? (thinking in the equivalent for django's south). I guess basically we'd need version and a way to convert semantically equivalent data for each field, plus the ability to mark a change as backwards incompatible (NA before a version cut).

One thought I've been entertaining is to draw a "family tree" of circumvention tools (proxy, VPN, onion routing) that captures at least broad aspects of protocols, and then allows to specify changing parameters (for example, Tor over an Obfs4 bridge, with endpoint E, where obfs4 has a version that allows us to compare breaking changes etc). Same for VPN, is_vpn=True but proto=wireguard && transport=tcp && obfuscation=foo.

hellais · 2023-12-07T09:51:01Z

This looks really good, thanks for working on it!

Could you perhaps split this PR up into two, where you keep the stuff which adds new Observation data models (the one about circumvention tools) so that we can land the facebook_messenger observation generation sooner?

I think the new Observation tables for circumvention tools we should take a bit more time to think if this approach is solid (which it might very well be) and I would not want that to stall on landing the facebook_messenger transformer which is much more straightforward.

DecFox added 4 commits November 27, 2023 18:18

feat: add support for generating psiphon and facebook_messenger obser…

a2d4f02

…vations

fix: import issues and typos

5dfe693

feat: add CircumventionToolObservation and vanilla_tor

cf0e70f

Merge branch 'main' into issue-32-psiphon

33e6c3f

hellais reviewed Dec 7, 2023

View reviewed changes

refactor: remove facebook_messenger diff

89e39e2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: introduce support for generating observations for circumvention nettests #48

feat: introduce support for generating observations for circumvention nettests #48

DecFox commented Dec 7, 2023 •

edited

Loading

hellais Dec 7, 2023

ainghazal Dec 7, 2023

hellais commented Dec 7, 2023

feat: introduce support for generating observations for circumvention nettests #48

Are you sure you want to change the base?

feat: introduce support for generating observations for circumvention nettests #48

Conversation

DecFox commented Dec 7, 2023 • edited Loading

hellais Dec 7, 2023

Choose a reason for hiding this comment

ainghazal Dec 7, 2023

Choose a reason for hiding this comment

hellais commented Dec 7, 2023

DecFox commented Dec 7, 2023 •

edited

Loading