Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First Party Data Module #6099

Closed
bretg opened this issue Dec 8, 2020 · 11 comments
Closed

First Party Data Module #6099

bretg opened this issue Dec 8, 2020 · 11 comments
Labels
pinned won't be closed by stalebot taxonomy

Comments

@bretg
Copy link
Collaborator

bretg commented Dec 8, 2020

Type of issue

Module proposal

Description

If the First Party Data (FPD) revision in #5795 is approved, we propose adding modules that perform a number of FPD additions and validations.

In order to provide a consistent set of first party data, we propose two new modules that performs the following general functions:

  • verify OpenRTB datatypes, remove/warn any that are likely to choke downstream readers
  • verify that certain OpenRTB attributes are not specified: just imp for now
  • optionally suppress user FPD based on a TBD opt-out signal (_pubcid_optout)
  • populate available data into object: referer, meta-keywords, cur

Enrichment Actions:

  1. merge HTML meta-keywords tag into global ortb2.site.keywords. Datablocks gives an example for how to do this:
datablocksBidAdapter.js:    let keywords = document.getElementsByTagName('meta')['keywords'];
  1. merge page referrer into global ortb2.site.ref from getRefererInfo().referer, obsoleting the need to read the bidRequest.refererInfo. Existing values take precedence.

  2. merge page into global ortb2.site.page from getRefererInfo().canonicalUrl. Existing values take precedence.

  3. parse domain out of getRefererInfo().canonicalUrl and merge into global ortb2.site.domain. Existing values take precedence.

  4. set ortb2.device.h and ortb2.device.w from the physical device's height and width if they're not already specified

  5. REMOVED FROM INITIAL RELEASE - set ortb2.cur from getConfig({currency.adServerCurrency}) if it's not already specified

Validation Actions

  1. Parse ortb2 and verify there aren't any imp[] objects specified. If there are, remove them and log a warning.

  2. Parse the ortb2.site.content.data and ortb2.user.data arrays for validation. This should include the global ortb2 as well as ortb2 specified in setBidderConfig.

  • 'data' must be an array
  • loop through 'data' array
  • there must be a 'name' attr under each element of data
  • if there's an 'ext' attr, it must be an object // maybe more validations here?
  • there must be a 'segment' attr under each element of data and it must be an array
  • loop through 'segment' array: each object must have an 'id' attribute with a string value.
  1. No validations are done on AdUnit-specific FPD.

  2. The following OpenRTB field validations should be performed on global and per-bidder ortb2. If any of them fails, remove the field from the ortb2 data and log a console error.

  • site.{name,domain,page,ref,keywords,search} must be strings
  • site.{cat, sectioncat, pagecat} must be arrays of strings
  • site.{content,publisher} must be an objects
  • user.yob must be an integer
  • user.{gender, keywords} must be a string

Other Actions

  1. REMOVED FROM INITIAL RELEASE - if the _pubcid_optout localstorage or cookie are defined, then the module should remove the entire ortb2.user object.

  2. It would be good if module supported an optional config to turn off the validations and/or enrichments:

pbjs.setConfig({
    firstPartyData: {
        skipValidations: true, // default to false
        skipEnrichments: true // default to false
    }
});
  1. The module should provide a function to force a refresh of enrichments and validations in case the publisher's page can change them after the first auction.

The proposed integration model is that this module only runs at two times

  • On the first AUCTION_INIT -- read ortb2, currency, fpdModule, bidderConfig ortb2, etc and does its work.
  • On forced refreshes it does the same thing.
@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@jdwieland8282
Copy link
Member

Will the output of this module follow the spec described in #5795 ?

ortp2.site...
ortb2.user...

@bretg
Copy link
Collaborator Author

bretg commented Mar 30, 2021

@mmoschovas - a suggestion came in today that rather than trigger this behavior on AUCTION_INIT, perhaps it makes more sense to trigger on setConfig({ortb2}) -- this way RTD module additions that come after the first auction would verified as well.

Maybe we wouldn't need the refresh function at all if we can do that.

@bretg
Copy link
Collaborator Author

bretg commented Mar 31, 2021

Just added req 13 covering openrtb field validations. At this point I've chosen to only validate the top level site and user attributes. If we go to the second level (content, publisher), there would be an additional 30-something type validations. Before we go that far, want community confirmation that bloating this module with so many ortb checks is worthwhile.

@bretg
Copy link
Collaborator Author

bretg commented Apr 1, 2021

Discussed in the PBJS committee meeting:

  • device.w/h are ok as viewport size
  • proposed ortb2 validations are ok. We can add more as needed in future revisions

As for AUCTION_INIT vs setConfig, I'm going to propose we stick with AUCTION_INIT: the purpose of the validations is to validate pub-entered data. We do not expect RTD modules to require validation -- they will be coded correctly or fixed.

@patmmccann
Copy link
Collaborator

patmmccann commented Apr 8, 2021

In general this is a really exciting development. I think we can add validations or pre-populations in future pull requests so that this can move forward with a minimum viable state. For example, it seems floc might be appropriate to pull (see #6510 ) and put in the ortb2 object using this module.

This appears to be a drastic measure

If the _pubcid_optout localstorage or cookie are defined, then the module should remove the entire ortb2.user object.

User segments are potentially out of scope of the opt out, or the opt out might be quite specific. Also, the floc is designed to be privacy preserving, it and other privacy preserving id's will likely be here.

set ortb2.cur from getConfig({currency.adServerCurrency}) if it's not already specified

This one I am worried could break some things, if publishers start receiving bids in a new currency from a partner suddenly, it might require them to unwind some logic, for example, perhaps they are setting a floor in dollars and ad server currency is yen, is this potentially breaking?

Ad unit specific logic:

Perhaps a topic for another PR, but adding interstitial or outstream to ad units could be really helpful, validating video ad unit open rtb params could be really helpful.

@bretg
Copy link
Collaborator Author

bretg commented Apr 9, 2021

@mmoschovas - based on Patrick's feedback, please update #6452 to not set cur or support optout for now. I've added flags to requirements 6 and 9 above to take them out of the initial release.

@patmmccann
Copy link
Collaborator

patmmccann commented Apr 16, 2021

Occasionally it seems people want country, for example #6516 and in the PBS meeting today.

I propose we resolve device.geo.country with the map of console.log(Intl.DateTimeFormat().resolvedOptions().timeZone) to iso 3166 2 code here https://en.wikipedia.org/wiki/List_of_tz_database_time_zones. We can then use this table to match to iso 3166 alpha 3 code https://www.iban.com/country-codes.

@bretg
Copy link
Collaborator Author

bretg commented Apr 27, 2021

We've discussed in the Taxonomy committee and propose splitting this module into the component functionalities:

  • First Party Data Enrichment module
  • First Party Data Validation module

The validation functionality consumes about 1KB of code but may not be something pubs need to run in production. Some pubs may prefer to use the validation module only in a test mode.

@bretg
Copy link
Collaborator Author

bretg commented May 3, 2021

Updated description to keep up with the changes made in the PR to make enrichment and validation two separate modules.

@bretg
Copy link
Collaborator Author

bretg commented Jun 3, 2021

These have been released.

@bretg bretg closed this as completed Jun 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pinned won't be closed by stalebot taxonomy
Projects
None yet
Development

No branches or pull requests

3 participants