Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backcompat child of mf2 root #11

Closed
sknebel opened this issue Jul 30, 2017 · 7 comments
Closed

backcompat child of mf2 root #11

sknebel opened this issue Jul 30, 2017 · 7 comments

Comments

@sknebel
Copy link
Member

sknebel commented Jul 30, 2017

This question is to clarify what happens when a backcompat class appears as a child of an mf2 root, since both parser behavior and discussions have varying interpretations.

This markup is derived from what I today encountered on https://indieweb.org/events/2017-04-26-homebrew-website-club,

<span class="h-card vcard">
<a href="http://cherryreds.com">
  <span class="p-name fn p-org org">Cherry Red's</span>
</a>, 
<span class="adr">
  <span class="street-address p-street-address">88-92 John Bright St</span>, 
  <span class="p-locality locality">Birmingham</span>, 
  <abbr class="p-country-name country-name">UK</abbr>
</span></span>

spec reading

from the parsing specification:

  • parse a child element for microformats (recurse)
    • if that child element itself has a microformat ("h-*" or backcompat roots) and is a property element, add it into the array of values for that property as a { } structure, add to that { } structure:
      • value:
        • if it's a p-* property element, use the first p-name of the h-* child
        • else if it's an e-* property element, re-use its { } structure with existing value: inside.
        • else if it's a u-* property element and the h-* child has a u-url, use the first such u-url
        • else use the parsed property value per p-*, u-*, dt-* parsing respectively
  • else add found elements that are microformats to the "children" array

While the section for properties explicitly calls out microformats as "h-*" or backcompat roots, the one for children does not, but given the wording I'd still interpret it as meaning the same.

Also, the note backward compatibility details section says that backcompat classes are ignored in mf2 and mf2 classes in mf1 are ignored

without an intervening root class name

meaning a backcompat root can change processing to backcompat, and this is not explicitly limited to properties

parser results

Of the 4 main parsers supporting backcompat, there are 3 different results (unrelated bugs have been removed/ignored):

1A: child object

In what would also be my interpretation, mf2py and microformats-ruby recognize the adr class as a backcompat object and, since it doesn't have a property name, turn it into a child of the surrounding h-card:

{
  "items": [
    {
      "type": ["h-card"],
      "properties": {
        "name": ["Cherry Red's"],
        "org": ["Cherry Red's"],
        "url": ["http://cherryreds.com"]
      },
      "children": [
        {
		"type": ["h-adr"],
          "properties": {
            "street-address": ["88-92 John Bright St"],
            "locality": ["Birmingham"],
            "country-name": ["UK"]
          }
        }
      ]}]}

1B: ignore adr

php-mf2 ignores the adr and consequently parses the mf2 properties inside it as properties of the surrounding h-card (which is in this case the result the original author probably intended)

{
  "items": [
    {
      "type": ["h-card"],
      "properties": {
        "name": ["Cherry Red's"],
        "org": ["Cherry Red's"],
        "url": ["http://cherryreds.com"],
        "street-address": ["88-92 John Bright St"],
        "locality": ["Birmingham"],
        "country-name": ["UK"]
        }}]}

1C: apply backcompat rules =>as if "p-adr h-adr"

microformats-shiv (or more specifically, microformats-node based on it) seems to apply the backcompat rules for vcard.adr, turning the adr into the equivalent of p-adr h-adr. This IMHO clearly shouldn't be applied to a MF2 root like h-card, but there is an example in the test suite requiring this behavior(?!).

{
  "items": [
    {
	  "type": ["h-card"],
	  "properties": {
	    "name": ["Cherry Red's"],
		"org": ["Cherry Red's"],
		"url": ["http://cherryreds.com"],
		"adr": [
		  {
		    "value": "88-92 John Bright St, \n   Birmingham, \n   UK", 
			"type": ["h-adr"],
			"properties": {
			  "street-address": ["88-92 John Bright St"],
			  "locality": ["Birmingham"], "country-name":["UK"]
		    }
		  }]
		}}]}

(Side note: If the markup is changed from class="adr" to class="p-adr adr" (backcompat property value instead of child), microformats-shiv, mf2py and microformats-ruby produce the same result as 1C, whereas mf2-php flattens the "adr" property to a string)

Related material

In this issue, @aaronpk says:

Based on the mf2 wiki, I'm not totally clear whether this should be ignored or not: http://microformats.org/wiki/microformats2-parsing#note_backward_compatibility_details However, it was surprising to me that it's there, and I don't want it to show up in the mf2 parsed version.

and @gRegorLove seems to say that nested mf1 should be ignored.

@Zegnat
Copy link
Member

Zegnat commented Jul 30, 2017

My interpretation matched that of sknebel so I would lean towards 1A.

1B seems clearly wrong as the adr class is indicative of the old adr and has backwards compatibility parsing defined by h-adr.

I get where 1C is coming from but I feel its definition is confusing. If this rule should exist because adr was often used as a direct child of a different object without property definition (as this distinction wasn’t made in mf1) I would love to see the specification making this more explicit. Something like:

If a backwards compatible adr object is found as a child (not property) of an h-card or h-event, do not add it to children and instead add it as p-adr.


Also see the discussion between @sknebel and myself on IRC.

@gRegorLove
Copy link
Member

adr (and geo) is a confusing one I always have to spend some time wrapping my head around when it comes to backcompat. My understanding is that it is always a root mf1 class, and sometimes a property of vcard

So adr appearing inside an h-card should be ignored as an mf1 property, but it should be backcompat parsed as an mf1 root, per http://microformats.org/wiki/h-adr#Backward_Compatibility, which would make it a child object.

So I think 1A is correct as the spec is worded now. 1B appealed to me at first since it's closer to what the publisher intended, but I think it's reasonable for publishers that are adding/updating to mf2 to be expected to update vcard => h-card and adr => p-adr h-adr so their intent is clear. I'd prefer that over the suggested backcompat exception.

Other discussions:
https://chat.indieweb.org/microformats/2017-05-01#t1493664172920000
https://chat.indieweb.org/microformats/2017-03-06#t1488831469687000

@sknebel
Copy link
Member Author

sknebel commented Dec 17, 2017

We just encountered this difference in handling between mf2py and php-mf2 while trying to help debug Bridgy on a WordPress site with the following structure:

<body class="h-entry">
  <div id="page" class="hfeed site wrap">
    <h1 class="entry-title"><span class='p-name'>title</span></h1>
    other content
    <div class="entry-content">
      <div class="e-content">this is a test for indieweb post </div> <span class="syn-text">Also on:</span>
<!--syndication links -->
    </div>
  </div>
</body>

Again, completely ignoring the backcompat hfeed leads to the result from an mf2 parser the author intended (so 1B above), while mf2py turned the hfeed into a backcompat root (it handled it's properties wrongly, but that is an unrelated issue), following 1A and what IMHO is closest to the wording here. In mf2py the h-entry thus had no properties (since they were inside the hfeed) and the parser fell back to implied-property handling, causing the entire page content, menus and all, to appear in the name property and being published to Twitter.

1A results in something like (note no properties on the feed, due to there being no mf1 properties, something mf2py currently does wrong)

{"items": [
{"type": ["h-entry"],
"properties": {
  "name": ["titleother content\nthis is a test for indieweb post Also on:"]
},
"children": [
  {
    "type": ["h-feed"],
    "properties": {}

1B results in

"items": [
 {
   "type": ["h-entry"],
   "properties": {
      "name": ["title"],
      "content": [
        {
          "html": "this is a test for indieweb post ",
          "value": "this is a test for indieweb post"
        }
      ]

EDIT: added examples for outputs.

@sknebel
Copy link
Member Author

sknebel commented Dec 17, 2017

Again, 1B (ignoring mf1 inside mf2 roots completely) leads to the more usable result.

But as laid out in the first post does not seem to match the spec language, which presumably is there for a reason: Can anyone involved in writing the backcompat rules clarify the thoughts behind allowing mf1 nested in mf2? I could not find anything on the old issues pages. I guess that's @tantek @glennjones?

I guess this turns into 2 questions:

  1. confirm that the current spec means A?
  2. if it does, is that what we want despite scenarios where B produced the "better" output (counter-examples?), or should the spec adjust to follow php-mf2 here?

@kartikprabhu
Copy link
Member

The new mf2py https://github.com/kartikprabhu/mf2py/ should consistently do 1A according to current spec. Checked on above examples.

@gRegorLove
Copy link
Member

Confirming php-mf2 master branch does 1A now as well. Note the second example doesn't imply the h-entry name due to #6

{
    "items": [
        {
            "type": [
                "h-card"
            ],
            "properties": {
                "name": [
                    "Cherry Red's"
                ],
                "org": [
                    "Cherry Red's"
                ],
                "url": [
                    "http://cherryreds.com"
                ]
            },
            "children": [
                {
                    "type": [
                        "h-adr"
                    ],
                    "properties": {
                        "street-address": [
                            "88-92 John Bright St"
                        ],
                        "locality": [
                            "Birmingham"
                        ],
                        "country-name": [
                            "UK"
                        ]
                    }
                }
            ]
        }
    ],
    "rels": {},
    "rel-urls": {},
    "debug": {
        "package": "https://packagist.org/packages/mf2/mf2",
        "version": "dev-master",
        "note": [
            "This output was generated from the php-mf2 library available at https://github.com/indieweb/php-mf2",
            "Please file any issues with the parser at https://github.com/indieweb/php-mf2/issues"
        ]
    }
}
{
    "items": [
        {
            "type": [
                "h-entry"
            ],
            "properties": [],
            "children": [
                {
                    "type": [
                        "h-feed"
                    ],
                    "properties": []
                }
            ]
        }
    ],
    "rels": {},
    "rel-urls": {},
    "debug": {
        "package": "https://packagist.org/packages/mf2/mf2",
        "version": "dev-master",
        "note": [
            "This output was generated from the php-mf2 library available at https://github.com/indieweb/php-mf2",
            "Please file any issues with the parser at https://github.com/indieweb/php-mf2/issues"
        ]
    }
}

@sknebel
Copy link
Member Author

sknebel commented Mar 12, 2018

Thanks for the updates @gRegorLove and @kartikprabhu!

Since everyone seems on the same page about what the spec says, we have active parser implementers confirming that their parsers match, the wrong test case has been fixed, and a bug against the JS parser has been filed, I'll close this as is.

Thanks everyone!

@sknebel sknebel closed this as completed Mar 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants