Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move processing to output function #1227

Merged
merged 5 commits into from
Apr 18, 2018
Merged

move processing to output function #1227

merged 5 commits into from
Apr 18, 2018

Conversation

UziTech
Copy link
Member

@UziTech UziTech commented Apr 17, 2018

Marked version: 4711f6b

Description

Contributor

  • Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
  • no tests required for this PR.
  • If submitting new feature, it has been documented in the appropriate places.

Committer

In most cases, this should be a different person than the contributor.

  • Draft GitHub release notes have been updated.
  • CI is green (no forced merge required).
  • Merge PR

@UziTech UziTech requested a review from davisjam April 17, 2018 02:55
lib/marked.js Outdated

if (link) {
href = link[1];
title = link[2].trim().slice(1, -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change the regex to ...\s+['"(](.*)['")] then you wouldn't need this slice.

lib/marked.js Outdated
href = href[0] === '<' ? href.substring(1, href.length - 1) : href;
title = cap[3] ? cap[3].substring(1, cap[3].length - 1) : cap[3];
if (this.options.pedantic) {
link = /^([^'"(]*[^\s])\s+(['"(].*['")])/.exec(href);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added ( and ) to the list of valid ways to denote a title based on the CommonMark spec. They were not in the original regex. Do we have a test for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we actually want that since this is pedantic. I'm not sure where the pedantic specs are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closest I've been able to find is from the Daring Fireball site:

https://duckduckgo.com/?q=site%3Adaringfireball.net+markdown&ia=web
https://daringfireball.net/projects/markdown/ - Download the Perl (I've never tried to read Perl)
https://daringfireball.net/projects/markdown/syntax

I was going to work on that after #1219 gets merged in.

We have the tests and assessment for the CommonMark and GFM specs. Next ones for me were going to be (still are), to finish migrating the old and the Perl.

@@ -762,8 +703,18 @@ InlineLexer.prototype.output = function(src) {
src = src.substring(cap[0].length);
this.inLink = true;
href = cap[2];
href = href[0] === '<' ? href.substring(1, href.length - 1) : href;
title = cap[3] ? cap[3].substring(1, cap[3].length - 1) : cap[3];
if (this.options.pedantic) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't test for pedantic much, and I don't see it well documented anywhere. The only docs I could find are in the man page.

Can you explain when I should test for pedantic in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this code right, without pedantic we will set href but not title. If my understanding is correct, why is this behavior desirable?

Copy link
Member

@styfle styfle Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pedantic flag means follow the original 2004 spec from John Gruber (Daring Fireball).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The regexes are grouped based on the options. The link regex that you fixed was in the pedantic group

    inline.pedantic = merge({}, inline.normal, {
  2. The title is set in the else clause. The regex for a non-pedantic link is /^!?\[(label)\]\(href(?:\s+(title))?\s*\)/ which is extremely complex when constructed and probably also error prone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is the actual non-pedantic link regex:

/^!?\[((?:\[[^\[\]]*\]|\\[\[\]]?|`[^`]*`|[^\[\]\\])*?)\]\(\s*(<(?:\\[<>]?|[^\s<>\\])*>|(?:\\[()]?|\([^\s\x00-\x1f()\\]*\)|[^\s\x00-\x1f()\\])*?)(?:\s+("(?:\\"?|[^"\\])*"|'(?:\\'?|[^'\\])*'|\((?:\\\)?|[^)\\])*\)))?\s*\)/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link regex is checking for a title. If no title is found then the href is already set to the whole string inside the parentheses and there is no title.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex that you used before when there was no title (/^(<?[\s\S]*>?)/) literally matches any string and since href is already set to the whole string there is no need to change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further down we then set the title field in the returned object based on the (undefined?) value of title. Do we want to set a default value for title, e.g. ''?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two cents would be an empty value of expected type. Avoid null and undefined checks and possibilities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are saying. Yes title should probably be defined as null or ''

looks like commonmark doesn't differentiate between the title being an empty string or no title: demo

so I will set it to an empty string.

@UziTech
Copy link
Member Author

UziTech commented Apr 17, 2018

I removed the () for title since the original markdown spec didn't allow them

(['"]) # quote char = $5
(.*?) # title = $6
\5 # matching quote

Copy link
Contributor

@davisjam davisjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this pull request Nov 8, 2021
move processing to output function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants