Skip to content

Commit

Permalink
Split attribute block content parsing to a dedicated method
Browse files Browse the repository at this point in the history
This function receives the range `[first_ix, last_ix]` rather than
the range `[begin_ix, end_ix)`.
It is because detection of attribute blocks will be often done backward.

Attribute blocks are usually suffixed to the standard markdown items
(for example `## H2 {.class}` and `![alt](uri){.class}` (this is not yet
supported though)). Sometimes they would be easier to parse backward,
and sometimes easier to parse forward.
In such situations, asymmetric `[begin_ix, end_ix)` could be error-prone
since it may require indices to be moved around the boundaries of two
different types.
(For example, `[rend_ix+1, rbegin_ix+1)` is the same range as
`[begin_ix, end_ix)`, but it seems not easy to understand quickly.)

The range `[first_ix, last_ix]` with symmetric boundaries will make it
easier to create even when the attribute block is detected by backward
search, and also easier to read and understand correctly.
  • Loading branch information
lo48576 committed Mar 31, 2021
1 parent a13d1b1 commit dbd6836
Showing 1 changed file with 48 additions and 28 deletions.
76 changes: 48 additions & 28 deletions src/firstpass.rs
Original file line number Diff line number Diff line change
Expand Up @@ -406,20 +406,12 @@ impl<'a, 'b> FirstPass<'a, 'b> {
if let Some(attr_block_open) = header_text.rfind('{').map(|i| header_start + i) {
let attr_block_end = header_text_end - 1;
header_text_end = attr_block_open;
let attr_block = &self.text[(attr_block_open + 1)..attr_block_end];
for attr in attr_block.split_ascii_whitespace() {
// iterator returned by `str::split_ascii_whitespace` never emits empty
// strings, so `[0]` is always available.
match attr.as_bytes()[0] {
b'#' => {
id = Some(CowStr::Borrowed(&attr[1..]));
}
b'.' => {
classes.push(CowStr::Borrowed(&attr[1..]));
}
_ => {}
}
}
self.parse_inside_attribute_block(
attr_block_open + 1,
attr_block_end - 1,
&mut id,
&mut classes
);
}
}
}
Expand Down Expand Up @@ -1062,20 +1054,12 @@ impl<'a, 'b> FirstPass<'a, 'b> {
if let Some(attr_block_open) = header_text.rfind('{').map(|i| header_start + i) {
let attr_block_end = ix - 1;
ix = attr_block_open;
let attr_block = &self.text[(attr_block_open + 1)..attr_block_end];
for attr in attr_block.split_ascii_whitespace() {
// iterator returned by `str::split_ascii_whitespace` never emits empty
// strings, so `[0]` is always available.
match attr.as_bytes()[0] {
b'#' => {
id = Some(CowStr::Borrowed(&attr[1..]));
}
b'.' => {
classes.push(CowStr::Borrowed(&attr[1..]));
}
_ => {}
}
}
self.parse_inside_attribute_block(
attr_block_open + 1,
attr_block_end - 1,
&mut id,
&mut classes
);
}
}
}
Expand Down Expand Up @@ -1251,6 +1235,42 @@ impl<'a, 'b> FirstPass<'a, 'b> {
None
}
}

/// Parses an attribute block content, such as `.class1 #id .class2`.
///
/// Returns `(id, classes)`.
///
/// It is callers' responsibility to find opening and closing characters of the attribute
/// block.
///
/// If `last_ix` is less than `first_index`, attribute block is considered as empty.
fn parse_inside_attribute_block(
&self,
first_ix: usize,
last_ix: usize,
id: &mut Option<CowStr<'a>>,
classes: &mut Vec<CowStr<'a>>,
) {
// Return earlily when the attribute block is empty.
if first_ix > last_ix {
return;
}

let attr_block = &self.text[first_ix..=last_ix];
for attr in attr_block.split_ascii_whitespace() {
// iterator returned by `str::split_ascii_whitespace` never emits empty
// strings, so `[0]` is always available.
match attr.as_bytes()[0] {
b'#' => {
*id = Some(CowStr::Borrowed(&attr[1..]));
}
b'.' => {
classes.push(CowStr::Borrowed(&attr[1..]));
}
_ => {}
}
}
}
}

/// Scanning modes for `Parser`'s `parse_line` method.
Expand Down

0 comments on commit dbd6836

Please sign in to comment.