Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Table Conversion - Last Row Not Converted #558

Closed
syntaxsurge opened this issue Dec 22, 2023 · 1 comment · Fixed by #560
Closed

Issue with Table Conversion - Last Row Not Converted #558

syntaxsurge opened this issue Dec 22, 2023 · 1 comment · Fixed by #560
Labels

Comments

@syntaxsurge
Copy link

Hello markdown2 team,

I've encountered an issue with the markdown-to-HTML conversion process, specifically concerning tables. The last row of a markdown table isn't being converted correctly into HTML. Below, I've provided a sample of the markdown input and the resulting HTML output for your reference.

Markdown Input:

<p markdown="1"><strong>OpenAI's Growth Trajectory:</strong>
| Version | Parameters | Abilities                       |
|---------|------------|---------------------------------|
| GPT     | 117M       | Basic understanding of language |
| GPT-2   | 1.5B       | More nuanced language processing|
| GPT-3   | 175B       | Highly advanced AI capabilities |</p>

Expected HTML Output:

<p><strong>OpenAI's Growth Trajectory:</strong>
<table>
<thead>
<tr>
  <th>Version</th>
  <th>Parameters</th>
  <th>Abilities</th>
</tr>
</thead>
<tbody>
<tr>
  <td>GPT</td>
  <td>117M</td>
  <td>Basic understanding of language</td>
</tr>
<tr>
  <td>GPT-2</td>
  <td>1.5B</td>
  <td>More nuanced language processing</td>
</tr>
<tr>
  <td>GPT-3</td>
  <td>175B</td>
  <td>Highly advanced AI capabilities</td>
</tr>
</tbody>
</table></p>

Actual HTML Output:

<p><strong>OpenAI's Growth Trajectory:</strong>
<table>
<thead>
<tr>
  <th>Version</th>
  <th>Parameters</th>
  <th>Abilities</th>
</tr>
</thead>
<tbody>
<tr>
  <td>GPT</td>
  <td>117M</td>
  <td>Basic understanding of language</td>
</tr>
<tr>
  <td>GPT-2</td>
  <td>1.5B</td>
  <td>More nuanced language processing</td>
</tr>
</tbody>
</table>

| GPT-3   | 175B       | Highly advanced AI capabilities |</p>

As you can see in the actual output, the last row of the table (pertaining to GPT-3) is not being converted into HTML format and remains in markdown.

I am using the following options for conversion: extras=['tables', 'footnotes', 'markdown-in-html', 'cuddled-lists'].

Could you please look into this issue?

Thank you for your assistance.

@Crozzers
Copy link
Contributor

This seems to be a continuation of #546. The linked PR (#547) implemented a solution for when markdown-in-html tags are on the same line as the markdown itself. The problem is that this solution is only applied when the snippet is < 3 lines long.

The assumption here is that the snippet would look like this:

<div markdown="1">Some **text**
</div>

This completely breaks down on longer snippets such as:

<div markdown="1">Some **text**
Followed by more **text**
</div>

The fix here would be to either implement a better check for HTML on the same lines as markdown, or to not check at all and always attempt to split it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants