Skip to content

Commit

Permalink
Backport PR jupyterlab#1094: Continue to allow $ symbols to delimit…
Browse files Browse the repository at this point in the history
… inline math in human messages (jupyterlab#1095)

Co-authored-by: david qiu <david@qiu.dev>
  • Loading branch information
meeseeksmachine and dlqqq authored Nov 7, 2024
1 parent 0016945 commit 11425e1
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 61 deletions.
8 changes: 4 additions & 4 deletions packages/jupyter-ai-magics/jupyter_ai_magics/providers.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,10 @@
You may use Markdown to format your response.
If your response includes code, they must be enclosed in Markdown fenced code blocks (with triple backticks before and after).
If your response includes mathematical notation, they must be expressed in LaTeX markup and enclosed in LaTeX delimiters.
- Single dollar signs ($) should never be used as delimiters for inline math.
- Valid inline math: `\\( \\infty \\)`
- Valid display math: `\\[ \\infty \\]`
- Invalid inline math: `$\\infty$`
All dollar quantities (of USD) must be formatted in LaTeX, with the `$` symbol escaped by a single backslash `\\`.
- Example prompt: `If I have \\\\$100 and spend \\\\$20, how much money do I have left?`
- **Correct** response: `You have \\(\\$80\\) remaining.`
- **Incorrect** response: `You have $80 remaining.`
If you do not know the answer to a question, answer truthfully by responding that you do not know.
The following is a friendly conversation between you and a human.
""".strip()
Expand Down
58 changes: 1 addition & 57 deletions packages/jupyter-ai/src/components/rendermime-markdown.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -39,61 +39,6 @@ function escapeLatexDelimiters(text: string) {
.replace(/\\\]/g, '\\\\]');
}

/**
* Type predicate function that determines whether a given DOM Node is a Text
* node.
*/
function isTextNode(node: Node | null): node is Text {
return node?.nodeType === Node.TEXT_NODE;
}

/**
* Escapes all `$` symbols present in an HTML element except those within the
* following elements: `pre`, `code`, `samp`, `kbd`.
*
* This prevents `$` symbols from being used as inline math delimiters, allowing
* `$` symbols to be used literally to denote quantities of USD. This does not
* escape literal `$` within elements that display their contents literally,
* like code elements. This overrides JupyterLab's default rendering of MarkDown
* w/ LaTeX.
*
* The Jupyter AI system prompt should explicitly request that the LLM not use
* `$` as an inline math delimiter. This is the default behavior.
*/
function escapeDollarSymbols(el: HTMLElement) {
// Get all text nodes that are not within pre, code, samp, or kbd elements
const walker = document.createTreeWalker(el, NodeFilter.SHOW_TEXT, {
acceptNode: node => {
const isInSkippedElements = node.parentElement?.closest(
'pre, code, samp, kbd'
);
return isInSkippedElements
? NodeFilter.FILTER_SKIP
: NodeFilter.FILTER_ACCEPT;
}
});

// Collect all valid text nodes in an array.
const textNodes: Text[] = [];
let currentNode: Node | null;
while ((currentNode = walker.nextNode())) {
if (isTextNode(currentNode)) {
textNodes.push(currentNode);
}
}

// Replace each `$` symbol with `\$` for each text node, unless there is
// another `$` symbol adjacent or it is already escaped. Examples:
// - `$10 - $5` => `\$10 - \$5` (escaped)
// - `$$ \infty $$` => `$$ \infty $$` (unchanged)
// - `\$10` => `\$10` (unchanged, already escaped)
textNodes.forEach(node => {
if (node.textContent) {
node.textContent = node.textContent.replace(/(?<![$\\])\$(?!\$)/g, '\\$');
}
});
}

function RendermimeMarkdownBase(props: RendermimeMarkdownProps): JSX.Element {
// create a single renderer object at component mount
const [renderer] = useState(() => {
Expand Down Expand Up @@ -131,8 +76,7 @@ function RendermimeMarkdownBase(props: RendermimeMarkdownProps): JSX.Element {
);
}

// step 2: render LaTeX via MathJax, while escaping single dollar symbols.
escapeDollarSymbols(renderer.node);
// step 2: render LaTeX via MathJax
props.rmRegistry.latexTypesetter?.typeset(renderer.node);

// insert the rendering into renderingContainer if not yet inserted
Expand Down

0 comments on commit 11425e1

Please sign in to comment.