Skip to content

Commit

Permalink
fix: yet another hack to manage max token limit
Browse files Browse the repository at this point in the history
  • Loading branch information
engineervix committed Jul 11, 2023
1 parent 1445e18 commit 28008a1
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion app/core/summarization/backends/openai.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import logging
import math
import textwrap

from langchain import OpenAI, PromptTemplate
Expand All @@ -25,7 +26,9 @@ def summarize(content: str, title: str) -> str:
# Trim the content if it exceeds the available tokens
# TODO: Instead of truncating the content, split it
# see <https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/split_by_token>
max_chars = max_prompt_tokens * 4 # Assuming 1 token ≈ 4 chars
chars = int(max_prompt_tokens * 3.75) # Assuming 1 token ≈ 4 chars
# round down max_chars to the nearest 100
max_chars = math.floor(chars / 100) * 100
if len(content) > max_chars:
content = textwrap.shorten(content, width=max_chars, placeholder=" ...")

Expand Down

0 comments on commit 28008a1

Please sign in to comment.