-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
markovify's make_sentence_with_start() doesn't seem to work properly #181
Comments
Hi @nezetimesthree, and thanks for your interest in |
of course. here's the code and text file. from transformers import pipeline
import random
import markovify
model_link = "IProject-10/bert-base-uncased-finetuned-squad2"
question_answerer = pipeline("question-answering", model=model_link)
with open('mayakovsky.txt', 'r') as file:
f = file.readlines()
poems = []
poem = ''
dataset = ''
for line in f:
dataset += line.strip() + '. '
if line != '\n':
poem += line.strip() + ' '
else:
poems.append(poem)
poem = ''
context = random.choice(poems)
question = input()
answer = question_answerer(question=question, context=context)['answer']
print(answer, '->', ' '.join(answer.split()[-2:]))
text_model = markovify.Text(' '.join(poems))
if len(answer.split()) > 1:
print(text_model.make_sentence_with_start(' '.join(answer.split()[-2:]), strict=False, tries=100), end='\n')
else:
print(text_model.make_sentence_with_start(answer, strict=False, tries=100), end='\n')
for i in range(5):
print(text_model.make_short_sentence(200, min_length=100, tries=100), end='\n') |
Thanks for sharing this, @nezetimesthree. It seems that you're passing to If I've misunderstood, could you share a simpler code example that doesn't depend on other libraries, yet still reproduces the problem? In this example, the logic that uses |
thanks for taking a look, @jsvine. but you're misunderstanding this: LLM gives answers only from the given context, which, in this case, is one of the poems from the file. i've checked the errors in poem dataset, and the words were there always. for some reason, NewlineText didn't see them as a start for sentences. maybe it's because some of the lines consist only of one word? could this be the issue? |
Thank you for the helpful clarification, @nezetimesthree. Could you share a start that the code fails on but that is definitely a start in the corpus? |
hello again, @jsvine. sorry i didn't answer yesterday, but here's the example, the error, and the proof that it's clearly there. |
Thanks; can you share that as copy-pasteable text? |
Great, thanks; that's what I was looking for, indeed. |
Thanks again for the helpful example. Taking a closer look, the issue seems not to be with import markovify
with open("mayakovsky.txt", "r") as file:
model = markovify.Text(file.read())
def test_presence(fragment):
return any(
any(fragment == token for token in sentence)
for sentence in model.parsed_sentences
)
print(test_presence("Послушайте!"))
print(test_presence("слажен")) Prints:
The default
Using import markovify
with open("mayakovsky.txt", "r") as file:
model = markovify.Text(file.read(), well_formed=False)
print(model.make_sentence_with_start("ладно слажен,")) Prints: |
thank you very much, @jsvine. i will test it and return with the result next week. sorry for making you wait for it, but i just won't have a chance this week. thank you again, and we'll see if this works. |
heya @jsvine. i'm writing a quite simple code with markovify, and i keep running into couple of issues.
it works in some cases as it expected to work, though, but in my tests issues happen a lot more often. i can give you the code if you need it.
The text was updated successfully, but these errors were encountered: