Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Markov sample separator #90

Open
adrienaury opened this issue Mar 1, 2022 · 0 comments
Open

[PROPOSAL] Markov sample separator #90

adrienaury opened this issue Mar 1, 2022 · 0 comments

Comments

@adrienaury
Copy link
Member

Problem

Markov Mask can be used on different samples:

  • lists of words
  • paragraphs

For list of words, we would want to read the file line by line (exemples: nameFR, pokemons, etc..)
For entire paragraphs, or text that can be spread over multiple lines.

Proposal

In addition of the separator parameter that determine the way we split the text (word by word, character by character, etc..), we would want a parameter that helps the mask to understand the structure of the text:

  • is it a list?
  • is it paragraphs?
  • is it something else?

Anyway, markov mask should have a default configuration in order not to make it unusable.

Originally posted by @baguettte in #81 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant