Skip to content

Commit

Permalink
Update sinrvec_fr.ipynb (#116)
Browse files Browse the repository at this point in the history
adding min_freq=0 for documents preprocessing
  • Loading branch information
aberanger authored Nov 29, 2024
1 parent ae6413d commit 6756b0b
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions notebooks/sinrvec_fr.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -835,14 +835,14 @@
" './train_#DOC#.txt'),\n",
" \".\", n_jobs=8)\n",
"vrt_maker.do_txt_to_vrt(separator=separator)\n",
"docs_train = ppcs.extract_text('./train_#DOC#.vrt', min_length_doc=-1)\n",
"docs_train = ppcs.extract_text('./train_#DOC#.vrt', min_freq=0, min_length_doc=-1)\n",
"\n",
"vrt_maker = ppcs.VRTMaker(ppcs.Corpus(ppcs.Corpus.REGISTER_WEB,\n",
" ppcs.Corpus.LANGUAGE_EN,\n",
" './test_#DOC#.txt'),\n",
" \".\", n_jobs=8)\n",
"vrt_maker.do_txt_to_vrt(separator=separator)\n",
"docs_test = ppcs.extract_text('./test_#DOC#.vrt', min_length_doc=-1)"
"docs_test = ppcs.extract_text('./test_#DOC#.vrt', min_freq=0, min_length_doc=-1)"
]
},
{
Expand Down

0 comments on commit 6756b0b

Please sign in to comment.