Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaking log statements #93

Closed
tvanyo opened this issue Jul 16, 2022 · 2 comments
Closed

Leaking log statements #93

tvanyo opened this issue Jul 16, 2022 · 2 comments

Comments

@tvanyo
Copy link
Contributor

tvanyo commented Jul 16, 2022

I'm using SimplePDFViewer to scrape a PDF in an app that has implemented logging and I've discovered that doing so is generating unexpected log statements to stdout.

I created a slightly larger than minimal mvp to illustrate the problem. The attached zip contains the python code testMVP.py and a test pdf, testPartial.pdf.

To see the problem extract both files to the same directory and run the program twice, once with the --pdf flag and once without:

python testMVP.py will run without executing the code on lines 23 & 24, so SimplePDFViewer is not run. The expected result will be:

Be patient - Extracting text & strings from testPartial.pdf
PDF scrapping complete!
Generated file test…

Note that I'm using logging instead of print to generate text output to the console.

python testMVP.py --pdf will run executing lines 23 & 24, so SimplePDFView is run. The unexpected result will be:

Be patient - Extracting text & strings from testPartial.pdf
PDF scrapping complete!
INFO:logTest:PDF scrapping complete!
DEBUG:logTest:doing somethings else…
Generated file test…
INFO:logTest:Generated file test…

You can see that there are 3 logging lines included and the format is clearly different than the stream formatter I setup on line 52 or the file formatter I set up on line 64 of testMVP.py

Version Information:

  • pdfreader: 0.1.11
  • python: 3.10.2

testMVP.zip

@maxpmaxp
Copy link
Owner

@tvanyo pdfreader uses the default logger, you may want to do something like

logging.basicConfig(format="%(asctime)s : %(levelname)s : %(message)s", datefmt="%d.%m.%Y %I:%M:%S %p")

before using it.

Feel free to contribute if you notice any logging issues in the project.

@tvanyo
Copy link
Contributor Author

tvanyo commented Jul 17, 2022

All the logging works as expected when I don't use SimplePDFViewer, but as soon as I make the call to SimplePDFViewer I'm seeing log messages that for some reason repeat the last message used in a log call. I'm not even seeing the log messages I can see in your code.

I have logging formatters for stdout and a file handler, neither of which matches what is being output only when I call SimplePDFViewer.

As a test I modified the logging of all pdfreader files (17 files had import logging) to connect to the module name logger and the unexpected log messages are not present.

Submitting as a pull request…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants