Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loguru doesn't remove uncompressed files after compressing them #309

Closed
dstlmrk opened this issue Aug 19, 2020 · 15 comments · Fixed by #325
Closed

Loguru doesn't remove uncompressed files after compressing them #309

dstlmrk opened this issue Aug 19, 2020 · 15 comments · Fixed by #325
Labels
bug Something isn't working

Comments

@dstlmrk
Copy link
Contributor

dstlmrk commented Aug 19, 2020

Hello,

I use this definition for compressing but I expected different behaviour:

handlers.append({
    "rotation": "03:00",
    "compression": "gz",
    "retention": 10 days",
    ...
})

All my raw log files are compressed every night. It's working. But I don't need them (uncompressed files) if they are already compressed. Loguru doesn't remove them and I don't know if it's wanted behaviour. Because now it's not saving any space on disc.

Should I use my custom callable which removes all old uncompromised files?

@Delgan
Copy link
Owner

Delgan commented Aug 19, 2020

Hum... That's not the expected behavior. The .log file should be removed once compression is done.

May I ask you which loguru version you're using please? Are you seeing any error message in your logs (I'm thinking to permission issues for example)? You may try to create a minimal script with rotation="10 kB" and a basic for loop to see if you can reproduce the problem consistently and hence investigate it. 😕

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 11, 2020

Hi! I will close this issue in next days after test probably. I was getting this traceback in uwsgi log:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/loguru/_handler.py", line 176, in emit
    self._sink.write(str_record)
  File "/usr/local/lib/python3.8/site-packages/loguru/_file_sink.py", line 174, in write
    self._terminate_file(is_rotating=True)
  File "/usr/local/lib/python3.8/site-packages/loguru/_file_sink.py", line 210, in _terminate_file
    self._compression_function(old_path)
  File "/usr/local/lib/python3.8/site-packages/loguru/_file_sink.py", line 66, in compression
    compress_function(path_in, path_out)
  File "/usr/local/lib/python3.8/site-packages/loguru/_file_sink.py", line 53, in copy_compress
    with open(path_in, "rb") as f_in:
FileNotFoundError: [Errno 2] No such file or directory: '/var/log/app/app_2020-09-07_12-34-24_908563.log'

Then I found the issue #220 and there is info about enqueue param. Because my app is multiprocessing it makes sense I need the enqueue=True :)

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 11, 2020

Ok that is bigger problem than I thought :( If I want have only one logging file for whole app instead of file for each process then I need enqueue=True. But there is problem with killing processes. For this example I created simple web server with Flask and uwsgi:

# main.py

import os
from loguru import logger

print("pid=%s, configure logger" % os.getpid())
logger.remove()
logger.add("file.log", enqueue=True)
logger.info("Logger is configured")

from flask import Flask
application = Flask(__name__)

@application.route('/')
def hello_world():
    pid = os.getpid()
    print(pid)
    logger.info(pid)
    return str(pid)

I can run ti with:

uwsgi --http :9090 --wsgi-file main.py --master --processes 2 --threads 1

Now there are two options:

First option: enqueue=False

It works normally (but it creates logging file for every process). I can choose random process (pstree -p | grep uwsgi) and kill one of them. In uwsgi log I can see:

DAMN ! worker 1 (pid: 81233) died :( trying respawn ...
Respawned uWSGI worker 1 (new pid: 81356)

...and web server is still working. In log files I see logs.

Second option: enqueue=True

I'm trying to kill one of processes. But after killing one process the whole app is not responding and logging is absolutely dead :( I tried this code, but it's not working too:

import atexit

def wait_for_loguru():
    # called in exiting child
    logger.complete()

atexit.register(wait_for_loguru)

@Delgan
Copy link
Owner

Delgan commented Sep 11, 2020

Oh, great you figured out it was related with enqueue and multiprocessing. That's one step forward!

What do you mean by "killing the whole app"?

The code you shared helped me a lot to be able to understand your issue, thanks a lot for that. However, I'm not able to reproduce the bug you're facing. If I start the server with enqueue=True and then kill one of the spawned child process, then it restarted successfully and logs continue to be written. If I kill the parent process, then of course the app can no longer deliver any route.

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 13, 2020

Sorry I wanted to say "After killing one process the whole app is not responding" :)

I use these versions:

  • loguru 0.5.2
  • Flask 1.1.2
  • uwsgi 2.0.18
  • python 3.8.2
  • OS: Ubuntu 20.04.1 LTS

There is obvious problem with killing uwsgi workers, because if I press Ctrl+C during running uwsgi I see timeout at the end of log (it takes more than 30 seconds).

*** Starting uWSGI 2.0.18 (64bit) on [Sun Sep 13 18:55:38 2020] ***
compiled with version: 9.2.1 20191008 on 17 March 2020 11:53:04
os: Linux-5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020
nodename: enterprise
machine: x86_64
clock source: unix
detected number of CPU cores: 8
current working directory: /home/dstlmrk/projekty/dev/loguru-rotation
detected binary path: /home/dstlmrk/.local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
your processes number limit is 62429
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on :9090 fd 4
uwsgi socket 0 bound to TCP address 127.0.0.1:42139 (port auto-assigned) fd 3
Python version: 3.8.2 (default, Jul 16 2020, 14:00:26)  [GCC 9.3.0]
Python main interpreter initialized at 0x55b22d3c4a50
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 218760 bytes (213 KB) for 2 cores
*** Operational MODE: preforking ***
pid=80749, configure logger
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x55b22d3c4a50 pid: 80749 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 80749)
spawned uWSGI worker 1 (pid: 80752, cores: 1)
spawned uWSGI worker 2 (pid: 80753, cores: 1)
spawned uWSGI http 1 (pid: 80754)
^CSIGINT/SIGQUIT received...killing workers...
gateway "uWSGI http 1" has been buried (pid: 80754)
Sun Sep 13 18:56:40 2020 - worker 1 (pid: 80752) is taking too much time to die...NO MERCY !!!
Sun Sep 13 18:56:40 2020 - worker 2 (pid: 80753) is taking too much time to die...NO MERCY !!!
worker 1 buried after 1 seconds
worker 2 buried after 1 seconds
goodbye to uWSGI.

There is no waiting with enqueue=False at the end:

*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 74611)
spawned uWSGI worker 1 (pid: 74613, cores: 1)
spawned uWSGI worker 2 (pid: 74614, cores: 1)
spawned uWSGI http 1 (pid: 74615)
^CSIGINT/SIGQUIT received...killing workers...
gateway "uWSGI http 1" has been buried (pid: 74615)
worker 1 buried after 1 seconds
worker 2 buried after 1 seconds
goodbye to uWSGI.

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 13, 2020

One of my colleagues has the same problem on Mac

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 13, 2020

Maybe I could create Dockerfile + main.py with this bug if you want :)

@Delgan
Copy link
Owner

Delgan commented Sep 13, 2020

Thank you for details! I can indeed reproduce the problem, the program freezes when I press Ctrl+C.

I'm not sure what causes the problem exactly, but if you configure the logger only if __name__ == "__main__", that seems to fix the issue.

if __name__ == "__main__":
    logger.remove()
    logger.add("file.log", enqueue=True)
    logger.info("Logger is configured")

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 14, 2020

It seems good but it's not. This code is not executed as expected. Then the behaviour is the same like default loguru settings (no logging to file).

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Sep 14, 2020

I believe I know where is problem. I created pull request #325. I hope it will helps you :)

Delgan added a commit that referenced this issue Sep 20, 2020
@Delgan Delgan added the bug Something isn't working label Sep 26, 2020
@Delgan
Copy link
Owner

Delgan commented Oct 1, 2020

Hi again @dstlmrk!

Quick question about the snippet you shared a few comments above:

# main.py

import os
from loguru import logger

print("pid=%s, configure logger" % os.getpid())
logger.remove()
logger.add("file.log", enqueue=True)
logger.info("Logger is configured")

from flask import Flask
application = Flask(__name__)

Any reason for importing flask after logger.add() / logger.info() and not at the top of the file? I noticed the program sometimes hangs if the import is done earlier. Have you noticed the same thing and if so, have you any idea why it's behaving that strangely?

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Oct 8, 2020

Hello @Delgan. I spent a few hours but probably I found the problem. It's combination forking and locking the SimpleQueue.

When you are pushing/putting messages to/from SimpleQueue it involves acquiring a lock. And because you can't control the execution of forking in uwsgi there is possibility the lock is copied in an acquired state.

The solution is to don't call logger.info() in master process or don't use enqueue=True probably. I will try more because our production application has still problem with logging.

Read more here: https://pythonspeed.com/articles/python-multiprocessing/

You can try this code. It includes SImpleQueue and forking (because it does uwsgi). You have to find the moment when the lock is copied in an acquired state (the child is completely dead).

import os
import time
import multiprocessing
from threading import Thread
from os import fork, getpid


def queued_writer():
    while True:
        try:
            print(f"[PID={os.getpid()}] queue.get()")
            message = queue.get()
        except Exception:
            print("exception")
            continue

        if message is None:
            break

        if message is True:
            continue

        print(f"[PID={os.getpid()}] write '{message}'")


queue = multiprocessing.SimpleQueue()
thread = Thread(target=queued_writer, daemon=True, name="writer")
thread.start()

queue.put(
    "Lorem ipsum dolor sit amet, consectetuer adipiscing elit."
    "Mauris tincidunt sem sed arcu. Fusce wisi. Aliquam erat volutpat."
    "Praesent id justo in neque elementum ultrices."
    "Morbi scelerisque luctus velit. Mauris metus."
    "Suspendisse sagittis ultrices augue. Pellentesque sapien."
)

if fork():
    while True:
        print("parent")
        time.sleep(1)
else:
    while True:
        print("child")
        time.sleep(1)

@Delgan
Copy link
Owner

Delgan commented Oct 10, 2020

Thank you for taking the time to investigate, it's much more what I was asking for!

However, I am somewhat surprised because the problem you describe was fixed a few months ago! This led to the creation of the _lock_machinery.py file which makes use of os.register_at_fork() to acquire lock before forking and release them once fork is done, preventing possible deadlock in child process.

The article you shared describes the difficulties of mixing multithreading with multiprocessing and why it's discouraged. It was definitely a problem with enqueue=True and internal threading locks used by loguru, as the state is copied into the child process. However, in this case, the SimpleQueue uses a mutiprocessing.Lock and not a threading.Lock. Its state is not copied but shared. This means its state while forking is therefore of little importance, as long as the parent releases it at some point.

To be more specific about the snippet you just shared. By running it multiple times, it indeed happens that the child seems blocked. Here is the twist: it happens because of print() and not because of queue.get()!

As I said, this is a problem experimented and fixed a few months ago: #231 (comment)
If the lock used by print() is acquired during forking, the child will deadlock while writing to stdout.

Maybe all this is closely or remotely associated with the compatibility issues between loguru and uwsgi. But I'm not sure sure, changing the place where import flask is done is enough to make the application buggy! 😕

@dstlmrk
Copy link
Contributor Author

dstlmrk commented Oct 12, 2020

Yeah but os.register_at_fork() doesn't work in this case. I'm pretty sure the fork is created by C libraries because uwsgi is written in C. You can try it but registered methods are not executed at all. Gunicorn and others uwsgi servers have the same problem. I don't believe it has some solution.

Btw the problem is not print() in child process but queue.get() in queued_writer(). The moment when thread is getting data from a queue and fork is happening is a problem for the app. You can solve it by putting time.sleep(0.1) before fork(). But it helps only once at start because uwsgi usually kills and spawns by forking child processes in time period because of cleaning memory.

@Delgan
Copy link
Owner

Delgan commented Oct 12, 2020

Yes, indeed. When I said that the problem came from print(), I was only referring to the snippet you shared. Removing the print() solves the issue. However, in a context including uwsgi, I'm not sure how it's going. As I already faced some surprises with current_process(), there are maybe some problem with queue.get() too. Too bad uwsgi is not operating well with multiprocessing.

jhamon pushed a commit to pinecone-io/pinecone-python-client that referenced this issue Oct 11, 2023
Bumps [loguru](https://github.com/Delgan/loguru) from 0.5.0 to 0.5.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Delgan/loguru/releases">loguru's
releases</a>.</em></p>
<blockquote>
<h2>0.5.3</h2>
<ul>
<li>Fix child process possibly hanging at exit while combining
<code>enqueue=True</code> with third party library like
<code>uwsgi</code> (<a
href="https://github.com/Delgan/loguru/issues/309">#309</a>,
thanks <a
href="https://github.com/dstlmrk"><code>@​dstlmrk</code></a>).</li>
<li>Fix possible exception during formatting of non-string messages (<a
href="https://github.com/Delgan/loguru/issues/331">#331</a>).</li>
</ul>
<h2>0.5.2</h2>
<ul>
<li>Fix <code>AttributeError</code> within handlers using
<code>serialize=True</code> when calling <code>logger.exception()</code>
outside of the context of an exception (<a
href="https://github.com/Delgan/loguru/issues/296">#296</a>).</li>
<li>Fix error while logging an exception containing a non-picklable
<code>value</code> to a handler with <code>enqueue=True</code> (<a
href="https://github.com/Delgan/loguru/issues/298">#298</a>).</li>
<li>Add support for async callable classes (with <code>__call__</code>
method) used as sinks (<a
href="https://github.com/Delgan/loguru/issues/294">#294</a>,
thanks <a
href="https://github.com/jessekrubin"><code>@​jessekrubin</code></a>).</li>
</ul>
<h2>0.5.1</h2>
<ul>
<li>Modify the way the <code>extra</code> dict is used by
<code>LogRecord</code> in order to prevent possible
<code>KeyError</code> with standard <code>logging</code> handlers (<a
href="https://github.com/Delgan/loguru/issues/271">#271</a>).</li>
<li>Add a new <code>default</code> optional argument to
<code>logger.catch()</code>, it should be the returned value by the
decorated function in case an error occurred (<a
href="https://github.com/Delgan/loguru/issues/272">#272</a>).</li>
<li>Fix <code>ValueError</code> when using <code>serialize=True</code>
in combination with <code>logger.catch()</code> or
<code>logger.opt(record=True)</code> due to circular reference of the
<code>record</code> dict (<a
href="https://github.com/Delgan/loguru/issues/286">#286</a>).</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Delgan/loguru/blob/master/CHANGELOG.rst">loguru's
changelog</a>.</em></p>
<blockquote>
<h1><code>0.5.3</code>_ (2020-09-20)</h1>
<ul>
<li>Fix child process possibly hanging at exit while combining
<code>enqueue=True</code> with third party library like
<code>uwsgi</code>
(<code>[#309](Delgan/loguru#309)
&lt;https://github.com/Delgan/loguru/issues/309&gt;</code><em>, thanks
<code>@dstlmrk &lt;https://github.com/dstlmrk&gt;</code></em>).</li>
<li>Fix possible exception during formatting of non-string messages
(<code>[#331](Delgan/loguru#331)
&lt;https://github.com/Delgan/loguru/issues/331&gt;</code>_).</li>
</ul>
<h1><code>0.5.2</code>_ (2020-09-06)</h1>
<ul>
<li>Fix <code>AttributeError</code> within handlers using
<code>serialize=True</code> when calling <code>logger.exception()</code>
outside of the context of an exception
(<code>[#296](Delgan/loguru#296)
&lt;https://github.com/Delgan/loguru/issues/296&gt;</code>_).</li>
<li>Fix error while logging an exception containing a non-picklable
<code>value</code> to a handler with <code>enqueue=True</code>
(<code>[#298](Delgan/loguru#298)
&lt;https://github.com/Delgan/loguru/issues/298&gt;</code>_).</li>
<li>Add support for async callable classes (with <code>__call__</code>
method) used as sinks
(<code>[#294](Delgan/loguru#294)
&lt;https://github.com/Delgan/loguru/pull/294&gt;</code><em>, thanks
<code>@jessekrubin
&lt;https://github.com/jessekrubin&gt;</code></em>).</li>
</ul>
<h1><code>0.5.1</code>_ (2020-06-12)</h1>
<ul>
<li>Modify the way the <code>extra</code> dict is used by
<code>LogRecord</code> in order to prevent possible
<code>KeyError</code> with standard <code>logging</code> handlers
(<code>[#271](Delgan/loguru#271)
&lt;https://github.com/Delgan/loguru/issues/271&gt;</code>_).</li>
<li>Add a new <code>default</code> optional argument to
<code>logger.catch()</code>, it should be the returned value by the
decorated function in case an error occurred
(<code>[#272](Delgan/loguru#272)
&lt;https://github.com/Delgan/loguru/issues/272&gt;</code>_).</li>
<li>Fix <code>ValueError</code> when using <code>serialize=True</code>
in combination with <code>logger.catch()</code> or
<code>logger.opt(record=True)</code> due to circular reference of the
<code>record</code> dict
(<code>[#286](Delgan/loguru#286)
&lt;https://github.com/Delgan/loguru/issues/286&gt;</code>_).</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/Delgan/loguru/commit/f31e97142adc1156693a26ecaf47208d3765a6e3"><code>f31e971</code></a>
Bump version to 0.5.3</li>
<li><a
href="https://github.com/Delgan/loguru/commit/3394fdbd04efa7f5ec9c9625ec4ec34ca72b16e7"><code>3394fdb</code></a>
Update Changelog to reference <a
href="https://github.com/Delgan/loguru/issues/309">#309</a>
fix</li>
<li><a
href="https://github.com/Delgan/loguru/commit/44f677172ccf10ec60cbc46e4ac77394429211a7"><code>44f6771</code></a>
Fix removing simple queue in child processes (<a
href="https://github.com/Delgan/loguru/issues/325">#325</a>)</li>
<li><a
href="https://github.com/Delgan/loguru/commit/296635c33cbc735c3367728f24f07545dbfe19c5"><code>296635c</code></a>
Fix possible exception while formatting non-string message (<a
href="https://github.com/Delgan/loguru/issues/331">#331</a>)</li>
<li><a
href="https://github.com/Delgan/loguru/commit/b77f4fd23ac380fa2521dd2039c961af1218d1d1"><code>b77f4fd</code></a>
Correct simple typo in docs, propogates -&gt; propagates (<a
href="https://github.com/Delgan/loguru/issues/323">#323</a>)</li>
<li><a
href="https://github.com/Delgan/loguru/commit/e48f3e49228ada2f6495dd7eedc4ecd03530a1f6"><code>e48f3e4</code></a>
Bump version to 0.5.2</li>
<li><a
href="https://github.com/Delgan/loguru/commit/d93c19f6c1c53cc14e9afcb28b770f8aa43c28cd"><code>d93c19f</code></a>
Flesh out the few remaining incomplete annotations (<a
href="https://github.com/Delgan/loguru/issues/316">#316</a>)</li>
<li><a
href="https://github.com/Delgan/loguru/commit/084dffd16211e8494419ed5cf9c6859de8e76123"><code>084dffd</code></a>
Fix tests fails (since pytest upgrade) due to standard logging
misuse</li>
<li><a
href="https://github.com/Delgan/loguru/commit/9a40e58c74e786149b6a4c6097e24d50e0b40459"><code>9a40e58</code></a>
Fix error with &quot;enqueue=True&quot; and non-picklable exception (<a
href="https://github.com/Delgan/loguru/issues/298">#298</a>)</li>
<li><a
href="https://github.com/Delgan/loguru/commit/6f86f487590fe8f264a3bb26c722bf4be354687b"><code>6f86f48</code></a>
Refactor usage of &quot;inspect&quot; functions for simplification</li>
<li>Additional commits viewable in <a
href="https://github.com/Delgan/loguru/compare/0.5.0...0.5.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=loguru&package-manager=pip&previous-version=0.5.0&new-version=0.5.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/pinecone-io/pinecone-python-client/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants