Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression & rotation? #229

Closed
algaspar opened this issue Mar 26, 2020 · 14 comments
Closed

Compression & rotation? #229

algaspar opened this issue Mar 26, 2020 · 14 comments
Labels
enhancement Improvement to an already existing feature

Comments

@algaspar
Copy link

I am having a little trouble with getting my log files to compress when they're done with, and I'm presuming it has something to do with the settings that I have.

I want my logs to start with a new file at midnight every day and I want them to be compressed when they are closed. I get my settings from a configuration file, which currently has these settings:

log_file = ../logs/SC--{time:ddd--MMM-DD}.log
log_format = {time:MM-DD-YY at HH:mm} | {level} | {file}:{line} | {function} | {message}
log_rotation = 00:00
log_retention = 1 month
log_compression = zip
log_level = INFO
log_queue = True

All of these settings except compression appear to work. I read these settings in from the configuration file and assign them like this:

logger.add(log_file, format=log_format, rotation=log_rotation,
retention=log_retention, compression=log_compression,
enqueue=log_queue, level=log_level)

The documentation says, "The compression happens at rotation or at sink stop if rotation is None." I presumed then that when my new log file is created at midnight, my previous day's log file would be zipped. That isn't happening. The new file is created, but the old one stays there unzipped. For example, yesterday had a file named SC--Tue--Mar-24.log. Today's log was created as SC--Wed--Mar-25.log, but when it was created SC--Tue--Mar-24.log remained as is, unzipped

Can someone tell me what I might be doing wrong?

Thanks--

Al

@Delgan
Copy link
Owner

Delgan commented Mar 26, 2020

Hi @algaspar.

I just tested (with a different rotation hour) and it worked fine on my computer. After the rotation time, I could see "SC--Thu--Mar-26.log" (written to) and "SC--Thu--Mar-26.log.zip" (compressed).

Which version of Loguru are you using? You can display it with pip freeze or print(loguru.__version__). Also, are you using Linux or Windows?

@algaspar
Copy link
Author

Thanks for your quick reply :-)

I just checked. I am running 0.4.1 on Windows Server 2016. I'm wondering if it has anything to do with my naming. Early on in my testing I hadn't started using anything but 'sc.log' for the file name. I had two of those still in my test folder that had been zipped and renamed sc.2020-03-06_21-00-05_340212.log.zip and sc.2020-03-13_23-00-02_171550.log.zip. Another one, though, was just named sc.log.zip; so I'm not sure how it decides to add a time stamp...

This morning I have a new SC--Thu--Mar-26.log, but the two earlier ones for Tuesday and Wednesday remain unzipped. I'll throw together a test script that just logs to a sink that rotates quickly (every minute) and see if that zips...

Thanks--

Al

@algaspar
Copy link
Author

algaspar commented Mar 26, 2020

OK. I was able to get it to compress, and I was also able to duplicate the issue that I was having. If my log file retains the same name, the zip works. If my log file was named sc--{time:ddd--HH}.log with rotation set to 1 minute, the file got zipped. However, if I named the file sc--{time:ddd--mm}.log, loguru created a new log on the next minute, but it did not zip the previous log. (On a side note while HH and mm worked by themselves, neither HH:mm or mm:HH worked; I got an empty file with the name truncated at whichever (mm or HH) appeared before the ':'. The .log extension was not there.). I've uploaded a copy of the little test file that I used.
loguru_test.txt

Any ideas how to get around this?

Thanks for your help--

Al

@Delgan
Copy link
Owner

Delgan commented Mar 27, 2020

Thanks for the test file! I understand the difficulties you're going through.

You see, when you set the file name to be "sc--{time:ddd--mm}.log", each time you start your application, Loguru will generate the file name based on current time and try to open the corresponding file.

So, supposing you start your application à 11:44:20 and stop it immediately. Created file is "sc--Fri--44.log". The application ends in less than one minute, so no rotation occurs. For this reason, the file is not compressed. Now, supposing you re-start your application at 11:45:30, a new file "sc--Fri--45.log" is created. Unfortunately, Loguru is not aware of the previously created file at 11:44:50 because you re-started the application at a different time. It's like a fresh start, there is no rotation so no compression.
In case the filename is "sc--{time:ddd--HH}", when you re-start the application the second time, the same filename is generated: "sc--Fri-11.log". So, Loguru will try to open the existing file, and it will realize than more than 1 minute has elapsed, so it will rotate the file and at the same time compress it.

Is your application supposed to run continuously, without being halted and restarted? In such a case, configuring with rotation="00:00" and compression="zip" should work fine. 😕

@algaspar
Copy link
Author

OK. That makes sense. I added a third sink to my test file, which is identical except for the name. I named it just 'sc.log' without any time parameters. I ran the my test script over a three minute period and got the following three files:

sc3.2020-03-27_10-10-40_978726.log.zip
sc3.log
sc3.log.zip

sc.log is the newest, and the one with the timestamp is the oldest. Subsequent timestamped files use the same timestamp and add a number (i.e., sc3.2020-03-27_10-10-40_978726.1.log.zip, etc.).

My "application" consists of several scripts that run daily at different times on a prearranged schedule. They all write to the same log file. I wanted to be able to keep the logs restricted to one days worth of data, and I thought it would be convenient to have the day and date as part of the file name. I was also hoping to zip the logs that were no longer active. As it essentially runs continuously setting rotation="00:00" and compression="zip" works, except I would like more control over the timestamp.

I'm thinking that I could do something like that with a flat filename (like sc.log) and have compression set to a callable, which renamed the rotating file to something like sc-Thu--Mar-23.log and zipped it? That way I could easily see which file was active and the days and dates for the others. Would rotation also need to be a callable? Would retention recognize that it needed to remove these custom compressed files after the retention period, or would I need to build that into my compression callable, have retention be a callable as well, or build a 'sweep' app?

Thanks--

Al

@algaspar
Copy link
Author

algaspar commented Mar 29, 2020

It seemed like a callable for correction would be the answer. I imported os, time, datetime, and zipfile, and created this callable:

> def mycompress(file):
>     """This callable handles daily compression of the log file
>     and renaming it with the day and date it was created"""
>     try:
>         file_stats = os.stat(file)
>     except:
>         return None
>    file_time_struct = time.localtime(file_stats[9])  # ctime
>     # file_wday = file_time_struct[6]
>     # now_wday = datetime.now().timetuple()[6]
>     # if now_wday != file_wday:
>     file_min = file_time_struct[4]
>     now_min = datetime.now().timetuple()[4]
>     if now_min > file_min + 1:
>         # file for zipping:  SC--Day--Mon-##.log
>         newfile = f'SC--{time.strftime("%a--%b-%d", file_time_struct)}.log'
>         os.rename(f'{file}', f'{newfile}')
>         zipfile.ZipFile(f'{newfile}{now_wday}.zip',
>                         'w', zipfile.ZIP_DEFLATED).write(f'{newfile}')
>         os.remove(newfile)

Since I want the file to change nightly at midnight, I figured that I just needed to check if the day of the week in the two structs was different. However, I didn't want to wait 24 hours between tests; so I used minute (item 4 in the time struct) rather than weekday (item 6 commented out above), and had my check be if now_min > file_min + 1:. I am using this logger.add:

> logger.add('new.log', format=log_format, rotation='2 minutes',
>                compression=mycompress('new.log'), level=log_level)

It works, but the first time it runs (and on rotation) I get a new.log file with my log entries and an empty file with a time stamp like this:

-a---- 3/28/2020 8:31 PM 0 new.2020-03-28_20-26-12_945722.log

Subsequent runs of my test file correctly write to new.log and zip the file on rotation, but when it does rotate, I get another empty new.time_stamp.log file in addition to my correctly named zip file.

Should I be doing this with a rotation callable instead or a combination? Will I need a callable for retention to remove my custom zip files that are older than my retention date? Bottom line, what am I getting wrong here ;-)?

Thanks--

Al

@Delgan
Copy link
Owner

Delgan commented Mar 29, 2020

Hey @algaspar, that's an interesting use case, I think you indeed need to use a custom compression function.

However, you should not have to handle rotation times in your mycompress function. You can add your handler with logger.add("sc.log", rotation="00:00", compression=mycompress). This is first part of what you requested: current logs are written to "sc.log", every new day the mycompress function will be called, then a new "sc.log" will be created for the current day. So, you only have to manage file renaming to add the date and zip compression. I foresee something like that:

from loguru import logger
from datetime import datetime
from zipfile import ZipFile
import os

def mycompress(filepath):
    archive_path = "sc.{:%Y-%m-%d}.log.zip".format(datetime.now())
    with ZipFile(archive_path, "w") as archive:
        archive.write(filepath, arcname="sc.log")
    os.remove(filepath)

logger.add("sc.log", rotation="00:00", compression=mycompress)

A few more notes about your previous comments:

  • I'm quite surprised that Loguru adds a number at the end of your logs ("sc3.2020-03-27_10-10-40_978726.1.log"), it only happens on Windows if the rotation is too fast (normally the filename include a new date, so no conflict)
  • You should not use compression=mycompress('new.log') but compression=mycompress (Loguru will call mycompress at rotation time with the log filepath)
  • As you are mixing rotation and compression this may explain the weird behavior your are facing with your mycompress function
  • In Retention fix #196 I restricted the files matched by retention so it only remove them if the name entirely corresponds to one generated by Loguru (with the full date like "2020-03-27_10-10-40_978726"). This means if you rename your log in a custom compression to something like "sc.2020-03-27.log.zip" it will not be considered by the retention process. This is unfortunate, in the next release I will probably restrain the rule so that it matches all files looking like "base.*.log.*". In the meantime, you can use a custom retention function but you will have to call os.listdir() in order to find files to remove.
  • I realized that if an exception occurs in a custom compression function, this entirely breaks the handler as Loguru will repeatedly try to write to a closed file. I will need to fix that too.

@algaspar
Copy link
Author

Thank you so much for your patience and your detailed explanation. I modified my compress callable as you suggested. It works great; the odd zero length file is gone :-). Since the log file I am archiving contains data from the day before, I used yesterday's date. Here's what I came up with:

> def mycompress(filepath):
>     """This callable handles daily compression of the log file
>     and naming it with the day and date it logged"""
>     yesterday = datetime.datetime.now() - datetime.timedelta(days=1)
>     archive_file = "SC--{:%a--%b-%d}.log.zip".format(yesterday)
>     with zipfile.ZipFile(f'{archive_file}', 'w', zipfile.ZIP_DEFLATED) as archive:
>         archive.write(filepath, arcname='SC.log')
>     os.remove(filepath)

I look forward to your retention change, but in the meantime, I'll go work on a retention callable using os.listdir() as you suggest.

Thank you very much for an awesome library. Loguru really makes logging simple and even pleasurable. By the way, is it "log guru" or does it rhyme with "kangaroo"? :-)

--Al

@Delgan
Copy link
Owner

Delgan commented Mar 30, 2020

Great, glad this solved your problem! And thanks for the kind words, I appreciate it!

It's supposed to be pronounced as "guru", but honestly, I'm not even sure I pronounce it correctly myself. 😅

@Delgan Delgan added the enhancement Improvement to an already existing feature label Mar 30, 2020
@algaspar
Copy link
Author

algaspar commented Apr 1, 2020

😃

Quick retention question. The documentation says that the retention callable "should accept the list of log files as argument", but above you say that I need to use os.listdir() to find my files. What does loguru pass to a retention callable?

Thanks--

Al

@Delgan
Copy link
Owner

Delgan commented Apr 1, 2020

Loguru will pass a list of file paths to the retention function. However, to create this list, Loguru applies a filter to all files in the logs folder. This is done in order to avoid to unintentionally remove unrelated files (like any .py file for example). Currently, this filter is quite strict: it will only accept files looking like "sc.log" or "sc.2020-03-06_21-00-05_340212.log". This means that in your custom retention function, the passed list will not contain your renamed file "sc--Apr-Wed-01.log" as it does not match the expected format. For this reason, I suggest you to call os.listdir() and find files to delete by yourself.

@algaspar
Copy link
Author

algaspar commented Apr 2, 2020

Ah, gotcha. That makes sense. Will do.

Thanks!

--Al

@Delgan
Copy link
Owner

Delgan commented Apr 7, 2020

I have improved the logs gathering. The files collected for the retention process should match the pattern "basename(.*).ext(.*)" (which means you it will nor work if you rename sc.log to sc--XXX.log in your custom compression because basenames are different).

The fix will be available in the next release. 👍

@Delgan Delgan closed this as completed Apr 7, 2020
@algaspar
Copy link
Author

I just wanted to say thank you 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement to an already existing feature
Projects
None yet
Development

No branches or pull requests

2 participants