Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdfFile: Too many open files #63

Closed
tagatac opened this issue Sep 9, 2024 · 17 comments · Fixed by #64 or #66
Closed

pdfFile: Too many open files #63

tagatac opened this issue Sep 9, 2024 · 17 comments · Fixed by #64 or #66
Assignees
Labels
bug Something isn't working

Comments

@tagatac
Copy link
Owner

tagatac commented Sep 9, 2024

all the convos exported perfectly except for one, which shows squares with question marks instead of text/words. Emojis show up fine, but all the text is question marks (see attached pic).

The program in Terminal ends with “Done” and then “Exit with code 1 due to network error: ContentAccessDenied.”

Screen Shot 2024-09-05 at 6 38 43 PM

Originally posted by @sav5000 in #49 (comment)

@tagatac
Copy link
Owner Author

tagatac commented Sep 9, 2024

@sav5000 does this error happen every time you run bagoup, or is it intermittent?

@sav5000
Copy link

sav5000 commented Sep 9, 2024

Every time, and just with one person

@tagatac
Copy link
Owner Author

tagatac commented Sep 9, 2024

Okay, this is the problem, and can explain the nondeterminism in #62. Have you given your terminal emulator full disk access? https://github.com/tagatac/bagoup#option-1-required-for-attachments-give-your-terminal-emulator-full-disk-access

The full error message would help a lot. Feel free to redact any names, phone numbers, or email addresses.

@sav5000
Copy link

sav5000 commented Sep 10, 2024

Yes, I made sure to do that before running bagoup. Terminal has full disc access. Would it help to remove the full disc access and then re-add it? ("Have you tried turning it off and on again?") Ok I scrolled up in Terminal and actually do see the completion log which is listed before other errors (which is why I didn't see it before). The other errors are a series of the one I posted before, and a variation of: Error opening /Users/sav5000/Library/Messages/Attachments/8b/11/012F6C54-AE38-4D40-852A-552C283946F6/76F9C3E6-9D08-404A-958D-F8F8B841B7BD.pluginPayloadAttachment: Too many open files

BAGOUP RESULTS:
bagoup version: 2.4.6 Darwin/x86_64
Export folder: "messages-export-pdf"
Export files written: 12
Chats exported: 19
Valid messages exported: 4154
Invalid messages exported (see warnings above): 2
Attachments copied: 0
Attachments referenced or embedded: 1134
application/octet-stream: 401
image/png: 66
text/vcard: 7
image/webp: 1
image/jpeg: 599
video/quicktime: 13
application/paprikarecipes: 1
image/heic: 2
image/gif: 44
Attachments embedded: 1075
image/jpeg: 597
application/octet-stream: 367
image/png: 66
image/gif: 44
image/webp: 1
Attachments missing (see warnings above): 35
HEIC conversions completed: 500
HEIC conversions failed (see warnings above): 0
Time elapsed: 14m6.93722669s

@tagatac
Copy link
Owner Author

tagatac commented Sep 10, 2024

I think you've hit on a real bug. I fixed this in #14, but somewhere along the line, I swapped the order of flushing the PDF file to disk and increasing the open file limit. I think I know how to fix it:

imgCount, err := outFile.Flush()
if err != nil {
return errors.Wrapf(err, "flush chat file %q to disk", outFile.Name())
}
if openFilesLimit := cfg.OS.GetOpenFilesLimit(); imgCount*2 > openFilesLimit {
if err := cfg.OS.SetOpenFilesLimit(imgCount * 2); err != nil {
return errors.Wrapf(err, "chat file %q - increase the open file limit from %d to %d to support %d embedded images", outFile.Name(), openFilesLimit, imgCount*2, imgCount)
}
}

In the meantime as a workaround, you can try increasing the file limit before running bagoup with sudo ulimit -n 2048, or keep doubling that number until it works.

@tagatac tagatac self-assigned this Sep 10, 2024
@tagatac tagatac changed the title wkhtmltopdf: network error: ContentAccessDenied pdfFile: flushing to disk before increasing the open file limit Sep 10, 2024
@tagatac tagatac added the bug Something isn't working label Sep 10, 2024
@tagatac
Copy link
Owner Author

tagatac commented Sep 10, 2024

@sav5000 does the error message start with flush chat file ... to disk? If so, do you mind pasting the full error message, redacting any private info?

tagatac added a commit that referenced this issue Sep 10, 2024
**What is changing**: Check the OS-level open file limit (and adjust it
if necessary) prior to writing PDF files.

**Why this change is being made**: There is the potential for
wkhtmltopdf to need many open files (see
wkhtmltopdf/wkhtmltopdf#3081). This was fixed
in #14, where it depended on the order of events:
1. Stage the PDF file, getting the number of images
2. Check and adjust the open file limit
3. Flush the PDF file to disk

Number 3 was achieved via the call `defer outFile.Close()`:
https://github.com/tagatac/bagoup/blob/86f9b32870d2127f3fd3e196cecfc24265cf8d87/write.go#L29

However, #29 removed `Close()` from the `OutFile` interface, collapsing
`Stage()` and `Flush()` into a single function `Flush()` run prior to
checking and adjusting the open file limit.

**Related issue(s)**: Fixes #63 

**Follow-up changes needed**: None AFAIK

**Is the change completely covered by unit tests? If not, why not?**:
Yes
@tagatac
Copy link
Owner Author

tagatac commented Sep 10, 2024

@sav5000 thanks for reporting this! This should be fixed in v2.4.7. When you get a chance, please upgrade with homebrew and let me know if your issue is resolved 🙏 I suspect this may resolve #62 as well.

@sav5000
Copy link

sav5000 commented Sep 11, 2024

I updated and am unfortunately having all the same problems

BAGOUP RESULTS:
bagoup version: 2.4.8 Darwin/x86_64
Export folder: "messages-export-pdf"
Export files written: 12
Chats exported: 19
Valid messages exported: 4185
Invalid messages exported (see warnings above): 2
Attachments copied: 0
Attachments referenced or embedded: 1139
image/jpeg: 600
application/octet-stream: 403
image/png: 66
image/gif: 46
application/paprikarecipes: 1
image/webp: 1
text/vcard: 7
video/quicktime: 13
image/heic: 2
Attachments embedded: 1080
image/png: 66
image/gif: 46
image/webp: 1
image/jpeg: 598
application/octet-stream: 369
Attachments missing (see warnings above): 35
HEIC conversions completed: 500
HEIC conversions failed (see warnings above): 0
Time elapsed: 14m22.034617322s

@tagatac
Copy link
Owner Author

tagatac commented Sep 12, 2024

Hmm, can you copy and paste the full error message, starting immediately after the results output you pasted above, and ending at the terminal prompt? It may be several lines, like this:

2024/09/11 17:52:14 ERROR: export chats: flush chat file "2.4.6test/XXX/iMessage;+;chat742138261817041884.pdf" to disk: write out PDF: Loading pages (1/6)
Error: Failed to load file:///var/folders/_h/799vmfrx6x1_ssd0pwcj838m0000gn/T/bagoup404745032/httpssoe-wbe-pilot.wl.r.appspot.comcharts#page=overview.jpeg, with network status code 203 and http status code 0 - Error opening /var/folders/_h/799vmfrx6x1_ssd0pwcj838m0000gn/T/bagoup404745032/httpssoe-wbe-pilot.wl.r.appspot.comcharts: No such file or directory
Warning: Failed to load file:///Users/tag/Library/Messages/Attachments/74/04/51FF56BE-E25D-46CB-9EF8-1C2821B752E7/Screenshot 2024-08-25 at 9.46.17/u202fAM.jpeg (ignore)
Warning: Failed to load file:///Users/tag/Library/Messages/Attachments/22/02/0176FB70-383C-470F-AD61-BF9C7876932A/Screenshot 2024-08-29 at 4.44.13/u202fPM.jpeg (ignore)
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done
Exit with code 1 due to network error: ContentNotFoundError

exit status 1

Note that I replaced the chat group name with XXX.

@tagatac tagatac reopened this Sep 12, 2024
Repository owner deleted a comment from sav5000 Sep 12, 2024
@tagatac
Copy link
Owner Author

tagatac commented Sep 12, 2024

I got it and will take a look, thanks!

@tagatac tagatac changed the title pdfFile: flushing to disk before increasing the open file limit pdfFile: Too many open files Sep 12, 2024
@tagatac
Copy link
Owner Author

tagatac commented Sep 12, 2024

Just adding back an anonymized snippet from the error message for future reference:

2024/09/11 16:47:40 ERROR: export chats: flush chat file "messages-export-pdf/+19999999999/iMessage;-;+19999999999;;;SMS;-;+19999999999.pdf" to disk: Loading pages (1/6)
2024-09-11 16:44:55.868 wkhtmltopdf[53507:2528132] CoreText: System LastResort not available, using built-in copy.
Error: Failed to load file:///Users/sav5000/Library/Messages/Attachments/34/04/3D0B288A-D024-453F-9E7D-B190D6528F00/D6683EE2-D5FF-49EC-8066-CA1680E4963A.pluginPayloadAttachment, with network status code 201 and http status code 0 - Error opening /Users/sav5000/Library/Messages/Attachments/34/04/3D0B288A-D024-453F-9E7D-B190D6528F00/D6683EE2-D5FF-49EC-8066-CA1680E4963A.pluginPayloadAttachment: Too many open files

From this error message, it does appear that we are still hitting the open file limit.

Would you please paste the output of the following command?

launchctl limit maxfiles

@tagatac
Copy link
Owner Author

tagatac commented Sep 12, 2024

Nevermind, I am able to reproduce this issue using the --include-ppa flag. I'll figure out what's going on here. In the meantime, you will probably have more luck if you leave out that flag.

@tagatac
Copy link
Owner Author

tagatac commented Sep 12, 2024

It seems like Go 1.19+ is reporting incorrect soft limits from the getrlimit syscall:

We can use ulimit to get the real value. I should be able to fix this this week.

@tagatac
Copy link
Owner Author

tagatac commented Sep 12, 2024

@sav5000 I think we are in business. I was getting the same error as you with v2.4.8, but it runs to completion successfully with v2.4.10. When you get a chance, please upgrade to v2.4.10, and let me know if you're still getting errors. Thanks again for the report!

@sav5000
Copy link

sav5000 commented Sep 14, 2024

Ok it finally worked! There were still a few errors but that one file with all the question mark boxes was resolved, and all my other conversations showed up as well when they didn't before. Here is the log with an extra error at the bottom. Other than that there were not nearly as many errors as before. Most of them had to do with missing file names this time, which is fine. I am not sure why I originally decided to include "--include-ppa" and "--preserve-paths" in the command, since it could have been messing things up if the paths weren't there in the first place? Not sure how it works or what those tags even do, but I wonder if some more errors would resolve if I removed them. Anyway, this was a big relief to get working and download my messages again. Thank you!!

BAGOUP RESULTS:
bagoup version: 2.4.10 Darwin/x86_64
Invocation: bagoup --export-path messages-export-pdf --pdf --include-ppa --preserve-paths
Export folder: "messages-export-pdf"
Export files written: 62
Chats exported: 70
Valid messages exported: 7760
Invalid messages exported (see warnings above): 2
Attachments copied: 0
Attachments referenced or embedded: 1382
image/webp: 1
image/heic: 4
text/vcard: 7
image/gif: 61
video/quicktime: 21
application/paprikarecipes: 1
image/jpeg: 808
application/octet-stream: 413
image/png: 66
Attachments embedded: 1313
image/png: 66
image/gif: 61
image/webp: 1
image/jpeg: 806
application/octet-stream: 379
Attachments missing (see warnings above): 37
HEIC conversions completed: 669
HEIC conversions failed (see warnings above): 0
Time elapsed: 30m16.685923602s
2024/09/13 09:01:33 ERROR: write out tilde expansion file: open messages-export-pdf/bagoup-attachments/.tildeexpansion: no such file or directory

@tagatac
Copy link
Owner Author

tagatac commented Sep 14, 2024

Awesome, I'm glad it worked! Thanks for the update.

Yeah, the --preserve-paths flag doesn't make any sense without the --copy-attachments flag, and the error above can be ignored if you are not copying attachments. I'll see if I can disable that combination of flags in the next release.

@sav5000
Copy link

sav5000 commented Sep 16, 2024

Sounds good. Thank you so much again 🙏😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants