Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory use without release #205

Closed
mihalybaci opened this issue Jul 28, 2021 · 10 comments
Closed

High memory use without release #205

mihalybaci opened this issue Jul 28, 2021 · 10 comments

Comments

@mihalybaci
Copy link

I am trying to write a bulk image renaming function with outputs based on JPEG EXIF entries, but I am running into a memory issue. Here is the crux of the problem:

using ImageMagick

# Starting memory reported by `top`: 2941 MiB
for i = 1:15
    field_info = magickinfo(testim, "date:modify")  # `testim` is 5.7 MiB image on my computer
end
# Ending memory: 5264 MiB
# After two for-loop runs: 8553 MiB

Here I have just used a loop to repeat function for the MWE, but this also happens when cycling through different images as well. Two problems seem to arise.

First, the image is only 5.7 MiB, after reading it 15 times I would naively expect a memory use of 15*6 = 90 MiB if the memory never cleared, but after the for loop memory usage goes up by over ~2000 MiB.

Second, after several minutes, the memory still doesn't clear. While writing this post, the memory dropped back into the 3100 MiB neighborhood, but that was only after 10-15 minutes of Julia idle time.

Is there a bug here?

My info:

julia> versioninfo()
Julia Version 1.6.2
Commit 1b93d53fc4 (2021-07-14 15:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, haswell)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8

and via pkg> st [6218d12a] ImageMagick v1.2.1`.

@johnnychen94
Copy link
Member

I wouldn't be surprised by this at all. Generally, the efforts would go to the libjpeg wrapper JuliaImages/Images.jl#960.

There's a newly initialized repo https://github.com/stevengj/JpegTurbo.jl @stevengj I'm not sure what's your plan on this, I could join force the development if you think this a good idea.

@mihalybaci
Copy link
Author

FWIW, I just copied the code example from JpegTurbo and it does seem to avoid the memory issue.

function image_mem(filename)
    cinfo = LibJpeg.jpeg_decompress_struct()
    jerr = Ref{LibJpeg.jpeg_error_mgr}()
    cinfo.err = LibJpeg.jpeg_std_error(jerr)
    LibJpeg.jpeg_create_decompress(cinfo)
    infile = ccall(:fopen, Libc.FILE, (Cstring, Cstring), filename, "rb")
    LibJpeg.jpeg_stdio_src(cinfo, infile)
    LibJpeg.jpeg_read_header(cinfo, true)
    LibJpeg.jpeg_start_decompress(cinfo)
    w = Int(cinfo.output_width) # show the image width
    h = Int(cinfo.output_height) # show the image width
    LibJpeg.jpeg_destroy_decompress(cinfo)
    ccall(:fclose, Cint, (Ptr{Libc.FILE},), infile)
    return w, h
end

for i = 1:100
    image_mem(testim)
end
# Ending memory usage = starting memory usage

So this would be a decent workaround for my case if I can figure out where the created/modified dates are buried.

@stevengj
Copy link
Member

@johnnychen94, I have no immediate plans to work on JpegTurbo — I mainly put it there as a starting point for later work. I would be happy to transfer it to JuliaIO or JuliaImages if desired, and/or to add collaborators. See also the discussion on discourse.

@IanButterworth also has a repo (https://github.com/IanButterworth/ImageIODevelopment.jl) with a similar Clang-generated wrapper for libjpeg, so it would be good to check with him on the best way forward.

@IanButterworth
Copy link
Member

I don't think my dev repo got much past copying a c example.

I would've thought the thing to do would be to build on JPEGTurbo.jl (though I'd rename it JPEGFiles.jl perhaps) and move it to JuliaIO when ready.

I can't offer much time but happy to review.

@stevengj
Copy link
Member

I don't think libjpeg has anything specifically for EXIF data, which is embedded in a single segment of a JPEG file. You can use libjpeg to extract that segment, I think, but you will have to parse the EXIF data yourself (or wrap another library like libexif). See https://dev.exiv2.org/projects/exiv2/wiki/The_Metadata_in_JPEG_files

@kodintent
Copy link

Hi. I encountered this issue also. With a set of images larger than my RAM.
either magickinfo calls, in a for loop, led to my Julia script killing itself due to running out of RAM.

for path_image in array_path_images
    array_keys = magickinfo(path_image)
    #OR
    dict_key_value = magickinfo(path_image, (key1, key2))
end

The behavior is as if each loaded image, is kept in RAM and not purged. When passing in keys, I noticed one or two memory releases before the crash, but never when just getting the keys array. using @threads with the for loop makes no difference to the end result.
At one stage i tried it with apt ImageMagick called in a bash script, and there were no memory issues.
Luckily I just needed height and width, so i was able to use JuliaImages instead. But if i wanted to get other exif keys, I would have to use apt ImageMagick.
imagemagick memory use

@yakir12
Copy link

yakir12 commented May 19, 2022

I encountered the same exact issue as @kodintent .
I "solved" it by

Base.GC.gc()

after every call to magickinfo...

@johnnychen94
Copy link
Member

JpegTurbo https://github.com/JuliaIO/JpegTurbo.jl is available now and bundled together with ImageIO

@ashwani-rathee is working on EXIF wrapper for GSoC'22 JuliaImages/Images.jl#1000

@yakir12
Copy link

yakir12 commented May 19, 2022

That is super promising. Thank you. I'll try to see how I can get date and time created from the exif data.

@yakir12
Copy link

yakir12 commented May 19, 2022

Yeah, using @ashwani-rathee's code example in that link works flawlessly:

using Dates, libexif_jll
include("LibExif.jl")
function readtag(filepath,  tag)
  ed_ptr = LibExif.exif_data_new_from_file(filepath)
  ed = unsafe_load(ed_ptr)
  content_ptr = ed.ifd[1]
  make_ptr = LibExif.exif_content_get_entry(content_ptr, tag)
  str = Vector{Cuchar}(undef, 1024);
  LibExif.exif_entry_get_value(make_ptr, str, length(str))
  return rstrip(String(str), '\0')
end

file2dt(file) = DateTime(readtag(file, LibExif.EXIF_TAG_DATE_TIME), "yyyy:mm:dd HH:MM:SS")

Thank you both (and all the rest working on this)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants