Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed two references to the head of the times seq which caused a memory leak #10

Closed
wants to merge 1 commit into from

Conversation

schmfr
Copy link

@schmfr schmfr commented Oct 30, 2015

I use your nice library and observed a memory leak in my programs. A huge number of org.joda.time.DateTime objects was found in the old gen space of my program. After a deeper inspection of the code I have seen that there are two references to the head element of the times-seq.

The first one is implicit in the destructuring of [next-time & more-times] in the go-loop, see https://groups.google.com/forum/#!topic/clojure/YNcqmAwWkqA. I fixed it by removing the destructuring and use instead first and rest in the body.

The second one is the function itself. The reify returns an object that holds (amongst others) a reference to the function arguments, in particular to times. I fixed it by converting the functions to macros.

Unfortunately you still have to be careful in using the lib.
For instance
(let [times (periodic/periodic-seq (time/now) (time/millis 1000))] (chime-ch times))
still results in a leak, while
(chime-ch (periodic/periodic-seq (time/now) (time/millis 1000)))
works fine.

It would be great if you would accept my proposal and adapt the use-cases in the readme.

…mory leak

I use your nice library and observed a memory leak in my programs. 
A huge number of org.joda.time.DateTime objects was found in the old gen space of my program.
After a deeper inspection of the code I have seen that there are two references to the head element of the "times"-seq.

The first one is implicit in the destructuring of [next-time & more-times] in the go-loop, see https://groups.google.com/forum/#!topic/clojure/YNcqmAwWkqA .
I fixed it by removing the destructuring and use instead first and rest in the body.

The second one is the function itself. The reify returns an object that holds (amongst others) a reference to the function arguments, in particular to times.
I fixed it by converting the functions to macros.

Unfortunately you still have to be careful in using the lib.
For instance
(let [times (periodic/periodic-seq (time/now) (time/millis 1000))]
  (chime-ch times))
still produces a leak, while
(chime-ch (periodic/periodic-seq (time/now) (time/millis 1000)))
works fine.

It would be great if you would accept my proposal and adapt the use-cases in the readme.
@jarohen
Copy link
Owner

jarohen commented Nov 3, 2015

Hi @schmfr - thanks for submitting this PR :)

Will take a look, and hopefully get it merged.

James

@jarohen
Copy link
Owner

jarohen commented Nov 5, 2015

The first one is implicit in the destructuring of [next-time & more-times] in the go-loop, see https://groups.google.com/forum/#!topic/clojure/YNcqmAwWkqA. I fixed it by removing the destructuring and use instead first and rest in the body.

Ah, I didn't know that - thanks!

The second one is the function itself. The reify returns an object that holds (amongst others) a reference to the function arguments, in particular to times. I fixed it by converting the functions to macros.

This is a pretty significant change! Can this be solved instead by extracting the reify out into another function, make-ch, say?

James

@jarohen jarohen self-assigned this Nov 5, 2015
@schmfr
Copy link
Author

schmfr commented Nov 11, 2015

I will investigate your idea within the next week. Sorry for my late response.

Frank

@schmfr
Copy link
Author

schmfr commented Nov 17, 2015

Unfortunately it didnt work for me.

In my program I use
(-> (periodic/periodic-seq (time/now) (time/millis 120)) (chime-ch {:ch (async/chan (async/dropping-buffer 1))}))
to execute a perodic task.

I did some "long runs" and activated the class tracker for class "org.joda.time.DateTime" in JProfiler.

Version 0.1.6. and the version where the reify is extracted out into another function I got this:
chime_0_1_6
There where some full garbage collects (FGC) (which you cannot see in this image, sorry) but they didnt reduce the number of DateTime objects.

For the version with macros it results in the following two images.
chime_0_1_7_no3
chime_0_1_7_no2
After the first FGC @650min the JVM changed its behavior! The DateTime objects are now moved to the old gen space. For me this doesnt seem to be an inprovement at all but the JVM might have some good reasons for it. Anyway, the next two FCG @760min and @1070min remove the unneeded DateTime objects from memory.

@jarohen
Copy link
Owner

jarohen commented Nov 20, 2015

Thanks for this - I'll hopefully get a moment to look over it in more detail over the weekend.

@schmfr
Copy link
Author

schmfr commented Dec 14, 2015

Did you already look over it?

We had to release a fixed 0.1.6. version to our local artifactory. Due to a lack of time we will not merge it with your further releases which is a pity. Therefore, a fix in your master branch is highly appreciated.
Do you need any further help to investigate the problem/fix?

@jarohen
Copy link
Owner

jarohen commented Dec 15, 2015

Hi there,

I'm afraid I'd really rather not merge a patch that meant that extracting one of a function call's parameters into a 'let' results in different behaviour - this is quite contrary to what a Clojure developer could reasonably expect.

Apologies for the delay with this - I've been really busy at work in the run up to Christmas - but I think it'd be preferable that we continue to look for a solution that solves this without resorting to a macro.

James

jarohen added a commit that referenced this pull request Dec 15, 2015
@jarohen
Copy link
Owner

jarohen commented Dec 15, 2015

I've been having a quick look around at other people facing the same problems - it seems like http://dev.clojure.org/jira/browse/ASYNC-138 might be the root cause?

With that in mind - I've pushed a commit with the workaround from http://dev.clojure.org/jira/browse/ASYNC-32 - does that work for you?

(thanks @Bronsa :) )

jarohen added a commit that referenced this pull request Jan 2, 2016
@jarohen
Copy link
Owner

jarohen commented Jan 2, 2016

Hi @schmfr / @Bronsa,

I've merged the workaround from http://dev.clojure.org/jira/browse/ASYNC-32 into release 0.1.8, so will close this PR - thanks for all your help in tracking this one down!

James

@jarohen jarohen closed this Jan 2, 2016
@jarohen
Copy link
Owner

jarohen commented Jan 2, 2016

Make that 0.1.9 - Clojars part-failed to accept 0.1.8, then didn't let me retry :(

@schmfr
Copy link
Author

schmfr commented Jan 5, 2016

Nice solution! Thanks for your deeper investigations.

Frank

@jconti
Copy link

jconti commented Jul 6, 2017

I think I am still seeing this leak. Though it could be a red-herring introduced by my use of rx. I clearly was filling heap with the date-times from a long running periodic-seq. So it looks to me like the head of the periodic-seq is being retained (using [jarohen/chime "0.2.2"]). My code for the offending observable is as follows (admittedly very hard to follow if one is not familiar with rxjava and the beicon clojure wrapping of rxjava):

(defn ts-from-seq
  "Return an observable that emits a UTC long timestamp on a schedule defined
   by dt-seq. If :prime is true (the default) an initial timestamp is sent."
  [dt-seq & opts]
  (let [{:keys [prime] :or {prime true}} opts]
    (rx/create (fn [sink]
                 (let [ch (chime/chime-ch dt-seq)]
                   (go-loop []
                     (when (<! ch)
                       (sink (tm/ts-now))
                       (recur)))
                   (if prime (sink (tm/ts-now)))
                   (fn []
                     (a/close! ch)))))))

@jarohen
Copy link
Owner

jarohen commented Jul 6, 2017

Hi @jconti, thanks for the report. Without diving into rx I can't really tell what's causing the leak - could you see if you can reproduce the issue without that?

One possible cause could be that the fn closes over the dt-seq, which is holding on to dt-seq for as long as the fn is in scope. Would it be possible to move the ch binding out of the fn without altering the required behaviour?

(defn ts-from-seq
  "Return an observable that emits a UTC long timestamp on a schedule defined
   by dt-seq. If :prime is true (the default) an initial timestamp is sent."
  [dt-seq & opts]
  (let [{:keys [prime] :or {prime true}} opts
        ch (chime/chime-ch dt-seq)]
    (rx/create (fn [sink]
                 (go-loop []
                   (when (<! ch)
                     (sink (tm/ts-now))
                     (recur)))
                 (if prime (sink (tm/ts-now)))
                 (fn []
                   (a/close! ch))))))

(@Bronsa might have more of an insight here?)

Cheers,

James

@jconti
Copy link

jconti commented Jul 21, 2017

Sorry, didn't see your note. I will try the experiment you suggest.

Though my interpretation is that I have not closed over dt-seq, but rather ch. I say this not out of knowledge of the clojure eval environment, however. My opinion is just that there is no actual deref of the dt-seq var within the anonymous fn's form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants