Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Classloader leak issue #465

Merged
merged 5 commits into from
Sep 20, 2017
Merged

Conversation

peterbae
Copy link
Contributor

Fixes issue #314

I put a comment as to why I took this approach for solving this problem in that issue as well.

@msftclas
Copy link

@peterbae,
Thanks for your contribution as a Microsoft full-time employee or intern. You do not need to sign a CLA.
Thanks,
Microsoft Pull Request Bot

@codecov-io
Copy link

codecov-io commented Aug 31, 2017

Codecov Report

Merging #465 into dev will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev     #465      +/-   ##
============================================
- Coverage     46.27%   46.23%   -0.05%     
- Complexity     2201     2206       +5     
============================================
  Files           108      108              
  Lines         25260    25239      -21     
  Branches       4176     4175       -1     
============================================
- Hits          11688    11668      -20     
- Misses        11547    11556       +9     
+ Partials       2025     2015      -10
Flag Coverage Δ Complexity Δ
#JDBC41 45.95% <100%> (-0.18%) 2193 <5> (ø)
#JDBC42 46.13% <100%> (-0.02%) 2204 <5> (+9)
Impacted Files Coverage Δ Complexity Δ
.../microsoft/sqlserver/jdbc/SQLServerConnection.java 45.84% <100%> (+0.04%) 274 <0> (ø) ⬇️
...m/microsoft/sqlserver/jdbc/ActivityCorrelator.java 91.66% <100%> (+1.66%) 7 <5> (+3) ⬆️
...java/com/microsoft/sqlserver/jdbc/StringUtils.java 25% <0%> (-12.5%) 3% <0%> (-1%)
...m/microsoft/sqlserver/jdbc/SQLServerException.java 75.2% <0%> (-2.03%) 28% <0%> (-1%)
...n/java/com/microsoft/sqlserver/jdbc/Parameter.java 61.37% <0%> (-1.29%) 63% <0%> (-1%)
...va/com/microsoft/sqlserver/jdbc/StreamColInfo.java 78.26% <0%> (-0.91%) 6% <0%> (ø)
...va/com/microsoft/sqlserver/jdbc/StreamTabName.java 85% <0%> (-0.72%) 7% <0%> (ø)
...oft/sqlserver/jdbc/SQLServerBulkCSVFileRecord.java 41.63% <0%> (-0.5%) 28% <0%> (ø)
...rc/main/java/com/microsoft/sqlserver/jdbc/dtv.java 62.69% <0%> (-0.43%) 0% <0%> (ø)
...m/microsoft/sqlserver/jdbc/SQLServerResultSet.java 32.61% <0%> (-0.37%) 241% <0%> (-1%)
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5597d4d...4acd905. Read the comment docs.

@msftclas
Copy link

@peterbae, thanks for signing the contribution license agreement. We will now validate the agreement and then the pull request.

Thanks, Microsoft Pull Request Bot

@marschall
Copy link
Contributor

Don't you need proper synchronization when accessing a shared object (the map)?
I can't find where the values are removed from the map again. If that doesn't happen, then you end up with a memory leak.

@peterbae peterbae changed the title Classloader leak 314 Fix for Classloader leak issue Aug 31, 2017
@peterbae
Copy link
Contributor Author

Hi @marschall,

I thought about making the map synchronized - it's true that the map is shared by multiple threads, but since each entry in the map is only going to be modified by one thread (the thread can only access the entry in the map that shares the same key as its thread ID), I do not believe any two threads will ever try to access/modify the same entry in the map, and therefore this operation should be thread safe.

As for the second point, I agree. I was thinking the map would get removed when the app server gets undeployed, but it may never get undeployed, in which case the map would keep holding entries that will never be used again. I've made changes so that a thread will remove its own entry in the map when its connection gets closed (which should be thread safe for the same reason above).

@marschall
Copy link
Contributor

marschall commented Sep 1, 2017

I thought about making the map synchronized - it's true that the map is shared by multiple threads, but since each entry in the map is only going to be modified by one thread (the thread can only access the entry in the map that shares the same key as its thread ID), I do not believe any two threads will ever try to access/modify the same entry in the map, and therefore this operation should be thread safe.

That's not how thread safety in Java works at all. And it's not true since the same entry is modified in case of a hash collision. In addition the size, the modification count, the backing array and the pointer to the backing array are modified by all threads.

For a quick introduction into the JMM I recommend the following articles

https://shipilev.net/blog/2014/safe-public-construction/
https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/

As for the second point, I agree. I was thinking the map would get removed when the app server gets undeployed, but it may never get undeployed, in which case the map would keep holding entries that will never be used again.

You mean the application gets undeployed? That doesn't fix your issue because the database driver should not be part of the application deployment and therefore it does not become eligible for garbage collection.

I've made changes so that a thread will remove its own entry in the map when its connection gets closed (which should be thread safe for the same reason above).

Not in an application server with a connection pool. The worker threads calling methods on on driver objects and the threads opening and closing the connection can be different ones.

@peterbae
Copy link
Contributor Author

peterbae commented Sep 5, 2017

Thanks @marschall for your insight. It was an oversight on my end to disregard the hash collision and misusing HashMap here. I've replaced the HashMap with a ConcurrentHashMap.

For the issue where the static map that holds ActivityId may not be removing the unused entries due to the scenario where the threads that are using the driver object and closing the connection being different, I believe we need a way to track the threads that are alive and periodically remove entries from the map. However, this will likely introduce performance degradation and will need further input from the team, and consider if the potential memory being held by the map is worth the performance gain. Meanwhile, since I believe the classloader leak issue itself is fixed, I think it's still good to merge this in unless more problems are found with this solution.

@peterbae
Copy link
Contributor Author

peterbae commented Sep 7, 2017

Thanks @rPraml, I've made the refactoring changes and also made some commenting changes.

@rPraml
Copy link
Contributor

rPraml commented Sep 7, 2017

@peterbae I do not see any major problems in your fix, It's true that you have a memory leak when you start & stop a lot of threads AND Util.IsActivityTraceOn() = true. But I think this behaviour is better than the classloading issue.

I also tried to find out what ActivityId is good for.
It seems to me, that ActivitiId & ActivitiCorrelator is used for logging purpose and ONLY used if "com.microsoft.sqlserver.jdbc.traceactivity" is set to "on".

I also found often this code block:

if (loggerExternal.isLoggable(Level.FINER) && Util.IsActivityTraceOn()) {
  loggerExternal.finer(toString() + " ActivityId: " + ActivityCorrelator.getNext().toString());
}

If I interpret this right, the logger statement calls ActivityCorrelator.getNext() - which will increment the ID. This means, the ActivityId counts different if log level is FINER or not. This looks like a bug/mistake...
(Maybe ,getCurrent() instead of .getNext() is correct here)

Maybe some other people from DEV can take a look at the ActivityCorrelator. If this is really only for debugging (https://msdn.microsoft.com/en-us/library/hh269638.aspx) maybe you can use a static AtomicInteger and remove some complexity?

@peterbae
Copy link
Contributor Author

peterbae commented Sep 7, 2017

@rPraml We call ActivityCorrelator.getNext so that we increment the sequence and then log it. If we were to call getCurrent instead here, we would always be logging the same sequence since we're never incrementing it (it also seems like we're only concerned with logging when the log level is FINER), so getNext is fine here.

As for the ActivityCorrelator, it's also used in other parts of the driver to store if a message was sent to the server (in TDSWriter, for example), so it's not only used for debugging.

@rPraml
Copy link
Contributor

rPraml commented Sep 8, 2017

@peterbae thanks for clarifying.
It really seems that incrementing the ID is only important when logging is enabled as there is only one place in SQLServerConnection.Prelogin where .getNext() is called. (This means, the potential memory leak is there)

@peterbae
Copy link
Contributor Author

peterbae commented Sep 8, 2017

@rPraml yes. As discussed above, the issue with this solution is that the ActivityId objects (which are relatively small) are potentially being held in the static map in ActivityCorrelator in an application server with a connection pool, but in order to clean up the ActivityId objects in such scenario, the driver would see some performance degradation. I would still like to merge this fix as an improvement.

@peterbae
Copy link
Contributor Author

@marschall @rPraml please let me know if there's other issues with this fix, otherwise I will merge it in. Thanks.

@peterbae peterbae merged commit e6f66e9 into microsoft:dev Sep 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants