Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Stability: Long lived objects eventually cause repeated stop-the-world gc. #2013

Closed
kcondon opened this issue Apr 14, 2015 · 4 comments
Assignees
Labels
Type: Bug a defect

Comments

@kcondon
Copy link
Contributor

kcondon commented Apr 14, 2015

The recent instability in service today seems to mostly be due to continuous stop-the-world gc.

I know this because each time there was a failure I checked:
jstat gcutil and
jmap -heap

Increasing the heap size helped but it's still happening.

Suggestions are to look at the access log and run the memory profiler on the app, looking for large object hierarchies being created and not freed on the most common pages, such as the homepage.

@kcondon kcondon added this to the Dataverse 4.0: Release Patch milestone Apr 14, 2015
pdurbin added a commit that referenced this issue Apr 16, 2015
We're trying this because if we download via Apache going through AJP we
are seeing this for large files (>2 GB):

[2015-04-16T10:31:52.532-0400] [glassfish 4.1] [WARNING] []
[org.glassfish.grizzly.filterchain.DefaultFilterChain] [tid:
_ThreadID=251 _ThreadName=jk-connector(3)] [timeMillis: 1429194712532]
[levelValue: 900] [[ GRIZZLY0013: Exception during FilterChain execution
java.lang.OutOfMemoryError: Java heap space ]]
@mercecrosas
Copy link
Member

FYI - We're still having once in a while Service temporarily unavailable:

PROBLEM: HTTPS-dvn is CRITICAL on host dataverse.harvard.edu

Service: HTTPS-dvn
Host: dataverse.harvard.edu
Alias: Production installation of Dataverse at Harvard
Address: dataverse.harvard.edu
Host Group Hierarchy: Opsview > DVN > DVN_Production
State: CRITICAL
Date & Time: Sun Apr 19 04:54:14 EDT 2015

Additional Information:

HTTP CRITICAL: HTTP/1.1 503 Service Temporarily Unavailable - 582 bytes in 30.697 second response time

@pdurbin
Copy link
Member

pdurbin commented Apr 22, 2015

Adding the Content-Length header in 2c81bea 5 days ago seems to have helped with large file downloads but there is more complexity to figure out having to do with Transfer-Encoding: chunked header for large files.

@scolapasta
Copy link
Contributor

Passing to QA.

@kcondon
Copy link
Contributor Author

kcondon commented Apr 26, 2015

Tested the original test case: Downloading a specific large datafile, >2GB immediately caused the 503 error. This no longer happens. Used a direct to glassfish, port 8181 with apache rewrite rules to bypass apache for downloading. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug a defect
Projects
None yet
Development

No branches or pull requests

5 participants