java.lang.OutOfMemoryError: Java heap space I met this error #54

WuDiDaBinGe · 2021-07-25T09:11:02Z

java.lang.OutOfMemoryError: Java heap space
	at com.carrotsearch.hppc.Internals.newArray(Internals.java:37)
	at com.carrotsearch.hppc.IntObjectOpenHashMap.allocateBuffers(IntObjectOpenHashMap.java:364)
	at com.carrotsearch.hppc.IntObjectOpenHashMap.expandAndPut(IntObjectOpenHashMap.java:318)
	at com.carrotsearch.hppc.IntObjectOpenHashMap.put(IntObjectOpenHashMap.java:194)
	at org.aksw.palmetto.corpus.lucene.WindowSupportingLuceneCorpusAdapter.requestDocumentsWithWord(WindowSupportingLuceneCorpusAdapter.java:124)
	at org.aksw.palmetto.corpus.lucene.WindowSupportingLuceneCorpusAdapter.requestWordPositionsInDocuments(WindowSupportingLuceneCorpusAdapter.java:102)
	at org.aksw.palmetto.prob.window.BooleanSlidingWindowFrequencyDeterminer.determineCounts(BooleanSlidingWindowFrequencyDeterminer.java:54)
	at org.aksw.palmetto.prob.window.BooleanSlidingWindowFrequencyDeterminer.determineCounts(BooleanSlidingWindowFrequencyDeterminer.java:45)
	at org.aksw.palmetto.prob.AbstractProbabilitySupplier.getProbabilities(AbstractProbabilitySupplier.java:37)
	at org.aksw.palmetto.DirectConfirmationBasedCoherence.calculateCoherences(DirectConfirmationBasedCoherence.java:87)
	at org.aksw.palmetto.webapp.PalmettoApplication.calculate(PalmettoApplication.java:198)
	at org.aksw.palmetto.webapp.PalmettoApplication.npmiService(PalmettoApplication.java:111)
	at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
	at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:440)
	at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:428)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:933)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:867)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:951)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:842)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:827)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:106)

When i using multi thread to get topic cohrence i met this issue.
I ram is 16gb , intel-i9

The text was updated successfully, but these errors were encountered:

MichaelRoeder · 2021-07-26T11:13:17Z

In general, this behavior is expected if you try to use many threads that evaluate different topics in parallel.

The problem is that window-based coherence measures need to know the positions of the single words within documents. If you have words that occur often, the program has to handle many positions at the same time. If you do that in parallel with different topics that have different words, it is not very surprising that the program runs out of memory 😉

It is hard to give you a hint without more information.

How do you have parallelized the workflow (i.e., what is the task of a single thread)
How many threads do you use?
How many topics do you try to evaluate?
How many top words does one of your topics have?

WuDiDaBinGe · 2021-07-28T14:00:08Z

In general, this behavior is expected if you try to use many threads that evaluate different topics in parallel.

The problem is that window-based coherence measures need to know the positions of the single words within documents. If you have words that occur often, the program has to handle many positions at the same time. If you do that in parallel with different topics that have different words, it is not very surprising that the program runs out of memory

It is hard to give you a hint without more information.

How do you have parallelized the workflow (i.e., what is the task of a single thread)

How many threads do you use?

How many topics do you try to evaluate?

How many top words does one of your topics have?

Thanks for you replying.
I use three threads to I use three threads to calculate c_a, c_p and npmi respectively. I send the same data to three threads. The topic number is 100 and each topic has top 10 words to evaluate. Topics_words is a topic-words matrix. In my case, his size is (100,10).

def calculate_coherence(word_list, ret, coherence_type):
    result = []
    for words in word_list:
        result.append(palmetto.get_coherence(words, coherence_type=coherence_type))
    ret[coherence_type] = result
    return
th_ca   = threading.Thread(target=calculate_coherence, args=[topic_words, ret, 'ca'], name='th_ca')
th_cp   = threading.Thread(target=calculate_coherence, args=[topic_words, ret, 'cp'], name='th_cp')
th_npmi = threading.Thread(target=calculate_coherence, args=[topic_words, ret, 'npmi'], name='th_npmi')

I have relieve this problem by running this code "export CATALINA_OPTS="-Xms512m -Xmx3072m -XX:-UseGCOverheadLimit" before "mvn org.apache.tomcat.maven:tomcat7-maven-plugin:2.2:run -Dmaven.tomcat.port=7777" It works useful when topic num is 75. But when topic num is 100, i often met the problrm-- "Aborted (core dumped)"

MichaelRoeder · 2021-07-30T10:40:42Z

Your setup looks good and should work. I am just wondering why you have -Xmx3072m in the options as it limits the server to use not more than 3GB of RAM. You may want to increase it and try it again.

Another workaround would be to split up the list of documents and restart the server in-between. But that is a very bad solution 😉

We are aware of the problem that the web service sometimes has issues in budgeting its memory. Until now, it is unclear which part of the server creates the problem since the Palmetto library runs without memory issues if it is executed as a plain Java program.

WuDiDaBinGe · 2021-07-30T11:11:06Z

Ok. In will increase "-Xmx" again. I use python-Palmetto,so i don't try Palmetto java library.Maybe i will try next time.Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

java.lang.OutOfMemoryError: Java heap space I met this error #54

java.lang.OutOfMemoryError: Java heap space I met this error #54

WuDiDaBinGe commented Jul 25, 2021 •

edited by MichaelRoeder

Loading

MichaelRoeder commented Jul 26, 2021

WuDiDaBinGe commented Jul 28, 2021

MichaelRoeder commented Jul 30, 2021

WuDiDaBinGe commented Jul 30, 2021

java.lang.OutOfMemoryError: Java heap space I met this error #54

java.lang.OutOfMemoryError: Java heap space I met this error #54

Comments

WuDiDaBinGe commented Jul 25, 2021 • edited by MichaelRoeder Loading

MichaelRoeder commented Jul 26, 2021

WuDiDaBinGe commented Jul 28, 2021

MichaelRoeder commented Jul 30, 2021

WuDiDaBinGe commented Jul 30, 2021

WuDiDaBinGe commented Jul 25, 2021 •

edited by MichaelRoeder

Loading