Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PythonScopeJpyImpl: Disable Conversion Cache #5123

Merged
merged 18 commits into from
Feb 12, 2024

Conversation

nbauernfeind
Copy link
Member

@nbauernfeind nbauernfeind commented Feb 8, 2024

This fixes a memory leak discovered in #5112 via repeated runs of:

import time, jpy
from deephaven import empty_table, garbage_collect

row_count = 10_000_000

begin_time = time.perf_counter_ns()
source = empty_table(row_count).update(["X = repeat(ii % 250, 100)"]) 
empty_table(1).select('Y = 1')
print('Rows / Sec:', row_count / ((time.perf_counter_ns() - begin_time) / 1_000_000_000))

Runtime = jpy.get_type('java.lang.Runtime')
print('Gigs Used Before GC:',
        (Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()) / 1024 / 1024 / 1024)

del source;
del row_count;
del begin_time
for i in range(10):
    garbage_collect()

print('Gigs Used After GC:',
        (Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()) / 1024 / 1024 / 1024)
del Runtime

Nightlies running here: https://github.com/nbauernfeind/deephaven-core/actions/runs/7833860264/

@nbauernfeind nbauernfeind added bug Something isn't working core Core development tasks python NoDocumentationNeeded ReleaseNotesNeeded Release notes are needed labels Feb 8, 2024
@nbauernfeind nbauernfeind self-assigned this Feb 8, 2024
@nbauernfeind nbauernfeind changed the title PythonScopeJpyImpl: Use a WeakHashMap to Allow Converted Values to be Garbage Collected PythonScopeJpyImpl: Only Cache the JNI Lookup Feb 8, 2024
jmao-denver
jmao-denver previously approved these changes Feb 9, 2024
Copy link
Contributor

@jmao-denver jmao-denver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. weak-values cache makes sense for the main global scope
  2. like the decoupling of QueryScope from QLP

Comment on lines +152 to +154
default Map<String, Object> toMap() {
return toMap(entry -> true);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this kind of a bad implementation when we could just copy?

@@ -98,8 +94,9 @@ public synchronized void init(TableDefinition tableDefinition) {

try {
final QueryScope queryScope = ExecutionContext.getContext().getQueryScope();
final Map<String, Object> possibleParams = queryScope.toMap();
for (Map.Entry<String, Object> param : possibleParams.entrySet()) {
final Map<String, Object> queryScopeVariables = queryScope.toMap(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me like the script session query scope should apply this predicate internally, as the standalone scope guards its input to avoid introducing any bad names. If we did that, then I think we wouldn't need the predicate in toMap().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm willing to do this if this is your preference. I was considering the other usages. They both filter to Table instances -- unfortunately they also need to unwrap first, and probably we would prefer to unwrap once.

final String columnSuffix = DhFormulaColumn.COLUMN_SUFFIX;
final Class<?> vectorType = DhFormulaColumn.getVectorType(columnDefinition.getDataType());

possibleVariables.put(columnDefinition.getName() + columnSuffix, vectorType);
columnVariables.add(columnDefinition.getName() + columnSuffix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order is opposite from condition filter. Is that bad? It could get confusing if we had column names ending in underscores, regardless.

I think in a later PR, we should combine this phase from ACF and FA into one utility method.

nbauernfeind and others added 2 commits February 9, 2024 20:17
…/QueryLanguageParser.java

Co-authored-by: Ryan Caudy <rcaudy@gmail.com>
…/QueryLanguageParser.java

Co-authored-by: Ryan Caudy <rcaudy@gmail.com>
Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Comment on lines 50 to 51
// forcefully load this helper class which imports python modules
PyCallableWrapperJpyImpl.init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried this is some kind of band-aid allowing us to mask a deeper issue.

import io.deephaven.engine.util.PyCallableWrapperJpyImpl;
import io.deephaven.integrations.python.PythonObjectWrapper;

public abstract class PythonImportInitializer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this pattern could be replaced with service loader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem w/ service loader is that the classes are on the classpath regardless of whether we are running python or groovy. In this current pattern, it's only invoked when we are coming from known-python script session.

@nbauernfeind nbauernfeind changed the title PythonScopeJpyImpl: Only Cache the JNI Lookup PythonScopeJpyImpl: Disable Conversion Cache Feb 12, 2024
@nbauernfeind nbauernfeind enabled auto-merge (squash) February 12, 2024 21:25
@nbauernfeind nbauernfeind merged commit 9fcaba7 into deephaven:main Feb 12, 2024
18 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Feb 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working core Core development tasks NoDocumentationNeeded python ReleaseNotesNeeded Release notes are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants