Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with runtime type and default initialization #15929

Open
mppf opened this issue Jun 24, 2020 · 13 comments
Open

Problem with runtime type and default initialization #15929

mppf opened this issue Jun 24, 2020 · 13 comments

Comments

@mppf
Copy link
Member

mppf commented Jun 24, 2020

After PR #15869, a problem came up in nightly testing that can be reproduced by the following code:

record R {
  type t;
  var x: int;
  proc init(type t, in arg: int = 1) {
    this.t = t;
    this.x = arg;
  }
}

proc main() {
  var A:[1..3] int;
  var rr: R(A.type);

  var B:rr.t;
  writeln(B);
}

What happens is that the access rr.t ends up pointing to uninitialized memory. This is because the compiler computes the type expression R(A.type) separately from initializing rr. In particular, in createGenericRecordVarDefaultInitCall, it creates an uninitialized temporary storing the runtime type (see the code creating a variable named default_runtime_temp and the related comment).

A workaround is to use e.g. var rr = new R(A.type).

I can see two ways we might be able to fix the bug:

  1. Associate runtime types of fields with runtime types of aggregates. For example, the code could have been type t = R(A.type); var rr: t and in that event the only way to get the value A.type at runtime into rr is if the t variable stores the runtime type for its field. This might be relatively straightforward for records but might become untenable or impossible to do with classes. (But maybe that is a restriction we could document). This has been a long-term todo - see handle record and class types with runtime types #11549.
  2. Adjust the compiler so that it can handle the runtime types in the declaration var rr: R(A.type); more directly. In particular, this might involve adjusting PRIM_DEFAULT_INIT_VAR to accept extra arguments for runtime types of fields.
daviditen pushed a commit to daviditen/chapel that referenced this issue Jun 24, 2020
`Map.map` ran into issue chapel-lang#15929 . As a workaround, we can avoid the default
initializer and initialize with new instead.

Also, fix an typo where a method was called without the receiver argument.
@daviditen
Copy link
Member

Unfortunately the workaround isn't working for maps of classes. After doing the workaround on the map.table field, I'm seeing internal compiler errors for this code:

use Map;

class C {
  var i: int;
}

var m = new map(int, owned C?);
m.add(1, new owned C?(1));
$CHPL_HOME/modules/standard/Map.chpl:74: internal error: invalid attempt to get reference type [AST/primitive.cpp:235]

With this diff from master:

diff --git a/modules/standard/Map.chpl b/modules/standard/Map.chpl
index 9f1a67136e..29b4676628 100644
--- a/modules/standard/Map.chpl
+++ b/modules/standard/Map.chpl
    param parSafe = false;

    pragma "no doc"
-   var table: chpl__hashtable(keyType, valType);
+   var table = new chpl__hashtable(keyType, valType);

    pragma "no doc"
    var _lock$ = if parSafe then new _LockWrapper() else none;

@mppf
Copy link
Member Author

mppf commented Jun 25, 2020

@daviditen - I'd suspect that is a different issue but I'm not seeing this problem with revision 27f25b4 and your patch applied on my system.

@daviditen
Copy link
Member

Strange. I'm currently fully updated and see it on my Mac laptop, chapcs11, and the travis build of mason for PR #15932 .

@mppf
Copy link
Member Author

mppf commented Jun 26, 2020

@daviditen - I've checked out your branch on chapcs11 and I'm not seeing the same issue. Is it possible that there is an error in the test program pasted above? Works for me on my Mac too.

and the travis build of mason for PR #15932 .

I do see the error when building mason. Here it looks like the problem is that the compiler is trying to resolve PRIM_ADDR_OF with a chpl__hashtable variable that has generic type. I'm pretty sure that is a separate error and I would appreciate it if you are able to make a small reproducer and issue.

I went ahead and tried a different workaround in Map. I applied this patch (to your branch):

diff --git a/modules/standard/Map.chpl b/modules/standard/Map.chpl
index 9f1a67136e..343469ca6e 100644
--- a/modules/standard/Map.chpl
+++ b/modules/standard/Map.chpl
@@ -71,7 +71,7 @@ module Map {
     param parSafe = false;
 
     pragma "no doc"
-    var table = new chpl__hashtable(keyType, valType);
+    var table: chpl__hashtable(keyType, valType);
 
     pragma "no doc"
     var _lock$ = if parSafe then new _LockWrapper() else none;
@@ -117,6 +117,7 @@ module Map {
       this.keyType = keyType;
       this.valType = valType;
       this.parSafe = parSafe;
+      this.table = new chpl__hashtable(keyType, valType);
     }
 
     proc init(type keyType, type valType, param parSafe=false)
@@ -143,6 +144,7 @@ module Map {
       this.keyType = keyType;
       this.valType = valType;
       this.parSafe = parSafe;
+      this.table = new chpl__hashtable(keyType, valType);
     }
 
     /*
@@ -164,6 +166,7 @@ module Map {
       this.keyType = kt;
       this.valType = vt;
       this.parSafe = ps;
+      this.table = new chpl__hashtable(keyType, valType);
 
       this.complete();
 

and after that, mason builds for me.

@daviditen
Copy link
Member

Huh. Somehow the code from my comment is passing for me too now. I guess I copied the wrong thing. Sorry.

I would appreciate it if you are able to make a small reproducer and issue.

test/library/standard/Map/testBorrowedMap.chpl is about the best I've been able to do so far. I tried to tease it out of map, but without luck. I opened issue #15960.

and after that, mason builds for me.

But then we're back into the problem this issue is about and optimizations/remoteValueForwarding/serialization/globals.chpl and optimizations/remoteValueForwarding/serialization/staticSize.chpl fail again.

daviditen pushed a commit to daviditen/chapel that referenced this issue Jul 7, 2020
Reformat .future files to look like:
bug: problem with x of arrays
chapel-lang#15929
bmcdonald3 added a commit that referenced this issue Oct 27, 2021
Retire old map future and add new future to replace it

[ reviewed by @daviditen ]

PR #18523 added a new in the initializer for the chpl__hashtable in map, which caused this future test that was failing due to #15929 to be resolved, even though the underlying issue wasn't fixed, which is a known issue with default initializing a record with type that isn’t known until runtime. I spoke with David and decided to retire the old future and add a new one to keep tracking the underlying issue. The setOfArray future is unrelated and is failing due to #16033, a compiler issue.
bradcray added a commit that referenced this issue Oct 29, 2021
Fix some problems in sets / hash tables of arrays

[reviewed by @bmcdonald3, helped along by @mppf]

This fixes support for sets of arrays by fixing a `==`-based key comparison
and using an explicit `new` call to create the underlying hash table rather
than relying on default initialization (contributed by @mppf).

Specifically, we were using `key == key` to look for existing keys before, but
for an array type, this will generate an array of bools rather than a single bool
as we require here.  So this PR pushes the test into a helper routine to compute
it, which is sufficient to get the setOfArray.chpl future working.

I then extended setOfArray.chpl to see if we could actually work with the set,
and it turned out that we couldn't due to #15929 until we applied the workaround
in that issue to ensure that the set's element type's runtime type was established
properly  (thanks to @mppf).

While working on this, I added a pair of futures that demonstrate that a generic
record instantiated with an array doesn't work: `genericRecordOfArray*.chpl`
which seem related to:
* #15929
* #13808 and/or
* #11549
but felt simpler than any futures I quickly found for those cases.

Resolves #16033.
@bradcray
Copy link
Member

bradcray commented Nov 1, 2021

@mppf: I just slammed back into this while trying to write a new test to lock something in and took an embarrassingly long time to debug it given that I've been working around it for the past few workdays (though I guess it is a Monday). Anyway, it made me wonder what we might be able to do to catch and complain about such cases rather than having them go through and result in unexpected behavior. But so far all I've done is wonder rather than actually think about it, so I'm putting this here to see what it jogs for you or remind myself to think about it more.

@mppf
Copy link
Member Author

mppf commented Nov 1, 2021

I wouldn't expect it to be hard to emit an error or warning for it. The things that seem hard to me are to get it working comprehensively in all of these cases.

@mppf
Copy link
Member Author

mppf commented Nov 1, 2021

@bradcray - here is a patch that adds a warning for the whole block of code that is problematic. We see the warning for the test in this issue but not for hello world. A reasonable next step would be to run testing with it to see if we have cases relying on it today. (I suspect that there used to be, especially within the array implementation, but that these have been gradually migrated away from this pattern).


diff --git a/compiler/resolution/functionResolution.cpp b/compiler/resolution/functionResolution.cpp
index cc01b21d7d..ac26be7a31 100644
--- a/compiler/resolution/functionResolution.cpp
+++ b/compiler/resolution/functionResolution.cpp
@@ -11409,6 +11409,7 @@ static CallExpr* createGenericRecordVarDefaultInitCall(Symbol* val,
       appendExpr = new SymExpr(value);
     } else if (field->hasFlag(FLAG_TYPE_VARIABLE)) {
       if (value->getValType()->symbol->hasFlag(FLAG_HAS_RUNTIME_TYPE)) {
+        USR_WARN(call, "Problematic default init of runtime type");
         // BHARSH 2018-11-02: This technically generates code that would
         // crash at runtime because aggregate types don't contain the runtime
         // type information for their fields, so this temporary will go

@bradcray
Copy link
Member

bradcray commented Nov 1, 2021

OK, thanks! I suspect I can run a paratest against this this afternoon if you don't get to it first.

@bradcray
Copy link
Member

bradcray commented Nov 2, 2021

The result was 29 failures, the vast majority of which had the output:

> $CHPL_HOME/modules/internal/ChapelBase.chpl:897: warning: Problematic default initialization of runtime type
> $CHPL_HOME/modules/internal/ChapelBase.chpl:903: warning: Problematic default initialization of runtime type
> $CHPL_HOME/modules/internal/DefaultRectangular.chpl:1504: warning: Problematic default initialization of runtime type
> $CHPL_HOME/modules/internal/DefaultRectangular.chpl:967: warning: Problematic default initialization of runtime type

(where the main other cases were tests that created sync/single of array or the like—not currently supported, but generate the error).

I haven't dug into this at all yet, but with this quick look wonder whether—if we were to replace the immediate warning with an insertion of a compilerError() (or warning) would determine that the code paths in question are never used. If so, we could either stick with the compilerError() or we could just squash the warning for any non-user modules.

@mppf
Copy link
Member Author

mppf commented Nov 2, 2021

I suppose that those errors are indicating that there's probably code you could write with array-of-record-containing-array that would lead to memory errors. Perhaps we could/should silence the warning for internal modules in the near term though.

@dlongnecke-cray
Copy link
Contributor

dlongnecke-cray commented Jan 26, 2022

Just ran into this again in #19118.

For the pattern var obj: foo(domain(3)), I think we are constructing the runtime type, it's just that we discard it when we resolve foo(domain(3)) by replacing that expression with a SymExpr pointing to a type. So I think this might be a case of discarding info too early when an actual in the type construction call is a runtime type.

This doesn't happen in new foo(domain(3)) because we correctly pass the constructed runtime type into the init call (and then resolve that init call all at once, I'd assume).

For the code:

record foo {
  type T;
  proc doSomething(in val: T) {
    writeln(val);
  }
}

proc test() {
  var obj: foo(domain(3));
  obj.doSomething({0..1, 2..3, 4..5});
}
test();

Coming at you with DavidL levels of excessive detail, now, because I need input on my idea.

The logged AST before resolve looks like this:

unknown obj[200288] "dead at end of block" "insert auto destroy"
unknown call_tmp[723085] "expr temp" "maybe param" "maybe type" "temp"

// Here's the runtime type we construct for `domain(3)`
(723088 'move' call_tmp[723085](200284 call chpl__buildDomainRuntimeType defaultDist[92798] 3))
unknown call_tmp[723090] "expr temp" "maybe param" "maybe type" "temp"

// The inner expression in the move is the foo(domain(3)) part of the record declaration in 'var obj: foo(domain(3))'
(723093 'move' call_tmp[723090](200287 call foo[200269](?) call_tmp[723085]))
(555989 'default init var' obj[200288] call_tmp[723090])

When the compiler goes to resolve (200287 call foo[200269](?) call_tmp[723085]), the foo(domain(3)) type specifier, it opts to replace that entire call with a SymExpr pointing to the instantiated record type.

This is what causes the runtime type to be lost.

I think we could opt to not postfold these type instantiation calls at all, or conditionally not fold them if they contain a runtime type. Either way would work, I think?

When resolving the default-init we can consult the call and compute the type symbol later if we need to in the same fashion as postfold.

This would allow the default-init to just use the runtime type as it appears in the code instead of constructing an empty temp.

Thoughts?

Another option would be to walk backwards from the default-init and look for a specific pattern, but I am less confident about that idea.

@dlongnecke-cray
Copy link
Contributor

dlongnecke-cray commented Jan 27, 2022

So I ran with my idea a bit and came up with something that actually works for my test case above, but day22a-listOfDomain.chpl is exploding due to problems in pass parallel.

As I was working on this, I kept asking "why do we replace type constructor calls with SymExprs to the type at all"? It seems like the situation would be handled naturally if we just didn't do that, because for foo(domain(3)) you can still get the fully resolved type from the baseExpr of the call. So I'm increasingly feeling this is a situation where we just replace nodes a bit too eagerly when resolving (it's not actually the normalizer's fault this time!).

(Well, it's also kind of the normalizer's fault, because the situation would be even better if we could just see (PRIM_DEFAULT_INIT_VAR obj (call foo domain(3))) without any temps in there...)

dlongnecke-cray added a commit that referenced this issue Aug 22, 2022
…ic-record-rtt

Fix generic record initialization when substitution is a runtime type

Attempt to recover runtime types used in the type construction of
default initialized records. This fixes some cases where a
default initialized record containing an array or domain field
would halt at runtime when said field was used.

Resolves #19118.
Resolves #20167.

The parent issue for this effort is: #15929.

Also consider: #19292, #20424

This PR does not fix all bugs related to default initialization of
aggregates that contain runtime types. The bug persists when
a type constructor containing a runtime type is obfuscated by
indirection (e.g, it is returned by a function, or uses 2+ levels of
type alias). See the added future `RuntimeTypeEscapes.chpl` for an
example.

Reviewed by @vasslitvinov. Thanks!

TESTING

- [x] `ALL` on `linux64`, `standard`

Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants