Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array of owned class cannot have its domain resized smaller #14362

Open
BryantLam opened this issue Oct 30, 2019 · 12 comments
Open

array of owned class cannot have its domain resized smaller #14362

BryantLam opened this issue Oct 30, 2019 · 12 comments

Comments

@BryantLam
Copy link

BryantLam commented Oct 30, 2019

Related #14361, #14273

Bug. Nilability. Resizing to a smaller domain when an associated array is of managed non-nilable class type will attempt to default-assign nil to the disappearing elements.

class MyInt {
    var x: int;
}

var D: domain(string);
var table: [D] owned MyInt;

proc add(key: string, val: int) {
    D += key;
    table[key] = new MyInt(val);
}

add("1", 1);
add("2", 2);
writeln(table);

proc remove(key: string) {
    // table[key].clear();
    D -= key;
}

remove("1");
writeln(table);
# chpl version 1.21.0 pre-release (8b3bca92)
{x = 1} {x = 2}
nilable-string-domain.chpl:19: error: halt reached - assigning nil to non-nilable owned
@bradcray
Copy link
Member

As in issue #14631, this behavior seems specific to the owned type. Specifically, changing the array element type to shared MyInt seems to work. But as with that issue and #14637, I suspect that there are still likely to be underlying problems.

@bradcray
Copy link
Member

This issue is becoming my brainteaser of the week. I've got a simple patch that avoids sticking the nil into the array and generates the "right" result, but it also doesn't transfer ownership (so the owned object persists). I could transfer the ownership out of the array's slot and drop it on the floor, but then what to put into that slot? I feel stuck in a mental dead-end.

Maybe collections of non-nilable like this one should give the appearance of storing non-nilable values, but should actually store nilable and then just assert non-nil at any interface boundaries that return the value back to the user?

Bryant, in filing this do you have a different/better solution in mind?

@BryantLam
Copy link
Author

BryantLam commented Oct 30, 2019

For this issue where the domain is removing an element, one approach is to have an unsafe nilability-ignoring function such as <management intent>.clear() to drop the object implicitly (instead of explicitly setting it to nil). The documentation for .clear() could indicate that .clear() is a nilability-ignoring/bypassing function. However, this is papering over the issue and introduces this rules-breaking function. Maybe that's okay (it's at least grep-able), but it certainly doesn't juxtapose well with object creation (e.g., when the domain is sized up).

A better solution would be some variation of #14273 where an annotated "unsafe" block can temporarily treat the object as nilable to do whatever, so long as the end result when closing the block keeps the non-nilable invariant. It's on the user to make sure the invariant holds, so the compiler can continue to make optimizing assumptions. Maybe this is too laborious, but it's certainly easier to debug / blame when you know the full scope of when nilable behaviors can happen to non-nilable types.

class MyVector {
  type eltType; // where isClass(eltType)
  var buffer: /* some non-nilable 0-based array treated as raw memory */;
  var size: uint = 0; // current size of this vector.

  /* Resize this container to smaller than its current size, otherwise nop. */
  proc truncate(newSmallerSize: uint) {
    if newSmallerSize < this.size {
      AssumeAsNilable(buffer) {
//    ^^^^^^^^^^^^^^^^^^^^^^^
        if isUnamangedClassType(eltType) {
          /* explicitly delete instead of implicitly dropping it */
        } else {
          buffer[newSmallerSize..(this.size - 1)] = nil;
        }
      }
      this.size = newSmallerSize;
    }
  }

  // The rest of the methods in this container
  // don't need to assume any nil objects because
  // there aren't any in the buffer up to this.size.
}

Somewhat a contrived example, but it shows that this approach allows a user to manually manage an array with all the dangers that come with it (the user manages their own invariants).

@bradcray
Copy link
Member

one approach is to have an unsafe nilability-ignoring function such as .clear() to drop the object implicitly

I don't think I understand this description. Is your thought that the owned object would have its ownership transferred and dropped on the floor (de-initing the object), but that the object pointer that is stored in memory would continue to point to the stale object rather than being reassigned to nil?

A better solution would be some variation of #14273 where an annotated "unsafe" block can temporarily treat the object as nilable to do whatever, so long as the end result when closing the block keeps the non-nilable invariant.

To me, this feels equivalent to having the collection store the nilable variant of the user-visible type and then to ensure that only non-nilable values are returned at the public interfaces. Specifically, I'm noting that your proposal still uses nil as a placeholder value and relies on the inherent semantics of the data structure to ensure that nil is never returned (relying on language changes when an escape is needed). If instead the collection were just to store collections of C? internally to implement a collection of C, we wouldn't need to change the language, and the collection's developer would still have to worry about similar concerns when implementing it (e.g., making sure that they returned the right types at boundaries and didn't dereference nils in the meantime).

Your comment, The rest of the methods in this container don't need to assume any nil objects becausesuggests that storingCand providing an escape clause when anilneeds to be stored into an element might permit most of the code to live in aCworld and lean on the existing nilability machinery, but I think that by defining the right core helper functions to deal withC?and returnC`, a collection author could still get those benefits without language changes.

Do you see any major difference between the two?

Note that in the case of the OP (an associative array), the default implementation today is for Chapel to use an open-addressing hash table in which empty (nil) and full slots are intermingled arbitrarily, so there isn't as clean a boundary between nil and non-nil elements as in your vector example. I don't think this changes anything much (since in both cases, the datatype has to semantically understand which elements are valid vs. not to interact with them safely), but I'm noting it because I could imagine some solutions that might work for the vector case, yet not for an open addressing hash table.

@BryantLam
Copy link
Author

BryantLam commented Oct 31, 2019

one approach is to have an unsafe nilability-ignoring function such as .clear() to drop the object implicitly

My thought is that, today, .clear() will drop the object, set it to nil, and ignore all nilability. This is a bug, ... but it could also just be the unsafe behavior that .clear() does (i.e., make it a feature and not a bug). There would be two coding guidelines: for nilable objects, explicitly assign it to nil to drop it; for non-nilable objects, call .clear() to violate nilability and drop it knowing full well that you're doing something dangerous. ([[1]]: most performant, least type-safe.)

To me, this feels equivalent to having the collection store the nilable variant of the user-visible type and then to ensure that only non-nilable values are returned at the public interfaces.

Agreed.

Do you see any major difference between the two?

Yeah, I admit it was a contrived example. For low-level data structures with only a single container, it is reasonable to use a nilable container to represent the non-nilable interface. For higher-level data structures where some construction using multiple, dependent arrays is required, this gets trickier.

I know this issue is for destruction, but if there's a simpler way to do construction than via an annotated code block, I haven't thought of one. Expecting the user to wrap all their complicated cases into nilable arrays and present a non-nilable interface is likely too much work, especially if after construction, the array is non-nilable for the rest of its (long) life.

I could imagine some solutions that might work for the vector case, yet not for an open addressing hash table.

This trickiness is why many of Rust's low-level data structures are implemented using unsafe blocks. A library developer is allowed to side-step some stringent compiler checks (in cases where the compiler cannot definitely prove what is being done is truly safe) so long as they still present the safe interface to the users.

Chapel certainly shouldn't go to this extreme conclusion if it doesn't have to, but there's a tradeoff between performance and safety [[1]]. The most type-safe thing to do is to force users to define a default value for their type and do the construction/destruction by creating this default value and swapping it into the effected value. If creating this default value is non-trivial work, there could be a huge performance penalty.

@BryantLam

This comment has been minimized.

@bradcray

This comment has been minimized.

@BryantLam

This comment has been minimized.

@bradcray

This comment has been minimized.

@bradcray

This comment has been minimized.

bradcray added a commit that referenced this issue Nov 11, 2019
Control ddata initialization via a param

[reviewed by @mppf and @dlongnecke-cray]

At present, we always know at compile-time whether or not a ddata
wants to be initialized or not, so change the flag that indicates this
into a param such that the compiler will fold out the initialization
code.  Failure to do so causes a problem when trying to create a ddata
of types that don't have an initial value, like non-nilable classes.

Add a test to lock this behavior in based on the case in #14362 (comment) which inspired this
fix.

In the future, if/when we have a case that needs this not to be a param,
we can presumably add a non-param overload and have the list continue
to use the param overload as it is today (?).
@mppf
Copy link
Member

mppf commented Aug 18, 2020

When I run the example from the top of the issue today, I get:

./bryant.chpl:6: error: Cannot default initialize associative array because element type owned MyInt is a non-nilable class

It's possible to disable that error but we still have problems with the pattern

    D += key;
    table[key] = new MyInt(val);

because the D += key line will cause the array element for key to be default initialized. I'm sure we've discussed this somewhere but one idea is to write this sequence differently, e.g. as a straw-person addKeyAndSetValue(D, key, A, new MyInt(val));.

Edit: #15703 also proposes that such a D += key would be OK if all the arrays declared over the domain are currently in a noinit state.

@bradcray
Copy link
Member

I'm sure we've discussed this somewhere but one idea is to write this sequence differently, e.g. as a straw-person addKeyAndSetValue(D, key, A, new MyInt(val));.

A challenge to generalizing this idea is that it'd need to be a varargs routine to take any non-default-initable arrays defined over D.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants