Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inflight handler caching allow shared mutable state between parallel function invocations #5549

Closed
Chriscbr opened this issue Jan 26, 2024 · 8 comments · Fixed by #5867
Closed
Labels
🐛 bug Something isn't working 🎨 sdk SDK 🕹️ simulator Related to the `sim` compilation target

Comments

@Chriscbr
Copy link
Contributor

Chriscbr commented Jan 26, 2024

I tried this:

bring cloud;
bring util;

class Foo {
  inflight var n: num;
  inflight new() {
    this.n = 99;
  }

  pub inflight inc(): num {
    this.n += 1;
    return this.n;
  }

  pub inflight get(): num {
    return this.n;
  }
}

let foo = new Foo();

// a function that takes at least three seconds to run
let fn = new cloud.Function(inflight () => {
  let n = foo.inc();
  util.sleep(3s);
  assert(n == foo.get());
});

test "Foo state is not shared between function invocations" {
  // start two invocations of fn, staggering them by 1 second
  fn.invokeAsync("");
  util.sleep(1s);
  fn.invoke("");
}

This happened:

The test fails

I expected this:

In clouds, FaaS invocations receive their own runtime environments where code is loaded into and executed. After an invocation is finished, it's possible the same loaded code will be reused -- but it's not possible for a single runtime environment to be used by two parallel invocations.

The Wing local simulator is violating this expectation, making it appear as if you can seamlessly share mutable state between concurrent cloud.Function invocations.

Is there a workaround?

No response

Anything else?

Regression from #5478

Wing Version

0.54.63

Node.js Version

20.9.0

Platform(s)

MacOS

Community Notes

  • Please vote by adding a 👍 reaction to the issue to help us prioritize.
  • If you are interested to work on this issue, please leave a comment.
@Chriscbr Chriscbr added 🐛 bug Something isn't working 🎨 sdk SDK 🕹️ simulator Related to the `sim` compilation target labels Jan 26, 2024
@monadabot monadabot added this to Wing Jan 26, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New - not properly defined in Wing Jan 26, 2024
@Chriscbr Chriscbr changed the title Inflight handler caching allows shared state to be mutated between function invocations Inflight handler caching allow shared mutable state between parallel function invocations Jan 26, 2024
@eladb
Copy link
Contributor

eladb commented Jan 28, 2024

I am not 100% sure I understand the distinction between two parallel invocations and two sequential invocations going into the same FaaS host. The current behavior in AWS Lambda for example does allow you to share mutable state across two invocations that end up in the same host.

I might be missing something

@staycoolcall911 staycoolcall911 moved this from 🆕 New - not properly defined to 🤝 Backlog - handoff to owners in Wing Jan 29, 2024
@Chriscbr
Copy link
Contributor Author

@eladb If two FaaS invocations are actively executing and getting billed at the same time (i.e. they are running in parallel), then we know for sure they are not running on the same container instance in AWS Lambda. That is, they will have their own JavaScript VM state. Hence, in the example above, the Foo class must have been instantiated twice, as the second FaaS invocation happened while the first one was still ongoing.

@Chriscbr
Copy link
Contributor Author

Chriscbr commented Jan 29, 2024

Here's an extended example that illustrates how two cloud.Function invocations running at the same time are able to communicate with each other through the inflight fields of a class:

bring cloud;
bring util;

class Foo {
  inflight var id: num = 0;
  inflight var fn1: (inflight (str): void)?;
  inflight var fn2: (inflight (str): void)?;

  inflight new() {
    this.id = 0;
    this.fn1 = nil;
    this.fn2 = nil;
  }

  pub inflight register(): num {
    this.id += 1;
    return this.id;
  }

  pub inflight setFn(id: num, fn: inflight (str): void) {
    if id == 1 {
      this.fn1 = fn;
    } elif id == 2 {
      this.fn2 = fn;
    }
  }

  pub inflight sendTo(id: num, msg: str) {
    if id == 1 {
      if let fn = this.fn1 {
        fn(msg);
      }
    } elif id == 2 {
      if let fn = this.fn2 {
        fn(msg);
      }
    }
  }
}

let foo = new Foo();

let fn = new cloud.Function(inflight () => {
  // produce a unique ID for this invocation
  let id = foo.register();

  // store a handler so this invocation can receive messages
  foo.setFn(id, inflight (msg: str) => {
    log("invocation {id} received " + msg);
  });

  util.sleep(1s);

  // send a message to the other invocation
  if id == 1 {
    log("sending foo from invocation {id} to invocation 2");
    foo.sendTo(2, "foo");
  } elif id == 2 {
    log("sending bar from invocation {id} to invocation 1");
    foo.sendTo(1, "bar");
  }
  util.sleep(1s);
});


let invoker = new cloud.Function(inflight () => {
  fn.invokeAsync("fn1");
  fn.invokeAsync("fn2");
}) as "invoker";

When I cal "invoker" through the Wing Console it logs

sending foo from invocation 1 to invocation 2
invocation 2 received foo
sending bar from invocation 2 to invocation 1
invocation 1 received bar

When I run call "invoker" once on AWS, the only logs I see in CloudWatch are

sending foo from invocation 1 to invocation 2

@eladb
Copy link
Contributor

eladb commented Jan 30, 2024

Thanks for the example. This is something that can actually happen in AWS Lambda as well through global variables.

So I am not sure the problem is that it is possible. I think the problem is that our language gives the impression that it won't happen.

I think this goes back to the quirks around inflight fields and constructors...

@Chriscbr
Copy link
Contributor Author

I reckon the optimization currently performed in the simulator which reuses inflight clients (even among parallel invocations) would be safe if it was guaranteed that it's always static data or pure functions being initialized. But in practice I don't think it's possible to guarantee this safely since extern JavaScript functions can always introduce mutable state.

@Chriscbr
Copy link
Contributor Author

I chatted with @eladb offline about this -- after some digging we found a page in AWS Lambda docs where it makes a brief mention of how "During [a container's] entire process, this execution environment is busy and cannot process other requests." (link)

But we decided that caching clients (like static data, DB connections, SDK clients) is important for a lot of use cases, and better simulates the cold/warm aspect of lambda functions -- so for now supporting this caching behavior is our first priority.

Eventually we can choose to model this in the simulator in higher fidelity by managing a queue of invocation requests (and optionally a pool of workers for handling requests), ensuring worker(s) only process one request at a time. That should give us the best of both worlds, preventing the shared mutable state and and even allowing us to simulate properties like FaaS concurrency.

@Chriscbr Chriscbr removed their assignment Jan 30, 2024
@eladb
Copy link
Contributor

eladb commented Jan 30, 2024

Thanks for updating

@mergify mergify bot closed this as completed in #5867 Mar 11, 2024
@mergify mergify bot closed this as completed in 62513f9 Mar 11, 2024
@github-project-automation github-project-automation bot moved this from 🤝 Backlog - handoff to owners to ✅ Done in Wing Mar 11, 2024
@monadabot
Copy link
Contributor

Congrats! 🚀 This was released in Wing 0.59.48.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working 🎨 sdk SDK 🕹️ simulator Related to the `sim` compilation target
Projects
Archived in project
3 participants