Compile runtime modules #2

djgrant · 2023-10-02T16:37:45Z

djgrant
Oct 2, 2023
Maintainer

Context

The Notation SDK provides modules for building both infrastructure and runtime code to be deployed to that infrastructure. Infrastructure and runtime code exist in the same type space.
Notation's compiler produces both an infrastructure plan and packages containing the runtime code. To produce the infrastructure plan, infrastructure modules are evaluated after transpilation producing a topologically sorted graph.
The compiler has to ensure that when it evaluates the infrastructure code that it does not accidentally evaluate runtime code.
Runtime code, particularly in the case of a serverless function, may contain code that exists outside the handler scope i.e. code that is evaluated on the initialisation of the serverless function but not on subsequent invocations.

Approaches

Approach 1: Runtime code can be in the same module as infrastructure

import { handler, apiRoute } from "@notation/aws";

const getNum = handler(() => Math.random());

apiRoute({ route: "/num", handler: getNum });

Approach 2: Runtime code must be declared in a separate module

// get-num.fn.ts
import { handler } from "@notation/aws";

export const getNum = handler(() => Math.random());

// api.ts
import { apiRoute } from "@notation/aws";

export const numRoute = apiRoute({ route: "/num", handler: getNum });

The runtime module could be identified by its path e.g. get-num.fn.ts or runtime/get-num.ts.

Challenges

Challenge 1: Handling side effects

A serverless function may declare code that is run only on initialisation. This is typically declared in the outer scope:

import { handler } from "@notation/aws";

// side effect: this code should never be run by Notation's build tools
const num = await fetch("http://api.com/num").then((res) => res.json());

export const getNum = handler(() => num);

It is crucial that, while forming the orchestration graph, the orchestration compiler never evaluates user code, which may contain side effects.

Challenge 2: Identifying runtime code

In Approach 2, runtime code can be identified by virtue of a contract with the developer, namely that their runtime code is placed within a special module. By and large, the code within these modules could be ignored by the orchestration compiler.

In Approach 1, developers would be required to wrap their runtime code within a higher-order handler function. Using module replacement, the handler function could be swapped for a no-op function, ensuring the runtime code in its callback is never called. The limitation of this approach is that it would not allow developers to write setup code outside the handler scope.

Challenge 3: Extracting infrastructure

Both approaches require a way of exposing the related infrastructure concerns of runtime modules to the orchestration compiler. For both approaches handler should return an infrastructure resource. This can be its default behaviour or achieved with module replacement.

For Approach 2, the module cannot be evaluated because it may contain side effects. This could be worked around by introducing an exports API, in which the key infrastructure information is extracted at the transpilation stage. For example, a named export config would be identified as the infrastructure configuration for the serverless function, and the other named exports would identified as handlers. A new module could then be created by the transpiler containing only infrastructure declarations.

Challenge 4: Avoiding uncanny valley

A potential advantage of Approach 2 is that it clearly demarcates what is infrastructure and what is runtime. While there is potentially an easiness to writing runtime code and infrastructure code in the same module, this could potentially place a cognitive overhead on the developer, or at worse, a foot gun.

Challenge 5: Nano functions ergonomics

Nano functions, characterised by their instant startup times and access to shared memory, are gaining popularity. They do not necessarily prescribe that their functions contain only a few lines of code, but they are certainly better suited for workflows that are broken into smaller units. As such, codebases with nano functions will be better organised if multiple small functions can exist in the same module.

Proposal

File system API

Runtime code must be place in modules with a filename in the format *.fn.ts
All other modules are treated as infrastructure-only

src/
├─ get-num.fn.ts
├─ api.ts

Export API

export const config: FnConfig: A config object describing how the serverless function should be configured
export const [handlerName]: FnHandler: All other exports, including the default export, represent a serverless function handler
Multiple handlers may be exported
The config object defines the target runtime

// src/get-num.fn.ts
import { handler, FnConfig } from "@notation/aws";

const num = await fetch("http://api.com/num").then((res) => res.json());

export const getNum = handler(() => num);
export const getDoubleNum = handler(() => num * 2);

export const config: FnConfig = {
  memory: 64,
  timeout: 5,
  environment: "node:16",
};

Runtime compilation

The config export should be removed
The code should be down-compiled to the target runtime defined in the config

// out/runtime/get-num.fn.ts
import { handler, FnConfig } from "@notation/aws";

const num = await fetch("http://api.com/num").then((res) => res.json());

export const getNum = handler(() => num);
export const getDoubleNum = handler(() => num * 2);

Orchestration compilation

The module is replaced with a new module exporting one serverless function resource for each handler
Each serverless function resource is configured with the exported config object
Each serverless function references the associated handler

// out/infra/get-num.ts
import { fn } from "@notation/aws";

export const getNum = fn({
  memory: 64,
  timeout: 5,
  environment: "node:16",
  handler: "getNum",
});

export const getDoubleNum = fn({
  memory: 64,
  timeout: 5,
  environment: "node:16",
  handler: "getDoubleNum",
});

djgrant · 2023-10-04T15:48:09Z

djgrant
Oct 4, 2023
Maintainer Author

Draft implementation in #3

0 replies

Happy0 · 2023-10-06T15:35:28Z

Happy0
Oct 6, 2023
Maintainer

General Comments

Although notation has the aim of allowing infrastructure setup code and 'runtime code' to live together, I think it is sensible to impose some guard rails to avoid the user accidentally running code at 'infrastructure deployment' time that should only be executed at runtime.

Equally, the infrastructure setup code may also have to do its own side effects that should rightfully only be executed at deployment time.

I think having a 'filesystem API' is a very good approach.

Some suggestions

I apologise if these comments are out of scope of your RFC and really I should be sticking to commenting on the division of infrastructure
code vs runtme code.

FnConfig export

I wonder if it would lead to less foot guns to just require the user to explicitly pass the configuration
rather than having defaults that apply to any function 'in scope' of the file / module.

I think this makes sense for the following reasons:

Functions are likely going to need their own specific config anyway (for things like permissions, environment variables)
Unclear in this case how 'default' and 'explicit' config for function specific config gets merged, may lead to foot guns
It'd be easy enough for the user to create a config and pass it in to each function.
I think it's in fitting of the 'building a dependency graph of infrastructure' feel of Notation - having it 'dangling' and not
attached to anything feels like 'magic' rather than the mental model that is trying to be communicated.

Function Resources

Further to the above, I wonder if there should be an intermediate 'function resource' infrastructure type between the 'apiRoute' resource and its 'handler' parameter
(the runtime code.)

Afterall, a 'lambda function' is a resource too. The function resource can be configured with the following things:

IAM permissions allowing it to access other resources (dynamodb, SQS, etc). These will probably be created at infrastructure deployment type,
where their 'resource identifiers' will become available and used to form the IAM permission resource's parameters.
The environment variables that are available to it at runtime when executed (sometimes these are dynamically created at infrastructure
deployment time - from looking up SSM parameters, for example.)

Example of what this might look like:

src/get-num.fn.ts

import { Handler } from "aws-lambda";

const num = fetch("http://api.com/num").then((res) => res.json());

export const getNum: Handler = async (event, context) => {
    return await num
}

src/types.ts

import { Handler } from "aws-lambda"

type EnvironmentVariables = Record<string, string>

export type FnConfig = {
    // The lambda name - so we don't deploy a fresh lambda on each infrastructure deployment
    name: string,
    description: string,
    // The memory allocated to the lambda
    memorySize: string,
    environmentVariables: EnvironmentVariables,
    // These would point to other 'instrastructure' types
    permissions: any,

    // The runtime code itself,
    handler: Handler
}

export type ApiRouteConfig = {
    route: string,
    lambdaFunction: FnConfig
}

src/api.ts

import { apiRoute } from "./apiGateway"
import { getNum } from "./get-num.fn"

export const numRoute = apiRoute(
    {
        route: "/num",
        lambdaFunction: {
            name: "getNum",
            description: "Fetches the very important number",
            // Very memory heavy lambda
            memorySize: "8096mb",
            // Would reference 'IAM' type resources
            permissions: {

            },
            environmentVariables: {
                // Might reference the output of other resources that will be evaluated at infrastructure compile time,
                // such as SSM parameters
            },
            handler: getNum,
        }
    }
)

src/get-num.fn.ts

import { Handler } from "aws-lambda";

const num = fetch("http://api.com/num").then((res) => res.json());

export const getNum: Handler = async (event, context) => {
    return await num
}

Note: I think there's still room for some helpful implicit behaviour here around implicitly setting up permissions for the lambda to be called
by the API gateway, etc, by virtue of the infrastructure graph.

0 replies

djgrant · 2023-10-12T10:44:58Z

djgrant
Oct 12, 2023
Maintainer Author

Thanks @Happy0! Great comments.

Am I right in thinking that the fundamental change you are suggesting is to move the config from the runtime code to the infrastructure code?

I'll leave my thoughts on why I didn't initially go with this approach. Would be great to hear a counter argument if you have one. Also totally fine to revise these arguments as we get clearer insights through testing one approach.

My reluctance for taking this approach was that the config is inherently coupled to the function code, and to have it defined in an infra module would move it too far away from the function, creating a strong coupling with low locality. In real terms, when a developer writes a function, they would then also have to head over to the infrastructure side of the code base and declare its config.

You make a good point though that some of the function config is coupled to other infrastructure components – for example an IAM role, or envs that originate from a cloud service. As you say, in some cases this can be calculated by virtue of the infrastructure graph. For example, a lambda connected to an API gateway needs IAM roles, a policy attachment, a lambda permission, an API integration. Similarly, a lamdba that calls Dynamo will probably need to be placed in a VPC and connected via a VPC endpoint. This is often predictable boilerplate that Notation can calculate based on best practices, and save the developer deciphering verbose documentation. Furthermore, if moving to another cloud, the underlying infrastructure will need to change and a different set of dependencies will be required to satisfy the relationship between the primary cloud resources. By abstracting away the infra boilerplate, Notation helps ease such a migration.

The question then is whether there is any configuration related to other infra resources that can't be calculated by Notation, and whether this substantiates the argument to move the config into the infra module space.

One other point to make about the config export: from an architectural PoV, I see these modules as something analogous to containers, with the default config export being analogous to a dockerfile. You can put multiple function in these modules, but they'll all be run in the same environment. So, it is partly by design that you can't have different runtime configurations of the same function (which is possible with your proposal).

One specific point:

It'd be easy enough for the user to create a config and pass it in to each function

If the config is defined in an infra module, then yes. If it is in a function module, the config needs to be extracted statically, therefore allowing it to be defined with anything other than primitive types (e.g. imports) vastly increases the complexity of the extraction.

3 replies

Happy0 Oct 31, 2023
Maintainer

... This is often predictable boilerplate that Notation can calculate based on best practices, and save the developer deciphering verbose documentation

This is certainly understandable for me for cases like integrating lambda with API gateway to handle routes and do the requisite wiring of permissions and integrations, etc. I wonder though if this is just easy to calculate by virtue of the infrastructure resource graph being explicit - e.g. the lambda function being is attached to the API gateway resource type in the configuration.

It's not clear to me how this would be achieved for cases like "a lambda handler writes to a DynamoDB table with name 'xyz'." Notation can't really know the lambda is writing to the DynamoDB table xyz as it's not really expressed in the resource graph as an input source or something like that. It could perhaps do some code analysis to work out if the lambda code makes an outgoing call to dynamoDB and look at the table parameter. However, this quickly gets intractable: for example the table name could be received dynamically (via SSM or via an environment variable) or the code dealing with DynamoDB could be buried in a separate npm package. If there was an infra module representation of a function, then you could configure it to allow it access to table XYZ (with perhaps a nicer abstraction over IAM.)

I suppose an option that would allow you to retain the principle that there shouldn't be a function resource type while being able to deal with dynamic situations like this would be to have some separate Permission type resource that cites the function and resource by their names or pointers. I do wonder though if setting up permissions on this basis is so central that it belongs attached to a Function resource type as a way to handhold the user into the right behaviour through type hints, etc.

djgrant Oct 31, 2023
Maintainer Author

I like the example and the distinction between static (explicit) relations and runtime (implicit) relations.

Generally, I'm inclined towards being able to import a table resource into the serverless function as a module and perform operations off that object i.e. make the relation statically analysable.

This approach doesn't solve the scenarios you describe such as using a third party library or having a dynamic table name, but it might cover a significant number of use cases.

I think we can come up with reasonable escape hatches for more dynamic use cases.

djgrant Nov 1, 2023
Maintainer Author

@Happy0, just thinking about this use case and the spectrum of possible solutions:

Most explicit: the developer imports a DynamoDB table resource and attaches it to the lambda config

import { FnConfig, handle } from "@notation/aws/lambda";
import { userTable } from "./tables";

// this is extract from the module using static analysis
export const config: FnConfig = { 
  service: "aws/lambda",
  memory: 64,
  accessRoles: [userTable.getItemAccessRole]
};

// now do whatever you want with your table
const client = thirdPartyDDBLib(userTable.name);

Semi-explicit: a DynamoDB client only allows operations for which access has been enabled

import { FnConfig, handle } from "@notation/aws/lambda";
import { dynamoClient } from "@notation/aws/dynamo";
import { userTable } from "./tables";

export const config: FnConfig = { 
  service: "aws/lambda",
  memory: 64
};

// this is also extracted and interpreted at compile time (maybe as a macro)
// when we create the orchestration graph, these access roles are added to the lambda
const users = dynamoClient({ 
  table: userTable, 
  assumeAccessRoles: ["getItem"] 
});

export const getUser = handler(() => {
  // get item is the only TypeScript type available
  return users.getItem();
}

Implicit: The access role is attached to the lambda based on the code the developers write

import { FnConfig, handle } from "@notation/aws/lambda";
import { dynamoClient } from "@notation/aws/dynamo";
import { userTable } from "./tables";

export const config: FnConfig = { 
  service: "aws/lambda",
  memory: 64
};

export const getUser = handler(() => {
  // using static analysis, we track the DynamoDB resource and see how it is being used
  return userTable.client.getItem();
}

Do you think that covers the range of possibilities?

p-maxwell · 2023-10-12T22:16:20Z

p-maxwell
Oct 12, 2023
Collaborator

Approach 2 definitely seems best to me, and I am glad to see it reflected in the direction being taken here.

I suspect that in reality configuration details like memory allocation and retries will frequently be the same from one function to the next. This for me is an argument for boosting it a level to the infrastructure, where it will be easier to see where a default template is being applied and where some variation has been made. In fact for developers unfamiliar with the cloud it may well be advantageous just to give them reasonable defaults and the option to override - it's very much preferable to be worrying about this during the "fine tuning" step rather than the "getting something to work" step.

0 replies

djgrant · 2023-10-24T12:41:23Z

djgrant
Oct 24, 2023
Maintainer Author

There are now integration tests on the main branch with a sample app that can be played around with to test the draft compiler.

0 replies

p-maxwell · 2023-11-02T10:22:40Z

p-maxwell
Nov 2, 2023
Collaborator

It strikes me that this is the old conflict between power and usability. There is no world where importing arbitrary code just works. Any library the user pulls in could end up making run time requests to some external entity which require permissions to be set up. The user will need to understand this at the outset. So then the question becomes whether to have the developer explicitly set up access on each and every occasion it is needed, or to abstract it. If it is abstracted, it may well be better to use the model discussed previously where the user deals with higher level abstractions around storage or databases and lets Notation convert these into detail. Having special handling and static analysis around Dynamo, Redis, Mongo etc strikes me as potentially being a lot of work. But then an abstraction layer is also more work than just giving the developer the tools to do things explicitly. I guess it depends whether this is more targeted at helping developers who know what they are doing work faster or helping developers who don't know what they're doing make anything work at all.

…

On Wed, 1 Nov 2023, 16:20 Daniel Grant, ***@***.***> wrote: @Happy0 <https://github.com/Happy0>, just thinking about this use case and the spectrum of possible solutions: 1. Most explicit: the developer imports a DynamoDB table resource and attaches it to the lambda config import { FnConfig, handle } from ***@***.***/aws/lambda";import { userTable } from "./tables"; // this is extract from the module using static analysisexport const config: FnConfig = { service: "aws/lambda", memory: 64, accessRoles: [userTable.getItemAccessRole]}; // now do whatever you want with your tableconst client = thirdPartyDDBLib(userTable.name); 1. Semi-explicit: a DynamoDB client only allows operations for which access has been enabled import { FnConfig, handle } from ***@***.***/aws/lambda";import { dynamoClient } from ***@***.***/aws/dynamo";import { userTable } from "./tables"; export const config: FnConfig = { service: "aws/lambda", memory: 64}; // this is also extracted and interpreted at compile time (maybe as a macro)// when we create the orchestration graph, these access roles are added to the lambdaconst users = dynamoClient({ table: userTable, access: ["getItem"] }); export const getUser = handler(() => { // get item is the only TypeScript type available return users.getItem();} 1. Implicit: The access role is attached to the lambda based on the code the developers write import { FnConfig, handle } from ***@***.***/aws/lambda";import { dynamoClient } from ***@***.***/aws/dynamo";import { userTable } from "./tables"; export const config: FnConfig = { service: "aws/lambda", memory: 64}; export const getUser = handler(() => { // using static analysis, we track the DynamoDB resource and see how it is being used return userTable.client.getItem();} Do you think that covers the range of possibilities? — Reply to this email directly, view it on GitHub <#2 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALRLOOKGVI3S45PWLLZ7X33YCJSELAVCNFSM6AAAAAA5PWEN7CVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TINBWHEYTI> . You are receiving this because you commented. Message ID: <notationhq/notation/repo-discussions/2/comments/7446914@ github.com>

3 replies

djgrant Nov 3, 2023
Maintainer Author

We were all once (and still are) developers who don't know what we're doing so I'm not sure I feel that particular dichotomy! Notation is ultimately an acknowledgement that it's the state of the tools in this space that makes building cloud software hard. The question for me is more about whether the model helps developers to easily intuit how to write good code – and what's more intuitive is often also more productive.

Coming back to @Happy0's thread (which Github seems to have disconnected us from due to email replies), I think the high level question is: how do we surface and model the principle of least privilege?

We could require developers to explicitly declare access for cases like writing to DynamoDB because we feel this is just good practice (and non-negotiable if running third party code). In that case the challenge is to make the process frictionless and intuitive (example 1 in my previous comment would be less onerous than the status quo, while not going so far as to actually abstract away the PoLP itself).

Alternatively, there's the view that if it's obvious from the user's code that they need certain access granted, that Notation infers it e.g. example 3. I think I would prefer some variant of example 2 that keeps the principle explicit but ties the granting of access policies into the type system i.e. operations only work after roles have been granted.

On a practical note, I am adverse to implementing more static analysis than necessary, mostly due to the performance penalty, but remain open to it if it feels absolutely the right solution.

p-maxwell Nov 3, 2023
Collaborator

On the contrary, I view the power/usability tension as absolutely key to every API. Are you manually managing your memory? Dealing with TCP/IP directly? We exist atop a layer of abstractions that hide away many powerful things we can do in order to make simple the things we want to do.

I think the "developers who don't know what they're doing" was perhaps overly terse from me - I mean lowering the bar to get something working for developers who do not have detailed domain knowledge but are focused on delivery. But in truth, even those who understand the domain may work faster at a higher level of abstraction.

Explicitly declaring access pushes toward the power end. There may be use cases where some patchwork of permissions need to be set up that would be difficult to manage solely by inferring their need from the code. But this also makes development a little more tedious in probably the majority of cases where the explicit access is largely boilerplate.

Implicitly inferring access and implementing best practice automatically makes the developer's life easier, at the cost of more work going into the framework that does so and the possibility that in some cases the wrong thing may be inferred. But the latter can possibly be resolved by providing a way for the developer to overwrite inferred defaults.

(Edited to add: This said, the danger with inference is that it becomes a more leaky abstraction if the user wants to import some arbitrary framework which may have difficult to infer security requirements. It needs to be clear what can be inferred and what not.)

djgrant Nov 3, 2023
Maintainer Author

Sounds like we're all a bit queasy about too much inference!

It was the latter – learning devs vs. experienced devs – that I don't see presenting a strong trade off. Notation should be good for both.

Agree on the rest wrt PoLP – the use cases convince me that access control of certain resources has to be a concern of the developer.

Taking it as a given then that user needs to think about access policies, I do think that tools can also become more usable with better DX (e.g. Deno's permission model surfaces to the developer the permissions they're missing), and by removing the power to do things that are nonsensical or foot guns. My preference is to aim for that rule-of-least-power sweet spot, rather than framework magic (which seems to always come around and bite my productivity later). Give devs a lot of power to do the things that are relevant, and remove the power to do things that are not.

To ground it, example 3 feels in the magic category and prone to cause confusion later ("why does Notation only work when I use their DynamoDB library?"), whereas example 2, has more power but is limited to just adding DDB access, causes type errors when is not configured correctly, is relatively easy to grok, and makes explicit how it is by adding the access policies that the DDB client works.

Does that seem like a reasonable framework for thinking about this?

djgrant · 2023-12-13T14:31:29Z

djgrant
Dec 13, 2023
Maintainer Author

For handling the DynamoDB access use case, #10 introduces a compile step that allows infrastructure modules to be imported into function modules.

The compiler removes all code from the function module except that which is definitely safe to evaluate, enabling some really interesting use cases that would require dynamically attaching additional infrastructure resources to a function resource.

1 reply

djgrant Dec 13, 2023
Maintainer Author

One consequence of this is that we need to know definitely what is an infrastructure module. I proposed a clean separation of a runtime folder and an infra folder. This could later be configured in a notation.config.json file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile runtime modules #2

{{title}}

Replies: 7 comments 7 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Compile runtime modules #2

djgrant Oct 2, 2023 Maintainer

Context

Approaches

Approach 1: Runtime code can be in the same module as infrastructure

Approach 2: Runtime code must be declared in a separate module

Challenges

Challenge 1: Handling side effects

Challenge 2: Identifying runtime code

Challenge 3: Extracting infrastructure

Challenge 4: Avoiding uncanny valley

Challenge 5: Nano functions ergonomics

Proposal

File system API

Export API

Runtime compilation

Orchestration compilation

Replies: 7 comments · 7 replies

djgrant Oct 4, 2023 Maintainer Author

Happy0 Oct 6, 2023 Maintainer

General Comments

Some suggestions

FnConfig export

Function Resources

djgrant Oct 12, 2023 Maintainer Author

Happy0 Oct 31, 2023 Maintainer

djgrant Oct 31, 2023 Maintainer Author

djgrant Nov 1, 2023 Maintainer Author

p-maxwell Oct 12, 2023 Collaborator

djgrant Oct 24, 2023 Maintainer Author

p-maxwell Nov 2, 2023 Collaborator

djgrant Nov 3, 2023 Maintainer Author

p-maxwell Nov 3, 2023 Collaborator

djgrant Nov 3, 2023 Maintainer Author

djgrant Dec 13, 2023 Maintainer Author

djgrant Dec 13, 2023 Maintainer Author

djgrant
Oct 2, 2023
Maintainer

Replies: 7 comments 7 replies

djgrant
Oct 4, 2023
Maintainer Author

Happy0
Oct 6, 2023
Maintainer

djgrant
Oct 12, 2023
Maintainer Author

Happy0 Oct 31, 2023
Maintainer

djgrant Oct 31, 2023
Maintainer Author

djgrant Nov 1, 2023
Maintainer Author

p-maxwell
Oct 12, 2023
Collaborator

djgrant
Oct 24, 2023
Maintainer Author

p-maxwell
Nov 2, 2023
Collaborator

djgrant Nov 3, 2023
Maintainer Author

p-maxwell Nov 3, 2023
Collaborator

djgrant Nov 3, 2023
Maintainer Author

djgrant
Dec 13, 2023
Maintainer Author

djgrant Dec 13, 2023
Maintainer Author