Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decentralize the Input Validation Domain #532

Closed
3 tasks done
CMCDragonkai opened this issue Jul 10, 2023 · 9 comments · Fixed by #573
Closed
3 tasks done

Decentralize the Input Validation Domain #532

CMCDragonkai opened this issue Jul 10, 2023 · 9 comments · Fixed by #573
Assignees
Labels
procedure Action that must be executed r&d:polykey:supporting activity Supporting core activity

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 10, 2023

According to https://github.com/MatrixAI/MatrixAI-Graph/issues/44 the validation domain currently is a single point of failure because of:

It's important to realise that the validation domain is becoming a Single Point of Failure (#481 (comment)).

Because if the validation domain starts being used for more structural parsers, then domains may be depending on the validation domain, while the validation domain depends on all other domains (due to the usage of all the decoding utility functions). This many to many nexus point creates a crux/keystone/centre of gravity. This results in a loss in robustness. In comparison to ids or utils, this where all domains depend on, they don't have the same problem because those domains are self-contained. They don't depend on other domains. Domains depend on it. But validation is a problem, and we need loose coupling in our module dependencies to avoid a case breaking changes in one area affects another place in the code thus preventing us from being able to iterate on small parts of the codebase.

The solution to this is to decentralise the validation domain. The validation parser functions can be spread around the domains, while they can be re-exported from the validation domain to be used. The validation domain itself can have a self contained utilities that contains just the generic validateSync and related functions.

The src/validation domain contains:

»» ~/Projects/Polykey/src
 ♖ tree validation                                                                                                            (staging) pts/4 12:50:45
validation
├── errors.ts
├── index.ts
└── utils.ts

0 directories, 3 files

The index.ts and errors.ts is fine, as they both contain the "generic" validation procedures. We can move the functions inside index.ts into utils.ts and then re-export it (unscoped) inside index.ts. That means validate, and validateSync.

The existing functions inside utils.ts are all the parseX functions as per the list in https://github.com/MatrixAI/MatrixAI-Graph/issues/44

All of these parseX functions should be moved into their respective domain utilities rather than being in validation/utils.ts. That's all of this:

export {
  parseInteger,
  parseNumber,
  parseNodeId,
  parseGestaltId,
  parseClaimId,
  parseVaultId,
  parseProviderId,
  parseIdentityId,
  parsePublicKey,
  parsePrivateKey,
  parseRecoveryCode,
  parseGestaltAction,
  parseVaultAction,
  parseHost,
  parseHostname,
  parseHostOrHostname,
  parsePort,
  parseNetwork,
  parseSeedNodes,
};

However these parse functions use the validationErrors.ErrorParse. That means validation domain can be imported by the other domains. This is actually already done by src/claims/utils.ts. Notice it imports import * as validationErrors from '../validation/errors'; and uses it for its assertX functions.

This does mean that technically these domains aren't entirely self-contained since they at least depend on a validationErrors.ErrorParse exception. The ErrorParse could later be factored out to js-validation if we want to to share this structure across projects, like in js-mdns which also uses the parseX/generateX structure in its DNS packets.

Ok final issue is all the places in the code that uses validationUtils.parseX. All of this has to be changed. In order to fully decentralise, they should be importing their respective domainUtils.parseX instead of relying on validationUtils.parseX. We could "re-export" all of these parseX functions, but that defeats the point of decentralization. We need to just change our import paths for these usages. That shouldn't be too difficult, can do a find all in vscode.

Tasks

  • 1. Move all parseX inside src/validation/utils.ts into their respective domains.
  • 2. In each domain/utils.ts they should import the src/validation/errors.ts as validationErrors, see src/claims/utils.ts to see how to do this.
  • 3. All users of src/validation/utils.ts should now be using domainUtils.parseX instead avoiding a centralised import on validation domain as a "many to many" nexus in our codebase.
  • [ ] 4. We can move the src/validation/index.ts functions into src/validation/utils.ts and re-export it unscoped inside src/validation, allowing one to import { validate, validateSync } from '../validation';. - not going to do this, it's fine the way it is

This allows us to in the future extract out the entire validation domain as js-validation library (although I think it should be called something else, just to avoid confusion with general input validation). This is not settled though, it may not be necessary.

@CMCDragonkai CMCDragonkai added procedure Action that must be executed r&d:polykey:supporting activity Supporting core activity labels Jul 10, 2023
@CMCDragonkai
Copy link
Member Author

Avoiding single points of failure in Polykey is quite critical. It always slows down our development whenever we need to test some changes in a feature branch but changes in one domain breaks another domain.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jul 11, 2023

As an example, I've visualised this many to many nexus in codesee. One of the first useful usecases I found with codesee maps.

image

Notice that validation/utils.ts ends up mutually importing/exporting to all these different domains.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jul 11, 2023

Expanding the main users of validation/utils is the service handlers from client and agent:

image

You can see here the that handlers end up importing utilities.

if we were to change one of the domains, let's say claims, it would end up impacting everything that imports the validation/utils.ts. So this is a high-change-impact factor.

@CMCDragonkai
Copy link
Member Author

Would be cool if it was possible to auto-detect high-impact factor points in the code, and find places that have too much mutual recursion.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes this is a loose-end, might do well to tie this up as we continue development.

@tegefaulkes
Copy link
Contributor

I'd prefer to do it all at once. It's a simple change and better to have a clean commit for it.

@CMCDragonkai
Copy link
Member Author

Note that all ID external parsers, encoders and decoders relating to Ids should probably all be inside ids domain.

However toX and fromX is supposed to be used for internal types, and these can be put into x/utils.ts.

@CMCDragonkai
Copy link
Member Author

I've noticed that you're using the parse prefix in some functions that are not correct according to our validation standard.

For example nodes/utils.ts has parseRemoteCertsChain, it should not be called parse.... This is some transformation function you're doing... All parseX functions must take data: any and return the relevant X type, and then throw validationErrors.ErrorParse.

@CMCDragonkai
Copy link
Member Author

An older example is:

/**
 * Parse an address string into host and port
 */
function parseAddress(address: string): [Host, Port] {
  const url = new URL(`pk://${address}`);
  const dstHostMatch = url.hostname.match(/\[(.+)\]|(.+)/)!;
  const dstHost = dstHostMatch[1] ?? dstHostMatch[2];
  const dstPort = url.port === '' ? 80 : parseInt(url.port);
  return [dstHost as Host, dstPort as Port];
}

Which is in network/utils.ts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
procedure Action that must be executed r&d:polykey:supporting activity Supporting core activity
Development

Successfully merging a pull request may close this issue.

2 participants