Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(semantic): add Symbol #4372

Closed
wants to merge 3 commits into from
Closed

feat(semantic): add Symbol #4372

wants to merge 3 commits into from

Conversation

DonIsaac
Copy link
Contributor

What This PR Does

Adds Symbol, which is a thin abstraction over Semantic for interacting with a specific symbol. It basically just wraps a SymbolId, making it a little nicer to interact with the symbol table.

@DonIsaac DonIsaac added C-enhancement Category - New feature or request A-semantic Area - Semantic labels Jul 20, 2024
Copy link

graphite-app bot commented Jul 20, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

Copy link

codspeed-hq bot commented Jul 20, 2024

CodSpeed Performance Report

Merging #4372 will not alter performance

Comparing don/feat/symbol (bea11ec) with main (abfccbd)

Summary

✅ 32 untouched benchmarks

@Boshen
Copy link
Member

Boshen commented Jul 20, 2024

This indirection feels redundant to me - you'll need to get familiarized yourself of data-orientated-design and our SoA strucutre.

@overlookmotel
Copy link
Contributor

overlookmotel commented Jul 20, 2024

Personally, I do like the thrust of this PR - I think the storage medium (SoA) can be thought of an implementation detail, and it's appropriate to provide an abstraction with a more "normal" API.

However:

  1. I'm not sure if the specific implementation of that abstraction in this PR is ideal. It may be that Symbol borrowing &'s Semantic<'a> is a perf regression, and also may get us in trouble with borrow-checker when it comes to mutating symbols.
  2. As I think has been mentioned elsewhere, I'm going to be doing an "in the round" review of Semantic over the next month, and producing an RFC. My feeling is that at this point, we'd be better off deciding an overall end goal for Semantic and plotting a roadmap for how to get there, rather than continuing with incremental peacemeal changes. So for now, it be my preference if we could call a halt to modifying Semantic's APIs, as it may be wasted effort if my RFC proposes ripping it all up and starting again!

FYI, I've also stopped doing perf work on Semantic now. There was some low-hanging fruit which seemed worthwhile to pluck asap (e.g. #4367). But in my view, we've now got most of the perf gains that we can get with ease, and at this point we'd be better off putting our efforts into other areas until RFC is ready. Hope this makes sense.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 20, 2024

Not directly related to this PR, But maybe we should put some effort toward researching any existing solution like MultiArrayList in zig or implement one ourselves. It is a really handy data structure that makes any struct into a SoA without drastic changes to the API for the users.

https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig

(have I mentioned this before or am I having a Dejavu moment?)

@DonIsaac
Copy link
Contributor Author

I agree that the various SoA structures should be an implementation detail. I've found that interacting with them can be quite verbose (particularly in lint rules), and this seems like a cleaner way of doing so.

  1. I'm not sure if the specific implementation of that abstraction in this PR is ideal. It may be that Symbol borrowing &'s Semantic<'a> is a perf regression, and also may get us in trouble with borrow-checker when it comes to mutating symbols.

This could be a perf regression, but if it was, it would only affect consumers using an owned Semantic. In most cases, we pass around an Rc<Semantic>, which requires a single pointer dereference, the same way this does. Additionally, passing around a Symbol does not incur reference counting overhead. And, since all methods are inlined, we shouldn't incur any overhead for function calls. In most cases (e.g. lint rules), the generated assembly should be almost exactly the same. The one potential caveat is the inclusion of SymbolFlags in Symbol - use cases that don't use them may see a slight regression. Because of this, I've marked Symbol as #[non_exhaustive] so we can remove it later without breaking its API.

Regarding updates to the symbol table, Symbol would not be able to do these.

  1. In the case of SemanticBuilder, internal _mut() methods should continue to be used
  2. In the case of the Transformer, Minifier, and ID, most changes are on the AST, which is out of scope of Symbol
  3. Mangling will not break Symbol as it only changes identifier names, meaning Symbol::name() would continue to return the correct identifier name. SymbolIds do not change.
  1. As I think has been mentioned elsewhere, I'm going to be doing an "in the round" review of Semantic over the next month, and producing an RFC. My feeling is that at this point, we'd be better off deciding an overall end goal for Semantic and plotting a roadmap for how to get there, rather than continuing with incremental peacemeal changes. So for now, it be my preference if we could call a halt to modifying Semantic's APIs, as it may be wasted effort if my RFC proposes ripping it all up and starting again!

In my opinion, having a wrapper gives us the flexibility to refactor Semantic while reducing the effects on downstream code. Right now, APIs for Semantic, SymbolTable, etc are directly tied to the internal SoA format. We can't change internal formats without breaking our contract. With a wrapper, we could refactor how internal structures are organized, then just change the implementation of Symbol's methods.

@overlookmotel
Copy link
Contributor

overlookmotel commented Jul 20, 2024

I'm sorry @DonIsaac but I don't think we should move forwards with this right now. Boshen has voiced opposition (and I don't want to pull him away from his holiday to argue the case with him). And I've said that, while I do like the thrust of what you're doing here, I'm also opposed to making changes to Semantic's APIs right now.

This may be a boringly practical argument, but I really want to get on with doing the research for this RFC, and I (like everyone) have limited bandwidth. So regardless of whether proposed changes are good or not, it makes it harder to hold the whole thing in your head if its an ever-evolving moving target, so it would be a big help if we could take a pause.

I've added a note on oxc-project/backlog#11 to come back to this later on, so your work won't be wasted.

So... TLDR... I'm not saying "no", but I am saying "not now please". I hope you can understand my reasoning.

@DonIsaac
Copy link
Contributor Author

Understandable, let's move forward on Semantic first. I'll close this PR.

@DonIsaac DonIsaac closed this Jul 20, 2024
@overlookmotel
Copy link
Contributor

Thanks for understanding Don. It's never easy saying "no", and rarely feels great receiving that answer either. So I appreciate you accepting my rationale, even though it's no doubt not what you wanted to hear.

@Boshen Boshen deleted the don/feat/symbol branch November 21, 2024 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-semantic Area - Semantic C-enhancement Category - New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants