Proposal: add several fields to `SymbolInformation` #154

olafurpg · 2023-05-15T16:21:12Z

This issue is an umbrella for several proposed additions to SCIP, based on discussions with @donsbot on Mastodon .

`SymbolInformation.display_name`

This would be the name of the symbol, which is both helpful for local variables and avoids parsing the name from the symbol. The field could be name instead of display_name, we use display_name in SemanticDB to emphasize that this name is meant to be displayed (and should therefore not have special encoding for non-ASCII characters like emojis.

`SymbolInformation.owner`

Alternative name parent. The thinking with this field is that it avoids parsing the owner from the symbol, and it allows us to emit an owner for local symbols.

`SymbolInformation.kind`

An enum that specifies what kind of symbol this is (enum/interface/method/...). Currently, Descriptor.Suffix doesn't encode enough fine-grained information (and it's intentionally named "suffix" to emphasize that it's primarily related to the syntax of the symbol.

`SymbolInformation.signature_documentation`

A string-formatted rendering of the signature. Currently, indexers emit this information in the documentation field as markdown-formatted code blocks. Having a separate field makes it cleaner to extract only the signature. I propose we reserve the field SymbolInformation.signature for fully typed/structured signatures (not string-formatted signatures).

The text was updated successfully, but these errors were encountered:

donsbot · 2023-05-23T04:48:20Z

SymbolInformation.kind

For symbol kinds, a variant of the lsif/vscode spec. E.g. currently used by Glean : https://github.com/facebookincubator/Glean/blob/main/glean/schema/source/codemarkup.types.angle#L38 ; these are used for search and outline views: to e.g. constraint a search to methods or classes only, or when searching by name, to show the kind of each result.

SymbolInformation.signature_documentation

this is somewhat important, as the scip index conflates "Documentation". e.g. in rust-analyzer the type signature and the documentation markdown are combined. They should ideally be separate "tables" in the output, keyed by scip.Symbol, as depending on the scenario you might only want to show the signature (e.g. a search page), or docs (an API page).

SymbolInformation.display_name

This is another search related application. Imagine taking the scip data and efficiently listing all symbols called 'vec'. We'd need a way to extrac the local name. Another scenario: search for foo::bar::vec , where 'bar' might be a subclass of something. In each case we need to know the precise local name and parent name ("qualified name"). These can be extracted by parsing the scip.Symbol but that's something the protocol could already do,

olafurpg · 2023-05-23T05:07:36Z

@donsbot Agreed! We have these pieces of information in SemanticDB for similar reasons. If we add these fields to SCIP, would you be interested in contributing the changes to rust-analyzer? I estimate it will be easy to update scip-java to emit the new fields. We can create a tracking issue for the remaining languages.

donsbot · 2023-05-23T05:21:50Z

Yep we would likely want to extend rust-analyzer immediately to support these.

This is the first step towards closing #154. Previously, the only way to determine the type of a symbol was through `SymbolDescriptor.Suffix`. This solution was not ideal since `Suffix` had very coarse grained types, and it was intentionally named "suffix" to indicate it was primarily used to determine the syntax of the symbol (not semantics). This PR adds a new enum `SymbolInformation.Kind` to add first-class support to determine the type of a symbol.

olafurpg · 2023-05-23T06:17:06Z

I opened a PR adding SymbolInformation.kind as a starting point. Will follow up with separate PRs adding the remaining fields.

Towards #154 Previously, there was no reliable way to get the name of a symbol. You could parse it from the symbol descriptor but that only worked for global symbols (not locals) and was error-prone when the indexer chooses to a special encoding in the symbol for some reason (for example, all-lowercase because symbols are case-insensitive). Now, clients can use `SymbolInformation.display_name` to reliably get the name of a symbol to render in API docs or some other application.

olafurpg · 2023-05-25T05:57:15Z

@donsbot can you elaborate on your motivation for SymbolInformation.owner? I am OK with adding this field on the condition it should only be used for locals. My concern with adding this field for global symbols is that indexers will start to emit malformed global symbols. I think it's a very desirable property that clients can implement a lot of functionality with an array of symbols (string[]) without having to load a global symbol table (record<string, SymbolInformation>). If we limit SymbolInformation.owner to locals then this problem is alleviated since it's cheap to load a local-only symbol table.

I think it should be unavoidable that clients need to implement a symbol parser to walk up the symbol hierarchy. The Java logic for the symbol parser is ~150 lines of code and we could aim to provide similar APIs for all language bindings.

donsbot · 2023-05-25T06:02:27Z

I was imagining most indexers usually know the "containing" symbol.

method -> class/interface
param -> method
class -> module (e.g. python)

which lets us build navigation by containing scope.

There's an equivalent relationship for parent by "inheritance" (e.g. A extends B).

So In this case I was just interested in the relationship between a symbol and its parent (at the scip.Symbol level, not the textual names).

Obviously if we have something like a/b/c() , we can figure out that a/b might be the containing parent. But then I need to search for scip.Symbol that match? or is this existent relationship present somewhere else?

donsbot · 2023-05-25T06:03:47Z

The specific product use case is not just symbol outlines on. page, but API listings. E.g. generating cargo-like docs for a language: E.g. looking up "class C" to see its API:

Class C

- method 1
- method 2
- method 3

Inherits:
- A::method4
- B::method5

* Add `SymbolInformation.display_name` Towards #154 Previously, there was no reliable way to get the name of a symbol. You could parse it from the symbol descriptor but that only worked for global symbols (not locals) and was error-prone when the indexer chooses to a special encoding in the symbol for some reason (for example, all-lowercase because symbols are case-insensitive). Now, clients can use `SymbolInformation.display_name` to reliably get the name of a symbol to render in API docs or some other application. * Add example for `display_name`.

Towards #154 Previously, SCIP clients had to parse markdown from the `SymbolInformation.documentation` field to display only the signature information of a symbol (for example, method parameters and return type). With the new `SymbolInformation.signature_documentation` field, indexers can now emit more structured information about the signature of a class/method including optional hyperlinks to the referenced symbols.

olafurpg · 2023-05-25T09:53:29Z

@donsbot You can parse the owner from the symbol fields to support those use-cases (assuming they're global symbols). The Java symbol parser is 150 lines of code and we have had a decent experience porting this logic to other languages with ChatGPT https://github.com/sourcegraph/scip-java/blob/main/scip-semanticdb/src/main/java/com/sourcegraph/scip_semanticdb/SymbolDescriptor.java#L21

I am open to add SymbolInformation.owner (or SymbolInformation.enclosing_symbol) to support the same functionality for local symbols.

Would that be a satisfiable solution? I'm not 100% against allowing owner/enclosing_symbol for global symbols but I am concerned it increases the risk that indexers stop emitting well-structured symbols as I mentioned in my last comment.

donsbot · 2023-05-26T02:31:30Z

Ok sounds good. Yes I will need it for locals (and can derive it for globals).

* Add `SymbolInformation.signature_documentation` Towards #154 Previously, SCIP clients had to parse markdown from the `SymbolInformation.documentation` field to display only the signature information of a symbol (for example, method parameters and return type). With the new `SymbolInformation.signature_documentation` field, indexers can now emit more structured information about the signature of a class/method including optional hyperlinks to the referenced symbols. * Made `signature_documentation` non-repeated

Towards #154 Previously, it was not possible to determine the enclosing symbol of a local because the syntax for local symbols has no hierarchy (unlike global symbols). This PR closes the gap by adding a field `SymbolInformation.enclosing_symbol` to allow documenting the "parent" or "owner" of a local symbol.

* Add `SymbolInformation.enclosing_symbol` Towards #154 Previously, it was not possible to determine the enclosing symbol of a local because the syntax for local symbols has no hierarchy (unlike global symbols). This PR closes the gap by adding a field `SymbolInformation.enclosing_symbol` to allow documenting the "parent" or "owner" of a local symbol. * Add example for enclosing_symbol * Regenerate bindings, and gitignore Haskell output

olafurpg · 2023-06-04T10:48:59Z

Closing this as fixed after #164 since all the proposed fields have been added to the spec now. However(!), none of the existing SCIP indexers have been updated yet to emit the new information. Feel free to open an issue in this repository or each SCIP indexer repository if you'd like the indexer to add support for these new fields.

varungandhi-src changed the title ~~Proposall: add several fields to SymbolInformation~~ Proposal: add several fields to SymbolInformation May 17, 2023

varungandhi-src mentioned this issue May 18, 2023

Surface signature information in hover docs sourcegraph/scip-clang#327

Closed

olafurpg mentioned this issue May 23, 2023

Add SymbolInformation.kind #156

Merged

olafurpg mentioned this issue May 25, 2023

Add SymbolInformation.display_name #158

Merged

olafurpg mentioned this issue May 25, 2023

Add SymbolInformation.signature_documentation #159

Merged

olafurpg mentioned this issue May 30, 2023

Add SymbolInformation.enclosing_symbol #164

Merged

olafurpg closed this as completed Jun 4, 2023

This was referenced Nov 17, 2023

Implement new SymbolInformation fields in the SCIP backend rust-lang/rust-analyzer#15919

Closed

Implement new scip SymbolInformation fields sourcegraph/scip-java#666

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: add several fields to `SymbolInformation` #154

Proposal: add several fields to `SymbolInformation` #154

olafurpg commented May 15, 2023

donsbot commented May 23, 2023 •

edited

Loading

olafurpg commented May 23, 2023 •

edited

Loading

donsbot commented May 23, 2023

olafurpg commented May 23, 2023

olafurpg commented May 25, 2023

donsbot commented May 25, 2023

donsbot commented May 25, 2023

olafurpg commented May 25, 2023

donsbot commented May 26, 2023

olafurpg commented Jun 4, 2023

Proposal: add several fields to SymbolInformation #154

Proposal: add several fields to SymbolInformation #154

Comments

olafurpg commented May 15, 2023

SymbolInformation.display_name

SymbolInformation.owner

SymbolInformation.kind

SymbolInformation.signature_documentation

donsbot commented May 23, 2023 • edited Loading

SymbolInformation.kind

SymbolInformation.signature_documentation

SymbolInformation.display_name

olafurpg commented May 23, 2023 • edited Loading

donsbot commented May 23, 2023

olafurpg commented May 23, 2023

olafurpg commented May 25, 2023

donsbot commented May 25, 2023

donsbot commented May 25, 2023

olafurpg commented May 25, 2023

donsbot commented May 26, 2023

olafurpg commented Jun 4, 2023

Proposal: add several fields to `SymbolInformation` #154

Proposal: add several fields to `SymbolInformation` #154

`SymbolInformation.display_name`

`SymbolInformation.owner`

`SymbolInformation.kind`

`SymbolInformation.signature_documentation`

donsbot commented May 23, 2023 •

edited

Loading

olafurpg commented May 23, 2023 •

edited

Loading