Skip to content

Goldziher/tree-sitter-language-pack

Repository files navigation

Tree Sitter Language Pack

This package bundles a comprehensive collection of tree-sitter languages as both source distribution and pre-built wheels. It is compatible with tree-sitter v0.22.0 and above. It is strongly typed.

Notes:

  • This package is a maintained and updated fork of tree-sitter-languages by Grant Jenks, and it incorporates code contributed by ObserverOfTime (see this PR).
  • This package is MIT licensed and the original package of which this is a fork has an Apache 2.0 License. Both licenses are available in the LICENSE file.
  • All languages bundled by this package are licensed under permissive open-source licenses (MIT, Apache 2.0 etc.) only - no GPL licensed languages are included.

Installation

pip install tree-sitter-language-pack

Usage

This library exposes two functions get_language and get_parser.

from tree_sitter_language_pack import get_binding, get_language, get_parser

python_binding = get_binding('python')  # this is an int pointing to the C binding
python_lang = get_language('python')  # this is an instance of tree_sitter.Language
python_parser = get_parser('python')  # this is an instance of tree_sitter.Parser

See the list of available languages below to get the name of the language you want to use.

Available Languages:

Each language below is identified by the key used to retrieve it from the get_language and get_parser functions.

Contributing

This library is open to and welcomes contributions.

Setup

  1. Fork the repository.
  2. Make sure to have PDM installed on your machine.
  3. You will also need the clang toolchain installed on your machine and available in path. Consult the pertinent documentation for your operating system.
  4. Install and build locally by running pdm install -v.

Adding a new language

Install via PDM

Some bindings are installed via PDM and are added to the package dependencies in the pyproject.toml file. To add an installed package follow these steps:

  1. Install the bindings with pdm add <bindings_package_name> --no-sync.
  2. Install the dev dependencies with pdm install -v --no-self
  3. Execute the cloning script with pdm run scripts/clone_vendors.py.
  4. Update both the literal type InstalledBindings and the installed_bindings_map dictionary in the __init .py _ file.
  5. Build the bindings by executing: pdm install -v.
  6. Execute the tests with pdm run test.
  7. If the tests pass, commit your changes and open a pull request.

Adding a Binary Wheel Language

  1. Add the language to the sources/language_definitions.json file at the repository's root.

This file contains a mapping of language names to their respective repositories.

{
  "name": {
    "repo": "https://github.com/...",
    "branch": "master",  // not mandatory
    "directory": "sub-dir/something",  // not mandatory
    "generate": true  // not mandatory
  }
}

That is, each object must have a repo key, and optionally a branch, directory, and generate keys.

  • repo is the URL of the tree-sitter repository. This value is mandatory
  • branch the branch of the repository to checkout. You should specify this only when the branch is not called main ( i.e. for master or other names, specify this).
  • directory is the directory under which there is an src folder. This should be specified only in cases where the src folder is not immediately under the root folder.
  • generate is a flag that dictates whether the tree-sitter-cli generate command should be executed in the given repository / directory combo. This should be specified only if the binding needs to be build in the repository.
  1. Update the SupportedLanguage literal type in the init.py file.
  2. Install the dev dependencies with pdm install -v --no-self
  3. Execute the cloning script with pdm run scripts/clone_vendors.py.
  4. Build the bindings by executing: pdm install -v.
  5. Execute the tests with pdm run test.
  6. If the tests pass, commit your changes and open a pull request.