This is an Emacs Lisp binding for tree-sitter, an incremental parsing library. It requires Emacs 25.1 or above, built with dynamic module support.
It aims to be the foundation for a new breed of Emacs packages that understand code structurally. For example:
- Faster, fine-grained code highlighting.
- More flexible code folding.
- Structural editing (like Paredit, or even better) for non-Lisp code.
- More informative indexing for imenu.
The author of tree-sitter articulated its merits a lot better in this Strange Loop talk.
At this stage of the project, there are few end-user-visible features, but you can already install it to play around with the APIs.
- Check that Emacs was built with module support:
(functionp 'module-load). - Add the tree-sitter ELPA to
package-archives(remember to runpackage-refresh-contentsafterwards):(add-to-list 'package-archives '("ublt" . "https://elpa.ubolonton.org/packages/"))
- Install
tree-sitterandtree-sitter-langslike other normal packages.
The package is not yet on MELPA, because it currently doesn't have a convenient way to distribute packages with pre-compiled binaries.
If you want to hack on emacs-tree-sitter itself, see the section Setup for Development instead.
- Enable the
tree-sitterminor mode in a supported major mode (defined intree-sitter-major-mode-language-alist):(require 'tree-sitter) (require 'tree-sitter-langs) (global-tree-sitter-mode)
- Enable tree-based syntax highlighting for supported languages:
(require 'tree-sitter-hl) (add-hook 'tree-sitter-after-on-hook #'tree-sitter-hl-mode)
- Show the debug view of a buffer's parse tree
(require 'tree-sitter-debug) (tree-sitter-debug-mode)
- Get names of all functions in a Rust file:
(with-current-buffer "types.rs" (seq-map (lambda (capture) (pcase-let ((`(_ . ,node) capture)) (ts-node-text node))) (tree-sitter-debug-query [(function_item (identifier) @name)])))
- Write a simple extension to
expand-region:(defun tree-sitter-mark-next-bigger-node () (interactive) (let* ((p (point)) (m (if mark-active (mark) p)) (beg (min p m)) (end (max p m)) (root (ts-root-node tree-sitter-tree)) (node (ts-get-named-descendant-for-position-range root beg end)) (node-beg (ts-node-start-position node)) (node-end (ts-node-end-position node))) ;; Already marking current node. Try its parent node instead. (when (and (= beg node-beg) (= end node-end)) (when-let ((node (ts-get-parent node))) (setq node-beg (ts-node-start-position node) node-end (ts-node-end-position node)))) (set-mark node-end) (goto-char node-beg)))
- Parse a string:
(let ((parser (ts-make-parser))) (ts-set-language parser (tree-sitter-require 'rust)) (ts-parse-string parser "fn foo() {}"))
- The tree-sitter doc is a good read to understand its concepts, and how to use the parsers in general.
- Functions in this package are named differently, to be more Lisp-idiomatic. The overall parsing flow stays the same.
- Documentation for individual functions can be viewed with
C-h f(describe-function), as usual. - A
symbolin the C API is actually the ID of a type, so it's calledtype-idin this package.
language,parser,tree,node,cursor,query: corresponding tree-sitter types, embedded inuser-ptrobjects.point: a pair of(LINE-NUMBER . BYTE-COLUMN).LINE-NUMBERis the absolute line number returned byline-number-at-pos, counting from 1.BYTE-COLUMNcounts from 0, likecurrent-column. However, unlike that function, it counts bytes, instead of displayed glyphs.
range: a vector in the form of[START-BYTEPOS END-BYTEPOS START-POINT END-POINT].
These types are understood only by this package. They are not recognized by type-of, but have corresponding type-checking predicates, which are useful for debugging: ts-language-p, ts-tree-p, ts-node-p...
For consistency with Emacs's conventions, this binding has some differences compared to the tree-sitter's C/Rust APIs:
- It uses 1-based byte positions, not 0-based byte offsets.
- It uses 1-based line numbers, not 0-based row coordinates.
- Language:
tree-sitter-require: likerequire, for tree-sitter languages.
- Parser:
ts-make-parser: create a new parser.ts-set-language: set a parser's active language.ts-parse-string: parse a single string of source code.ts-parse-chunks: parse source code chunks generated by an input function.ts-set-included-ranges: set sub-ranges when parsing multi-language text.
- Tree:
ts-root-node: get the tree's root node.ts-edit-tree: prepare a tree for incremental parsing.ts-changed-ranges: compare 2 trees for changes.ts-tree-to-sexp: debug utility.
- Cursor:
ts-make-cursor: obtain a new cursor from either a tree or a node.ts-goto-functions: move to a different node.ts-current-functions: get the current field/node.
- Node:
ts-node-functions: node's properties and predicates.ts-get-functions: get related nodes (parent, siblings, children, descendants).ts-count-functions: count child nodes.ts-mapc-children: loops through child nodes.ts-node-to-sexp: debug utility.
- Query:
ts-make-query: create a new query.ts-make-query-cursor: create a new query cursor.ts-query-matches,ts-query-captures: execute a query, returning matches/captures.
Clone this repo and add its lisp and langs directories to load-path.
Install cask and run cask install to install dev dependencies.
If you want to hack on the high-level features (in Lisp) only:
- Evaluate this (once) to download the necessary binaries:
(require 'tree-sitter-langs-build) ;; Download pre-compiled `tree-sitter-dyn'. (tree-sitter-download-dyn-module)
- Make changes to the
.elfiles. - Add tests to
tree-sitter-tests.eland run them with./bin/test(.\bin\teston Windows).
If you want to build additional (or all) grammars from source, or work on the core dynamic module, see the next 2 sections.
- Install tree-sitter CLI tool (if you don't use NodeJS, you can download the binary directly from GitHub):
# For yarn user yarn global add tree-sitter-cli # For npm user npm install -g tree-sitter-cli
- Run:
# macOS/Linux make ensure/rust# Windows .\bin\ensure-lang rust
- You can modify
tree-sitter-langs-reposif the language you need is not declared there.
- Install the Rust toolchain.
- Install
clang, to generate the raw Rust binding foremacs-module.h. - Build:
# macOS/Linux make build# Windows .\bin\build - Test:
# macOS/Linux make test
# Windows .\bin\test - Continuously rebuild and test on change (requires cargo-watch):
# macOS/Linux make watch# Windows .\bin\test watch
To test against a different version of Emacs, set the environment variable EMACS (e.g. EMACS=/snap/bin/emacs make test).
Targeting lib authors:
- Write a guide on using the tree-sitter APIs.
Targeting end users:
- Pick a language, make a "killer" minor mode that extends its major mode in multiple ways.
- Make minor modes for most common languages.
- Extract common patterns from the language minor modes into helper language-diagnostic minor modes.
- Get a language major mode to use tree-sitter for optional features.
Binding through C instead of Rust: https://github.com/karlotness/tree-sitter.el
Contributions are welcomed. Please take a look at the issue list for ideas, or create a new issue to describe any idea you have for improvement.
Show respect and empathy towards others. Both technical empathy and general empathy are highly valued.