Tree-sitter in Emacs 29+: Setting Up Syntax Highlighting for PHP and Web Languages

Tree-sitter is a parser generator and incremental parsing library that gives text editors access to a concrete syntax tree of the code being edited. Emacs 29 shipped with built-in tree-sitter support, and Emacs 30 continues to expand it. For PHP developers, tree-sitter means faster, more accurate syntax highlighting, structural navigation, and the foundation for features like combobulate and future indentation improvements.

This guide covers everything needed to get tree-sitter working for PHP, JavaScript, CSS, HTML, TypeScript, and JSON — the typical web development language stack.


Prerequisites

First, verify that your Emacs was compiled with tree-sitter support:

M-x emacs-version
;; Should include "TreeSitter" in the output

;; Or check programmatically:
(treesit-available-p) ; returns t if available

If you are on a distribution that ships Emacs without tree-sitter support (some older distro packages), you need to compile Emacs from source or find a package built with the --with-tree-sitter flag. On macOS with Homebrew, use the emacs-plus tap: brew install emacs-plus --with-tree-sitter. On Arch Linux, the emacs package includes tree-sitter by default.

You also need the tree-sitter C library:

# Debian/Ubuntu
sudo apt install libtree-sitter-dev

# Arch Linux
sudo pacman -S tree-sitter

# macOS
brew install tree-sitter

Installing Language Grammars

Tree-sitter grammars are language-specific shared libraries (.so or .dylib files). They are not included with Emacs — you download and compile them separately.

Method 1: Built-in Grammar Installation (Recommended)

Configure grammar source repositories and use treesit-install-language-grammar:

(setq treesit-language-source-alist
      '((php        "https://github.com/tree-sitter/tree-sitter-php"        "master" "php/src")
        (phpdoc     "https://github.com/claytonrcarter/tree-sitter-phpdoc")
        (javascript "https://github.com/tree-sitter/tree-sitter-javascript")
        (typescript "https://github.com/tree-sitter/tree-sitter-typescript" "master" "typescript/src")
        (tsx        "https://github.com/tree-sitter/tree-sitter-typescript" "master" "tsx/src")
        (html       "https://github.com/tree-sitter/tree-sitter-html")
        (css        "https://github.com/tree-sitter/tree-sitter-css")
        (json       "https://github.com/tree-sitter/tree-sitter-json")
        (yaml       "https://github.com/ikatyang/tree-sitter-yaml")
        (bash       "https://github.com/tree-sitter/tree-sitter-bash")))

The fourth argument is the path within the repository where the grammar source lives. For PHP, the grammar is in a php/src/ subdirectory of the tree-sitter-php repository.

Install all grammars at once:

(defun my/install-all-treesit-grammars ()
  "Install all configured tree-sitter grammars."
  (interactive)
  (mapc #'treesit-install-language-grammar
        (mapcar #'car treesit-language-source-alist)))

;; Run once: M-x my/install-all-treesit-grammars

Grammars are compiled to ~/.emacs.d/tree-sitter/ by default. This requires a C compiler (gcc or clang) and git available on your PATH.

Method 2: treesit-auto Package

The treesit-auto package automates grammar installation and mode remapping:

(use-package treesit-auto
  :ensure t
  :config
  (global-treesit-auto-mode))

This automatically remaps any language that has a tree-sitter grammar available to its -ts-mode variant. Run M-x treesit-auto-install-all to install all known grammars. It is the lower-friction option but gives you less control over which grammars are installed and how they are compiled.


Activating Tree-sitter Major Modes

Tree-sitter modes are separate major modes named <language>-ts-mode. The cleanest way to activate them is via major-mode-remap-alist, which maps the traditional mode to its tree-sitter counterpart:

(setq major-mode-remap-alist
      '((php-mode         . php-ts-mode)
        (js-mode          . js-ts-mode)
        (javascript-mode  . js-ts-mode)
        (typescript-mode  . typescript-ts-mode)
        (html-mode        . html-ts-mode)
        (css-mode         . css-ts-mode)
        (json-mode        . json-ts-mode)
        (yaml-mode        . yaml-ts-mode)
        (bash-mode        . bash-ts-mode)
        (sh-mode          . bash-ts-mode)))

After this, opening any .php file automatically activates php-ts-mode. Your existing hooks and keybindings for php-mode still fire because of the remapping.

Verify that tree-sitter is active in a buffer:

M-x treesit-inspect-mode   ; toggle tree display
M-: (treesit-language-at (point))  ; returns the language at cursor

PHP-Specific Configuration

php-ts-mode is available in Emacs 30. On Emacs 29 you need the php-ts-mode package from NonGNU ELPA:

(use-package php-ts-mode
  :ensure t  ; only needed on Emacs 29
  :mode "\\.php\\'"
  :hook (php-ts-mode . eglot-ensure)
  :config
  ;; Adjust indentation — Drupal uses 2-space indentation
  (setq php-ts-mode-indent-offset 2))

For Drupal development, the tree-sitter PHP mode understands the full PHP 8.x grammar including attributes (#[Attribute]), enums, match expressions, and first-class callables. These were never properly highlighted by the regex-based php-mode.

Fontification Level

Tree-sitter font-lock operates at configurable levels from 1 (minimal) to 4 (maximum detail). Level 3 is a good balance for PHP development — it highlights all keywords, types, and operators without the visual noise of level 4:

;; Set globally
(setq treesit-font-lock-level 3)

;; Or per-mode via hook
(add-hook 'php-ts-mode-hook
          (lambda () (setq-local treesit-font-lock-level 4)))

The feature sets available per level vary by grammar. For PHP:

  • Level 1: comments, strings, basics
  • Level 2: keywords, types, function names
  • Level 3: operators, variables, attributes
  • Level 4: full detail including doc comments, escape sequences

Structural Navigation

Tree-sitter powers several structural navigation commands in Emacs 29+:

;; Navigate by syntax node
M-x treesit-beginning-of-defun  ; Move to the start of current function/method
M-x treesit-end-of-defun        ; Move to end

;; These replace the default beginning/end-of-defun in ts-mode buffers
;; Bind them for consistency with non-ts-mode buffers:
(define-key php-ts-mode-map [remap beginning-of-defun] #'treesit-beginning-of-defun)
(define-key php-ts-mode-map [remap end-of-defun]       #'treesit-end-of-defun)

Imenu Integration

Tree-sitter modes generate accurate imenu indexes — the list of functions, methods, and classes navigable with M-x imenu or via Vertico/Consult:

;; Use consult-imenu for fuzzy imenu navigation
(use-package consult
  :ensure t
  :bind ("M-g i" . consult-imenu))

In a large Drupal module with many methods and hooks, consult-imenu with tree-sitter-generated indexes is significantly more accurate than the regex-based imenu of traditional php-mode.


Combobulate: Structural Editing Powered by Tree-sitter

Combobulate is a third-party package that uses the tree-sitter syntax tree for structural editing: moving code blocks, swapping arguments, expanding selections to syntax boundaries, and navigating by AST node rather than by text pattern:

(use-package combobulate
  :ensure t
  :hook ((php-ts-mode       . combobulate-mode)
         (js-ts-mode        . combobulate-mode)
         (typescript-ts-mode . combobulate-mode)
         (html-ts-mode      . combobulate-mode)
         (css-ts-mode       . combobulate-mode))
  :bind (:map combobulate-key-map
         ("C-c o u" . combobulate-up)
         ("C-c o d" . combobulate-down)
         ("C-c o n" . combobulate-navigate-next)
         ("C-c o p" . combobulate-navigate-prev)
         ("C-c o t" . combobulate-transpose)))

The most useful combobulate operations for PHP work:

  • Navigate to next/previous sibling node — jump between arguments in a function call, or between methods in a class.
  • Expand selection to enclosing node — select from a variable outward to the expression, then the statement, then the block.
  • Splice — remove a wrapping construct while keeping its contents (e.g., remove an if-block wrapper while keeping the body).

Checking Grammar Installation and Debugging

;; Check if a grammar is installed
(treesit-language-available-p 'php)       ; t or nil
(treesit-language-available-p 'javascript) ; t or nil

;; List all available grammars
(cl-loop for lang in '(php javascript typescript html css json yaml bash)
         collect (cons lang (treesit-language-available-p lang)))

;; Inspect the tree of the current buffer
M-x treesit-explore-mode     ; opens a live syntax tree viewer

;; Find the node at point
M-: (treesit-node-at (point))

;; Find the enclosing named node
M-: (treesit-node-parent (treesit-node-at (point)))

The treesit-explore-mode buffer is invaluable when writing custom tree-sitter queries or debugging why a particular construct is not being highlighted correctly. It shows the exact node type at point and its position in the parse tree.


Complete Configuration

;;; Tree-sitter setup for PHP and web development

;; 1. Grammar sources
(setq treesit-language-source-alist
      '((php        "https://github.com/tree-sitter/tree-sitter-php"        "master" "php/src")
        (javascript "https://github.com/tree-sitter/tree-sitter-javascript")
        (typescript "https://github.com/tree-sitter/tree-sitter-typescript" "master" "typescript/src")
        (tsx        "https://github.com/tree-sitter/tree-sitter-typescript" "master" "tsx/src")
        (html       "https://github.com/tree-sitter/tree-sitter-html")
        (css        "https://github.com/tree-sitter/tree-sitter-css")
        (json       "https://github.com/tree-sitter/tree-sitter-json")
        (yaml       "https://github.com/ikatyang/tree-sitter-yaml")
        (bash       "https://github.com/tree-sitter/tree-sitter-bash")))

;; 2. Fontification level
(setq treesit-font-lock-level 3)

;; 3. Mode remapping
(setq major-mode-remap-alist
      '((php-mode        . php-ts-mode)
        (js-mode         . js-ts-mode)
        (typescript-mode . typescript-ts-mode)
        (html-mode       . html-ts-mode)
        (css-mode        . css-ts-mode)
        (json-mode       . json-ts-mode)
        (bash-mode       . bash-ts-mode)
        (sh-mode         . bash-ts-mode)))

;; 4. PHP tree-sitter mode (Emacs 29 only — built-in on Emacs 30)
(use-package php-ts-mode
  :ensure t
  :mode "\\.php\\'"
  :hook (php-ts-mode . eglot-ensure)
  :config
  (setq php-ts-mode-indent-offset 2))

;; 5. Combobulate for structural editing
(use-package combobulate
  :ensure t
  :hook ((php-ts-mode        . combobulate-mode)
         (js-ts-mode         . combobulate-mode)
         (typescript-ts-mode . combobulate-mode)
         (html-ts-mode       . combobulate-mode)))

Summary Checklist

  • Verify Emacs was compiled with tree-sitter: (treesit-available-p) must return t.
  • Configure treesit-language-source-alist with the PHP grammar's php/src subdirectory path.
  • Run M-x treesit-install-language-grammar for each language, or use treesit-auto for bulk install.
  • Use major-mode-remap-alist to activate -ts-mode variants automatically.
  • On Emacs 29, install php-ts-mode from NonGNU ELPA; on Emacs 30+ it is built in.
  • Set treesit-font-lock-level to 3 or 4 for comprehensive PHP highlighting including attributes and enums.
  • Use treesit-explore-mode to inspect the syntax tree — essential for debugging highlighting issues.
  • Add combobulate-mode to PHP and JS hooks for structural editing powered by the syntax tree.
  • Pair with Eglot + Intelephense for a complete PHP IDE experience: tree-sitter handles highlighting, LSP handles completions and diagnostics.

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
Please share this article on your favorite website or platform.