Posts tagged "tree-sitter":
Using the golang mode shipped with Emacs
A few weeks ago I wanted to try out tree-sitter and switched a few of the modes
I use for coding to their -ts-mode
variants. Based on the excellent How to Get
Started with Tree-Sitter I added bits like this to the setup I have for coding
modes:1
(use-package X-mode :init (add-to-list 'treesit-language-source-alist '(X "https://github.com/tree-sitter/tree-sitter-X")) ;; (treesit-install-language-grammar 'X) (add-to-list 'major-mode-remap-alist '(X-mode . X-ts-mode)) ;; ... )
I then manually evaluated the expression that's commented out to download and
compile the tree-sitter grammar. It's a rather small change, it works, and I can
switch over language by language. I swapped a couple of languages to the
tree-sitter modes like this, including golang. The only mode that I noticed
changes in was golang, in particular my adding of gofmt-before-save
to
before-save-hook
had stopped having any effect.
What I hadn't realised was that the go-mode
I was using didn't ship with Emacs
and that when I switched to go-ts-mode
I switched to one that was. It turns
out that gofmt-before-save
is hard-wired to work only in go-mode
,
something others have noticed.
I don't feel like waiting for go-mode
to fix that though, especially not when
there's a perfectly fine golang mode shipping with Emacs now, and not when
emacs-reformatter make it so easy to define formatters (as I've written about
before).
My golang setup, sans keybindings, now looks like this:2
(use-package go-ts-mode :hook (go-ts-mode . lsp-deferred) (go-ts-mode . go-format-on-save-mode) :init (add-to-list 'treesit-language-source-alist '(go "https://github.com/tree-sitter/tree-sitter-go")) (add-to-list 'treesit-language-source-alist '(gomod "https://github.com/camdencheek/tree-sitter-go-mod")) ;; (dolist (lang '(go gomod)) (treesit-install-language-grammar lang)) (add-to-list 'auto-mode-alist '("\\.go\\'" . go-ts-mode)) (add-to-list 'auto-mode-alist '("/go\\.mod\\'" . go-mod-ts-mode)) :config (reformatter-define go-format :program "goimports" :args '("/dev/stdin")) :general ;; ... )
So far I'm happy with the built-in go-ts-mode
and I've got to say that using a
minor mode for the format-on-save functionality is more elegant than adding a
function to before-save-hook
(something that go-mode
may get through this
PR).
Footnotes:
More on tree-sitter and consult
Here's a few things that I've gotten help with figuring out during the last few days. Both things are related to my playing with tree-sitter that I've written about earlier, here and here.
You might also be interested in the two repositories where the full code is. (I've linked to the specific commits as of this writing.)
Anonymous nodes and matching in tree-sitter
In the grammar for Cabal I have a rule for sections that like this
sections: $ => repeat1(choice( $.benchmark, $.common, $.executable, $.flag, $.library, $.source_repository, $.test_suite, )),
where each section followed this pattern
benchmark: $ => seq( repeat($.comment), 'benchmark', field('name', $.section_name), field('properties', $.property_block), ),
This made it a little bit difficult to capture the relevant parts of each
section to implement consult-cabal
. I thought a pattern like this ought to
work
(cabal (sections (_ _ @type name: (section_name)? @name)))
but it didn't; I got way too many things captured in type
. Clearly I had
misunderstood something about the wildcards, or the query syntax. I attempted to
add a field name to the anonymous node, i.e. change the sections rules like this
benchmark: $ => seq( repeat($.comment), field('type', 'benchmark'), field('name', $.section_name), field('properties', $.property_block), ),
It was accepted by tree-sitter generate
, but the field type
was nowhere to
be found in the parse tree.
Then I changed the query to list the anonymous nodes explicitly, like this
(cabal (sections (_ ["benchmark" "common" "executable" ...] @type name: (section_name)? @name)))
That worked, but listing all the sections like that in the query didn't sit right with me.
Luckily there's a discussions area in tree-sitters GitHub so a fairly short
discussion later I had answers to why my query behaved like it did and a
solution that would allow me to not list all the section types in the query. The
trick is to wrap the string in a call to alias
to make it a named node. After
that it works to add a field name to it as well, of course. The section rules
now look like this
benchmark: $ => seq( repeat($.comment), field('type', alias('benchmark', $.section_type)), field('name', $.section_name), field('properties', $.property_block), ),
and the final query looks like this
(cabal (sections (_ type: (section_type) @type name: (section_name)? @name)))
With that in place I could improve on the function that collects all the items
for consult-cabal
so it now show the section's type and name instead of the
string representation of the tree-sitter node.
State in a consult
source for preview of lines in a buffer
I was struggling with figuring out how to make a good state function in order
to preview the items in consult-cabal
. The GitHub repo for consult
doesn't
have discussions enabled, but after a discussion in an issue I'd arrived at a
state function that works very well.
The state function makes use of functions in consult
and looks like this
(defun consult-cabal--state () "Create a state function for previewing sections." (let ((state (consult--jump-state))) (lambda (action cand) (when cand (let ((pos (get-text-property 0 'section-pos cand))) (funcall state action pos))))))
The trick here was to figure out how the function returned by
consult--jump-state
actually works. On the surface it looks like it takes an
action and a candidate, (lambda (action cand) ...)
. However, the argument
cand
shouldn't be the currently selected item, but rather a postion (ideally a
marker
), so I had to attach another text property on the items (section-pos
,
which is fetched in the inner lambda). This position is then what's passed to
the function returned by consult--jump-state
.
In hindsight it seems so easy, but I was struggling with this for an entire evening before finally asking the question the morning after.
Cabal, tree-sitter, and consult
After my last post I thought I'd move on to implement the rest of the functions
in haskell-mode's major mode for Cabal, functions like
haskell-cabal-goto-library-section
and
haskell-cabal-goto-executable-section
. Then I realised that what I really
want is a way to quickly jump to any section, that is, I want consult-cabal
!
What follows is very much a work-in-progress, but hopefully it'll show enough promise.
Listing the sections
As I have a tree-sitter
parse tree to hand it is fairly easy to fetch all the
nodes corresponding to sections. Since the last post I've made some
improvements to the parser and now the parse tree looks like this (I can
recommend the function treesit-explore-mode
to expore the parse tree, I've
found it invaluable ever since I realised it existed)
(cabal ... (properties ...) (sections (common common (section_name) ...) (library library ...) (executable executable (section_name) ...) ...))
That is, all the sections are children of the node called sections
.
The function to use for fetching all the nodes is treesit-query-capture
, it
needs a node to start on, which this case should be the full parse tree,
i.e. (treesit-buffer-root-node 'cabal)
and a query string. Given the
structure of the parse tree, and that I want to capture all children of
sections
, a query string like this one works
"(cabal (sections (_)* @section))"
Finally, by default treesit-query-capture
returns a list of tuples of the form
(<capture> . <node>)
, but in this case I only want the list of nodes, so the
full call will look like this
(treesit-query-capture (treesit-buffer-root-node 'cabal) "(cabal (sections (_)* @section))" nil nil t)
Hooking it up to consult
As I envision adding more things to jump to in the future, I decided to make use
of consult--multi
. That in turn means I need to define a "source" for the
sections. After a bit of digging and rummaging in the consult source I put
together this
(defvar consult-cabal--source-section `(:name "Sections" :category location :action ,#'consult-cabal--section-action :items ,#'consult-cabal--section-items) "Definition of source for Cabal sections.")
which means I need two functions, consult-cabal--section-action
and
consult-cabal--section-items
. I started with the latter.
Getting section nodes as items for consult
It took me a while to work understand how this would ever be able to work. The
function that :items
point to must return a list of strings, but how would I
ever be able to use just a string to jump to the correct location?
The solution is in a comment in the documentation of consult--multi
:
:items - List of strings to select from or function returning list of strings. Note that the strings can use text properties to carry metadata, which is then available to the :annotate, :action and :state functions.
I'd never come across text properties in Emacs before, so at first I
completely missed those two words. Once I'd looked up the concept in the
documentation everything fell into place. The function
consult-cabal--section-items
would simply attach the relevant node as a text
property to the strings in the list.
My current version, obviously a work-in-progress, takes a list of nodes and turns them naïvely into a string and attaches the node. I split it into two functions, like this
(defun consult-cabal--section-to-string (section) "Convert a single SECTION node to a string." (propertize (format "%S" section) :treesit-node section)) (defun consult-cabal--section-items () "Fetch all sections as a list of strings ." (let ((section-nodes (treesit-query-capture (treesit-buffer-root-node 'cabal) "(cabal (sections (_)* @section))" nil nil t))) (mapcar #'consult-cabal--section-to-string section-nodes)))
Implementing the action
The action function is called with the selected item, i.e. with the string and
its properties. That means, to jump to the selected section the function needs
to extract the node property, :treesit-node
, and jump to the start of it. the
function to use is get-text-property
, and as all characters in the string will
have to property I just picked the first one. The jumping itself I copied from
the navigation functions I'd written before.
(defun consult-cabal--section-action (item) "Go to the section referenced by ITEM." (when-let* ((node (get-text-property 0 :treesit-node item)) (new-pos (treesit-node-start node))) (goto-char new-pos)))
Tying it together with consult--multi
The final function, consult-cabal
, looks like this
(defun consult-cabal () "Choose a Cabal construct and jump to it." (interactive) (consult--multi '(consult-cabal--source-section) :sort nil))
Conclusions and where to find the code
The end result works as intended, but it's very rough. I'll try to improve it a bit more. In particular I want
- better strings -
(format "%S" node)
is all right to start with, but in the long run I want strings that describe the sections, and - preview as I navigate between items - AFAIU this is what the
:state
field is for, but I still haven't looked into how it works.
The source can be found here.
Making an Emacs major mode for Cabal using tree-sitter
A few days ago I posted on r/haskell that I'm attempting to put together a Cabal grammar for tree-sitter. Some things are still missing, but it covers enough to start doing what I initially intended: experiment with writing an alternative Emacs major mode for Cabal.
The documentation for the tree-sitter integration is very nice, and several of
the major modes already have tree-sitter variants, called X-ts-mode
where X
is e.g. python
, so putting together the beginning of a major mode wasn't too
much work.
Configuring Emacs
First off I had to make sure the parser for Cabal was installed. The snippet for that looks like this1
(use-package treesit :straight nil :ensure nil :commands (treesit-install-language-grammar) :init (setq treesit-language-source-alist '((cabal . ("https://gitlab.com/magus/tree-sitter-cabal.git")))))
With that in place the parser is installed using M-x
treesit-install-language-grammar
and choosing cabal
.
After that I removed my configuration for haskell-mode
and added the following
snippet to get my own major mode into my setup.
(use-package my-cabal-mode :straight (:type git :repo "git@gitlab.com:magus/my-emacs-pkgs.git" :branch "main" :files (:defaults "my-cabal-mode/*el")))
The major mode and font-locking
The built-in elisp documentation actually has a section on writing a major mode with tree-sitter, so it was easy to get started. Setting up the font-locking took a bit of trial-and-error, but once I had comments looking the way I wanted it was easy to add to the setup. Oh, and yes, there's a section on font-locking with tree-sitter in the documentation too. At the moment it looks like this
(defvar cabal--treesit-font-lock-setting (treesit-font-lock-rules :feature 'comment :language 'cabal '((comment) @font-lock-comment-face) :feature 'cabal-version :language 'cabal '((cabal_version _) @font-lock-constant-face) :feature 'field-name :language 'cabal '((field_name) @font-lock-keyword-face) :feature 'section-name :language 'cabal '((section_name) @font-lock-variable-name-face)) "Tree-sitter font-lock settings.") ;;;###autoload (define-derived-mode my-cabal-mode fundamental-mode "My Cabal" "My mode for Cabal files" (when (treesit-ready-p 'cabal) (treesit-parser-create 'cabal) ;; set up treesit (setq-local treesit-font-lock-feature-list '((comment field-name section-name) (cabal-version) () ())) (setq-local treesit-font-lock-settings cabal--treesit-font-lock-setting) (treesit-major-mode-setup))) ;;;###autoload (add-to-list 'auto-mode-alist '("\\.cabal\\'" . my-cabal-mode))
Navigation
One of the reasons I want to experiment with tree-sitter is to use it for code
navigation. My first attempt is to translate haskell-cabal-section-beginning
(in haskell-mode
, the source) to using tree-sitter. First a convenience
function to recognise if a node is a section or not
(defun cabal--node-is-section-p (n) "Predicate to check if treesit node N is a Cabal section." (member (treesit-node-type n) '("benchmark" "common" "executable" "flag" "library" "test_suite")))
That makes it possible to use treesit-parent-until
to traverse the nodes until
hitting a section node
(defun cabal-goto-beginning-of-section () "Go to the beginning of the current section." (interactive) (when-let* ((node-at-point (treesit-node-at (point))) (section-node (treesit-parent-until node-at-point #'cabal--node-is-section-p)) (start-pos (treesit-node-start section-node))) (goto-char start-pos)))
And the companion function, to go to the end of a section is very similar
(defun cabal-goto-end-of-section () "Go to the end of the current section." (interactive) (when-let* ((node-at-point (treesit-node-at (point))) (section-node (treesit-parent-until node-at-point #'cabal--node-is-section-p)) (end-pos (treesit-node-end section-node))) (goto-char end-pos)))
Footnotes:
I'm using straight.el and use-package
in my setup, but hopefully the
snippets can easily be converted to other ways of configuring Emacs.