Magnus web site
Random stuff
Trying eglot, again
I've been using lsp-mode since I switched to Emacs several years ago. When eglot
made into Emacs core I used it very briefly but quickly switched back. Mainly I
found eglot a bit too bare-bones; I liked some of the bells and whistles of
lsp-ui. Fast-forward a few years and I've grown a bit tired of those bells and
whistles. Specifically that it's difficult to make lsp-ui-sideline and
lsp-ui-doc work well together. lsp-ui-sidedline is shown on the right side,
which is good, but combining it with lsp-ui-doc leads to situations where the
popup covers the sideline. What I've done so far is centre the line to bring the
sideline text out. I was playing a little bit with making the setting of
lsp-ui-doc-position change depending on the location of the current position.
It didn't work that well though so I decided to try to find a simpler setup.
Instead of simplifying the setup of lsp-config I thought I'd give eglot
another shot.
Basic setup
I removed the statements pulling in lsp-mode, lsp-ui, and all
language-specific packages like lsp-haskell. Then I added this to configure
eglot
(use-package eglot
:ensure nil
:custom
(eglot-autoshutdown t)
(eglot-confirm-server-edits '((eglot-rename . nil)
(t . diff))))
The rest was mainly just switching lsp-mode functions for eglot functions.
| lsp-mode function | eglot function |
|---|---|
lsp-deferred |
eglot-ensure |
lsp-describe-thing-at-point |
eldoc |
lsp-execute-code-action |
eglot-code-actions |
lsp-find-type-definition |
eglot-find-typeDefinition |
lsp-format-buffer |
eglot-format-buffer |
lsp-format-region |
eglot-format |
lsp-organize-imports |
eglot-code-action-organize-imports |
lsp-rename |
eglot-rename |
lsp-workspace-restart |
eglot-reconnect |
lsp-workspace-shutdown |
eglot-shutdown |
I haven't verified that the list is fully correct yet, but it looks good so far.
The one thing I might miss is lenses, and using lsp-avy-lens. However,
everything that I use lenses for can be done using actions, and to be honest I
don't think I'll miss the huge lens texts from missing type annotations in
Haskell.
Configuration
One good thing about lsp-mode's use of language-specific packages is that
configuration of the various servers is performed through functions. This makes
it easy to discover what options are available, though it also means not all
options may be available. In eglot configuration is less organised, I have to
know about the options for each language server and put the options into
eglot-workspace-configuration myself. It's not always easy to track down what
options are available, and I've found no easy way to verify the settings. For
instance, with lsp-mode I configures HLS like this
(lsp-haskell-formatting-provider "fourmolu")
(lsp-haskell-plugin-stan-global-on nil)
which translates to this for eglot
(setq-default eglot-workspace-configuration
(plist-put eglot-workspace-configuration
:haskell
'(:formattingProvider "fourmolu"
:plugin (:stan (:global-on :json-false)))))
and I can verify that this configuration has taken effect because I know enough about the Haskell tools.
I do some development in Python and I used to configure pylsp like this
(lsp-pylsp-plugins-mypy-enabled t)
(lsp-pylsp-plugins-ruff-enabled t)
which I think translates to this for eglot
(setq-default eglot-workspace-configuration
(plist-put eglot-workspace-configuration
:pylsp
'(:plugins (:ruff (:enabled t)
:mypy (:enabled t)))))
but I don't know any convenient way of verifying these settings. I'm simply not
familiar enough with the Python tools. I can check the value of
eglot-workspace-configuration by inspecting it or calling
eglot-show-workspace-configuration but is there really no way of asking the
language server for its active configuration?
Closing remark
The last time I gave up on eglot very quickly, probably too quickly to be
honest. I made these changes to my configuration over the weekend, so the real
test of eglot starts when I'm back in the office. I have a feeling I'll stick
to it longer this time.
Validation of data in a servant server
I've been playing around with adding more validation of data received by an HTTP
endpoint in a servant server. Defining a type with a FromJSON instance is very
easy, just derive a Generic instance and it just works. Here's a simple
example
data Person = Person
{ name :: Text
, age :: Int
, occupation :: Occupation
}
deriving (Generic, Show)
deriving (FromJSON, ToJSON) via (Generically Person)
data Occupation = UnderAge | Student | Unemployed | SelfEmployed | Retired | Occupation Text
deriving (Eq, Generic, Ord, Show)
deriving (FromJSON, ToJSON) via (Generically Occupation)
However, the validation is rather limited, basically it's just checking that
each field is present and of the correct type. For the type above I'd like to
enforce some constraints for the combination of age and occupation.
The steps I thought of are
- Hide the default constructor and define a smart one. (This is the standard suggestion for placing extra constraints values.)
- Manually define the
FromJSONinstance using theGenericinstance to limit the amount of code and the smart constructor.
The smart constructor
I give the constructor the result type Either String Person to make sure it
can both be usable in code and when defining parseJSON.
mkPerson :: Text -> Int -> Occupation -> Either String Person
mkPerson name age occupation = do
guardE mustBeUnderAge
guardE notUnderAge
guardE tooOldToBeStudent
guardE mustBeRetired
pure $ Person name age occupation
where
guardE (pred, err) = when pred $ Left err
mustBeUnderAge = (age < 8 && occupation > UnderAge, "too young for occupation")
notUnderAge = (age > 15 && occupation == UnderAge, "too old to be under age")
tooOldToBeStudent = (age > 45 && occupation == Student, "too old to be a student")
mustBeRetired = (age > 65 && occupation /= Retired, "too old to not be retired")
Here I'm making use of Either e being a Monad and use when to apply the
constraints and ensure the reason for failure is given to the caller.
The FromJSON instance
When defining the instance I take advantage of the Generic instance to make
the implementation short and simple.
instance FromJSON Person where
parseJSON v = do
Person{name, age, occupation} <- genericParseJSON defaultOptions v
either fail pure $ mkPerson name age occupation
If there are many more fields in the type I'd consider using RecordWildCards.
Conclusion
No, it's nothing ground-breaking but I think it's a fairly nice example of how things can fit together in Haskell.
Making a theme based on modus
In modus-theme 5.0.0 Prot introduced a structured way to build a theme based
on modus. Just a few days ago he released version 5.1.0 with some improvements
in this area.
The official documentation of how to build on top of the Modus themes is very good. It's focused on how to make sure your theme fits in with the rest of the "modus universe". However, after reading it I still didn't have a good idea of how to get started with my own theme. In case others feel the same way I thought I'd write down how I ended up getting started.
The resulting theme, modus-catppuccin, can be found here.
A little background
I read about how to create a catppuccin-mocha theme using modus-vivendi through
modus' mechanism of overrides. On Reddit someone pointed out that Prot had been
working on basing themes on modus and when I checked the state of it he'd just
released version 5.0.0. Since I'm using catppuccin themes for pretty much all
software with a GUI I thought it could be interesting to see if I could make a
modus-based catppuccin theme to replace my use of catppuccin-theme.
I'm writing the rest as if it was a straight and easy journey. It wasn't! I made a few false starts, each time realising something new about the structure and starting over with a better idea.
Finding a starting point
When reading what Prot had written about modus-themes in general, and about
how to create themes based on it, in particular, I found that he's ported both
standard-themes and ef-themes so they now are based on modus. Instead of
just using them for inspiration I decided that since standard-themes is so
small I might as well use it as my starting point.
Starting
I copied all files of standard-themes to an empty git repository, then I
- deleted all but one of the theme file
- copied the remaining theme file so I had four in total (one for each of the catppuccin flavours)
- renamed constants, variables, and functions so they would match the theme and its flavours
- put the colours into each
catppuccin-<flavour>-palette - emptied the common palette,
modus-catppuccin-common-palette-mappings - made sure that my use of
modus-themes-themewas reasonable, in particular the base palette (I based the light flavour onmodus-operandiand the three dark flavours onmodus-vivendi)
The result can be seen here.
At this point the three theme flavours contained no relevant mappings of their
own, so what I had was in practice modus-operandi under a new name and
modus-vivendi under three new names.
Adding mappings for catppuccin
By organising the theme flavours the way outlined above I only need to add
mappings to modus-catppuccin-common-palette-mappings because
- each flavour-specific mapping adds its colour palette using the same name (that's how catppuccin organises its colors too, as seen here)
- each flavour-specific mapping is combined with the common one
- any missing mapping is picked up by the underlying theme,
modus-operandiormodus-vivendi, so there will be (somewhat) nice colours for everything
I started out with the mappings in the dark standard theme but then I realised
that's not the complete list of available mappings and I started looking at the
themes in modus-themes itself.
Current state of modus-catppuccin
I've so far defined enough mappings to make it look enough like catppuccin for
my use. There are a lot of possible mappings so my plan is to add them over time
and use catppuccin-theme for inspiration.
Listing buffers by tab using consult and bufferlo
I've gotten into the habit of using tabs, via tab-bar, to organise my buffers
when I have multiple projects open at once. Each project has its own tab.
There's nothing fancy here (yet), I simply open a new tab manually before
opening a new project.
A while ago I added bufferlo to my config to help with getting consult-buffer
to organise buffers (somewhat) by tab. I copied the configuration from the
bufferlo README and started using it. It took me a little while to notice that
the behaviour wasn't quite what I wanted. It seemed like one buffer "leaked"
from another tab.
In the image above all files in ~/.emacs.d should be listed under Other
Buffers, but one has been brought over into the tab for the Sider project.
After a bit of experimenting I realised that
- the buffer that leaks is the one I'm in when creating the new tab, and
- my function for creating a new tab doesn't work the way I thought.
My function for creating a new tab looked like this
(lambda ()
(interactive)
(tab-new)
(dashboard-open))
and it turns out that tab-new shows the current buffer in the new tab which in
turn caused bufferlo to associate it to the wrong tab. From what I can see
there's no way to tell tab-new to open a specific buffer in the newly created
tab. I tried the following
(lambda ()
(interactive)
(with-current-buffer dashboard-buffer-name
(tab-new)))
hoping that the dashboard would open in the new tab. It didn't, it was still the active buffer that popped up in the new tab.
In the end I resorted to use bufferlo-remove to simply remove the current
buffer from the new tab.
(lambda ()
(interactive)
(tab-new)
(bufferlo-remove (current-buffer))
(dashboard-open))
No more leakage and consult-buffer works like I wanted it to.
Reading Redis responses
When I began experimenting with writing a new Redis client package I decided to use lazy bytestrings, because:
- aeson seems to prefer it – the main encoding and decoding functions use lazy byte strings, though there are strict variants too.
- the
Buildertype in bytestring produce lazy bytestrings.
At the time I was happy to see that attoparsec seemed to support strict and lazy bytestrings equally well.
To get on with things I also wrote the simplest function I could come up with
for sending and receiving data over the network – I used send and recv from
Network.Socket.ByteString.Lazy in network. The function was really simple
import Network.Socket.ByteString.Lazy qualified as SB
sendCmd :: Conn -> Command r -> IO (Result r)
sendCmd (Conn p) (Command k cmd) = withResource p $ \sock -> do
_ <- SB.send sock $ toWireCmd cmd
resp <- SB.recv sock 4096
case decode resp of
Left err -> pure $ Left $ RespError "decode" (TL.pack err)
Right r -> pure $ k <$> fromWireResp cmd r
with decode defined like this
decode :: ByteString -> Either String Resp
decode = parseOnly resp
I knew I'd have to revisit this function, it was naïve to believe that a call to
recv would always result in as single complete response. It was however good
enough to get going. When I got to improving sendCmd I was a little surprised
to find that I'd also have to switch to using strict bytestrings in the parser.
Interlude on the Redis serialisation protocol (RESP3)
The Redis protocol has some defining attributes
- It's somewhat of a binary protocol. If you stick to keys and values that fall
within the set of ASCII strings, then the protocol is humanly readable and you
can rather easily use
netcatortelnetas a client. However, you aren't limited to storing only readable strings. - It's somewhat of a request-response protocol. A notable exception is the publish-subscribe subset, but it's rather small and I reckon most Redis users don't use it.
- It's somewhat of a type-length-value style protocol. Some of the data types include their length in bytes, e.g. bulk strings and verbatim strings. Other types include the number of elements, e.g. arrays and maps. A large number of them have no length at all, e.g. simple strings, integers, and doubles.
I suspect there are good reasons, I gather a lot of it has to do with speed. It does however cause one issue when writing a client: it's not possible to read a whole response without parsing it.
Rewriting sendCmd
With that extra information about the RESP3 protocol the naïve implementation above falls short in a few ways
- The read buffer may contain more than one full message and give the definition
of
decodeabove any remaining bytes are simply dropped.1 - The read buffer my contain less than one full message and then
decodewill return an error.2
Surely this must be solvable, because in my mind running the parser results in one of three things:
- Parsing is done and the result is returned, together with any input that wasn't consumed.
- The parsing is not done due to lack of input, this is typically encoded as a continuation.
- The parsing failed so the error is returned, together with input that wasn't consumed.
So, I started looking in the documentation for the module
Data.Attoparsec.ByteString.Lazy in attoparsec. I was a little surprised to find
that the Result type lacked a way to feed more input to a parser – it only
has two constructors, Done and Fail:
data Result r
= Fail ByteString [String] String
| Done ByteString r
I'm guessing the idea is that the function producing the lazy bytestring in the
first place should be able to produce more chunks of data on demand. That's
likely what the lazy variant of recv does, but at the same time it also
requires choosing a maximum length and that doesn't rhyme with RESP3. The lazy
recv isn't quite lazy in the way I needed it to be.
When looking at the parser for strict bytestrings I calmed down. This parser
follows what I've learned about parsers (it's not defined exactly like this;
it's parameterised in its input but for the sake of simplicity I show it with
ByteString as input):
data Result r
= Fail ByteString [String] String
| Partial (ByteString -> Result r)
| Done ByteString r
Then to my delight I found that there's already a function for handling exactly my problem
parseWith :: Monad m => (m ByteString) -> Parser a -> ByteString -> m (Result a)
I only needed to rewrite the existing parser to work with strict bytestrings and
work out how to write a function using recv (for strict bytestrings) that
fulfils the requirements to be used as the first argument to parseWith. The
first part wasn't very difficult due to the similarity between attoparsec's
APIs for lazy and strict bytestrings. The second only had one complication. It
turns out recv is blocking, but of course that doesn't work well with
parseWith. I wrapped it in timeout based on the idea that timing out means
there's no more data and the parser should be given an empty string so it
finishes. I also decided to pass the parser as an argument, so I could use the
same function for receiving responses for individual commands as well as for
pipelines. The full receiving function is
import Data.ByteString qualified as BS
import Data.Text qualified as T
import Network.Socket.ByteString qualified as SB
recvParse :: S.Socket -> Parser r -> IO (Either Text (BS.ByteString, r))
recvParse sock parser = do
parseWith receive parser BS.empty >>= \case
Fail _ [] err -> pure $ Left (T.pack err)
Fail _ ctxs err -> pure $ Left $ T.intercalate " > " (T.pack <$> ctxs) <> ": " <> T.pack err
Partial _ -> pure $ Left "impossible error"
Done rem result -> pure $ Right (rem, result)
where
receive =
timeout 100_000 (SB.recv sock 4096) >>= \case
Nothing -> pure BS.empty
Just bs -> pure bs
Then I only needed to rewrite sendCmd and I wanted to do it in such a way that
any remaining input data could be use in by the next call to sendCmd.3 I
settled for modifying the Conn type to hold an IORef ByteString together
with the socket and then the function ended up looking like this
sendCmd :: Conn -> Command r -> IO (Result r)
sendCmd (Conn p) (Command k cmd) = withResource p $ \(sock, remRef) -> do
_ <- SBL.send sock $ toWireCmd cmd
rem <- readIORef remRef
recvParse sock rem resp >>= \case
Left err -> pure $ Left $ RespError "recv/parse" err
Right (newRem, r) -> do
writeIORef remRef newRem
pure $ k <$> fromWireResp cmd r
What's next?
I've started looking into pub/sub, and basically all of the work described in this post is a prerequisite for that. It's not very difficult on the protocol level, but I think it's difficult to come up with a design that allows maximal flexibility. I'm not even sure it's worthwhile the complexity.
Footnotes:
This isn't that much of a problem when sticking to the request-response commands, I think. It most certainly becomes a problem with pub/sub though.
I'm sure that whatever size of buffer I choose to use there'll be someone out there who's storing values that are larger. Then there's pipelining that makes it even more of an issue.
To be honest I'm not totally convinced there'll ever be any remaining input.
Unless a single Conn is used by several threads – which would lead to much
pain with the current implementation – or pub/sub is used – which isn't
supported yet.