Remove lsp4j on v1 branch #12

mainej · 2022-07-14T21:57:35Z

Replaces lsp4j with a Clojure implementation. The full list of dependencies is now clojure, core.async, cheshire, and camel-snake-kebab.

Also merges the lsp4clj-server and lsp4clj-protocol projects into one top-level project, lsp4clj.

clojure-lsp has an accompanying PR.

@Cyrik, as stated in the CHANGELOG, other users (you, I think) of lsp4clj are encouraged to upgrade. You can follow that clojure-lsp PR as guidance on the changes that'll need to be made. They're fairly extensive, but I believe will make lsp4clj a more stable platform for LSP development.

Notable changes:

As described in Replace lsp4j with internal communication logic #8, servers can choose which requests and notifications they respond to by implementing the lsp4clj.server/receive-request and lsp4clj.server/receive-notification multimethods. (In contrast to the protocols based approach we had before.) This gives servers much more flexibility. They can implement new parts of the spec at their leisure, or according to the needs of their language. They can also choose when and how to implement asynchronicity. The tradeoff is that much of the spec checking and logging moves back to clojure-lsp and other servers. The servers also have to know and use the names of the LSP requests/notifications. Personally, I find clojure-lsp.server code to be more readable now, even though it has a few more responsibilities.
Input maps are all coerced to kebab-case. The server receives whatever the client sent, which should follow the LSP spec.
Output specs still exist, but do only very lightly conformation. Of what conformation remains, most of it converts keywords to enum values. There are a few places where more complex conformation happens. Honestly, I'd like to remove almost all conformation. I think it will be easier if you can look up what to output based on the LSP spec, not the LSP spec, plus a bunch of reasoning about coercion.
It is much less code to initialize a server. None of it is Java interop.
We have a path to working on Support for more communication channels #1. We need code that converts a socket into a pair of input and output channels. Otherwise, much of the code can stay the same.

Fixes #8.

This implements a stdio server, which reads JSON-RPC messages off its :in, processes them, and sends responses/notifications to :out. There's still a lot of exception handling to do, and it needs a careful comparison to the spec.

Clients have two ways to shutdown. They can send the 'shutdown' request, or they can close their end of the server input. The shutdown is mostly for the language server to do cleanup, so now we do nothing. On exit, we simulate the server input closing, which blocks waits for messages to finish being processed, then closes the output, then unblocks.

…messages

I've been having a problem with the new code that wasn't happening in lsp4j. Here's what I _think_ is going on in lsp4j. (Java code is so hard for me to read.) First we createLauncher. This instantiates several objects: 1. A RemoteEndpoint. This is the thing that will processes messages. In ring terms it is the handler. 2. A MessageConsumer. This is a list of things that will all 'consume' messages. Each one does something with a message, then passes the message to the next consumer. Technically, the RemoteEndpoint is also a MessageConsumer. It's the first in the list. The last one writes JSON-RPC to the wire. Optionally, in between, you can have tracers or other consumers. In ring terms, this is a stack of middleware that handles a request and transforms the response in several ways. 3. A StreamMessageProducer. This is the thing that will read JSON-RPC off the wire, convert it to Java obects, and call the MessageConsumer list. In ring terms, it's middleware that transforms the request before calling the handler. 4. A ConcurrentMessageProcessor. This is the thing that will connect the producer to the consumer list. Then we ask the launcher to startListening. This starts a thread in which the ConcurrentMessageProcessor will: 1. Ask the producer to start a loop that reads messages. 2. Pipe those messages to the consumer stack. Despite using newCachedThreadPool, (and despite the name "Concurrent") only [one thread is created](https://github.com/eclipse/lsp4j/blob/6e20087ec9e6201a6cdabf746c7233fe3c87abbc/org.eclipse.lsp4j.jsonrpc/src/main/java/org/eclipse/lsp4j/jsonrpc/json/ConcurrentMessageProcessor.java#L105-L108) to read messages. So, one background thread reads messages from the client, and forwards them to the main thread. In the main thread, lsp4j receives requests and notifications from the client, processes them (by forwarding them to clojure-lsp) and replies. On the clojure-lsp side, we start many threads and so it's possible for many of them to ask lsp4j to send messages to the client—for example, by calling publishDiagnostics—all at the same time. publishDiagnostics doesn't block waiting for the client to respond. But calling that method will result in an outbound message. All outbound messages, whether responses to client requests or server-generated requests and notifications, all go to a single object. (There's some messiness with Service objects and reflection in here that I don't really understand.) In any case, as it's writing JSON output, that object [synchronizes](https://github.com/eclipse/lsp4j/blob/6e20087ec9e6201a6cdabf746c7233fe3c87abbc/org.eclipse.lsp4j.jsonrpc/src/main/java/org/eclipse/lsp4j/jsonrpc/json/StreamMessageConsumer.java#L66) its writes across threads. I'm not sure how big the buffer is between the server's output and the client's input, but if the client is slow to read messages, the buffer will fill up and the server won't be able to write. So, effectively you can only write messages as fast as lsp4j can put them on the output stream. If the server-side writer can't keep up or the client-side reader is slow, publishDiagnostics will block waiting for other threads to finish writing. In the new architecture, output is modeled a bit differently. There's a channel which gets messages put on it. A go loop is reading messages off the channel, one-by-one, serializing them to JSON and writing them to the output stream. Now we're in a situation where that go block, like the synchronized lsp4j writer before it, can only write messages as fast as the client is willing to read them. If clojure-lsp creates thousands of messages to `put!` on the chan, and the go block doesn't drain the put queue fast enough, you end up with the exception about having too many pending puts. Unfortunately, I've already seen this exception in the wild. When metabase starts up, we publish thousands of diagnostics messages, one for each of the files in that very large project. The server-side writer is slow (possibly because the client-side reader is slow) and so we get the pending puts exception. I think we are _just_ over the limit. The put queue can only contain 1024 messages, but metabase has 1084 Clojure files. After some experimentation I have one way to fix it, which is this patch. In lsp4clj, in `send-notification`, I can use `>!!` instead of `put!` so that if the go block gets backed up, we block further notifications. (I tried `(go (>! output msg))` but that causes the same pending puts exception, which the core.async wiki [suggests will be the case](https://github.com/clojure/core.async/wiki/Go-Block-Best-Practices#general-advice).) After changing `send-notification` to use `>!!`, `spawn-async-tasks` can no longer use `go-loop` (because you aren't supposed to use blocking operations inside go blocks for fear of depleting all the threads that process the go blocks). So, on the clojure-lsp side, `spawn-async-tasks` has to be switched to `(thread (loop []))`. Maybe that's fine. It dedicates 4 threads to the 4 async tasks, but that's not terrible. This works, and I'd be happy with it as a long-term solution. It may even be correct... we are, after all, doing blocking io. Perhaps in a sense, we've essentially discovered a subtle bug. All along we've been doing blocking io inside a go block. We probably haven't ever depleted all the go threads, but it's been possible... and now, it's fixed! But maybe there's a way where it can all be solved on the lsp4clj side, without changing `spawn-async-tasks`. I'll think about that more.

Clients generally expect that their messages will be received and processed in order. If lsp4clj proccesses messsages in parallel, then that ordering gets lost. The client might send `textDocument/definition` followed by `textDocument/rename`, but the server could receive those messages in the opposite order. There is some flexibility in the spec about message ordering [spec-ordering]. Servers can introduce their own parallelism, but they then have to take responsibility for the consequences. For example clojure-lsp processes `textDocument/didChange` asynchronously, and then needs `process-after-changes` to restore ordering. But, lsp4clj should not force parallelism on every request, for every server. [spec-ordering]: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#messageOrdering)

To avoid corrupting the output stream.

mainej · 2022-07-14T23:02:29Z

And @Cyrik—I meant to mention... when you're upgrading if you'd like a little direction, I'd be happy to chat. Perhaps we can turn our discussion into a Usage section in the README.

mainej · 2022-07-15T15:49:43Z

Sorry for the noise, but I forgot our plan for how to deploy this. I'm going to merge this into a v1 branch in lsp2clj, to get it out of my personal fork, and so that the clojure-lsp branch can depend on a :git/sha until we have a release. I'll open another PR (essentially a duplicate of this one) asking to merge v1 into master. We can do that merge after testing.

Cyrik · 2022-07-30T03:39:28Z

Thanks for the great work @mainej. I've been on a business trip but I'll try to take a look at this next week.

mainej added 30 commits July 14, 2022 12:21

First pass at removing lsp4j

607a2fe

This implements a stdio server, which reads JSON-RPC messages off its :in, processes them, and sends responses/notifications to :out. There's still a lot of exception handling to do, and it needs a careful comparison to the spec.

Share liveness probe

1dc0dac

Properly handle character encoding

7101970

Extract JSON-RPC message helpers

548fd5e

Thread context through handlers

f51c577

Allow nil responses

229def7

Trace method

82b34b0

Avoid accidental conversion to map

c6cbc76

Simplify client usage of error codes

f94eceb

Content-Length is in bytes not chars

cbf1610

Rename

4d34f0f

Record plans

4b122fe

Make requests cancellable

b20d0a5

Docs and comments

6521c7f

Be more careful to only cancel request once

ac6579b

Only block 10 seconds for close server to finish processing previous …

9567155

…messages

Preserve case of keys that are already strings

a037776

Don't add empty data to error messages

dcd6e42

Remove alpha namespaces

6d8929a

Remove deps on lsp4j

e9d18c0

Revert arg order

170a651

Language servers don't have to require endpoint protocol

ce960f9

Rename and improve docs

63a8ce0

Include id in tracing

be9d336

Hold more data on pending-request object

bd18fcf

Close input on exit

1eb7279

Read and write in threads, to avoid blocking go threads

01b25d1

Improve trace messages

61a2dc7

mainej added 17 commits July 14, 2022 12:24

Reduce knowledge of log channel format

9b46f29

Move the logger protocol back to clojure-lsp

f0f0b58

Merge server and protocols subprojects

bef2b0f

Conform message only once

0534c09

Handle parse error and invalid request the same

4f1d4a4

Cutomize input keywordization function

bfe7cc1

Close owned channels

7f2be73

Fix comment

c206597

Remove unused protocol method

255f1d2

Clean up and fix tests

9865b1b

Discard stdout while running server code

ac2712d

To avoid corrupting the output stream.

Clean up tests

e48f55d

Simplify let

0d97376

Add comment

38779d3

Prep for 1.0.0 release

5d1c339

Changelog

772f669

mainej requested review from snoe and ericdallo July 14, 2022 21:57

mainej mentioned this pull request Jul 14, 2022

Remove dependency on lsp4j clojure-lsp/clojure-lsp#1117

Merged

2 tasks

Fix typo

786837e

mainej added 2 commits July 14, 2022 17:21

Document usage in README

423a8a4

Ignore $/cancelRequest notifications

9aefcf6

mainej changed the base branch from master to v1 July 15, 2022 15:41

mainej merged commit b6038ce into clojure-lsp:v1 Jul 15, 2022

mainej deleted the lsp2clj branch July 15, 2022 15:51

mainej changed the title ~~Remove lsp4j~~ Remove lsp4j on v1 branch Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove lsp4j on v1 branch #12

Remove lsp4j on v1 branch #12

mainej commented Jul 14, 2022 •

edited

Loading

mainej commented Jul 14, 2022

mainej commented Jul 15, 2022

Cyrik commented Jul 30, 2022

Remove lsp4j on v1 branch #12

Remove lsp4j on v1 branch #12

Conversation

mainej commented Jul 14, 2022 • edited Loading

mainej commented Jul 14, 2022

mainej commented Jul 15, 2022

Cyrik commented Jul 30, 2022

mainej commented Jul 14, 2022 •

edited

Loading