-
Notifications
You must be signed in to change notification settings - Fork 853
Why not use HTTP directly? #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A server implementation can still use HTTP, protocol does not prevent HTTP to be used as the transport. However, explicitly limiting the transport to HTTP would prevent servers to use transports such as sockets or named pipes which are most commonly used by the implementations today. |
HTTP could still be used over named pipes and sockets, FWIW |
+1 for this, and when I had the opportunity to ask the VSCode team about it, they said that main reason for not using HTTP is performance (which is a major concern for the language server). Opening to HTTP(s) would allow to provide language servers as SaaS, that clients could connect to. Although not being very relevant for some operations such as completion, I see a huge opportunity for remote "build servers" to run builds or other complex analysis and report back diagnostics and remediation after some delay. However, I agree with Gorkem that the protocol shouldn't restrain to one transport and still allow stdio, sockets and named pipes as those are the one that work best for immediate local integration. However, it should also open the door to HTTP to allow more explicitly SaaS and user authentication. |
The language server protocol is basically independent of the transport. So from the protocol point of view there is nothing that limits it to a special transport. In reality the json-rpc library we use already has two different kind of transports (https://github.com/Microsoft/vscode-languageserver-node/blob/master/jsonrpc/src/messageReader.ts#L248). The first one is over streams using the described base protocol. The second is node ipc which doesn't have any header information. It simply sends and receives JSON objects. Expanding this to HTTP should be straight forward including the client library. What needs to be done is to update the documentation. Now that I think off we should move the Base protocol to https://github.com/Microsoft/vscode-languageserver-node and simply say here that different transports are possible. |
I agree with @dbaeumer here: LSP is independent of the transport - and it wouldn't be good to try to specifiy the transport at LSP level, since they are two different concerns, two different layers of abstraction. If anything, the header part of the LSP ("Content-Length")seems like it could be simplified and removed away. Just sending the JSON_RPC json objects through the stream should be enough, and from what you say, you're doing this already in an internal case for VSCode. |
I would also tend to think removing the 'Content-Length' requirement is actually a better solution if that is possible. In this case, we are specifying on just using JSON-RPC with a configurable layer of transport, so the whole:
Can be removed? |
I don't think this makes much sense. HTTP always executes a HTTP method on a ressource. What would these be? |
@bruno-medeiros If you remove that header, how would you know how much of the stream to consume for the next message? JSON parsing the stream on the fly would be complicated, fault-intolerant, and would prevent concurrent parsing/processing of messages. The protocol doesn't need all the metadata overhead of HTTP, but it is useful to have a content length header, as is the case with many protocols. |
@masaeedu huh? Any JSON parsing library worth their salt should be able to parse a JSON object from a stream of characters/bytes (and do it again on same stream). It should not be complicated, but rather trivial - at least for languages that have built-in a concept of stream of bytes/characters. As for fault-tolerance, so far the protocol assumes the connection is made through a stream of bytes from a connection (either TCP, or stdin/stdout) that guarantees sequential delivery of bytes - until the connection is closed or dropped abnormally. But if that happens the connection is dead anyways, you can't recover other than reconnecting - and that makes no difference whatsover if you're using Content Length header or not. Also, as for "preventing concurrent parsing/processing of messages" - only the JSON parsing can no longer be concurrent with this change, true that. But the actual method execution can still be concurrent. In any case, I'm not advocating either way that the Content-Length header be dropped, I just pointed out it could be done. |
That is so totally not the case. In what language is it trivial to parse a stream of JSON? Not even in JavaScript. A Content-Length header makes a lot of sense. You cannot use |
@bruno-medeiros Any JSON library worth its salt can parse a single JSON object from a stream of bytes, provided an encoding. You can't parse a stream of undelimited JSON objects, especially not when you are expected to ignore malformed data and can have different encodings for each object. It is not impossible to do, but many JSON libraries won't have it out of the box.
Fault tolerance isn't just about dealing with dropped connections; you need to be able to deal with invalid data without crashing the language server or flushing all the other messages in your input stream. If I send
Glad we agree re: third point. With the volume of data you can expect in a language server, I think parallelizing deserialization is going to be important for performance. |
Java with GSON library. I'm sure there is GSON for Go as well. With Rust you have https://github.com/netvl/xml-rs (streaming API), and I'd bet there is something similar for D. I don't know much about dynamic languages like Javascript cause I don't like dynamic typing, however a quick google search pointed out this library for Javascript: http://oboejs.com/ , so there you go. |
JSON is JavaScript Object Notation and has seen most of its adoption because it is built-in into JavaScript with |
But it also isn't very useful to use streaming here. You cannot handle the response anyway until you have the whole object. |
You can't have different encoding for different objects ATM, only UTF8. See https://github.com/Microsoft/language-server-protocol/blob/master/protocol.md#content-part
If your client (or server) can't even send valid JSON, you're in big trouble already. Also, if the endpoint is so buggy it can't even guarantee to send valid JSON, who is to say it will guarantee the header fields will be valid as well? The Content-Length could be invalid as well... |
wuuut...? 😲 @felixfbecker Do you think most LSP clients and servers are going to be using Javascript at all? Not to be offensive, but that is laughable. Nearly every server is being written in the language they serve (Java LS in Java, Rust LS in Rust, Go LS Go, etc, etc. no Javascript). As for clients, it's only VS Code that is using Javascript, as far as I know. Eclipse Ché and Eclipse desktop are using a Java JSON library - GSON actually). IntelliJ is likely to do same, if not doing already. Other clients like Vim, Emacs, XCode, Sublime, Visual Studio (not code) - don't use Javascript AFAIK, but rather C/C++, Objective C, Swift, LISP/Scheme, Python and whatnot, and are likely to use existing, well established JSON parsing libraries. |
It doesn't matter whether this is the client's fault, the ISP's fault, or the network card's fault. Fault tolerance is about increasing the number of scenarios where you can recover from failure. If you have a header, you can deal with invalid message bodies, which improves fault tolerance.
If the content length header is invalid, you can tell immediately, because you will not be able to parse out the subsequent header in the middle of a message body. If the body is invalid (and there is no header), there is no way to tell, and the parser is going to continue muddling on until hopefully you get a parse error. Until you do encounter a parse error, the server is going to appear to be stuck. The worst possible outcome would be that you don't receive any error, and the server simply does something incorrect and carries on. This is especially likely to be a problem over connectionless transports. Either way, we're down to only 1/3 reasons I provided that you still object to, so I think there is a strong case for keeping the header. |
This whole argument is just pointless. There are so many things that speak against removing the |
I don't object to anything. In a comment of mine up ahead I already said: "In any case, I'm not advocating either way that the Content-Length header be dropped, I just pointed out it could be done."
Well, I do agree by now the argument is pointless, because now we are getting into territory of subjectiveness and personal preference. Let's just say I don't like dynamic languages, and specifically don't like |
IME, the Content-Length header is a responsibility of the transport layer, and not the protocol itself. There will likely be something similar at the transport layer; but it does not need to be directly part of the protocol itself. If it turns out to be the most efficient way of transporting data, the by all means give it a name (LSTP) and use it; I just vote for not enforcing it at the language server protocol level. |
No, transport layer is for example TCP. Content-Length header is part of the application layer, just like in HTTP. https://en.wikipedia.org/wiki/OSI_model |
Sorry about the mis-wording. Essentially, I think the application layer should not be enforced by ms-language-serverprotocol. If one wanted to use HTTP1.1 for example, they should be able to. In the current documentation, they cannot because the header is strictly specified. I think the focus should be on the RPC protocol; leave the application layer up to the implementer. |
I would propose the following
Beside specifying the protocol we should agree on a set of Transports I see: stdio, socket/pipes, ... The minimal combination a server has to support is [stdio, LSP-RPC] @aeschli FYI. |
To summarise, please do correct me if I am wrong: The protocol is not bound to the transport. One could even implement the protocol using UDP. Did you intentionally layout the points in the above comment in that order, @dbaeumer? I feel this is becoming a bit confusing at the moment. If people are not sure they should ask questions, e.g., #8 WebSocket is not TCP. |
One important aspect: On Windows, STDIO is always blocking. For this reason the PHP LS will connect to a TCP server, when passed the argument |
Hum, what do you mean by this, exactly? |
@bruno-medeiros In PHP on Windows, a See discussion |
(Ah, single-threading, now I get it. Yeah, it would be a problem in that case.) |
Yes! The simplest possible transport is to send JSON in JSON object per line format (that is, to require no internal Pro:
Contra:
|
If the stream is UTF-8 encoded, you could also use an out-of-band byte like 0xff (which will never appear in the UTF-8 steam) as your delimiter, so you don't need to worry about escaping anything in your payload. |
Open item is to specify which minimal command line flags should be supported. PR still welcome. |
if raw TCP socket is preferrable, then why not use something like either google protobuf or flatbuffers (lazy source interop language codegen) and then perfrom simple delimeted msg length with it's raw buffer? |
It actually has a specification: http://ndjson.org/ :) |
Btw, was it ever considered to use WebSockets instead? That's probably available in every language too, and would even allow language servers on the web. |
I don't think LSP really says anything about what you should/could use to establish a connection that can send bytes back and forth. You can use sysin/out, TCP socktets, websockets, or whatever else you want. The protocol only talks about what gets sent over the wire once you have already established a bi-directional channel between client and server. |
We should still specify the minimal command line flags |
I documented some standard command line arguments for 3.15 and will close the issue. The specification itself still allows to rtun LSP over http. |
Why not go the full route and support HTTP directly?
Benefits
language-server-protocol
parser. HTTP has plenty of robust, performant parsers already; we could leverage these.Downfalls
More info
http://json-rpc.org/wiki/specification#a2.2JSON-RPCoverHTTP
The text was updated successfully, but these errors were encountered: