-
Notifications
You must be signed in to change notification settings - Fork 298
Intellisense for notebooks (old way)
This page describes the old way that notebook intellisense worked. It's used if the python.pylanceLspNotebooksEnabled
setting is false
.
Once this experiment is pushed out to 100%, the following code could be eliminated:
- src/standalone/intellisense/fileBasedCancellationStrategy.node.ts - only used for custom pylance servers
- src/standalone/intellisense/fileBasedCancellationStrategy.node.ts - sets up the custom middleware pieces
- src/standalone/intellisense/languageServer.node.ts - starts the custom pylance servers
- vscode-python/src/client/activation/languageClientMiddleware.ts - creates a middleware that 'hides' requests for notebook cells.
Concatentation of the notebook cells
Intellisense in VS code works by sending LSP requests to a separate process (well in most cases, see this for more info)
Something like so:
Intellisense for notebooks works pretty much the same way but with each cell of a notebook being a text document:
This poses a problem for the language server (Pylance) because code from one cell can be referenced in another.
Example:
In that example, the pandas import crosses the cell boundary.
This means pylance cannot just analyze each cell individually.
The solution was to concatenate the cells in order.
This changes the original architecture to something like so:
Concatenation is mostly just a raw concat of all of the contents of each cell on top of each other. Then the concat document has functions to map back and forth between the original cells and the concatenated contents.
Code for this can be found here
Here's an example of using it:
public async provideReferences(
document: vscode.TextDocument,
position: vscode.Position,
options: {
includeDeclaration: boolean;
},
token: vscode.CancellationToken,
_next: protocol.ProvideReferencesSignature
) {
const client = this.getClient();
if (this.shouldProvideIntellisense(document.uri) && client) {
const documentId = this.asTextDocumentIdentifier(document);
const newDoc = this.converter.toConcatDocument(documentId);
const newPos = this.converter.toConcatPosition(documentId, position);
const params: protocol.ReferenceParams = {
textDocument: newDoc,
position: newPos,
context: {
includeDeclaration: options.includeDeclaration
}
};
const result = await client.sendRequest(protocolNode.ReferencesRequest.type, params, token);
const notebookResults = this.converter.toNotebookLocations(result);
return client.protocol2CodeConverter.asReferences(notebookResults);
}
}
That is the handler for the references LSP request.
It is
- translating the incoming cell uri into a concat document
- translating the incoming cell position into a concat document
- sending the request using the concat data
- translating the results back into a cell uri
When pylance starts up, it is passed an interpreter that defines what modules are installed. In this example, pylance is running with a Python 3.10 environment that is missing scikit-learn:
For python files, this interpreter is set at the bottom left of VS code:
That interpreter is used by pylance to determine where it will find all of the modules it checks. So in this example, the window's 3.10 64 bit environment does not have the module 'scikit-learn'
Notebooks don't have a 'global' interpreter, but rather a 'kernel' that is used to run the code. This kernel is almost always associated with a python interpreter.
This interpreter is what we need to pass to pylance so it can find the correct modules.
This complicates how pylance is started.
For a normal python file, this is how things are started:
For a notebook, we can't use the global interpreter, but rather we start a pylance server per kernel in use:
This is necessary because each pylance needs to have a separate 'interpreter' to use to search for modules.
This means there are now 4 pylance servers running.
- 1 for the python extension to handle python files
- 3 for each notebook that is opened with a different kernel
Having multiple language servers running would usually mean each server was assigned to a specific document selector, otherwise you'd end up with duplicate results for say hover or completion.
However that's not the case. That's because of limitations in how selectors are specified.
- The can specify a scheme, a language, or a pattern match
- They cannot run logic (they're static)
- They cannot exclude things
The python extension's selector is basically "language": "python"
and the jupyter extension's selector is basically "scheme":"vscode-notebook-cell"
, then how do we resolve the duplicates?
Both extensions use something called middleware.
The VS Code language client npm module is a library for talking to LSP enabled language servers. Both the Python Extension and the Jupyter extension use it in order to send messages to pylance. The library allows for the creation of a 'Middleware' object that can listen to any LSP request before it is sent to the server.
This provides an opportunity to filter messages based on the outbound document URI. Meaning we can eliminate duplicates in the example above.
- Python extension lets all non notebook requests go through normally and swallows notebook requests (handling the negative case that selectors can't handle)
- Jupyter extension has one middleware started per kernel. Each middleware piece swallows all requests not notebook related and checks if the request matches the kernel on a server. (handling the 'function' check for a selector)
This diagram shows a request for a specific notebook cell:
The middleware that makes these decisions can be found
Jupyter's mutliplexing code for picking which pylance server to be run can be found here.
Having 4 pylance servers running at the same time is rather redundant and a waste of CPU so we'd like to eliminate this need. In order to do that, Pylance would have to support a custom message indicating that certain URIs have different interpreters.
If that were to happen, we wouldn't need any middleware layers at all. Pylance would just handle all requests for all python files, and the jupyter extension would just need to pass a message indicating certain cells use a different interpreter.
- Contribution
- Source Code Organization
- Coding Standards
- Profiling
- Coding Guidelines
- Component Governance
- Writing tests
- Kernels
- Intellisense
- Debugging
- IPyWidgets
- Extensibility
- Module Dependencies
- Errors thrown
- Jupyter API
- Variable fetching
- Import / Export
- React Webviews: Variable Viewer, Data Viewer, and Plot Viewer
- FAQ
- Kernel Crashes
- Jupyter issues in the Python Interactive Window or Notebook Editor
- Finding the code that is causing high CPU load in production
- How to install extensions from VSIX when using Remote VS Code
- How to connect to a jupyter server for running code in vscode.dev
- Jupyter Kernels and the Jupyter Extension