rfc: runtime API #3036

ematipico · 2022-08-09T12:59:28Z

Summary

This is a follow up of closed PR: #3014

@MichaReiser raised some questions, which I am going to answer below

Some questions after a quick look at the example from the summary:
* Does `new Rome()` automatically load the rome configuration? If not, how can I accomplish that?

In a sense. It's actually the Rust Workspace that loads the configuration. The runtime might be able to override the defaults.

* `format` and `check` accept the file content as parameters. Does that mean this is an **alternative** API to @leops _Workspace_ API ([wire the playground and the workspace #3012](https://github.com/rome/tools/issues/3012))? If not, will it be necessary to call `file_changed` with an updated content but it's still necessary to pass the content to `format` and `check`?

No, this is not an alternative API, it's a problem that is already been solved in the playground, for example. This will reflect a future command, where the CLI will read from stdin.

* How about passing the `format` and `check` options as optional arguments to `format` and `check` but use the `Rome` instance's default configuration if they are not supplied. Note: I saw in the description that this is possible. I recommend making this clearer in the example.

Sure, thank you.

* I would prefer exposing the `configuration` on the `Rome` instance rather than the `formatterConfiguration` to avoid having to call n-methods if I want to use n-Rome methods.

Yes, it's a good call. I used this feedback and applied it to the RFC.

* I think it would be useful for `new Rome()` to accept an optional configuration to use over the default configuration.

Yes, done in the RFC.

* In my view, `format` should return the printed string because it doesn't allow to specify a format range.

Yes, format is too broad. I believe we should have specific APIs for each case. In this case a formatRange API.

* It may be necessary to provide additional data to a `format` call, e.g. the cursor position, that isn't covered by `Rome`'s settings. Do you intend to have different `format`, `format_range` and `format_with_cursor` (`format_with_cursor_and_range?`) methods?

I think it's preferred to have specific APIs for each specific usage. We don't have cursor functionality in the Workspace, so I didn't consider it for the proposal. I wouldn't make much sense the propose if I don't know if we're able to support it and if it helps adoption.

* How does your API compare to Prettier's API?

Prettier's API are not in sync with its CLI. With the CLI you can format files and folders. With the APIs the user is in charge reading the content of those files and pass the content the APIs.

I don't know the reasons of this design, but I believe I/O operation should be delegated to our workspace (at least give some API where Rome can do that), and I also believe that the APIs should resemble the CLI too.

Test Plan

No test involved

cloudflare-workers-and-pages · 2022-08-09T13:17:34Z

Deploying with Cloudflare Pages

Latest commit:	`f08d0e5`
Status:	✅ Deploy successful!
Preview URL:	https://375ed51a.tools-8rn.pages.dev
Branch Preview URL:	https://rfc-apis-rfc.tools-8rn.pages.dev

View logs

leops · 2022-08-10T07:39:36Z

Comments from my initial reading:

Most of the methods on the Rome class should be async, since the backend Rust code may run in a different Worker thread / OS process entirely
It may be preferable to provide the path of files passed to formatContent or lintContent in order to display it in the emitted diagnostics. This could replace the FileType argument as that would let us use the same file path to file type matching logic as the rest of the toolchain instead of relying on users of the API to re-implement it (drawback: this will require users to "guess" a fake file path that matches the type of the content they're trying to format / lint when working on virtual or generated files)

ematipico · 2022-08-10T11:29:27Z

Comments from my initial reading:

* Most of the methods on the `Rome` class should be async, since the backend Rust code may run in a different Worker thread / OS process entirely

I left that out on purpose because I didn't know the specifics of the architecture. Is it something that you can confirm now or is it too soon? For example esbuild provides synchronous and asynchronous APIs.

* It may be preferable to provide the path of files passed to `formatContent` or `lintContent` in order to display it in the emitted diagnostics. This could replace the `FileType` argument as that would let us use the same file path to file type matching logic as the rest of the toolchain instead of relying on users of the API to re-implement it (drawback: this will require users to "guess" a fake file path that matches the type of the content they're trying to format / lint when working on virtual or generated files)

Yeah, that's true, although providing a path for some content might seem less intuitive, especially for less experienced developers. Also, if we compare this decision against other tools (prettier or eslint), we need to be very good to explain this option that other tools might not have.

MichaReiser

thank you @ematipico for this awesome write-up.

I very much like this API, particularly splitting it into ..Files, ..Content methods.

I added a few inline questions with clarification questions. Two overarching that I have at the moment are:

Exposing lintFiles and lintDirectories requires that Rome is limited to run in environments where it's possible to read files. Is Rome's API different in environment that doesn't support IO (e.g. browser, if we're ignoring virtual filesystems)
It may be interesting to add a few example use cases for the different API's and how they would be used in that case. For example, how would a Jest pretty-printer integration look like? Are there any other immediate use cases that you are aware of that we could use to verify our API, by providing a few sample implementations?

MichaReiser · 2022-08-10T20:39:37Z

rfcs/002-runtime-api.md

+
+const rome = new Rome();
+const content = "function f()  { return   {}}";
+const result = rome.formatContent(content, { fileType:  FileType.Js });


I very much like the idea of having different methods to format a file, directory, or some content (to e.g. integrate into Jest).

You mentioned that Rome automatically infers the configuration. How does that work with the fileContent API. Does it resolve the configuration relative to the cwd?

I might be wrong, and this is only a theoretic guess: if the script is run from where the package.json is, then the configuration is automatically picked up, because that's how the CLI works too.

If the premise is correct, then yes, we will need of a cwd as base directory, and from there we can pick the configuration. A script can be run from anywhere, even inside a Rome project but not from the working directory.

Thank you raising it, I completely missed this scenario.

MichaReiser · 2022-08-10T20:41:36Z

rfcs/002-runtime-api.md

+import { Rome } from "rome";
+
+const content = "function f() { return {} }";
+const rome = new Rome({


You outline that Rome either starts a new process or runs using WebAssembly. What's your proposal for managing the lifetime of the webassembly module (GC?) or the spawned subprocess? Does the Rome object need a shutdown/stop method?

I don't an answer to this question, the design of the backend is still in the works and it might change. Maybe @leops can chime in, he might have some answer

As with async methods, the design of the underlying backend isn't entirely set down yet so I would include a shutdown method just in case. To give more detail on this:

For WASM the Module and Instance objects are managed by the runtime GC, but the Workspace bridge object is not: creating a workspace allocates native memory in the WASM instance that needs to be reclaimed by calling the free method automatically generated by wasm-bindgen

For a native child process we'd either need to send a shutdown signal if the native process is running in "attached" mode, or cleanly close the IPC connection if connected to a daemon server

MichaReiser · 2022-08-10T20:42:29Z

rfcs/002-runtime-api.md

+const ast = rome.parse(content); // not part of this paragraph
+const new_content = rome.format(content); // not part of this paragraph
+const diagnostics = rome.check(content);  // not part of this paragraph


Is this section outdated? It seems that the APIs are formatContent or lintContent?

It's outdated, yes. But it was mostly to show off how we can use multiple features/APIs in the same snippet.

MichaReiser · 2022-08-10T20:43:38Z

rfcs/002-runtime-api.md

+import { Rome } from "rome";
+
+const rome = new Rome();
+const result = rome.lintFiles(["./path/to/file.js"]);


What's your reasoning behind naming these methods lint vs. the CLIs terminology that uses check.

Good point. Mostly two reasons:

we're exploring untouched ground, even for Rome classic, there linter and formatter were tight together (and issues were rising up); if we want to increase adoption, we should be more flexible. A user should be able to use the tools separately if they want;

I envision a check command (like Rome classic) that does multiple checks, and a lint command that does only linting. This command is inline with the format command

Hence, why the API. We can also have a .check API, it's very straightforward.

MichaReiser · 2022-08-10T20:45:38Z

rfcs/002-runtime-api.md

+import { Rome } from "rome";
+
+const rome = new Rome();
+const result = rome.parseFiles(["./path/to/file.js", ""]);


What are the use cases for parsing multiple files? How does a user get the CST for a specific file (I imagine this method must either return an array OR object). My recommendation would be to limit the API to parsing a specific file.

This raises another question. What is the API proposing to use as file identifiers. Paths alone may not be unique enough in case the user provided path is a symlink to another file.

The reason was mainly to keep it in line with the other APIs. I don't see anything wrong limiting it to a single file. I didn't include the CST mainly because we promote the AST (see the VSCode extension for example). If we want to give access to both, we should align this choice with the VSCode extension too.

I used CST&AST as synonym. My main concern here was about what use cases there are for parsing multiple files and how the result structure would look like (one huge object where paths are keys?)

I guess it doesn't make much sense to parse multiple files. We can limit the API to a single file then.

MichaReiser · 2022-08-10T21:00:06Z

How do you envision that caching works for the content methods. Is there no caching, or can users provide an optional explicit id/cache key (which could be a virtual path: virtual://a/b/c.js)

ematipico · 2022-08-11T07:54:24Z

thank you @ematipico for this awesome write-up.

I very much like this API, particularly splitting it into ..Files, ..Content methods.

I added a few inline questions with clarification questions. Two overarching that I have at the moment are:
* Exposing `lintFiles` and `lintDirectories` requires that Rome is limited to run in environments where it's possible to read files. Is Rome's API different in environment that doesn't support IO (e.g. browser, if we're ignoring virtual filesystems)

These APIs are mostly for Node.js, but I can see the future problem.

We have options here:

release a package/library just for browsers (like esbuild does), which relies exclusively on WebAssembly and exclude those APIs from the library;
rely on MemoryFileSystem to create virtual files; document that these APIs, on a browser, will work on a virtual file system, no physical files;

Note: I don't know if both options (especially the second) are doable

* It may be interesting to add a few example use cases for the different API's and how they would be used in that case. For example, how would a Jest pretty-printer integration look like? Are there any other immediate use cases that you are aware of that we could use to verify our API, by providing a few sample implementations?

I am open to proposal, because I have no idea how to make a jest pretty-printer. An idea might be a plugin for webpack (like eslint-webpack-plugin, to catch lint errors during compilation, but I don't know how to make a plugin.

MichaReiser · 2022-08-11T12:23:01Z

I am open to proposal, because I have no idea how to make a jest pretty-printer. An idea might be a plugin for webpack (like eslint-webpack-plugin, to catch lint errors during compilation, but I don't know how to make a plugin.

You may need to spend some time familiarizing yourself. Jest uses pretty-format under the hood. The documentation has a section on how to customize formatting in jest as well as on how to write a plugin

The main challenge that I see is that many such plugin architectures don't support explicit lifetimes (beyond what a GC provides)

EDIT: It seems that Jest only uses Prettier for inline snapshots and I don't think that is configurable except by providing a node module that mimics prettier's API

Edit2: Other use cases

MichaReiser · 2022-08-15T06:37:00Z

I spent a little more time thinking about the API and have some more thoughts that may need additional exploration but can be deferred to later or we can decide that we don't want to pursue the idea.

`Files`, `File`,`Virtual`

One thing I noticed about the API is that many functionality provide methods to work on a set of files, a single file, or a virtual file.

rome.formatFiles(["./path"])
rome.formatFile("path"...)
rome.formatContent("content", ...)

An alternative to this could be to have objects that wrap a set of "files":

let files = rome.files(["./path"]);
let file = rome.file("./path");
let virtual = rome.virtual("content", { content-id });

You can then do:

files.check();
file.check();
content.check();

or if we prefer

rome.check(files)
rome.check(file)
rome.check(content)

I haven't thought through the implications of this but thought it interesting.

`Rome` lifetime

One of the main unanswered questions of this proposal is how to manage Romes lifetime, especially considering the different execution modes:

Spawn a process for every operation
Spawn / connect to a deamon
Create a WASM module

The questions I have are:

How does a client specify which execution mode they want?
What does shutdown do? Does it disconnect from a deamon or shuts it down?
Does the API need equivalents to the CLI's start / stop command and if so, what do they return?

ematipico · 2022-08-15T07:26:00Z

I spent a little more time thinking about the API and have some more thoughts that may need additional exploration but can be deferred to later or we can decide that we don't want to pursue the idea.

Files, File,Virtual

One thing I noticed about the API is that many functionality provide methods to work on a set of files, a single file, or a virtual file.
rome.formatFiles(["./path"])
rome.formatFile("path"...)
rome.formatContent("content", ...)
An alternative to this could be to have objects that wrap a set of "files":
let files = rome.files(["./path"]);
let file = rome.file("./path");
let virtual = rome.virtual("content", { content-id });
You can then do:
files.check();
file.check();
content.check();
or if we prefer
rome.check(files)
rome.check(file)
rome.check(content)
I haven't thought through the implications of this but thought it interesting.

Rome lifetime

One of the main unanswered questions of this proposal is how to manage Romes lifetime, especially considering the different execution modes:
* Spawn a process for every operation

* Spawn / connect to a deamon

* Create a WASM module
The questions I have are:
* How does a client specify which execution mode they want?

I don't think a client should be aware of anything that regards the communication with Rust Workspace. I think that exposing this information would make the APIs more complex, especially for beginners. Our internal architecture should stay internal.

We should decide which kind of communication we should have based on the API. For example, initially we decide that the API .formatContent would use the WASM. After sometime we discovered something and communicating via process is more efficient. We then change it without breaking the official API.

* What does `shutdown` do? Does it disconnect from a deamon or shuts it down?

Not part of this RFC, we can follow up here. I don't have an answer now.

* Does the API need equivalents to the CLI's `start` / `stop` command and if so, what do they return?

Not part of this RFC, we can follow up here. Probably we will reimplement the same logic we had in Rome classic.

MichaReiser · 2022-08-15T07:31:40Z

We should decide which kind of communication we should have based on the API. For example, initially we decide that the API .formatContent would use the WASM. After sometime we discovered something and communicating via process is more efficient. We then change it without breaking the official API.

I agree that we should for good defaults but I do believe that it should be possible for users to change the default. For example, it may be the most performant to spawn a new CLI for each formatContent call by default, except if these calls are done inside of a loop, in which case it will be more performant to spawn the process only once.

This is why I believe we should expose some sort of control to the user and this may influence how our API looks like significantly which is why I think it's worth to explore the idea at least a bit.

ematipico · 2022-08-15T07:46:06Z

We should decide which kind of communication we should have based on the API. For example, initially we decide that the API .formatContent would use the WASM. After sometime we discovered something and communicating via process is more efficient. We then change it without breaking the official API.

I agree that we should for good defaults but I do believe that it should be possible for users to change the default. For example, it may be the most performant to spawn a new CLI for each formatContent call by default, except if these calls are done inside of a loop, in which case it will be more performant to spawn the process only once.

This is why I believe we should expose some sort of control to the user and this may influence how our API looks like significantly which is why I think it's worth to explore the idea at least a bit.

The only non-intrusive and friendly way I see using these suggestions is by introducing a different sets o APIs, different from the ones proposed in the RFC. An example:

import { workflow } from "rome";

await workflow.withProcess(async romeInstance => { 
	romeInstance.formatContent("");
}).run()

await workflow.withDaemon(async romeInstance => { 
	romeInstance.formatContent("");
}).run()

await workflow.withWasm(async romeInstance => { 
	romeInstance.formatContent("");
}).run()

We can explore this proposal in another RFC.

MichaReiser · 2022-08-15T07:52:42Z

Using closures has the downside that this isn't working well in situations where an API exposes start and shutdown methods. For example, a webpack loader exposes a function that gets called for every file and ideally, the rome instance would be re-used across files.

An alternative could be

let rome = await Rome.default(); // Creates a new instance using the default scheduling
let rome = await Rome.start(); // Creates a new deamon or connects to an existing one

Or @leops proposed an API where it's possible to pass an optional RomeDeamon API to the format* methods.

let deamon = await RomeDeamon.start()
rome.formatContent(..., deamon)

...
deamon.disconnect() 
// or 
deamon.stop()

Or the deamon could be passed in the constructor?

let deamon = await RomeDeamon.start();

let rome = new Rome(deamon);

One thing we may need to consider with async constructor methods is that not all plugin APIs allow for calling async methods.

ematipico · 2022-08-16T13:07:33Z

RFC updated:

removed fileType in favour of filePath;
.parseContent only is exposed, but a .parseFile can be implemented if we want;

Not yet confirmed but evaluated:

APIs can all be async;
possibility to choose a back-end when creating a Rome instance;

To evaluate and implement (they require changes in the Rust code):

resolve the configuration from a path that is not the working directory (the path from where the command is run)

ematipico requested a review from a team August 9, 2022 12:59

ematipico force-pushed the rfc/apis-rfc branch from 4c15a7e to d889e9c Compare August 9, 2022 13:17

rfc: runtime API

a383069

ematipico force-pushed the rfc/apis-rfc branch from d889e9c to a383069 Compare August 9, 2022 13:20

MichaReiser reviewed Aug 10, 2022

View reviewed changes

MichaReiser approved these changes Aug 16, 2022

View reviewed changes

chore: PR review

f08d0e5

ematipico merged commit 4942e79 into main Aug 16, 2022

ematipico deleted the rfc/apis-rfc branch August 16, 2022 13:53

ematipico mentioned this pull request Aug 17, 2022

Implement a simple facade of Node.js APIs #3073

Closed

IWANABETHATGUY pushed a commit to IWANABETHATGUY/tools that referenced this pull request Aug 22, 2022

rfc: runtime API (rome#3036)

b57cd08

rfc: runtime API #3036

rfc: runtime API #3036

Uh oh!

Conversation

ematipico commented Aug 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

cloudflare-workers-and-pages bot commented Aug 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Pages

Uh oh!

leops commented Aug 10, 2022

Uh oh!

ematipico commented Aug 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ematipico Aug 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser commented Aug 10, 2022

Uh oh!

ematipico commented Aug 11, 2022

Uh oh!

MichaReiser commented Aug 11, 2022 • edited by ematipico Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaReiser commented Aug 15, 2022

Files, File,Virtual

Rome lifetime

Uh oh!

ematipico commented Aug 15, 2022

Files, File,Virtual

Rome lifetime

Uh oh!

MichaReiser commented Aug 15, 2022

Uh oh!

ematipico commented Aug 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaReiser commented Aug 15, 2022

Uh oh!

ematipico commented Aug 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

ematipico commented Aug 9, 2022 •

edited

Loading

cloudflare-workers-and-pages bot commented Aug 9, 2022 •

edited

Loading

ematipico commented Aug 10, 2022 •

edited

Loading

ematipico Aug 11, 2022 •

edited

Loading

MichaReiser commented Aug 11, 2022 •

edited by ematipico

Loading

`Files`, `File`,`Virtual`

`Rome` lifetime

`Files`, `File`,`Virtual`

`Rome` lifetime

ematipico commented Aug 15, 2022 •

edited

Loading