+
Skip to content

userdiff: add builtin diff driver for TypeScript language #1746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

matthewhughes934
Copy link
Contributor

@matthewhughes934 matthewhughes934 commented Jul 14, 2024

cc: Johannes Sixt j6t@kdbg.org

Copy link

Welcome to GitGitGadget

Hi @matthewhughes934, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that either:

  • Your Pull Request has a good description, if it consists of multiple commits, as it will be used as cover letter.
  • Your Pull Request description is empty, if it consists of a single commit, as the commit message should be descriptive enough by itself.

You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <revi.ewer@example.com>, Ill Takalook <ill.takalook@example.net>

NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description,
because it will result in a malformed CC list on the mailing list. See
example.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

Copy link

There are issues in commit bfe2820:
userdiff: add builtin diff driver for TypeScript language
Commit not signed off

@matthewhughes934 matthewhughes934 force-pushed the add-typescript-diff-driver branch from bfe2820 to c6c1b82 Compare July 14, 2024 19:34
@dscho
Copy link
Member

dscho commented Jul 14, 2024

/allow

Copy link

User matthewhughes934 is now allowed to use GitGitGadget.

WARNING: matthewhughes934 has no public email address set on GitHub;
GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use.

@matthewhughes934
Copy link
Contributor Author

/preview

Copy link

Preview email sent as pull.1746.git.git.1720989699838.gitgitgadget@gmail.com

@matthewhughes934 matthewhughes934 force-pushed the add-typescript-diff-driver branch from c6c1b82 to 8fb1183 Compare July 14, 2024 20:45
@matthewhughes934 matthewhughes934 force-pushed the add-typescript-diff-driver branch 3 times, most recently from 0c7ce73 to 0efa26b Compare July 15, 2024 12:05
TypeScript[1] is an open-source programming language that builds on
JavaScript. This patch adds builtin diff driver support for this
language. As far as I can tell there is no official syntax specification
for the language (see[2] for some discussion) so this patch is based off
some existing work[3]. The docs[4] probably provide the best
reference as to what this driver should satisfy. See[5] for
discussion/motivation for this change from the TypeScript language team.

This is my first time developing a diff driver, so as such the
implementation borrows quite a bit from existing drivers. The funcname
attribute matches function and class definitions, the list of keywords
used to define functions was take from[3], I could not find an
exhaustive list for these. The word-regex borrows much from other
existing diff engines, with the addition of the rather unique
right-shifting operators (>>> and >>>=) available in JavaScript (and
hence Typescript)[6]

[1] https://www.typescriptlang.org/
[2] microsoft/TypeScript#15711
[3] git#859
[4] https://www.typescriptlang.org/docs/
[5] microsoft/TypeScript#36185
[6] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Unsigned_right_shift

Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
@matthewhughes934 matthewhughes934 force-pushed the add-typescript-diff-driver branch from 0efa26b to 63b6a46 Compare July 15, 2024 13:52
@matthewhughes934
Copy link
Contributor Author

/preview

Copy link

Preview email sent as pull.1746.git.git.1721060847706.gitgitgadget@gmail.com

@matthewhughes934
Copy link
Contributor Author

/submit

Copy link

Submitted as pull.1746.git.git.1721061218993.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1746/matthewhughes934/add-typescript-diff-driver-v1

To fetch this version to local tag pr-git-1746/matthewhughes934/add-typescript-diff-driver-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1746/matthewhughes934/add-typescript-diff-driver-v1

Copy link

This patch series was integrated into seen via 0ca8316.

Copy link

This patch series was integrated into seen via ac7da8b.

Copy link

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Jul 15, 2024 at 04:33:38PM +0000, Matthew Hughes via GitGitGadget wrote:
> diff --git a/userdiff.c b/userdiff.c
> index c4ebb9ff734..7247d351cde 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -333,6 +333,17 @@ PATTERNS("scheme",
>  	 "|([^][)(}{[ \t])+"),
>  PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
>  	 "\\\\[a-zA-Z@]+|\\\\.|([a-zA-Z0-9]|[^\x01-\x7f])+"),
> +PATTERNS("typescript",
> +	"^[\t ]*((class|constructor|public|private|protected|function|interface)[ \t].*)$\n"
> +	// arrow funcs
> +	"^[\t ]*((const|let|var)?[^()]*)=[\t ]*\\([^()]*\\)[\t ]*.*=>.*$",
> +	/* -- */
> +	"[a-zA-Z_][a-zA-Z0-9_]*"
> +	// numeric constants
> +	"|[-+0-9.e]+|0[xX]?[0-9a-fA-F]"
> +	// operators
> +	"|[-+*/<>%&^|=!]"
> +	"|--|\\+\\+|//=?|<<=?|>>?=?"),
>  { .name = "default", .binary = -1 },
>  };
>  #undef PATTERNS
> 
> base-commit: a7dae3bdc8b516d36f630b12bb01e853a667e0d9
> -- 
> gitgitgadget

This needs some updates. For the arrow function, definitions can cover multiple
lines e.g.:

    const bar = (
        name: string
    ) => console.log(name)

The funcname pattern should also consider the `export` keyword, since both of
the following are valid:

    export const bar = (
        name: string
    ) => console.log(name)

    export function foo() {}

Some docs: https://www.typescriptlang.org/docs/handbook/modules/reference.html#module-syntax

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Matthew Hughes <matthewhughes934@gmail.com> writes:

> This needs some updates.

What does it mean?

The patterns that were posted were so broken that they are unusable
and harm the users by giving misleading information?

Or do the patterns work just fine in basic or tutorial cases, but
with more advanced or realistic uses of the language construct, they
highlight wrong lines as the function header and/or split at wrong
word boundaries that are obviously much less optimal than ideal that
any human users would find questionable?

In the latter case, how far from the ideal are the decisions done by
the current patterns, and what's the rough percentage of usual code
we see in the real world, for which the current patterns do not work
well?

What I am trying to gauge is if it is so broken that it should not
exist (in other words, you regret sending the patch to the list
before doing these updates), or is "already serviceable, but not
perfect yet".  Waiting for perfection takes forever.  If the latter,
letting the general public to use it to gather feedbacks by waiting
for the dust to settle before making such updates is often better.

Copy link

On the Git mailing list, Matthew Hughes wrote (reply to this):

Oe Tue, Jul 16, 2024 at 08:45:08AM -0700, Junio C Hamano wrote:
> What does it mean?
> 
> The patterns that were posted were so broken that they are unusable
> and harm the users by giving misleading information?

I think this would be a good summary. It's sufficient for some simpler cases
considered, and does even give some benefits e.g. for function headers for
nested functions. However, the cases where it fails can be significant, e.g.
hundreds of lines away from the correct function header for files with multiple
consecutive multi-line arrow functions.

> In the latter case, how far from the ideal are the decisions done by
> the current patterns, and what's the rough percentage of usual code
> we see in the real world, for which the current patterns do not work
> well?

I think just the missing `export` keyword handling would be equivalent to
missing all public functions in other programming languages, so that alone
would be a decent percentage.

> What I am trying to gauge is if it is so broken that it should not
> exist (in other words, you regret sending the patch to the list
> before doing these updates), or is "already serviceable, but not
> perfect yet".  Waiting for perfection takes forever.  If the latter,
> letting the general public to use it to gather feedbacks by waiting
> for the dust to settle before making such updates is often better.

I'm leaning towards the former case: that this patch was premature. I think
it's far enough from perfect that it would greatly benefit from me more
actively reaching out to the TypeScript language team and asking some devs
there try out the changes and gather some more input (and identify some more
missing cases, of which I now expect that are many) before getting
something out to general users.

Copy link

On the Git mailing list, Johannes Sixt wrote (reply to this):

We have had a submission for typescript just recently:

https://lore.kernel.org/git/20240404163827.5855-1-utsavp0213@gmail.com/

And two for Javascript

https://lore.kernel.org/git/20240301074048.188835-1-sergiusnyah@gmail.com/
https://lore.kernel.org/git/20220403132508.28196-1-a97410985new@gmail.com/

Please review these earlier submissions. If you think you can improve on
them, you are very welcome to do so. But you can also just resend one of
these series with a note that they are sufficiently mature and that you
support the submitted version.

I may very well mistaken, but I think that Typescript is a syntactical
superset of Javascript, so we need just one language driver and mention
in the documentation that it can be used for both languages.

Thanks,
-- Hannes

Copy link

User Johannes Sixt <j6t@kdbg.org> has been added to the cc: list.

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Matthew Hughes <matthewhughes934@gmail.com> writes:

> I'm leaning towards the former case: that this patch was premature.

OK.  Then let's take sufficient time.  After all, we are never in a
hurry ;-)

Thanks for giving an honest assessment.  Will keep the topic in
'seen' without marking it for 'next' (at least until it gets
replaced with a version that is more suitable to the public).

Copy link

This branch is now known as mh/userdiff-typescript.

Copy link

There was a status update in the "Discarded" section about the branch mh/userdiff-typescript on the Git mailing list:

Retracted for now.
cf. <20240716193344.bjb62zsfnrfw3ngf@archP14s>
source: <pull.1746.git.git.1721061218993.gitgitgadget@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载