这是indexloc提供的服务,不要输入任何密码
Skip to content

Meta: add infrastructure to generate summary.json #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 28, 2023
Merged

Conversation

annevk
Copy link
Contributor

@annevk annevk commented Jun 20, 2023

The setup is as follows:

  1. We collect all the issue data from the repository through the GitHub API and store it in a JSON file. Storing the input allows us to monitor changes in it for bugs. We'll likely want to change some issues so they are more suitable for consumption by the script.
  2. The script takes this data and produces a JSON file with a summarized view of each issue. The bulk of the logic is in processing the body of the issue (the top comment) which is split between before and after the YAML change.

The idea is that the JSON file with the summarized view can be consumed and eventually displayed on webkit.org once we're happy with it.

It currently has some flaws due to the raw data, but it seems worthwhile to check this in and then improve the raw data to validate the entire process around updating the issue data.

The setup is as follows:

1. We collect all the issue data from the repository through the GitHub API and store it in a JSON file. Storing the input allows us to monitor changes in it for bugs. We'll likely want to change some issues so they are more suitable for consumption by the script.
2. The script takes this data and produces a JSON file with a summarized view of each issue. The bulk of the logic is in processing the body of the issue (the top comment) which is split between before and after the YAML change.

The idea is that the JSON file with the summarized view can be consumed and eventually displayed on webkit.org once we're happy with it.

It currently has some flaws due to the raw data, but it seems worthwhile to check this in and then improve the raw data to validate the entire process around updating the issue data.
@annevk annevk added the meta For issues about this repo label Jun 20, 2023
Copy link
Member

@hober hober left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me; one nit

Copy link
Member

@gsnedders gsnedders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a lot of comments.

You might want to run black on this to reformat this to meet expected code style, and you may also want to run flake8 on this and see what errors it finds?

@@ -0,0 +1,145 @@
#!/usr/bin/env python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python or python3? I'd hope the latter!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really best practice to put the version number there? It works fine either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assume this is python 3, and we can set a requirements after.

page += 1
continue
break
write_json("summary-data.json", data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be using summary-data.json in the current working directory, or summary-data.json adjacent to the script?

Do we even want to add an option to change what file it uses? Maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to account for different files. Supporting invocation from different directories seems like something we could support as a follow-up, though I doubt we'll need it.

Copy link
Member

@karlcow karlcow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the code resides in the root directory.
It would probably be better in a separate directory.
Screenshot 2023-06-23 at 14 49 39

Specifically when adding a couple of things for managing the tests, etc.

Could we move all of that in a tooling or something.

Also should the edits on summary.py be separated from summary.json and summary-data.json

I would propose we do.

.github/
data/
   summary-data.json
   summary.json
tooling/
    summary.py
    tests/

I will be working on testing later. Once the first PR is merged.

@karlcow
Copy link
Member

karlcow commented Jun 23, 2023

Also should the edits on summary.py be separated from summary.json and summary-data.json

It could even be part of a GitHub actions. So that we have a clear separation in between editing the python factory and the data generations.

@gsnedders
Copy link
Member

It could even be part of a GitHub actions.

One could reasonably imagine the summary.json getting pushed to gh-pages, so it could be used by others.

@annevk
Copy link
Contributor Author

annevk commented Jun 26, 2023

I like the idea of more automation, but changing summary.json through PRs is important for review purposes in my opinion. We want to be somewhat careful about what ends up on the website. I could see running the update process in a more automated fashion, but for now that seems more wasteful than useful to me.

Perhaps enforcing black and tests could be something useful for GitHub Actions though. I suggest that once it gets more complicated we make the directory changes. Seems nicer to keep it simple for now as we only have four files in total.

@karlcow
Copy link
Member

karlcow commented Jun 27, 2023

I like the idea of more automation, but changing summary.json through PRs is important for review purposes in my opinion. We want to be somewhat careful about what ends up on the website. I could see running the update process in a more automated fashion, but for now that seems more wasteful than useful to me.

At least would it be possible to do different PRs for python code and data file generation. 🙏

@annevk
Copy link
Contributor Author

annevk commented Jun 27, 2023

I doubt we're going to change the code a whole lot after this lands and a reviewer probably wants to see the changes doing the right thing?

Edit: clarified with Karl that all is in order.

@annevk annevk merged commit 631dfca into main Jun 28, 2023
@annevk annevk deleted the annevk/summary branch June 28, 2023 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta For issues about this repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants