Meta: add infrastructure to generate summary.json #208

annevk · 2023-06-20T15:44:42Z

The setup is as follows:

We collect all the issue data from the repository through the GitHub API and store it in a JSON file. Storing the input allows us to monitor changes in it for bugs. We'll likely want to change some issues so they are more suitable for consumption by the script.
The script takes this data and produces a JSON file with a summarized view of each issue. The bulk of the logic is in processing the body of the issue (the top comment) which is split between before and after the YAML change.

The idea is that the JSON file with the summarized view can be consumed and eventually displayed on webkit.org once we're happy with it.

It currently has some flaws due to the raw data, but it seems worthwhile to check this in and then improve the raw data to validate the entire process around updating the issue data.

The setup is as follows: 1. We collect all the issue data from the repository through the GitHub API and store it in a JSON file. Storing the input allows us to monitor changes in it for bugs. We'll likely want to change some issues so they are more suitable for consumption by the script. 2. The script takes this data and produces a JSON file with a summarized view of each issue. The bulk of the logic is in processing the body of the issue (the top comment) which is split between before and after the YAML change. The idea is that the JSON file with the summarized view can be consumed and eventually displayed on webkit.org once we're happy with it. It currently has some flaws due to the raw data, but it seems worthwhile to check this in and then improve the raw data to validate the entire process around updating the issue data.

hober

looks good to me; one nit

summary.py

gsnedders

Here's a lot of comments.

You might want to run black on this to reformat this to meet expected code style, and you may also want to run flake8 on this and see what errors it finds?

summary.py

gsnedders · 2023-06-21T15:05:15Z

summary.py

@@ -0,0 +1,145 @@
+#!/usr/bin/env python


python or python3? I'd hope the latter!

Is it really best practice to put the version number there? It works fine either way.

Let's assume this is python 3, and we can set a requirements after.

summary.py

gsnedders · 2023-06-21T17:51:11Z

summary.py

+                page += 1
+                continue
+            break
+        write_json("summary-data.json", data)


Do we want to be using summary-data.json in the current working directory, or summary-data.json adjacent to the script?

Do we even want to add an option to change what file it uses? Maybe?

I don't think we need to account for different files. Supporting invocation from different directories seems like something we could support as a follow-up, though I doubt we'll need it.

summary.py

karlcow

Currently the code resides in the root directory.
It would probably be better in a separate directory.

Specifically when adding a couple of things for managing the tests, etc.

Could we move all of that in a tooling or something.

Also should the edits on summary.py be separated from summary.json and summary-data.json

I would propose we do.

.github/
data/
   summary-data.json
   summary.json
tooling/
    summary.py
    tests/

I will be working on testing later. Once the first PR is merged.

karlcow · 2023-06-23T05:59:47Z

Also should the edits on summary.py be separated from summary.json and summary-data.json

It could even be part of a GitHub actions. So that we have a clear separation in between editing the python factory and the data generations.

gsnedders · 2023-06-23T14:09:45Z

It could even be part of a GitHub actions.

One could reasonably imagine the summary.json getting pushed to gh-pages, so it could be used by others.

annevk · 2023-06-26T13:02:44Z

I like the idea of more automation, but changing summary.json through PRs is important for review purposes in my opinion. We want to be somewhat careful about what ends up on the website. I could see running the update process in a more automated fashion, but for now that seems more wasteful than useful to me.

Perhaps enforcing black and tests could be something useful for GitHub Actions though. I suggest that once it gets more complicated we make the directory changes. Seems nicer to keep it simple for now as we only have four files in total.

karlcow · 2023-06-27T02:44:21Z

I like the idea of more automation, but changing summary.json through PRs is important for review purposes in my opinion. We want to be somewhat careful about what ends up on the website. I could see running the update process in a more automated fashion, but for now that seems more wasteful than useful to me.

At least would it be possible to do different PRs for python code and data file generation. 🙏

annevk · 2023-06-27T11:09:56Z

I doubt we're going to change the code a whole lot after this lands and a reviewer probably wants to see the changes doing the right thing?

Edit: clarified with Karl that all is in order.

annevk assigned hober and gsnedders Jun 20, 2023

annevk added the meta For issues about this repo label Jun 20, 2023

hober approved these changes Jun 20, 2023

View reviewed changes

summary.py Outdated Show resolved Hide resolved

Add support for timeout and blocked

389b8e3

rik reviewed Jun 21, 2023

View reviewed changes

summary.py Outdated Show resolved Hide resolved

annevk added 2 commits June 21, 2023 13:43

triaged the first 51 issues for source errors

243468a

first 100 triaged

122c2bf

gsnedders reviewed Jun 21, 2023

View reviewed changes

karlcow requested changes Jun 23, 2023

View reviewed changes

annevk added 2 commits June 26, 2023 15:05

review

451871f

reviewed and cleaned up to and including 140

a961e0b

annevk added 4 commits June 27, 2023 13:59

reviewed and cleaned up all

f9e8495

also collect venues and concerns

fe31a44

remove duplicates

c56a5ff

address nit + sync

aa798ea

annevk merged commit 631dfca into main Jun 28, 2023

annevk deleted the annevk/summary branch June 28, 2023 09:33

Meta: add infrastructure to generate summary.json #208

Meta: add infrastructure to generate summary.json #208

Uh oh!

Conversation

annevk commented Jun 20, 2023

Uh oh!

hober left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gsnedders left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gsnedders Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

annevk Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

karlcow Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gsnedders Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

annevk Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

karlcow left a comment

Choose a reason for hiding this comment

Uh oh!

karlcow commented Jun 23, 2023

Uh oh!

gsnedders commented Jun 23, 2023

Uh oh!

annevk commented Jun 26, 2023

Uh oh!

karlcow commented Jun 27, 2023

Uh oh!

annevk commented Jun 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

annevk commented Jun 27, 2023 •

edited

Loading