+
Skip to content

Conversation

joelochlann
Copy link
Member

@joelochlann joelochlann commented Aug 1, 2025

How did we make this PR?

This was all generated by Copilot Agent mode with Claude Sonnet 4 during a focused group session with the Newsroom AI. Initial prompt was:

Currently searching for specific metadata fields is handled by something called chips, which require the user to type something specific in the search box. I'd like to turn these into explicit filters that are visible in the UI.

I'd like these to have the kibana functionality where they show a list of top values from the actual data, e.g. for "country" you might "UK", "USA", etc, based on the number of hits from a group by.

As a first step, I'd like to add a backend endpoint that gets the top n values with counts for the "country" field.

Does it work?

Yes! Copilot was actually able to re-use some existing (but possibly unused?) functionality, and created a test which exercises the new size param which it added. The test does not actually exercise the new endpoint itself, just the function used by the endpoint.

You can also verify yourself with

https://api.media.local.dev-gutools.co.uk/images/aggregations/metadata/keywords?size=5

Is it perfect!

No! Flaws we know about:

  • It will allow requests to non-existent metadata fields, and instead of erroring, just give a zero count for values. (Maybe this is desired behaviour? Not sure)
  • It will not work for fields that do not have an associated keyword mapping type. This is an elasticsearch limitation

Copilot documentation

On our request, Copilot produced the following documentation.

It has a few mistakes, namely:

  • localhost curl requests won't work because of auth (you need *.media.local.dev-gutools.co.uk)
  • country will not work, because it doesn't have a keyword field type

Overview

This new endpoint provides the top N values with counts for metadata fields, enabling Kibana-style filtering functionality. It's the first step towards replacing the chips-based search with explicit UI filters.

Endpoint

GET /images/aggregations/metadata/{field}?size={size}&q={query}

Parameters

  • field (required): The metadata field to aggregate on (e.g., "country", "city", "credit", etc.)
  • size (optional, default: 10): Maximum number of top values to return
  • q (optional): Additional query to filter the data before aggregation

Example Usage

Get top 5 countries from all images:

curl "http://localhost:9001/images/aggregations/metadata/keyword?size=5"

Get top countries from images with "london" in any field:

curl "http://localhost:9001/images/aggregations/metadata/keyword?q=london&size=5"

Example Response

{
  "data": [
    {
      "key": "UK",
      "count": 1250
    },
    {
      "key": "USA", 
      "count": 890
    },
    {
      "key": "France",
      "count": 456
    },
    {
      "key": "Germany",
      "count": 234
    },
    {
      "key": "Spain",
      "count": 123
    }
  ],
  "offset": 0,
  "total": 5
}

Supported Fields

All metadata fields are supported, including:

  • country, city, state, subLocation
  • credit, source, supplier
  • byline, photographer
  • keywords, subjects
  • title, description
  • imageType
  • And more...

Technical Implementation

  • Uses Elasticsearch terms aggregation for efficient counting
  • Leverages existing metadataSearch functionality in ElasticSearch class
  • Supports structured query filtering via the q parameter
  • Returns results in the standard Grid API format

Next Steps

This endpoint will be used to build explicit filter UI components that show:

  1. Available filter values with counts
  2. Dynamic filtering based on current search context
  3. Multi-select filtering capabilities
  4. Clear visual indication of applied filters

This will replace the current chips-based system where users need to know specific syntax like country:UK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载