rfc (wip): parameterized queries

# Parameterized Queries 

## Motivation

While graphql-engine's auto-generated API is reasonably expressive, it doesn't
really compete with the expressivity of a query language that a database
provides. You can express a lot more queries through SQL on Postgres, use a lot
more operators on Postgres than through graphql-engine's Postgres API.

We want to provide our users an ability to fallback to a databases's query
language when they run into cases where graphql-engine's API is not adequate.
Currently graphql-engine addresses this by exposing GraphQL API over
[views](https://hasura.io/docs/latest/graphql/core/guides/postgres/views.html)
and
[functions](https://hasura.io/docs/latest/graphql/core/guides/postgres/functions.html).
For example, a view can be defined as follows:

```sql
create view page_items as
  select title, content from articles
  union all
  select title, description from alerts
```

and graphql-engine provides this API:

```graphql
query {
  page_items(where: {..} limit: .. offset: ..) {
    ...
  }
}
```

and similarly for the following function

```sql
CREATE FUNCTION search_articles(search text)
RETURNS SETOF articles AS $$
  SELECT *
  FROM articles
  WHERE
    title ilike ('%' || search || '%')
    OR content ilike ('%' || search || '%')
$$ LANGUAGE sql STABLE;
```

graphql-engine provides this API:

```graphql
query {
  search_articles(args: {search: "something"} where: .. limit: .. offset: ..) {
    ...
  }
}
```

While the above approach works, it has a couple of shortcomings:

### 1. Requirement of DDL privileges

Our users need DDL privileges to create these new functions or views. This
privilege is not easily granted, particularly in an enterprise setup. The less
we require of a database, the easier it would be for our users to add
graphql-engine in their stack.

### 2. Restrictions on the function return type

The functions that graphql-engine can support currently need to return a
'table' or 'setof table' that's already tracked. There are couple of issues
here:

1. Does not work for databases (such as SQL Server) that do not have
   the notion of functions returning named table types.

1. Does not support functions that do not return a table. For example, if the
   above `search_articles` function only returns title and content,
   graphql-engine cannot track it.

   ```sql
   CREATE FUNCTION search_page_items(search text)
   RETURNS SETOF TABLE (title text, content text) AS $$
     SELECT title, content
     FROM articles
     WHERE
       title ilike ('%' || search || '%')
       OR content ilike ('%' || search || '%')
   $$ LANGUAGE sql STABLE;
   ```

   To work around this, users go through this elaborate process:

   1. Create a table with the schema of the function return type
   1. Track the table
   1. Change the definition of the function to return the newly created table
   1. Track the function

   which is terrible DX

To summarize the motivation is this:

> We want our users to use the full query language ability of a database
to extend graphql-engine's auto-generated API while addressing the shortcomings
of the existing features (GraphQL API views and functions).

## Parameterized queries

We want to provide a means to define views and functions at graphql-engine's
layer.

1. A view's definition is a database select query so instead of creating a
   view, we can just accept the select query which backs the view. For example,
   we want to allow adding `page_items` root field as follows:

   ```yaml
   root_field_name: page_items
   sql_query: |
     select title, content from articles
     union all
     select title, description from alerts
   ```

   This should create the same GraphQL API for `page_items` field as it did
   when we created the view `page_items` and tracked it:

   ```graphql
   type query_root {
     ...
     page_items(distint_on: .. where: .. limit: .. offset: ..)
     page_items_aggregate(where: .. limit: .. offset: ..)
   }
   type subscription_root {
     ...
     page_items(distint_on: .. where: .. limit: .. offset: ..)
     page_items_aggregate(where: .. limit: .. offset: ..)
   }
   ```

1. A function's body may or may not be a simple query, but let's only consider
   the cases where a function's body is a query such as `search_articles`. In
   case of `search_articles`, we want to allow `search_articles` root field
   as follows:

   ```yaml
   root_field_name: search_articles
   arguments:
   - $1: search
     type: String
   sql_query: |
     SELECT *
     FROM articles
     WHERE
       title ilike ('%' || $1 || '%')
       OR content ilike ('%' || $1 || '%')
   ```

   (Note how the `search` parameter used in `search_articles` function's body
   is replaced by `$1` in the equivalent parameterized query)

   This should create the same GraphQL API for `search_articles` field as it
   did when we created the `search_articles` function and tracked it.

   ```graphql
   type query_root {
     ...
     search_articles(args: {search: ..} where: .. limit: .. offset: ..)
     search_articles_aggregate(args: {search: ..} where: .. limit: .. offset: ..)
   }
   type subscription_root {
     ...
     search_articles(args: {search: ..} where: .. limit: .. offset: ..)
     search_articles_aggregate(args: {search: ..} where: .. limit: .. offset: ..)
   }
   ```

> In short, users can specify 'parameterized' database queries instead of
defining functions and views on the database to add custom root fields

Note that the restrictions on function types that are mentioned in the
motivation section can be addressed by extending our functions support. We can
let our users specify the name of the table that a function returns (currently
we get it from the function's type signature on the database) or a custom
schema when they don't return an existing table. However, addressing the 'no
ddl' requirement is more important and hence we are going forward with
parameterized queries.

## Product requirements

1. 'Parameterized queries' should be a superset of our current functions support:

   1. If the body of a function is a database query, one can use a parameterized
      query (like above).

   1. If the body of a function is procedural and is not a single database query,
      one can use a parameterized query as follows:
      ```sql
      select * from function($1, ... $n);
      ```

   1. Support for subscriptions and aggregate fields like we currently have for
      immutable functions

1. Should allow mutations too - one should be able to mark a templated query as
   a query or a mutation so that the field can be added to query root or
   mutation root appropriately.

1. Should be a superset of our current views support:

   1. Parameterized queries instead of defining views

      ```sql
      select * from articles where is_verified = true
      ```

   1. Parameterized queries to use existing views (`verified_articles` is a
      view on the database):

      ```sql
      select * from verified_articles;
      ```

   1. Support for subscriptions and aggregate fields like we currently have on views

1. Support for relationships

   1. *From* a parameterized query's response:

      ```graphql
      query {
        search_articles(args: {search: "something"}) {
          title
          content
          # author is a relationship from search_articles to author
          author {
            name
          }
        }
      }
      ```

   1. *To* a parametrized query:

      ```graphql
      query {
        author {
          name
          # verified_articles is a relationship from author table to
          # 'verified_articles' paramaterized query
          verified_articles {
            title
            content
          }
        }
      }
      ```

1. Should allow defining queries which do not return tables or views:

   ```sql
   select title, content from article;
   ```

   In such cases, we expect the user to define the schema of this query's return type:

   ```yaml
   title:
     type: text
     nullable: true
   content:
     type: text
     nullable: true
   ```

1. Support for permissions:

   1. We want to allow similar permissions like on tables/views.

      1. The columns that can be selected
      1. The rows that can be selected
      1. Whether the `_aggregate` should be generated

   1. Support for presetting parameters in a query to either static values or
      values from session variables (like in insert and update permissions).

## Product Specification

(See the Discussion section for other approaches considered, the following is
based off suggestions made by Joe and Phil on the PR)

The biggest challenge is (5) - letting parameterized queries return table like
data whose schema is not captured by any of the existing tables/views that are
defined in the database. To address this, we introduce the notion of 'virtual
tables' which lets one define the schema of table data at graphql-engine's
layer. For example,

```yaml
version:3
sources:
- name: default
  kind: mssql
  configuration: ..
  tables: ...
  virtual_tables:
  - name: average_rating
    fields:
    - name: num_articles
      type: int
    - name: average_rating
      type: float
```

defines a virtual table `average_rating` with fields `name` and `average_rating`.

Parameterized queries are defined per source. The return type of a
parameterized query can be one or more rows of a database table or a virtual
table. For example,

```yaml
version: 3
sources:
- name: default
  kind: mssql
  configuration: ...
  tables:
  - table: article
  parameterized_queries:
  - name: get_article
    description: Get an article with an id
    query: |
      select * from articles where id = $1
    parameters:
    - param: $1
      name: article_id
      type: uuid
      optional: false
    response:
      return_type:
        table: article
        cardinality: one
```

defines a parameterized query `get_article` which returns a single row of type
`article`.

The following example defines a parameterized query `author_ratings` which
returns zero or many rows of the virtual table `average_rating`.

```yaml
version: 3
sources:
- name: source_name
  kind: mssql
  configuration: ...
  virtual_tables:
  - name: average_rating
    fields:
    - name: num_articles
      type: int
    - name: average_rating
      type: float
  parameterized_queries:
  - name: author_ratings
    description: Ratings of articles of various authors
    query: |
      select author_id, avg(rating) as average_rating
      from article group_by author_id
    parameters: []
    response:
      return_type:
        virtual_table: average_rating
        cardinality: many
```

### Virtual tables metadata

Virtual tables are defined under a field called `virtual_tables` under each
source.

```yaml
virtual_tables:
  # name of the virtual table, should be unique in the source
  # this will also be used for the GraphQL type name
- name: name_of_the_virtual_table

  # fields of the virtual table
  fields:
    # name of the field
  - name: f1
    # the database type of the field
    type: database_scalar_type
    # whether a query which returns this type would return nulls for this
    # field
    nullable: true
```

### Parameterized queries metadata

Parameterized queres are defined under a field called `parameterized_queries`
under each source.

```yaml
version: 3
sources:
- name: default
  kind: postgres
  configuration: ...
  parameterized_queries:
  - ..
  - ..
```

The schema of each paramaterized query is defined as follows:

```yaml

# The name of the parameterized query, which will also be the name
# of the field name that exposes this query in GraphQL API

name: author_ratings

# A description of this field, that will be published in the
# GraphQL API

comment: Ratings of articles of various authors

# A parameterized SQL query where parameters start with ':'
# See https://docs.sqlalchemy.org/en/13/core/tutorial.html#using-textual-sql
query: |
  select * from x where c1 = :c1

# additional information about the parameters that are defined in the query
parameters:

  # name of the parameter
- name: c1

  # the type of the parameter: database scalar type
  type: database_scalar_type

  # (optional) whether the parameter is required or not - shows up in the
  # GraphQL schema for the argument. `true` by default
  required: true_or_false

  # (optional) a default value for the parameter if not provided in the query
  default: default_value_for_the_scalar

# return type of the parameterized query
return_type:

  # can be one of virtual_table or table
  table: name_of_the_table
  virtual_table: name_of_the_virtual_table

  # whether the query returns a zero or one row
  # or whether it returns zero or many rows
  # maybe we can come up with a better name than cardinality?
  cardinality: one_or_many
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rfc (wip): parameterized queries #7473

Parameterized Queries

Motivation

1. Requirement of DDL privileges

2. Restrictions on the function return type

Parameterized queries

Product requirements

Product Specification

Virtual tables metadata

Parameterized queries metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

rfc (wip): parameterized queries #7473

Description

Parameterized Queries

Motivation

1. Requirement of DDL privileges

2. Restrictions on the function return type

Parameterized queries

Product requirements

Product Specification

Virtual tables metadata

Parameterized queries metadata

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions