这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@tolgaozen
Copy link
Member

@tolgaozen tolgaozen commented Oct 24, 2025

…on to v1.4.7

Summary by CodeRabbit

  • Documentation

    • Updated API reference and OpenAPI docs to v1.4.7.
  • Chores

    • Version bumped to v1.4.7.
  • Tests

    • Added a race-condition integration test suite with Docker Compose setup and runnable scripts to exercise concurrent permission scenarios.

@coderabbitai
Copy link

coderabbitai bot commented Oct 24, 2025

Walkthrough

This PR bumps version metadata to v1.4.7, adds a race-condition integration-test suite with Docker Compose and test scripts, and refactors PostgreSQL storage to propagate and encode snapshot information with RepeatableRead isolation.

Changes

Cohort / File(s) Summary
Version Updates (docs & proto)
docs/api-reference/apidocs.swagger.json, docs/api-reference/openapiv2/apidocs.swagger.json, proto/base/v1/openapi.proto
Bumped OpenAPI/Swagger version from v1.4.6 → v1.4.7.
Version Update (code)
internal/info.go
Bumped Version constant from "v1.4.6" → "v1.4.7".
Race Condition Integration Tests
integration-test/race_condition/README.md, integration-test/race_condition/docker-compose-test.yml, integration-test/race_condition/schema.perm, integration-test/race_condition/test_race_condition.sh, integration-test/race_condition/test_race_condition_parallel.sh
Added Docker Compose test setup, schema for user/doc entities and relations, and bash scripts to exercise concurrent permission deletions and checks.
Snapshot Token & Tests
internal/storage/postgres/snapshot/token.go, internal/storage/postgres/snapshot/token_test.go
Added Snapshot field to Token; NewToken now accepts snapshot string; Encode/Decode support legacy binary xid and new base64 "xid:snapshot" format; tests updated for both formats.
Postgres Data Reader
internal/storage/postgres/data_reader.go
Switched transaction isolation to REPEATABLE READ; propagate snapshot token through read paths; HeadSnapshot now returns id and snapshot.
Postgres Data Writer
internal/storage/postgres/data_writer.go
Transaction retrieval now scans xid and snapshotValue; return paths pass snapshot to NewToken.
Postgres Snapshot Utilities & Tests
internal/storage/postgres/utils/common.go, internal/storage/postgres/utils/common_test.go
TransactionTemplate returns id, snapshot; added createFinalSnapshot; SnapshotQuery signature gains snapshotValue and uses snapshot-aware visibility logic; tests extended for legacy and snapshot modes.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Reader as DataReader
    participant DB as PostgreSQL
    participant Token

    rect rgba(220,240,255,0.35)
    Note over Reader,DB: Snapshot-aware read (new)
    end

    Client->>Reader: Read request
    activate Reader
    Reader->>DB: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ
    activate DB
    DB-->>Reader: transaction started (xid, snapshot)
    Reader->>Token: NewToken(xid, snapshot)
    Token-->>Reader: Encoded snapshot token (base64 "xid:snapshot" or legacy)
    Reader->>DB: SnapshotQuery(..., finalSnapshot)
    DB-->>Reader: Visible rows per snapshot
    Reader-->>Client: Response + snapshot token
    deactivate Reader
    deactivate DB
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas needing extra attention:

  • snapshot token Encode/Decode compatibility and base64 edge cases (internal/storage/postgres/snapshot/token.go, tests)
  • visibility logic in SnapshotQuery and createFinalSnapshot (internal/storage/postgres/utils/common.go)
  • transaction isolation change impact in data_reader.go (deadlocks/consistency)
  • end-to-end propagation of snapshot values across reader/writer paths

Possibly related PRs

Suggested reviewers

  • tolgaOzen

Poem

🐇 In burrows of code I softly tread,

Tokens stitched with snapshot thread,
Tests that race in Docker light,
Version hopped to v1.4.7 tonight,
May reads be stable, and sleeps be tight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat(integration-test): add race condition tests and update API version to v1.4.7" is partially related to the changeset. It accurately captures two real aspects of the PR: the addition of new race condition integration tests (multiple test files in integration-test/race_condition/) and the version update from v1.4.6 to v1.4.7 across multiple files. However, the title does not emphasize the primary functional changes, which are significant modifications to the PostgreSQL storage layer snapshot handling (in data_reader.go, data_writer.go, snapshot/token.go, and utils/common.go) that represent the core implementation to address race conditions. While the title references a real part of the change rather than being misleading, it focuses on the testing and administrative aspects rather than the main implementation work.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/race-condition

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9021b2d and 31f5528.

📒 Files selected for processing (1)
  • internal/storage/postgres/utils/common.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Scan Container for Vulnerabilities
  • GitHub Check: Test with Coverage
  • GitHub Check: Analyze (go)
  • GitHub Check: Test with Coverage
🔇 Additional comments (2)
internal/storage/postgres/utils/common.go (2)

25-25: LGTM: Snapshot capture added to transaction creation.

The addition of snapshot to the RETURNING clause enables capturing the transaction's snapshot at creation time, which is essential for the race condition handling improvements in this PR.


39-91: Past review issues have been addressed correctly.

The function now properly:

  • Uses strings.SplitN instead of the non-existent strings.SplitSeq
  • Uses correct range syntax for _, xip := range
  • Includes the safety guard for xid <= xmin (lines 70-73)

The snapshot manipulation logic is sound and handles the edge cases appropriately for PostgreSQL MVCC visibility semantics.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/storage/postgres/data_writer.go (1)

214-224: Fix: Exec error is overwritten by Close result; can return nil, nil on failure

On Exec failure you assign err = batchResult.Close() and then return err. If Close returns nil, the function returns nil, nil. Preserve the Exec error and still close the batch.

Apply these diffs.

Write path:

- batchResult := tx.SendBatch(ctx, batch)
- for i := 0; i < batch.Len(); i++ {
-   _, err = batchResult.Exec()
-   if err != nil {
-     err = batchResult.Close()
-     if err != nil {
-       return nil, err
-     }
-     return nil, err
-   }
- }
+ batchResult := tx.SendBatch(ctx, batch)
+ for i := 0; i < batch.Len(); i++ {
+   if _, execErr := batchResult.Exec(); execErr != nil {
+     _ = batchResult.Close() // best-effort cleanup
+     return nil, execErr
+   }
+ }

RunBundle path:

- batchResult := tx.SendBatch(ctx, batch)
- for i := 0; i < batch.Len(); i++ {
-   _, err = batchResult.Exec()
-   if err != nil {
-     err = batchResult.Close()
-     if err != nil {
-       return nil, err
-     }
-     return nil, err
-   }
- }
+ batchResult := tx.SendBatch(ctx, batch)
+ for i := 0; i < batch.Len(); i++ {
+   if _, execErr := batchResult.Exec(); execErr != nil {
+     _ = batchResult.Close() // best-effort cleanup
+     return nil, execErr
+   }
+ }

Current Close-after-success block (Lines 226-229 and 371-374) can remain to propagate Close errors.

Also applies to: 359-369

♻️ Duplicate comments (1)
integration-test/race_condition/test_race_condition_parallel.sh (1)

74-98: Same output interleaving issue in remaining parallel checks.

The same output interleaving problem from lines 62-72 also affects the parallel checks after 1s wait and 5s wait. Apply similar fixes to all three sections to properly capture and display results.

🧹 Nitpick comments (13)
internal/info.go (1)

26-26: Core Version constant bumped to v1.4.7 — verified OK.

Verification confirms no leftover v1.4.6 references and public API docs correctly reflect v1.4.7. For future releases, consider driving doc versions from a single source (or CI inject) to prevent drift across documentation.

integration-test/race_condition/docker-compose-test.yml (1)

2-8: Document test-only credentials.

The hardcoded credentials (pg/pg) are appropriate for a local test environment. Consider adding a comment to clarify these are test-only and should not be used in production.

Apply this diff to add clarifying comments:

+  # Test environment only - uses simple credentials for local testing
   postgres:
     image: postgres:17-alpine
     environment:
integration-test/race_condition/README.md (1)

14-22: Consider adding expected outcomes and prerequisites.

The test instructions are clear, but could be enhanced with:

  1. Expected test results (what success looks like)
  2. Note about making scripts executable (chmod +x *.sh)
  3. Prerequisites (Docker, Docker Compose, curl, jq)

Consider adding a section like:

### Prerequisites
- Docker and Docker Compose
- `curl` and `jq` installed

### Expected Results
Both tests should demonstrate that permissions are correctly isolated after concurrent deletions. The `read` permission should remain granted while `comment` and `edit` should be denied after deletion.

**Note:** Make scripts executable before running: `chmod +x *.sh`
integration-test/race_condition/test_race_condition.sh (3)

1-7: Consider adding error handling and validation.

The script correctly uses set -e but doesn't validate API responses. Consider capturing and checking HTTP status codes or response content to ensure operations succeed.

Example enhancement:

#!/bin/bash
set -e

# Write schema
RESPONSE=$(curl -sS -w '\n%{http_code}' 'http://localhost:3477/v1/tenants/t1/schemas/write' \
     -H "Content-Type: application/json" \
     -d "$(cat "schema.perm" | jq -Rs '{schema: . }')")

HTTP_CODE=$(echo "$RESPONSE" | tail -n1)
if [ "$HTTP_CODE" -ne 200 ]; then
    echo "Schema write failed with HTTP $HTTP_CODE"
    exit 1
fi

22-33: Clarify the metadata depth parameter.

The depth: 20 parameter is used but not documented. Consider adding a comment explaining why this depth is chosen for the race condition test, or confirm if this is necessary for the test scenario.

 function check_access() {
     ACTION=$1
-
+    # depth: 20 ensures deep permission resolution across potential relation chains
     curl -sS -w '\n' 'http://localhost:3477/v1/tenants/t1/permissions/check' \

50-76: Test lacks programmatic validation.

The test outputs results but doesn't validate them programmatically. Consider adding assertions to verify:

  1. Read permission remains granted throughout
  2. Comment and edit permissions are denied after deletion
  3. Results are stable after the wait period

Example validation:

# After DELETE checks
READ_RESULT=$(check_access 'read' | jq -r '.can')
COMMENT_RESULT=$(check_access 'comment' | jq -r '.can')
EDIT_RESULT=$(check_access 'edit' | jq -r '.can')

if [ "$READ_RESULT" != "CHECK_RESULT_ALLOWED" ]; then
    echo "ERROR: Read permission should still be granted"
    exit 1
fi

if [ "$COMMENT_RESULT" = "CHECK_RESULT_ALLOWED" ]; then
    echo "ERROR: Comment permission should be denied"
    exit 1
fi
integration-test/race_condition/test_race_condition_parallel.sh (1)

1-46: Consider extracting shared code to reduce duplication.

Lines 1-46 are identical to test_race_condition.sh. Consider extracting the shared functions and setup into a common file (e.g., test_common.sh) that both scripts can source.

Example structure:

test_common.sh:

#!/bin/bash

BASE_URL="http://localhost:3477"
TENANT="t1"

function check_access() { ... }
function drop_access() { ... }
function setup_schema() { ... }
function grant_permissions() { ... }

Then in both test scripts:

#!/bin/bash
set -e
source "$(dirname "$0")/test_common.sh"

setup_schema
grant_permissions
# ... test-specific logic
internal/storage/postgres/data_reader.go (2)

36-38: RepeatableRead + snapshot-aware filters LGTM; reduce repeated type assertions

  • Isolation change to RepeatableRead is appropriate for stable reads.
  • Passing both xid and snapshot into utils.SnapshotQuery is consistent with the new API.
  • Minor: store st as a concrete snapshot.Token once to avoid repeated assertions.

Example:

- var st token.SnapToken
- st, err = snapshot.EncodedToken{Value: snap}.Decode()
+ var st token.SnapToken
+ st, err = snapshot.EncodedToken{Value: snap}.Decode()
  if err != nil { /* ... */ }
- builder = utils.SnapshotQuery(builder, st.(snapshot.Token).Value.Uint, st.(snapshot.Token).Snapshot)
+ tok := st.(snapshot.Token)
+ builder = utils.SnapshotQuery(builder, tok.Value.Uint, tok.Snapshot)

Also applies to: 60-60, 135-135, 223-223, 281-281, 369-369, 482-482


559-575: Add supporting index for HeadSnapshot query

Create the composite index to avoid seq scans and reduce latency under load:

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_transactions_tenant_id_id
ON transactions(tenant_id, id DESC);

Add a reversible migration.

Also applies to: 586-587

internal/storage/postgres/utils/common_test.go (1)

28-40: Good coverage; add case for xid > xmax

Please add a test where xid > xmax to validate createFinalSnapshot adjusts xmax and xip_list appropriately.

Example:

It("Case 4: New mode with snapshot (xid > xmax)", func() {
  sl := squirrel.Select("column").From("table")
  revision := uint64(50)
  snapshot := "30:40:35" // xid(50) > xmax(40) -> expect adjusted final snapshot
  query := utils.SnapshotQuery(sl, revision, snapshot)
  sql, args, err := query.ToSql()
  Expect(err).ShouldNot(HaveOccurred())
  Expect(sql).Should(Equal(
    "SELECT column FROM table WHERE (pg_visible_in_snapshot(created_tx_id, ?) = true OR created_tx_id = ?::xid8) AND ((pg_visible_in_snapshot(expired_tx_id, ?) = false OR expired_tx_id = ?::xid8) AND expired_tx_id <> ?::xid8)",
  ))
  Expect(args[0]).ShouldNot(Equal(snapshot)) // final snapshot should change
  Expect(args).Should(HaveLen(5))
})

Also applies to: 42-55, 56-69

internal/storage/postgres/snapshot/token_test.go (1)

41-54: Extend tests: comparator with snapshots and invalid new-format

  • Verify Eg/Gt/Lt ignore Snapshot by adding cases where Snapshot differs but Value equals.
  • Add a failing decode case for base64-valid but invalid format (no colon), e.g., base64("123") -> expect error.

Happy to draft these tests if you want.

Also applies to: 116-127, 136-151, 153-165

internal/storage/postgres/snapshot/token.go (1)

37-55: Token format handling looks solid; tighten errors and doc comment

  • Return a typed sentinel error for invalid token formats to simplify caller handling, e.g., var ErrInvalidTokenFormat = errors.New("invalid token format"), and use it instead of fmt.Errorf.
  • Fix the comment on EncodedToken.String() (it returns the encoded value, doesn’t decode).
  • Optional: consider prefixing new formats (e.g., "v2:xid:snapshot") to future‑proof beyond the len==8 heuristic.

Also applies to: 75-119, 121-124

internal/storage/postgres/utils/common.go (1)

121-121: Cast snapshot parameter to pg_snapshot for type safety.

Passing a raw text parameter relies on implicit casts. Be explicit to avoid version/driver quirks.

Apply:

-    squirrel.Expr("pg_visible_in_snapshot(created_tx_id, ?) = true", finalSnapshot),
+    squirrel.Expr("pg_visible_in_snapshot(created_tx_id, ?::pg_snapshot) = true", finalSnapshot),
@@
-      squirrel.Expr("pg_visible_in_snapshot(expired_tx_id, ?) = false", finalSnapshot),
+      squirrel.Expr("pg_visible_in_snapshot(expired_tx_id, ?::pg_snapshot) = false", finalSnapshot),

Also applies to: 128-128

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00bea4c and 9021b2d.

⛔ Files ignored due to path filters (1)
  • pkg/pb/base/v1/openapi.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (15)
  • docs/api-reference/apidocs.swagger.json (1 hunks)
  • docs/api-reference/openapiv2/apidocs.swagger.json (1 hunks)
  • integration-test/race_condition/README.md (1 hunks)
  • integration-test/race_condition/docker-compose-test.yml (1 hunks)
  • integration-test/race_condition/schema.perm (1 hunks)
  • integration-test/race_condition/test_race_condition.sh (1 hunks)
  • integration-test/race_condition/test_race_condition_parallel.sh (1 hunks)
  • internal/info.go (1 hunks)
  • internal/storage/postgres/data_reader.go (9 hunks)
  • internal/storage/postgres/data_writer.go (6 hunks)
  • internal/storage/postgres/snapshot/token.go (4 hunks)
  • internal/storage/postgres/snapshot/token_test.go (7 hunks)
  • internal/storage/postgres/utils/common.go (2 hunks)
  • internal/storage/postgres/utils/common_test.go (1 hunks)
  • proto/base/v1/openapi.proto (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
internal/storage/postgres/utils/common_test.go (1)
internal/storage/postgres/utils/common.go (2)
  • SnapshotQuery (91-136)
  • ActiveRecordTxnID (32-32)
internal/storage/postgres/data_reader.go (2)
internal/storage/postgres/utils/common.go (1)
  • SnapshotQuery (91-136)
internal/storage/postgres/snapshot/token.go (1)
  • Token (18-21)
internal/storage/postgres/snapshot/token.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/snapshot/token_test.go (2)
internal/storage/postgres/snapshot/token.go (3)
  • NewToken (29-34)
  • EncodedToken (23-25)
  • Token (18-21)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/data_writer.go (3)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/utils/common.go (1)
  • TransactionTemplate (25-25)
internal/storage/postgres/snapshot/token.go (1)
  • NewToken (29-34)
🪛 Checkov (3.2.334)
integration-test/race_condition/docker-compose-test.yml

[medium] 14-15: Basic Auth Credentials

(CKV_SECRET_4)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test with Coverage
  • GitHub Check: Analyze (go)
  • GitHub Check: Scan Container for Vulnerabilities
  • GitHub Check: Test with Coverage
🔇 Additional comments (9)
docs/api-reference/openapiv2/apidocs.swagger.json (1)

6-6: Version bump looks good.

OpenAPI info.version updated to v1.4.7; no other changes in this doc block.

docs/api-reference/apidocs.swagger.json (1)

6-6: Version bump confirmed.

Doc version is now v1.4.7; consistent with the PR intent.

proto/base/v1/openapi.proto (1)

12-12: Proto OpenAPI info.version updated correctly.

Keeps docs and generated artifacts aligned for v1.4.7.

integration-test/race_condition/docker-compose-test.yml (1)

10-17: LGTM - Test environment configuration is appropriate.

The configuration correctly sets up the Permify service with PostgreSQL backend. The static analysis warning about credentials in the URI is a false positive in this context—embedding credentials in connection strings is standard practice for containerized test environments where the database is not exposed externally.

integration-test/race_condition/schema.perm (1)

1-11: LGTM - Clean test schema.

The schema is appropriately simple for testing race conditions. The direct mapping between relations and permissions makes it easy to verify the behavior of concurrent deletions.

integration-test/race_condition/test_race_condition.sh (1)

9-19: LGTM - Permission grants are correct.

The initial permission setup properly grants all three permissions (read, comment, edit) that will be tested in subsequent operations.

internal/storage/postgres/data_writer.go (1)

182-186: Snapshot propagation LGTM

Scanning xid+snapshot and returning snapshot-aware tokens looks correct and consistent with utils.TransactionTemplate and snapshot.NewToken signature.

Also applies to: 237-237, 257-261, 313-313, 335-339, 380-380

internal/storage/postgres/utils/common.go (2)

91-115: Review comment is incorrect—all SnapshotQuery call sites already pass the required third argument.

All 6 production call sites in data_reader.go and all 3 test call sites in common_test.go have been updated to pass snapshotValue. No compilation failures or missing arguments exist.

Likely an incorrect or invalid review comment.


25-25: All TransactionTemplate call sites already correctly handle the two-column return.

All three usages in data_writer.go (lines 185, 260, 338) properly scan into both &xid and &snapshotValue, matching the RETURNING id, snapshot clause. No updates needed.

@tolgaozen tolgaozen merged commit 3b88322 into master Oct 25, 2025
13 of 15 checks passed
@tolgaozen tolgaozen deleted the feature/race-condition branch October 25, 2025 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants