这是indexloc提供的服务,不要输入任何密码
Skip to content

[BUG]: GitLab data connector doesn't work for self-hosted instances #2315

@blazeyo

Description

@blazeyo

How are you running AnythingLLM?

All versions

What happened?

When adding a GitLab data connector with a self-hosted GitLab instance there's an error

TypeError: branches.map is not a function
    at /home/blazej/llm/tools/anything-llm/collector/utils/extensions/RepoLoader/GitlabRepo/RepoLoader/index.js:183:27
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async GitLabRepoLoader.getRepoBranches (/home/blazej/llm/tools/anything-llm/collector/utils/extensions/RepoLoader/GitlabRepo/RepoLoader/index.js:171:23)
    at async /home/blazej/llm/tools/anything-llm/collector/extensions/index.js:65:29

The problem is in the UrlPattern matching.

Gitlab.com

The rule works for gitlab.com projects because the domain is explicitly provided

new UrlPattern("https\\://gitlab.com/(:projectId(*))", {
  segmentValueCharset: "a-zA-Z0-9-._~%/+",
}),
GitLabRepoLoader {
  ready: false,
  repo: 'https://gitlab.com/gitlab-com/Product',
  ...
  projectId: 'gitlab-com%2FProduct',
  apiBase: 'https://gitlab.com',
  author: 'gitlab-com',
  project: 'Product',
  branches: []
}

Self-hosted

However, for self-hosted instances the segmentValueCharset is too wide and matches the hostname along with the repository owner

new UrlPattern(
  "(:protocol(http|https))\\://(:hostname*)/(:projectId(*))",
  {
     segmentValueCharset: "a-zA-Z0-9-._~%/+",
  }
),
GitLabRepoLoader {
  ready: false,
  repo: 'https://my.selfhostedinstance.com/me/my-project',
  ...
  projectId: 'my-project',
  apiBase: 'https://my.selfhostedinstance.com',
  author: 'my-project',
  project: undefined,
  branches: []
}

As a result a request is sent to

  • https://my.selfhostedinstance.com/api/v4/projects/my-project/repository/branches instead of
  • https://my.selfhostedinstance.com/api/v4/projects/me%2Fmy-project/repository/branches and returns a 404.

Proposed resolution

  • remove / from segmentValueCharset
  • match the author and project explicitly

Are there known steps to reproduce?

Add any self-hosted GitLab project to the GitLab data connector.

Metadata

Metadata

Assignees

No one assigned

    Labels

    possible bugBug was reported but is not confirmed or is unable to be replicated.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions