-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
How are you running AnythingLLM?
Docker (local)
What happened?
When using the GitLab RepoLoader, there is potential for at least 1 infinite loop:
...
Unexpected response format for /api/v4/projects/developer%2Fdata-source/repository/branches: {message: '404 Project Not Found' }
Unexpected response format for /api/v4/projects/developer%2Fdata-source/repository/branches: {message: '404 Project Not Found' }
Unexpected response format for /api/v4/projects/developer%2Fdata-source/repository/branches: {message: '404 Project Not Found' }
...
I noticed the issue in getRepoBranches(), but it could possibly apply to fetchFilesRecursive() and fetchIssues() as well under certain conditions.
/**
* Retrieves all branches for the repository.
* @returns {Promise<string[]>} An array of branch names.
*/
async getRepoBranches() {
if (!this.#validGitlabUrl() || !this.projectId) return [];
await this.#validateAccessToken();
this.branches = [];
const branchesRequestData = {
endpoint: `/api/v4/projects/${this.projectId}/repository/branches`,
};
let branchesPage = [];
while ((branchesPage = await this.fetchNextPage(branchesRequestData))) {
this.branches.push(...branchesPage.map((branch) => branch.name));
}
return this.#branchPrefSort(this.branches);
}
This while loop runs on fetchNextPage()
while ((branchesPage = await this.fetchNextPage(branchesRequestData))) {
this.branches.push(...branchesPage.map((branch) => branch.name));
}
/**
* Fetches the next page of data from the API.
* @param {Object} requestData - The request data.
* @returns {Promise<Array<Object>|null>} The next page of data, or null if no more pages.
*/
async fetchNextPage(requestData) {
try {
if (requestData.page === -1) return null;
if (!requestData.page) requestData.page = 1;
const { endpoint, perPage = 100, queryParams = {} } = requestData;
const params = new URLSearchParams({
...queryParams,
per_page: perPage,
page: requestData.page,
});
const url = `${this.apiBase}${endpoint}?${params.toString()}`;
const response = await fetch(url, {
method: "GET",
headers: this.accessToken ? { "PRIVATE-TOKEN": this.accessToken } : {},
});
// Rate limits get hit very often if no PAT is provided
if (response.status === 401) {
console.warn(`Rate limit hit for ${endpoint}. Skipping.`);
return null;
}
const totalPages = Number(response.headers.get("x-total-pages"));
const data = await response.json();
if (!Array.isArray(data)) {
console.warn(`Unexpected response format for ${endpoint}:`, data);
return [];
}
console.log(
`Gitlab RepoLoader: fetched ${endpoint} page ${requestData.page}/${totalPages} with ${data.length} records.`
);
if (totalPages === requestData.page) {
requestData.page = -1;
} else {
requestData.page = Number(response.headers.get("x-next-page"));
}
return data;
} catch (e) {
console.error(`RepoLoader.fetchNextPage`, e);
return null;
}
}
When there are any issues reading the page, the function returns an empty array [ ], which is TRUTHY!
if (!Array.isArray(data)) {
console.warn(`Unexpected response format for ${endpoint}:`, data);
return [];
}
Note: This tends to happen instantly because the modal tries to fetch as soon as link is entered into the text box, so if you have an access token to add as well, it's already failed. You can subvert this behavior by putting the access token in first.
Are there known steps to reproduce?
Have a non-public GitLab repo
Paste link to that repo into the GitLab connector