[FEAT]: Adjusted url links in AnythingLLM citations for chat responses

### How are you running AnythingLLM?

Docker (remote machine)

### What happened?

Hi!

I am building a private chatbot prototype for clients in the education/welfare sector. 
**My goal:**
For this, when the chatbot recommends some courses (e.g. language courses), I like to provide the url link directly to the website with the course information.
**Problem:**
Sadly AnythingLLM changes the uploaded url quite a lot in the documents section. This happens with the bulk link scraper or manually the API upload_link() function. In both cases, the original url is changed and the chatbot recommends a broken link.

**Example** for a course in German:
* **Broken** uploaded link in AnythingLLM documents section:
www_vhs muehldorf.de-programmberuf-karrierekursIHK-Fachkraft-Rechnungswesen-Steuerrechtliche-GrundlagenA20000.html
* **Real** link from website that I uploaded:
https://www.vhs-muehldorf.de/programm/beruf-karriere/kurs/IHK-Fachkraft-Rechnungswesen-Steuerrechtliche-Grundlagen/A20000

**Short-term solution/fix**:
 I am giving the bot some examples of correct links in the system prompt, but that doesn't work always perfectly and it consumes input tokens.

I would be happy about a solution in AnythingLLM for this! Maybe I can also collaborate or help as a software engineer. Thanks and blessings!

**Helpful Information**:
For example in my loaded sources the chatbot gets the information for the link from the metadata and sourceDocument:
sourceDocument: www_vhs-lingen.de-programmkursPrivate-Kochkurse-nur-fuer-Euch2024H92000.html published: 10/7/2024, 7:57:11 AM </document_metadata>

But inside the citations dropdown the links are correct, so I suppose they are saved somewhere correctly. If this could be saved in the metadata, maybe that would be a solution. Usable links may be under chunkSource also.
![citations_links](https://github.com/user-attachments/assets/411ccd3c-d921-49a1-abc8-0d74bb697a24)

Possibly I could set the link inside the metadata with the API function (raw-text) myself. I need to test that.


### Are there known steps to reproduce?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[FEAT]: Adjusted url links in AnythingLLM citations for chat responses #2430

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[FEAT]: Adjusted url links in AnythingLLM citations for chat responses #2430

Description

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions