Added ragas to compute string metrics for evaluation. #1039

akarshgupta7 · 2024-11-26T23:26:36Z

No description provided.

baitsguy

Couple small refactoring requests

apps/query-eval/queryeval/driver.py

baitsguy · 2024-11-26T23:34:00Z

apps/query-eval/queryeval/driver.py

-        if not result.result:
-            console.print("[yellow] No query execution result available, skipping..", style="italic")
+        # Evaluate string metrics
+        if not query.expected:


this function is large enough now that it might make sense to separate the specific metric types into separate methods (doc_retrieval metrics, generated_answer metrics)

I have made these changes.

baitsguy · 2024-11-26T23:35:41Z

apps/query-eval/queryeval/driver.py

+                    response=result.result,
+                    reference=query.expected,
+                )
+                scores = asyncio.run(


do we really need to make compute_string_metrics an async method? Can we just have it be sync and have it wait for processing within?

This is adding a little complexity and I'm unsure why

I looked into this as well for a long time. Ragas method to get score is async because of which we have to do this. Do you have any suggestions?

You can do a variant off https://chatgpt.com/share/67466a08-1488-8007-b4fa-2d2538c74071

Basically do the waiting for async completion inside the compute_string_metrics method

There will be no waiting as such because do_eval runs one sample at a time. Ragas people have made it an async function to handle large workloads.

Regardless I have made the changes.

Right I just mean in terms of function APIs

baitsguy · 2024-11-27T02:43:51Z

apps/query-eval/queryeval/driver.py

+            metrics.doc_retrieval_precision = len(retrieved_doc_set & expected_doc_set) / len(retrieved_doc_set)
+        return metrics
+
+    def get_string_metrics(


nit: generated_answer_metrics

akarshgupta7 added 5 commits November 21, 2024 23:55

Removed duplicate code in query execution.

f33b77d

Added nltk download logic to support string metrics.

03d77bd

Add ragas to imports.

dae643b

Added ragas to imports.

5a80592

Added string metrics for evaluation.

2de4273

akarshgupta7 requested a review from baitsguy November 26, 2024 23:27

Remove dthe unwanted break statement.

99290d3

baitsguy reviewed Nov 26, 2024

View reviewed changes

akarshgupta7 added 5 commits November 26, 2024 23:37

Moves the import to top of file.

d3dadf5

Moved the scorer definitions to the init function.

b51cd0d

Refactoring.

6952961

Removed unused imports.

8c2825f

Moved async calls to the outer function.

d40d8cb

baitsguy reviewed Nov 27, 2024

View reviewed changes

baitsguy approved these changes Nov 27, 2024

View reviewed changes

Refactor to add error handling and change function names.

c55f8b0

akarshgupta7 merged commit 796e3a9 into main Nov 27, 2024
13 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added ragas to compute string metrics for evaluation. #1039

Added ragas to compute string metrics for evaluation. #1039

Uh oh!

akarshgupta7 commented Nov 26, 2024

Uh oh!

baitsguy left a comment

Uh oh!

Uh oh!

baitsguy Nov 26, 2024

Uh oh!

akarshgupta7 Nov 27, 2024

Uh oh!

baitsguy Nov 26, 2024

Uh oh!

akarshgupta7 Nov 26, 2024

Uh oh!

baitsguy Nov 27, 2024

Uh oh!

akarshgupta7 Nov 27, 2024

Uh oh!

akarshgupta7 Nov 27, 2024

Uh oh!

baitsguy Nov 27, 2024

Uh oh!

baitsguy Nov 27, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Added ragas to compute string metrics for evaluation. #1039

Added ragas to compute string metrics for evaluation. #1039

Uh oh!

Conversation

akarshgupta7 commented Nov 26, 2024

Uh oh!

baitsguy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants