Sentiment Evaluation Metric (F1 vs. Accuracy)

The [documentation](https://fengshenbang-doc.readthedocs.io/zh/latest/docs/%E4%BA%8C%E9%83%8E%E7%A5%9E%E7%B3%BB%E5%88%97/Erlangshen-MegatronBert-1.3B-Sentiment.html) and [Hugging Face model cards](https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment) for the Erlangshen sentiment analysis models reports the following results on the ASAP and ChnSentiCorp benchmarks. 

|   | ASAP-SENT | ASAP-ASPECT | ChnSentiCorp |
|---|---|---|---|
| Erlangshen-Roberta-110M-Sentiment | 97.77 | 97.31 | 96.61 |
| Erlangshen-Roberta-330M-Sentiment | 97.90 | 97.51 | 96.66 |
| Erlangshen-MegatronBert-1.3B-Sentiment | 98.10 | 97.80 | 97.00 |

What metric is being reported? Is it macro F1, accuracy, or some other metric?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sentiment Evaluation Metric (F1 vs. Accuracy) #464

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	ASAP-SENT	ASAP-ASPECT	ChnSentiCorp
Erlangshen-Roberta-110M-Sentiment	97.77	97.31	96.61
Erlangshen-Roberta-330M-Sentiment	97.90	97.51	96.66
Erlangshen-MegatronBert-1.3B-Sentiment	98.10	97.80	97.00

Sentiment Evaluation Metric (F1 vs. Accuracy) #464

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions