Spaces:

NCSOFT
/

harim_plus

Running

sonsus commited on Aug 4

Commit

83286e0

verified ·

1 Parent(s): f7f866b

Updated short description

Files changed (1) hide show

README.md CHANGED Viewed

@@ -11,8 +11,12 @@ tags:
 - evaluate
 - metric
 description: >-
-  HaRiM+ is reference-less metric for summary quality evaluation which hurls the power of summarization model to estimate the quality of the summary-article pair. <br />
-  Note that this metric is reference-free and do not require training. It is ready to go without reference text to compare with the generation nor any model training for scoring.
 ---
@@ -83,4 +87,4 @@ Please cite as follows
     pages = "895--924",
     abstract = "One of the challenges of developing a summarization model arises from the difficulty in measuring the factual inconsistency of the generated text. In this study, we reinterpret the decoder overconfidence-regularizing objective suggested in (Miao et al., 2021) as a hallucination risk measurement to better estimate the quality of generated summaries. We propose a reference-free metric, HaRiM+, which only requires an off-the-shelf summarization model to compute the hallucination risk based on token likelihoods. Deploying it requires no additional training of models or ad-hoc modules, which usually need alignment to human judgments. For summary-quality estimation, HaRiM+ records state-of-the-art correlation to human judgment on three summary-quality annotation sets: FRANK, QAGS, and SummEval. We hope that our work, which merits the use of summarization models, facilitates the progress of both automated evaluation and generation of summary.",
 }
-```

 - evaluate
 - metric
 description: >-
+  HaRiM+ is reference-less metric for summary quality evaluation which hurls the
+  power of summarization model to estimate the quality of the summary-article
+  pair. <br /> Note that this metric is reference-free and do not require
+  training. It is ready to go without reference text to compare with the
+  generation nor any model training for scoring.
+short_description: HaRiM+ is a reference-free summary faithfulness measure.
 ---
     pages = "895--924",
     abstract = "One of the challenges of developing a summarization model arises from the difficulty in measuring the factual inconsistency of the generated text. In this study, we reinterpret the decoder overconfidence-regularizing objective suggested in (Miao et al., 2021) as a hallucination risk measurement to better estimate the quality of generated summaries. We propose a reference-free metric, HaRiM+, which only requires an off-the-shelf summarization model to compute the hallucination risk based on token likelihoods. Deploying it requires no additional training of models or ad-hoc modules, which usually need alignment to human judgments. For summary-quality estimation, HaRiM+ records state-of-the-art correlation to human judgment on three summary-quality annotation sets: FRANK, QAGS, and SummEval. We hope that our work, which merits the use of summarization models, facilitates the progress of both automated evaluation and generation of summary.",
 }
+```