Hi!
I am experimenting the performance of UQ methods. As the below code shows, I am calling estimate_uncertainty multiple times and get score for a selected UQ method. Will the generated outputs from llm be the same for each call in the loop? As far as I experimented, it is the same but could not figure out if it's coincidence or by implementation side. If generated texts might be different, comparing methods won't make sense in this setting.
from lm_polygraph.utils.manager import estimate_uncertainty
def estimate_uncertainties(model, ue_methods, input_text):
scores = []
for method_name in ue_methods.keys():
ue = estimate_uncertainty(model, ue_methods[method_name], input_text=prompt)
scores.append(
{
'method': method_name,
'uncertainty': ue.uncertainty
}
)
torch.cuda.empty_cache()
return scores, ue.generation_text
Thank you so much for your time!
Hi!
I am experimenting the performance of UQ methods. As the below code shows, I am calling estimate_uncertainty multiple times and get score for a selected UQ method. Will the generated outputs from llm be the same for each call in the loop? As far as I experimented, it is the same but could not figure out if it's coincidence or by implementation side. If generated texts might be different, comparing methods won't make sense in this setting.
Thank you so much for your time!