Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update helm_prompt_settings.jsonl to allow for evaluation of all tasks #41

Open
JoelNiklaus opened this issue Sep 24, 2024 · 0 comments

Comments

@JoelNiklaus
Copy link
Contributor

HELM currently only evaluates on 5 LegalBench tasks. Ideally, we would like to be able to run evaluation on all tasks.

I quickly analyzed the structure of the tasks and their prompts. I found that all tasks contain a base_prompt.txtfile, a train.tsv and a README.md file that could be used to automatically construct a complete helm_prompt_settings.jsonl file.
I saw that the prompts in the helm_prompt_settings.jsonl file are modified versions from base_prompt.txt. Writing the jsonl file for all tasks manually could be a lot of work. Therefore, I would suggest the following:

  1. Extract the first line from base_prompt.txt as general instructions.
  2. Use the train.tsvfile to get the possible answer options and provide those as a second line to the instructions. Here a question: Does the train.tsv file contain all possible answer options for a task at least once?
  3. Use Data column names field from the README.md to build the field_ordering, label_keys and output_nouns.

What do you think? Happy to create a PR for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant