Skip to content

Commit

Permalink
Update
Browse files Browse the repository at this point in the history
  • Loading branch information
samholt committed Apr 25, 2024
1 parent 8f798cf commit b3445db
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,9 @@ To cite [L2MAC](https://openreview.net/forum?id=EhrzQwsV4K) in publications, ple
2. We further evaluated L2MAC on the standard **HumanEval benchmark** and observe that it achieves a state-of-the-art score of [90.2% Pass@1](https://paperswithcode.com/sota/code-generation-on-humaneval).
3. L2MAC also works for general-purpose extensive text-based tasks, such as writing an [entire book from a single prompt](https://samholt.github.io/L2MAC/guide/use_cases/gallery.html#entire-book-italian-pasta-recipe-book).

![HumanEval](docs/public/images/human_eval.png)
<p align="center">LLM-Automatic Computer (L2MAC) achieves strong performance on HumanEval coding benchmark and is currently ranked the <b>3rd best AI coding agent in the world</b> on the global coding <a href="https://paperswithcode.com/sota/code-generation-on-humaneval">industry-standard leaderboard of HumanEval</a>.</p>

### In depth-comparison to AutoGPT and GPT-4

#### Can L2MAC correctly perform task-oriented context management?
Expand Down
3 changes: 3 additions & 0 deletions docs/guide/get_started/comparison_to_autogpt.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
2. We further evaluated L2MAC on the standard **HumanEval benchmark** and observe that it achieves a state-of-the-art score of [90.2% Pass@1](https://paperswithcode.com/sota/code-generation-on-humaneval).
3. L2MAC also works for general-purpose extensive text-based tasks, such as writing an [entire book from a single prompt](https://samholt.github.io/L2MAC/guide/use_cases/gallery.html#entire-book-italian-pasta-recipe-book).

![HumanEval](/images/human_eval.png)
<p align="center">LLM-Automatic Computer (L2MAC) achieves strong performance on HumanEval coding benchmark and is currently ranked the <b>3rd best AI coding agent in the world</b> on the global coding <a href="https://paperswithcode.com/sota/code-generation-on-humaneval">industry-standard leaderboard of HumanEval</a>.</p>

# In depth-comparison to AutoGPT and GPT-4

## Can L2MAC correctly perform task-oriented context management?
Expand Down
Binary file added docs/public/images/human_eval.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b3445db

Please sign in to comment.