Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better edit syntax for the improve command #869

Closed
ATheorell opened this issue Nov 25, 2023 · 0 comments
Closed

Better edit syntax for the improve command #869

ATheorell opened this issue Nov 25, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@ATheorell
Copy link
Collaborator

ATheorell commented Nov 25, 2023

Feature description

gpt-engineer supports improving existing code with natural language prompts using the --improve flag. On a technical level, gpt-engineer improves code by sending the improve prompt and the code to the LLM and asking the LLM to provide improvements in the form of edit blocks on the format of the given example:

some/dir/example_1.py
<<<<<<< HEAD
    def mul(a,b)
=======
    def add(a,b):
>>>>>>> updated

This idea is largely borrowed from aider .
However, this approach is rather error prone, since it either requires the LLM to output the HEAD part of the edit block to be identical to a part of the code, which quite often is not the case, or that some intelligent heuristics handle the case when no exact match is found. Problems with the current implementation are reported in #721 #814 #841.

An alternative way to make edits is to prompt the LLM to provide edits in the classic diff syntax

28 -    workspace = FileRepository(eval_ob["project_root"])
28 +    workspace = OnDiskRepository(eval_ob["project_root"])

This has the advantage that the file name + the line number uniquely defines where to put in the edit, enabling edits, even if the existing code is not reproduced perfectly. Line numbers are currently not stored and not provided to the LLM. The easiest way to do this is probably to implement a method that equips a code object with line numbers in the the Code class and update the to_chat method to provide the line numbers, exactly the way it looks in the diff syntax. The same line numbers should then be reused when parsing edits from the LLM back into the code.

Of course, to make the LLM understand what is going on, it is also necessary to modify the corresponding preprompt.

Motivation/Application

Making the --improve workflow 10x more reliable!

@ATheorell ATheorell added enhancement New feature or request triage Interesting but stale issue. Will be close if inactive for 3 more days after label added. and removed triage Interesting but stale issue. Will be close if inactive for 3 more days after label added. labels Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant