Transformer-based Code Completion

A proof-of-concept source code completion model, based on the Universal Transformer architecture, trained on the FunCom dataset, made up of 2.1 million Java methods.

Check out the live demo!

Technical Details

In contrast to the original proposed Transformer, which is made up of a set of layers, each with their own parameters, the Universal Transformer applies the same layer repeatedly. This improves performance across many tasks, particularly those of an algorithmic nature (such as processing source code as opposed to natural language).

In tasks such as completion where we are not translating between languages, only the "Encoder" layer of the Transformer is utilized.

The steps to predict the next token, given an input prompt:

The prompt is tokenized using a SentencePiece model
The set of input tokens are processed by an embedding layer, which turns them into vectors
A Transformer Encoder Layer is applied n=4 times
Finally, a Dense layer is used to produce next-token probabilities

This process is applied repeatedly until the end-of-sentence token is reached, or a specified maximum length is surpassed.

In contrast to my Code Summarization Transformer, I implemented this project using PyTorch.

Running Locally

You'll need Python 3 with torch, sentencepiece, and Keras-Preprocessing. Run run.py with Python to enter an interactive demo.

You can train a new model by editing the parameters in train.py.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
saved_models		saved_models
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
gunicorn_server.py		gunicorn_server.py
heroku.yml		heroku.yml
requirements.txt		requirements.txt
run.py		run.py
text_data_utils.py		text_data_utils.py
train.py		train.py
transformer_lm.py		transformer_lm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer-based Code Completion

Technical Details

Running Locally

About

Releases

Packages

Languages

nathanielwarner/transformer_lang_model

Folders and files

Latest commit

History

Repository files navigation

Transformer-based Code Completion

Technical Details

Running Locally

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages