Skip to content

This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters 🔥. I train the model to be able to ask question from a given context

Notifications You must be signed in to change notification settings

logic-OT/Decoder-Only-LLM

Repository files navigation

Decoder-Only-LLM With Pytorch ✨

Don't forget to star the repo if you find this useful

This repository features a custom-built decoder-only language model (LLM) with 8 decoders and a total of 37 million parameters. I developed this from scratch as a personal project to explore NLP techniques which you can now pretrain to fit whatever task

image

📖 Notebook Experiments

If you wish to run the code sequentially to see how everything works check out the Decoder-Only transformer notebook in the repo where I train the model to ask a question from a given context.

🔧 Setup

  • Clone repository
git clone git@github.com:logic-ot/decoder-only-llm.git
cd decoder-only-llm
  • Configuration file

    • Configure certain variables from the config.py file such as dataset path, etc.

      Structure of config.py:

    config = {
        "data_path":"sample_data.json", #path to dataset
        "tokenizer": AutoTokenizer.from_pretrained('bert-base-uncased'),
        "device": torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu"),
        "model_weights": None, #path to model weights
        "learning_rate":0.0001,
        "num_batches":10,
    }
  • Training

    • After configuration, run the training script like so:

    python training.py
    

    The model should automatically begin training

  • Data Structure

    • Your data for training must be in the following format:

      ["conversation":[{"from":"human","value":"{some text}"},{"from":"gpt","value":"{some text}"}]
      See an example in sample_data.json in the repo
  • Inferencing

    • The inference script takes in a text and the queries the model for a response. To inference, run:

    python inference.py
    

🪄 Tips

  • I have a what-to-expect.pdf file in the repo that details some observations i made while training the model for the question asking task. This information applies to any other model you decide to pretrain.
  • Here is an excellent video that explains tranformers: 3 blue blue 1 brown

About

This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters 🔥. I train the model to be able to ask question from a given context

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published