Skip to content

Implement the "Attention Is All You Need" paper from scratch using PyTorch, focusing on building a sequence-to-sequence transformer architecture for translating text from English to Italian

License

Notifications You must be signed in to change notification settings

IsmaelMousa/transformer-pytorch-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English-to-Italian Translation Transformer

Implement the Attention is All You Need paper from scratch using PyTorch, focusing on building a sequence-to-sequence transformer architecture for translating text from English to Italian.

Overview

The transformer architecture, introduced in the paper Attention is All You Need, revolutionized sequence-to-sequence tasks with its attention-based mechanism, eliminating the need for recurrent networks. This project leverages this architecture to translate English sentences into Italian.

Data

The model was trained and evaluated on the g8a9 / europarl_en-it dataset, the dataset consists of English and Italian sentence pairs used for training and validation.

The intention was to train the model on the entire dataset (the entire dataset consist of 1.9M sentence pairs) but because of the limited hardware capabilities I just used 100K for training, and 20K for evaluating.

Tokenizers from Hugging Face were employed solely for preprocessing the data, other components of the model were implemented from scratch.

Results

Epochs Loss(train) Loss(validation) F1(train) F1(validation) Time(minutes) Device
5 0.116378 0.091345 0.262342 0.273207 170 1 x T4 GPU

Acknowledgements

This project follows the architecture and configurations from the Attention is All You Need paper. The implementation is guided by the principles and methods outlined in the paper.

About

Implement the "Attention Is All You Need" paper from scratch using PyTorch, focusing on building a sequence-to-sequence transformer architecture for translating text from English to Italian

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published