Transflate

architecture = {

        'src_vocab_len' : len(vocab_src),

        'tgt_vocab_len' : len(vocab_tgt),

        'N' : 6, # loop

        'd_model' : 512, # emb

        'd_ff' : 2048,

        'h' : 8,

        'p_dropout' : 0.1 }

collate_fn(

            batch=batch,

            src_pipeline=lambda x : tokenize(x, spacy_de),

            tgt_pipeline=lambda x : tokenize(x, spacy_en),

            src_vocab=vocab_src,

            tgt_vocab=vocab_tgt,

            device=None,

            max_padding=data_setup['max_padding'],

            pad_id=vocab_src.get_stoi()["<blank>"],)

class DataLoader(INPUT, collate_fn)

class EncoderDecoder

Src_emb : nn.Embeddings

Tgt_emb : nn.Embeddings

Encoder : converts an input sequence of tokens into a sequence of embeddings vectors (aka hidden state)

EncoderLayer

Attention : Self Attention -> Scaled Dot Product

FeedForward

N = 6

Decoder : use encoder's hidden state to iteratively generate output sequence of tokens, 1 by 1, until EOS.