Transflate
architecture = {
'src_vocab_len' : len(vocab_src),
'tgt_vocab_len' : len(vocab_tgt),
'N' : 6, # loop
'd_model' : 512, # emb
'd_ff' : 2048,
'h' : 8,
'p_dropout' : 0.1 }
collate_fn(
batch=batch,
src_pipeline=lambda x : tokenize(x, spacy_de),
tgt_pipeline=lambda x : tokenize(x, spacy_en),
src_vocab=vocab_src,
tgt_vocab=vocab_tgt,
device=None,
max_padding=data_setup['max_padding'],
pad_id=vocab_src.get_stoi()["<blank>"],)
class DataLoader(INPUT, collate_fn)
class EncoderDecoder
Src_emb : nn.Embeddings
Tgt_emb : nn.Embeddings
Encoder : converts an input sequence of tokens into a sequence of embeddings vectors (aka hidden state)
EncoderLayer
Attention : Self Attention -> Scaled Dot Product
FeedForward
N = 6
Decoder : use encoder's hidden state to iteratively generate output sequence of tokens, 1 by 1, until EOS.