How does transformer work and what will change if we make changes in encoder decoder blocks?

Question

Anonymous · Accepted Answer

Encoder blocks primarily affect language understanding and contextual representation. Increasing encoder depth, attention heads, or FFN size generally improves semantic understanding but increases latency and cost. Decoder blocks primarily affect text generation quality and reasoning. Increasing decoder depth improves generation capabilities, while removing cross-attention converts a sequence-to-sequence model into a decoder-only model like GPT. The overall impact is a trade-off among model quality, latency, memory consumption, and training cost.

Wipro

Wipro interview question

Interview Answer

Want the inside scoop on your own company?

Bowls

Followed companies

Job searches