After being contacted by the HR team regarding the opportunity, the interview was scheduled and conducted in online mode. This was the first round of the selection process and took place as a one-on-one interaction with the interviewer, focusing on assessing foundational knowledge and relevant experience.
Interview questions [1]
Question 1
1. How is self-attention not better than multi-head attention? What advantages does it have?
2. How do transformers resolve the issue of exploding or vanishing gradients in beam search?
3. What are Sentence Transformers?
4. Why is the positional embedding not trained in the Transformer?
5. Rotary positional embedding, how is it better than the rest?
Decoder:
6. What is given as input to the decoder from the encoder in a Transformer? What all tokens are given as output by the Decoder?
7. BeamSearch in Decoder.
8. Describe working with multiple GPUs?
9. Details of RAG - chunking, retrieval etc