Abstract. In this talk we'll review the GPT-1 paper Improving Language Understanding by Generative Pre-Training - Radford et al. By way of setting the stage we will give a brief review of the Transformer architecture.
Note: We didn't have time to cover GPT-2 in this talk, but some slides on the topic made it into the deck.
😴 Lazy blog - just a link to the talk's 📋PDF.