NFNet-F1 model achieves comparable accuracy to an EffNet-B7 while being 8.7× faster to train. The image is taken from page 1 of the paper.

Introduction & Overview


The illustration snip is taken from Open AIs’ official blog post. Visit the website for even more crazy things!!!

Introduction & Overview


Approximation of the regular attention mechanism AV (before D⁻¹ -renormalization) via (random) feature maps. Dashed-blocks indicate the order of computation with corresponding time complexities attached. The image is taken from the paper.

Introduction & Overview


Model Overview. The image is taken from the paper.

The Limitation with Transformers For Images


Underspecification in a simple epidemiological model. The image is taken from the paper.

Introduction & Overview


Overview of the proposed approach MAMA. MAMA constructs an open knowledge graph (KG) with a single forward pass of the pre-trained Language model (LM) (without fine-tuning) over the corpus. The image is taken from the paper.

Introduction & Overview


Comparison between attention and lambda layers. (Left) An example of 3 queries and their local contexts within a global context. (Middle) The attention operation associates each query with an attention distribution over its context. (Right) The lambda layer transforms each context into a linear function lambda that is applied to the corresponding query. The image is taken from the paper

Introduction & Overview

Lambda Layers Vs Attention Layers

Nakshatra Singh

A Machine Learning, Deep Learning, and Natural Language Processing enthusiast. Making life easy for beginners to read SOTA research papers🤞❤️

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store