About AlexaTM 20B
The AlexaTM 20B is a large-scale multilingual sequence-to-sequence (seq2seq) model developed by Amazon Science. This model is pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, making it a more efficient few-shot learner than decoder-only models across various tasks.
Three key features of the AlexaTM 20B include
- State-of-the-art Performance: The AlexaTM 20B model achieves state-of-the-art performance on 1-shot summarization tasks, outperforming even larger models like the 540B PaLM decoder model.
- Multilingual Capabilities: The model supports a wide range of languages (Arabic, English, French, German, Hindi, Italian, Japanese, Marathi, Portuguese, Spanish, Tamil, and Telugu) and has shown superior performance in 1-shot machine translation tasks, especially for low-resource languages.
- Superior Few-Shot Learning: In a zero-shot setting, AlexaTM 20B outperforms GPT3 (175B) on SuperGLUE and SQuADv2 datasets and provides state-of-the-art performance on multilingual tasks such as XNLI, XCOPA, Paws-X, and XWinograd.