Palmyra Base is a model primarily pre-trained with English text, although it does contain a trace amount of non-English data from CommonCrawl. It was trained with a causal language modeling (CLM) objective, similar to GPT-3. This model is part of the same family as GPT-3, containing only a decoder. Palmyra Base was evaluated using the experimental setup and prompts from GPT-3.
Features of Palmyra
- Model Description: Palmyra Base was pre-trained mainly with English text. It utilized a causal language modeling (CLM) objective during its pretraining process. This model is similar to GPT-3 and only contains a decoder. It uses the prompts and experimental setup from GPT-3 for its evaluation.
- Use Case: The model is both powerful and fast, excelling at nuanced tasks like sentiment classification and summarization.
- Training Data: Palmyra Base was trained on Writer’s custom dataset.
- Intended Use and Limitations: The model learns an inner representation of the English language, making it useful for downstream tasks. However, it’s best suited for generating text from a prompt.
- How to Use: The model can be loaded using the AutoModelForCausalLM functionality from the transformers library.
- Limitations and Biases: The core functionality of Palmyra Base is to predict the next token in a string of text. It’s essential to understand that the next statistically likely token might not always produce the most accurate text. The model was trained on Writer’s custom data, and there are potential biases and limitations associated with it.
- Evaluation Results: The Palmyra-base model was evaluated on the SuperGLUE benchmark, with various tasks and metrics listed.
- Citation: The model can be cited using the provided citation details, which credits the Writer Engineering team and provides a link to the Writer’s official website.