monolyth.ai Open in urlscan Pro
2606:4700:3036::ac43:9237  Public Scan

URL: https://monolyth.ai/
Submission: On June 11 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

 1. MONOLYTH
 2. 
 3. Navigate…

Chat
Chat
Analytics
Keys
Docs
Models
Discord
Settings
Sign in



BUILD MODEL-AGNOSTIC AI APPS

Start Building

 1. Models
 2. 
 3. Analytics

Most Popular
⌘K


LYNN: SOLILOQUY LLAMA 3 V2

970K

Soliloquy-L3 is a fast, highly capable roleplaying model designed for immersive,
dynamic experiences. Trained on over 250 million tokens of roleplaying data,
Soliloquy-L3 has a vast knowledge base, rich literary expression, and support
for up to 24k context length. It outperforms existing ~13B models, delivering
enhanced roleplaying capabilities. Usage of this model is subject to [Meta's
Acceptable Use Policy](https://ai.meta.com/llama/use-policy/). *This provider
may log and train based on your prompt.*

by lynn

24K context

$0.05/M input

$0.05/M output

soliloquy-l3


META: LLAMA 3 70B INSTRUCT

527K

Meta developed and released the Meta Llama 3 family of large language models
(LLMs), a collection of pretrained and instruction tuned generative text models
in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for
dialogue use cases and outperform many of the available open source chat models
on common industry benchmarks.

by meta

8K context

$0.59/M input

$0.79/M output

llama-3-70b-instruct


SNOWFLAKE: ARCTIC INSTRUCT

246K

Arctic is a dense-MoE Hybrid transformer architecture pre-trained from scratch
by the Snowflake AI Research Team. **Efficient Intelligence**: Arctic
outperforms similar open source models in enterprise tasks like SQL generation
and coding, setting a new standard for cost-effective AI training for Snowflake
customers. **True Openness**: Licensed under Apache 2.0, Arctic offers full
access to its code and weights, and openly shares all data recipes and research
insights.

by snowflake

4K context

$2.40/M input

$2.40/M output

snowflake-arctic-instruct


LLAVA V1.6 YI 34B

244K

Vision language model LLaVA 1.6 allowing both image and text inputs. LLaVA is an
open-source chatbot trained by fine-tuning LLM on multimodal
instruction-following data.

by fireworks

4K context

$0.90/M input

$0.90/M output

llava-yi-34b


WIZARDLM-2 8X22B

220K

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates
highly competitive performance compared to those leading proprietary models.

by microsoft

65K context

$1.08/M input

$1.08/M output

wizardlm-2-8x22b


ANTHROPIC: CLAUDE 3 OPUS

199K

The Claude 3 Opus from Anthropic is their most advanced model, designed for
highly complex tasks. It excels in performance, intelligence, fluency, and
comprehension.

by anthropic

200K context

$15.00/M input

$75.00/M output

claude-3-opus


PHI 3 MINI INSTRUCT PREVIEW

189K

Phi-3 Mini is a 3.8B parameters, lightweight, state-of-the-art open model
trained with the Phi-3 datasets that includes both synthetic data and the
filtered publicly available websites data with a focus on high-quality and
reasoning dense properties. The model has underwent a post-training process that
incorporates both supervised fine-tuning and direct preference optimization to
ensure precise instruction adherence and robust safety measures. When assessed
against benchmarks testing common sense, language understanding, math, code,
long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust
and state-of-the-art performance among models with less than 13 billion
parameters. *This model and the API is experimental.*

by microsoft

128K context

$0.10/M input

$0.10/M output

phi-3-mini-128k-instruct


OPENAI: GPT-4O

167K

GPT-4o ("o" for "omni") is OpenAI's newest AI model, designed to handle both
text and image inputs with text outputs. It retains the intelligence level of
GPT-4 Turbo while being twice as fast and 50% more cost-efficient. Additionally,
GPT-4o excels in processing non-English languages and offers superior visual
capabilities. During benchmarking against other models, it was temporarily named
"im-also-a-good-gpt2-chatbot."

by openai

128K context

$5.00/M input

$15.00/M output

gpt-4o


OPENAI: GPT-4 TURBO

155K

GPT-4 is a large multimodal model (accepting text or image inputs and outputting
text) that can solve difficult problems with greater accuracy than any of our
previous models, thanks to its broader general knowledge and advanced reasoning
capabilities.

by openai

128K context

$10.00/M input

$30.00/M output

gpt-4-turbo


GEMINI 1.5 FLASH PREVIEW

122K

Gemini 1.5 Flash is Google's fastest multimodal model with exceptional speed and
efficiency for quick, high-frequency tasks. Currently available in preview.

by google

1M context

$0.35/M input

$0.53/M output

gemini-1.5-flash


MISTRAL: 8X22B INSTRUCT

118K

Mixtral 8x22B is mistral's latest open model. It sets a new standard for
performance and efficiency within the AI community. It is a sparse
Mixture-of-Experts (SMoE) model that uses only 39B active parameters out of
141B, offering unparalleled cost efficiency for its size.

by mistral-ai

65K context

$0.65/M input

$0.65/M output

mixtral-8x22b-instruct


NOUS HERMES 2 MIXTRAL 8X7B DPO

117K

Nous Hermes 2 Mixtral 8x7B DPO is the new flagship Nous Research model trained
over the Mixtral 8x7B MoE LLM.

by nousresearch

32K context

$0.50/M input

$0.50/M output

nous-hermes-2-mixtral-8x7b-dpo


META: LLAMA 3 8B INSTRUCT

104K

Meta developed and released the Meta Llama 3 family of large language models
(LLMs), a collection of pretrained and instruction tuned generative text models
in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for
dialogue use cases and outperform many of the available open source chat models
on common industry benchmarks.

by meta

8K context

$0.10/M input

$0.10/M output

llama-3-8b-instruct


PERPLEXITY: LLAMA SONAR LARGE 32K CHAT

98K

Perplexity has announced the launch of its new Perplexity models:
llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their
search-enabled versions, llama-3-sonar-small-32k-online and
llama-3-sonar-large-32k-online. These models are reported to exceed the
performance of their predecessors (sonar-small, sonar-medium). This is large
variant.

by perplexity-ai

32K context

$1.00/M input

$1.00/M output

llama-3-sonar-large-32k-chat


ANTHROPIC: CLAUDE 3 HAIKU

85K

The Claude 3 Haiku, created by Anthropic, is their fastest and most efficient
model to date, designed for near-instant response. It offers rapid and accurate
performance for specific tasks.

by anthropic

200K context

$0.25/M input

$1.25/M output

claude-3-haiku


PERPLEXITY: SONAR MEDIUM CHAT [REPLACED]

77K

Sonar represents the newest model family from Perplexity, offering improvements
over previous models in terms of cost-efficiency, speed, and performance. This
model has been replaced with [newer
variant](https://monolyth.ai/models/llama-3-sonar-large-32k-chat)

by perplexity-ai

16K context

$0.60/M input

$0.60/M output

sonar-medium-chat


ANTHROPIC: CLAUDE 3 SONNET

76K

The Claude 3 Sonnet offers an optimal mix of intelligence and speed suitable for
enterprise tasks. It provides high utility at a reduced cost, is reliable, and
well-suited for large-scale deployments.

by anthropic

200K context

$3.00/M input

$15.00/M output

claude-3-sonnet


LLAMA 3 LUMIMAID 8B

73K

Llama3 was fine-tuned using the NeverSleep team's curated datasets, focusing on
a balanced ERP and RP data while maintaining appropriate seriousness and
selective uncensorship. Additionally, the model incorporates the new Luminae
dataset from Ikari and roughly 40% of non-roleplay data, enhancing its overall
intelligence and conversational abilities. This combination ensures a breadth of
knowledge while prioritizing roleplay expertise.

by neversleep

24K context

$0.23/M input

$2.25/M output

llama-3-lumimaid-8b


TOPPY M 7B

69K

An affordable 7B-parameter model that combines multiple models using the new
task_arithmetic merge method from MergeKit.

by undi95

4K context

$0.20/M input

$0.20/M output

toppy-m-7b


GOOGLE: GEMINI 1.5 PRO PREVIEW

66K

Gemini 1.5 by Google delivers dramatically enhanced performance with a more
efficient architecture. This model is intended for early testing.

by google

1M context

$7.00/M input

$21.00/M output

gemini-1.5-pro


MISTRAL: LARGE

58K

Mistral AI's premier model. Built on a closed-source prototype, it is highly
capable in reasoning, coding, handling JSON, chatting, and more. For more
details, refer to the launch announcement. The model is proficient in English,
French, Spanish, German, and Italian, delivering high grammatical precision.
With a context window of 32K tokens, it effectively recalls detailed information
from extensive documents.

by mistral-ai

32K context

$8.00/M input

$24.00/M output

mistral-large


UPSTAGE: SOLAR 1 MINI CHAT

48K

A compact LLM offering superior performance to GPT-3.5, with robust multilingual
capabilities for both English and Korean, delivering high efficiency in a
smaller package.

by upstage

16K context

$0.50/M input

$0.50/M output

solar-1-mini-chat


ZEPHYR 141B A35B

41K

Zephyr 141B-A35B is an instruction finetuned version of Mixtral-8x22B. It was
fine-tuned on a mix of publicly available, synthetic datasets trained by
Huggingface. It achieves strong performance on chat benchmarks.

by huggingface

65K context

$0.65/M input

$0.65/M output

zephyr-8x22b-instruct


MYTHOMAX L2 13B

37K

Mythomax L2 13B is a large language model created by Gryphe that specializes in
storytelling and advanced roleplaying. It is built on the Llama 2 architecture
and is an optimized version of the MythoMix model, incorporating a tensor merger
strategy for increased coherency and performance. Mythomax L2 13B has been
praised for its ability to bond characters and create engaging roleplaying
experiences.

by gryphe

4K context

$0.13/M input

$0.13/M output

mythomax-l2-13b


MISTRAL: TINY

36K

Mistral-Tiny is the smallest and most cost-effective model in the lineup offered
by Mistral AI, designed to provide essential language modeling capabilities in a
resource-efficient package.

by mistral-ai

32K context

$0.25/M input

$0.25/M output

mistral-tiny


PERPLEXITY: LLAMA SONAR LARGE 32K ONLINE

35K

Perplexity has announced the launch of its new Perplexity models:
llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their
search-enabled versions, llama-3-sonar-small-32k-online and
llama-3-sonar-large-32k-online. These models are reported to exceed the
performance of their predecessors (sonar-small, sonar-medium). This is large,
online variant.

by perplexity-ai

28K context

$1.00/M input

$1.00/M output

$0.005/request

llama-3-sonar-large-32k-online


GOOGLE: GEMMA 7B

27K

Gemma is a family of lightweight, state-of-the-art open models from Google,
built from the same research and technology used to create the Gemini models.
They are text-to-text, decoder-only large language models, available in English,
with open weights, pre-trained variants, and instruction-tuned variants. Gemma
models are well-suited for a variety of text generation tasks, including
question answering, summarization, and reasoning. Their relatively small size
makes it possible to deploy them in environments with limited resources such as
a laptop, desktop or your own cloud infrastructure, democratizing access to
state of the art AI models and helping foster innovation for everyone.

by google

8K context

$0.20/M input

$0.20/M output

gemma-7b


OPEN CHAT 3.5

25K

OpenChat is a library of open-source language models that have been fine-tuned
with C-RLFT, a strategy inspired by offline reinforcement learning. These models
can learn from mixed-quality data without preference labels and have achieved
exceptional performance comparable to ChatGPT.

by openchat

8K context

$0.13/M input

$0.13/M output

openchat-7b


PERPLEXITY: LLAMA SONAR SMALL 32K CHAT

24K

Perplexity has announced the launch of its new Perplexity models:
llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their
search-enabled versions, llama-3-sonar-small-32k-online and
llama-3-sonar-large-32k-online. These models are reported to exceed the
performance of their predecessors (sonar-small, sonar-medium).

by perplexity-ai

32K context

$0.20/M input

$0.20/M output

llama-3-sonar-small-32k-chat


PERPLEXITY: SONAR MEDIUM ONLINE [REPLACED]

24K

Sonar represents the newest model family from Perplexity, offering improvements
over previous models in terms of cost-efficiency, speed, and performance. For
*-online* models, in addition to the token charges, a flat $5 is charged per
thousand requests (or half a cent per request). This model has been replaced
with [newer variant](https://monolyth.ai/models/llama-3-sonar-large-32k-online)

by perplexity-ai

12K context

$0.60/M input

$0.60/M output

$0.005/request

sonar-medium-online


QWEN 1.5 110B

24K

The latest release, Qwen1.5-110B, is part of the Qwen1.5 series and features the
same Transformer decoder architecture, including grouped query attention (GQA)
for efficient model serving. It supports a context length of 32K tokens and
remains multilingual, accommodating numerous languages such as English, Chinese,
French, and more. In evaluations, Qwen1.5-110B shows comparable results to
Meta-Llama3-70B in base model performance and excels in chat evaluations,
including MT-Bench and AlpacaEval 2.0.

by alibaba

32K context

$1.80/M input

$1.80/M output

qwen-1.5-110b-chat


MISTRAL: MEDIUM

24K

Mistral Medium is a closed-source model developed by Mistral AI. It operates on
a proprietary model weights and excels in reasoning, coding, handling JSON,
chatting, and various other applications. It performs comparably to many
flagship models from different companies in benchmarks.

by mistral-ai

32K context

$2.70/M input

$8.10/M output

mistral-medium


QWEN 72B CHAT

23K

通义千问-72B(Qwen-72B)是阿里云研发的通义千问大模型系列的720亿参数规模的模型。Qwen-72B是基于Transformer的大语言模型,
在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。 Qwen-72B (Qwen-72B) is a 72
billion parameter scale model of the Qwen Big Model series developed by Alibaba
Cloud which is trained on super-large scale pre-training data. The pre-training
data are of various types and cover a wide range, including a large number of
web texts, professional books, codes, etc.

by alibaba

4K context

$0.90/M input

$0.90/M output

qwen-72b-chat


OPENAI: GPT-3.5 TURBO

19K

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested
formats and a fix for a bug which caused a text encoding issue for non-English
language function calls. Returns a maximum of 4,096 output tokens.

by openai

16K context

$0.50/M input

$1.50/M output

gpt-3.5-turbo


MISTRAL: SMALL

18K

Mistral-Small is a mid-tier model in the Mistral AI suite, offering a balance
between performance and affordability.

by mistral-ai

32K context

$2.00/M input

$6.00/M output

mistral-small


WIZARDLM 7B

16K

WizardLM-2 7B is the smaller variant of Microsoft AI's latest Wizard model. It
is the fastest and achieves comparable performance with existing 10x larger
open-source leading models

by microsoft

32K context

$0.10/M input

$0.10/M output

wizardlm-2-7b


UPSTAGE: SOLAR 1 MINI TRANSLATE EN-KO

15K

Context-aware English-Korean translation that leverages previous dialogues to
ensure unmatched coherence and continuity in your conversations.

by upstage

32K context

$1.00/M input

$1.00/M output

solar-1-mini-translate-enko


COHERE: COMMAND

14K

Command is a conversational model designed for following instructions and
executing language tasks with superior quality, greater reliability, and a
broader context compared to our standard generative models. Utilization of this
model is governed by Cohere’s Acceptable Use Policy.

by cohere

4K context

$1.00/M input

$2.00/M output

command


REKA CORE

11K

Reka Core is a frontier-class multimodal language model developed by Reka AI.

by reka-ai

128K context

$10.00/M input

$25.00/M output

reka-core


PERPLEXITY: SONAR SMALL CHAT [REPLACED]

11K

Sonar represents the newest model family from Perplexity, offering improvements
over previous models in terms of cost-efficiency, speed, and performance. This
model has been replaced with [newer
variant](https://monolyth.ai/models/llama-3-sonar-small-32k-chat)

by perplexity-ai

16K context

$0.20/M input

$0.20/M output

sonar-small-chat


PHIND: CODELLAMA 34B

11K

Phind-CodeLlama-34B-v2 is an open-source language model, fine-tuned on 1.5B
tokens from high-quality programming-related data, and proficient in languages
like Python, C/C++, TypeScript, and Java. It achieved a 73.8% pass rate on
HumanEval and is instruction-tuned using Alpaca/Vicuna formats for better
usability and steerability. Trained on proprietary instruction-answer pairs, it
generates a single completion per prompt.

by phind

4K context

$0.60/M input

$0.60/M output

phind-codellama-34b-v2


FIRELLAVA 13B

10K

LLaVA vision-language model trained on OSS LLM generated instruction following
data. 1 Image is counted as 576 prompt tokens.

by fireworks

4K context

$0.20/M input

$0.20/M output

firellava-13b


PERPLEXITY: LLAMA SONAR SMALL 32K ONLINE

9K

Perplexity has announced the launch of its new Perplexity models:
llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their
search-enabled versions, llama-3-sonar-small-32k-online and
llama-3-sonar-large-32k-online. These models are reported to exceed the
performance of their predecessors (sonar-small, sonar-medium). This is online
variant.

by perplexity-ai

28K context

$0.20/M input

$0.20/M output

$0.005/request

llama-3-sonar-small-32k-online


LZLV 70B

8K

A Mythomax/MLewd_13B-style merge of selected 70B models A multi-model merge of
several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to
create a model that combines creativity with intelligence for an enhanced
experience.

by lizpreciatior

4K context

$0.70/M input

$0.90/M output

lzlv-70b-fp16-hf


QWEN 1.5 1.8B

6K

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.10/M input

$0.10/M output

qwen-1.5-1.8b-chat


AI21: JAMBA INSTRUCT PREVIEW

6K

An instruction-tuned version of our hybrid SSM-Transformer Jamba model,
Jamba-Instruct is built for reliable commercial use, with best-in-class quality
and performance.

by ai21

256K context

$0.50/M input

$0.70/M output

jamba-instruct-preview


CHRONOS HERMES 13B

6K

This model is a 75/25 merge of Chronos (13B) and Nous Hermes (13B) models
resulting in having a great ability to produce evocative storywriting and follow
a narrative.

by austism

2K context

$0.30/M input

$0.30/M output

chronos-hermes-13b


QWEN 1.5 32B

5K

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.80/M input

$0.80/M output

qwen-1.5-32b-chat


REKA FLASH

5K

Reka Flash is a state-of-the-art 21B model trained entirely from scratch and
pushed to its absolute limits. It serves as the “turbo-class” offering in our
lineup of models. Reka Flash rivals the performance of many significantly larger
models, making it an excellent choice for fast workloads that require high
quality. On a myriad of language and vision benchmarks, it is competitive with
Gemini Pro and GPT-3.5.

by reka-ai

8K context

$0.80/M input

$2.00/M output

reka-flash


QWEN 1.5 72B

4K

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.90/M input

$0.90/M output

qwen-1.5-72b-chat


REKA EDGE

3K

Lightweight, 7B equivalent model for local (i.e., on-hardware) or latency
sensitive applications

by reka-ai

8K context

$0.40/M input

$1.00/M output

reka-edge


NOUS CAPYBARA 7B 1.9

3K

The Capybara series is the first Nous collection of dataset and models made by
fine-tuning mostly on data created by Nous in-house.

by nousresearcch

4K context

$0.20/M input

$0.20/M output

nous-capybara-7b


META: LLAMA 2 70B CHAT

3K

Llama 2 is a collection of pretrained and fine-tuned generative text models
ranging in scale from 7 billion to 70 billion parameters. This is 70B fine-tuned
model, optimized for dialogue use cases and converted for the Hugging Face
Transformers format.

by meta

4K context

$0.90/M input

$0.90/M output

llama-2-70b-chat


AIROBOROS L2 70B

3K

Airoboros is a fairly general purpose model, but focuses heavily on instruction
following, rather than casual chat/roleplay.

by jondurbin

4K context

$0.70/M input

$0.90/M output

airoboros-70b


UPSTAGE: SOLAR 1 MINI TRANSLATE KO-EN

2K

Context-aware Korean-English translation that leverages previous dialogues to
ensure unmatched coherence and continuity in your conversations.

by upstage

32K context

$1.00/M input

$1.00/M output

solar-1-mini-translate-koen


MISTRAL: 7B INSTRUCT V0.2

2K

The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an enhanced version
of the Mistral-7B-v0.2 generative text model, fine-tuned for instruction-based
tasks using numerous publicly accessible conversation datasets.

by mistral-ai

32K context

$0.13/M input

$0.13/M output

mistral-7b-instruct-v0.2


HERMES 2 PRO MISTRAL 7B

2K

Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of
an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly
introduced Function Calling and JSON Mode dataset developed in-house.

by nousresearch

8K context

$0.20/M input

$0.20/M output

hermes-2-pro-mistral-7b


YI 34B CHAT

1K

Yi-34B is a large language model (LLM) developed by the AI startup 01.AI, It is
a bilingual (English and Chinese) base model trained with 34 billion parameters.
Yi-34B has shown impressive performance on various natural language processing
tasks.

by 01-ai

4K context

$0.80/M input

$0.80/M output

yi-34b-chat


DBRX INSTRUCT

1K

DBRX Instruct is a mixture-of-experts (MoE) large language model trained from
scratch by Databricks. DBRX Instruct specializes in few-turn interactions.

by databricks

32K context

$1.20/M input

$1.20/M output

dbrx-instruct


JAPANESE STABLE LM INSTRUCT GAMMA 7B

1K

This is a 7B-parameter decoder-only Japanese language model fine-tuned on
instruction-following datasets, built on top of the base model Japanese Stable
LM Base Gamma 7B.

by stability-ai

8K context

$0.20/M input

$0.20/M output

japanese-stablelm-instruct-gamma-7b


OPEN HERMES 2 MISTRAL 7B

1K

OpenHermes 2 Mistral 7B is a state of the art Mistral Fine-tune.

by nousresearch

8K context

$0.20/M input

$0.20/M output

openhermes-2-mistral-7b


QWEN 1.5 14B

1K

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.30/M input

$0.30/M output

qwen-1.5-14b-chat


MISTRAL: MIXTRAL 8X7B INSTRUCT

797

Mixtral is mixture of expert large language model (LLM) from Mistral AI. This is
state of the art machine learning model using a mixture 8 of experts (MoE) 7b
models. During inference 2 expers are selected. This architecture allows large
models to be fast and cheap at inference. The Mixtral-8x7B outperforms Llama 2
70B on most benchmarks.

by mistral-ai

32K context

$0.24/M input

$0.24/M output

mixtral-8x7b-instruct


UPSTAGE: SOLAR 10.7B INSTRUCT

778

SOLAR-10.7B is a large language model with 10.7 billion parameters, showing
superior performance across various natural language processing tasks,
outperforming other models with up to 30 billion parameters. This model employs
depth up-scaling for enhancement, integrating architectural changes and
continuing pretraining with Mistral 7B weights. It excels in robustness and
adaptability, making it ideal for fine-tuning applications, and consistently
surpasses the Mixtral 8X7B model in benchmarks.

by upstage

4K context

$0.30/M input

$0.30/M output

solar-10.7b-instruct


NOUS HERMES 2 - YI 34B

701

Nous Hermes 2 - Yi-34B is a state of the art Yi Fine-tune. Nous Hermes 2 Yi 34B
was trained on 1,000,000 entries of primarily GPT-4 generated data

by nousresearch

4K context

$0.80/M input

$0.80/M output

nous-hermes-2-yi-34b


QWEN 1.5 4B

332

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.10/M input

$0.10/M output

qwen-1.5-4b-chat


QWEN 1.5 7B

304

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.20/M input

$0.20/M output

qwen-1.5-7b-chat


PERPLEXITY: SONAR SMALL ONLINE [REPLACED]

212

Sonar represents the newest model family from Perplexity, offering improvements
over previous models in terms of cost-efficiency, speed, and performance. For
*-online* models, in addition to the token charges, a flat $5 is charged per
thousand requests (or half a cent per request). This model has been replaced
with [newer variant](https://monolyth.ai/models/llama-3-sonar-small-32k-online)

by perplexity-ai

12K context

$0.20/M input

$0.20/M output

$0.005/request

sonar-small-online


JAPANESE STABLE LM INSTRUCT BETA 70B

187

japanese-stablelm-base-beta-70b is a 70B-parameter decoder-only language model
based on Llama-2-70b that has been fine-tuned on a diverse collection of
Japanese data, with the intent of maximizing downstream performance on Japanese
language tasks.

by stability-ai

8K context

$0.90/M input

$0.90/M output

japanese-stablelm-instruct-beta-70b


MISTRAL EMBED

12

A model that converts text into numerical vectors of embeddings in 1024
dimensions. Embedding models enable retrieval and retrieval-augmented generation
applications. It achieves a retrieval score of 55.26 on MTEB.

by mistral-ai$0.10/M input

mistral-embed


OPENAI: TEXT EMBEDDING ADA 2

9

text-embedding-ada-002 outperforms all the old embedding models on text search,
code search, and sentence similarity tasks and gets comparable performance on
text classification.

by openai$0.10/M input

text-embedding-ada-002


OPENAI: TEXT EMBEDDING 3 LARGE

9

text-embedding-3-large is OpenAI's new next generation larger embedding model
and creates embeddings with up to 3072 dimensions.

by openai$0.13/M input

text-embedding-3-large


OPENAI: TEXT EMBEDDING 3 SMALL

9

The Text Embedding 3 Small model is a highly efficient upgrade from the December
2022 release, Text-Embedding-ADA-002. It demonstrates improved performance on
the MIRACL benchmark for multi-language retrieval, increasing from 31.4% to
44.0%, and on the MTEB benchmark for English tasks, improving from 61.0% to
62.3%.

by openai$0.02/M input

text-embedding-3-small


AUTO

When your model slug is unknown, your prompts will be processed by
[llama-3-70b-instruct](https://monolyth.ai/models/llama-3-70b-instruct).

by monolyth

N/A context

auto


OPEN HERMES 2.5 MISTRAL 7B

OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a
continuation of OpenHermes 2 model, which trained on additional code datasets.

by nousresearch

8K context

$0.20/M input

$0.20/M output

openhermes-2.5-mistral-7b


OLMO 7B INSTRUCT

OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the
science of language models. The OLMo base models are trained on the
[Dolma](https://huggingface.co/datasets/allenai/dolma) dataset. The adapted
versions are trained on the [Tulu SFT
mixture](https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture) and, for
the Instruct version, a [cleaned version of the UltraFeedback
dataset](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned).
OLMo 7B Instruct and OLMo SFT are two adapted versions of these models trained
for better question answering. They show the performance gain that OLMo base
models can achieve with existing fine-tuning techniques.

by allenai

2K context

$0.20/M input

$0.20/M output

olmo-7b-instruct


QWEN 1.5 0.5B

Qwen1.5 is the improved version of Qwen, the large language model series
developed by Qwen team, Alibaba Cloud.

by alibaba

32K context

$0.10/M input

$0.10/M output

qwen-1.5-0.5b-chat
Request a model


DiscordPricingPrivacyTerms
MONOLYTHby Empty Canvas Inc.