monolyth.ai
Open in
urlscan Pro
2606:4700:3036::ac43:9237
Public Scan
URL:
https://monolyth.ai/
Submission: On June 11 via api from US — Scanned from DE
Submission: On June 11 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
1. MONOLYTH 2. 3. Navigate… Chat Chat Analytics Keys Docs Models Discord Settings Sign in BUILD MODEL-AGNOSTIC AI APPS Start Building 1. Models 2. 3. Analytics Most Popular ⌘K LYNN: SOLILOQUY LLAMA 3 V2 970K Soliloquy-L3 is a fast, highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base, rich literary expression, and support for up to 24k context length. It outperforms existing ~13B models, delivering enhanced roleplaying capabilities. Usage of this model is subject to [Meta's Acceptable Use Policy](https://ai.meta.com/llama/use-policy/). *This provider may log and train based on your prompt.* by lynn 24K context $0.05/M input $0.05/M output soliloquy-l3 META: LLAMA 3 70B INSTRUCT 527K Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. by meta 8K context $0.59/M input $0.79/M output llama-3-70b-instruct SNOWFLAKE: ARCTIC INSTRUCT 246K Arctic is a dense-MoE Hybrid transformer architecture pre-trained from scratch by the Snowflake AI Research Team. **Efficient Intelligence**: Arctic outperforms similar open source models in enterprise tasks like SQL generation and coding, setting a new standard for cost-effective AI training for Snowflake customers. **True Openness**: Licensed under Apache 2.0, Arctic offers full access to its code and weights, and openly shares all data recipes and research insights. by snowflake 4K context $2.40/M input $2.40/M output snowflake-arctic-instruct LLAVA V1.6 YI 34B 244K Vision language model LLaVA 1.6 allowing both image and text inputs. LLaVA is an open-source chatbot trained by fine-tuning LLM on multimodal instruction-following data. by fireworks 4K context $0.90/M input $0.90/M output llava-yi-34b WIZARDLM-2 8X22B 220K WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to those leading proprietary models. by microsoft 65K context $1.08/M input $1.08/M output wizardlm-2-8x22b ANTHROPIC: CLAUDE 3 OPUS 199K The Claude 3 Opus from Anthropic is their most advanced model, designed for highly complex tasks. It excels in performance, intelligence, fluency, and comprehension. by anthropic 200K context $15.00/M input $75.00/M output claude-3-opus PHI 3 MINI INSTRUCT PREVIEW 189K Phi-3 Mini is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters. *This model and the API is experimental.* by microsoft 128K context $0.10/M input $0.10/M output phi-3-mini-128k-instruct OPENAI: GPT-4O 167K GPT-4o ("o" for "omni") is OpenAI's newest AI model, designed to handle both text and image inputs with text outputs. It retains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-efficient. Additionally, GPT-4o excels in processing non-English languages and offers superior visual capabilities. During benchmarking against other models, it was temporarily named "im-also-a-good-gpt2-chatbot." by openai 128K context $5.00/M input $15.00/M output gpt-4o OPENAI: GPT-4 TURBO 155K GPT-4 is a large multimodal model (accepting text or image inputs and outputting text) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities. by openai 128K context $10.00/M input $30.00/M output gpt-4-turbo GEMINI 1.5 FLASH PREVIEW 122K Gemini 1.5 Flash is Google's fastest multimodal model with exceptional speed and efficiency for quick, high-frequency tasks. Currently available in preview. by google 1M context $0.35/M input $0.53/M output gemini-1.5-flash MISTRAL: 8X22B INSTRUCT 118K Mixtral 8x22B is mistral's latest open model. It sets a new standard for performance and efficiency within the AI community. It is a sparse Mixture-of-Experts (SMoE) model that uses only 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. by mistral-ai 65K context $0.65/M input $0.65/M output mixtral-8x22b-instruct NOUS HERMES 2 MIXTRAL 8X7B DPO 117K Nous Hermes 2 Mixtral 8x7B DPO is the new flagship Nous Research model trained over the Mixtral 8x7B MoE LLM. by nousresearch 32K context $0.50/M input $0.50/M output nous-hermes-2-mixtral-8x7b-dpo META: LLAMA 3 8B INSTRUCT 104K Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. by meta 8K context $0.10/M input $0.10/M output llama-3-8b-instruct PERPLEXITY: LLAMA SONAR LARGE 32K CHAT 98K Perplexity has announced the launch of its new Perplexity models: llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their search-enabled versions, llama-3-sonar-small-32k-online and llama-3-sonar-large-32k-online. These models are reported to exceed the performance of their predecessors (sonar-small, sonar-medium). This is large variant. by perplexity-ai 32K context $1.00/M input $1.00/M output llama-3-sonar-large-32k-chat ANTHROPIC: CLAUDE 3 HAIKU 85K The Claude 3 Haiku, created by Anthropic, is their fastest and most efficient model to date, designed for near-instant response. It offers rapid and accurate performance for specific tasks. by anthropic 200K context $0.25/M input $1.25/M output claude-3-haiku PERPLEXITY: SONAR MEDIUM CHAT [REPLACED] 77K Sonar represents the newest model family from Perplexity, offering improvements over previous models in terms of cost-efficiency, speed, and performance. This model has been replaced with [newer variant](https://monolyth.ai/models/llama-3-sonar-large-32k-chat) by perplexity-ai 16K context $0.60/M input $0.60/M output sonar-medium-chat ANTHROPIC: CLAUDE 3 SONNET 76K The Claude 3 Sonnet offers an optimal mix of intelligence and speed suitable for enterprise tasks. It provides high utility at a reduced cost, is reliable, and well-suited for large-scale deployments. by anthropic 200K context $3.00/M input $15.00/M output claude-3-sonnet LLAMA 3 LUMIMAID 8B 73K Llama3 was fine-tuned using the NeverSleep team's curated datasets, focusing on a balanced ERP and RP data while maintaining appropriate seriousness and selective uncensorship. Additionally, the model incorporates the new Luminae dataset from Ikari and roughly 40% of non-roleplay data, enhancing its overall intelligence and conversational abilities. This combination ensures a breadth of knowledge while prioritizing roleplay expertise. by neversleep 24K context $0.23/M input $2.25/M output llama-3-lumimaid-8b TOPPY M 7B 69K An affordable 7B-parameter model that combines multiple models using the new task_arithmetic merge method from MergeKit. by undi95 4K context $0.20/M input $0.20/M output toppy-m-7b GOOGLE: GEMINI 1.5 PRO PREVIEW 66K Gemini 1.5 by Google delivers dramatically enhanced performance with a more efficient architecture. This model is intended for early testing. by google 1M context $7.00/M input $21.00/M output gemini-1.5-pro MISTRAL: LARGE 58K Mistral AI's premier model. Built on a closed-source prototype, it is highly capable in reasoning, coding, handling JSON, chatting, and more. For more details, refer to the launch announcement. The model is proficient in English, French, Spanish, German, and Italian, delivering high grammatical precision. With a context window of 32K tokens, it effectively recalls detailed information from extensive documents. by mistral-ai 32K context $8.00/M input $24.00/M output mistral-large UPSTAGE: SOLAR 1 MINI CHAT 48K A compact LLM offering superior performance to GPT-3.5, with robust multilingual capabilities for both English and Korean, delivering high efficiency in a smaller package. by upstage 16K context $0.50/M input $0.50/M output solar-1-mini-chat ZEPHYR 141B A35B 41K Zephyr 141B-A35B is an instruction finetuned version of Mixtral-8x22B. It was fine-tuned on a mix of publicly available, synthetic datasets trained by Huggingface. It achieves strong performance on chat benchmarks. by huggingface 65K context $0.65/M input $0.65/M output zephyr-8x22b-instruct MYTHOMAX L2 13B 37K Mythomax L2 13B is a large language model created by Gryphe that specializes in storytelling and advanced roleplaying. It is built on the Llama 2 architecture and is an optimized version of the MythoMix model, incorporating a tensor merger strategy for increased coherency and performance. Mythomax L2 13B has been praised for its ability to bond characters and create engaging roleplaying experiences. by gryphe 4K context $0.13/M input $0.13/M output mythomax-l2-13b MISTRAL: TINY 36K Mistral-Tiny is the smallest and most cost-effective model in the lineup offered by Mistral AI, designed to provide essential language modeling capabilities in a resource-efficient package. by mistral-ai 32K context $0.25/M input $0.25/M output mistral-tiny PERPLEXITY: LLAMA SONAR LARGE 32K ONLINE 35K Perplexity has announced the launch of its new Perplexity models: llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their search-enabled versions, llama-3-sonar-small-32k-online and llama-3-sonar-large-32k-online. These models are reported to exceed the performance of their predecessors (sonar-small, sonar-medium). This is large, online variant. by perplexity-ai 28K context $1.00/M input $1.00/M output $0.005/request llama-3-sonar-large-32k-online GOOGLE: GEMMA 7B 27K Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. by google 8K context $0.20/M input $0.20/M output gemma-7b OPEN CHAT 3.5 25K OpenChat is a library of open-source language models that have been fine-tuned with C-RLFT, a strategy inspired by offline reinforcement learning. These models can learn from mixed-quality data without preference labels and have achieved exceptional performance comparable to ChatGPT. by openchat 8K context $0.13/M input $0.13/M output openchat-7b PERPLEXITY: LLAMA SONAR SMALL 32K CHAT 24K Perplexity has announced the launch of its new Perplexity models: llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their search-enabled versions, llama-3-sonar-small-32k-online and llama-3-sonar-large-32k-online. These models are reported to exceed the performance of their predecessors (sonar-small, sonar-medium). by perplexity-ai 32K context $0.20/M input $0.20/M output llama-3-sonar-small-32k-chat PERPLEXITY: SONAR MEDIUM ONLINE [REPLACED] 24K Sonar represents the newest model family from Perplexity, offering improvements over previous models in terms of cost-efficiency, speed, and performance. For *-online* models, in addition to the token charges, a flat $5 is charged per thousand requests (or half a cent per request). This model has been replaced with [newer variant](https://monolyth.ai/models/llama-3-sonar-large-32k-online) by perplexity-ai 12K context $0.60/M input $0.60/M output $0.005/request sonar-medium-online QWEN 1.5 110B 24K The latest release, Qwen1.5-110B, is part of the Qwen1.5 series and features the same Transformer decoder architecture, including grouped query attention (GQA) for efficient model serving. It supports a context length of 32K tokens and remains multilingual, accommodating numerous languages such as English, Chinese, French, and more. In evaluations, Qwen1.5-110B shows comparable results to Meta-Llama3-70B in base model performance and excels in chat evaluations, including MT-Bench and AlpacaEval 2.0. by alibaba 32K context $1.80/M input $1.80/M output qwen-1.5-110b-chat MISTRAL: MEDIUM 24K Mistral Medium is a closed-source model developed by Mistral AI. It operates on a proprietary model weights and excels in reasoning, coding, handling JSON, chatting, and various other applications. It performs comparably to many flagship models from different companies in benchmarks. by mistral-ai 32K context $2.70/M input $8.10/M output mistral-medium QWEN 72B CHAT 23K 通义千问-72B(Qwen-72B)是阿里云研发的通义千问大模型系列的720亿参数规模的模型。Qwen-72B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。 Qwen-72B (Qwen-72B) is a 72 billion parameter scale model of the Qwen Big Model series developed by Alibaba Cloud which is trained on super-large scale pre-training data. The pre-training data are of various types and cover a wide range, including a large number of web texts, professional books, codes, etc. by alibaba 4K context $0.90/M input $0.90/M output qwen-72b-chat OPENAI: GPT-3.5 TURBO 19K The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls. Returns a maximum of 4,096 output tokens. by openai 16K context $0.50/M input $1.50/M output gpt-3.5-turbo MISTRAL: SMALL 18K Mistral-Small is a mid-tier model in the Mistral AI suite, offering a balance between performance and affordability. by mistral-ai 32K context $2.00/M input $6.00/M output mistral-small WIZARDLM 7B 16K WizardLM-2 7B is the smaller variant of Microsoft AI's latest Wizard model. It is the fastest and achieves comparable performance with existing 10x larger open-source leading models by microsoft 32K context $0.10/M input $0.10/M output wizardlm-2-7b UPSTAGE: SOLAR 1 MINI TRANSLATE EN-KO 15K Context-aware English-Korean translation that leverages previous dialogues to ensure unmatched coherence and continuity in your conversations. by upstage 32K context $1.00/M input $1.00/M output solar-1-mini-translate-enko COHERE: COMMAND 14K Command is a conversational model designed for following instructions and executing language tasks with superior quality, greater reliability, and a broader context compared to our standard generative models. Utilization of this model is governed by Cohere’s Acceptable Use Policy. by cohere 4K context $1.00/M input $2.00/M output command REKA CORE 11K Reka Core is a frontier-class multimodal language model developed by Reka AI. by reka-ai 128K context $10.00/M input $25.00/M output reka-core PERPLEXITY: SONAR SMALL CHAT [REPLACED] 11K Sonar represents the newest model family from Perplexity, offering improvements over previous models in terms of cost-efficiency, speed, and performance. This model has been replaced with [newer variant](https://monolyth.ai/models/llama-3-sonar-small-32k-chat) by perplexity-ai 16K context $0.20/M input $0.20/M output sonar-small-chat PHIND: CODELLAMA 34B 11K Phind-CodeLlama-34B-v2 is an open-source language model, fine-tuned on 1.5B tokens from high-quality programming-related data, and proficient in languages like Python, C/C++, TypeScript, and Java. It achieved a 73.8% pass rate on HumanEval and is instruction-tuned using Alpaca/Vicuna formats for better usability and steerability. Trained on proprietary instruction-answer pairs, it generates a single completion per prompt. by phind 4K context $0.60/M input $0.60/M output phind-codellama-34b-v2 FIRELLAVA 13B 10K LLaVA vision-language model trained on OSS LLM generated instruction following data. 1 Image is counted as 576 prompt tokens. by fireworks 4K context $0.20/M input $0.20/M output firellava-13b PERPLEXITY: LLAMA SONAR SMALL 32K ONLINE 9K Perplexity has announced the launch of its new Perplexity models: llama-3-sonar-small-32k-chat and llama-3-sonar-large-32k-chat, as well as their search-enabled versions, llama-3-sonar-small-32k-online and llama-3-sonar-large-32k-online. These models are reported to exceed the performance of their predecessors (sonar-small, sonar-medium). This is online variant. by perplexity-ai 28K context $0.20/M input $0.20/M output $0.005/request llama-3-sonar-small-32k-online LZLV 70B 8K A Mythomax/MLewd_13B-style merge of selected 70B models A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience. by lizpreciatior 4K context $0.70/M input $0.90/M output lzlv-70b-fp16-hf QWEN 1.5 1.8B 6K Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.10/M input $0.10/M output qwen-1.5-1.8b-chat AI21: JAMBA INSTRUCT PREVIEW 6K An instruction-tuned version of our hybrid SSM-Transformer Jamba model, Jamba-Instruct is built for reliable commercial use, with best-in-class quality and performance. by ai21 256K context $0.50/M input $0.70/M output jamba-instruct-preview CHRONOS HERMES 13B 6K This model is a 75/25 merge of Chronos (13B) and Nous Hermes (13B) models resulting in having a great ability to produce evocative storywriting and follow a narrative. by austism 2K context $0.30/M input $0.30/M output chronos-hermes-13b QWEN 1.5 32B 5K Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.80/M input $0.80/M output qwen-1.5-32b-chat REKA FLASH 5K Reka Flash is a state-of-the-art 21B model trained entirely from scratch and pushed to its absolute limits. It serves as the “turbo-class” offering in our lineup of models. Reka Flash rivals the performance of many significantly larger models, making it an excellent choice for fast workloads that require high quality. On a myriad of language and vision benchmarks, it is competitive with Gemini Pro and GPT-3.5. by reka-ai 8K context $0.80/M input $2.00/M output reka-flash QWEN 1.5 72B 4K Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.90/M input $0.90/M output qwen-1.5-72b-chat REKA EDGE 3K Lightweight, 7B equivalent model for local (i.e., on-hardware) or latency sensitive applications by reka-ai 8K context $0.40/M input $1.00/M output reka-edge NOUS CAPYBARA 7B 1.9 3K The Capybara series is the first Nous collection of dataset and models made by fine-tuning mostly on data created by Nous in-house. by nousresearcch 4K context $0.20/M input $0.20/M output nous-capybara-7b META: LLAMA 2 70B CHAT 3K Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is 70B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. by meta 4K context $0.90/M input $0.90/M output llama-2-70b-chat AIROBOROS L2 70B 3K Airoboros is a fairly general purpose model, but focuses heavily on instruction following, rather than casual chat/roleplay. by jondurbin 4K context $0.70/M input $0.90/M output airoboros-70b UPSTAGE: SOLAR 1 MINI TRANSLATE KO-EN 2K Context-aware Korean-English translation that leverages previous dialogues to ensure unmatched coherence and continuity in your conversations. by upstage 32K context $1.00/M input $1.00/M output solar-1-mini-translate-koen MISTRAL: 7B INSTRUCT V0.2 2K The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an enhanced version of the Mistral-7B-v0.2 generative text model, fine-tuned for instruction-based tasks using numerous publicly accessible conversation datasets. by mistral-ai 32K context $0.13/M input $0.13/M output mistral-7b-instruct-v0.2 HERMES 2 PRO MISTRAL 7B 2K Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house. by nousresearch 8K context $0.20/M input $0.20/M output hermes-2-pro-mistral-7b YI 34B CHAT 1K Yi-34B is a large language model (LLM) developed by the AI startup 01.AI, It is a bilingual (English and Chinese) base model trained with 34 billion parameters. Yi-34B has shown impressive performance on various natural language processing tasks. by 01-ai 4K context $0.80/M input $0.80/M output yi-34b-chat DBRX INSTRUCT 1K DBRX Instruct is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. DBRX Instruct specializes in few-turn interactions. by databricks 32K context $1.20/M input $1.20/M output dbrx-instruct JAPANESE STABLE LM INSTRUCT GAMMA 7B 1K This is a 7B-parameter decoder-only Japanese language model fine-tuned on instruction-following datasets, built on top of the base model Japanese Stable LM Base Gamma 7B. by stability-ai 8K context $0.20/M input $0.20/M output japanese-stablelm-instruct-gamma-7b OPEN HERMES 2 MISTRAL 7B 1K OpenHermes 2 Mistral 7B is a state of the art Mistral Fine-tune. by nousresearch 8K context $0.20/M input $0.20/M output openhermes-2-mistral-7b QWEN 1.5 14B 1K Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.30/M input $0.30/M output qwen-1.5-14b-chat MISTRAL: MIXTRAL 8X7B INSTRUCT 797 Mixtral is mixture of expert large language model (LLM) from Mistral AI. This is state of the art machine learning model using a mixture 8 of experts (MoE) 7b models. During inference 2 expers are selected. This architecture allows large models to be fast and cheap at inference. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks. by mistral-ai 32K context $0.24/M input $0.24/M output mixtral-8x7b-instruct UPSTAGE: SOLAR 10.7B INSTRUCT 778 SOLAR-10.7B is a large language model with 10.7 billion parameters, showing superior performance across various natural language processing tasks, outperforming other models with up to 30 billion parameters. This model employs depth up-scaling for enhancement, integrating architectural changes and continuing pretraining with Mistral 7B weights. It excels in robustness and adaptability, making it ideal for fine-tuning applications, and consistently surpasses the Mixtral 8X7B model in benchmarks. by upstage 4K context $0.30/M input $0.30/M output solar-10.7b-instruct NOUS HERMES 2 - YI 34B 701 Nous Hermes 2 - Yi-34B is a state of the art Yi Fine-tune. Nous Hermes 2 Yi 34B was trained on 1,000,000 entries of primarily GPT-4 generated data by nousresearch 4K context $0.80/M input $0.80/M output nous-hermes-2-yi-34b QWEN 1.5 4B 332 Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.10/M input $0.10/M output qwen-1.5-4b-chat QWEN 1.5 7B 304 Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.20/M input $0.20/M output qwen-1.5-7b-chat PERPLEXITY: SONAR SMALL ONLINE [REPLACED] 212 Sonar represents the newest model family from Perplexity, offering improvements over previous models in terms of cost-efficiency, speed, and performance. For *-online* models, in addition to the token charges, a flat $5 is charged per thousand requests (or half a cent per request). This model has been replaced with [newer variant](https://monolyth.ai/models/llama-3-sonar-small-32k-online) by perplexity-ai 12K context $0.20/M input $0.20/M output $0.005/request sonar-small-online JAPANESE STABLE LM INSTRUCT BETA 70B 187 japanese-stablelm-base-beta-70b is a 70B-parameter decoder-only language model based on Llama-2-70b that has been fine-tuned on a diverse collection of Japanese data, with the intent of maximizing downstream performance on Japanese language tasks. by stability-ai 8K context $0.90/M input $0.90/M output japanese-stablelm-instruct-beta-70b MISTRAL EMBED 12 A model that converts text into numerical vectors of embeddings in 1024 dimensions. Embedding models enable retrieval and retrieval-augmented generation applications. It achieves a retrieval score of 55.26 on MTEB. by mistral-ai$0.10/M input mistral-embed OPENAI: TEXT EMBEDDING ADA 2 9 text-embedding-ada-002 outperforms all the old embedding models on text search, code search, and sentence similarity tasks and gets comparable performance on text classification. by openai$0.10/M input text-embedding-ada-002 OPENAI: TEXT EMBEDDING 3 LARGE 9 text-embedding-3-large is OpenAI's new next generation larger embedding model and creates embeddings with up to 3072 dimensions. by openai$0.13/M input text-embedding-3-large OPENAI: TEXT EMBEDDING 3 SMALL 9 The Text Embedding 3 Small model is a highly efficient upgrade from the December 2022 release, Text-Embedding-ADA-002. It demonstrates improved performance on the MIRACL benchmark for multi-language retrieval, increasing from 31.4% to 44.0%, and on the MTEB benchmark for English tasks, improving from 61.0% to 62.3%. by openai$0.02/M input text-embedding-3-small AUTO When your model slug is unknown, your prompts will be processed by [llama-3-70b-instruct](https://monolyth.ai/models/llama-3-70b-instruct). by monolyth N/A context auto OPEN HERMES 2.5 MISTRAL 7B OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets. by nousresearch 8K context $0.20/M input $0.20/M output openhermes-2.5-mistral-7b OLMO 7B INSTRUCT OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models. The OLMo base models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset. The adapted versions are trained on the [Tulu SFT mixture](https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture) and, for the Instruct version, a [cleaned version of the UltraFeedback dataset](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned). OLMo 7B Instruct and OLMo SFT are two adapted versions of these models trained for better question answering. They show the performance gain that OLMo base models can achieve with existing fine-tuning techniques. by allenai 2K context $0.20/M input $0.20/M output olmo-7b-instruct QWEN 1.5 0.5B Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. by alibaba 32K context $0.10/M input $0.10/M output qwen-1.5-0.5b-chat Request a model DiscordPricingPrivacyTerms MONOLYTHby Empty Canvas Inc.