huggingface.co Open in urlscan Pro
2600:9000:275b:1200:17:b174:6d00:93a1 Public Scan

Back to summary
URL:
https://huggingface.co/AnatoliiPotapov/T-lite-instruct-0.1
Submission: On August 13 via manual (August 13th 2024, 9:49:27 am UTC) from RU — Scanned from DE
Form analysis
0 forms found in the DOM

Text Content

Hugging Face


 * Models
 * Datasets
 * Spaces
 * Posts
 * Docs
 * Solutions
 * Pricing
 * 

 * --------------------------------------------------------------------------------

 * Log In
 * Sign Up



ANATOLIIPOTAPOV
/
T-LITE-INSTRUCT-0.1
LIKE 73


Text Generation
Transformers
Safetensors
Russian
llama
conversational
text-generation-inference
Inference Endpoints
Model card Files Files and versions Community
8
Train
Deploy
Use this model

Edit model card

 * T-lite-instruct-0.1
   * Description
     * 📚 Dataset
   * 📊 Benchmarks
     * 🏆 MT-Bench
     * 🏟️ Arena
   * 👨‍💻 Examples of usage




T-LITE-INSTRUCT-0.1

🚨 T-lite is designed for further fine-tuning and is not intended as a
ready-to-use conversational assistant. Users are advised to exercise caution and
are responsible for any additional training and oversight required to ensure the
model's responses meet acceptable ethical and safety standards. The
responsibility for incorporating this model into industrial or commercial
solutions lies entirely with those who choose to deploy it.


DESCRIPTION

T-lite-instruct-0.1 is an instruct version of the T-lite-0.1 model.

T-lite-instruct-0.1 was trained in bf16.


📚 DATASET

CONTEXTS

For the instruction dataset, the contexts are obtained from:

 * Open Source English-language datasets (such as UltraFeedback, HelpSteer, SHP,
   and so on)
 * Translations of English-language datasets through machine translation
 * Synthetic grounded QA contexts, generated from pre-training datasets

The translated contexts are filtered using classifiers.

SFT

The responses to the contexts are generated by a strong model and the training
is exclusively carried out on these responses. This avoids training the model on
poor-quality translations.

REWARD MODELING

RM is trained on such pairs:

 * Strong Model > Our Model
 * Stronger Model > Weaker Model
 * Chosen Translated Response > Rejected Translated Response
 * Pairs from original English datasets

The translated preference data are preliminarily filtered by the RM ensemble.

PREFERENCE TUNING

Two stages were used in preference tuning:

 * Stage 1: SPiN on the responses of the teacher model (Strong Model > Our
   Model)
 * Stage 2: SLiC-HF using our RM


📊 BENCHMARKS

Here we present the results of T-lite-instruct-0.1 on automatic benchmarks.


🏆 MT-BENCH

This benchmark was carefully translated into Russian and measured with LLM Judge
codebase, using gpt-4-1106-preview as a judge.

MT-Bench Total Turn_1 Turn_2 coding humanities math reasoning roleplay stem
writing T-lite-instruct-0.1 6.458 6.833 6.078 4.136 8.45 4.25 4.5 7.667 7.7
7.706 gpt3.5-turbo-0125 6.373 6.423 6.320 6.519 7.474 4.75 4.15 6.333 6.7 7.588
suzume-llama-3-8B-multilingual-orpo-borda-half 6.051 6.577 5.526 4.318 8.0 4.0
3.6 7.056 6.7 7.889 Qwen2-7b-Instruct 6.026 6.449 5.603 5.0 6.95 5.8 4.15 7.167
5.85 7.278 Llama-3-8b-Instruct 5.948 6.662 5.224 4.727 7.8 3.9 2.8 7.333 6.053
7.0 suzume-llama-3-8B-multilingual 5.808 6.167 5.449 5.409 6.4 5.05 3.8 6.556
5.0 7.056 saiga_llama3_8b 5.471 5.896 5.039 3.0 7.4 3.55 3.5 6.444 5.15 7.812
Mistral-7B-Instruct-v0.3 5.135 5.679 4.584 4.045 6.35 3.15 3.2 5.765 5.2 7.333


🏟️ ARENA

We used Russian version of Arena benchmark from Vikhrmodels and Arena Hard Auto
codebase for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the
judge was gpt-4-1106-preview.

Arena General Score 95% CI Average Tokens T-lite-instruct-0.1 57.26 -2.9/2 870
gpt3.5-turbo-0125 50 0/0 254 suzume-llama-3-8B-multilingual-orpo-borda-half
47.17 -2.6/2.4 735 Llama-3-8b-Instruct 42.16 -2.1/2.1 455 saiga_llama3_8b 39.88
-2.3/2.5 616 suzume-llama-3-8B-multilingual 38.25 -1.7/1.7 625 Qwen2-7b-Instruct
33.42 -1.9/2.2 365 Mistral-7B-Instruct-v0.3 28.11 -2/2.2 570


👨‍💻 EXAMPLES OF USAGE

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
torch.manual_seed(42)

model_name = "t-bank-ai/T-lite-instruct-0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

messages = [
    {"role": "user", "content": "Напиши рецепт классной пиццы!"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))


Output:

Конечно, вот рецепт для вкусной домашней пиццы, который можно адаптировать под разные вкусы и предпочтения. Важно, чтобы тесто было мягким и воздушным, а начинка — сочной и ароматной.

### Ингредиенты для теста:
- 500 г муки (лучше использовать смесь пшеничной и цельнозерновой)
- 1 ч. л. сухих дрожжей (или 7 г свежих)
- 1 ч. л. сахара
- 1 ч. л. соли
- 1 ст. л. оливкового масла
- 300 мл тёплой воды
- 1 яйцо (для смазки)

### Ингредиенты для начинки (примерный набор):
- 200 г томатного соуса (можно сделать самому из свежих помидоров или использовать готовый)
- 200 г моцареллы, нарезанной ломтиками
- 100 г сыра пармезан (тертый)
- 100 г ветчины или колбасы
- 100 г грибов (шампин


Downloads last month10,168


Safetensors
Model size
8.03B params
Tensor type
BF16
·

Inference API
Text Generation

This model does not have enough activity to be deployed to Inference API
(serverless) yet. Increase its social visibility and check back later, or deploy
to Inference Endpoints (dedicated) instead.



MODEL TREE FOR ANATOLIIPOTAPOV/T-LITE-INSTRUCT-0.1

Finetunes

2 models
Quantizations

7 models



SPACE USING ANATOLIIPOTAPOV/T-LITE-INSTRUCT-0.1 1

💬
ivpich/t-lite


Company
© Hugging Face
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs
huggingface.co Open in urlscan Pro 2600:9000:275b:1200:17:b174:6d00:93a1 Public Scan

Form analysis 0 forms found in the DOM

Text Content

huggingface.co Open in urlscan Pro
2600:9000:275b:1200:17:b174:6d00:93a1 Public Scan

Form analysis
0 forms found in the DOM