huggingface.co Open in urlscan Pro
2600:9000:261f:1000:17:b174:6d00:93a1  Public Scan

URL: https://huggingface.co/MoritzLaurer/deberta-v3-large-zeroshot-v2.0
Submission: On October 09 via manual from CA — Scanned from CA

Form analysis 1 forms found in the DOM

<form class="flex w-full max-w-full flex-col ">
  <div class="mb-2 flex items-center justify-between font-semibold">
    <div class="flex items-center text-lg"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" class="-ml-1 mr-1 text-yellow-500" width="1em" height="1em"
        preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24">
        <path d="M11 15H6l7-14v8h5l-7 14v-8z" fill="currentColor"></path>
      </svg> Inference API
      <a target="_blank" href="https://huggingface.co/docs/hub/models-widgets#example-outputs"><svg class="ml-1.5 text-sm text-gray-400 hover:text-black dark:hover:text-gray-200" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M17 22v-8h-4v2h2v6h-3v2h8v-2h-3z" fill="currentColor"></path><path d="M16 8a1.5 1.5 0 1 0 1.5 1.5A1.5 1.5 0 0 0 16 8z" fill="currentColor"></path><path d="M16 30a14 14 0 1 1 14-14a14 14 0 0 1-14 14zm0-26a12 12 0 1 0 12 12A12 12 0 0 0 16 4z" fill="currentColor"></path></svg></a>
    </div>
    <div class="relative "><button class=" " type="button">
        <div slot="button" class="flex items-center gap-1 rounded-full px-1.5 py-0.5 text-sm capitalize bg-blue-500/10 hover:bg-blue-500/20 border !border-blue-500/15 text-blue-600 dark:text-blue-400"><svg xmlns="http://www.w3.org/2000/svg"
            xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" class="text-blue-500 dark:text-blue-400" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24" fill="none">
            <path d="M11 15H6l7-14v8h5l-7 14v-8z" stroke="currentColor" stroke-width="2"></path>
          </svg> cold <svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24">
            <path d="M16.293 9.293L12 13.586L7.707 9.293l-1.414 1.414L12 16.414l5.707-5.707z" fill="currentColor"></path>
          </svg></div>
      </button> </div>
  </div>
  <div class="mb-0.5 flex w-full max-w-full flex-wrap items-center text-sm text-gray-500">
    <div class="mb-1.5 flex items-center gap-4">
      <a href="/tasks/zero-shot-classification" target="_blank" title="Learn more about zero-shot-classification"><div class="inline-flex items-center hover:underline"><svg class="mr-1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 18 18"><path d="M16.7125 8.75625H9.64375V1.6875H8.55625V8.75625H1.4875V9.84375H8.55625V16.9125H9.64375V9.84375H16.7125V8.75625Z"></path><path d="M3.11875 16.9125C2.79612 16.9125 2.48073 16.8168 2.21247 16.6376C1.94421 16.4584 1.73513 16.2036 1.61167 15.9055C1.4882 15.6074 1.4559 15.2794 1.51884 14.963C1.58178 14.6466 1.73714 14.3559 1.96528 14.1278C2.19341 13.8997 2.48407 13.7443 2.8005 13.6814C3.11694 13.6184 3.44493 13.6507 3.743 13.7742C4.04107 13.8976 4.29584 14.1067 4.47508 14.375C4.65432 14.6432 4.75 14.9586 4.75 15.2813C4.74956 15.7138 4.57756 16.1284 4.27174 16.4343C3.96591 16.7401 3.55125 16.9121 3.11875 16.9125V16.9125ZM3.11875 14.7375C3.0112 14.7375 2.90607 14.7694 2.81665 14.8291C2.72724 14.8889 2.65754 14.9738 2.61639 15.0732C2.57523 15.1725 2.56446 15.2819 2.58544 15.3873C2.60642 15.4928 2.65821 15.5897 2.73426 15.6657C2.8103 15.7418 2.90719 15.7936 3.01267 15.8146C3.11814 15.8355 3.22747 15.8248 3.32683 15.7836C3.42619 15.7425 3.51111 15.6728 3.57086 15.5834C3.63061 15.4939 3.6625 15.3888 3.6625 15.2813C3.66235 15.1371 3.60502 14.9989 3.50308 14.8969C3.40113 14.795 3.26291 14.7377 3.11875 14.7375Z"></path><path d="M4.75 4.95C4.42737 4.95 4.11198 4.85433 3.84372 4.67509C3.57547 4.49584 3.36639 4.24107 3.24292 3.943C3.11945 3.64493 3.08715 3.31694 3.15009 3.00051C3.21303 2.68408 3.3684 2.39342 3.59653 2.16528C3.82466 1.93715 4.11533 1.78179 4.43176 1.71884C4.74819 1.6559 5.07618 1.68821 5.37425 1.81167C5.67232 1.93514 5.92709 2.14422 6.10633 2.41248C6.28558 2.68073 6.38125 2.99612 6.38125 3.31875C6.38082 3.75125 6.20881 4.16592 5.90299 4.47174C5.59716 4.77757 5.1825 4.94957 4.75 4.95ZM4.75 2.775C4.64245 2.775 4.53733 2.80689 4.44791 2.86664C4.35849 2.92639 4.28879 3.01131 4.24764 3.11067C4.20648 3.21002 4.19572 3.31935 4.2167 3.42483C4.23768 3.53031 4.28946 3.62719 4.36551 3.70324C4.44155 3.77928 4.53844 3.83107 4.64392 3.85205C4.7494 3.87303 4.85873 3.86227 4.95808 3.82111C5.05744 3.77995 5.14236 3.71026 5.20211 3.62084C5.26186 3.53142 5.29375 3.42629 5.29375 3.31875C5.2936 3.17458 5.23627 3.03636 5.13433 2.93442C5.03239 2.83248 4.89417 2.77514 4.75 2.775Z"></path><path d="M12.3625 7.66875C12.0399 7.66875 11.7245 7.57308 11.4562 7.39384C11.188 7.21459 10.9789 6.95982 10.8554 6.66175C10.732 6.36368 10.6996 6.03569 10.7626 5.71926C10.8255 5.40283 10.9809 5.11217 11.209 4.88403C11.4372 4.6559 11.7278 4.50054 12.0443 4.43759C12.3607 4.37465 12.6887 4.40696 12.9867 4.53042C13.2848 4.65389 13.5396 4.86297 13.7188 5.13123C13.8981 5.39948 13.9937 5.71487 13.9937 6.0375C13.9933 6.47 13.8213 6.88467 13.5155 7.19049C13.2097 7.49632 12.795 7.66832 12.3625 7.66875ZM12.3625 5.49375C12.255 5.49375 12.1498 5.52564 12.0604 5.58539C11.971 5.64514 11.9013 5.73006 11.8601 5.82942C11.819 5.92877 11.8082 6.0381 11.8292 6.14358C11.8502 6.24906 11.902 6.34595 11.978 6.42199C12.0541 6.49803 12.1509 6.54982 12.2564 6.5708C12.3619 6.59178 12.4712 6.58102 12.5706 6.53986C12.6699 6.4987 12.7549 6.42901 12.8146 6.33959C12.8744 6.25017 12.9062 6.14504 12.9062 6.0375C12.9061 5.89333 12.8488 5.75511 12.7468 5.65317C12.6449 5.55123 12.5067 5.49389 12.3625 5.49375Z"></path><path d="M6.38125 7.66876C6.98186 7.66876 7.46875 7.18187 7.46875 6.58126C7.46875 5.98065 6.98186 5.49376 6.38125 5.49376C5.78064 5.49376 5.29375 5.98065 5.29375 6.58126C5.29375 7.18187 5.78064 7.66876 6.38125 7.66876Z"></path><path d="M6.38125 13.1063C6.98186 13.1063 7.46875 12.6194 7.46875 12.0188C7.46875 11.4181 6.98186 10.9313 6.38125 10.9313C5.78064 10.9313 5.29375 11.4181 5.29375 12.0188C5.29375 12.6194 5.78064 13.1063 6.38125 13.1063Z"></path><path d="M11.8187 13.1063C12.4194 13.1063 12.9062 12.6194 12.9062 12.0188C12.9062 11.4181 12.4194 10.9313 11.8187 10.9313C11.2181 10.9313 10.7312 11.4181 10.7312 12.0188C10.7312 12.6194 11.2181 13.1063 11.8187 13.1063Z"></path><path d="M12.3625 16.9125C12.9631 16.9125 13.45 16.4256 13.45 15.825C13.45 15.2244 12.9631 14.7375 12.3625 14.7375C11.7619 14.7375 11.275 15.2244 11.275 15.825C11.275 16.4256 11.7619 16.9125 12.3625 16.9125Z"></path><path d="M15.625 14.7375C16.2256 14.7375 16.7125 14.2506 16.7125 13.65C16.7125 13.0494 16.2256 12.5625 15.625 12.5625C15.0244 12.5625 14.5375 13.0494 14.5375 13.65C14.5375 14.2506 15.0244 14.7375 15.625 14.7375Z"></path><path d="M2.575 7.66876C3.17561 7.66876 3.6625 7.18187 3.6625 6.58126C3.6625 5.98065 3.17561 5.49376 2.575 5.49376C1.97439 5.49376 1.4875 5.98065 1.4875 6.58126C1.4875 7.18187 1.97439 7.66876 2.575 7.66876Z"></path><path d="M15.625 3.8625C16.2256 3.8625 16.7125 3.37561 16.7125 2.775C16.7125 2.17439 16.2256 1.6875 15.625 1.6875C15.0244 1.6875 14.5375 2.17439 14.5375 2.775C14.5375 3.37561 15.0244 3.8625 15.625 3.8625Z"></path></svg> <span>Zero-Shot Classification</span></div></a>
    </div>
    <div class="ml-auto flex gap-2">
      <div class="flex gap-x-1 peer:">
        <div class="relative mb-1.5  ">
          <div class="inline-flex w-32 justify-between rounded-md border border-gray-100 px-4 py-1">
            <div class="truncate text-sm">Examples</div> <svg class="-mr-1 ml-2 h-5 w-5 transition ease-in-out transform false" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true">
              <path fill-rule="evenodd" d="M5.293 7.293a1 1 0 011.414 0L10 10.586l3.293-3.293a1 1 0 111.414 1.414l-4 4a1 1 0 01-1.414 0l-4-4a1 1 0 010-1.414z" clip-rule="evenodd"></path>
            </svg>
          </div>
        </div>
      </div>
    </div>
  </div>
  <div class="flex flex-col space-y-2"><span class="inline-block w-full"><span class="contents"><label class="block "> <span
            class="block w-full resize-y overflow-auto px-3 py-2 min-h-[42px] inline-block max-h-[500px] whitespace-pre-wrap rounded-lg border border-gray-200 shadow-inner outline-none focus:shadow-inner focus:ring focus:ring-blue-200 dark:bg-gray-925 svelte-1wfa7x9"
            role="textbox" style="--placeholder: 'Text to classify...';" spellcheck="false" dir="auto" contenteditable="">Last week I upgraded my iOS version and ever since then my phone has been overheating whenever I use your
            app.</span></label></span> </span> <span class="inline-block w-full"><span class="contents"><label class="block "> <span class="text-sm text-gray-500">Possible class names (comma-separated)</span> <input
            class="mt-1.5 form-input-alt block w-full" placeholder="Possible class names..." type="text"></label></span> </span> <label class="block inline-flex items-center "><input
        class="mr-2 cursor-pointer rounded border-transparent bg-gray-200 text-blue-500 checked:bg-blue-500 focus:ring-1 focus:ring-blue-200 focus:ring-offset-2 dark:bg-gray-700 dark:checked:bg-blue-500 dark:focus:ring-gray-500 dark:focus:ring-offset-gray-925"
        type="checkbox"> <span class="text-sm text-gray-500">Allow multiple true classes</span> </label> <span class="inline-block "><span class="contents"><button class="btn-widget h-10 w-24 px-5 " type="submit">Compute</button></span> </span>
  </div>
  <div class="mt-2">
    <div class="text-sm text-gray-400"></div>
  </div>
  <div class="mt-1 flex items-center text-sm text-gray-500"><button class="flex items-center" type="button"><svg class="mr-1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false"
        role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32" style="transform: rotate(360deg);">
        <path d="M31 16l-7 7l-1.41-1.41L28.17 16l-5.58-5.59L24 9l7 7z" fill="currentColor"></path>
        <path d="M1 16l7-7l1.41 1.41L3.83 16l5.58 5.59L8 23l-7-7z" fill="currentColor"></path>
        <path d="M12.419 25.484L17.639 6l1.932.518L14.35 26z" fill="currentColor"></path>
      </svg> View Code</button>
    <div class="ml-auto"><span class="inline-block "><span class="contents"><button class="flex items-center" type="button"><svg class="mr-1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true"
              focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32">
              <path d="M22 16h2V8h-8v2h6v6z" fill="currentColor"></path>
              <path d="M8 24h8v-2h-6v-6H8v8z" fill="currentColor"></path>
              <path d="M26 28H6a2.002 2.002 0 0 1-2-2V6a2.002 2.002 0 0 1 2-2h20a2.002 2.002 0 0 1 2 2v20a2.002 2.002 0 0 1-2 2zM6 6v20h20.001L26 6z" fill="currentColor"></path>
            </svg> Maximize</button></span> </span></div>
  </div>
</form>

Text Content

Hugging Face


 * Models
 * Datasets
 * Spaces
 * Posts
 * Docs
 * Solutions
 * Pricing
 * 

 * --------------------------------------------------------------------------------

 * Log In
 * Sign Up



MORITZLAURER
/
DEBERTA-V3-LARGE-ZEROSHOT-V2.0
LIKE 78


Zero-Shot Classification
Transformers
ONNX
Safetensors
English
deberta-v2
text-classification
Inference Endpoints
arxiv: 2312.17543

License: mit

Model card Files Files and versions Community
5
Train
Deploy
Use this model

Edit model card

 * Model description: deberta-v3-large-zeroshot-v2.0
   * zeroshot-v2.0 series of models
   * Training data
   * How to use the models
   * Metrics
   * When to use which model
   * Reproduction
   * Limitations and bias
   * License
   * Citation
     * Ideas for cooperation or questions?
     * Flexible usage and "prompting"




MODEL DESCRIPTION: DEBERTA-V3-LARGE-ZEROSHOT-V2.0


ZEROSHOT-V2.0 SERIES OF MODELS

Models in this series are designed for efficient zeroshot classification with
the Hugging Face pipeline. These models can do classification without training
data and run on both GPUs and CPUs. An overview of the latest zeroshot
classifiers is available in my Zeroshot Classifier Collection.

The main update of this zeroshot-v2.0 series of models is that several models
are trained on fully commercially-friendly data for users with strict license
requirements.

These models can do one universal classification task: determine whether a
hypothesis is "true" or "not true" given a text (entailment vs. not_entailment).
This task format is based on the Natural Language Inference task (NLI). The task
is so universal that any classification task can be reformulated into this task
by the Hugging Face pipeline.


TRAINING DATA

Models with a "-c" in the name are trained on two types of fully
commercially-friendly data:

 1. Synthetic data generated with Mixtral-8x7B-Instruct-v0.1. I first created a
    list of 500+ diverse text classification tasks for 25 professions in
    conversations with Mistral-large. The data was manually curated. I then used
    this as seed data to generate several hundred thousand texts for these tasks
    with Mixtral-8x7B-Instruct-v0.1. The final dataset used is available in the
    synthetic_zeroshot_mixtral_v0.1 dataset in the subset
    mixtral_written_text_for_tasks_v4. Data curation was done in multiple
    iterations and will be improved in future iterations.
 2. Two commercially-friendly NLI datasets: (MNLI, FEVER-NLI). These datasets
    were added to increase generalization.
 3. Models without a "-c" in the name also included a broader mix of training
    data with a broader mix of licenses: ANLI, WANLI, LingNLI, and all datasets
    in this list where used_in_v1.1==True.


HOW TO USE THE MODELS

#!pip install transformers[sentencepiece]
from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)


multi_label=False forces the model to decide on only one class. multi_label=True
enables the model to choose multiple classes.


METRICS

The models were evaluated on 28 different text classification tasks with the
f1_macro metric. The main reference point is facebook/bart-large-mnli which is,
at the time of writing (03.04.24), the most used commercially-friendly 0-shot
classifier.



facebook/bart-large-mnli roberta-base-zeroshot-v2.0-c
roberta-large-zeroshot-v2.0-c deberta-v3-base-zeroshot-v2.0-c
deberta-v3-base-zeroshot-v2.0 (fewshot) deberta-v3-large-zeroshot-v2.0-c
deberta-v3-large-zeroshot-v2.0 (fewshot) bge-m3-zeroshot-v2.0-c
bge-m3-zeroshot-v2.0 (fewshot) all datasets mean 0.497 0.587 0.622 0.619 0.643
(0.834) 0.676 0.673 (0.846) 0.59 (0.803) amazonpolarity (2) 0.937 0.924 0.951
0.937 0.943 (0.961) 0.952 0.956 (0.968) 0.942 (0.951) imdb (2) 0.892 0.871 0.904
0.893 0.899 (0.936) 0.923 0.918 (0.958) 0.873 (0.917) appreviews (2) 0.934 0.913
0.937 0.938 0.945 (0.948) 0.943 0.949 (0.962) 0.932 (0.954) yelpreviews (2)
0.948 0.953 0.977 0.979 0.975 (0.989) 0.988 0.985 (0.994) 0.973 (0.978)
rottentomatoes (2) 0.83 0.802 0.841 0.84 0.86 (0.902) 0.869 0.868 (0.908) 0.813
(0.866) emotiondair (6) 0.455 0.482 0.486 0.459 0.495 (0.748) 0.499 0.484
(0.688) 0.453 (0.697) emocontext (4) 0.497 0.555 0.63 0.59 0.592 (0.799) 0.699
0.676 (0.81) 0.61 (0.798) empathetic (32) 0.371 0.374 0.404 0.378 0.405 (0.53)
0.447 0.478 (0.555) 0.387 (0.455) financialphrasebank (3) 0.465 0.562 0.455
0.714 0.669 (0.906) 0.691 0.582 (0.913) 0.504 (0.895) banking77 (72) 0.312 0.124
0.29 0.421 0.446 (0.751) 0.513 0.567 (0.766) 0.387 (0.715) massive (59) 0.43
0.428 0.543 0.512 0.52 (0.755) 0.526 0.518 (0.789) 0.414 (0.692)
wikitoxic_toxicaggreg (2) 0.547 0.751 0.766 0.751 0.769 (0.904) 0.741 0.787
(0.911) 0.736 (0.9) wikitoxic_obscene (2) 0.713 0.817 0.854 0.853 0.869 (0.922)
0.883 0.893 (0.933) 0.783 (0.914) wikitoxic_threat (2) 0.295 0.71 0.817 0.813
0.87 (0.946) 0.827 0.879 (0.952) 0.68 (0.947) wikitoxic_insult (2) 0.372 0.724
0.798 0.759 0.811 (0.912) 0.77 0.779 (0.924) 0.783 (0.915)
wikitoxic_identityhate (2) 0.473 0.774 0.798 0.774 0.765 (0.938) 0.797 0.806
(0.948) 0.761 (0.931) hateoffensive (3) 0.161 0.352 0.29 0.315 0.371 (0.862)
0.47 0.461 (0.847) 0.291 (0.823) hatexplain (3) 0.239 0.396 0.314 0.376 0.369
(0.765) 0.378 0.389 (0.764) 0.29 (0.729) biasframes_offensive (2) 0.336 0.571
0.583 0.544 0.601 (0.867) 0.644 0.656 (0.883) 0.541 (0.855) biasframes_sex (2)
0.263 0.617 0.835 0.741 0.809 (0.922) 0.846 0.815 (0.946) 0.748 (0.905)
biasframes_intent (2) 0.616 0.531 0.635 0.554 0.61 (0.881) 0.696 0.687 (0.891)
0.467 (0.868) agnews (4) 0.703 0.758 0.745 0.68 0.742 (0.898) 0.819 0.771
(0.898) 0.687 (0.892) yahootopics (10) 0.299 0.543 0.62 0.578 0.564 (0.722)
0.621 0.613 (0.738) 0.587 (0.711) trueteacher (2) 0.491 0.469 0.402 0.431 0.479
(0.82) 0.459 0.538 (0.846) 0.471 (0.518) spam (2) 0.505 0.528 0.504 0.507 0.464
(0.973) 0.74 0.597 (0.983) 0.441 (0.978) wellformedquery (2) 0.407 0.333 0.333
0.335 0.491 (0.769) 0.334 0.429 (0.815) 0.361 (0.718) manifesto (56) 0.084 0.102
0.182 0.17 0.187 (0.376) 0.258 0.256 (0.408) 0.147 (0.331) capsotu (21) 0.34
0.479 0.523 0.502 0.477 (0.664) 0.603 0.502 (0.686) 0.472 (0.644)

These numbers indicate zeroshot performance, as no data from these datasets was
added in the training mix. Note that models without a "-c" in the title were
evaluated twice: one run without any data from these 28 datasets to test pure
zeroshot performance (the first number in the respective column) and the final
run including up to 500 training data points per class from each of the 28
datasets (the second number in brackets in the column, "fewshot"). No model was
trained on test data.

Details on the different datasets are available here:
https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/v1_human_data/datasets_overview.csv


WHEN TO USE WHICH MODEL

 * deberta-v3-zeroshot vs. roberta-zeroshot: deberta-v3 performs clearly better
   than roberta, but it is a bit slower. roberta is directly compatible with
   Hugging Face's production inference TEI containers and flash attention. These
   containers are a good choice for production use-cases. tl;dr: For accuracy,
   use a deberta-v3 model. If production inference speed is a concern, you can
   consider a roberta model (e.g. in a TEI container and HF Inference
   Endpoints).
 * commercial use-cases: models with "-c" in the title are guaranteed to be
   trained on only commercially-friendly data. Models without a "-c" were
   trained on more data and perform better, but include data with non-commercial
   licenses. Legal opinions diverge if this training data affects the license of
   the trained model. For users with strict legal requirements, the models with
   "-c" in the title are recommended.
 * Multilingual/non-English use-cases: use bge-m3-zeroshot-v2.0 or
   bge-m3-zeroshot-v2.0-c. Note that multilingual models perform worse than
   English-only models. You can therefore also first machine translate your
   texts to English with libraries like EasyNMT and then apply any English-only
   model to the translated data. Machine translation also facilitates validation
   in case your team does not speak all languages in the data.
 * context window: The bge-m3 models can process up to 8192 tokens. The other
   models can process up to 512. Note that longer text inputs both make the mode
   slower and decrease performance, so if you're only working with texts of up
   to 400~ words / 1 page, use e.g. a deberta model for better performance.
 * The latest updates on new models are always available in the Zeroshot
   Classifier Collection.


REPRODUCTION

Reproduction code is available in the v2_synthetic_data directory here:
https://github.com/MoritzLaurer/zeroshot-classifier/tree/main


LIMITATIONS AND BIAS

The model can only do text classification tasks.

Biases can come from the underlying foundation model, the human NLI training
data and the synthetic data generated by Mixtral.


LICENSE

The foundation model was published under the MIT license. The licenses of the
training data vary depending on the model, see above.


CITATION

This model is an extension of the research described in this paper.

If you use this model academically, please cite:

@misc{laurer_building_2023,
    title = {Building {Efficient} {Universal} {Classifiers} with {Natural} {Language} {Inference}},
    url = {http://arxiv.org/abs/2312.17543},
    doi = {10.48550/arXiv.2312.17543},
    abstract = {Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-like models can also learn universal tasks, which allow them to do any text classification task without requiring fine-tuning (zeroshot classification) or to learn new tasks with only a few examples (fewshot), while being significantly more efficient than generative LLMs. This paper (1) explains how Natural Language Inference (NLI) can be used as a universal classification task that follows similar principles as instruction fine-tuning of generative LLMs, (2) provides a step-by-step guide with reusable Jupyter notebooks for building a universal classifier, and (3) shares the resulting universal classifier that is trained on 33 datasets with 389 diverse classes. Parts of the code we share has been used to train our older zeroshot classifiers that have been downloaded more than 55 million times via the Hugging Face Hub as of December 2023. Our new classifier improves zeroshot performance by 9.4\%.},
    urldate = {2024-01-05},
    publisher = {arXiv},
    author = {Laurer, Moritz and van Atteveldt, Wouter and Casas, Andreu and Welbers, Kasper},
    month = dec,
    year = {2023},
    note = {arXiv:2312.17543 [cs]},
    keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},
}



IDEAS FOR COOPERATION OR QUESTIONS?

If you have questions or ideas for cooperation, contact me at
moritz{at}huggingface{dot}co or LinkedIn


FLEXIBLE USAGE AND "PROMPTING"

You can formulate your own hypotheses by changing the hypothesis_template of the
zeroshot pipeline. Similar to "prompt engineering" for LLMs, you can test
different formulations of your hypothesis_template and verbalized classes to
improve performance.

from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
# formulation 1
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
# formulation 2 depending on your use-case
hypothesis_template = "The topic of this text is {}"
classes_verbalized = ["political activities", "economic policy", "entertainment or music", "environmental protection"]
# test different formulations
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)


Downloads last month61,519


Safetensors
Model size
435M params
Tensor type
FP16
ยท

Inference API
cold
Zero-Shot Classification
Examples
Last week I upgraded my iOS version and ever since then my phone has been
overheating whenever I use your app. Possible class names (comma-separated)
Allow multiple true classes Compute

View Code
Maximize



MODEL TREE FOR MORITZLAURER/DEBERTA-V3-LARGE-ZEROSHOT-V2.0

Base model


microsoft/deberta-v3-large
Quantized
(3)

this model
Adapters

2 models
Finetunes

2 models



SPACES USING MORITZLAURER/DEBERTA-V3-LARGE-ZEROSHOT-V2.0 4

๐ŸŒ
AISimplyExplained/deberta_api
๐Ÿ 
sarangs/students-feedback-analysis
๐Ÿ“ˆ
sitammeur/ClassyText
๐Ÿจ
cnealex/demo



COLLECTION INCLUDING MORITZLAURER/DEBERTA-V3-LARGE-ZEROSHOT-V2.0

ZEROSHOT CLASSIFIERS

Collection
These are my current best zeroshot classifiers. Some of my older models are
downloaded more often, but the models in this collection are newer/better. โ€ข 11
items โ€ข Updated Apr 3 โ€ข 107



Company
ยฉ Hugging Face
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs