gluebenchmark.com Open in urlscan Pro
2a06:98c1:3120::3  Public Scan

URL: https://gluebenchmark.com/
Submission: On April 17 via api from CH — Scanned from NL

Form analysis 0 forms found in the DOM

Text Content

You need to enable JavaScript to run this app.
GLUE SuperGLUE
PaperCodeTasksLeaderboardFAQDiagnosticsSubmitLogin

GLUE



The General Language Understanding Evaluation (GLUE) benchmark is a collection
of resources for training, evaluating, and analyzing natural language
understanding systems. GLUE consists of:
 * A benchmark of nine sentence- or sentence-pair language understanding tasks
   built on established existing datasets and selected to cover a diverse range
   of dataset sizes, text genres, and degrees of difficulty,
 * A diagnostic dataset designed to evaluate and analyze model performance with
   respect to a wide range of linguistic phenomena found in natural language,
   and
 * A public leaderboard for tracking performance on the benchmark and a
   dashboard for visualizing the performance of models on the diagnostic set.

The format of the GLUE benchmark is model-agnostic, so any system capable of
processing sentence and sentence pairs and producing corresponding predictions
is eligible to participate. The benchmark tasks are selected so as to favor
models that share information across tasks using parameter sharing or other
transfer learning techniques. The ultimate goal of GLUE is to drive research in
the development of general and robust natural language understanding systems.


Paper Starter Code Group Diagnostics