gluebenchmark.com
Open in
urlscan Pro
2a06:98c1:3120::3
Public Scan
URL:
https://gluebenchmark.com/
Submission: On April 17 via api from CH — Scanned from NL
Submission: On April 17 via api from CH — Scanned from NL
Form analysis
0 forms found in the DOMText Content
You need to enable JavaScript to run this app. GLUE SuperGLUE PaperCodeTasksLeaderboardFAQDiagnosticsSubmitLogin GLUE The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems. GLUE consists of: * A benchmark of nine sentence- or sentence-pair language understanding tasks built on established existing datasets and selected to cover a diverse range of dataset sizes, text genres, and degrees of difficulty, * A diagnostic dataset designed to evaluate and analyze model performance with respect to a wide range of linguistic phenomena found in natural language, and * A public leaderboard for tracking performance on the benchmark and a dashboard for visualizing the performance of models on the diagnostic set. The format of the GLUE benchmark is model-agnostic, so any system capable of processing sentence and sentence pairs and producing corresponding predictions is eligible to participate. The benchmark tasks are selected so as to favor models that share information across tasks using parameter sharing or other transfer learning techniques. The ultimate goal of GLUE is to drive research in the development of general and robust natural language understanding systems. Paper Starter Code Group Diagnostics