# MMLU

MMLU is a benchmark for language understanding. It is a multi-task benchmark that consists of 14 diverse tasks, including:

· Natural language inference (NLI)

· Question answering (QA)

· Summarization

· Translation

· Sentiment analysis

· Name entity recognition (NER)

· Natural language reasoning (NLR)

· Commonsense reasoning (CS)

· Logical reasoning (LR)

· Code generation

· Code translation

· Code summarization

· Code question answering

· Code completion

· Code debugging

&#x20;MMLU is designed to evaluate the general language understanding capabilities of models, and it is a more challenging benchmark than previous benchmarks such as GLUE and SuperGLUE. This is because MMLU tasks require a deeper understanding of language, such as the ability to reason about common sense, logic, and code.

&#x20;MMLU is also designed to be more inclusive, with tasks in multiple languages and domains. This is important because it allows for the evaluation of models on a wider range of tasks and datasets.

The current state-of-the-art on MMLU is GPT-4 (few-shot, k=5), which achieves a score of 81.4%. This shows that Large Language Models are making progress in the area of general language understanding.

MMLU is a valuable resource for the NLP community, and it is helping to drive research in the area of language understanding.

MMLU is a benchmark for language understanding. You likely have experienced dealing with other people where they  brainstorm and throw different ideas and oftentimes it's not one person who is correct but the sum of all the ideas tends to be better than any single contribution. These ideas mesh together and create something better. In business this is referred to as a 'mastermind' where you have two or more people that come together in a similar goal and share ideas. They are able to have certain breakthroughs. A certain better understanding of how to continue. It is the same concept with these AI agents. &#x20;

&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://metaverse-imagen.gitbook.io/ai-tools-research/about-ai-tools-research/llm-performance-benchmarks/llm-benchmarks-and-tasks/mmlu-massive-multitask-language-understanding/mmlu.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
