Ai Tools Research
search
Ctrlk
  • Group 1
  • 🔲About Ai Tools Researchchevron-right
    • AI Adoption Consultation & Training Serviceschevron-right
    • LLM Performance Benchmarkschevron-right
    • Youtube Videos Directorychevron-right
    • Frequently Asked Questions (FAQs)chevron-right
      • A Typology of AIchevron-right
      • Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs) and Diffusion Models
      • What is 'Latent Space' in Image Generation?
      • What is LoRA and How does LoRA work
      • What is Gradient Descent?
      • What are Vector databases are and how they work?
      • What is 'Inpainting' & 'Outpainting' ?
      • What is 'DPO' in LLM Training?
      • What is 'One-Shot' Learning?
      • FAQs on LLM Training and Data Labellingchevron-right
        • LLMs Main Concepts Explained
        • LLM Evaluation
        • Building Datasetschevron-right
        • What is an 'Uncensored LLM'chevron-right
        • What are Parameters in LLMs?
        • Parameters vs Tokens in LLMs?
        • What are Model Weights?
        • What is 'Inference Cost'?
        • Training Corpus and Datasetschevron-right
          • Open-Sourced Training Datasets for LLMs
          • Datasets List from Dr. Alan Thompson
          • Corpus Used by Large Language Models (LLMs) for Different Applications
        • What are 'Tokens' ?
        • What are Token Limits?
        • What Are Context Windows?chevron-right
        • How to Fine Tune LLMs?
        • Case Study of Fine-Tuning an LLMchevron-right
        • What is "RAG," (Retrieval-Augmented Generation)?chevron-right
        • What does "Release Base, Instruct and Reward Model" mean?
    • Articles and Transcriptschevron-right
  • 🔲LARGE LANGUAGE MODELS (LLM's)chevron-right
  • sidebarBlockchain & AIchevron-right
  • Ai Tools Main Categories
    • 🔲TEXT & WRITINGchevron-right
    • 🔲AUDIO, SPEECH & MUSICchevron-right
    • 🔲VIDEO & ANIMATIONchevron-right
    • 🔲IMAGES, ART & DESIGNchevron-right
    • 🔲PROGRAMMING & CODEchevron-right
    • 🟢Prompt Design and Engineeringchevron-right
    • 🔲AI RESOURCESchevron-right
    • 🔲AI HARDWARE (GPU's & TPU's) and Cloud Serviceschevron-right
    • 🔲OTHERchevron-right
  • 🔲SOLUTIONS & TUTORIALSchevron-right
  • 🔲AI TECHNOLOGYchevron-right
  • 🔲GLOSSARY OF AI TERMSchevron-right
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
  1. 🔲About Ai Tools Researchchevron-right
  2. Frequently Asked Questions (FAQs)chevron-right
  3. FAQs on LLM Training and Data Labellingchevron-right
  4. Training Corpus and Datasets

Datasets List from Dr. Alan Thompson

LogoDr Alan D. Thompson – LifeArchitect.aiDr Alan D. Thompson – LifeArchitect.aichevron-right

Datasets: https://docs.google.com/spreadsheets/d/1O5KVQW1Hx5ZAkcg8AIRjbQLQzx2wVaLl0SqUu-ir9Fs/edit#gid=484905095arrow-up-right

Models Table: https://lifearchitect.ai/models-table/arrow-up-right

PreviousOpen-Sourced Training Datasets for LLMschevron-leftNextCorpus Used by Large Language Models (LLMs) for Different Applicationschevron-right

Last updated 1 year ago