Skip to content

📊 Dataset Generation

Create Comprehensive Test Datasets

Generate diverse, high-quality test cases for evaluation

🎯 Generation Strategies

  1. Template-Based: Use predefined templates
  2. Synthetic: Algorithmically generate tests
  3. Capability-Driven: Match required capabilities
  4. Stratified Sampling: Balanced coverage

💻 Quick Start

from llm_evaluation_framework.test_dataset_generator import TestDatasetGenerator

generator = TestDatasetGenerator()

use_case = {
    "domain": "mathematics",
    "required_capabilities": ["reasoning"],
    "difficulty": "medium"
}

test_cases = generator.generate_test_cases(use_case, count=100)

📁 Export Formats

  • JSON
  • CSV
  • JSONL
  • Parquet

See Full Guide