BALROG

Getting Started

  • Installation
  • Evaluation
    • ⚡ Quickstart
    • Evaluate using local vLLM server
    • Evaluate using API
    • 🖼️ VLM mode
    • ▶️ Resume an evaluation
    • ⚙️ Configuring Eval
  • Agents
    • Pre-built agents
    • 🤖 Creating Custom Agents
      • Simple Planning Agent
  • Contributions

Environments

  • Baby AI
    • BabyAI-Text
    • BabyAI Results
      • LLM results
      • VLM results
    • Observations
  • Crafter
    • Crafter Results
    • LLM results
    • VLM results
    • Observations
  • TextWorld
    • Tasks
      • Treasure Hunter
      • The Cooking Game
      • Coin Collector
    • TextWorld Results
    • Observations
  • Baba Is AI
    • Baba Is AI Language Wrapper
    • Baba Is AI Results
    • LLM results
    • VLM results
    • Observations
  • MiniHack
    • LLM results
    • VLM results
    • Observations
  • NetHack Learning Environment
    • NetHack Language Wrapper
    • New NetHack Progression System
    • NetHack Results
    • LLM results
    • VLM results
    • Observation

API

  • iclbench
    • iclbench package
      • Subpackages
        • iclbench.agents package
        • iclbench.environments package
        • iclbench.prompt_builder package
      • Submodules
      • iclbench.client module
        • ClaudeWrapper
        • GoogleGenerativeAIWrapper
        • LLMClientWrapper
        • LLMResponse
        • OpenAIWrapper
        • ReplicateWrapper
        • create_llm_client()
        • process_image_claude()
        • process_image_openai()
      • iclbench.dataset module
        • InContextDataset
        • choice_excluding()
        • natural_sort_key()
      • iclbench.evaluator module
        • Evaluator
      • iclbench.utils module
        • load_secrets()
        • setup_environment()
        • summarize_env_progressions()
        • wandb_save_artifact()
      • Module contents
BALROG
  • Search


© Copyright 2024, BALROG Team.

Built with Sphinx using a theme provided by Read the Docs.