Structured Output
Structured output is a technique that constrains an LLM to respond in a predefined format — typically JSON or XML — enabling reliable programmatic parsing of model responses rather than free-form text.
理解する Structured Output
LLMs naturally generate free-form text, which is powerful for conversation but problematic for applications that need to parse and act on model output. If an application needs to extract a task title, due date, and priority from a model's response, unstructured text requires fragile regex parsing that breaks when the model varies its format. Structured output solves this by constraining the model's output to a specific schema. OpenAI, Anthropic, and Google all offer native structured output modes that guarantee responses conform to a provided JSON schema. The model is still reasoning freely — structured output only constrains how it expresses that reasoning. Structured output is essential for reliable AI application development. It enables: reliable extraction of specific fields from model responses, validation that required fields are present and correctly typed, consistent integration with downstream systems, and easier debugging when something goes wrong. Pydantic (in Python) and Zod (in TypeScript) are popular schema definition libraries that work well with structured output APIs, providing type-safe parsing and validation of model responses.
GAIAの活用方法 Structured Output
GAIA uses structured output extensively to reliably extract information from LLM responses. When parsing emails for tasks, extracting calendar event details, or determining action priority, GAIA constrains the model to structured JSON schemas validated by Pydantic. This ensures reliable downstream processing without fragile text parsing.
関連概念
Function Calling
Function calling is a feature of AI models that allows them to generate structured, machine-readable invocations of predefined functions, enabling AI systems to reliably call external APIs and tools with the correct arguments.
Tool Use
Tool use is the capability of AI agents to invoke external functions, APIs, databases, and services to retrieve information or take actions in the real world beyond generating text.
Prompt Engineering
Prompt engineering is the practice of designing and refining inputs to AI language models to reliably elicit desired outputs, shaping model behavior without modifying the underlying weights.
Large Language Model (LLM)
A Large Language Model (LLM) is a deep learning model trained on massive text datasets that can understand, generate, and reason about human language across a wide range of tasks.
Agent Loop
An agent loop is the iterative execution cycle of an AI agent in which it reasons about the current state, selects and executes an action (often a tool call), observes the result, and repeats until the task is complete or a stopping condition is reached.


