Text Generation, Image and vision, Audio and Speech, Structured Output, Function Calling, Using GTP-5 and Migrate to Response API.
Text Generation: Prompt engineering: sample codes
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-5",
reasoning={"effort": "low"},
instructions="Talk like a pirate.", # was realized by input array before, there are role of developer, user and assistant with different levels of priority given
input="Write a one-sentence bedtime story about a unicorn."
)
print(response.output_text)
Structured Output: Structured Outputs is the evolution of JSON mode. While both ensure valid JSON is produced, only Structured Outputs ensure schema adherence. An example of getting a structured output
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
completion = client.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Extract the event information."},
{"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
],
response_format=CalendarEvent,
)
event = completion.choices[0].message.parsed
List below 4 examples to show case how powerful the structured output is!
First, Chain of Thoughts: ask the model to output an answer in a structured, step-by-step way, to guide the user through the solution. The second example is to extract key areas from papers
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class Step(BaseModel):
explanation: str
output: str
class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str
completion = client.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},
{"role": "user", "content": "how can I solve 8x + 7 = -23"}
],
response_format=MathReasoning,
)
math_reasoning = completion.choices[0].message.parsed
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class ResearchPaperExtraction(BaseModel):
title: str
authors: list[str]
abstract: str
keywords: list[str]
completion = client.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are an expert at structured data extraction. You will be given unstructured text from a research paper and should convert it into the given structure."},
{"role": "user", "content": "..."}
],
response_format=ResearchPaperExtraction,
)
research_paper = completion.choices[0].message.parsed
# output is
{
"title": "Application of Quantum Algorithms in Interstellar Navigation: A New Frontier",
"authors": [
"Dr. Stella Voyager",
"Dr. Nova Star",
"Dr. Lyra Hunter"
],
"abstract": "This paper investigates the utilization of quantum algorithms to improve interstellar navigation systems. By leveraging quantum superposition and entanglement, our proposed navigation system can calculate optimal travel paths through space-time anomalies more efficiently than classical methods. Experimental simulations suggest a significant reduction in travel time and fuel consumption for interstellar missions.",
"keywords": [
"Quantum algorithms",
"interstellar navigation",
"space-time anomalies",
"quantum superposition",
"quantum entanglement",
"space travel"
]
}
The third example is UI generation, i.e., making the agent to be an expert helping you generate UIs:
from enum import Enum
from typing import List
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class UIType(str, Enum):
div = "div"
button = "button"
header = "header"
section = "section"
field = "field"
form = "form"
class Attribute(BaseModel):
name: str
value: str
class UI(BaseModel):
type: UIType
label: str
children: List["UI"]
attributes: List[Attribute]
UI.model_rebuild() # This is required to enable recursive types
class Response(BaseModel):
ui: UI
completion = client.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a UI generator AI. Convert the user input into a UI."},
{"role": "user", "content": "Make a User Profile Form"}
],
response_format=Response,
)
ui = completion.choices[0].message.parsed
print(ui)
The last example is You can classify inputs on multiple categories, which is a common way of doing moderation.
from enum import Enum
from typing import Optional
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class Category(str, Enum):
violence = "violence"
sexual = "sexual"
self_harm = "self_harm"
class ContentCompliance(BaseModel):
is_violating: bool
category: Optional[Category]
explanation_if_violating: Optional[str]
completion = client.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Determine if the user input violates specific guidelines and explain if they do."},
{"role": "user", "content": "How do I prepare for a job interview?"}
],
response_format=ContentCompliance,
)
compliance = completion.choices[0].message.parsed
# the output
{
"is_violating": false,
"category": null,
"explanation_if_violating": null
}
Not every time the AI can fulfill the structured output, the refusal property appears in your output object, you might present the refusal in your UI, or include conditional logic in code that consumes the response to handle the case of a refused request.
To prevent your JSON Schema and corresponding types in your programming language from diverging, we strongly recommend using the native Pydantic/zod sdk support
A great coding example of how to leverage Structured Outputs to build more robust multi-agent systems.
Let’s first set up our 4-agents system:
- Triaging agent: Decides which agent(s) to call
- Data pre-processing Agent: Prepares data for analysis – for example by cleaning it up
- Data Analysis Agent: Performs analysis on the data
- Data Visualization Agent: Visualizes the output of the analysis to extract insights
to be continued on details to make this simple multi-agent data analysis tool work!