Skip to main content

OpenAI Chat Completion

The most common use case for the OpenAI API in Python is generating chat completions. This involves sending a list of messages to a specified model and receiving a generated response. The API supports both synchronous (blocking) and asynchronous (non-blocking) calls, and responses can be received as a single object or streamed as chunks.

Synchronous Chat Completion (Non-Streaming)

This example demonstrates how to send a single prompt to the OpenAI API and receive the complete response in one go.

from openai import OpenAI

# Initialize the OpenAI client.
# The API key is automatically loaded from the OPENAI_API_KEY environment variable.
# Alternatively, you can pass it directly: client = OpenAI(api_key="your_api_key_here")
client = OpenAI()

try:
# Create a chat completion
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the capital of France?",
}
],
model="gpt-4o", # Specify the model to use
)

# Access the content of the generated message
response_content = chat_completion.choices[0].message.content
print(f"Non-streaming response:\n{response_content}")

except Exception as e:
print(f"An error occurred: {e}")

Synchronous Chat Completion (Streaming)

This example shows how to receive the model's response incrementally as it is generated, which is useful for applications that want to display the response in real-time.

from openai import OpenAI

client = OpenAI()

print("Streaming response:")
try:
# Create a chat completion with streaming enabled
stream = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Write a short, uplifting poem about the beauty of nature.",
}
],
model="gpt-4o", # Specify the model to use
stream=True, # Enable streaming
)

# Iterate over the stream to print chunks as they arrive
for chunk in stream:
# Each chunk contains a 'delta' object with partial content
# Use .delta.content or "" to handle cases where content might be None
print(chunk.choices[0].delta.content or "", end="")
print("\n") # Add a newline at the end for clean output

except Exception as e:
print(f"An error occurred: {e}")

Asynchronous Chat Completion (Non-Streaming)

For asynchronous applications, the AsyncOpenAI client can be used with await.

import asyncio
from openai import AsyncOpenAI

async def async_non_streaming_completion():
# Initialize the asynchronous OpenAI client
client = AsyncOpenAI()

try:
# Create an asynchronous chat completion
chat_completion = await client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Summarize the main idea of quantum physics in one sentence.",
}
],
model="gpt-4o",
)

response_content = chat_completion.choices[0].message.content
print(f"Async non-streaming response:\n{response_content}")

except Exception as e:
print(f"An error occurred: {e}")

# Run the asynchronous function
# asyncio.run(async_non_streaming_completion())

Asynchronous Chat Completion (Streaming)

Streaming responses are also available asynchronously, allowing for efficient real-time updates in async applications.

import asyncio
from openai import AsyncOpenAI

async def async_streaming_completion():
client = AsyncOpenAI()

print("Async streaming response:")
try:
# Create an asynchronous streaming chat completion
stream = await client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Tell me a brief story about a brave knight and a dragon.",
}
],
model="gpt-4o",
stream=True,
)

async for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
print("\n")

except Exception as e:
print(f"An error occurred: {e}")

# Run the asynchronous function
# asyncio.run(async_streaming_completion())