Skip to main content

Conversational AI

This document outlines how to create and manage conversational sessions using the google.genai library in Python. It covers initializing chat sessions, sending messages, overriding parameters for single requests, and importantly, how to permanently update parameters like temperature and max_output_tokens while preserving system_instruction and existing history.

Prerequisites

To run the examples, ensure you have the google-generative-ai library installed. You will also need an API key for the Gemini API or appropriate Vertex AI credentials. For the Gemini API, you can set the GOOGLE_API_KEY environment variable or pass it directly to the genai.Client constructor.

pip install google-generative-ai

1. Initialize the Client

First, initialize the Client object. This is the entry point for interacting with the Google Generative AI services.

import google.genai as genai
from google.genai import types

# Configure your API key or Vertex AI credentials
# Option 1: Set GOOGLE_API_KEY environment variable
# export GOOGLE_API_KEY="YOUR_API_KEY"
# client = genai.Client()

# Option 2: Pass API key directly (for Gemini API)
client = genai.Client(api_key="YOUR_API_KEY")

# Option 3: For Vertex AI (requires project and location)
# client = genai.Client(vertexai=True, project="your-gcp-project-id", location="us-central1")

2. Create a Conversation with Initial Parameters

You can start a conversation (chat session) with an existing history and specify initial generation parameters such as temperature, max_output_tokens, top_p, top_k, and a system_instruction.

The client.chats.create() method is used to initiate a Chat object.

  • model: Specifies the generative model to use (e.g., "gemini-pro").
  • history: An optional list of types.Content objects representing previous turns in the conversation. Use types.UserContent() and types.ModelContent() for cleaner history definition.
  • config: An optional types.GenerateContentConfig object to set initial generation parameters for the chat session. This config will serve as the default for all subsequent messages in this session.
# Define an optional initial conversation history
initial_history = [
types.UserContent("Hello!"),
types.ModelContent("Hi there! How can I assist you today?"),
types.UserContent("I'm interested in learning about large language models."),
]

# Define initial generation configuration parameters
initial_generation_config = types.GenerateContentConfig(
temperature=0.7, # Controls randomness: 0.0 (less random) to 1.0 (more random)
max_output_tokens=150, # Maximum number of tokens to generate in the response
top_p=0.9, # Tokens are sampled until their probabilities sum up to this value
top_k=40, # For each step, consider top_k tokens with highest probabilities
system_instruction=types.Content( # Use types.Content for system_instruction
parts=[types.Part(text="You are an expert in AI and machine learning. Provide detailed and informative answers.")]
)
)

# Create the chat session
chat = client.chats.create(
model="gemini-pro", # Use an appropriate model, e.g., "gemini-pro"
history=initial_history,
config=initial_generation_config
)

print("--- Initial Chat Session Configuration ---")
print(f"Model: {chat._model}")
# Note: Accessing `_config` directly (e.g., `chat._config`) is for demonstration of internal state.
# The `send_message` method internally uses this config as default.
print(f"Default Temperature: {chat._config.temperature}")
print(f"Default Max Output Tokens: {chat._config.max_output_tokens}")
print(f"Default System Instruction: {chat._config.system_instruction.parts[0].text if chat._config.system_instruction else 'None'}")

print("\n--- Initial Chat History (curated) ---")
for i, content in enumerate(chat.get_history(curated=True)):
role = content.role
# Check if content.parts exists and has a text attribute before accessing
text_content = content.parts[0].text if content.parts and hasattr(content.parts[0], 'text') else "[Non-text content]"
print(f"Turn {i+1} ({role}): {text_content}")

3. Sending Messages and Per-Call Parameter Overrides

After a conversation is initiated, you can send new messages. The send_message() method (or send_message_stream() for streaming responses) allows you to override generation parameters for that specific call only.

You can modify parameters like temperature and max_output_tokens for individual send_message calls by passing a new types.GenerateContentConfig object to the config argument. Any parameter set in this per-call config will take precedence over the chat session's default for that particular message, but will not change the session's default for subsequent messages.

import time

# Send a message, overriding temperature and max_output_tokens for this turn
print("\n--- Sending Message 1 (overriding config for this call) ---")
message_1_config = types.GenerateContentConfig(
temperature=0.2, # Lower temperature for less randomness on this call
max_output_tokens=80 # Generate a shorter response for this call
)
response1 = chat.send_message("What are some applications of LLMs?", config=message_1_config)
print(f"User: What are some applications of LLMs?")
print(f"Model: {response1.text}")
print(f"Config for Message 1 (per-call override): Temperature={message_1_config.temperature}, Max Output Tokens={message_1_config.max_output_tokens}")

# The chat's internal default config remains unchanged by the above call:
# print(f"Chat's default temperature: {chat._config.temperature}")

# Send another message, overriding with different parameters
print("\n--- Sending Message 2 (overriding config again for this call) ---")
message_2_config = types.GenerateContentConfig(
temperature=0.9, # Higher temperature for more creative response on this call
max_output_tokens=120 # Allow for a longer response on this call
)
response2 = chat.send_message("Give me a creative example.", config=message_2_config)
print(f"User: Give me a creative example.")
print(f"Model: {response2.text}")
print(f"Config for Message 2 (per-call override): Temperature={message_2_config.temperature}, Max Output Tokens={message_2_config.max_output_tokens}")

# Send a message without overriding config, which will use the initial_generation_config
print("\n--- Sending Message 3 (using initial default config) ---")
response3 = chat.send_message("Summarize our conversation so far.")
print(f"User: Summarize our conversation so far.")
print(f"Model: {response3.text}")
# This call uses the temperature and max_output_tokens from `initial_generation_config`
print(f"Config for Message 3 (default from chat session): Temperature={initial_generation_config.temperature}, Max Output Tokens={initial_generation_config.max_output_tokens}")

# Demonstrate streaming response
print("\n--- Sending Message 4 (streaming, overriding config for this call) ---")
message_4_config = types.GenerateContentConfig(
temperature=0.5,
max_output_tokens=60
)
print(f"User: Tell me a very short story about a brave knight.")
print("Model (streaming): ", end="")
for chunk in chat.send_message_stream("Tell me a very short story about a brave knight.", config=message_4_config):
print(chunk.text, end="", flush=True)
time.sleep(0.05) # Simulate real-time printing
print("\n")
print(f"Config for Message 4 (per-call override): Temperature={message_4_config.temperature}, Max Output Tokens={message_4_config.max_output_tokens}")

4. Updating Persistent Chat Parameters (Preserving State)

The core configuration of a Chat object, including its system_instruction and default generation parameters, is set during its creation and is not directly modifiable through public methods. To make changes to temperature and max_output_tokens that persist for all future messages in a conversation (while retaining the system_instruction and accumulated chat history), you must recreate the Chat instance.

This process involves:

  1. Retrieving Current State: Get the existing model name, the current GenerateContentConfig (which includes system_instruction), and the full conversation history from the active Chat instance.
  2. Gathering New Parameters: Obtain the new temperature and max_output_tokens values.
  3. Merging Configurations: Combine the existing GenerateContentConfig values with the new parameter values into a single, updated configuration.
  4. Recreating the Chat Instance: Create a new Chat object using client.chats.create(), passing the original model, the newly merged configuration, and the preserved chat history.
  5. Replacing the Old Instance: Update your program's reference to the Chat object to point to the new instance.

This ensures a seamless transition to the new parameters without losing any conversational context or other predefined settings.

# Function to get new parameters from user (can be integrated into your application's UI)
def get_user_param_overrides():
"""Prompts the user for temperature and max_output_tokens, allowing for empty input to keep current."""
print("\n--- Enter New Model Parameters (Leave empty to keep current) ---")
temp_input = input("New Temperature (float, e.g., 0.7): ")
max_tokens_input = input("New Max Output Tokens (int, e.g., 200): ")

new_temperature = None
if temp_input:
try:
new_temperature = float(temp_input)
if not (0.0 <= new_temperature <= 1.0):
print("Note: Temperature usually ranges from 0.0 to 1.0 for typical use cases.")
except ValueError:
print("Invalid input for temperature. Keeping current value.")

new_max_output_tokens = None
if max_tokens_input:
try:
new_max_output_tokens = int(max_tokens_input)
if new_max_output_tokens <= 0:
print("Note: Max output tokens must be a positive integer.")
except ValueError:
print("Invalid input for max_output_tokens. Keeping current value.")

return new_temperature, new_max_output_tokens

def recreate_chat_with_updates(current_chat: genai.chats.Chat) -> genai.chats.Chat:
"""
Retrieves current chat state, merges with new parameters, and recreates the chat.
Returns the new Chat instance.
"""
print("\n--- Updating Chat Parameters Permanently ---")

# 1. Retrieve current model name, configuration, and history
current_model_name = current_chat._model
# Accessing _config directly is a common pattern for inspecting internal state
# when no public getter is provided for the full config object.
current_gen_config: types.GenerateContentConfig = current_chat._config
current_history = current_chat.get_history(curated=False) # Retrieve comprehensive history

# 2. Get new parameter values from user or source
new_temperature, new_max_output_tokens = get_user_param_overrides()

# 3. Merge configurations
# Start with a dictionary representation of the current config to carry over all its fields
updated_config_dict = current_gen_config.model_dump()

# Apply new parameters if they were provided (not None)
if new_temperature is not None:
updated_config_dict['temperature'] = new_temperature
if new_max_output_tokens is not None:
updated_config_dict['max_output_tokens'] = new_max_output_tokens

# Create a new GenerateContentConfig object from the merged dictionary
final_gen_config = types.GenerateContentConfig(**updated_config_dict)

print("\nRecreating chat session with updated configuration:")
print(f" Model: {current_model_name}")
print(f" New Temperature: {final_gen_config.temperature}")
print(f" New Max Output Tokens: {final_gen_config.max_output_tokens}")
if final_gen_config.system_instruction:
print(f" System Instruction: '{final_gen_config.system_instruction.parts[0].text}'")
print(f" History length preserved: {len(current_history)} turns")

# 4. Recreate the chat session with the combined configuration and history
updated_chat = client.chats.create(
model=current_model_name,
config=final_gen_config,
history=current_history # Pass the entire preserved conversation history
)
return updated_chat

# --- Main interactive chat loop to demonstrate persistent updates ---
# This part is for demonstration and would integrate with your application's flow.

# Define an initial chat session (as in section 2)
initial_system_instruction = types.Content(parts=[types.Part.from_text("You are a concise assistant. Provide short answers.")])
initial_history = [
types.UserContent("What is Python?"),
types.ModelContent("Python is a popular programming language.")
]
initial_gen_config = types.GenerateContentConfig(
temperature=0.4,
max_output_tokens=50,
system_instruction=initial_system_instruction
)

print("--- Initializing Main Chat Session ---")
current_chat_session = client.chats.create(
model='gemini-pro',
config=initial_gen_config,
history=initial_history
)
print(f"Current chat config: Temperature={current_chat_session._config.temperature}, MaxTokens={current_chat_session._config.max_output_tokens}")
print(f"Current system instruction: {current_chat_session._config.system_instruction.parts[0].text}")
print(f"Current history length: {len(current_chat_session.get_history())} turns")

print("\n--- Start Chatting ---")
print("Type 'update_params' to change temperature and max_output_tokens persistently.")
print("Type 'quit' or 'exit' to end the chat.")

while True:
user_input = input("\nYou: ")
if user_input.lower() in ["quit", "exit"]:
print("Ending chat session. Goodbye!")
break

if user_input.lower() == "update_params":
# Call the function to recreate the chat with updated parameters.
# The returned new chat instance replaces the old one.
current_chat_session = recreate_chat_with_updates(current_chat_session)
print("\nChat session parameters updated. Continue typing messages.")
# Continue to the next loop iteration without sending 'update_params' to the model
continue

try:
response = current_chat_session.send_message(user_input)
print(f"Model: {response.text}")
except Exception as e:
print(f"An error occurred during message generation: {e}")
break

print("\n--- Final Chat History (comprehensive) ---")
# The history property of the chat object keeps track of all turns
# including the original history and all messages sent and received.
for i, content in enumerate(current_chat_session.get_history()):
role = content.role
text_content = content.parts[0].text if content.parts and hasattr(content.parts[0], 'text') else "[Non-text content]"
print(f"Turn {i+1} ({role}): {text_content}")

print("\n--- Final Chat History (curated) ---")
# The curated history contains only valid turns that contribute to the model's context
for i, content in enumerate(current_chat_session.get_history(curated=True)):
role = content.role
text_content = content.parts[0].text if content.parts and hasattr(content.parts[0], 'text') else "[Non-text content]"
print(f"Turn {i+1} ({role}): {text_content}")