Beyond Lex: Hybrid Approaches
Explore advanced architectures that combine Amazon Lex with other technologies to create more sophisticated and powerful conversational experiences.
Learning Objectives
- Understand the limitations of single-technology approaches to conversational AI
- Design hybrid architectures that combine multiple technologies
- Integrate Amazon Lex with large language models (LLMs)
- Implement context management across different components
- Create seamless user experiences with hybrid systems
Limitations of Single-Technology Approaches
While Amazon Lex provides a powerful foundation for conversational interfaces, relying solely on one technology can limit the capabilities and flexibility of your solution. Understanding these limitations is the first step toward designing more sophisticated hybrid approaches.
Common Limitations of Lex-Only Solutions
Amazon Lex excels at structured conversations with clear intents and slots, but may face challenges with:
- Complex, Open-Ended Conversations: Handling multi-turn dialogues that don't follow predictable patterns
- Contextual Understanding: Maintaining context across multiple turns or topics
- Knowledge-Based Responses: Providing detailed information that requires access to large knowledge bases
- Nuanced Language Understanding: Recognizing subtle variations in meaning, sarcasm, or implicit requests
- Personalization: Adapting responses based on user preferences, history, or characteristics
- Creativity: Generating novel, diverse responses to similar inputs
Technology Comparison
When to Consider Hybrid Approaches
Hybrid approaches are particularly valuable when your conversational interface needs to:
- Handle Both Structured and Unstructured Interactions: Combine the predictability of intent-based systems with the flexibility of generative models
- Access Large Knowledge Bases: Provide detailed, accurate information beyond what can be encoded in intents and responses
- Maintain Complex Context: Track and use information across multiple turns and topics
- Generate Dynamic, Personalized Content: Create responses tailored to specific users or situations
- Balance Control and Flexibility: Maintain guardrails while allowing for more natural conversations
Hybrid Architecture Patterns
Several architectural patterns can be used to combine different conversational AI technologies, each with its own strengths and use cases.
Router Pattern
The Router pattern uses a central component to direct user inputs to the most appropriate technology based on the type of request:
- Input Analysis: Analyze user input to determine its type and complexity
- Routing Decision: Direct the input to the most suitable technology (e.g., Lex for transactional requests, LLM for open-ended questions)
- Response Integration: Combine and format responses from different components
- Context Management: Maintain and share context across components
This pattern is particularly useful when different types of requests clearly benefit from different technologies.
Router Pattern Implementation
// Example Lambda function implementing a router pattern
const AWS = require('aws-sdk');
const lexRuntime = new AWS.LexRuntimeV2();
const bedrock = new AWS.Bedrock();
exports.handler = async (event) => {
console.log('Received event:', JSON.stringify(event, null, 2));
// Extract user input and session data
const userInput = event.userInput;
const sessionId = event.sessionId;
const sessionState = event.sessionState || {};
// Step 1: Analyze the input to determine routing
const routingDecision = await analyzeInput(userInput, sessionState);
// Step 2: Route to appropriate technology based on the decision
let response;
switch (routingDecision.destination) {
case 'lex':
response = await routeToLex(userInput, sessionId, sessionState);
break;
case 'llm':
response = await routeToLLM(userInput, sessionState);
break;
case 'knowledge_base':
response = await routeToKnowledgeBase(userInput, sessionState);
break;
default:
response = {
message: "I'm not sure how to process that request.",
sessionState: sessionState
};
}
// Step 3: Update context with the new information
const updatedSessionState = updateContext(sessionState, userInput, response, routingDecision);
// Step 4: Format and return the final response
return {
message: response.message,
sessionState: updatedSessionState,
source: routingDecision.destination
};
};
// Analyze input to determine where to route it
async function analyzeInput(userInput, sessionState) {
// Simple heuristic-based routing for demonstration
// In a real system, this could use more sophisticated analysis
// Check for transactional or task-oriented patterns
const transactionalPatterns = [
/book/i, /schedule/i, /reserve/i, /order/i, /buy/i, /purchase/i,
/cancel/i, /change/i, /update/i, /status/i, /check/i, /find/i
];
// Check for knowledge-seeking patterns
const knowledgePatterns = [
/what is/i, /how does/i, /explain/i, /tell me about/i, /information on/i,
/details about/i, /when was/i, /where is/i, /who is/i, /why does/i
];
// Check for open-ended, conversational patterns
const conversationalPatterns = [
/think about/i, /opinion on/i, /feel about/i, /imagine/i, /creative/i,
/suggest/i, /recommend/i, /advice/i, /help me with/i, /brainstorm/i
];
// Check if we're in the middle of a Lex conversation
const inLexConversation = sessionState.lexSessionActive &&
sessionState.lexSessionState &&
sessionState.lexSessionState.dialogAction &&
sessionState.lexSessionState.dialogAction.type !== 'Close';
// If we're in an active Lex conversation, continue with Lex
if (inLexConversation) {
return { destination: 'lex', confidence: 0.9, reason: 'active_lex_session' };
}
// Check for transactional patterns
for (const pattern of transactionalPatterns) {
if (pattern.test(userInput)) {
return { destination: 'lex', confidence: 0.8, reason: 'transactional_pattern' };
}
}
// Check for knowledge patterns
for (const pattern of knowledgePatterns) {
if (pattern.test(userInput)) {
return { destination: 'knowledge_base', confidence: 0.7, reason: 'knowledge_pattern' };
}
}
// Check for conversational patterns
for (const pattern of conversationalPatterns) {
if (pattern.test(userInput)) {
return { destination: 'llm', confidence: 0.7, reason: 'conversational_pattern' };
}
}
// Default to LLM for unclassified inputs
return { destination: 'llm', confidence: 0.5, reason: 'default' };
}
// Route the request to Amazon Lex
async function routeToLex(userInput, sessionId, sessionState) {
// Extract Lex-specific session state if it exists
const lexSessionState = sessionState.lexSessionState || {};
try {
const params = {
botId: process.env.LEX_BOT_ID,
botAliasId: process.env.LEX_BOT_ALIAS_ID,
localeId: 'en_US',
sessionId: sessionId,
text: userInput,
sessionState: lexSessionState
};
const lexResponse = await lexRuntime.recognizeText(params).promise();
// Extract the message from Lex response
let message = '';
if (lexResponse.messages && lexResponse.messages.length > 0) {
message = lexResponse.messages.map(m => m.content).join(' ');
}
return {
message: message,
lexSessionState: lexResponse.sessionState,
dialogState: lexResponse.sessionState.dialogAction.type
};
} catch (error) {
console.error('Error calling Lex:', error);
return {
message: "I'm having trouble processing your request right now.",
error: error.message
};
}
}
// Route the request to a Large Language Model
async function routeToLLM(userInput, sessionState) {
// Get conversation history from session state
const conversationHistory = sessionState.conversationHistory || [];
// Prepare the prompt with conversation history
const prompt = preparePromptWithHistory(userInput, conversationHistory);
try {
// Call Amazon Bedrock with Claude model
const params = {
modelId: 'anthropic.claude-v2',
contentType: 'application/json',
accept: 'application/json',
body: JSON.stringify({
prompt: prompt,
max_tokens_to_sample: 500,
temperature: 0.7,
top_p: 0.9,
stop_sequences: ["\n\nHuman:"]
})
};
const bedrockResponse = await bedrock.invokeModel(params).promise();
const responseBody = JSON.parse(bedrockResponse.body.toString());
return {
message: responseBody.completion.trim(),
modelId: 'anthropic.claude-v2'
};
} catch (error) {
console.error('Error calling LLM:', error);
return {
message: "I'm having trouble generating a response right now.",
error: error.message
};
}
}
// Route the request to a knowledge base
async function routeToKnowledgeBase(userInput, sessionState) {
// In a real implementation, this would call Amazon Kendra or another knowledge base
// For this example, we'll simulate a knowledge base response
return {
message: `Here's what I found about "${userInput}": This would be information retrieved from a knowledge base.`,
source: 'simulated_knowledge_base'
};
}
// Prepare a prompt for the LLM that includes conversation history
function preparePromptWithHistory(userInput, conversationHistory) {
// Start with the system prompt
let prompt = "\n\nHuman: You are a helpful, harmless assistant that provides accurate and concise information. You're part of a hybrid system where some requests are handled by other components. Keep your responses friendly and conversational.\n\nAssistant: I understand. I'll provide helpful, accurate, and concise responses in a friendly tone.\n\n";
// Add conversation history
for (const exchange of conversationHistory.slice(-5)) { // Include up to 5 most recent exchanges
prompt += `Human: ${exchange.userInput}\n\nAssistant: ${exchange.assistantResponse}\n\n`;
}
// Add the current user input
prompt += `Human: ${userInput}\n\nAssistant:`;
return prompt;
}
// Update the context with new information
function updateContext(sessionState, userInput, response, routingDecision) {
const updatedState = { ...sessionState };
// Update Lex session state if applicable
if (routingDecision.destination === 'lex' && response.lexSessionState) {
updatedState.lexSessionState = response.lexSessionState;
updatedState.lexSessionActive = response.dialogState !== 'Close';
}
// Update conversation history
const conversationHistory = updatedState.conversationHistory || [];
conversationHistory.push({
timestamp: new Date().toISOString(),
userInput: userInput,
assistantResponse: response.message,
source: routingDecision.destination
});
// Keep only the last 10 exchanges to manage context size
updatedState.conversationHistory = conversationHistory.slice(-10);
// Add routing information for analytics
updatedState.lastRouting = {
destination: routingDecision.destination,
confidence: routingDecision.confidence,
reason: routingDecision.reason,
timestamp: new Date().toISOString()
};
return updatedState;
}
Fallback Pattern
The Fallback pattern uses a primary technology (typically Lex) for most interactions, but falls back to alternative technologies when the primary system cannot handle a request:
- Primary Processing: First attempt to handle the request with the primary system
- Confidence Check: Evaluate whether the primary system's response is satisfactory
- Fallback Decision: If confidence is low or no matching intent is found, route to the fallback system
- Response Selection: Choose the most appropriate response from available options
This pattern is useful when you want to maintain the predictability and control of a structured system while handling edge cases more gracefully.
Fallback Pattern Flow
User Input
User sends a message to the conversational interface
Primary System (Lex)
Attempt to match input to defined intents and slots
Use Lex Response
Process intent and return structured response
Fallback System (LLM)
Process input with more flexible model
Use LLM Response
Return generated response with appropriate guardrails
Orchestrator Pattern
The Orchestrator pattern uses a central component to coordinate multiple technologies that work together to handle a single request:
- Request Analysis: Break down the request into components that different technologies can handle
- Parallel Processing: Send components to appropriate technologies simultaneously
- Result Aggregation: Combine results from different components
- Response Generation: Create a unified, coherent response
This pattern is valuable for complex requests that benefit from multiple technologies working together, such as combining structured data retrieval with natural language generation.
Augmentation Pattern
The Augmentation pattern uses one technology to enhance the capabilities of another:
- Primary Processing: Handle the core request with the primary technology
- Enhancement Identification: Identify aspects that could be improved
- Augmentation: Use secondary technologies to enhance specific aspects
- Integration: Incorporate enhancements into the final response
For example, using Lex for intent recognition and slot filling, then using an LLM to make the response more natural and conversational.
Augmentation Pattern Implementation
// Example Lambda function implementing an augmentation pattern
const AWS = require('aws-sdk');
const lexRuntime = new AWS.LexRuntimeV2();
const bedrock = new AWS.Bedrock();
exports.handler = async (event) => {
console.log('Received event:', JSON.stringify(event, null, 2));
// Extract user input and session data
const userInput = event.userInput;
const sessionId = event.sessionId;
const sessionState = event.sessionState || {};
try {
// Step 1: Process with Lex to handle intent recognition and slot filling
const lexResponse = await processWithLex(userInput, sessionId, sessionState);
// Step 2: Determine if the response needs enhancement
const needsEnhancement = shouldEnhanceResponse(lexResponse);
if (needsEnhancement) {
// Step 3: Enhance the response using an LLM
const enhancedResponse = await enhanceWithLLM(userInput, lexResponse, sessionState);
// Step 4: Return the enhanced response
return {
message: enhancedResponse.message,
sessionState: {
...sessionState,
lexSessionState: lexResponse.lexSessionState,
lastEnhanced: true
},
enhanced: true
};
} else {
// Return the original Lex response if enhancement isn't needed
return {
message: lexResponse.message,
sessionState: {
...sessionState,
lexSessionState: lexResponse.lexSessionState,
lastEnhanced: false
},
enhanced: false
};
}
} catch (error) {
console.error('Error processing request:', error);
return {
message: "I'm having trouble processing your request right now.",
sessionState: sessionState,
error: error.message
};
}
};
// Process the request with Amazon Lex
async function processWithLex(userInput, sessionId, sessionState) {
// Extract Lex-specific session state if it exists
const lexSessionState = sessionState.lexSessionState || {};
try {
const params = {
botId: process.env.LEX_BOT_ID,
botAliasId: process.env.LEX_BOT_ALIAS_ID,
localeId: 'en_US',
sessionId: sessionId,
text: userInput,
sessionState: lexSessionState
};
const lexResponse = await lexRuntime.recognizeText(params).promise();
// Extract the message from Lex response
let message = '';
if (lexResponse.messages && lexResponse.messages.length > 0) {
message = lexResponse.messages.map(m => m.content).join(' ');
}
// Extract intent and slots for context
const intent = lexResponse.sessionState.intent ? lexResponse.sessionState.intent.name : 'None';
const slots = lexResponse.sessionState.intent ? lexResponse.sessionState.intent.slots : {};
return {
message: message,
lexSessionState: lexResponse.sessionState,
dialogState: lexResponse.sessionState.dialogAction.type,
intent: intent,
slots: slots,
confidence: lexResponse.interpretations && lexResponse.interpretations[0] ?
lexResponse.interpretations[0].nluConfidence.score : 0
};
} catch (error) {
console.error('Error calling Lex:', error);
throw error;
}
}
// Determine if the response should be enhanced
function shouldEnhanceResponse(lexResponse) {
// Criteria for enhancement:
// 1. Don't enhance if we're in the middle of slot filling
if (lexResponse.dialogState === 'ElicitSlot') {
return false;
}
// 2. Don't enhance if confidence is very low (might be misinterpreting)
if (lexResponse.confidence < 0.3) {
return false;
}
// 3. Enhance fulfilled intents to make responses more natural
if (lexResponse.dialogState === 'Close' &&
lexResponse.lexSessionState.dialogAction.fulfillmentState === 'Fulfilled') {
return true;
}
// 4. Enhance certain intents that benefit from more natural language
const enhanceableIntents = [
'ProvideInformation',
'ExplainFeatures',
'GiveAdvice',
'AnswerFAQ'
];
if (enhanceableIntents.includes(lexResponse.intent)) {
return true;
}
// Default to not enhancing
return false;
}
// Enhance the Lex response using an LLM
async function enhanceWithLLM(userInput, lexResponse, sessionState) {
// Prepare the prompt for enhancement
const prompt = prepareEnhancementPrompt(userInput, lexResponse, sessionState);
try {
// Call Amazon Bedrock with Claude model
const params = {
modelId: 'anthropic.claude-v2',
contentType: 'application/json',
accept: 'application/json',
body: JSON.stringify({
prompt: prompt,
max_tokens_to_sample: 300,
temperature: 0.7,
top_p: 0.9,
stop_sequences: ["\n\nHuman:"]
})
};
const bedrockResponse = await bedrock.invokeModel(params).promise();
const responseBody = JSON.parse(bedrockResponse.body.toString());
return {
message: responseBody.completion.trim(),
originalMessage: lexResponse.message,
modelId: 'anthropic.claude-v2'
};
} catch (error) {
console.error('Error enhancing with LLM:', error);
// Fall back to the original Lex response if enhancement fails
return {
message: lexResponse.message,
error: error.message
};
}
}
// Prepare a prompt for enhancing the response
function prepareEnhancementPrompt(userInput, lexResponse, sessionState) {
// Get user's name from session state if available
const userName = sessionState.userProfile ? sessionState.userProfile.firstName : null;
// Start with the system prompt
let prompt = "\n\nHuman: You are an AI assistant that makes responses more conversational and natural. You'll be given a user's input and a basic response. Your job is to enhance the response to make it more engaging and natural while preserving all the factual information. Keep the enhanced response concise and focused on the user's request.\n\nAssistant: I understand. I'll enhance responses to make them more conversational and natural while preserving all factual information and keeping them concise.\n\n";
// Add context about the user if available
if (userName) {
prompt += `Human: The user's name is ${userName}. Please personalize the response appropriately.\n\nAssistant: I'll make sure to personalize the response for ${userName}.\n\n`;
}
// Add context about the intent and slots
prompt += `Human: The user's intent is "${lexResponse.intent}" and they provided these details: ${JSON.stringify(lexResponse.slots)}.\n\nAssistant: I understand the context of the conversation.\n\n`;
// Add the current exchange
prompt += `Human: The user said: "${userInput}"\n\nThe basic response is: "${lexResponse.message}"\n\nPlease enhance this response to make it more conversational and natural while preserving all the factual information.\n\nAssistant:`;
return prompt;
}
Integrating Lex with Large Language Models
Large Language Models (LLMs) like those available through Amazon Bedrock can significantly enhance the capabilities of Lex-based conversational interfaces.
LLM Capabilities and Limitations
Understanding the strengths and weaknesses of LLMs is essential for effective integration:
- Strengths:
- Generating natural, contextually appropriate responses
- Handling open-ended, creative requests
- Understanding complex or ambiguous language
- Maintaining context across multiple turns
- Adapting tone and style based on the conversation
- Limitations:
- Less predictable than rule-based systems
- Potential for hallucinations or factual errors
- Difficulty with precise, structured data collection
- Higher latency and computational cost
- Challenges with specific domain knowledge without fine-tuning
Integration Approaches
Several approaches can be used to integrate Lex with LLMs:
- LLM for Response Enhancement: Use Lex for intent recognition and slot filling, then use an LLM to make responses more natural
- LLM for Fallback Handling: Use an LLM when Lex cannot match an intent or has low confidence
- LLM for Complex Queries: Route complex or open-ended queries to an LLM while keeping transactional requests with Lex
- LLM for Context Management: Use an LLM to maintain and interpret conversation context across multiple turns
- LLM for Intent Classification: Use an LLM to classify intents before routing to Lex for structured processing
LLM Integration Examples
Response Enhancement
Lex Response: "The weather in Seattle is 62°F with rain."
Fallback Handling
Lex Response: "I'm sorry, I didn't understand your request."
Context Management
Bot: "It's currently 62°F with rain in Seattle."
User: "How about tomorrow?"
LLM Context Management: [Recognizes "tomorrow" refers to weather in Seattle]
Bot: "Tomorrow in Seattle will be partly cloudy with a high of 65°F."
Using Amazon Bedrock
Amazon Bedrock provides a managed service for accessing foundation models from leading AI companies. Key considerations when using Bedrock with Lex include:
- Model Selection: Choose the appropriate model based on your requirements for performance, cost, and capabilities
- Prompt Engineering: Design effective prompts that provide necessary context and guidance
- Response Filtering: Implement mechanisms to filter or modify responses for safety and appropriateness
- Latency Management: Consider the impact of LLM processing time on the overall user experience
- Cost Optimization: Implement strategies to minimize token usage and API calls
Amazon Bedrock Integration
# Example Python code for integrating Amazon Bedrock with a Lex bot
import boto3
import json
import os
import time
# Initialize clients
bedrock = boto3.client('bedrock-runtime')
lex = boto3.client('lexv2-runtime')
def lambda_handler(event, context):
"""
Lambda function that integrates Lex with Bedrock for enhanced conversational capabilities
"""
print(f"Received event: {json.dumps(event)}")
# Extract user input and session information
user_input = event.get('inputText', '')
session_id = event.get('sessionId', f"session_{int(time.time())}")
session_state = event.get('sessionState', {})
# Step 1: Process with Lex first
lex_response = process_with_lex(user_input, session_id, session_state)
# Step 2: Determine if we need to use Bedrock based on Lex response
if should_use_bedrock(lex_response):
# Step 3: Process with Bedrock
bedrock_response = process_with_bedrock(user_input, lex_response, session_state)
# Step 4: Return the enhanced response
return {
'sessionState': update_session_state(session_state, lex_response, bedrock_response),
'messages': [{'content': bedrock_response['message'], 'contentType': 'PlainText'}],
'requestAttributes': event.get('requestAttributes', {}),
'sessionId': session_id,
'source': 'bedrock'
}
else:
# Return the original Lex response
return {
'sessionState': update_session_state(session_state, lex_response, None),
'messages': lex_response.get('messages', []),
'requestAttributes': event.get('requestAttributes', {}),
'sessionId': session_id,
'source': 'lex'
}
def process_with_lex(user_input, session_id, session_state):
"""
Process the user input with Amazon Lex
"""
try:
response = lex.recognize_text(
botId=os.environ['LEX_BOT_ID'],
botAliasId=os.environ['LEX_BOT_ALIAS_ID'],
localeId='en_US',
sessionId=session_id,
text=user_input,
sessionState=session_state
)
print(f"Lex response: {json.dumps(response)}")
return response
except Exception as e:
print(f"Error processing with Lex: {str(e)}")
return {
'messages': [{'content': "I'm having trouble understanding. Could you try again?", 'contentType': 'PlainText'}],
'sessionState': session_state,
'error': str(e)
}
def should_use_bedrock(lex_response):
"""
Determine if we should use Bedrock based on the Lex response
"""
# Case 1: No matching intent with high confidence
if 'interpretations' in lex_response and len(lex_response['interpretations']) > 0:
top_intent = lex_response['interpretations'][0]
if 'intent' in top_intent and 'nluConfidence' in top_intent:
# If confidence is below threshold, use Bedrock
if top_intent['nluConfidence']['score'] < 0.6:
return True
# Case 2: Fallback intent was triggered
if 'sessionState' in lex_response and 'intent' in lex_response['sessionState']:
intent_name = lex_response['sessionState']['intent']['name']
if intent_name == 'FallbackIntent':
return True
# Case 3: Specific intents that benefit from enhancement
enhance_intents = ['ProvideInformation', 'ExplainConcept', 'AnswerQuestion']
if 'sessionState' in lex_response and 'intent' in lex_response['sessionState']:
intent_name = lex_response['sessionState']['intent']['name']
if intent_name in enhance_intents:
return True
# Default: Don't use Bedrock if we're in the middle of slot filling
if 'sessionState' in lex_response and 'dialogAction' in lex_response['sessionState']:
dialog_action = lex_response['sessionState']['dialogAction']
if dialog_action['type'] == 'ElicitSlot':
return False
return False
def process_with_bedrock(user_input, lex_response, session_state):
"""
Process the user input with Amazon Bedrock
"""
try:
# Extract conversation history from session state
conversation_history = session_state.get('conversationHistory', [])
# Prepare the prompt
prompt = prepare_prompt(user_input, lex_response, conversation_history)
# Select the model to use
model_id = os.environ.get('BEDROCK_MODEL_ID', 'anthropic.claude-v2')
# Call Bedrock with the appropriate parameters for the selected model
if 'anthropic.claude' in model_id:
response = call_claude(prompt, model_id)
elif 'amazon.titan' in model_id:
response = call_titan(prompt, model_id)
else:
raise ValueError(f"Unsupported model: {model_id}")
return {
'message': response,
'model': model_id
}
except Exception as e:
print(f"Error processing with Bedrock: {str(e)}")
# Fall back to Lex response if available, otherwise generic message
if 'messages' in lex_response and len(lex_response['messages']) > 0:
fallback_message = lex_response['messages'][0]['content']
else:
fallback_message = "I'm having trouble generating a response right now."
return {
'message': fallback_message,
'error': str(e)
}
def prepare_prompt(user_input, lex_response, conversation_history):
"""
Prepare a prompt for the LLM based on the conversation context
"""
# Extract intent and slots from Lex response if available
intent_name = "Unknown"
slots = {}
if 'sessionState' in lex_response and 'intent' in lex_response['sessionState']:
intent_name = lex_response['sessionState']['intent']['name']
slots = lex_response['sessionState']['intent'].get('slots', {})
# Start with system instructions
system_prompt = """
You are a helpful assistant that provides accurate, concise, and friendly responses.
You are part of a hybrid system where some requests are handled by Amazon Lex and others by you.
Keep your responses conversational but focused on answering the user's question.
Do not make up information or provide financial, medical, or legal advice.
If you don't know something, it's okay to say so.
"""
# For Claude models, format the prompt according to their requirements
prompt = f"\n\nHuman: {system_prompt}\n\nAssistant: I understand my role in the hybrid system. I'll provide helpful, accurate responses while staying within my guidelines.\n\n"
# Add relevant conversation history (last 3 exchanges)
for exchange in conversation_history[-3:]:
prompt += f"Human: {exchange['user']}\n\nAssistant: {exchange['assistant']}\n\n"
# Add context about the current intent and slots
prompt += f"Human: The user's message was: \"{user_input}\"\n\n"
prompt += f"The system detected intent: \"{intent_name}\" with the following slots: {json.dumps(slots)}\n\n"
# Add specific instructions based on the intent
if intent_name == "FallbackIntent":
prompt += "The system couldn't match this to a specific intent. Please provide a helpful response to the user's query.\n\n"
elif intent_name in ["ProvideInformation", "ExplainConcept", "AnswerQuestion"]:
prompt += "Please provide a detailed but concise explanation in response to this query.\n\n"
# Add the final instruction
prompt += "Please respond to the user's message in a helpful, accurate, and conversational way.\n\nAssistant:"
return prompt
def call_claude(prompt, model_id):
"""
Call Claude model through Bedrock
"""
response = bedrock.invoke_model(
modelId=model_id,
contentType='application/json',
accept='application/json',
body=json.dumps({
'prompt': prompt,
'max_tokens_to_sample': 500,
'temperature': 0.7,
'top_p': 0.9,
'stop_sequences': ["\n\nHuman:"]
})
)
response_body = json.loads(response['body'].read())
return response_body['completion'].strip()
def call_titan(prompt, model_id):
"""
Call Titan model through Bedrock
"""
response = bedrock.invoke_model(
modelId=model_id,
contentType='application/json',
accept='application/json',
body=json.dumps({
'inputText': prompt,
'textGenerationConfig': {
'maxTokenCount': 500,
'temperature': 0.7,
'topP': 0.9
}
})
)
response_body = json.loads(response['body'].read())
return response_body['results'][0]['outputText'].strip()
def update_session_state(session_state, lex_response, bedrock_response):
"""
Update the session state with information from both Lex and Bedrock
"""
updated_state = session_state.copy() if session_state else {}
# Update with Lex session state
if 'sessionState' in lex_response:
updated_state.update(lex_response['sessionState'])
# Add conversation history
conversation_history = updated_state.get('conversationHistory', [])
# Add the current exchange to history
user_input = lex_response.get('inputTranscript', '')
if bedrock_response:
assistant_response = bedrock_response['message']
source = 'bedrock'
elif 'messages' in lex_response and len(lex_response['messages']) > 0:
assistant_response = lex_response['messages'][0]['content']
source = 'lex'
else:
assistant_response = "No response generated."
source = 'unknown'
conversation_history.append({
'timestamp': int(time.time()),
'user': user_input,
'assistant': assistant_response,
'source': source
})
# Keep only the last 10 exchanges to manage context size
updated_state['conversationHistory'] = conversation_history[-10:]
# Add processing metadata
updated_state['lastProcessed'] = {
'timestamp': int(time.time()),
'source': source
}
return updated_state
Context Management in Hybrid Systems
Effective context management is crucial for creating coherent, natural conversations in hybrid systems that combine multiple technologies.
Types of Context
Several types of context need to be managed in conversational interfaces:
- Conversation History: Previous exchanges between the user and system
- User Information: User profile, preferences, and history
- Session State: Current state of the conversation, including active intents and slots
- Environmental Context: Time, location, device, and other external factors
- Application State: State of the application or service the conversation relates to
Context Management Strategies
Several strategies can be used to manage context effectively in hybrid systems:
- Centralized Context Store: Maintain a single source of truth for context that all components can access
- Context Passing: Pass relevant context between components with each request
- Context Summarization: Create concise summaries of context for components with limited context windows
- Selective Context Sharing: Share only the relevant portions of context with each component
- Context Expiration: Implement policies for when context should expire or be refreshed
Context Management Example
Centralized Context Store
User Profile
{ "userId": "user123", "name": "Alex", "preferences": { "language": "en", "notifications": true }, "accountType": "premium" }
Conversation History
[ { "timestamp": "2025-05-24T10:15:30Z", "user": "I need to book a flight to London", "assistant": "I can help you book a flight to London. When would you like to travel?", "source": "lex" }, { "timestamp": "2025-05-24T10:15:45Z", "user": "Next Friday", "assistant": "Great! Would you prefer a morning or evening flight to London next Friday, May 30th?", "source": "lex" } ]
Session State
{ "activeIntent": "BookFlight", "slots": { "Destination": "London", "DepartureDate": "2025-05-30", "TimeOfDay": null }, "dialogState": "ElicitSlot" }
Context Flow Between Components
Implementing Context Management
Practical approaches to implementing context management include:
- Session Attributes: Use Lex session attributes to store and retrieve context
- External Storage: Use DynamoDB or other databases for persistent context storage
- Context Preprocessing: Format context appropriately for each component
- Context Postprocessing: Update context based on component outputs
- Context Validation: Ensure context is consistent and valid across components
Creating Seamless User Experiences
The ultimate goal of hybrid approaches is to create seamless user experiences that leverage the strengths of different technologies while hiding the complexity from users.
Design Principles for Hybrid Interfaces
Key design principles for creating seamless hybrid conversational experiences:
- Consistency: Maintain consistent tone, personality, and capabilities across components
- Transparency: Be clear about capabilities and limitations without exposing the underlying technology
- Graceful Transitions: Ensure smooth transitions between different technologies
- Appropriate Expectations: Set and manage user expectations about what the system can do
- Error Recovery: Handle errors gracefully regardless of which component they originate from
Hybrid Conversation Simulator
Experience a seamless conversation using multiple technologiesTesting and Evaluation
Thorough testing and evaluation are essential for hybrid systems:
- Component Testing: Test each technology component individually
- Integration Testing: Test how components work together
- End-to-End Testing: Test complete conversation flows
- A/B Testing: Compare different hybrid approaches
- User Testing: Gather feedback from real users
Key metrics to evaluate include:
- Task completion rate
- Conversation length
- User satisfaction
- Error rate
- Response latency
- Seamlessness of transitions
Continuous Improvement
Hybrid systems benefit from a continuous improvement approach:
- Monitoring: Track performance and user interactions
- Analysis: Identify patterns and areas for improvement
- Refinement: Adjust routing logic, prompts, and integration points
- Expansion: Gradually add new capabilities and technologies
- Feedback Loops: Incorporate user feedback into improvements