Module 09

Beyond Lex: Hybrid Approaches

Explore advanced architectures that combine Amazon Lex with other technologies to create more sophisticated and powerful conversational experiences.

Back to Modules

Learning Objectives

Understand the limitations of single-technology approaches to conversational AI
Design hybrid architectures that combine multiple technologies
Integrate Amazon Lex with large language models (LLMs)
Implement context management across different components
Create seamless user experiences with hybrid systems

Limitations of Single-Technology Approaches

While Amazon Lex provides a powerful foundation for conversational interfaces, relying solely on one technology can limit the capabilities and flexibility of your solution. Understanding these limitations is the first step toward designing more sophisticated hybrid approaches.

Common Limitations of Lex-Only Solutions

Amazon Lex excels at structured conversations with clear intents and slots, but may face challenges with:

Complex, Open-Ended Conversations: Handling multi-turn dialogues that don't follow predictable patterns
Contextual Understanding: Maintaining context across multiple turns or topics
Knowledge-Based Responses: Providing detailed information that requires access to large knowledge bases
Nuanced Language Understanding: Recognizing subtle variations in meaning, sarcasm, or implicit requests
Personalization: Adapting responses based on user preferences, history, or characteristics
Creativity: Generating novel, diverse responses to similar inputs

Technology Comparison

Capability

Amazon Lex

Large Language Models

Knowledge Bases

Intent Recognition

High

Medium

Low

Structured Conversations

High

Medium

Low

Open-Ended Dialogue

Low

High

Low

Factual Knowledge

Low

Medium

High

Contextual Memory

Medium

High

Low

Response Creativity

Low

High

Low

Predictable Behavior

High

Low

High

When to Consider Hybrid Approaches

Hybrid approaches are particularly valuable when your conversational interface needs to:

Handle Both Structured and Unstructured Interactions: Combine the predictability of intent-based systems with the flexibility of generative models
Access Large Knowledge Bases: Provide detailed, accurate information beyond what can be encoded in intents and responses
Maintain Complex Context: Track and use information across multiple turns and topics
Generate Dynamic, Personalized Content: Create responses tailored to specific users or situations
Balance Control and Flexibility: Maintain guardrails while allowing for more natural conversations

Hybrid Architecture Patterns

Several architectural patterns can be used to combine different conversational AI technologies, each with its own strengths and use cases.

Router Pattern

The Router pattern uses a central component to direct user inputs to the most appropriate technology based on the type of request:

Input Analysis: Analyze user input to determine its type and complexity
Routing Decision: Direct the input to the most suitable technology (e.g., Lex for transactional requests, LLM for open-ended questions)
Response Integration: Combine and format responses from different components
Context Management: Maintain and share context across components

This pattern is particularly useful when different types of requests clearly benefit from different technologies.

Router Pattern Implementation

// Example Lambda function implementing a router pattern
const AWS = require('aws-sdk');
const lexRuntime = new AWS.LexRuntimeV2();
const bedrock = new AWS.Bedrock();

exports.handler = async (event) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    
    // Extract user input and session data
    const userInput = event.userInput;
    const sessionId = event.sessionId;
    const sessionState = event.sessionState || {};
    
    // Step 1: Analyze the input to determine routing
    const routingDecision = await analyzeInput(userInput, sessionState);
    
    // Step 2: Route to appropriate technology based on the decision
    let response;
    switch (routingDecision.destination) {
        case 'lex':
            response = await routeToLex(userInput, sessionId, sessionState);
            break;
        case 'llm':
            response = await routeToLLM(userInput, sessionState);
            break;
        case 'knowledge_base':
            response = await routeToKnowledgeBase(userInput, sessionState);
            break;
        default:
            response = {
                message: "I'm not sure how to process that request.",
                sessionState: sessionState
            };
    }
    
    // Step 3: Update context with the new information
    const updatedSessionState = updateContext(sessionState, userInput, response, routingDecision);
    
    // Step 4: Format and return the final response
    return {
        message: response.message,
        sessionState: updatedSessionState,
        source: routingDecision.destination
    };
};

// Analyze input to determine where to route it
async function analyzeInput(userInput, sessionState) {
    // Simple heuristic-based routing for demonstration
    // In a real system, this could use more sophisticated analysis
    
    // Check for transactional or task-oriented patterns
    const transactionalPatterns = [
        /book/i, /schedule/i, /reserve/i, /order/i, /buy/i, /purchase/i,
        /cancel/i, /change/i, /update/i, /status/i, /check/i, /find/i
    ];
    
    // Check for knowledge-seeking patterns
    const knowledgePatterns = [
        /what is/i, /how does/i, /explain/i, /tell me about/i, /information on/i,
        /details about/i, /when was/i, /where is/i, /who is/i, /why does/i
    ];
    
    // Check for open-ended, conversational patterns
    const conversationalPatterns = [
        /think about/i, /opinion on/i, /feel about/i, /imagine/i, /creative/i,
        /suggest/i, /recommend/i, /advice/i, /help me with/i, /brainstorm/i
    ];
    
    // Check if we're in the middle of a Lex conversation
    const inLexConversation = sessionState.lexSessionActive && 
                             sessionState.lexSessionState && 
                             sessionState.lexSessionState.dialogAction && 
                             sessionState.lexSessionState.dialogAction.type !== 'Close';
    
    // If we're in an active Lex conversation, continue with Lex
    if (inLexConversation) {
        return { destination: 'lex', confidence: 0.9, reason: 'active_lex_session' };
    }
    
    // Check for transactional patterns
    for (const pattern of transactionalPatterns) {
        if (pattern.test(userInput)) {
            return { destination: 'lex', confidence: 0.8, reason: 'transactional_pattern' };
        }
    }
    
    // Check for knowledge patterns
    for (const pattern of knowledgePatterns) {
        if (pattern.test(userInput)) {
            return { destination: 'knowledge_base', confidence: 0.7, reason: 'knowledge_pattern' };
        }
    }
    
    // Check for conversational patterns
    for (const pattern of conversationalPatterns) {
        if (pattern.test(userInput)) {
            return { destination: 'llm', confidence: 0.7, reason: 'conversational_pattern' };
        }
    }
    
    // Default to LLM for unclassified inputs
    return { destination: 'llm', confidence: 0.5, reason: 'default' };
}

// Route the request to Amazon Lex
async function routeToLex(userInput, sessionId, sessionState) {
    // Extract Lex-specific session state if it exists
    const lexSessionState = sessionState.lexSessionState || {};
    
    try {
        const params = {
            botId: process.env.LEX_BOT_ID,
            botAliasId: process.env.LEX_BOT_ALIAS_ID,
            localeId: 'en_US',
            sessionId: sessionId,
            text: userInput,
            sessionState: lexSessionState
        };
        
        const lexResponse = await lexRuntime.recognizeText(params).promise();
        
        // Extract the message from Lex response
        let message = '';
        if (lexResponse.messages && lexResponse.messages.length > 0) {
            message = lexResponse.messages.map(m => m.content).join(' ');
        }
        
        return {
            message: message,
            lexSessionState: lexResponse.sessionState,
            dialogState: lexResponse.sessionState.dialogAction.type
        };
    } catch (error) {
        console.error('Error calling Lex:', error);
        return {
            message: "I'm having trouble processing your request right now.",
            error: error.message
        };
    }
}

// Route the request to a Large Language Model
async function routeToLLM(userInput, sessionState) {
    // Get conversation history from session state
    const conversationHistory = sessionState.conversationHistory || [];
    
    // Prepare the prompt with conversation history
    const prompt = preparePromptWithHistory(userInput, conversationHistory);
    
    try {
        // Call Amazon Bedrock with Claude model
        const params = {
            modelId: 'anthropic.claude-v2',
            contentType: 'application/json',
            accept: 'application/json',
            body: JSON.stringify({
                prompt: prompt,
                max_tokens_to_sample: 500,
                temperature: 0.7,
                top_p: 0.9,
                stop_sequences: ["\n\nHuman:"]
            })
        };
        
        const bedrockResponse = await bedrock.invokeModel(params).promise();
        const responseBody = JSON.parse(bedrockResponse.body.toString());
        
        return {
            message: responseBody.completion.trim(),
            modelId: 'anthropic.claude-v2'
        };
    } catch (error) {
        console.error('Error calling LLM:', error);
        return {
            message: "I'm having trouble generating a response right now.",
            error: error.message
        };
    }
}

// Route the request to a knowledge base
async function routeToKnowledgeBase(userInput, sessionState) {
    // In a real implementation, this would call Amazon Kendra or another knowledge base
    // For this example, we'll simulate a knowledge base response
    
    return {
        message: `Here's what I found about "${userInput}": This would be information retrieved from a knowledge base.`,
        source: 'simulated_knowledge_base'
    };
}

// Prepare a prompt for the LLM that includes conversation history
function preparePromptWithHistory(userInput, conversationHistory) {
    // Start with the system prompt
    let prompt = "\n\nHuman: You are a helpful, harmless assistant that provides accurate and concise information. You're part of a hybrid system where some requests are handled by other components. Keep your responses friendly and conversational.\n\nAssistant: I understand. I'll provide helpful, accurate, and concise responses in a friendly tone.\n\n";
    
    // Add conversation history
    for (const exchange of conversationHistory.slice(-5)) { // Include up to 5 most recent exchanges
        prompt += `Human: ${exchange.userInput}\n\nAssistant: ${exchange.assistantResponse}\n\n`;
    }
    
    // Add the current user input
    prompt += `Human: ${userInput}\n\nAssistant:`;
    
    return prompt;
}

// Update the context with new information
function updateContext(sessionState, userInput, response, routingDecision) {
    const updatedState = { ...sessionState };
    
    // Update Lex session state if applicable
    if (routingDecision.destination === 'lex' && response.lexSessionState) {
        updatedState.lexSessionState = response.lexSessionState;
        updatedState.lexSessionActive = response.dialogState !== 'Close';
    }
    
    // Update conversation history
    const conversationHistory = updatedState.conversationHistory || [];
    conversationHistory.push({
        timestamp: new Date().toISOString(),
        userInput: userInput,
        assistantResponse: response.message,
        source: routingDecision.destination
    });
    
    // Keep only the last 10 exchanges to manage context size
    updatedState.conversationHistory = conversationHistory.slice(-10);
    
    // Add routing information for analytics
    updatedState.lastRouting = {
        destination: routingDecision.destination,
        confidence: routingDecision.confidence,
        reason: routingDecision.reason,
        timestamp: new Date().toISOString()
    };
    
    return updatedState;
}

Fallback Pattern

The Fallback pattern uses a primary technology (typically Lex) for most interactions, but falls back to alternative technologies when the primary system cannot handle a request:

Primary Processing: First attempt to handle the request with the primary system
Confidence Check: Evaluate whether the primary system's response is satisfactory
Fallback Decision: If confidence is low or no matching intent is found, route to the fallback system
Response Selection: Choose the most appropriate response from available options

This pattern is useful when you want to maintain the predictability and control of a structured system while handling edge cases more gracefully.

Fallback Pattern Flow

User Input

User sends a message to the conversational interface

↓

Primary System (Lex)

Attempt to match input to defined intents and slots

Intent matched with high confidence?

Yes

↓

Use Lex Response

Process intent and return structured response

→

Fallback System (LLM)

Process input with more flexible model

↓

Use LLM Response

Return generated response with appropriate guardrails

Orchestrator Pattern

The Orchestrator pattern uses a central component to coordinate multiple technologies that work together to handle a single request:

Request Analysis: Break down the request into components that different technologies can handle
Parallel Processing: Send components to appropriate technologies simultaneously
Result Aggregation: Combine results from different components
Response Generation: Create a unified, coherent response

This pattern is valuable for complex requests that benefit from multiple technologies working together, such as combining structured data retrieval with natural language generation.

Augmentation Pattern

The Augmentation pattern uses one technology to enhance the capabilities of another:

Primary Processing: Handle the core request with the primary technology
Enhancement Identification: Identify aspects that could be improved
Augmentation: Use secondary technologies to enhance specific aspects
Integration: Incorporate enhancements into the final response

For example, using Lex for intent recognition and slot filling, then using an LLM to make the response more natural and conversational.

Augmentation Pattern Implementation

// Example Lambda function implementing an augmentation pattern
const AWS = require('aws-sdk');
const lexRuntime = new AWS.LexRuntimeV2();
const bedrock = new AWS.Bedrock();

exports.handler = async (event) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    
    // Extract user input and session data
    const userInput = event.userInput;
    const sessionId = event.sessionId;
    const sessionState = event.sessionState || {};
    
    try {
        // Step 1: Process with Lex to handle intent recognition and slot filling
        const lexResponse = await processWithLex(userInput, sessionId, sessionState);
        
        // Step 2: Determine if the response needs enhancement
        const needsEnhancement = shouldEnhanceResponse(lexResponse);
        
        if (needsEnhancement) {
            // Step 3: Enhance the response using an LLM
            const enhancedResponse = await enhanceWithLLM(userInput, lexResponse, sessionState);
            
            // Step 4: Return the enhanced response
            return {
                message: enhancedResponse.message,
                sessionState: {
                    ...sessionState,
                    lexSessionState: lexResponse.lexSessionState,
                    lastEnhanced: true
                },
                enhanced: true
            };
        } else {
            // Return the original Lex response if enhancement isn't needed
            return {
                message: lexResponse.message,
                sessionState: {
                    ...sessionState,
                    lexSessionState: lexResponse.lexSessionState,
                    lastEnhanced: false
                },
                enhanced: false
            };
        }
    } catch (error) {
        console.error('Error processing request:', error);
        return {
            message: "I'm having trouble processing your request right now.",
            sessionState: sessionState,
            error: error.message
        };
    }
};

// Process the request with Amazon Lex
async function processWithLex(userInput, sessionId, sessionState) {
    // Extract Lex-specific session state if it exists
    const lexSessionState = sessionState.lexSessionState || {};
    
    try {
        const params = {
            botId: process.env.LEX_BOT_ID,
            botAliasId: process.env.LEX_BOT_ALIAS_ID,
            localeId: 'en_US',
            sessionId: sessionId,
            text: userInput,
            sessionState: lexSessionState
        };
        
        const lexResponse = await lexRuntime.recognizeText(params).promise();
        
        // Extract the message from Lex response
        let message = '';
        if (lexResponse.messages && lexResponse.messages.length > 0) {
            message = lexResponse.messages.map(m => m.content).join(' ');
        }
        
        // Extract intent and slots for context
        const intent = lexResponse.sessionState.intent ? lexResponse.sessionState.intent.name : 'None';
        const slots = lexResponse.sessionState.intent ? lexResponse.sessionState.intent.slots : {};
        
        return {
            message: message,
            lexSessionState: lexResponse.sessionState,
            dialogState: lexResponse.sessionState.dialogAction.type,
            intent: intent,
            slots: slots,
            confidence: lexResponse.interpretations && lexResponse.interpretations[0] ? 
                        lexResponse.interpretations[0].nluConfidence.score : 0
        };
    } catch (error) {
        console.error('Error calling Lex:', error);
        throw error;
    }
}

// Determine if the response should be enhanced
function shouldEnhanceResponse(lexResponse) {
    // Criteria for enhancement:
    
    // 1. Don't enhance if we're in the middle of slot filling
    if (lexResponse.dialogState === 'ElicitSlot') {
        return false;
    }
    
    // 2. Don't enhance if confidence is very low (might be misinterpreting)
    if (lexResponse.confidence < 0.3) {
        return false;
    }
    
    // 3. Enhance fulfilled intents to make responses more natural
    if (lexResponse.dialogState === 'Close' && 
        lexResponse.lexSessionState.dialogAction.fulfillmentState === 'Fulfilled') {
        return true;
    }
    
    // 4. Enhance certain intents that benefit from more natural language
    const enhanceableIntents = [
        'ProvideInformation',
        'ExplainFeatures',
        'GiveAdvice',
        'AnswerFAQ'
    ];
    
    if (enhanceableIntents.includes(lexResponse.intent)) {
        return true;
    }
    
    // Default to not enhancing
    return false;
}

// Enhance the Lex response using an LLM
async function enhanceWithLLM(userInput, lexResponse, sessionState) {
    // Prepare the prompt for enhancement
    const prompt = prepareEnhancementPrompt(userInput, lexResponse, sessionState);
    
    try {
        // Call Amazon Bedrock with Claude model
        const params = {
            modelId: 'anthropic.claude-v2',
            contentType: 'application/json',
            accept: 'application/json',
            body: JSON.stringify({
                prompt: prompt,
                max_tokens_to_sample: 300,
                temperature: 0.7,
                top_p: 0.9,
                stop_sequences: ["\n\nHuman:"]
            })
        };
        
        const bedrockResponse = await bedrock.invokeModel(params).promise();
        const responseBody = JSON.parse(bedrockResponse.body.toString());
        
        return {
            message: responseBody.completion.trim(),
            originalMessage: lexResponse.message,
            modelId: 'anthropic.claude-v2'
        };
    } catch (error) {
        console.error('Error enhancing with LLM:', error);
        // Fall back to the original Lex response if enhancement fails
        return {
            message: lexResponse.message,
            error: error.message
        };
    }
}

// Prepare a prompt for enhancing the response
function prepareEnhancementPrompt(userInput, lexResponse, sessionState) {
    // Get user's name from session state if available
    const userName = sessionState.userProfile ? sessionState.userProfile.firstName : null;
    
    // Start with the system prompt
    let prompt = "\n\nHuman: You are an AI assistant that makes responses more conversational and natural. You'll be given a user's input and a basic response. Your job is to enhance the response to make it more engaging and natural while preserving all the factual information. Keep the enhanced response concise and focused on the user's request.\n\nAssistant: I understand. I'll enhance responses to make them more conversational and natural while preserving all factual information and keeping them concise.\n\n";
    
    // Add context about the user if available
    if (userName) {
        prompt += `Human: The user's name is ${userName}. Please personalize the response appropriately.\n\nAssistant: I'll make sure to personalize the response for ${userName}.\n\n`;
    }
    
    // Add context about the intent and slots
    prompt += `Human: The user's intent is "${lexResponse.intent}" and they provided these details: ${JSON.stringify(lexResponse.slots)}.\n\nAssistant: I understand the context of the conversation.\n\n`;
    
    // Add the current exchange
    prompt += `Human: The user said: "${userInput}"\n\nThe basic response is: "${lexResponse.message}"\n\nPlease enhance this response to make it more conversational and natural while preserving all the factual information.\n\nAssistant:`;
    
    return prompt;
}

Integrating Lex with Large Language Models

Large Language Models (LLMs) like those available through Amazon Bedrock can significantly enhance the capabilities of Lex-based conversational interfaces.

LLM Capabilities and Limitations

Understanding the strengths and weaknesses of LLMs is essential for effective integration:

Strengths:
- Generating natural, contextually appropriate responses
- Handling open-ended, creative requests
- Understanding complex or ambiguous language
- Maintaining context across multiple turns
- Adapting tone and style based on the conversation
Limitations:
- Less predictable than rule-based systems
- Potential for hallucinations or factual errors
- Difficulty with precise, structured data collection
- Higher latency and computational cost
- Challenges with specific domain knowledge without fine-tuning

Integration Approaches

Several approaches can be used to integrate Lex with LLMs:

LLM for Response Enhancement: Use Lex for intent recognition and slot filling, then use an LLM to make responses more natural
LLM for Fallback Handling: Use an LLM when Lex cannot match an intent or has low confidence
LLM for Complex Queries: Route complex or open-ended queries to an LLM while keeping transactional requests with Lex
LLM for Context Management: Use an LLM to maintain and interpret conversation context across multiple turns
LLM for Intent Classification: Use an LLM to classify intents before routing to Lex for structured processing

LLM Integration Examples

Response Enhancement

User: "What's the weather like in Seattle today?"
Lex Response: "The weather in Seattle is 62°F with rain."

Enhanced with LLM: "It looks like you'll need an umbrella in Seattle today! It's 62°F with rain showers throughout the day."

Fallback Handling

User: "Can you explain how cloud computing works?"
Lex Response: "I'm sorry, I didn't understand your request."

LLM Fallback: "Cloud computing is a technology that allows you to access computing resources (like servers, storage, and applications) over the internet instead of having them physically located on your premises. It works by distributing these resources across multiple data centers, allowing for scalability, flexibility, and often cost savings compared to traditional IT infrastructure."

Context Management

User: "What's the weather like in Seattle?"
Bot: "It's currently 62°F with rain in Seattle."
User: "How about tomorrow?"
LLM Context Management: [Recognizes "tomorrow" refers to weather in Seattle]
Bot: "Tomorrow in Seattle will be partly cloudy with a high of 65°F."

Using Amazon Bedrock

Amazon Bedrock provides a managed service for accessing foundation models from leading AI companies. Key considerations when using Bedrock with Lex include:

Model Selection: Choose the appropriate model based on your requirements for performance, cost, and capabilities
Prompt Engineering: Design effective prompts that provide necessary context and guidance
Response Filtering: Implement mechanisms to filter or modify responses for safety and appropriateness
Latency Management: Consider the impact of LLM processing time on the overall user experience
Cost Optimization: Implement strategies to minimize token usage and API calls

Amazon Bedrock Integration

# Example Python code for integrating Amazon Bedrock with a Lex bot
import boto3
import json
import os
import time

# Initialize clients
bedrock = boto3.client('bedrock-runtime')
lex = boto3.client('lexv2-runtime')

def lambda_handler(event, context):
    """
    Lambda function that integrates Lex with Bedrock for enhanced conversational capabilities
    """
    print(f"Received event: {json.dumps(event)}")
    
    # Extract user input and session information
    user_input = event.get('inputText', '')
    session_id = event.get('sessionId', f"session_{int(time.time())}")
    session_state = event.get('sessionState', {})
    
    # Step 1: Process with Lex first
    lex_response = process_with_lex(user_input, session_id, session_state)
    
    # Step 2: Determine if we need to use Bedrock based on Lex response
    if should_use_bedrock(lex_response):
        # Step 3: Process with Bedrock
        bedrock_response = process_with_bedrock(user_input, lex_response, session_state)
        
        # Step 4: Return the enhanced response
        return {
            'sessionState': update_session_state(session_state, lex_response, bedrock_response),
            'messages': [{'content': bedrock_response['message'], 'contentType': 'PlainText'}],
            'requestAttributes': event.get('requestAttributes', {}),
            'sessionId': session_id,
            'source': 'bedrock'
        }
    else:
        # Return the original Lex response
        return {
            'sessionState': update_session_state(session_state, lex_response, None),
            'messages': lex_response.get('messages', []),
            'requestAttributes': event.get('requestAttributes', {}),
            'sessionId': session_id,
            'source': 'lex'
        }

def process_with_lex(user_input, session_id, session_state):
    """
    Process the user input with Amazon Lex
    """
    try:
        response = lex.recognize_text(
            botId=os.environ['LEX_BOT_ID'],
            botAliasId=os.environ['LEX_BOT_ALIAS_ID'],
            localeId='en_US',
            sessionId=session_id,
            text=user_input,
            sessionState=session_state
        )
        
        print(f"Lex response: {json.dumps(response)}")
        return response
    except Exception as e:
        print(f"Error processing with Lex: {str(e)}")
        return {
            'messages': [{'content': "I'm having trouble understanding. Could you try again?", 'contentType': 'PlainText'}],
            'sessionState': session_state,
            'error': str(e)
        }

def should_use_bedrock(lex_response):
    """
    Determine if we should use Bedrock based on the Lex response
    """
    # Case 1: No matching intent with high confidence
    if 'interpretations' in lex_response and len(lex_response['interpretations']) > 0:
        top_intent = lex_response['interpretations'][0]
        if 'intent' in top_intent and 'nluConfidence' in top_intent:
            # If confidence is below threshold, use Bedrock
            if top_intent['nluConfidence']['score'] < 0.6:
                return True
    
    # Case 2: Fallback intent was triggered
    if 'sessionState' in lex_response and 'intent' in lex_response['sessionState']:
        intent_name = lex_response['sessionState']['intent']['name']
        if intent_name == 'FallbackIntent':
            return True
    
    # Case 3: Specific intents that benefit from enhancement
    enhance_intents = ['ProvideInformation', 'ExplainConcept', 'AnswerQuestion']
    if 'sessionState' in lex_response and 'intent' in lex_response['sessionState']:
        intent_name = lex_response['sessionState']['intent']['name']
        if intent_name in enhance_intents:
            return True
    
    # Default: Don't use Bedrock if we're in the middle of slot filling
    if 'sessionState' in lex_response and 'dialogAction' in lex_response['sessionState']:
        dialog_action = lex_response['sessionState']['dialogAction']
        if dialog_action['type'] == 'ElicitSlot':
            return False
    
    return False

def process_with_bedrock(user_input, lex_response, session_state):
    """
    Process the user input with Amazon Bedrock
    """
    try:
        # Extract conversation history from session state
        conversation_history = session_state.get('conversationHistory', [])
        
        # Prepare the prompt
        prompt = prepare_prompt(user_input, lex_response, conversation_history)
        
        # Select the model to use
        model_id = os.environ.get('BEDROCK_MODEL_ID', 'anthropic.claude-v2')
        
        # Call Bedrock with the appropriate parameters for the selected model
        if 'anthropic.claude' in model_id:
            response = call_claude(prompt, model_id)
        elif 'amazon.titan' in model_id:
            response = call_titan(prompt, model_id)
        else:
            raise ValueError(f"Unsupported model: {model_id}")
        
        return {
            'message': response,
            'model': model_id
        }
    except Exception as e:
        print(f"Error processing with Bedrock: {str(e)}")
        
        # Fall back to Lex response if available, otherwise generic message
        if 'messages' in lex_response and len(lex_response['messages']) > 0:
            fallback_message = lex_response['messages'][0]['content']
        else:
            fallback_message = "I'm having trouble generating a response right now."
        
        return {
            'message': fallback_message,
            'error': str(e)
        }

def prepare_prompt(user_input, lex_response, conversation_history):
    """
    Prepare a prompt for the LLM based on the conversation context
    """
    # Extract intent and slots from Lex response if available
    intent_name = "Unknown"
    slots = {}
    
    if 'sessionState' in lex_response and 'intent' in lex_response['sessionState']:
        intent_name = lex_response['sessionState']['intent']['name']
        slots = lex_response['sessionState']['intent'].get('slots', {})
    
    # Start with system instructions
    system_prompt = """
    You are a helpful assistant that provides accurate, concise, and friendly responses.
    You are part of a hybrid system where some requests are handled by Amazon Lex and others by you.
    Keep your responses conversational but focused on answering the user's question.
    Do not make up information or provide financial, medical, or legal advice.
    If you don't know something, it's okay to say so.
    """
    
    # For Claude models, format the prompt according to their requirements
    prompt = f"\n\nHuman: {system_prompt}\n\nAssistant: I understand my role in the hybrid system. I'll provide helpful, accurate responses while staying within my guidelines.\n\n"
    
    # Add relevant conversation history (last 3 exchanges)
    for exchange in conversation_history[-3:]:
        prompt += f"Human: {exchange['user']}\n\nAssistant: {exchange['assistant']}\n\n"
    
    # Add context about the current intent and slots
    prompt += f"Human: The user's message was: \"{user_input}\"\n\n"
    prompt += f"The system detected intent: \"{intent_name}\" with the following slots: {json.dumps(slots)}\n\n"
    
    # Add specific instructions based on the intent
    if intent_name == "FallbackIntent":
        prompt += "The system couldn't match this to a specific intent. Please provide a helpful response to the user's query.\n\n"
    elif intent_name in ["ProvideInformation", "ExplainConcept", "AnswerQuestion"]:
        prompt += "Please provide a detailed but concise explanation in response to this query.\n\n"
    
    # Add the final instruction
    prompt += "Please respond to the user's message in a helpful, accurate, and conversational way.\n\nAssistant:"
    
    return prompt

def call_claude(prompt, model_id):
    """
    Call Claude model through Bedrock
    """
    response = bedrock.invoke_model(
        modelId=model_id,
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'prompt': prompt,
            'max_tokens_to_sample': 500,
            'temperature': 0.7,
            'top_p': 0.9,
            'stop_sequences': ["\n\nHuman:"]
        })
    )
    
    response_body = json.loads(response['body'].read())
    return response_body['completion'].strip()

def call_titan(prompt, model_id):
    """
    Call Titan model through Bedrock
    """
    response = bedrock.invoke_model(
        modelId=model_id,
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': prompt,
            'textGenerationConfig': {
                'maxTokenCount': 500,
                'temperature': 0.7,
                'topP': 0.9
            }
        })
    )
    
    response_body = json.loads(response['body'].read())
    return response_body['results'][0]['outputText'].strip()

def update_session_state(session_state, lex_response, bedrock_response):
    """
    Update the session state with information from both Lex and Bedrock
    """
    updated_state = session_state.copy() if session_state else {}
    
    # Update with Lex session state
    if 'sessionState' in lex_response:
        updated_state.update(lex_response['sessionState'])
    
    # Add conversation history
    conversation_history = updated_state.get('conversationHistory', [])
    
    # Add the current exchange to history
    user_input = lex_response.get('inputTranscript', '')
    
    if bedrock_response:
        assistant_response = bedrock_response['message']
        source = 'bedrock'
    elif 'messages' in lex_response and len(lex_response['messages']) > 0:
        assistant_response = lex_response['messages'][0]['content']
        source = 'lex'
    else:
        assistant_response = "No response generated."
        source = 'unknown'
    
    conversation_history.append({
        'timestamp': int(time.time()),
        'user': user_input,
        'assistant': assistant_response,
        'source': source
    })
    
    # Keep only the last 10 exchanges to manage context size
    updated_state['conversationHistory'] = conversation_history[-10:]
    
    # Add processing metadata
    updated_state['lastProcessed'] = {
        'timestamp': int(time.time()),
        'source': source
    }
    
    return updated_state

Context Management in Hybrid Systems

Effective context management is crucial for creating coherent, natural conversations in hybrid systems that combine multiple technologies.

Types of Context

Several types of context need to be managed in conversational interfaces:

Conversation History: Previous exchanges between the user and system
User Information: User profile, preferences, and history
Session State: Current state of the conversation, including active intents and slots
Environmental Context: Time, location, device, and other external factors
Application State: State of the application or service the conversation relates to

Context Management Strategies

Several strategies can be used to manage context effectively in hybrid systems:

Centralized Context Store: Maintain a single source of truth for context that all components can access
Context Passing: Pass relevant context between components with each request
Context Summarization: Create concise summaries of context for components with limited context windows
Selective Context Sharing: Share only the relevant portions of context with each component
Context Expiration: Implement policies for when context should expire or be refreshed

Context Management Example

Centralized Context Store

User Profile

{
  "userId": "user123",
  "name": "Alex",
  "preferences": {
    "language": "en",
    "notifications": true
  },
  "accountType": "premium"
}

Conversation History

[
  {
    "timestamp": "2025-05-24T10:15:30Z",
    "user": "I need to book a flight to London",
    "assistant": "I can help you book a flight to London. When would you like to travel?",
    "source": "lex"
  },
  {
    "timestamp": "2025-05-24T10:15:45Z",
    "user": "Next Friday",
    "assistant": "Great! Would you prefer a morning or evening flight to London next Friday, May 30th?",
    "source": "lex"
  }
]

Session State

{
  "activeIntent": "BookFlight",
  "slots": {
    "Destination": "London",
    "DepartureDate": "2025-05-30",
    "TimeOfDay": null
  },
  "dialogState": "ElicitSlot"
}

Context Flow Between Components

User Input

"I prefer morning flights"

↓

Router

Analyzes input and context

Continue with Lex (slot filling)

↓

Lex

Fills "TimeOfDay" slot with "morning"

"I've found 3 morning flights to London on Friday, May 30th."

↓

LLM Enhancer

Receives context and Lex response

"Perfect, Alex! I've found 3 morning flights to London next Friday, May 30th. Would you like to see the options sorted by price or departure time?"

Implementing Context Management

Practical approaches to implementing context management include:

Session Attributes: Use Lex session attributes to store and retrieve context
External Storage: Use DynamoDB or other databases for persistent context storage
Context Preprocessing: Format context appropriately for each component
Context Postprocessing: Update context based on component outputs
Context Validation: Ensure context is consistent and valid across components

Creating Seamless User Experiences

The ultimate goal of hybrid approaches is to create seamless user experiences that leverage the strengths of different technologies while hiding the complexity from users.

Design Principles for Hybrid Interfaces

Key design principles for creating seamless hybrid conversational experiences:

Consistency: Maintain consistent tone, personality, and capabilities across components
Transparency: Be clear about capabilities and limitations without exposing the underlying technology
Graceful Transitions: Ensure smooth transitions between different technologies
Appropriate Expectations: Set and manage user expectations about what the system can do
Error Recovery: Handle errors gracefully regardless of which component they originate from

Hybrid Conversation Simulator

Experience a seamless conversation using multiple technologies

I need to book a flight to Tokyo

Lex

I'd be happy to help you book a flight to Tokyo. When would you like to travel?

Lex+LLM

Next month, sometime in the first week

Lex

Great! To help you find the best flights to Tokyo in the first week of June, could you tell me which city you'll be departing from?

Lex+LLM

San Francisco. By the way, what's the weather like in Tokyo in June?

Router

Thanks for letting me know you'll be departing from San Francisco. June in Tokyo is typically warm and humid, with average temperatures ranging from 19°C (66°F) to 25°C (77°F). It's also the start of the rainy season, so you can expect some precipitation. I'd recommend packing light, breathable clothing and a compact umbrella. Now, let's continue with your flight booking. Do you have specific dates within the first week of June that you prefer?

Lex+LLM

Testing and Evaluation

Thorough testing and evaluation are essential for hybrid systems:

Component Testing: Test each technology component individually
Integration Testing: Test how components work together
End-to-End Testing: Test complete conversation flows
A/B Testing: Compare different hybrid approaches
User Testing: Gather feedback from real users

Key metrics to evaluate include:

Task completion rate
Conversation length
User satisfaction
Error rate
Response latency
Seamlessness of transitions

Continuous Improvement

Hybrid systems benefit from a continuous improvement approach:

Monitoring: Track performance and user interactions
Analysis: Identify patterns and areas for improvement
Refinement: Adjust routing logic, prompts, and integration points
Expansion: Gradually add new capabilities and technologies
Feedback Loops: Incorporate user feedback into improvements

Knowledge Check: Module 9

Question 1 of X

Loading question...

All Modules

Previous Module Next Module

Beyond Lex: Hybrid Approaches

Learning Objectives

Limitations of Single-Technology Approaches

Common Limitations of Lex-Only Solutions

Technology Comparison

When to Consider Hybrid Approaches

Hybrid Architecture Patterns

Router Pattern

Router Pattern Implementation

Fallback Pattern

Fallback Pattern Flow

User Input

Primary System (Lex)

Use Lex Response

Fallback System (LLM)

Use LLM Response

Orchestrator Pattern

Augmentation Pattern

Augmentation Pattern Implementation

Integrating Lex with Large Language Models

LLM Capabilities and Limitations

Integration Approaches

LLM Integration Examples

Response Enhancement

Fallback Handling

Context Management

Using Amazon Bedrock

Amazon Bedrock Integration

Context Management in Hybrid Systems

Types of Context

Context Management Strategies

Context Management Example

Centralized Context Store

User Profile

Conversation History

Session State

Context Flow Between Components

Implementing Context Management

Creating Seamless User Experiences

Design Principles for Hybrid Interfaces

Hybrid Conversation Simulator

Testing and Evaluation

Continuous Improvement

Knowledge Check: Module 9

Quiz Complete!

Module Navigation

Resources

Course Progress