Amazon Comprehend と Amazon Translate を使って事前に英語に変換して Amazon Q と会話できるようにしてみた #AWSreInvent
先日発表された Amazon Q の会話型 Q&A 機能は、 AWS のトラブルシューティングに非常に便利な機能です。
一方で、Amazon Q には、以下のような記載があり、多言語で回答はできるものの、英語のほうが良い回答を得られるという記載があります。
Amazon Q can respond in multiple languages. However, Amazon Q performs optimally for English language conversations and interactions.
Only English language documents are supported for indexing.
そこで、本記事では、Amazon Translate を使って事前に英語に変換することで、 Amazon Q と英語以外でも柔軟に会話できるようにしてみました!
ユーザより入力される言語については、日本語以外も対応させたいので、 Amazon Comprehend を利用して言語コードを取得し、Amazon Translate を使用して英語に変換します。
また、本記事では、事前に言語変換を行う都合上、Amazon Q とのインターフェースとして、Slack を利用することとします。
コードは、ゼロから作成するのではなく、amazon-q-slack-gateway をベースに作成していきます。
上記のサンプルには、Amazon Q、デプロイ環境、 Slack App のセットアップが必要となっております。
今回は検証のため、以下のPDFファイルを Amazon Q Application のデータソースとして追加します。
Amazon Q Application のコンソールからAdd data sourceを押下します。
Upload docsからPDFファイルのアップロードを行います。
IAM Role の権限変更
Amazon Translate と Amazon Comprehend の権限を追加します。
role: new Role(this, `${suffix}-Role`, { assumedBy: new ServicePrincipal('lambda.amazonaws.com'), managedPolicies: [ ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaVPCAccessExecutionRole') ], inlinePolicies: { SecretManagerPolicy: new PolicyDocument({ statements: [ new PolicyStatement({ actions: ['secretsmanager:GetSecretValue'], resources: [slackSecret.secretArn] }) ] }), TranslatePolicy: new PolicyDocument({ statements: [ new PolicyStatement({ actions: ['translate:TranslateText', "comprehend:DetectDominantLanguage"], resources: ["*"] }) ] }), DynamoDBPolicy: new PolicyDocument({ statements: [ new PolicyStatement({ actions: ['dynamodb:DeleteItem', 'dynamodb:PutItem', 'dynamodb:GetItem'], resources: [dynamoCache.tableArn, messageMetadata.tableArn] }) ] }), ChatPolicy: new PolicyDocument({ statements: [ new PolicyStatement({ actions: ['qbusiness:ChatSync', 'qbusiness:PutFeedback'], // parametrized resources: [`arn:aws:qbusiness:*:*:application/${env.AmazonQAppId}`] }) ] }) } })
comprehend.ts では、言語コードの取得を行います。言語コードが取得できない場合は、日本語として扱うことにしています。
import * as AWS from 'aws-sdk'; // Amazon Comprehendのインスタンスを生成 export const client = new AWS.Comprehend(); export const getLanguage = async (Text:string):Promise<string> =>{ let params = { Text }; let response = await client.detectDominantLanguage(params).promise(); if (!response.Languages || response.Languages.length === 0 || response.Languages[0].LanguageCode==undefined) { return "ja" } return response.Languages[0].LanguageCode; }
translate.ts では、入力された言語コードを元に、翻訳をかけます。
import * as AWS from 'aws-sdk'; // Amazon Translateのインスタンスを生成 export const client = new AWS.Translate(); export const Translate = async (text:string, source:string, target:string) =>{ // 翻訳のパラメータを設定 let params = { Text: text, SourceLanguageCode: source, // 元の言語 TargetLanguageCode: target // 翻訳先の言語 }; // 翻訳の実行 let res = await client.translateText(params).promise(); return res.TranslatedText; }
Slack Event Handler
上記で作成した関数を slack-event-handler.ts に反映させます。
また、238行目では、 Amazon Q から取得した英語のテキストも言語コードを使用して、元の言語に変換を行います。
export const handler = async ( event: { body: string; headers: { [key: string]: string | undefined }; }, _context: Context, _callback: Callback, dependencies = { ...chatDependencies, validateSlackRequest }, slackEventsEnv: SlackEventsEnv = processSlackEventsEnv(process.env) ): Promise<APIGatewayProxyResult> => { logger.debug(`Received event: ${JSON.stringify(event)}`); logger.debug(`dependencies ${JSON.stringify(dependencies)}`); if (isEmpty(event.body)) { return { statusCode: 400, body: JSON.stringify({ error: 'Bad request' }) }; } // You would want to ensure that this method is always here before you start parsing the request // For extra safety it is recommended to have a Synthetic test (aka Canary) via AWS that will // Call this method with an invalid signature and verify that the status code is 403 // You can define a CDK construct for it. if (!(await dependencies.validateSlackRequest(event.headers, event.body, slackEventsEnv))) { logger.warn(`Invalid request`); return { statusCode: 403, body: JSON.stringify({ error: 'Forbidden' }) }; } const body = JSON.parse(event.body); logger.debug(`Received message body ${JSON.stringify(body)}`); // Read why it is needed: https://api.slack.com/events/url_verification if (!isEmpty(body.challenge)) { return { statusCode: 200, body: body.challenge }; } if (!isEmpty(event.headers['X-Slack-Retry-Reason'])) { const retry_reason = event.headers['X-Slack-Retry-Reason']; const retry_num = event.headers['X-Slack-Retry-Num']; logger.debug( `Ignoring retry event (avoid duplicate bot requests): Retry-Reason '${retry_reason}', Retry-Num '${retry_num}'` ); return { statusCode: 200, body: JSON.stringify({ error: `Ignoring retry event: Retry-Reason '${retry_reason}', Retry-Num '${retry_num}` }) }; } // handle message and threads with app_mention if (!['message', 'app_mention'].includes(body.event.type) || isEmpty(body.event.client_msg_id)) { console.log(`Ignoring type: ${body.type}`); return { statusCode: 200, body: JSON.stringify({ error: `Unsupported body type ${body.type}` }) }; } if (isEmpty(body.event.channel) || isEmpty(body.event.text)) { return { statusCode: 200, body: JSON.stringify({ error: `No channel or text to response from` }) }; } const channelKey = getChannelKey( body.event.type, body.team_id, body.event.channel, body.event.event_ts, body.event.thread_ts ); const channelMetadata = await getChannelMetadata(channelKey, dependencies, slackEventsEnv); logger.debug( `ChannelKey: ${channelKey}, Cached channel metadata: ${JSON.stringify(channelMetadata)} ` ); const context = { conversationId: channelMetadata?.conversationId, parentMessageId: channelMetadata?.systemMessageId }; let attachments: Attachment[] = []; const input = []; const userInformationCache: Record<string, UsersInfoResponse> = {}; const stripMentions = (text?: string) => text?.replace(/<@[A-Z0-9]+>/g, '').trim(); // retrieve and cache user info if (isEmpty(userInformationCache[body.event.user])) { userInformationCache[body.event.user] = await dependencies.getUserInfo( slackEventsEnv, body.event.user ); } if (isEmpty(slackEventsEnv.AMAZON_Q_USER_ID)) { // Use slack user email as Q UserId const userEmail = userInformationCache[body.event.user].user?.profile?.email; slackEventsEnv.AMAZON_Q_USER_ID = userEmail; logger.debug( `User's email (${userEmail}) used as Amazon Q userId, since AmazonQUserId is empty.` ); } if (!isEmpty(body.event.thread_ts)) { const threadHistory = await dependencies.retrieveThreadHistory( slackEventsEnv, body.event.channel, body.event.thread_ts ); if (threadHistory.ok && !isEmpty(threadHistory.messages)) { const promptConversationHistory = []; // The last message in the threadHistory result is also the current message, so // to avoid duplicating chatHistory with the current message we skip the // last element in threadHistory message array. for (const m of threadHistory.messages.slice(0, -1)) { if (isEmpty(m.user)) { continue; } if (m.text === FEEDBACK_MESSAGE) { continue; } if (isEmpty(userInformationCache[m.user])) { userInformationCache[m.user] = await dependencies.getUserInfo(slackEventsEnv, m.user); } promptConversationHistory.push({ name: userInformationCache[m.user].user?.real_name, message: stripMentions(m.text), date: !isEmpty(m.ts) ? new Date(Number(m.ts) * 1000).toISOString() : undefined }); if (!isEmpty(m.files)) { attachments.push(...(await attachFiles(slackEventsEnv, m.files))); } } if (promptConversationHistory.length > 0) { // We clear the history and start a new conversation because we inject the context in the prompt context.conversationId = undefined; context.parentMessageId = undefined; input.push( `Given the following conversation thread history in JSON:\n${JSON.stringify( promptConversationHistory )}` ); } } } input.push(stripMentions(body.event.text)); const inputMessage = input.join(`\n${'-'.repeat(10)}\n`); const languageCode = await getLanguage(inputMessage); const prompt = await Translate(inputMessage, languageCode, "en"); // attach files (if any) from current message if (!isEmpty(body.event.files)) { attachments.push(...(await attachFiles(slackEventsEnv, body.event.files))); } // Limit file attachments to the last MAX_FILE_ATTACHMENTS if (attachments.length > MAX_FILE_ATTACHMENTS) { logger.debug( `Too many attached files (${attachments.length}). Attaching the last ${MAX_FILE_ATTACHMENTS} files.` ); attachments = attachments.slice(-MAX_FILE_ATTACHMENTS); } const [output, slackMessage] = await Promise.all([ chat(prompt, attachments, dependencies, slackEventsEnv, context), dependencies.sendSlackMessage( slackEventsEnv, body.event.channel, `Processing...`, [getMarkdownBlock(`Processing...`)], body.event.type === 'app_mention' ? body.event.ts : undefined ) ]); if (output instanceof Error) { const errMsgWithDetails = `${ERROR_MSG}\n_${output.message}_`; const blocks = [getMarkdownBlock(errMsgWithDetails)]; await dependencies.updateSlackMessage(slackEventsEnv, slackMessage, errMsgWithDetails, blocks); return { statusCode: 200, body: JSON.stringify({ chat: { context, input, output, blocks }, error: output }) }; } if (!isEmpty(output.failedAttachments)) { // Append error message for failed attachments to systemMessage const fileErrorMessages = []; for (const f of output.failedAttachments) { if (f.status === 'FAILED') { logger.debug(`Failed attachment: File ${f.name} - ${f.error.errorMessage}`); fileErrorMessages.push(` \u2022 ${f.name}: ${f.error.errorMessage}`); } } if (!isEmpty(fileErrorMessages)) { output.systemMessage = `${ output.systemMessage }\n\n*_Failed attachments:_*\n${fileErrorMessages.join('\n')}`; } } const blocks = [ ...dependencies.getResponseAsBlocks(output), ...dependencies.getFeedbackBlocks(output) ]; // 回答を元の言語に変換 const systemMessage = await Translate(output.systemMessage, "en", languageCode); const transOutput = { ...output, systemMessage:systemMessage }; await Promise.all([ saveChannelMetadata( channelKey, output.conversationId, output.systemMessageId, dependencies, slackEventsEnv ), saveMessageMetadata(output, dependencies, slackEventsEnv), dependencies.updateSlackMessage( slackEventsEnv, slackMessage, transOutput.systemMessage, dependencies.getResponseAsBlocks(transOutput) ) ]); await dependencies.sendSlackMessage( slackEventsEnv, body.event.channel, FEEDBACK_MESSAGE, dependencies.getFeedbackBlocks(output), body.event.type === 'app_mention' ? body.event.ts : undefined ); return { statusCode: 200, body: JSON.stringify({ chat: { context, prompt, output, blocks } }) }; };
今回は、AWSサービスを使って日本語の場合でも、Amazon Q で正しく回答できるように検討してみました!