AIP-C01 AWS Certified Generative AI Developer - Professional Questions and Answers

Questions 4

A financial services company needs to build a document analysis system that uses Amazon Bedrock to process quarterly reports. The system must analyze financial data, perform sentiment analysis, and validate compliance across batches of reports. Each batch contains 5 reports. Each report requires multiple foundation model (FM) calls. The solution must finish the analysis within 10 seconds for each batch. Current sequential processing takes 45 seconds for each batch.

Which solution will meet these requirements?

Options:

Use AWS Lambda functions with provisioned concurrency to process each analysis type sequentially. Configure the Lambda function timeouts to 10 seconds. Configure automatic retries with exponential backoff.

Use AWS Step Functions with a Parallel state to invoke separate AWS Lambda functions for each analysis type simultaneously. Configure Amazon Bedrock client timeouts. Use Amazon CloudWatch metrics to track execution time and model inference latency.

Create an Amazon SQS queue to buffer analysis requests. Deploy multiple AWS Lambda functions with reserved concurrency. Configure each Lambda function to process different aspects of each report sequentially and then combine the results.

Deploy an Amazon ECS cluster that runs containers that process each report sequentially. Use a load balancer to distribute batch workloads. Configure an auto-scaling policy based on CPU utilization.

Buy Now

Answer:

Explanation:

Option B is the correct solution because it parallelizes independent foundation model inference tasks while maintaining orchestration, observability, and time-bound execution. AWS Generative AI best practices emphasize reducing end-to-end latency by parallelizing independent inference calls rather than scaling individual calls vertically.

In this scenario, each report requires multiple independent analyses such as financial extraction, sentiment analysis, and compliance validation. These tasks do not depend on each other’s output, making them ideal candidates for parallel execution. AWS Step Functions provides a Parallel state that can invoke multiple AWS Lambda functions simultaneously, drastically reducing total processing time compared to sequential execution.

By invoking Amazon Bedrock from separate Lambda functions in parallel, the system can reduce batch execution time from 45 seconds to well under the 10-second requirement, assuming each inference call remains within acceptable latency bounds. Step Functions also provide built-in error handling, retries, and state tracking, which improves reliability without increasing complexity.

CloudWatch metrics allow teams to monitor both workflow execution time and individual model inference latency, enabling performance tuning and operational visibility. Configuring client-side timeouts ensures that slow or failed model invocations do not block the entire batch.

Option A still processes tasks sequentially and therefore cannot meet the strict latency requirement. Option C introduces queuing delays and sequential processing within each report, which increases total execution time. Option D relies on container-based sequential processing and adds unnecessary operational overhead for a workload that is event-driven and latency-sensitive.

Therefore, Option B best meets the performance, scalability, and operational efficiency requirements for high-speed batch document analysis using Amazon Bedrock.

Questions 5

A university recently digitized a collection of archival documents, academic journals, and manuscripts. The university stores the digital files in an AWS Lake Formation data lake.

The university hires a GenAI developer to build a solution to allow users to search the digital files by using text queries. The solution must return journal abstracts that are semantically similar to a user's query. Users must be able to search the digitized collection based on text and metadata that is associated with the journal abstracts. The metadata of the digitized files does not contain keywords. The solution must match similar abstracts to one another based on the similarity of their text. The data lake contains fewer than 1 million files.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

Use Amazon Titan Embeddings in Amazon Bedrock to create vector representations of the digitized files. Store embeddings in the OpenSearch Neural plugin for Amazon OpenSearch Service.

Use Amazon Comprehend to extract topics from the digitized files. Store the topics and file metadata in an Amazon Aurora PostgreSQL database. Query the abstract metadata against the data in the Aurora database.

Use Amazon SageMaker AI to deploy a sentence-transformer model. Use the model to create vector representations of the digitized files. Store embeddings in an Amazon Aurora PostgreSQL database that has the pgvector extension.

Use Amazon Titan Embeddings in Amazon Bedrock to create vector representations of the digitized files. Store embeddings in an Amazon Aurora PostgreSQL Serverless database that has the pgvector extension.

Buy Now

Answer:

Explanation:

Option D is the best choice because it delivers true semantic search with the smallest operational footprint by combining a fully managed embedding service with an automatically scaling vector-capable database. The university’s requirement is explicitly semantic: the metadata has no keywords, and the system must match abstracts based on similarity of meaning. This is a direct fit for an embeddings-based approach, where each abstract is converted into a vector representation and searched using vector similarity. Amazon Titan Embeddings in Amazon Bedrock provides a managed way to generate these vectors without hosting or maintaining an ML model, eliminating the operational work of model provisioning, patching, scaling, and lifecycle management.

For storage and retrieval, Amazon Aurora PostgreSQL Serverless with the pgvector extension supports vector storage and similarity search while minimizing infrastructure operations. Aurora Serverless reduces capacity planning and scaling tasks because it can automatically adjust to changes in workload, which is valuable for a university search application with variable usage patterns. With fewer than 1 million files, a PostgreSQL-based vector store is commonly operationally simpler than running a dedicated search cluster, while still meeting the requirement to query using both text-derived similarity and associated metadata filters stored alongside the vectors.

Option A can also enable vector search, but operating an OpenSearch domain typically introduces additional concerns such as domain sizing, shard strategy, cluster scaling, and performance tuning for k-NN workloads. Option C increases operational overhead the most because it requires deploying and operating a sentence-transformer model endpoint in SageMaker AI, including scaling, monitoring, and model management. Option B does not meet the semantic similarity requirement reliably because topic extraction is not equivalent to embedding-based semantic matching, especially when the metadata lacks keywords and the system must compare abstracts by meaning.

Therefore, D best satisfies semantic search needs with the least operational overhead.

Questions 6

A company is developing a generative AI (GenAI) application that analyzes customer service calls in real time and generates suggested responses for human customer service agents. The application must process 500,000 concurrent calls during peak hours with less than 200 ms end-to-end latency for each suggestion. The company uses existing architecture to transcribe customer call audio streams. The application must not exceed a predefined monthly compute budget and must maintain auto scaling capabilities.

Which solution will meet these requirements?

Options:

Deploy a large, complex reasoning model on Amazon Bedrock. Purchase provisioned throughput and optimize for batch processing.

Deploy a low-latency, real-time optimized model on Amazon Bedrock. Purchase provisioned throughput and set up automatic scaling policies.

Deploy a large language model (LLM) on an Amazon SageMaker real-time endpoint that uses dedicated GPU instances.

Deploy a mid-sized language model on an Amazon SageMaker serverless endpoint that is optimized for batch processing.

Buy Now

Questions 7

A medical company is creating a generative AI (GenAI) system by using Amazon Bedrock. The system processes data from various sources and must maintain end-to-end data lineage. The system must also use real-time personally identifiable information (PII) filtering and audit trails to automatically report compliance.

Which solution will meet these requirements?

Options:

Use AWS Glue Data Catalog to register all data sources and track lineage. Use Amazon Bedrock Guardrails PII filters. Enable AWS CloudTrail logging for all Amazon Bedrock API calls with Amazon S3 integration. Use Amazon Macie to scan stored data for sensitive information and publish findings to Amazon CloudWatch Logs. Create CloudWatch dashboards to visualize the findings and generate automated compliance reports.

Use AWS Config to track data source configurations and changes. Use AWS WAF with custom rules to filter PII at the application layer before Amazon Bedrock processes the data. Configure Amazon EventBridge to capture and route audit events to Amazon S3. Use Amazon Comprehend Medical with scheduled AWS Lambda functions to analyze stored outputs for compliance violations.

Use AWS DataSync to replicate data sources to track lineage. Configure Amazon Macie to scan Amazon Bedrock outputs for sensitive information. Use AWS Systems Manager Session Manager to log user interactions. Deploy Amazon Textract with AWS Step Functions workflows to identify and redact PII from generated reports.

Configure Amazon Athena to query data sources to analyze and report on data lineage. Use Amazon CloudWatch custom metrics to monitor PII exposure in Amazon Bedrock responses and establish AWS X-Ray tracing to generate an audit trail. Use an Amazon Rekognition Custom Labels model to detect sensitive information in the data that Amazon Bedrock processes.

Buy Now

Answer:

Explanation:

Option A is the most comprehensive and architecturally aligned solution for meeting end-to-end data lineage, real-time PII filtering, and automated compliance reporting requirements in a medical GenAI system built on Amazon Bedrock. Each requirement maps directly to a managed AWS service that is purpose-built for governance, security, and compliance.

AWS Glue Data Catalog is designed to register datasets across multiple sources and maintain metadata that supports lineage tracking. By cataloging all inputs that flow into the Bedrock-based system, the organization can trace how data moves from ingestion through processing and storage, which is essential for regulatory audits in healthcare environments.

For real-time PII filtering, Amazon Bedrock Guardrails provide native PII detection and filtering during model inference. Guardrails operate inline with model invocation, ensuring sensitive information is blocked or redacted before responses are returned to users. This satisfies the requirement for real-time protection rather than post-processing analysis.

AWS CloudTrail delivers a complete audit trail of all Amazon Bedrock API calls, including InvokeModel requests and configuration changes. Storing these logs in Amazon S3 enables long-term retention and supports compliance audits. CloudTrail ensures traceability of who accessed the system, when, and how it was used.

To strengthen compliance monitoring, Amazon Macie continuously scans stored data for sensitive information and automatically classifies findings. Publishing Macie findings to Amazon CloudWatch Logs and visualizing them through dashboards enables near-real-time visibility into compliance posture and supports automated reporting workflows.

The other options fall short. Option B performs PII filtering at the application edge rather than at inference time and relies on scheduled analysis instead of real-time enforcement. Option C focuses on replication and document processing rather than inline GenAI governance. Option D uses services that are not designed for PII detection in text-based GenAI workflows and lacks native lineage tracking.

Therefore, A best fulfills all stated requirements using AWS-recommended governance and security capabilities.

Questions 8

A GenAI developer is evaluating Amazon Bedrock foundation models (FMs) to enhance a Europe-based company's internal business application. The company has a multi-account landing zone in AWS Control Tower. The company uses Service Control Policies (SCPs) to allow its accounts to use only the eu-north-1 and eu-west-1 Regions. All customer data must remain in private networks within the approved AWS Regions.

The GenAI developer selects an FM based on analysis and testing and hosts the model in the eu-central-1 Region and the eu-west-3 Region. The GenAI developer must enable access to the FM for the company's employees. The GenAI developer must ensure that requests to the FM are private and remain within the same Regions as the FM.

Which solution will meet these requirements?

Options:

Deploy an AWS Lambda function that is exposed by a private Amazon API Gateway REST API to a VPC in eu-north-1. Create a VPC endpoint for the selected FM in eu-central-1 and eu-west-3. Extend existing SCPs to allow employees to use the FM. Integrate the REST API with the business application.

Deploy the FM on Amazon EC2 instances in eu-north-1. Deploy a private Amazon API Gateway REST API in front of the EC2 instances. Configure an Amazon Bedrock VPC endpoint. Integrate the REST API with the business application.

Configure the FM to use cross-Region inference through a Europe-scoped endpoint. Configure an Amazon Bedrock VPC endpoint. Extend existing SCPs to allow employees to use the FM through inference profiles in Europe-based Regions where the FM is available. Use an inference profile to integrate Amazon Bedrock with the business application.

Deploy the FM in Amazon SageMaker in eu-north-1. Configure a SageMaker VPC endpoint. Extend existing SCPs to allow employees to use the SageMaker endpoint. Integrate the FM in SageMaker with the business application.

Buy Now

Questions 9

A company is building a multicloud generative AI (GenAI)-powered secret resolution application that uses Amazon Bedrock and Agent Squad. The application resolves secrets from multiple sources, including key stores and hardware security modules (HSMs). The application uses AWS Lambda functions to retrieve secrets from the sources. The application uses AWS AppConfig to implement dynamic feature gating. The application supports secret chaining and detects secret drift. The application handles short-lived and expiring secrets. The application also supports prompt flows for templated instructions. The application uses AWS Step Functions to orchestrate agents to resolve the secrets and to manage secret validation and drift detection.

The company finds multiple issues during application testing. The application does not refresh expired secrets in time for agents to use. The application sends alerts for secret drift, but agents still use stale data. Prompt flows within the application reuse outdated templates, which cause cascading failures. The company must resolve the performance issues.

Which solution will meet this requirement?

Options:

Use Step Functions Map states to run agent workflows in parallel. Pass updated secret metadata through Lambda function outputs. Use AWS AppConfig to version all prompt flows to gate and roll back faulty templates.

Use Amazon Bedrock Agents only. Configure Amazon Bedrock guardrails to restrict prompt variation. Use an inline JSON schema for a single agent’s workflow definition to chain tool calls.

Use a centralized Amazon EventBridge pipeline to invoke each agent. Store intermediate prompts in Amazon DynamoDB. Resolve agent ordering by using TTL-based backoff and retries.

Use Amazon EventBridge Pipes to invoke resolvers based on Amazon CloudWatch log patterns. Store response metadata in DynamoDB with TTL and versioned writes. Use Amazon Q Developer to dynamically generate fallback prompts.

Buy Now

Questions 10

A medical device company wants to feed reports of medical procedures that used the company’s devices into an AI assistant. To protect patient privacy, the AI assistant must expose patient personally identifiable information (PII) only to surgeons. The AI assistant must redact PII for engineers. The AI assistant must reference only medical reports that are less than 3 years old.

The company stores reports in an Amazon S3 bucket as soon as each report is published. The company has already set up an Amazon Bedrock Knowledge Bases. The AI assistant uses Amazon Cognito to authenticate users.

Which solution will meet these requirements?

Options:

Enable Amazon Macie PII detection on the S3 bucket. Use an S3 trigger to invoke an AWS Lambda function that redacts PII from the reports. Configure the Lambda function to delete outdated documents and invoke knowledge base syncing.

Invoke an AWS Lambda function to sync the S3 bucket and the knowledge base when a new report is uploaded. Use a second Lambda function with Amazon Comprehend to redact PII for engineers. Use S3 Lifecycle rules to remove reports older than 3 years.

Set up an S3 Lifecycle configuration to remove reports that are older than 3 years. Schedule an AWS Lambda function to run daily syncs between the bucket and the knowledge base. When users interact with the AI assistant, apply a guardrail configuration selected based on the user’s Cognito user group to redact PII from responses when required.

Create a second knowledge base. Use Lambda and Amazon Comprehend to redact PII before syncing to the second knowledge base. Route users to the appropriate knowledge base based on Cognito group membership.

Buy Now

Questions 11

An elevator service company has developed an AI assistant application by using Amazon Bedrock. The application generates elevator maintenance recommendations to support the company’s elevator technicians. The company uses Amazon Kinesis Data Streams to collect the elevator sensor data.

New regulatory rules require that a human technician must review all AI-generated recommendations. The company needs to establish human oversight workflows to review and approve AI recommendations. The company must store all human technician review decisions for audit purposes.

Which solution will meet these requirements?

Options:

Create a custom approval workflow by using AWS Lambda functions and Amazon SQS queues for human review of AI recommendations. Store all review decisions in Amazon DynamoDB for audit purposes.

Create an AWS Step Functions workflow that has a human approval step that uses the waitForTaskToken API to pause execution. After a human technician completes a review, use an AWS Lambda function to call the SendTaskSuccess API with the approval decision. Store all review decisions in Amazon DynamoDB.

Create an AWS Glue workflow that has a human approval step. After the human technician review, integrate the application with an AWS Lambda function that calls the SendTaskSuccess API. Store all human technician review decisions in Amazon DynamoDB.

Configure Amazon EventBridge rules with custom event patterns to route AI recommendations to human technicians for review. Create AWS Glue jobs to process human technician approval queues. Use Amazon ElastiCache to cache all human technician review decisions.

Buy Now

Questions 12

A company has deployed an AI assistant as a React application that uses AWS Amplify, an AWS AppSync GraphQL API, and Amazon Bedrock Knowledge Bases. The application uses the GraphQL API to call the Amazon Bedrock RetrieveAndGenerate API for knowledge base interactions. The company configures an AWS Lambda resolver to use the RequestResponse invocation type.

Application users report frequent timeouts and slow response times. Users report these problems more frequently for complex questions that require longer processing.

The company needs a solution to fix these performance issues and enhance the user experience.

Which solution will meet these requirements?

Options:

Use AWS Amplify AI Kit to implement streaming responses from the GraphQL API and to optimize client-side rendering.

Increase the timeout value of the Lambda resolver. Implement retry logic with exponential backoff.

Update the application to send an API request to an Amazon SQS queue. Update the AWS AppSync resolver to poll and process the queue.

Change the RetrieveAndGenerate API to the InvokeModelWithResponseStream API. Update the application to use an Amazon API Gateway WebSocket API to support the streaming response.

Buy Now

Answer:

Explanation:

Option A is the best solution because it directly addresses both observed problems: user-perceived latency and resolver timeouts that occur more frequently for complex prompts. In the current design, an AWS AppSync Lambda resolver is configured with synchronous RequestResponse behavior. That means the client receives nothing until the entire retrieval and generation workflow completes. For longer-running knowledge base queries, this increases the likelihood of hitting request time limits in the synchronous path and creates a poor user experience because the UI appears stalled.

Using AWS Amplify AI Kit to implement streaming responses allows the application to return partial output incrementally as the model produces tokens. This improves perceived responsiveness because users can see the answer forming immediately, even when the full response takes longer. Streaming also reduces the impact of variable model latency and retrieval time because the client no longer waits for a single final payload before rendering content. From a troubleshooting perspective, streaming makes it easier to distinguish “slow generation” from “no response,” and it provides faster feedback during testing of complex questions.

Option B is not sufficient because increasing timeouts and adding retries can worsen load and cost while still producing a stalled UI experience. Retries also risk duplicating requests to the knowledge base and can amplify token usage. Option C introduces an awkward polling model for GraphQL interactions and adds significant operational complexity, while not inherently improving interactivity. Option D adds major architectural changes by replacing the knowledge base RetrieveAndGenerate call path with a different streaming invocation API and introducing a WebSocket layer, which is unnecessary when the goal is primarily to fix timeouts and improve UX within the existing AppSync and Amplify design.

Therefore, streaming through Amplify AI Kit is the most effective and lowest-friction improvement.

Thought for 24s

Questions 13

An ecommerce company is developing a generative AI application that uses Amazon Bedrock with Anthropic Claude to recommend products to customers. Customers report that some recommended products are not available for sale on the website or are not relevant to the customer. Customers also report that the solution takes a long time to generate some recommendations.

The company investigates the issues and finds that most interactions between customers and the product recommendation solution are unique. The company confirms that the solution recommends products that are not in the company’s product catalog. The company must resolve these issues.

Which solution will meet this requirement?

Options:

Increase grounding within Amazon Bedrock Guardrails. Enable Automated Reasoning checks. Set up provisioned throughput.

Use prompt engineering to restrict the model responses to relevant products. Use streaming techniques such as the InvokeModelWithResponseStream action to reduce perceived latency for the customers.

Create an Amazon Bedrock knowledge base. Implement Retrieval Augmented Generation RAG. Set the PerformanceConfigLatency parameter to optimized.

Store product catalog data in Amazon OpenSearch Service. Validate the model’s product recommendations against the product catalog. Use Amazon DynamoDB to implement response caching.

Buy Now

Answer:

Explanation:

Option C best addresses both core problems: hallucinated recommendations that do not exist in the catalog and slow response times, while keeping operational overhead low. The most direct way to prevent the model from recommending unavailable products is to ground generation on authoritative product catalog data at inference time. An Amazon Bedrock knowledge base is designed for this pattern by ingesting domain data, chunking content, creating embeddings, and retrieving the most relevant catalog entries when a user asks for recommendations. Implementing Retrieval Augmented Generation ensures the foundation model receives only approved, catalog-backed context and can cite or base its output on those retrieved items. This sharply reduces the likelihood of inventing products, because the response is conditioned on retrieved catalog records rather than relying on the model’s parametric memory.

The requirement also notes that most interactions are unique. That makes response caching far less effective, because there are fewer repeated prompts to benefit from cached outputs. Instead, improving the retrieval and model invocation path is the better optimization. Using the PerformanceConfigLatency parameter set to optimized prioritizes lower latency behavior for model inference, helping meet faster recommendation generation without requiring the company to build and operate additional infrastructure.

The other options do not solve the root cause as reliably. Prompt engineering and streaming can improve perceived latency, but they do not guarantee catalog-only recommendations because the model can still hallucinate items. Guardrails can help detect or block certain undesired outputs, but without consistent catalog grounding they do not ensure every recommendation is derived from the company’s product data. Building a custom OpenSearch validation and caching layer increases operational complexity, and caching is misaligned with predominantly unique interactions.

Questions 14

A financial services company uses an AI application to process financial documents by using Amazon Bedrock. During business hours, the application handles approximately 10,000 requests each hour, which requires consistent throughput.

The company uses the CreateProvisionedModelThroughput API to purchase provisioned throughput. Amazon CloudWatch metrics show that the provisioned capacity is unused while on-demand requests are being throttled. The company finds the following code in the application:

response = bedrock_runtime.invoke_model(

modelId="anthropic.claude-v2",

body=json.dumps(payload)

)

The company needs the application to use the provisioned throughput and to resolve the throttling issues.

Which solution will meet these requirements?

Options:

Increase the number of model units (MUs) in the provisioned throughput configuration.

Replace the model ID parameter with the ARN of the provisioned model that the CreateProvisionedModelThroughput API returns.

Add exponential backoff retry logic to handle throttling exceptions during peak hours.

Modify the application to use the invokeModelWithResponseStream API instead of the invokeModel API.

Buy Now

Questions 15

A company is building a generative AI (GenAI) application that uses Amazon Bedrock APIs to process complex customer inquiries. During peak usage periods, the application experiences intermittent API timeouts that cause issues such as broken response chunks and delayed data delivery. The application struggles to ensure that prompts remain within token limits when handling complex customer inquiries of varying lengths. Users have reported truncated inputs and incomplete responses. The company has also observed foundation model (FM) invocation failures.

The company needs a retry strategy that automatically handles transient service errors and prevents overwhelming Amazon Bedrock during peak usage periods. The strategy must also adapt to changing service availability and support response streaming and token-aware request handling.

Which solution will meet these requirements?

Options:

Implement a standard retry strategy that uses a 1-second fixed delay between attempts and a 3-retry maximum for all errors. Handle streaming response timeouts by restarting streams. Cap token usage for each session.

Implement an adaptive retry strategy that uses exponential backoff with jitter and a circuit breaker pattern that temporarily disables retries when error rates exceed a predefined threshold. Implement a streaming response handler that monitors for chunk delivery timeouts. Configure the handler to buffer successfully received chunks and intelligently resume streaming from the last received chunk when connections are re-established.

Use the AWS SDK to configure a retry strategy in standard mode. Wrap Amazon Bedrock API calls in try-catch blocks that handle timeout exceptions. Return cached completions for failed streaming requests. Enforce a global token limit for all users. Add jitter-based retry logic and lightweight token trimming for each request. Resume broken streams by requesting only missing chunks from the point of failure. Maintain a small in-memory buffer of

Set Amazon Bedrock client request timeouts to 30 seconds. Implement client-side load shedding. Buffer partial results and stop new requests when application performance degrades. Set static token usage caps for all requests. Configure exponential backoff retries, dynamic chunk sizing, and context-aware token limits.

Buy Now

Answer:

Explanation:

Option B best meets all requirements because it combines AWS-recommended resiliency patterns for transient failures with streaming-aware handling and adaptive protection against cascading retries during peak load. When timeouts and throttling occur, naïve retries can amplify traffic and worsen outages. Exponential backoff with jitter is the standard AWS best practice because it spreads retry attempts over time, reduces synchronized retry storms, and lowers the probability of repeatedly colliding with service limits.

The requirement also states the strategy must “adapt to changing service availability” and “prevent overwhelming Amazon Bedrock.” A circuit breaker pattern directly addresses this by temporarily stopping or reducing retries when failure rates exceed a threshold, allowing the system to degrade gracefully instead of continually hammering the service. This is a key mechanism to prevent cascading failures during throttling events.

Because the application uses response streaming and experiences broken chunks, the retry strategy must be streaming-aware. A streaming response handler that detects chunk delivery timeouts and buffers already received chunks prevents the user from losing progress when a connection drops. Resuming from the last successfully received chunk minimizes redundant generation and reduces additional load on the model compared with restarting the entire stream. This supports better user experience and better service efficiency during intermittent failures.

Token-aware request handling is supported in this architecture because the application can apply token budgeting before invoking the model (for example, trimming or summarizing excessive context) while still preserving streaming output behavior. Option B provides the correct backbone for this by focusing on adaptive control and robust streaming recovery.

Option A is too simplistic and risks retry storms. Option C combines conflicting elements (global token limit, cached completions for streaming) and includes impractical “request only missing chunks” behavior that is not a reliable property of streamed generative output. Option D includes useful ideas (load shedding) but relies on static caps and does not provide as strong adaptive retry control as circuit breaking.

Therefore, Option B is the most correct and operationally safe strategy for peak-load Bedrock streaming workloads.

Questions 16

An enterprise application uses an Amazon Bedrock foundation model (FM) to process and analyze 50 to 200 pages of technical documents. Users are experiencing inconsistent responses and receiving truncated outputs when processing documents that exceed the FM's context window limits.

Which solution will resolve this problem?

Options:

Configure fixed-size chunking at 4,000 tokens for each chunk with 20% overlap. Use application-level logic to link multiple chunks sequentially until the FM's maximum context window of 200,000 tokens is reached before making inference calls.

Use hierarchical chunking with parent chunks of 8,000 tokens and child chunks of 2,000 tokens. Use Amazon Bedrock Knowledge Bases built-in retrieval to automatically select relevant parent chunks based on query context. Configure overlap tokens to maintain semantic continuity.

Use semantic chunking with a breakpoint percentile threshold of 95% and a buffer size of 3 sentences. Use the RetrieveAndGenerate API to dynamically select the most relevant chunks based on embedding similarity scores.

Create a pre-processing AWS Lambda function that analyzes document token count by using the FM's tokenizer. Configure the Lambda function to split documents into equal segments that fit within 80% of the context window. Configure the Lambda function to process each segment independently before aggregating the results.

Buy Now

Questions 17

python

response = bedrock_runtime.invoke_model(modelId="anthropic.claude-v2", body=json.dumps(payload))

The company needs the application to use the provisioned throughput and to resolve the throttling issues.

Which solution will meet these requirements?

Options:

Increase the number of model units (MUs) in the provisioned throughput configuration.

Replace the model ID parameter with the ARN of the provisioned model that the CreateProvisionedModelThroughput API returns.

Add exponential backoff retry logic to handle throttling exceptions during peak hours.

Modify the application to use the InvokeModelWithResponseStream API instead of the InvokeModel API.

Buy Now

Answer:

Explanation:

Option B is correct because the application is currently invoking the base foundation model identifier, which routes traffic to the on-demand capacity pool rather than the company’s purchased provisioned throughput. In Amazon Bedrock, provisioned throughput is attached to a specific provisioned resource created through the provisioned throughput APIs. To consume that reserved capacity, inference requests must target the provisioned resource identifier that represents the purchased throughput, not the generic model identifier used for on-demand inference.

The code snippet uses modelId="anthropic.claude-v2". This value selects the on-demand endpoint for that model. As a result, requests are subject to on-demand quotas and throttling behavior, while the provisioned throughput remains idle. This directly explains the CloudWatch observation: provisioned capacity metrics show unused capacity because no traffic is being directed to the provisioned resource, and the on-demand path is throttling because it is exceeding the applicable on-demand limits during peak volume.

Replacing the modelId value with the provisioned throughput ARN returned by the CreateProvisionedModelThroughput workflow ensures the runtime invocation is routed to the reserved capacity. Once traffic is directed correctly, the purchased model units provide the consistent throughput required for predictable performance during business hours, which is exactly why provisioned throughput is used.

Option A could increase capacity, but it does not fix the core issue that the application is not using the provisioned resource at all. Option C can reduce the impact of throttling temporarily, but it adds latency and does not guarantee consistent throughput; it also still wastes the provisioned capacity. Option D changes the response delivery mechanism, but throttling is a capacity routing and quota issue, not a streaming API issue.

Questions 18

A company uses AWS Lake Formation to set up a data lake that contains databases and tables for multiple business units across multiple AWS Regions. The company wants to use a foundation model (FM) through Amazon Bedrock to perform fraud detection. The FM must ingest sensitive financial data from the data lake. The data includes some customer personally identifiable information (PII).

The company must design an access control solution that prevents PII from appearing in a production environment. The FM must access only authorized data subsets that have PII redacted from specific data columns. The company must capture audit trails for all data access.

Which solution will meet these requirements?

Options:

Create a separate dataset in a separate Amazon S3 bucket for each business unit and Region combination. Configure S3 bucket policies to control access based on IAM roles that are assigned to FM training instances. Use S3 access logs to track data access.

Configure the FM to authenticate by using AWS Identity and Access Management roles and Lake Formation permissions based on LF-Tag expressions. Define business units and Regions as LF-Tags that are assigned to databases and tables. Use AWS CloudTrail to collect comprehensive audit trails of data access.

Use direct IAM principal grants on specific databases and tables in Lake Formation. Create a custom application layer that logs access requests and further filters sensitive columns before sending data to the FM.

Configure the FM to request temporary credentials from AWS Security Token Service. Access the data by using presigned S3 URLs that are generated by an API that applies business unit and Regional filters. Use AWS CloudTrail to collect comprehensive audit trails of data access.

Buy Now

Questions 19

A financial technology company is using Amazon Bedrock to build an assessment system for the company’s customer service AI assistant. The AI assistant must provide financial recommendations that are factually accurate, compliant with financial regulations, and conversationally appropriate. The company needs to combine automated quality evaluations at scale with targeted human reviews of critical interactions.

What solution will meet these requirements?

Options:

Configure a pipeline in which financial experts manually score all responses for accuracy, compliance, and conversational quality. Use Amazon SageMaker notebooks to analyze results to identify improvement areas.

Configure Amazon Bedrock evaluations that use Anthropic Claude Sonnet as a judge model to assess response accuracy and appropriateness. Configure custom Amazon Bedrock guardrails to check responses for compliance with financial policies. Add Amazon Augmented AI (Amazon A2I) human reviews for flagged critical interactions.

Create an Amazon Lex bot to manage customer service interactions. Configure AWS Lambda functions to check responses against a static compliance database. Configure intents that call the Lambda functions. Add an additional intent to collect end-user reviews.

Configure Amazon CloudWatch to monitor response patterns from the AI assistant. Configure CloudWatch alerts for potential compliance violations. Establish a team of human evaluators to review flagged interactions.

Buy Now

Answer:

Explanation:

Option B meets the requirement to combine scalable automated evaluation with targeted human oversight using managed AWS GenAI capabilities. Amazon Bedrock evaluations enable systematic, repeatable quality assessment across large volumes of interactions. Using an LLM-as-a-judge approach with a strong evaluator model such as Anthropic Claude Sonnet allows the company to automatically score outputs for dimensions like factual accuracy, conversational appropriateness, and policy alignment. This directly supports “automated quality evaluations at scale” without building custom scoring models.

However, financial recommendations add higher risk because regulatory compliance requires additional enforcement beyond general quality scoring. Amazon Bedrock guardrails provide a dedicated policy enforcement layer that can block or intervene when responses violate compliance constraints. Guardrails are particularly important for preventing disallowed financial guidance patterns and ensuring consistent behavior across deployments.

The requirement also calls for “targeted human reviews of critical interactions.” Amazon Augmented AI (A2I) is a managed human review service that supports routing specific items to human reviewers based on rules or confidence thresholds. In this design, the system can automatically send only high-risk or policy-flagged interactions to qualified financial experts for review, keeping human effort focused where it matters most while maintaining scale.

Option A is not scalable because it requires manual review of all responses. Option C relies on static rules and end-user feedback, which is insufficient for regulatory compliance and factual accuracy assurance. Option D provides monitoring but not structured quality evaluation or policy enforcement.

Therefore, Option B provides the most complete, AWS-aligned solution for scalable evaluation plus human oversight in a regulated financial context.

Questions 20

A company upgraded its Amazon Bedrock–powered foundation model (FM) that supports a multilingual customer service assistant. After the upgrade, the assistant exhibited inconsistent behavior across languages. The assistant began generating different responses in some languages when presented with identical questions.

The company needs a solution to detect and address similar problems for future updates. The evaluation must be completed within 45 minutes for all supported languages. The evaluation must process at least 15,000 test conversations in parallel. The evaluation process must be fully automated and integrated into the CI/CD pipeline. The solution must block deployment if quality thresholds are not met.

Which solution will meet these requirements?

Options:

Create a distributed traffic simulation framework that sends translation-heavy workloads to the assistant in multiple languages simultaneously. Use Amazon CloudWatch metrics to monitor latency, concurrency, and throughput. Run simulations before production releases to identify infrastructure bottlenecks.

Deploy the assistant in multiple AWS Regions with Amazon Route 53 latency-based routing and AWS Global Accelerator to improve global performance. Store multilingual conversation logs in Amazon S3. Perform weekly post-deployment audits to review consistency.

Create a pre-processing pipeline that normalizes all incoming messages into a consistent format before sending the messages to the assistant. Apply rule-based checks to flag potential hallucinations in the outputs. Focus evaluation on normalized text to simplify testing across languages.

Set up standardized multilingual test conversations with identical meaning. Run the test conversations in parallel by using Amazon Bedrock model evaluation jobs. Apply similarity and hallucination thresholds. Integrate the process into the CI/CD pipeline to block releases that fail.

Buy Now

Answer:

Explanation:

Option D is the correct solution because it directly evaluates multilingual output consistency and quality in an automated, scalable, and deployment-gating workflow. Amazon Bedrock model evaluation jobs are designed to run large-scale, repeatable evaluations against defined datasets and to produce quantitative metrics that can be used as objective release criteria.

The core issue is semantic inconsistency across languages for equivalent inputs. The most reliable way to detect this is to create standardized test conversations where each language version expresses the same intent and constraints. Running those tests through the updated model and comparing results with similarity metrics (for example, semantic similarity between expected and actual answers, or between language variants) surfaces regressions that infrastructure testing cannot detect.

Bedrock evaluation jobs support running evaluations at scale and are well suited for processing large datasets quickly. By parallelizing evaluation runs across languages and conversations, the company can meet the 45-minute requirement while executing at least 15,000 conversations. Because the process is standardized, it also allows consistent baseline comparisons across releases.

Applying hallucination thresholds ensures that answers remain grounded and do not introduce fabricated details, which is particularly important when language-specific behavior shifts after a model upgrade. Integrating evaluation jobs into the CI/CD pipeline enables fully automated execution on every model or configuration update. The pipeline can enforce a hard quality gate that blocks deployment if thresholds are not met, preventing regressions from reaching production.

Option A focuses on performance and infrastructure bottlenecks, not multilingual response quality. Option B is post-deployment and too slow to prevent regressions. Option C normalizes inputs but does not measure multilingual output equivalence or provide robust, quantitative gating.

Therefore, Option D best meets the automation, scale, timing, and deployment-blocking requirements.

Questions 21

A book publishing company wants to build a book recommendation system that uses an AI assistant. The AI assistant will use ML to generate a list of recommended books from the company's book catalog. The system must suggest books based on conversations with customers.

The company stores the text of the books, customers' and editors' reviews of the books, and extracted book metadata in Amazon S3. The system must support low-latency responses and scale efficiently to handle more than 10,000 concurrent users.

Which solution will meet these requirements?

Options:

Use Amazon Bedrock Knowledge Bases to generate embeddings. Store the embeddings as a vector store in Amazon OpenSearch Service. Create an AWS Lambda function that queries the knowledge base. Configure Amazon API Gateway to invoke the Lambda function when handling user requests.

Use Amazon Bedrock Knowledge Bases to generate embeddings. Store the embeddings as a vector store in Amazon DynamoDB. Create an AWS Lambda function that queries the knowledge base. Configure Amazon API Gateway to invoke the Lambda function when handling user requests.

Use Amazon SageMaker AI to deploy a pre-trained model to build a personalized recommendation engine for books. Deploy the model as a SageMaker AI endpoint. Invoke the model endpoint by using Amazon API Gateway.

Create an Amazon Kendra GenAI Enterprise Edition index that uses the S3 connector to index the book catalog data stored in Amazon S3. Configure built-in FAQ in the Kendra index. Develop an AWS Lambda function that queries the Kendra index based on user conversations. Deploy Amazon API Gateway to expose this functionality and invoke the Lambda function.

Buy Now

Answer:

Explanation:

Option A best meets the requirements because it directly implements a Retrieval Augmented Generation pattern for conversational recommendations using managed Amazon Bedrock capabilities and a scalable vector store. The company’s source data already resides in Amazon S3, which aligns naturally with Amazon Bedrock Knowledge Bases ingestion workflows. A knowledge base can ingest book text, reviews, and metadata, generate embeddings using a supported embedding model, and persist those vectors in a purpose-built vector backend such as Amazon OpenSearch Service. This enables semantic retrieval that is well suited to conversation-driven intent, where user prompts are often descriptive and do not map cleanly to keyword filters.

The requirement to suggest books based on conversations implies the system must interpret natural language context and retrieve relevant passages, reviews, and metadata to ground the recommendation. Knowledge Bases provide managed orchestration for embedding creation and retrieval, which reduces development effort compared to building custom embedding pipelines. OpenSearch Service provides scalable vector search and k-nearest neighbors style similarity retrieval, which supports low-latency responses when properly indexed and sized.

For scaling to more than 10,000 concurrent users, the API layer design in option A is a common AWS pattern: Amazon API Gateway provides a managed front door with throttling and request handling, while AWS Lambda scales horizontally with demand and can invoke the knowledge base retrieval operations. This separates compute scaling from the vector store scaling and helps keep latency predictable under load.

Option B is not the best choice because DynamoDB is not the standard native vector store target for Amazon Bedrock Knowledge Bases in this context and would introduce additional implementation complexity around vector indexing and similarity search behavior. Option C requires substantial ML lifecycle work, model hosting, tuning, and continuous iteration to achieve quality recommendations at scale. Option D provides strong enterprise search, but it focuses on retrieval and FAQs rather than a managed RAG recommendation workflow grounded in embeddings and conversational context for generative responses.

Questions 22

A company has a customer service application that uses Amazon Bedrock to generate personalized responses to customer inquiries. The company needs to establish a quality assurance process to evaluate prompt effectiveness and model configurations across updates. The process must automatically compare outputs from multiple prompt templates, detect response quality issues, provide quantitative metrics, and allow human reviewers to give feedback on responses. The process must prevent configurations that do not meet a predefined quality threshold from being deployed.

Which solution will meet these requirements?

Options:

Create an AWS Lambda function that sends sample customer inquiries to multiple Amazon Bedrock model configurations and stores responses in Amazon S3. Use Amazon QuickSight to visualize response patterns. Manually review outputs daily. Use AWS CodePipeline to deploy configurations that meet the quality threshold.

Use Amazon Bedrock evaluation jobs to compare model outputs by using custom prompt datasets. Configure AWS CodePipeline to run the evaluation jobs when prompt templates change. Configure CodePipeline to deploy only configurations that exceed the predefined quality threshold.

Set up Amazon CloudWatch alarms to monitor response latency and error rates from Amazon Bedrock. Use Amazon EventBridge rules to notify teams when thresholds are exceeded. Configure a manual approval workflow in AWS Systems Manager.

Use AWS Lambda functions to create an automated testing framework that samples production traffic and routes duplicate requests to the updated model version. Use Amazon Comprehend sentiment analysis to compare results. Block deployment if sentiment scores decrease.

Buy Now

Questions 23

A medical company is building a generative AI (GenAI) application that uses Retrieval Augmented Generation (RAG) to provide evidence-based medical information. The application uses Amazon OpenSearch Service to retrieve vector embeddings. Users report that searches frequently miss results that contain exact medical terms and acronyms and return too many semantically similar but irrelevant documents. The company needs to improve retrieval quality and maintain low end-user latency, even as the document collection grows to millions of documents.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

Configure hybrid search by combining vector similarity with keyword matching to improve semantic understanding and exact term and acronym matching.

Increase the dimensions of the vector embeddings from 384 to 1536. Use a post-processing AWS Lambda function to filter out irrelevant results after retrieval.

Replace OpenSearch Service with Amazon Kendra. Use query expansion to handle medical acronyms and terminology variants during pre-processing.

Implement a two-stage retrieval architecture in which initial vector search results are re-ranked by an ML model hosted on Amazon SageMaker.

Buy Now

Questions 24

A media company must use Amazon Bedrock to implement a robust governance process for AI-generated content. The company needs to manage hundreds of prompt templates. Multiple teams use the templates across multiple AWS Regions to generate content. The solution must provide version control with approval workflows that include notifications for pending reviews. The solution must also provide detailed audit trails that document prompt activities and consistent prompt parameterization to enforce quality standards.

Which solution will meet these requirements?

Options:

Configure Amazon Bedrock Studio prompt templates. Use Amazon CloudWatch dashboards to display prompt usage metrics. Store approval status in Amazon DynamoDB. Use AWS Lambda functions to enforce approvals.

Use Amazon Bedrock Prompt Management to implement version control. Configure AWS CloudTrail for audit logging. Use AWS Identity and Access Management policies to control approval permissions. Create parameterized prompt templates by specifying variables.

Use AWS Step Functions to create an approval workflow. Store prompts in Amazon S3. Use tags to implement version control. Use Amazon EventBridge to send notifications.

Deploy Amazon SageMaker Canvas with prompt templates stored in Amazon S3. Use AWS CloudFormation for version control. Use AWS Config to enforce approval policies.

Buy Now

Questions 25

A healthcare company uses Amazon Bedrock to deploy an application that generates summaries of clinical documents. The application experiences inconsistent response quality with occasional factual hallucinations. Monthly costs exceed the company’s projections by 40%. A GenAI developer must implement a near real-time monitoring solution to detect hallucinations, identify abnormal token consumption, and provide early warnings of cost anomalies. The solution must require minimal custom development work and maintenance overhead.

Which solution will meet these requirements?

Options:

Configure Amazon CloudWatch alarms to monitor InputTokenCount and OutputTokenCount metrics to detect anomalies. Store model invocation logs in an Amazon S3 bucket. Use AWS Glue and Amazon Athena to identify potential hallucinations.

Run Amazon Bedrock evaluation jobs that use LLM-based judgments to detect hallucinations. Configure Amazon CloudWatch to track token usage. Create an AWS Lambda function to process CloudWatch metrics. Configure the Lambda function to send usage pattern notifications.

Configure Amazon Bedrock to store model invocation logs in an Amazon S3 bucket. Enable text output logging. Configure Amazon Bedrock guardrails to run contextual grounding checks to detect hallucinations. Create Amazon CloudWatch anomaly detection alarms for token usage metrics.

Use AWS CloudTrail to log all Amazon Bedrock API calls. Create a custom dashboard in Amazon QuickSight to visualize token usage patterns. Use Amazon SageMaker Model Monitor to detect quality drift in generated summaries.

Buy Now

Questions 26

A pharmaceutical company is developing a Retrieval Augmented Generation (RAG) application that uses an Amazon Bedrock knowledge base. The knowledge base uses Amazon OpenSearch Service as a data source for more than 25 million scientific papers. Users report that the application produces inconsistent answers that cite irrelevant sections of papers when queries span methodology, results, and discussion sections of the papers.

The company needs to improve the knowledge base to preserve semantic context across related paragraphs on the scale of the entire corpus of data.

Which solution will meet these requirements?

Options:

Configure the knowledge base to use fixed-size chunking. Set a 300-token maximum chunk size and a 10% overlap between chunks. Use an appropriate Amazon Bedrock embedding model.

Configure the knowledge base to use hierarchical chunking. Use parent chunks that contain 1,000 tokens and child chunks that contain 200 tokens. Set a 50-token overlap between chunks.

Configure the knowledge base to use semantic chunking. Use a buffer size of 1 and a breakpoint percentile threshold of 85% to determine chunk boundaries based on content meaning.

Configure the knowledge base not to use chunking. Manually split each document into separate files before ingestion. Apply post-processing reranking during retrieval.

Buy Now

Answer:

Explanation:

Option B is the best solution because hierarchical chunking is specifically designed to preserve broader semantic context while still enabling precise retrieval at paragraph or sub-paragraph granularity. The problem described—answers citing irrelevant sections when a query spans multiple paper sections—often occurs when chunks are either too small (losing cross-paragraph context) or too “flat” (retrieving isolated snippets without their surrounding rationale).

In a scientific paper, related information is frequently distributed across methodology, results, and discussion. Flat, fixed-size chunking (Option A) can split these logically connected ideas into separate chunks, causing retrieval to surface fragments that match a term but not the full intent. Semantic chunking (Option C) improves boundary placement, but it does not inherently provide a multi-resolution structure that helps preserve section-level continuity at massive scale.

Hierarchical chunking solves this by creating parent chunks (larger context windows) that capture broader section context and child chunks (smaller units) that retain retrieval precision. When the retriever identifies relevant child chunks, it can also bring in the associated parent context so the foundation model sees the surrounding methodological or discussion framing. The defined overlaps further reduce the risk that key transitions or references are split across chunks.

This approach is well suited for a corpus of 25 million papers because it improves relevance without requiring a custom reranking model or a manual preprocessing pipeline. It remains operationally efficient because it is configured at the knowledge base level rather than implemented through custom code per document.

Option D introduces high operational complexity and inconsistent document handling at scale. Therefore, Option B best meets the requirement to preserve semantic context across related paragraphs and improve citation relevance across scientific paper sections.

Questions 27

A specialty coffee company has a mobile app that generates personalized coffee roast profiles by using Amazon Bedrock with a three-stage prompt chain. The prompt chain converts user inputs into structured metadata, retrieves relevant logs for coffee roasts, and generates a personalized roast recommendation for each customer.

Users in multiple AWS Regions report inconsistent roast recommendations for identical inputs, slow inference during the retrieval step, and unsafe recommendations such as brewing at excessively high temperatures. The company must improve the stability of outputs for repeated inputs. The company must also improve app performance and the safety of the app's outputs. The updated solution must ensure 99.5% output consistency for identical inputs and achieve inference latency of less than 1 second. The solution must also block unsafe or hallucinated recommendations by using validated safety controls.

Which solution will meet these requirements?

Options:

Deploy Amazon Bedrock with provisioned throughput to stabilize inference latency. Apply Amazon Bedrock guardrails that have semantic denial rules to block unsafe outputs. Use Amazon Bedrock Prompt Management to manage prompts by using approval workflows.

Use Amazon Bedrock Agents to manage chaining. Log model inputs and outputs to Amazon CloudWatch Logs. Use logs from Amazon CloudWatch to perform A/B testing for prompt versions.

Cache prompt results in Amazon ElastiCache. Use AWS Lambda functions to pre-process metadata and to trace end-to-end latency. Use AWS X-Ray to identify and remediate performance bottlenecks.

Use Amazon Kendra to improve roast log retrieval accuracy. Store normalized prompt metadata within Amazon DynamoDB. Use AWS Step Functions to orchestrate multi-step prompts.

Buy Now

Answer:

Explanation:

Option A best meets the combined requirements of low latency, stability, and validated safety controls by using purpose-built Amazon Bedrock features designed for production GenAI operations. The company’s latency target of under 1 second and its observation of degradation during spikes strongly indicate capacity and throughput variability. Provisioned throughput for Amazon Bedrock is intended to deliver more predictable performance by reserving inference capacity for a chosen model, reducing throttling risk and stabilizing response times under load. This directly improves operational consistency across Regions where on-demand capacity can vary.

The requirement to “block unsafe or hallucinated recommendations” is most directly addressed by Amazon Bedrock Guardrails. Guardrails provide managed safety enforcement, including sensitive information controls and configurable content policies. Using semantic denial rules enables the application to prevent unsafe guidance such as dangerous brewing temperatures or other harmful procedural instructions, enforcing safety at the model boundary rather than relying on downstream filtering.

The remaining requirement is “99.5% output consistency for identical inputs.” While generative models can be probabilistic, production systems achieve practical consistency by controlling prompt versions, inputs, and policy behavior. Amazon Bedrock Prompt Management supports controlled prompt lifecycle practices, including versioning and approval workflows, which reduce unintended drift across deployments and Regions. By ensuring the same approved prompt templates and parameters are used consistently, the company can materially improve repeatability for the same structured inputs and retrieval context, which is essential in multi-stage prompt chains.

The other options are incomplete. B improves experimentation and observability but does not enforce safety controls or stabilize latency. C can improve performance, but it does not provide validated safety enforcement at inference time. D can help retrieval relevance, but it does not address unsafe outputs or inference stability. Therefore, A is the only option that simultaneously targets predictable latency, governance of prompt behavior, and strong safety controls within Amazon Bedrock.

Questions 28

A healthcare company is using Amazon Bedrock to develop a real-time patient care AI assistant to respond to queries for separate departments that handle clinical inquiries, insurance verification, appointment scheduling, and insurance claims. The company wants to use a multi-agent architecture.

The company must ensure that the AI assistant is scalable and can onboard new features for patients. The AI assistant must be able to handle thousands of parallel patient interactions. The company must ensure that patients receive appropriate domain-specific responses to queries.

Which solution will meet these requirements?

Options:

Isolate data for each agent by using separate knowledge bases. Use IAM filtering to control access to each knowledge base. Deploy a supervisor agent to perform natural language intent classification on patient inquiries. Configure the supervisor agent to route queries to specialized collaborator agents to respond to department-specific queries. Configure each specialized collaborator agent to use Retrieval Augmented Generation (RAG) with th

Create a separate supervisor agent for each department. Configure individual collaborator agents to perform natural language intent classification for each specialty domain within each department. Integrate each collaborator agent with department-specific knowledge bases only. Implement manual handoff processes between the supervisor agents.

Isolate data for each department in separate knowledge bases. Use IAM filtering to control access to each knowledge base. Deploy a single general-purpose agent. Configure multiple action groups within the general-purpose agent to perform specific department functions. Implement rule-based routing logic within the general-purpose agent instructions.

Implement multiple independent supervisor agents that run in parallel to respond to patient inquiries for each department. Configure multiple collaborator agents for each supervisor agent. Integrate all agents with the same knowledge base. Use external routing logic to merge responses from multiple supervisor agents.

Buy Now

Answer:

Explanation:

Option A is the most appropriate design because it provides scalable multi-agent orchestration, clear domain separation, and strong governance with minimal operational complexity. A supervisor-agent pattern is a standard AWS-recommended approach for multi-agent systems: one agent performs intent classification and routing, while specialized agents handle domain-specific tasks.

Isolating data with separate knowledge bases ensures that each specialized collaborator agent retrieves only the information relevant to its department. This improves response accuracy, reduces hallucinations, and supports privacy controls because clinical content, claims content, and scheduling content can have different access policies. IAM-based filtering ensures that each agent has permission only to the knowledge base it is authorized to use.

Routing patient inquiries through a supervisor agent supports high concurrency and extensibility. New departments or features can be added by introducing new collaborator agents and knowledge bases without redesigning the entire system. Because routing is handled centrally, changes in classification logic do not require updates across many independent supervisors.

Using RAG within each collaborator agent ensures that responses are grounded in department-approved information sources, which is critical in healthcare settings to reduce unsafe or incorrect guidance. This approach also improves performance because each retrieval scope is smaller and more relevant, supporting thousands of parallel interactions.

Option B introduces manual handoffs that do not scale. Option C relies on rule-based routing inside one general agent, which becomes brittle and difficult to govern as complexity grows. Option D mixes all departments into a single knowledge base and merges responses externally, increasing risk of incorrect domain answers and operational overhead.

Therefore, Option A best meets the scalability, correctness, and multi-agent onboarding requirements.

Questions 29

A company needs a system to automatically generate study materials from multiple content sources. The content sources include document files (PDF files, PowerPoint presentations, and Word documents) and multimedia files (recorded videos). The system must process more than 10,000 content sources daily with peak loads of 500 concurrent uploads. The system must also extract key concepts from document files and multimedia files and create contextually accurate summaries. The generated study materials must support real-time collaboration with version control.

Which solution will meet these requirements?

Options:

Use Amazon Bedrock Data Automation (BDA) with AWS Lambda functions to orchestrate document file processing. Use Amazon Bedrock Knowledge Bases to process all multimedia. Store the content in Amazon DocumentDB with replication. Collaborate by using Amazon SNS topic subscriptions. Track changes by using Amazon Bedrock Agents.

Use Amazon Bedrock Data Automation (BDA) with foundation models (FMs) to process document files. Integrate BDA with Amazon Textract for PDF extraction and with Amazon Transcribe for multimedia files. Store the processed content in Amazon S3 with versioning enabled. Store the metadata in Amazon DynamoDB. Collaborate in real time by using AWS AppSync GraphQL subscriptions and DynamoDB.

Use Amazon Bedrock Data Automation (BDA) with Amazon SageMaker AI endpoints to host content extraction and summarization models. Use Amazon Bedrock Guardrails to extract content from all file types. Store document files in Amazon Neptune for time series analysis. Collaborate by using Amazon Bedrock Chat for real-time messaging.

Use Amazon Bedrock Data Automation (BDA) with AWS Lambda functions to process batches of content files. Fine-tune foundation models (FMs) in Amazon Bedrock to classify documents across all content types. Store the processed data in Amazon ElastiCache (Redis OSS) by using Cluster Mode with sharding. Use Prompt management in Amazon Bedrock for version control.

Buy Now

Answer:

Explanation:

Option B best fulfills all functional, scalability, and collaboration requirements by combining purpose-built AWS services with Amazon Bedrock capabilities. Amazon Bedrock Data Automation is designed to orchestrate large-scale, multimodal data processing pipelines and integrates naturally with foundation models for summarization and concept extraction. Using BDA to process document files ensures consistent preprocessing and model invocation at scale, which is essential for handling more than 10,000 sources per day with high concurrency.

Integrating Amazon Textract for PDFs enables accurate extraction of structured and unstructured text from scanned and digital documents, while Amazon Transcribe is the appropriate service for converting recorded videos into text for downstream semantic analysis. These services are optimized for their respective media types and feed clean, normalized inputs into Bedrock foundation models, improving the quality of contextual summaries.

Storing processed content in Amazon S3 with versioning enabled directly addresses the requirement for version control. S3 versioning provides immutable object history and rollback capabilities without additional complexity. Metadata storage in Amazon DynamoDB supports high-throughput, low-latency access patterns and scales automatically to handle peak upload concurrency.

Real-time collaboration is achieved through AWS AppSync GraphQL subscriptions combined with DynamoDB. AppSync enables real-time updates to connected clients whenever study materials are created or modified, making it well suited for collaborative editing and live synchronization. DynamoDB streams integrate seamlessly with AppSync to propagate changes efficiently.

The other options misuse services or fail to meet key requirements. Amazon SNS does not support collaborative state synchronization, Amazon DocumentDB is not optimized for versioned document storage, Amazon Neptune is unsuitable for document-centric workloads, and Amazon ElastiCache is not designed for durable storage or version control. Option B aligns with AWS best practices for scalable, multimodal generative AI systems built on Amazon Bedrock.

Questions 30

A wildlife conservation agency operates zoos globally. The agency uses various sensors, trackers, and audiovisual recorders to monitor animal behavior. The agency wants to launch a generative AI (GenAI) assistant that can ingest multimodal data to study animal behavior.

The GenAI assistant must support natural language queries, avoid speculative behavioral interpretations, and maintain audit logs for ethical research audits.

Which solution will meet these requirements?

Options:

Ingest raw videos into Amazon Rekognition to detect animal postures and expressions. Use Amazon Data Firehose to stream sensor and GPS data into Amazon S3. Prompt an Amazon Bedrock FM using basic templates stored in AWS Systems Manager Parameter Store. Use IAM for access control. Use AWS CloudTrail for audit logging.

Use Amazon SageMaker Processing and Amazon Transcribe to pre-process multimodal data. Ingest curated summaries into an Amazon Bedrock Knowledge Bases. Apply Amazon Bedrock guardrails to restrict speculative outputs. Use AWS AppConfig to manage prompt templates. Use AWS CloudTrail to log research activity for audits.

Use Amazon OpenSearch Serverless to index behavioral logs and telemetry. Use Amazon Comprehend to extract entities. Use Amazon Bedrock to answer questions over indexed data. Use IAM for access control and CloudTrail for audit logging.

Configure Amazon O Business to federate data across Amazon S3, Amazon Kinesis, and Amazon SageMaker Feature Store. Use EventBridge for ingestion orchestration. Use custom AWS Lambda functions to filter LLM outputs for ethical compliance.

Buy Now

Questions 31

A company developed a multimodal content analysis application by using Amazon Bedrock. The application routes different content types (text, images, and code) to specialized foundation models (FMs).

The application needs to handle multiple types of routing decisions. Simple routing based on file extension must have minimal latency. Complex routing based on content semantics requires analysis before FM selection. The application must provide detailed history and support fallback options when primary FMs fail.

Which solution will meet these requirements?

Options:

Configure AWS Lambda functions that call Amazon Bedrock FMs for all routing logic. Use conditional statements to determine the appropriate FM based on content type and semantics.

Create a hybrid solution. Handle simple routing based on file extensions in application code. Handle complex content-based routing by using an AWS Step Functions state machine with JSONata for content analysis and the InvokeModel API for specialized FMs.

Deploy separate AWS Step Functions workflows for each content type with routing logic in AWS Lambda functions. Use Amazon EventBridge to coordinate between workflows when fallback to alternate FMs is required.

Use Amazon SQS with different SQS queues for each content type. Configure AWS Lambda consumers that analyze content and invoke appropriate FMs based on message attributes by using Amazon Bedrock with an AWS SDK.

Buy Now

Answer:

Explanation:

Option B is the most appropriate solution because it directly aligns with AWS-recommended architectural patterns for building scalable, observable, and resilient generative AI applications on Amazon Bedrock. The requirements clearly distinguish between simple and complex routing decisions, and this option addresses both in an optimal way.

Simple routing based on file extension is latency sensitive. Handling this logic directly in the application code avoids unnecessary orchestration, state transitions, and service calls. This approach ensures that straightforward requests, such as routing images to vision-capable foundation models or text files to language models, are processed with minimal overhead and maximum performance.

For complex routing based on content semantics, AWS Step Functions is specifically designed for multi-step workflows that require analysis, branching logic, and error handling. Semantic routing often requires inspecting meaning, intent, or structure before selecting the appropriate foundation model. Step Functions enables this by orchestrating analysis steps and applying conditional logic to determine the correct model to invoke using the Amazon Bedrock InvokeModel API.

A key requirement is detailed execution history. Step Functions provides built-in execution tracing, including state inputs, outputs, and error details, which is essential for auditing, debugging, and compliance. Additionally, Step Functions supports native retry and catch mechanisms, allowing the workflow to automatically fall back to alternate foundation models if a primary model invocation fails. This directly satisfies the fallback requirement without introducing excessive custom code.

The other options lack one or more critical capabilities. Lambda-only logic lacks deep observability and structured fallback handling, SQS introduces additional latency and limited workflow visibility, and multiple coordinated workflows increase architectural complexity without added benefit.

Questions 32

A retail company has a generative AI (GenAI) product recommendation application that uses Amazon Bedrock. The application suggests products to customers based on browsing history and demographics. The company needs to implement fairness evaluation across multiple demographic groups to detect and measure bias in recommendations between two prompt approaches. The company wants to collect and monitor fairness metrics in real time. The company must receive an alert if the fairness metrics show a discrepancy of more than 15% between demographic groups. The company must receive weekly reports that compare the performance of the two prompt approaches.

Which solution will meet these requirements with the LEAST custom development effort?

Options:

Configure an Amazon CloudWatch dashboard to display default metrics from Amazon Bedrock API calls. Create custom metrics based on model outputs. Set up Amazon EventBridge rules to invoke AWS Lambda functions that perform post-processing analysis on model responses and publish custom fairness metrics.

Create the two prompt variants in Amazon Bedrock Prompt Management. Use Amazon Bedrock Flows to deploy the prompt variants with defined traffic allocation. Configure Amazon Bedrock guardrails to monitor demographic fairness. Set up Amazon CloudWatch alarms on the GuardrailContentSource dimension by using InvocationsIntervened metrics to detect recommendation discrepancy threshold violations.

Set up Amazon SageMaker Clarify to analyze model outputs. Publish fairness metrics to Amazon CloudWatch. Create CloudWatch composite alarms that combine SageMaker Clarify bias metrics with Amazon Bedrock latency metrics.

Create an Amazon Bedrock model evaluation job to compare fairness between the two prompt variants. Enable model invocation logging in Amazon CloudWatch. Set up CloudWatch alarms for InvocationsIntervened metrics with a dimension for each demographic group.

Buy Now

Exam Code: AIP-C01

Exam Name: AWS Certified Generative AI Developer - Professional

Last Update: Feb 21, 2026

Questions: 107

AIP-C01 PDF

$25.5 ~~$84.99~~

Add to Cart

AIP-C01 Testing Engine

$30 ~~$99.99~~

Add to Cart

AIP-C01 PDF + Testing Engine

$40.5 ~~$134.99~~

Add to Cart

Spring Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: clap70

clapgeek logo

AIP-C01 AWS Certified Generative AI Developer - Professional Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer: