• Open

    PwC and AWS Build Responsible AI with Automated Reasoning on Amazon Bedrock
    This post presents how AWS and PwC are developing new reasoning checks that combine deep industry expertise with Automated Reasoning checks in Amazon Bedrock Guardrails to support innovation.  ( 19 min )
    How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM
    In this post, Amazon shares how they developed a multi-node inference solution for Rufus, their generative AI shopping assistant, using Amazon Trainium chips and vLLM to serve large language models at scale. The solution combines a leader/follower orchestration model, hybrid parallelism strategies, and a multi-node inference unit abstraction layer built on Amazon ECS to deploy models across multiple nodes while maintaining high performance and reliability.  ( 20 min )
    Build an intelligent financial analysis agent with LangGraph and Strands Agents
    This post describes an approach of combining three powerful technologies to illustrate an architecture that you can adapt and build upon for your specific financial analysis needs: LangGraph for workflow orchestration, Strands Agents for structured reasoning, and Model Context Protocol (MCP) for tool integration.  ( 23 min )
    Amazon Bedrock AgentCore Memory: Building context-aware agents
    In this post, we explore Amazon Bedrock AgentCore Memory, a fully managed service that enables AI agents to maintain both immediate and long-term knowledge, transforming one-off conversations into continuous, evolving relationships between users and AI agents. The service eliminates complex memory infrastructure management while providing full control over what AI agents remember, offering powerful capabilities for maintaining both short-term working memory and long-term intelligent memory across sessions.  ( 23 min )
    Build a conversational natural language interface for Amazon Athena queries using Amazon Nova
    In this post, we explore an innovative solution that uses Amazon Bedrock Agents, powered by Amazon Nova Lite, to create a conversational interface for Athena queries. We use AWS Cost and Usage Reports (AWS CUR) as an example, but this solution can be adapted for other databases you query using Athena. This approach democratizes data access while preserving the powerful analytical capabilities of Athena, so you can interact with your data using natural language.  ( 22 min )

  • Open

    Train and deploy AI models at trillion-parameter scale with Amazon SageMaker HyperPod support for P6e-GB200 UltraServers
    In this post, we review the technical specifications of P6e-GB200 UltraServers, discuss their performance benefits, and highlight key use cases. We then walk though how to purchase UltraServer capacity through flexible training plans and get started using UltraServers with SageMaker HyperPod.  ( 18 min )
    How Indegene’s AI-powered social intelligence for life sciences turns social media conversations into insights
    This post explores how Indegene’s Social Intelligence Solution uses advanced AI to help life sciences companies extract valuable insights from digital healthcare conversations. Built on AWS technology, the solution addresses the growing preference of HCPs for digital channels while overcoming the challenges of analyzing complex medical discussions on a scale.  ( 26 min )
    Unlocking enhanced legal document review with Lexbe and Amazon Bedrock
    In this post, Lexbe, a legal document review software company, demonstrates how they integrated Amazon Bedrock and other AWS services to transform their document review process, enabling legal professionals to instantly query and extract insights from vast volumes of case documents using generative AI. Through collaboration with AWS, Lexbe achieved significant improvements in recall rates, reaching up to 90% by December 2024, and developed capabilities for broad human-style reporting and deep automated inference across multiple languages.  ( 19 min )
    Automate AIOps with SageMaker Unified Studio Projects, Part 2: Technical implementation
    In this post, we focus on implementing this architecture with step-by-step guidance and reference code. We provide a detailed technical walkthrough that addresses the needs of two critical personas in the AI development lifecycle: the administrator who establishes governance and infrastructure through automated templates, and the data scientist who uses SageMaker Unified Studio for model development without managing the underlying infrastructure.  ( 24 min )
    Automate AIOps with Amazon SageMaker Unified Studio projects, Part 1: Solution architecture
    This post presents architectural strategies and a scalable framework that helps organizations manage multi-tenant environments, automate consistently, and embed governance controls as they scale their AI initiatives with SageMaker Unified Studio.  ( 24 min )

  • Open

    Demystifying Amazon Bedrock Pricing for a Chatbot Assistant
    In this post, we'll look at Amazon Bedrock pricing through the lens of a practical, real-world example: building a customer service chatbot. We'll break down the essential cost components, walk through capacity planning for a mid-sized call center implementation, and provide detailed pricing calculations across different foundation models.  ( 20 min )
    Fine-tune OpenAI GPT-OSS models on Amazon SageMaker AI using Hugging Face libraries
    Released on August 5, 2025, OpenAI’s GPT-OSS models, gpt-oss-20b and gpt-oss-120b, are now available on AWS through Amazon SageMaker AI and Amazon Bedrock. In this post, we walk through the process of fine-tuning a GPT-OSS model in a fully managed training environment using SageMaker AI training jobs.  ( 24 min )
  • Open

    SRE Weekly Issue #489
    View on sreweekly.com A message from our sponsor, Observe, Inc.: Observe‘s free Masterclass in Observability at Scale is coming on September 4th at 10am Pacific! We’ll explore how to architect for observability at scale – from streaming telemetry and open data lakes to AI agents that proactively instrument your code and surface insights. Learn more […]  ( 4 min )

  • Open

    The DIVA logistics agent, powered by Amazon Bedrock
    In this post, we discuss how DTDC and ShellKode used Amazon Bedrock to build DIVA 2.0, a generative AI-powered logistics agent.  ( 21 min )
    Automate enterprise workflows by integrating Salesforce Agentforce with Amazon Bedrock Agents
    This post explores a practical collaboration, integrating Salesforce Agentforce with Amazon Bedrock Agents and Amazon Redshift, to automate enterprise workflows.  ( 25 min )
    How Amazon Bedrock powers next-generation account planning at AWS
    In this post, we share how we built Account Plan Pulse, a generative AI tool designed to streamline and enhance the account planning process, using Amazon Bedrock. Pulse reduces review time and provides actionable account plan summaries for ease of collaboration and consumption, helping AWS sales teams better serve our customers.  ( 19 min )

  • Open

    Pioneering AI workflows at scale: A deep dive into Asana AI Studio and Amazon Q index collaboration
    Today, we’re excited to announce the integration of Asana AI Studio with Amazon Q index, bringing generative AI directly into your daily workflows. In this post, we explore how Asana AI Studio and Amazon Q index transform enterprise efficiency through intelligent workflow automation and enhanced data accessibility.  ( 20 min )
    Responsible AI for the payments industry – Part 1
    This post explores the unique challenges facing the payments industry in scaling AI adoption, the regulatory considerations that shape implementation decisions, and practical approaches to applying responsible AI principles. In Part 2, we provide practical implementation strategies to operationalize responsible AI within your payment systems.  ( 21 min )
    Responsible AI for the payments industry – Part 2
    In Part 1 of our series, we explored the foundational concepts of responsible AI in the payments industry. In this post, we discuss the practical implementation of responsible AI frameworks.  ( 19 min )
    Process multi-page documents with human review using Amazon Bedrock Data Automation and Amazon SageMaker AI
    In this post, we show how to process multi-page documents with a human review loop using Amazon Bedrock Data Automation and Amazon SageMaker AI.  ( 19 min )
    Build an AI assistant using Amazon Q Business with Amazon S3 clickable URLs
    In this post, we demonstrate how to build an AI assistant using Amazon Q Business that responds to user requests based on your enterprise documents stored in an S3 bucket, and how the users can use the reference URLs in the AI assistant responses to view or download the referred documents, and verify the AI responses to practice responsible AI.  ( 25 min )

  • Open

    GPT OSS models from OpenAI are now available on SageMaker JumpStart
    Today, we are excited to announce the availability of Open AI’s new open weight GPT OSS models, gpt-oss-120b and gpt-oss-20b, from OpenAI in Amazon SageMaker JumpStart. With this launch, you can now deploy OpenAI’s newest reasoning models to build, experiment, and responsibly scale your generative AI ideas on AWS. In this post, we demonstrate how to get started with these models on SageMaker JumpStart.  ( 18 min )
    Discover insights from Microsoft Exchange with the Microsoft Exchange connector for Amazon Q Business
    Amazon Q Business is a fully managed, generative AI-powered assistant that helps enterprises unlock the value of their data and knowledge. With Amazon Q Business, you can quickly find answers to questions, generate summaries and content, and complete tasks by using the information and expertise stored across your company’s various data sources and enterprise systems. […]  ( 23 min )

  • Open

    AI judging AI: Scaling unstructured text analysis with Amazon Nova
    In this post, we highlight how you can deploy multiple generative AI models in Amazon Bedrock to instruct an LLM model to create thematic summaries of text responses. We then show how to use multiple LLM models as a jury to review these LLM-generated summaries and assign a rating to judge the content alignment between the summary title and summary description.  ( 20 min )
    Building an AI-driven course content generation system using Amazon Bedrock
    In this post, we explore each component in detail, along with the technical implementation of the two core modules: course outline generation and course content generation.  ( 24 min )
    How Handmade.com modernizes product image and description handling with Amazon Bedrock and Amazon OpenSearch Service
    In this post, we explore how Handmade.com, a leading hand-crafts marketplace, modernized their product description handling by implementing an AI-driven pipeline using Amazon Bedrock and Amazon OpenSearch Service. The solution combines Anthropic's Claude 3.7 Sonnet LLM for generating descriptions, Amazon Titan Text Embeddings V2 for vector embedding, and semantic search capabilities to automate and enhance product descriptions across their catalog of over 60,000 items.  ( 19 min )
    Cost tracking multi-tenant model inference on Amazon Bedrock
    In this post, we demonstrate how to track and analyze multi-tenant model inference costs on Amazon Bedrock using the Converse API's requestMetadata parameter. The solution includes an ETL pipeline using AWS Glue and Amazon QuickSight dashboards to visualize usage patterns, token consumption, and cost allocation across different tenants and departments.  ( 20 min )
  • Open

    SRE Weekly Issue #488
    View on sreweekly.com A message from our sponsor, Observe, Inc.: Observe‘s free Masterclass in Observability at Scale is coming on September 4th at 10am Pacific! We’ll explore how to architect for observability at scale – from streaming telemetry and open data lakes to AI agents that proactively instrument your code and surface insights. Learn more […]  ( 4 min )

  • Open

    Introducing Amazon Bedrock AgentCore Browser Tool
    In this post, we introduce the newly announced Amazon Bedrock AgentCore Browser Tool. We explore why organizations need cloud-based browser automation and the limitations it addresses for FMs that require real-time data access. We talk about key use cases and the core capabilities of the AgentCore Browser Tool. We walk through how to get started with the tool.  ( 20 min )
    Introducing the Amazon Bedrock AgentCore Code Interpreter
    In this post, we introduce the Amazon Bedrock AgentCore Code Interpreter, a fully managed service that enables AI agents to securely execute code in isolated sandbox environments. We discuss how the AgentCore Code Interpreter helps solve challenges around security, scalability, and infrastructure management when deploying AI agents that need computational capabilities.  ( 23 min )
    Observing and evaluating AI agentic workflows with Strands Agents SDK and Arize AX
    In this post, we present how the Arize AX service can trace and evaluate AI agent tasks initiated through Strands Agents, helping validate the correctness and trustworthiness of agentic workflows.  ( 22 min )
    Building AIOps with Amazon Q Developer CLI and MCP Server
    In this post, we discuss how to implement a low-code no-code AIOps solution that helps organizations monitor, identify, and troubleshoot operational events while maintaining their security posture. We show how these technologies work together to automate repetitive tasks, streamline incident response, and enhance operational efficiency across your organization.  ( 21 min )
    Containerize legacy Spring Boot application using Amazon Q Developer CLI and MCP server
    In this post, you’ll learn how you can use Amazon Q Developer command line interface (CLI) with Model Context Protocol (MCP) servers integration to modernize a legacy Java Spring Boot application running on premises and then migrate it to Amazon Web Services (AWS) by deploying it on Amazon Elastic Kubernetes Service (Amazon EKS).  ( 22 min )
  • Open

    July 2025
    Pupdate The boys had some fantastic long walks on our trip to the Lakes (more on that in a moment). I did a separate post about Milo’s extended remission, but it’s great that he’s been able to enjoy the summer without vet visits for chemo. Lake District We returned to Graythwaite’s Dove Cottage and this […]  ( 14 min )
    July 2025
    Pupdate The boys had some fantastic long walks on our trip to the Lakes (more on that in a moment). I did a separate post about Milo’s extended remission, but it’s great that he’s been able to enjoy the summer without vet visits for chemo. Lake District We returned to Graythwaite’s Dove Cottage and this […]  ( 14 min )

  • Open

    Introducing AWS Batch Support for Amazon SageMaker Training jobs
    AWS Batch now seamlessly integrates with Amazon SageMaker Training jobs. In this post, we discuss the benefits of managing and prioritizing ML training jobs to use hardware efficiently for your business. We also walk you through how to get started using this new capability and share suggested best practices, including the use of SageMaker training plans.  ( 21 min )
    Structured outputs with Amazon Nova: A guide for builders
    We launched constrained decoding to provide reliability when using tools for structured outputs. Now, tools can be used with Amazon Nova foundation models (FMs) to extract data based on complex schemas, reducing tool use errors by over 95%. In this post, we explore how you can use Amazon Nova FMs for structured output use cases.  ( 19 min )
    AI agents unifying structured and unstructured data: Transforming support analytics and beyond with Amazon Q Plugins
    Learn how to enhance Amazon Q with custom plugins to combine semantic search capabilities with precise analytics for AWS Support data. This solution enables more accurate answers to analytical questions by integrating structured data querying with RAG architecture, allowing teams to transform raw support cases and health events into actionable insights. Discover how this enhanced architecture delivers exact numerical analysis while maintaining natural language interactions for improved operational decision-making.  ( 23 min )
    Amazon Strands Agents SDK: A technical deep dive into agent architectures and observability
    In this post, we first introduce the Strands Agents SDK and its core features. Then we explore how it integrates with AWS environments for secure, scalable deployments, and how it provides rich observability for production use. Finally, we discuss practical use cases, and present a step-by-step example to illustrate Strands in action.  ( 41 min )
    Build dynamic web research agents with the Strands Agents SDK and Tavily
    In this post, we introduce how to combine Strands Agents with Tavily’s purpose-built web intelligence API, to create powerful research agents that excel at complex information gathering tasks while maintaining the security and compliance standards required for enterprise deployment.  ( 20 min )

  • Open

    Automate the creation of handout notes using Amazon Bedrock Data Automation
    In this post, we show how you can build an automated, serverless solution to transform webinar recordings into comprehensive handouts using Amazon Bedrock Data Automation for video analysis. We walk you through the implementation of Amazon Bedrock Data Automation to transcribe and detect slide changes, as well as the use of Amazon Bedrock foundation models (FMs) for transcription refinement, combined with custom AWS Lambda functions orchestrated by AWS Step Functions.  ( 21 min )
    Streamline GitHub workflows with generative AI using Amazon Bedrock and MCP
    This blog post explores how to create powerful agentic applications using the Amazon Bedrock FMs, LangGraph, and the Model Context Protocol (MCP), with a practical scenario of handling a GitHub workflow of issue analysis, code fixes, and pull request generation.  ( 21 min )

  • Open

    Mistral-Small-3.2-24B-Instruct-2506 is now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart
    Today, we’re excited to announce that Mistral-Small-3.2-24B-Instruct-2506—a 24-billion-parameter large language model (LLM) from Mistral AI that’s optimized for enhanced instruction following and reduced repetition errors—is available for customers through Amazon SageMaker JumpStart and Amazon Bedrock Marketplace. Amazon Bedrock Marketplace is a capability in Amazon Bedrock that developers can use to discover, test, and use over […]  ( 23 min )
    Generate suspicious transaction report drafts for financial compliance using generative AI
    A suspicious transaction report (STR) or suspicious activity report (SAR) is a type of report that a financial organization must submit to a financial regulator if they have reasonable grounds to suspect any financial transaction that has occurred or was attempted during their activities. In this post, we explore a solution that uses FMs available in Amazon Bedrock to create a draft STR.  ( 23 min )
    Fine-tune and deploy Meta Llama 3.2 Vision for generative AI-powered web automation using AWS DLCs, Amazon EKS, and Amazon Bedrock
    In this post, we present a complete solution for fine-tuning and deploying the Llama-3.2-11B-Vision-Instruct model for web automation tasks. We demonstrate how to build a secure, scalable, and efficient infrastructure using AWS Deep Learning Containers (DLCs) on Amazon Elastic Kubernetes Service (Amazon EKS).  ( 25 min )
    How Nippon India Mutual Fund improved the accuracy of AI assistant responses using advanced RAG methods on Amazon Bedrock
    In this post, we examine a solution adopted by Nippon Life India Asset Management Limited that improves the accuracy of the response over a regular (naive) RAG approach by rewriting the user queries and aggregating and reranking the responses. The proposed solution uses enhanced RAG methods such as reranking to improve the overall accuracy  ( 24 min )

  • Open

    Build a drug discovery research assistant using Strands Agents and Amazon Bedrock
    In this post, we demonstrate how to create a powerful research assistant for drug discovery using Strands Agents and Amazon Bedrock. This AI assistant can search multiple scientific databases simultaneously using the Model Context Protocol (MCP), synthesize its findings, and generate comprehensive reports on drug targets, disease mechanisms, and therapeutic areas.  ( 19 min )
    Amazon Nova Act SDK (preview): Path to production for browser automation agents
    In this post, we’ll walk through what makes Nova Act SDK unique, how it works, and how teams across industries are already using it to automate browser-based workflows at scale.  ( 20 min )
    Optimizing enterprise AI assistants: How Crypto.com uses LLM reasoning and feedback for enhanced efficiency
    In this post, we explore how Crypto.com used user and system feedback to continuously improve and optimize our instruction prompts. This feedback-driven approach has enabled us to create more effective prompts that adapt to various subsystems while maintaining high performance across different use cases.  ( 22 min )
    Build modern serverless solutions following best practices using Amazon Q Developer CLI and MCP
    This post explores how the AWS Serverless MCP server accelerates development throughout the serverless lifecycle, from making architectural decisions with tools like get_iac_guidance and get_lambda_guidance, to streamlining development with get_serverless_templates, sam_init, to deployment with SAM integration, webapp_deployment_help, and configure_domain. We show how this conversational AI approach transforms the entire process, from architecture design through operations, dramatically accelerating AWS serverless projects while adhering to architectural principles.  ( 25 min )
  • Open

    SRE Weekly Issue #487
    View on sreweekly.com A message from our sponsor, Spacelift: IaC Experts! IaCConf Call for Presenters – August 27, 2025The upcoming IaCConf Spotlight dives into the security and governance challenges of managing infrastructure as code at scale. From embedding security in your pipelines to navigating the realities of open source risk, this event brings together practitioners […]  ( 4 min )

  • Open

    Build an intelligent eDiscovery solution using Amazon Bedrock Agents
    In this post, we demonstrate how to build an intelligent eDiscovery solution using Amazon Bedrock Agents for real-time document analysis. We show how to deploy specialized agents for document classification, contract analysis, email review, and legal document processing, all working together through a multi-agent architecture. We walk through the implementation details, deployment steps, and best practices to create an extensible foundation that organizations can adapt to their specific eDiscovery requirements.  ( 20 min )
    How PerformLine uses prompt engineering on Amazon Bedrock to detect compliance violations
    PerformLine operates within the marketing compliance industry, a specialized subset of the broader compliance software market, which includes various compliance solutions like anti-money laundering (AML), know your customer (KYC), and others. In this post, PerformLine and AWS explore how PerformLine used Amazon Bedrock to accelerate compliance processes, generate actionable insights, and provide contextual data—delivering the speed and accuracy essential for large-scale oversight.  ( 21 min )

  • Open

    Boost cold-start recommendations with vLLM on AWS Trainium
    In this post, we demonstrate how to use vLLM for scalable inference and use AWS Deep Learning Containers (DLC) to streamline model packaging and deployment. We’ll generate interest expansions through structured prompts, encode them into embeddings, retrieve candidates with FAISS, apply validation to keep results grounded, and frame the cold-start challenge as a scientific experiment—benchmarking LLM and encoder pairings, iterating rapidly on recommendation metrics, and showing clear ROI for each configuration  ( 19 min )
    Benchmarking Amazon Nova: A comprehensive analysis through MT-Bench and Arena-Hard-Auto
    The repositories for MT-Bench and Arena-Hard were originally developed using OpenAI’s GPT API, primarily employing GPT-4 as the judge. Our team has expanded its functionality by integrating it with the Amazon Bedrock API to enable using Anthropic’s Claude Sonnet on Amazon as judge. In this post, we use both MT-Bench and Arena-Hard to benchmark Amazon Nova models by comparing them to other leading LLMs available through Amazon Bedrock.  ( 25 min )

  • Open

    Customize Amazon Nova in Amazon SageMaker AI using Direct Preference Optimization
    At the AWS Summit in New York City, we introduced a comprehensive suite of model customization capabilities for Amazon Nova foundation models. Available as ready-to-use recipes on Amazon SageMaker AI, you can use them to adapt Nova Micro, Nova Lite, and Nova Pro across the model training lifecycle, including pre-training, supervised fine-tuning, and alignment. In this post, we present a streamlined approach to customize Nova Micro in SageMaker training jobs.  ( 24 min )
    Multi-tenant RAG implementation with Amazon Bedrock and Amazon OpenSearch Service for SaaS using JWT
    In this post, we introduce a solution that uses OpenSearch Service as a vector data store in multi-tenant RAG, achieving data isolation and routing using JWT and FGAC. This solution uses a combination of JWT and FGAC to implement strict tenant data access isolation and routing, necessitating the use of OpenSearch Service.  ( 22 min )
    Enhance generative AI solutions using Amazon Q index with Model Context Protocol – Part 1
    In this post, we explore best practices and integration patterns for combining Amazon Q index and MCP, enabling enterprises to build secure, scalable, and actionable AI search-and-retrieval architectures.  ( 19 min )

  • Open

    Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program
    In 2024, the Ministry of Economy, Trade and Industry (METI) launched the Generative AI Accelerator Challenge (GENIAC)—a Japanese national program to boost generative AI by providing companies with funding, mentorship, and massive compute resources for foundation model (FM) development. AWS was selected as the cloud provider for GENIAC’s second cycle (cycle 2). It provided infrastructure and technical guidance for 12 participating organizations.  ( 32 min )
    Streamline deep learning environments with Amazon Q Developer and MCP
    In this post, we explore how to use Amazon Q Developer and Model Context Protocol (MCP) servers to streamline DLC workflows to automate creation, execution, and customization of DLC containers.  ( 33 min )

  • Open

    Build an AI-powered automated summarization system with Amazon Bedrock and Amazon Transcribe using Terraform
    This post introduces a serverless meeting summarization system that harnesses the advanced capabilities of Amazon Bedrock and Amazon Transcribe to transform audio recordings into concise, structured, and actionable summaries. By automating this process, organizations can reclaim countless hours while making sure key insights, action items, and decisions are systematically captured and made accessible to stakeholders.  ( 37 min )
    Kyruus builds a generative AI provider matching solution on AWS
    In this post, we demonstrate how Kyruus Health uses AWS services to build Guide. We show how Amazon Bedrock, a fully managed service that provides access to foundation models (FMs) from leading AI companies and Amazon through a single API, and Amazon OpenSearch Service, a managed search and analytics service, work together to understand everyday language about health concerns and connect members with the right providers.  ( 28 min )
    Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance
    In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how Amazon Web Services (AWS) customers can build a solution that automates the digitisation and extraction of crucial information from many reports using generative AI.  ( 29 min )
  • Open

    SRE Weekly Issue #486
    View on sreweekly.com A message from our sponsor, Spacelift: IaC Experts! IaCConf Call for Presenters – August 27, 2025 The upcoming IaCConf Spotlight dives into the security and governance challenges of managing infrastructure as code at scale. From embedding security in your pipelines to navigating the realities of open source risk, this event brings together […]  ( 4 min )

  • Open

    Build real-time travel recommendations using AI agents on Amazon Bedrock
    In this post, we show how to build a generative AI solution using Amazon Bedrock that creates bespoke holiday packages by combining customer profiles and preferences with real-time pricing data. We demonstrate how to use Amazon Bedrock Knowledge Bases for travel information, Amazon Bedrock Agents for real-time flight details, and Amazon OpenSearch Serverless for efficient package search and retrieval.  ( 30 min )
    Deploy a full stack voice AI agent with Amazon Nova Sonic
    In this post, we show how to create an AI-powered call center agent for a fictional company called AnyTelco. The agent, named Telly, can handle customer inquiries about plans and services while accessing real-time customer data using custom tools implemented with the Model Context Protocol (MCP) framework.  ( 29 min )
    Manage multi-tenant Amazon Bedrock costs using application inference profiles
    This post explores how to implement a robust monitoring solution for multi-tenant AI deployments using a feature of Amazon Bedrock called application inference profiles. We demonstrate how to create a system that enables granular usage tracking, accurate cost allocation, and dynamic resource management across complex multi-tenant environments.  ( 30 min )

  • Open

    Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI
    Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores. For most real-world generative AI scenarios, it’s crucial to understand whether a model is producing better outputs than a baseline or an earlier iteration. This is especially important for applications such as summarization, content generation, […]  ( 37 min )
    Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors
    In this post, we demonstrate how to integrate Amazon S3 Vectors with Amazon Bedrock Knowledge Bases for RAG applications. You'll learn a practical approach to scale your knowledge bases to handle millions of documents while maintaining retrieval quality and using S3 Vectors cost-effective storage.  ( 32 min )
    Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock
    In this post, we walk through the custom model on-demand deployment workflow for Amazon Bedrock and provide step-by-step implementation guides using both the AWS Management Console and APIs or AWS SDKs. We also discuss best practices and considerations for deploying customized Amazon Nova models on Amazon Bedrock.  ( 30 min )
    Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI
    Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive innovation at unprecedented speed. However, standalone LLMs have key limitations such as hallucinations, outdated knowledge, and no access to proprietary data. Retrieval Augmented Generation (RAG) addresses these gaps by combining semantic search with generative AI, […]  ( 36 min )

  • Open

    Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents
    This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture. Video highlights offer a powerful way to boost audience engagement and extend content value for content publishers. These short, high-impact clips capture key moments that drive viewer retention, amplify reach across social media, reinforce brand identity, and open new […]  ( 31 min )
    Deploy conversational agents with Vonage and Amazon Nova Sonic
    In this post, we explore how developers can integrate Amazon Nova Sonic with the Vonage communications service to build responsive, natural-sounding voice experiences in real time. By combining the Vonage Voice API with the low-latency and expressive speech capabilities of Amazon Nova Sonic, businesses can deploy AI voice agents that deliver more human-like interactions than traditional voice interfaces. These agents can be used as customer support, virtual assistants, and more.  ( 29 min )
    Enabling customers to deliver production-ready AI agents at scale
    Today, I’m excited to share how we’re bringing this vision to life with new capabilities that address the fundamental aspects of building and deploying agents at scale. These innovations will help you move beyond experiments to production-ready agent systems that can be trusted with your most critical business processes.  ( 32 min )

  • Open

    Amazon Bedrock Knowledge Bases now supports Amazon OpenSearch Service Managed Cluster as vector store
    Amazon Bedrock Knowledge Bases has extended its vector store options by enabling support for Amazon OpenSearch Service managed clusters, further strengthening its capabilities as a fully managed Retrieval Augmented Generation (RAG) solution. This enhancement builds on the core functionality of Amazon Bedrock Knowledge Bases , which is designed to seamlessly connect foundation models (FMs) with internal data sources. This post provides a comprehensive, step-by-step guide on integrating an Amazon Bedrock knowledge base with an OpenSearch Service managed cluster as its vector store.  ( 44 min )
    Monitor agents built on Amazon Bedrock with Datadog LLM Observability
    We’re excited to announce a new integration between Datadog LLM Observability and Amazon Bedrock Agents that helps monitor agentic applications built on Amazon Bedrock. In this post, we'll explore how Datadog's LLM Observability provides the visibility and control needed to successfully monitor, operate, and debug production-grade agentic applications built on Amazon Bedrock Agents.  ( 29 min )
    How PayU built a secure enterprise AI assistant using Amazon Bedrock
    PayU offers a full-stack digital financial services system that serves the financial needs of merchants, banks, and consumers through technology. In this post, we explain how we equipped the PayU team with an enterprise AI solution and democratized AI access using Amazon Bedrock, without compromising on data residency requirements.  ( 32 min )
    Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import
    This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA. DGX Cloud on Amazon Web Services (AWS) represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock […]  ( 32 min )
    Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS
    This post introduces NVIDIA Dynamo and explains how to set it up on Amazon EKS for automated scaling and streamlined Kubernetes operations. We provide a hands-on walkthrough, which uses the NVIDIA Dynamo blueprint on the AI on EKS GitHub repo by AWS Labs to provision the infrastructure, configure monitoring, and install the NVIDIA Dynamo operator.  ( 35 min )
    AWS doubles investment in AWS Generative AI Innovation Center, marking two years of customer success
    In this post, AWS announces a $100 million additional investment in its AWS Generative AI Innovation Center, marking two years of successful customer collaborations across industries from financial services to healthcare. The investment comes as AI evolves toward more autonomous, agentic systems, with the center already helping thousands of customers drive millions in productivity gains and transform customer experiences.  ( 30 min )
2025-08-13T20:19:42.557Z osmosfeed 1.15.1