• Open

    How Amazon Bedrock catches AI-generated phishing
    Social engineering through phishing remains one of the most common tactics for launching cyberattacks. AI-generated phishing email messages now pose a new challenge for security teams managing email systems, significantly raising the risk because of their advanced sophistication. Modern social engineers use generative AI and open source intelligence (OSINT) to craft thousands of unique messages […]  ( 115 min )
    Best practices for multi-turn reinforcement learning in Amazon SageMaker AI
    In this post, we share best practices for reliable multi-turn RL training. We cover how to build a training environment you can trust, set up an external evaluation, design a reward aligned with the end task, manage what changes once the agent runs for multiple turns, and monitor the metrics that tell you when to iterate.  ( 120 min )

  • Open

    Run NVIDIA Nemotron and OpenAI GPT OSS models on Amazon Bedrock in AWS GovCloud (US)
    We're excited to introduce US-based frontier open-weight models in AWS GovCloud (US). With this release, Amazon Bedrock now supports OpenAI’s open-weight GPT OSS models (120B and 20B) and NVIDIA Nemotron (Nano 9B v2, Nano 12B v2, Nano 30B, Super 120B) models. In this post, we cover these models and their capabilities, the inference options for data residency, the available service tiers and how to get started.  ( 118 min )
    Building a serverless A2A gateway for agent discovery, routing, and access control
    In this post, you will learn how to build a serverless A2A gateway on AWS that hosts multiple agents behind a single domain using path-based routing (/agents/{agentId}). Standard A2A clients work without modification.  ( 115 min )
    Structured memory filtering with metadata in AgentCore Memory
    In this post, you will learn how metadata works across configuration, ingestion, and retrieval, explore enterprise use cases including multi-agent and multi-tenant architectures, and discover best practices for implementation.  ( 122 min )
    HippoRAG: Neurobiologically inspired RAG using Amazon Bedrock, Amazon Neptune, and personalized PageRank
    In this post, we demonstrate how to implement HippoRAG using a comprehensive AWS stack. We use Amazon Bedrock for LLM capabilities, Amazon Neptune for graph database functionality, Amazon Neptune Analytics for advanced graph algorithms including Personalized PageRank, and Amazon Titan Embeddings for vector representations. This implementation showcases how to build and deploy HippoRAG within AWS infrastructure for enterprise-scale applications.  ( 116 min )
    How Inscribe uses Amazon Bedrock to stop document fraud in seconds
    In this post, you will learn how Inscribe developed an agentic AI system using Amazon Bedrock that reasons across documents the way an expert fraud analyst would. With this new agentic AI system, Inscribe now detects tampered, fabricated, and AI-generated financial documents in under 90 seconds. This is a 20x improvement over traditional manual review, while maintaining the accuracy and explainability required by financial services regulations.  ( 114 min )
    Simplify model selection in Amazon Bedrock with the open source Model Profiler
    The Amazon Bedrock Model Profiler is an open source tool that aggregates model metadata from multiple AWS APIs and external sources into a single, searchable interface. In this post, you’ll learn what the Model Profiler provides, the real-world scenarios it supports, and how to deploy it in your own environment in under five minutes.  ( 117 min )
    Accelerate protein design with BoltzGen on Amazon SageMaker AI
    In this post, we demonstrate how to deploy BoltzGen on SageMaker AI and run an end-to-end protein design experiment. By the end of the walkthrough, you have a working setup that scales from quick validation runs to production batch processing. The setup offers two execution modes for different stages of research and uses step-level caching to reduce compute expenses during iterative workflows.  ( 116 min )
    Safely Releasing Frontier Models to Customers
    It’s our goal for AWS to be the most secure place to run any workload, and in support of that we’ve been deeply investing in security across our services since AWS's inception more than two decades ago. Our AI services like Amazon Bedrock are built on this foundation and with the same focus.  ( 108 min )

  • Open

    Introducing Claude Sonnet 5 on AWS: Anthropic’s most capable Sonnet model
    Today, we’re excited to announce the availability of Anthropic’s most advanced Sonnet model, Claude Sonnet 5, on Amazon Bedrock and Claude Platform on AWS. Claude Sonnet 5 is the first Sonnet model of Anthropic’s latest generation and represents a meaningful step forward. It delivers top-tier intelligence at Sonnet pricing for coding, agents, and everyday professional […]  ( 110 min )
    Build generative UI for AI agents on Amazon Bedrock AgentCore with the AG-UI protocol
    This post walks through how AG-UI integrates into the Fullstack AgentCore Solution Template (FAST) to build interactive agent frontends on Amazon Bedrock AgentCore. We then show how CopilotKit extends this with generative UI, shared state, and human-in-the-loop interactions, all deployed on Amazon Bedrock AgentCore.  ( 114 min )
    Simplify multi-account access to Amazon Bedrock models with managed entitlements
    In this post, we show you how to use managed entitlements for Amazon Bedrock to subscribe once from a central account and distribute model access across your organization. This approach removes the need for AWS Marketplace permissions in workload accounts.  ( 112 min )
    Implementing resilience patterns with Amazon Bedrock and LLM gateway
    In this post, you will learn five practical patterns for building resilient generative AI applications on AWS, progressing from native Amazon Bedrock features to multi-model orchestration using an LLM gateway. These patterns address real-world challenges such as quota exhaustion during unexpected traffic surges, maximizing availability through geographic distribution of inference, and helping prevent noisy neighbor problems in multi-tenant environments.  ( 114 min )
    How Outpost VFX Uses AWS to Accelerate AI Model Training for Visual Effects
    In this post, we explore how Outpost VFX achieved 8x faster training speeds using AWS infrastructure to transform their face replacement workflow, the technical architecture they implemented to overcome single-GPU limitations, and the measurable results achieved through AWS multi-GPU training.  ( 111 min )
    Building bilingual NER for cargo logistics with Amazon Bedrock
    In this post, we share the technical approach using token-based distillation, lessons learned, and deployment architecture. If you face similar bilingual NER challenges, you can benefit from IBS Software’s experience with the Amazon Bedrock knowledge distillation capabilities.  ( 111 min )
    Fine-tune Amazon Nova models for accurate email data extraction
    In this post, you'll learn how fine-tuning Amazon Nova models using Amazon SageMaker AI addresses these specific issues by teaching the models to recognize your exact data patterns, distinguish between similar fields, and process information more efficiently—achieving up to 94.77% extraction accuracy while reducing costs 50%.  ( 114 min )

  • Open

    Implement a backup strategy for Amazon Quick Sight BI assets
    In this post, we cover best practices for implementing an effective backup strategy for BI assets in Quick Sight. We start by covering the options for selecting the assets to include in your backup, then explain the high-level APIs available for that purpose, and finalize with sample code to help you get started quickly.  ( 120 min )
    Pair Nova 2 Lite with Claude for cost-optimized document processing
    In this post, we show how pairing Amazon Nova 2 Lite with Anthropic’s Claude Sonnet 4.6 delivers an efficient solution for digitizing scanned documents at scale. We built a two-model pipeline on Amazon Bedrock for digitizing scanned yearbook pages. Amazon Nova 2 Lite handles native multimodal extraction in a single call: detecting photos, extracting visible names with coordinates, and returning page-level metadata. Claude Sonnet 4.6 then performs spatial reasoning to match names to faces based on page layout.  ( 116 min )
    Multi-tenant LLM analytics with row-level security: How we built a secure agent on AWS
    In this post, we show you how PAR built a production-ready multi-tenant LLM analytics system that enforces row-level security through a three-layer architecture: cryptographic request signing with AWS SigV4, semantic validation on Amazon Bedrock, and programmatic data isolation via Split-Plane SQL. We demonstrate how each layer operates independently to reduce the risk of cross-tenant data exposure, even when the LLM itself is compromised or manipulated.  ( 120 min )
    Build an agentic AI healthcare claims pipeline with Amazon Bedrock and AWS HealthLake
    In this post, we show you how to build an automated claims processing pipeline using two key Amazon Bedrock capabilities: Amazon Bedrock Data Automation for intelligent document extraction from healthcare claim forms, and Amazon Bedrock AgentCore for hosting an AI agent that validates and transforms the extracted data into FHIR (Fast Healthcare Interoperable Resources) resources in AWS HealthLake. You will learn how to combine these services to create an end-to-end workflow that reduces manual processing while maintaining accuracy through automated validation checks.  ( 112 min )
    Debugging production agents with Amazon Bedrock AgentCore Observability
    In this post, you learn how to debug production agent failures using built-in observability capabilities. We walk through common failure patterns, show how to analyze agent behavior with traces and metrics, and provide structured workflows for resolving issues such as infinite loops and tool invocation failures. This is Part 1 of a two-part series. Part 2 covers performance optimization and memory management.  ( 116 min )
  • Open

    SRE Weekly Issue #523
    View on sreweekly.com A message from our sponsor, Buildkite: More places to run, more scale to manage and maintain, usually means more blind spots; not here. Buildkite’s control plane holds the live state of every job, agent and queue, regardless of throughput size. See what’s running, what’s waiting and why with immediate insight → https://buildkite.com/platform/pipelines/ […]  ( 4 min )

  • Open

    Build interactive PDF text extraction from Amazon S3
    In this post, you’ll build a server that extracts text from PDF files in Amazon S3 in real time. This protocol-based approach provides programmatic document access. You’ll walk through the architecture, set up the server, and run interactive document queries. Along the way, you’ll compare this approach with Amazon Textract so you can decide which tool fits your workload.  ( 116 min )
    How Cara pioneers domain-specific AI for enterprise insurance brokerages with AWS
    In this post, we explore how Cara, built in cooperation with AWS, addresses these challenges. We walk through the technical design decisions and the AWS services that support the solution. We also share measurable outcomes Cara has delivered for enterprise brokerages.  ( 109 min )
    Production-grade AI agents for financial compliance: Lessons from Stripe
    In this post, you learn how Stripe built a production-grade AI agent system for financial compliance. We cover the technical architecture of Stripe’s ReAct agent framework and the infrastructure decisions behind a dedicated agent service. We also discuss the role of human oversight in maintaining accountability, and key lessons about task decomposition, orchestration patterns, and cost optimization through prompt caching. By the end, you will understand how to design agentic systems that scale compliance operations without compromising quality or auditability.  ( 117 min )

  • Open

    Retrofit, don’t rebuild: Agentic overlays for transforming legacy enterprise services
    In this technical collaboration between AWS and the authors, we present a pragmatic solution: agentic overlays. Agentic overlays are thin wrapper layers that transform traditional REST-based services into agents capable of participating in A2A interactions. They also expose REST APIs as tools compatible with the Model Context Protocol (MCP). Together, they let enterprises add A2A capabilities to existing REST services without rewriting business logic, without duplicating code, and without running parallel infrastructures. This reduces agent sprawl in the infrastructure by reusing existing services as agents. We provide reference architectures and sample code that show how to build agentic overlays.  ( 117 min )
    Optimize model training on Amazon SageMaker AI with NVIDIA Blackwell
    This post shows you how to configure training jobs on Amazon SageMaker AI to get the most out of Blackwell’s architecture on AWS. You learn how to select batch sizes and sequence lengths that take advantage of Blackwell’s expanded memory, choose the right precision format for your model size (1B to 64B parameters), and apply activation checkpointing strategically. By the end, you have a practical framework for tuning your training configuration and launching distributed training jobs on P6-B200 instances.  ( 115 min )
    Implementing super resolution by deploying SeedVR2 on Amazon SageMaker AI
    In this post, we demonstrate how to implement video upscaling using SeedVR2 on SageMaker AI. We cover the solution architecture, walk through the deployment steps, and show performance comparisons that highlight the quality improvements and processing efficiency you can achieve. By the end of this post, you’ll have the practical knowledge needed to implement this super resolution solution.  ( 114 min )
    Build self-service AWS Health analytics to find actionable health insights with AI agents powered by Amazon Bedrock
    In this post, we show you how to build Chaplin (Customer Health and Planned Lifecycle Intelligence Nexus), an open source solution that uses AI agents exposed through the Model Context Protocol (MCP) to provide self-service health event analytics.  ( 121 min )
    Building agentic AI applications with a modern data mesh strategy on AWS
    This post shows how to build a governed, serverless data mesh on AWS that provides the secure, scalable data foundation production agentic AI requires.  ( 120 min )

  • Open

    Huntington Bank: Redacting sensitive data from 400M+ documents with AWS
    In this post, we walk through how Huntington built a scalable AWS solution to detect and redact Personally Identifiable Information (PII) and Payment Card Industry (PCI) data from over 400 million documents, reducing processing time from years to just a few months while achieving 95%+ redaction accuracy.  ( 111 min )
    Build a healthcare appointment agent with Amazon Nova 2 Sonic
    In this post, you will learn how to build a voice agent that handles appointment reminder conversations using Amazon Nova 2 Sonic and Amazon Bedrock AgentCore. The agent authenticates patients by voice, manages appointments (confirm, cancel, or reschedule), collects pre-visit health information, and escalates to human staff when needed. You handle routine calls at scale, which can help reduce no-show rates. This sample focuses on the agentic side of the problem: voice conversation and tool orchestration. A browser-based interface is included for testing. To connect the agent to actual phone lines for outbound dialing, you would integrate a telephony service such as Amazon Connect Customer.  ( 114 min )
    AI-powered BI with Snowflake and Amazon Quick
    In this post, you will learn how to build an end-to-end integration between Snowflake semantic views and Amazon Quick. The sample data is user review data for a media company. You start by loading movie review data from Amazon Simple Storage Service (Amazon S3) into Snowflake, define a semantic view in SQL to add business meaning, explore it with natural-language queries through Cortex Analyst, and then generate an Amazon Quick dataset and dashboard. The dataset can be created manually or with a provided automation script. By the end, your BI team or AI team can ask natural-language questions against a governed data layer and trust that every response reflects the same business logic.  ( 115 min )
    How Loka Built a Natural, Low-Latency Voice Agent with Amazon Nova 2 Sonic
    In this post, we demonstrate the architecture and approach Loka used to solve a common frustration: robotic, slow voice assistants that cause customers to hang up, damaging brand reputation and driving up support costs.  ( 114 min )

  • Open

    Build a protein research copilot with Amazon Bedrock AgentCore
    This post shows you how to build a conversational protein research assistant that combines three capabilities: Natural language query parsing to extract structured search parameters, vector similarity search over protein embeddings using a specialized language model and ai-generated scientific summaries of search results.  ( 116 min )
    Shared infrastructure, isolated tenants: Pool model multi-tenancy with Amazon Bedrock AgentCore
    In this post, you will learn patterns for implementing production-ready multi-tenant systems using Amazon Bedrock AgentCore. You will see these patterns demonstrated through healthcare AI agents that serve multiple clinics and hospitals.  ( 116 min )

  • Open

    Building pay-per-intelligence for AI agents: How Ampersend uses Amazon Bedrock AgentCore Payments
    In this post, you will learn how Ampersend built a pay-per-intelligence routing layer on top of Amazon Bedrock AgentCore Payments. AI agents autonomously route tasks to the most effective model, pay per request, and operate within spending budgets. You will also see how the two-hop payment pattern works end-to-end and how to get started with your own implementation.  ( 111 min )
    Embed the world: Multimodal AI for searchable aerial imagery at scale
    In this post, we walk through the problem space, our architecture on Amazon Bedrock and Amazon OpenSearch Serverless, the evaluation methodology we built on OpenStreetMap ground truth, four experiments that compared embedding models, fusion strategies, captioning, and search methods, and the practical guidance you can apply when building a similar system. You’ll learn which design choices move the needle for geospatial semantic search, including why Amazon Nova Multimodal Embeddings delivered the highest F1 scores across both benchmark queries in our evaluation. The work described here evolved into Vexcel Intelligence, a searchable imagery product.  ( 123 min )
    Running ComfyUI workflows on Amazon SageMaker AI processing jobs
    In this post, we walk you through how to deploy ComfyUI workflows on Amazon SageMaker AI processing jobs to generate hundreds of high-quality images in a single batch. You learn how to set up the infrastructure using AWS Cloud Development Kit (AWS CDK), configure GPU-accelerated processing, and automate image generation at scale. You can then adapt this solution to your ComfyUI workflows specific to your needs. We will guide you through a practical, step-by-step process to automate ComfyUI workflows to generate hundreds of high-quality images in a single batch empowering you to scale your creative pipeline.  ( 114 min )
  • Open

    SRE Weekly Issue #522
    View on sreweekly.com A message from our sponsor, Bronto: What would an AI SRE choose for their observability stack?We asked AWS DevOps Agent to run a live test comparing Bronto, Grafana Loki, and Elasticsearch against the same OpenTelemetry dataset. Bronto scored highest (9.4/10) and was the only tool that didn’t return silent failures. Curious why? […]  ( 4 min )

  • Open

    悟不是灵光乍现
    悟不是灵光乍现,是慢慢磨出来的 我们常以为,“悟”是灵光一闪的顿悟,是天才的专属时刻。 但回望历史与当下,你会发现一个残酷的真相:真正的“悟”,往往伴随着痛苦的自我否定,是一场漫长而精细的“打磨”。 它不是一味做加法,而是加加再减减;不是一味获得,而是抽丝剥茧,曲阜纯真。 01. 子贡的举一反三 在《论语》中,子贡曾自信地问孔子:“贫穷时不巴结,富贵时不骄傲,这境界不错吧?” 孔子却泼了一盆冷水:“这只是及格线。不如贫穷却快乐,富贵却好礼。” 如果是普通人,可能觉得面子挂不住。但子贡立刻联想到《诗经》里的“如切如磋,如琢如磨”。 他“悟”了:道德修养不是静止的状态,而是像加工玉石一样,需要不断的切割、磋磨,甚至要切除原本属于自己的部分,才能成器。 孔子大赞他“告诸往而知来者”,能够举一反三。这里的“悟”,不是凭空而来的灵感,而是通过极致的观察与思考,将外在的知识内化为自己的智慧。没有之前的积累与反思,就没有那一刻的通透。 02. 秦观的炼字:山抹微云 北宋词人秦观写下千古名句“山抹微云”。一个“抹”字,为何能流传千年? 因为它不是简单的描写,而是诗人对自然深度观察和思考后的结晶。普通人看云,只知其在飘;秦观“悟”到了山与云之间那种拟人化的、充满情感的互动。 这个字,是他反复推敲、剔除平庸词汇后的结果。“悟”在这里,表现为对细节的极致敏感和对表达的精准把控。 03. Whimsical的断舍离 今天读到一篇文章 https://whimsical.com/blog/choosing-depth-over-breadth。 Whimsical 始于 2017 年,他们的愿景是旨在打造一个全能型团队协作中心,最初凭借流程图、线框图、思维导图等专注型工具取得了成功。 他们曾执着于打造“全能协作中心”,近期推出的“项目(Projects)”和“帖子(Posts)”等宽泛功能未能获得预期的关注。产品范围的扩大分散了资源,威胁到整体产品质量,产品团队意识到用户更看重核心工具的深度,而不是一个大一统的套件。他们没有固执己见,而是像工匠审视瑕疵品一样,通过分析数据、倾听用户,完成了深刻的自我反省。 最终,他们做出了艰难的决定:砍掉 Projects 和 Tasks 功能,重新聚焦并专注于其最具优势的核心产品:白板体验(Boards)。 这不就是“如切如磋”吗?切掉多余的欲望,磨去浮躁的广度,只留下最核心的价值。 这种“悟”,是敢于承认错误的勇气,更是从“贪多”走向“精深”的境界升华。 无论是子贡的修身、秦观的炼字,还是 Whimsical 的产品迭代,它们都指向同一个真理: “悟”不在深山,而在事上磨。在这个喧嚣的时代,愿我们都能拥有这份“琢磨”的耐心,在不断的自我重塑中,遇见那个更通透的自己。  ( 1 min )

  • Open

    Introducing Web Search on Amazon Bedrock AgentCore
    Web Search on Amazon Bedrock AgentCore is now generally available. In this post, we walk through what makes Web Search on Amazon Bedrock AgentCore different, why it matters, and how to wire it in with a few lines of code.  ( 113 min )
    Accelerate campaign workflow with insights from Adobe Marketing Agent for Amazon Quick
    This post shows how to enable Adobe Marketing Agent for Amazon Quick using a Model Context Protocol (MCP). We walk you through how to configure the integration, authenticate using your Adobe credentials, and get the latest insights in Amazon Quick. The sample workflow returns audience rankings, loyalty segment summaries, journey usage, and conflict recommendations.  ( 115 min )

  • Open

    Monitor and debug generative AI inference with SageMaker detailed metrics and Insights dashboard on CloudWatch
    Amazon SageMaker AI provides fully managed real-time inference hosting for machine learning models. You deploy a model to a SageMaker endpoint backed by one or more compute instances, and SageMaker handles provisioning and scaling. SageMaker supports multiple endpoint architectures. This post focuses on the two most relevant to generative AI workloads with detailed observability: Single-model endpoints (SME) and Inference component (IC) endpoints.  ( 115 min )
    Amazon Bedrock AgentCore harness is now generally available: Go from idea to production-grade agent in minutes
    Today, Amazon Bedrock AgentCore harness is generally available. Two API calls (CreateHarness to define an agent, and InvokeHarness to run it), and you have an agent running in seconds. The agent runs in its own isolated environment with a filesystem and shell, so it can read files, run commands, and write code safely. It remembers users and conversations across sessions, picks up skills you point it at (including the AWS-curated catalog), browses the web, calls your tools through gateway or MCP, and switches model providers mid-session without losing context. Every step streams back to you in real time and is automatically traced to Amazon CloudWatch. You don’t need to write orchestration code or build a container, unless you want to.  ( 120 min )

  • Open

    Amazon SageMaker AI Async Inference now supports inline request payloads
    Today, we’re announcing inline payload support for Amazon SageMaker AI Async Inference. Customers can now send inference payloads directly in the request body of the InvokeEndpointAsync API, removing the need to upload input data to Amazon Simple Storage Service (Amazon S3) before each invocation.  ( 110 min )
    Get back hours every day with autonomous agents in Amazon Quick
    Today, Quick gets even more powerful: new autonomous agents that work continuously on your behalf, an activity feed that helps you prioritize your most important work, and the ability to find insights across every data source your business runs on from a single question.  ( 110 min )
    Context intelligence for your data and AI agents at scale
    Agents are only as intelligent as the context they can reason over. Today, that context is scattered across data lakes, data warehouses, lakehouses, databases, and streams, and in institutional knowledge that has never been written down. You want to trust the decisions made by your AI agents, but that can't happen until agents have context. Imagine what becomes possible when we give agents a safe way to access the context they need to deliver trusted decisions. This is why at the AWS Summit New York City, we’re announcing a series of innovations that deliver intelligence for your data and AI agents at scale.  ( 111 min )
    New in Amazon Bedrock AgentCore: Build agents with broader knowledge and continuous learning
    Today we're introducing new capabilities on Amazon Bedrock AgentCore, the platform to build, connect, and optimize agents. In this post, we cover how these capabilities close each gap: connecting agents to organizational, web, and paid knowledge; helping teams find and fix what's going wrong in production; and enforcing controls that scale as agents grow more capable. Together, they help you build more capable agents faster, govern them with controls that scale, and improve them continuously.  ( 115 min )

  • Open

    Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChecks API
    Today, we’re announcing a new API with Amazon Bedrock Guardrails. With this API, you can apply individual safeguards, also referred to as safety checks, at any point in your agentic AI applications without creating guardrail resources. In this post, we walk through how the InvokeGuardrailChecks API works and how to use it to build safe, multi-turn agentic AI applications.  ( 115 min )
    Introducing container caching in Amazon SageMaker AI for faster model scaling
    Today, we’re excited to announce container image caching for Amazon SageMaker AI inference, the next major advancement in our faster scaling optimization journey. This speeds up end-to-end latency by up to 2x for generative AI models during scale-out events.  ( 111 min )
    Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI
    This post walks you through how to use P-EAGLE directly within Amazon SageMaker AI. It will demonstrate how to select a compatible model from the SageMaker JumpStart catalog, configure the parallel drafting specifications, and deploy a highly optimized real-time SageMaker AI endpoint to accelerate your generative AI applications.  ( 115 min )

  • Open

    Introducing Gemma 4 models on Amazon Bedrock
    Today, we are announcing the availability of the Gemma 4 family on Amazon Bedrock. Built by Google DeepMind and released under the Apache 2.0 license, Gemma 4 is a family of open-weight models designed with a focus on intelligence-per-parameter across a broad range of deployment scenarios. The family includes three instruction-tuned variants: Gemma 4 31B, Gemma 4 26B-A4B, and Gemma 4 E2B. These cover dense and mixture-of-experts (MoE) architectures, where only a fraction of the model’s parameters activate per request. The variants offer built-in reasoning, native function calling, and multimodal input across text and image.  ( 120 min )
    AI Agent Failure Detection and Root Cause Analysis with Strands Evals
    In this post, we walk you through calling the detector functions to diagnose real agent failures. You learn how to interpret their structured output: categorized failures with confidence scores, causal chains linking root causes to downstream symptoms, and fix recommendations specifying whether a change belongs in your system prompt or tool definitions. You also learn how to integrate detection into your evaluation pipeline for automated diagnosis on every test run.  ( 114 min )
    Build context-rich research agents with Deep Agents and Bedrock AgentCore
    In this post, you'll build a competitive research agent that demonstrates this pattern end to end. This walkthrough targets developers building multi-step AI workflows who need isolated execution environments for their agents. In Part 2 of the notebook, you can deploy this same agent to Bedrock AgentCore Runtime using the AgentCore CLI, so it runs as a managed, session-isolated service.  ( 113 min )
  • Open

    SRE Weekly Issue #521
    View on sreweekly.com A message from our sponsor, Bronto: Stuck with slow queries and scattered logs? What if you could easily retain all of your telemetry data in one place for a full year without sky-high bills? Now with Bronto, it’s possible. Connect the dots faster across TBs of always hot, full fidelity data. Try […]  ( 4 min )

  • Open

    Building Supercharger: How Rocket Close optimized title operations with agentic AI
    In this post, we explore how Rocket Close built a solution using Strands Agents, large language models (LLMs), Amazon Bedrock, Amazon Bedrock Knowledge Bases, and Model Context Protocol (MCP) tools. We cover solution features, the rationale for the technology stack, lessons learned, and the business impact at Rocket Close.  ( 113 min )
    Build a meeting prep and follow-up assistant with Amazon Quick and Cisco Webex MCP servers
    This post shows how to build a custom meeting prep and follow-up assistant using Amazon Quick and Cisco Webex MCP servers. From a single prompt, the agent finds an upcoming Webex meeting, reviews prior meeting summaries and transcripts, and pulls related Vidcast highlights and transcript context. It then searches Webex message threads for unresolved follow-ups and creates a concise prep brief. After the meeting, the same assistant can summarize the discussion and identify action items. It can also find related Vidcast updates and draft a follow-up message for the right Webex space.  ( 116 min )
    From PDFs to insights: Architecting an intelligent document processing pipeline with AWS generative AI services
    This post outlines the development of a cost-effective and scalable intelligent document processing pipeline on AWS, powered by Amazon Bedrock and its features. BDA is a managed service within Amazon Bedrock that automates the extraction of insights from documents. We demonstrate how BDA extracts and analyzes document content, while Strands Agent hosted on Amazon Bedrock AgentCore Runtime coordinate specialized processing tasks, and Amazon Bedrock Knowledge Base enable contextual understanding across multiple documents. By combining these capabilities within a unified architecture, organizations can transform their document processing workflows with minimal development effort.  ( 115 min )
    Built from the inside out: How AWS Professional Services became a frontier team first
    AWS Professional Services (AWS ProServe) compressed engagement timelines from months to days, not by adding artificial intelligence (AI) tools to an existing process, but by fundamentally rebuilding how we deliver from the inside out. In this post, we share how AWS ProServe became a frontier team, the practices that enabled it, and what your engineering organization can take from our experience.  ( 111 min )

  • Open

    Extract Data with On-demand and Batch Pipelines Dynamically
    This post demonstrates an intelligent document processing pipeline that consists of both on-demand inference and batch inference options on Amazon Bedrock to enable the flexibility on the document processing time and cost.  ( 114 min )
    Evaluate AI agents systematically with Agent-EvalKit
    Agent-EvalKit is an open-source toolkit (Apache 2.0) that makes this evaluation infrastructure available by integrating with AI coding assistants, including Claude Code, Kiro CLI, and Kilo Code. This post walks through how Agent-EvalKit works across its six evaluation phases, using a travel research agent built with the Strands Agents SDK and Amazon Bedrock as a running example.  ( 115 min )
    Spot trends faster, sort smarter: Unlocking Sparklines and Custom Sort in Amazon Quick
    Today, we’re excited to announce two new capabilities that make Quick Sight dashboards even more expressive and business-aligned: sparklines and custom sort for controls. In this post, we walk through both features, what they are, when to use them, and how to configure them, with real-world scenarios that bring them together in a practical, decision-ready dashboard.  ( 118 min )
    Optimize blueprint extraction accuracy in Amazon Bedrock Data Automation
    Blueprint instruction optimization is a BDA feature that automatically refines your extraction instructions to address this challenge directly. You provide three to ten example documents with expected values, and BDA refines your blueprint instructions to improve accuracy in minutes, not weeks. No separate model fine-tuning is required. By the end of this post, you can optimize your blueprints to improve accuracy, run the optimization workflow through the Amazon Bedrock console or the API, and apply best practices for selecting examples and ground truth.  ( 116 min )
    How frontier teams are reinventing AI-native development
    Frontier teams are not just using AI to code faster. They’re redesigning how software gets built. The result is 4.5x productivity gains, in some cases more than 10x.  ( 112 min )

  • Open

    Stop hand-tuning kernels: How Neuron Agentic Development accelerates AWS Trainium optimizations
    Today, we’re announcing the Neuron Agentic Development capabilities: a collection of AI agents and skills that make this possible for developers building on AWS Trainium and AWS Inferentia. In this post, we explain how the Neuron Agentic Development capabilities accelerate the kernel development workflow.  ( 114 min )
    Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore
    In this post, you build an AI-powered equipment repair assistant using Amazon Bedrock AgentCore that helps farmers and field technicians diagnose equipment problems, identify required parts, and access manufacturer-approved repair procedures through natural language. The solution uses AgentCore Runtime with the Strands Agents SDK, Amazon Nova 2 Lite as the foundation model, Amazon Bedrock Knowledge Base for retrieval-augmented generation (RAG), and AgentCore Memory for conversation persistence.  ( 115 min )

  • Open

    Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
    In this post, we show how to train robot policies for the Unitree H1 humanoid with NVIDIA Isaac Lab on Amazon SageMaker AI across two compute options: Amazon SageMaker HyperPod and Amazon SageMaker Training Jobs.  ( 122 min )
    Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake
    In this post, we demonstrate how a hands-free FNOL intake system combines agents built with the Strands Agents SDK for domain reasoning with Amazon Bedrock AgentCore Browser Tool for live portal interaction. This approach preserves human expertise while removing repetitive screen work.  ( 121 min )
    Build an agentic incident triage assistant with Amazon Quick and New Relic
    This post shows engineering teams how to apply that principle to one of the most time-sensitive workflows in engineering: incident triage. You will build a custom incident triage assistant agent using Amazon Quick that orchestrates a response with the New Relic Model Context Protocol (MCP) Server and Asana through native integrations. From a single prompt, the Amazon Quick agent investigates the incident, assembles a root cause analysis (RCA) brief with evidence links, and creates a tracked Asana task ready for handoff.  ( 113 min )

  • Open

    Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access
    With access to the latest generative AI models and high-performance accelerated compute in high global demand, AWS customers need tools to take advantage of model availability and capacity across multiple AWS Regions, while still meeting their security and privacy requirements. cross-Region Inference (CRIS) on Amazon Bedrock meets these needs by automatically routing requests across multiple […]  ( 114 min )
    It’s safe to close your laptop now: Hosting coding agents on Amazon Bedrock AgentCore
    Amazon Bedrock AgentCore Runtime gives each agent session its own isolated microVM with a persistent workspace, secure tool access through Gateway, and built-in observability—so you can run Claude Code, Codex, Kiro, and Cursor in parallel without sharing secrets, ports, or filesystems. Close the lid, go to dinner, and pick up where you left off tomorrow.  ( 122 min )
    Better decisions at scale: How mathematical optimization delivers where intuition fails
    In this post, we introduce mathematical optimization, explain how it fits within the broader AI landscape, and showcase real-world success stories where the Innovation Center has partnered with customers to deliver concrete results.  ( 111 min )
    End-to-end encrypted ML inference with Amazon SageMaker AI and FHE
    This blog has previously discussed FHE for ML inference in the post Enable fully homomorphic encryption with Amazon SageMaker endpoints for secure, real-time inferencing, but this post goes a little further. That previous post showed how to implement FHE-based inference 'from scratch' by hand-crafting a linear-regression algorithm using a low-level library called SEAL. Instead, this post shows a much more flexible and higher-level approach based on concrete-ml, a high-level library built specifically for FHE-based inference. It supports several common types of models 'out of the box' and is even API compatible with the well-known ML library scikit-learn.  ( 119 min )
    Amazon Quick ARNs: Cross-account migration and namespace permissions
    In this post, we cover the structure of Amazon Quick ARNs and provide a practical mental model for working with them. By the end, you can look at an ARN and immediately understand what it means for your migration strategy, diagnose permission issues faster, and design multi-tenant architectures with confidence.  ( 115 min )
    Evaluate your Amazon Nova Sonic voice agent at scale, no microphone required
    In this post, we walk you through the Nova Sonic Test Harness, an open source framework that we built to solve both problems. It serves as a rapid iteration tool for tuning system prompts and tool configurations (run a conversation, see results, adjust, repeat) and as a comprehensive evaluation framework for validating voice agent quality at scale. It runs complete multi-turn conversations with Amazon Nova Sonic automatically, evaluates them using LLM-as-judge techniques, and can even detect cases where the model’s audio output doesn’t match its text output (audio hallucinations). No microphone required.  ( 114 min )
  • Open

    SRE Weekly Issue #520
    View on sreweekly.com A message from our sponsor, BigPanda: Your team solved this incident last month. Why is it back? Because you fixed the symptom, not the cause. BigPanda surfaces the pattern behind repeat incidents and tells you what to fix so the next on-call doesn’t fight the same P1. Prevent incidents proactively AI Agents […]  ( 4 min )

  • Open

    Miell’s Law and Token Budgets
    TL;DR Conway’s Law tells us that organisations create systems that mirror their communication systems. Jamie Dobson coin’s ‘Miell’s Law’ in a post about the work of our mutual friend (and his colleague) Ian Miell in his forthcoming book ‘Follow the Money‘: Organisations that design systems are constrained to produce systems that reflect the financial structures […]  ( 14 min )
    Miell’s Law and Token Budgets
    TL;DR Conway’s Law tells us that organisations create systems that mirror their communication systems. Jamie Dobson coin’s ‘Miell’s Law’ in a post about the work of our mutual friend (and his colleague) Ian Miell in his forthcoming book ‘Follow the Money‘: Organisations that design systems are constrained to produce systems that reflect the financial structures […]  ( 15 min )
    May 2026
    Pupdate It was Milo’s 5th birthday on the 12th, which meant a post about how he’s getting on. Sadly Max has also needed to visit the vets, with (we think) back ache, which might be the dreaded Intervertebral Disc Disease (IVDD). He’s been on reduced activity, so shorter walks, but thankfully seems pretty much back […]  ( 15 min )
    May 2026
    Pupdate It was Milo’s 5th birthday on the 12th, which meant a post about how he’s getting on. Sadly Max has also needed to visit the vets, with (we think) back ache, which might be the dreaded Intervertebral Disc Disease (IVDD). He’s been on reduced activity, so shorter walks, but thankfully seems pretty much back […]  ( 15 min )

  • Open

    NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart
    Deploy NVIDIA Nemotron 3 Ultra on Amazon SageMaker JumpStart. Get 5x faster inference and 30% lower cost for agentic AI workloads with this frontier reasoning model.  ( 109 min )

  • Open

    Automate model quota request and operational issue triage on Amazon Bedrock
    In this post, we introduce Amazon Bedrock Ops Alert, a three-layer automated monitoring solution that proactively detects operational issues, dynamically adjusts alarm thresholds, classifies alarms by category, automatically creates context-aware support cases, helps prevent duplicate cases when an unresolved case of the same alarm category is already active, and delivers contextualized notifications to AI SRE teams. We walk through the solution architecture and how you can deploy it in your own environment.  ( 120 min )
    How to build self-driving AI operations on Amazon Bedrock at scale
    In this post, we introduce Amazon Bedrock Ops Alert, a three-layer automated monitoring solution that proactively detects operational issues, dynamically adjusts alarm thresholds, classifies alarms by category, automatically creates context-aware support cases, helps prevent duplicate cases when an unresolved case of the same alarm category is already active, and delivers contextualized notifications to AI SRE teams. We walk through the solution architecture and how you can deploy it in your own environment.  ( 120 min )
2026-07-03T07:17:55.246Z osmosfeed 1.15.1