• Open

    Amazon Nova Act is now HIPAA eligible
    In this post, you will learn what Nova Act offers, how HIPAA eligibility applies to agentic AI, and how to get started.  ( 109 min )
    Intelligent radiology workflow optimization with AI agents
    Many healthcare organizations report that traditional worklist systems rely on rigid rules that ignore critical context, radiologist specialization, current workload, fatigue levels, and case complexity. This creates a persistent challenge: radiologists cherry-pick easier, higher-value cases while avoiding complex studies, leading to diagnostic delays and increased costs. Research across 62 hospitals analyzing 2.2 million studies found […]  ( 117 min )
    Integrating AWS API MCP Server with Amazon Quick using Amazon Bedrock AgentCore Runtime
    This post shows you how to use Amazon Bedrock AgentCore Runtime with Model Context Protocol (MCP) support to connect Amazon Quick with AWS services through the AWS API MCP Server, creating a conversational AI assistant that translates natural language into AWS Command Line Interface (AWS CLI) commands, without the need to switch between tools during critical moments.  ( 117 min )
    Building multi-tenant agents with Amazon Bedrock AgentCore
    This post explores design considerations for architecting multi-tenant agentic applications and the framework needed to address SaaS architecture challenges with Amazon Bedrock AgentCore.  ( 119 min )
    Break the context window barrier with Amazon Bedrock AgentCore
    In this post, you will learn how to implement Recursive Language Models (RLM) using Amazon Bedrock AgentCore Code Interpreter and the Strands Agents SDK. By the end, you will know how to process documents of varying lengths, with no upper bound on context size, use Bedrock AgentCore Code Interpreter as persistent working memory for iterative document analysis, and orchestrate sub-large language model (sub-LLM) calls from within a sandboxed Python environment to analyze specific document sections.  ( 116 min )
    Build AI agents for business intelligence with Amazon Bedrock AgentCore
    In this post, we show you how OPLOG developed three AI agents using the Strands Agents SDK, deployed them to Amazon Bedrock AgentCore, and integrated Amazon Bedrock with Anthropic’s Claude Sonnet and Amazon Bedrock Knowledge Bases for Retrieval Augmented Generation (RAG).  ( 115 min )
    Build an AI-powered recruitment assistant using Amazon Bedrock
    In this post, we demonstrate how to build an AI-powered recruitment assistant using Amazon Bedrock that brings efficiencies to candidate evaluation, generates personalized interview questions, and provides data-driven insights for human hiring decisions. This post presents a reference architecture for learning purposes — not a production-ready solution. Amazon Bedrock and the AWS services used here are general-purpose tools that customers can combine to support a wide variety of use cases, including recruitment workflows. The architecture demonstrates one possible approach; customers should adapt it to their specific requirements.  ( 117 min )
    Build AI-powered dashboard automation agents with NLP on Amazon Bedrock AgentCore
    This solution combines the power of Amazon Bedrock AgentCore, Strands Agents, and Amazon Quick transforms to deliver a secure, scalable, and intelligent system for building and operating AI agents while transforming data into actionable business insights.  ( 117 min )

  • Open

    Announcing OpenAI-compatible API support for Amazon SageMaker AI endpoints
    Today, Amazon SageMaker AI introduces OpenAI-compatible API support for real-time inference endpoints. If you use the OpenAI SDK, LangChain, or Strands Agents, you can now invoke models on SageMaker AI by changing only your endpoint URL. You don’t need a custom client, a SigV4 wrapper, or code rewrites. Overview With this launch, SageMaker AI endpoints […]  ( 115 min )
    Multimodal evaluators: MLLM-as-a-judge for image-to-text tasks in Strands Evals
    If you’re building visual shopping, image or document understanding, or chart analysis, you need a way to verify whether your model’s response is actually grounded in the source image. A text-only evaluator cannot tell you whether a caption faithfully describes an image, whether an extracted invoice total matches the document, or whether a screen summary […]  ( 114 min )
    Build real-time voice applications with Amazon SageMaker AI and vLLM
    Voice agents, live captioning, contact center analytics, and accessibility tools all depend on real-time speech-to-text, where your application streams audio in and receives transcription back simultaneously over a single persistent connection. Traditional request-response inference falls short here because transcription cannot begin until the entire audio recording has been received, adding latency that breaks the real-time […]  ( 115 min )

  • Open

    Scalable voice agent design with Amazon Nova Sonic: multi-agent, tools, and session segmentation
    In this post, you’ll learn how to use Amazon Nova Sonic, Amazon Bedrock AgentCore, and Strands BidiAgent to build scalable, maintainable voice agents that handle these challenges efficiently, resulting in more responsive and intelligent customer interactions. We’ll explore three popular architectural patterns for voice agents, highlighting their trade-offs and best practices for minimizing latency.  ( 114 min )
    Extending conversational memory in Kiro CLI using Amazon Bedrock AgentCore Memory
    In this post, we demonstrate how you can extend the conversational memory of Kiro CLI by implementing a custom Model Context Protocol (MCP) server that integrates with Amazon Bedrock AgentCore Memory. You can use Kiro CLI to interact with AI agents of Kiro directly from your terminal. Amazon Bedrock AgentCore Memory is a fully managed service that allows AI agents to retain information from past interactions, creating more intelligent and context-aware conversations. By implementing a custom MCP server, you can provide Kiro CLI with tools to store and retrieve conversation context, monitor memory usage, and manage the underlying Bedrock Agent Core Memory infrastructure.  ( 111 min )
    Accelerate ML feature pipelines with new capabilities in Amazon SageMaker Feature Store
    Today, we’re announcing three new capabilities available in SageMaker Python SDK v3.8.0. In this post, we walk through each capability with code examples you can use to get started. For complete end-to-end walkthroughs, see the accompanying notebooks for Lake Formation governance and Iceberg table properties in the SageMaker Python SDK repository.  ( 114 min )
    Implementing programmatic tool calling on Amazon Bedrock
    In this post, we show three ways to implement Programmatic tool calling (PTC) on Amazon Bedrock: a self-hosted Docker sandbox on ECS for maximum control, a managed solution using Amazon Bedrock AgentCore Code Interpreter, and an Anthropic SDK-compatible path through a proxy for teams that prefer that developer experience.  ( 118 min )

  • Open

    Prompting Amazon Nova 2 for content moderation
    In this post, you learn how to prompt Amazon Nova 2 Lite for content moderation using structured and free-form approaches, grounded in the MLCommons AILuminate Assessment Standard. The prompting techniques use the AILuminate taxonomy as an example, but they work equally well with your own custom moderation policy. You can swap in your own category definitions and the prompt structure stays the same. We also benchmark the content moderation capabilities of Amazon Nova 2 Lite against several foundation models (FMs) on three public datasets.  ( 117 min )
    Aderant transforms cloud operations with Amazon Quick
    In this post, we share how Aderant used the AI-powered capabilities of Amazon Quick to unify search across six vendor systems and automate documentation workflows, achieving 90 percent faster search times and 75 percent documentation acceleration, and how others can apply these approaches to their operations.  ( 111 min )
    Integrate Atlassian Confluence Cloud with Amazon Quick
    In this post, you will learn how to set up the Confluence Cloud integration with Quick. This includes creating a knowledge base for semantic search, setting up Actions to query and manage Confluence pages, and organizing resources in Quick Spaces. Quick integrates with your current enterprise technology stack, from internal knowledge repositories and corporate intranets to business-critical applications and AWS data services.  ( 117 min )
    Build custom code-based evaluators in Amazon Bedrock AgentCore
    In this post, you will implement four Lambda-based custom code evaluators for a financial market-intelligence agent, register each with AgentCore, and run them in on-demand and online modes. You will also see how to combine custom code-based evaluators with built-in evaluators and how to call other AWS services for grounded fact-checking, PII detection, and real-time alerting.  ( 117 min )
  • Open

    SRE Weekly Issue #517
    View on sreweekly.com A message from our sponsor, BigPanda: No single team sees the full incident anymore. Today’s P1s break across services, teams, and infrastructure. Instead of chasing dashboards, waiting on tribal knowledge, or piecing together signals from siloed systems, BigPanda surfaces the complete picture to pinpoint root cause faster. See BigPanda for SREs Why […]  ( 4 min )

  • Open

    Restrict access to sensitive documents in your Amazon Quick knowledge bases for Amazon S3
    In this post, we walk through how to configure document-level ACLs for your S3 knowledge base in Amazon Quick. You will learn how to set up and verify an ACL configuration that enforces document-level permissions across chat and automated workflows.  ( 119 min )

  • Open

    Improve bot accuracy with Amazon Lex Assisted NLU
    In this post, you will learn how to implement Assisted NLU effectively. You will learn how to improve your bot design with effective intent and slot descriptions, validate your implementation using Test Workbench, and plan your transition from traditional NLU to Assisted NLU for both new and existing bots.  ( 117 min )
    Real-time voice agents with Stream Vision Agents and Amazon Nova 2 Sonic
    In this post, you learn how to combine Stream's Vision Agents open-source framework with Amazon Bedrock and Amazon Nova 2 Sonic to build real-time voice agents that can be production-ready in minutes. You'll learn how the integration works under the hood, walk through code examples, and explore advanced capabilities like function calling, automatic reconnection, and multilingual voice support.  ( 117 min )
    From siloed data to unified insights: Cross-account Athena Access for Amazon Quick
    Today, we're announcing cross-account Athena access for Amazon Quick. With this feature, customers can query Athena data in other AWS accounts using AWS Identity and Access Management (IAM) role chaining, with query costs billed to the account where the data resides.  ( 117 min )
    Control where your AI agents can browse with Chrome enterprise policies on Amazon Bedrock AgentCore
    In this post, you will configure Chrome enterprise policies to restrict a browser agent to a specific website, observe the policy enforcement through session recording, and demonstrate custom root CA certificates using a public test site. The walkthrough produces a working solution that researches Amazon Bedrock AgentCore documentation while operating under enterprise browser restrictions.  ( 116 min )

  • Open

    Build financial document processing with Pulse AI and Amazon Bedrock
    This post demonstrates how to build a documentation extraction and model fine-tuning pipeline that addresses challenges when processing the complex financial documents. By combining Pulse AI's advanced document understanding capabilities with the powerful AI services of Amazon Bedrock, organizations can achieve enterprise-grade accuracy and extract contextually relevant financial insights at scale.  ( 118 min )
    Build real-time voice streaming applications with Amazon Nova Sonic and WebRTC
    Building end-to-end live streaming applications with real-time voice interaction presents several challenges. This post introduces a solution based on Amazon Nova 2 Sonic (Nova Sonic) and Amazon Kinesis Video Streams WebRTC (WebRTC) that addresses these challenges. In this post, we’ll walk through the solution architecture, implementation patterns, and two real-world scenario examples.  ( 112 min )
    Securing AI agents: How AWS and Cisco AI Defense scale MCP and A2A deployments
    The Cisco and AWS partnership addresses three challenges enterprises face when scaling AI agents: visibility gaps, security bottlenecks, and compliance risks. In this post, we explore how you can overcome AI security challenges through automated scanning and unified governance.  ( 112 min )
    Fine-tune LLM with Databricks Unity Catalog and Amazon SageMaker AI
    In this post, we demonstrate how to build a secure, complete LLM fine-tuning workflow that integrates Unity Catalog with Amazon SageMaker AI using Amazon EMR Serverless for preprocessing. The solution shows how to securely access governed data, maintain lineage across services, fine-tune the Ministral-3-3B-Instruct model, and register trained artifacts back into Unity Catalog. With this approach, you can continue using your existing services while preserving central governance, tracking data lineage without compromising security or compliance requirements.  ( 116 min )

  • Open

    How Amazon Finance streamlines regulatory inquiries by using generative AI on AWS
    In this post, we demonstrate how Amazon FinTech teams are using Amazon Bedrock and other AWS services to build a scalable AI application to transform how regulatory inquiries are handled. Each team using this solution creates and maintains its own dedicated knowledge base, populated with that team's specific documents and reference materials.  ( 114 min )
    Automate schema generation for intelligent document processing
    In this post, we'll show you how our multi-document discovery feature solves this problem. It serves as an automated pre-processing step, analyzing unknown documents, clustering them by type, and generating schemas ready for the IDP Accelerator. You'll learn how the new capability uses visual embeddings for automatic clustering and agents for schema generation. We'll also walk you through running the solution on your own document collections.  ( 113 min )
    Navigating EU AI Act requirements for LLM fine-tuning on Amazon SageMaker AI
    In this post, we show you how to set up FLOPs tracking during LLM fine-tuning using the open source Fine-Tuning FLOPs Meter toolkit on Amazon SageMaker AI. You learn how to determine your compliance status with a single configuration flag and generate audit-ready documentation.  ( 115 min )
  • Open

    Milo cancer diary part 23 – Five
    Milo is five today, which he’s mostly celebrating by snoozing on the office sofa behind me. He’s nearing the end of his fourth (modified) CHOP protocol, with just over 4 weeks and 3 more vet visits to go. Since going into remission at the start of the year things have proceeded mostly uneventfully, which is […]  ( 13 min )

  • Open

    Building web search-enabled agents with Strands and Exa
    In this post, you will learn how to set up the Exa integration in Strands Agents, understand the two core tools it exposes, and walk through real-world use cases that show how agents use web search to complete multi-step tasks.  ( 117 min )
    Introducing Claude Platform on AWS: Anthropic’s native platform, through your AWS account
    Today, we're excited to announce the general availability of Claude Platform on AWS. Claude Platform on AWS is a new service that gives customers direct access to Anthropic's native Claude Platform experience through their AWS account, with no separate credentials, contracts, or billing relationships required. AWS is the first cloud provider to offer access to the native Claude Platform experience. In this post, we explore how Claude Platform on AWS works and how you can start using it today.  ( 110 min )
    Manufacturing intelligence with Amazon Nova Multimodal Embeddings
    In this post, we build a multimodal retrieval system for aerospace manufacturing documents using Amazon Nova Multimodal Embeddings on Amazon Bedrock and Amazon S3 Vectors. We evaluate the system on 26 manufacturing queries and compare generation quality between a text-only pipeline and the multimodal pipeline.  ( 115 min )
    How Miro uses Amazon Bedrock to boost software bug routing accuracy and improve time-to-resolution from days to hours
    In this post, we dive deep into the architecture and techniques we used to improve Miro’s bug routing, achieving six times fewer team reassignments and five times shorter time-to-resolution powered by Amazon Bedrock.  ( 115 min )
    Amazon Quick: Accelerating the path from enterprise data to AI-powered decisions
    Amazon Quick helps turn your large enterprise data into fast and accurate AI-powered decisions. In this post, you will learn about five new capabilities of Amazon Quick that accelerate how data professionals deliver trusted AI-powered insights at enterprise scale.  ( 114 min )
  • Open

    SRE Weekly Issue #516
    View on sreweekly.com A message from our sponsor, incident.io: Paging is just 10% of your incident workflow. incident.io’s 4-step framework turns migration into a forcing function for the other 90%: cut alert noise, fix service ownership, and build the on-call program your team actually deserves. Not all index scans are equal: How we cut query […]  ( 4 min )

  • Open

    Halliburton enhances seismic workflow creation with Amazon Bedrock and Generative AI
    In this post, we'll explore how we built a proof-of-concept that converts natural language queries into executable seismic workflows while providing a question-answering capability for Halliburton's Seismic Engine tools and documentation. We'll cover the technical details of the solution, share evaluation results showing workflow acceleration of up to 95%, and discuss key learnings that can help other organizations enhance their complex technical workflows with generative AI.  ( 112 min )

  • Open

    Secure short-term GPU capacity for ML workloads with EC2 Capacity Blocks for ML and SageMaker training plans
    In this post, you will learn how to secure reserved GPU capacity for short-term workloads using Amazon Elastic Compute Cloud (Amazon EC2) Capacity Blocks for ML and Amazon SageMaker training plans. These solutions can address GPU availability challenges when you need short-term capacity for load testing, model validation, time-bound workshops, or preparing inference capacity ahead of a release.  ( 113 min )
    Overcoming reward signal challenges: Verifiable rewards-based reinforcement learning with GRPO on SageMaker AI
    In this post, you will learn how to implement reinforcement learning with verifiable rewards (RLVR) to introduce verification and transparency into reward signals to improve training performance. This approach works best when outputs can be objectively verified for correctness, such as in mathematical reasoning, code generation, or symbolic manipulation tasks. You will also learn how to layer techniques like Group Relative Policy Optimization (GRPO) and few-shot examples to further improve results. You’ll use the GSM8K dataset (Grade School Math 8K: a collection of grade school math problems) to improve math problem solving accuracy, but the techniques used here can be adapted to a wide variety of other use cases.  ( 117 min )
    Agents that transact: Introducing Amazon Bedrock AgentCore Payments, built with Coinbase and Stripe
    Today, we're announcing a preview of Amazon Bedrock AgentCore Payments, a new set of features in Amazon Bedrock AgentCore that enables AI agents to instantly access and pay for what they use. AgentCore Payments was developed in partnership with Coinbase and Stripe.  ( 111 min )

  • Open

    Cost effective deployment of vision-language models for pet behavior detection on AWS Inferentia2
    Tomofun, the Taiwan-headquartered pet-tech startup behind the Furbo Pet Camera, is redefining how pet owners interact with their pets remotely. To reduce costs and maintain accuracy, Tomofun turned to EC2 Inf2 instances powered by AWS Inferentia2, the Amazon purpose-built AI chips. In this post, we walk through the following sections in detail.  ( 111 min )

  • Open

    How Hapag-Lloyd uses Amazon Bedrock to transform customer feedback into actionable insights
    Hapag-Lloyd's Digital Customer Experience and Engineering team, distributed between Hamburg and Gdańsk, drives digital innovation by developing and maintaining customer-facing web and mobile products. In this post, we walk you through our generative AI–powered feedback analysis solution built using Amazon Bedrock, Elasticsearch, and open-source frameworks like LangChain and LangGraph  ( 112 min )
    Streamlining generative AI development with MLflow v3.10 on Amazon SageMaker AI
    Today, we’re excited to announce that Amazon SageMaker AI MLflow Apps now support MLflow version 3.10, bringing enhanced capabilities for generative AI development and streamlined experiment tracking to your generative AI workflows. Building on the foundations established with Amazon SageMaker AI MLflow Apps, this latest version introduces powerful new features for observability, evaluation, and generative […]  ( 108 min )
    Introducing OS Level Actions in Amazon Bedrock AgentCore Browser
    We’re announcing OS Level Actions for AgentCore Browser. This new capability unblocks these scenarios by exposing direct OS control through the InvokeBrowser API, so agents can interact with content visible on the screen, not only what's accessible through the browser's web layer. By combining full-desktop screenshots with mouse and keyboard control at the OS level, agents can observe native UI, reason about it, and act on it within the same session. This post walks through how OS Level Actions work, what actions are supported, and how to get started.  ( 112 min )
    Secure AI agents with Amazon Bedrock AgentCore Identity on Amazon ECS
    AI agents in production require secure access to external services. Amazon Bedrock AgentCore Identity, available as a standalone service, secures how your AI agents access external services whether they run on compute platforms like Amazon ECS, Amazon EKS, AWS Lambda, or on-premises. This post implements Authorization Code Grant (3-legged OAuth) on Amazon ECS with secure session binding and scoped tokens.  ( 115 min )
    Intelligence-driven message defense and insights using Amazon Bedrock
    In this post, you will learn how you can use Amazon Nova Foundation Models in Amazon Bedrock to apply generative AI techniques for both business protection and enhancement. You can identify obvious and disguised attempts at direct contact while gaining valuable insights into customer sentiment and service improvement opportunities.  ( 115 min )

  • Open

    Beyond BI: How the Dataset Q&A feature of Amazon Quick powers the next generation of data decisions
    Business leaders across industries rely on operational dashboards as the shared source of truth that their teams execute against daily. But dashboards are built to answer known questions. When teams need to explore further, ad-hoc, multi-dimensional, or unforeseen questions, they hit a bottleneck. They wait hours or days for BI teams to build new views […]  ( 115 min )
    Introducing agent quality optimization in AgentCore, now in preview
    Generate recommendations from production traces, validate them with batch evaluation and A/B testing, and ship with confidence. AI agents that perform well at launch don’t stay that way. As models evolve, user behavior shifts, and prompts get reused in new contexts they were never designed for. Agent quality quietly degrades. In most teams, the improvement […]  ( 109 min )
    Introducing the agent quality loop: AgentCore Optimization now in preview
    Generate recommendations from production traces, validate them with batch evaluation and A/B testing, and ship with confidence. AI agents that perform well at launch don’t stay that way. As models evolve, user behavior shifts, and prompts get reused in new contexts they were never designed for. Agent quality quietly degrades. In most teams, the improvement […]  ( 109 min )
    Introducing the agent performance loop: AgentCore Optimization now in preview
    Generate recommendations from production traces, validate them with batch evaluation and A/B testing, and ship with confidence. AI agents that perform well at launch don’t stay that way. As models evolve, user behavior shifts, and prompts get reused in new contexts they were never designed for. Agent quality quietly degrades. In most teams, the improvement […]  ( 109 min )
    Agent-guided workflows to accelerate model customization in Amazon SageMaker AI
    Amazon SageMaker AI now offers an agentic experience that changes this. Developers describe their use case using natural language, and the AI coding agent streamlines the entire journey, from use case definition and data preparation through technique selection, evaluation, and deployment. In this post, we walk you through the model customization lifecycle using SageMaker AI agent skills.  ( 113 min )
    Generate dashboards from natural language prompts in Amazon Quick
    Building meaningful dashboards demands hours of manual setup, even for experienced BI professionals. Amazon Quick now generates complete multi-sheet dashboards from natural language prompts, taking you from one or more datasets to a production-ready analysis in minutes. Data analysts building recurring operations reports, program managers preparing a leadership review, or engineers exploring a new dataset can […]  ( 108 min )
    From data lake to AI-ready analytics: Introducing new data source with S3 Tables in Amazon Quick
    Amazon Quick introduces Amazon S3 Tables (Apache Iceberg tables) as a new data source. With this feature, customers can directly query and visualize Apache Iceberg tables stored in an Amazon S3 table bucket without the need for intermediate data layers. In this post, we explored how Amazon Quick’s new Amazon S3 Tables data source enables near real-time analytics while streamlining modern data architectures.  ( 112 min )
    Introducing Dataset Q&A: Expanding natural language querying for structured datasets in Amazon Quick
    In this post, you learn how to get started with Dataset Q&A, explore real-world use cases with hands-on examples, and discover advanced capabilities like auto-discovery across all your data assets and multi-dataset querying in a single conversation.  ( 115 min )
    Capacity-aware inference: Automatic instance fallback for SageMaker AI endpoints
    Today, Amazon SageMaker AI introduces capacity aware instance pool for new and existing inference endpoints. You define a prioritized list of instance types, and SageMaker AI automatically works through your list whenever capacity is constrained at creation, during scale-out, and during scale-in. Your endpoint provisions on available AI Infrastructure without manual intervention. This capability is available for Single Model Endpoints, Inference Component-based endpoints, and Asynchronous Inference endpoints.  ( 114 min )
  • Open

    VLANs on Sodola managed switches
    TL;DR There’s an aspect to the web user interface of Sodola switches that’s far from obvious, and not documented :/ When setting up trunk ports it’s necessary to select all the VLANs that will be carried, and the default is only to select VLAN 1. Background I’m starting to accumulate a small selection of things […]  ( 14 min )
    VLANs on Sodola managed switches
    TL;DR There’s an aspect to the web user interface of Sodola switches that’s far from obvious, and not documented :/ When setting up trunk ports it’s necessary to select all the VLANs that will be carried, and the default is only to select VLAN 1. Background I’m starting to accumulate a small selection of things […]  ( 14 min )
  • Open

    SRE Weekly Issue #515
    View on sreweekly.com A message from our sponsor, atscaleconference.com: Building scalable, high-performance infrastructure for AI is one of today’s toughest challenges. Join @Scale: Systems & Reliability on June 25 in Bellevue, WA to learn how leading engineers are solving it. Secure your seat today! The Silent Failure of Reliability Metrics at Scale: Lessons Learned from […]  ( 4 min )

  • Open

    AWS Transform now automates BI migration to Amazon Quick in days
    In this post, we walk through the full journey, from setting up your migration workspace in AWS Transform to subscribing to partner agents through AWS Marketplace to unlocking Amazon Quick capabilities that change how your organization consumes data.  ( 112 min )
  • Open

    April 2026
    Pupdate It’s been mostly warm and dry, so plenty of opportunities for longer walks :) Milo is now on the final cycle of his 4th chemo protocol, and it’s proceeding OK. Toronto We started the month in Toronto, which was a really fun trip deserving it’s own post. Eyes My cataracts are gone, and I […]  ( 14 min )
    April 2026
    Pupdate It’s been mostly warm and dry, so plenty of opportunities for longer walks :) Milo is now on the final cycle of his 4th chemo protocol, and it’s proceeding OK. Toronto We started the month in Toronto, which was a really fun trip deserving it’s own post. Eyes My cataracts are gone, and I […]  ( 14 min )
  • Open

    苏东坡
    人在职场不快乐,只因未读苏东坡 林语堂在《苏东坡传》里说,苏东坡的人生,是从四十岁之后开始的。如今年过四十的我,每当职场内卷不快乐的时候,我都能从苏东坡的诗词里找到慰藉和力量。 今天的职场,很多压力和焦虑看似新鲜,实则古已有之。趁着五一闲暇,抄写一首《满庭芳》,可以说是苏老的“职场反内卷宣言”,让我产生些许共鸣。并对照当下职场做了一份“职场解压版”,希望它能提醒你:真正重要的,不是无休止的竞争,而是给自己留一点生活的余地。 职场解压版 KPI虚名,年终微利,算来著甚干忙(PA 晋升、各种评奖……为了这点虚名微利,算来算去,值得把自己忙成这样吗)。 晋升前定,谁卷谁又强。(你以为晋升靠拼命,其实很多时候和能力无关) 且趁40未老,尽放我、些子疏狂(趁还没被996榨干,保留一点”不服从”的野性,夜晚非老板电话不接消息不回,周报里敢写牢骚)。 百年里,浑教是醉,三万六千场。(一辈子也就三万六千天,偶尔放松,比天天紧绷更可贵) 思量。能几许,忧愁风雨,一半相妨。(想想吧,焦虑内耗至少偷走了一半人生,值得吗) 又何须,抵死说短道长。(何必互相甩锅,各种损招?) 幸对清风皓月,奶茶店、云幕高张。(真正的”福利”是下班后的清风明月,路边的奶茶小店,天边的晚霞) 下班好,千钟美酒,一曲满庭芳。(下班真好,和朋友喝顿酒,唱首歌,这才是自己的人生) 为什么今日依然可参考 苏东坡的心态提醒我们,工作只是人生的一部分。你可以努力,但不必把自己逼成机器;你可以认真,却不必把时间和情绪都交给职场。 这不是鼓励你消极避工,而是鼓励你在职场中保留自我。用苏东坡的方式看待压力,不是逃避,而是给自己多一份选择:内卷之外,还有更多值得好好过的日子。人生不只有绩效,还有在忙碌之外的安静时刻。 愿你这个五一,既能放松心情,也能找到属于自己的“满庭芳”。  ( 1 min )

  • Open

    Reinforcement fine-tuning with LLM-as-a-judge
    In this post, we take a deeper look at how RLAIF or RL with LLM-as-a-judge works with Amazon Nova models effectively.  ( 115 min )
    AWS Generative AI Model Agility Solution: A comprehensive guide to migrating LLMs for generative AI production
    In this post, we introduce a systematic framework for LLM migration or upgrade in generative AI production, encompassing essential tools, methodologies, and best practices. The framework facilitates transitions between different LLMs by providing robust protocols for prompt conversion and optimization.  ( 125 min )
    Sun Finance automates ID extraction and fraud detection with generative AI on AWS
    In this post, we show how Sun Finance used Amazon Bedrock, Amazon Textract, and Amazon Rekognition to build an AI-powered identity verification (IDV) pipeline. The solution improved extraction accuracy from 79.7% to 90.8%, cut per-document costs by 91%, and reduced processing time from up to 20 hours to under 5 seconds. You'll learn how combining specialized OCR with large language model (LLM) structuring outperformed using either tool alone. You'll also learn how to architect a serverless fraud detection system using vector similarity search.  ( 117 min )
    Unleashing Agentic AI Analytics on Amazon SageMaker with Amazon Athena and Amazon Quick
    This post demonstrates how agentic AI assistant from Amazon Quick transform data analytics into a self-service capability by using Amazon Simple Storage Service (Amazon S3) as a storage, Amazon SageMaker and AWS Glue for lakehouse, Amazon Athena for serverless SQL querying across multiple storage formats (S3 Table, Iceberg, and Parquet).  ( 122 min )
    Configuring Amazon Bedrock AgentCore Gateway for secure access to private resources
    In this post, you will configure Amazon Bedrock AgentCore Gateway to access private endpoints using Resource Gateway, a managed construct that provisions Elastic Network Interfaces (ENIs) directly inside your Amazon VPC, one per subnet. You will explore two implementation modes (managed and self-managed) and walk through three practical scenarios: connecting to a private Amazon API Gateway endpoint, integrating with a MCP server on Amazon Elastic Kubernetes Service (Amazon EKS), and accessing a private REST API.  ( 113 min )

  • Open

    Extracting contract insights with PwC’s AI-driven annotation on AWS
    This post was co-written with Yash Munsadwala, Adam Hood, Justin Guse, and Hector Hernandez from PwC. Contract analysis often consumes significant time for legal, compliance, and procurement teams, especially when important insights are buried in lengthy, unstructured agreements. As contract volumes grow, finding specific clauses and assessing extracted terms can become increasingly difficult to scale. […]  ( 113 min )
    Organizing Agents’ memory at scale: Namespace design patterns in AgentCore Memory
    In this post, you will learn how to design namespace hierarchies, choose the right retrieval patterns, and implement AWS Identity and Access Management (IAM)-based access control for AgentCore Memory.  ( 112 min )
    Building AI-ready data: Vanguard’s Virtual Analyst journey
    In this post, you'll learn how Vanguard built their Virtual Analyst solution by focusing on eight guiding principles of AI-ready data, the AWS services that powered their implementation, and the measurable business outcomes they achieved.  ( 111 min )
    Run custom MCP proxies serverless on Amazon Bedrock AgentCore Runtime
    This post shows you how to deploy a serverless MCP proxy on Amazon Bedrock AgentCore Runtime that gives you a programmable layer to implement proper governance, controls, and observability aligned with an organization's security policies.  ( 116 min )
  • Open

    Toronto
    TL;DR Toronto is a fantastic city, with plenty to keep us entertained over our 6 night stay. Why Toronto? We’d originally talked about returning to Halifax Nova Scotia, which we last visited in 2000; and then $wife announced that she’d like to go somewhere new. Getting there We picked flights with Air Canada on their […]  ( 16 min )
    Toronto
    TL;DR Toronto is a fantastic city, with plenty to keep us entertained over our 6 night stay. Why Toronto? We’d originally talked about returning to Halifax Nova Scotia, which we last visited in 2000; and then $wife announced that she’d like to go somewhere new. Getting there We picked flights with Air Canada on their […]  ( 16 min )

  • Open

    Migrating a text agent to a voice assistant with Amazon Nova 2 Sonic
    In this post, we explore what it takes to migrate a traditional text agent into a conversational voice assistant using Amazon Nova 2 Sonic. We compare text and voice agent requirements, highlight design priorities for different use cases, break down agent architecture, and address common concerns like tools and sub-agents for reuse and system prompt adaptation. This post helps you navigate the migration process and avoid common pitfalls.  ( 113 min )
    NVIDIA Nemotron 3 Nano Omni model now available on Amazon SageMaker JumpStart
    Today, we are excited to announce the day zero availability of NVIDIA Nemotron 3 Nano Omni on Amazon SageMaker JumpStart. In this post, we walk through the model architecture and key capabilities of Nemotron 3 Nano Omni, explore the enterprise use cases it unlocks, and show you how to deploy and run inference using Amazon SageMaker JumpStart.  ( 109 min )

  • Open

    Automate repetitive tasks with Amazon Quick Flows
    This post shows you how to build your first AI-powered workflow, using Amazon Quick, starting with a financial analysis tool and progressing to an advanced employee onboarding automation.  ( 115 min )
    Build and deploy an automatic sync solution for Amazon Bedrock Knowledge Bases
    In this post, we explore an automated solution that detects S3 events and triggers ingestion jobs while respecting service quotas and providing comprehensive monitoring. This serverless solution uses an event-driven architecture to keep your knowledge base current without overwhelming the Amazon Bedrock APIs.  ( 114 min )
    Build Strands Agents with SageMaker AI models and MLflow
    In this post, we demonstrate how to build AI agents using Strands Agents SDK with models deployed on SageMaker AI endpoints. You will learn how to deploy foundation models from SageMaker JumpStart, integrate them with Strands Agents, and establish production-grade observability using SageMaker Serverless MLflow for agent tracing. We also cover how to implement A/B testing across multiple model variants and evaluate agent performance using MLflow metrics and show how you can build, deploy, and continuously improve AI agents on infrastructure you control.  ( 115 min )
    How Popsa used Amazon Nova to inspire customers with personalised title suggestions
    In this post, we share how we applied Amazon Bedrock and the Amazon Nova family of models to reimagine our Title Suggestion feature. By combining metadata, computer vision, and retrieval-augmented generative AI, we now automatically generate creative, brand-aligned titles and subtitles across 12 languages. Using the unified API of Amazon Bedrock, Anthropic’s Claude 3 Haiku, and Amazon Nova Lite and Pro, we improved quality, reduced cost, and cut response times. This resulted in higher customer satisfaction, measurable uplifts in engagement and purchase rates, and over 5.5 million personalised titles generated in 2025.  ( 111 min )
  • Open

    SRE Weekly Issue #514
    View on sreweekly.com How we built a real-world evaluation platform for autonomous SRE agents at scale Finally! Someone actually explaining how they test their SRE agent. Having a testing methodology is table stakes. Showing their work helps us decide whether we can trust the tool. With so many SRE agents floating around, it’s quite surprising […]  ( 4 min )
2026-05-24T19:13:43.281Z osmosfeed 1.15.1