• Open

    Build AI agents with Amazon Bedrock AgentCore using AWS CloudFormation
    Amazon Bedrock AgentCore services are now being supported by various IaC frameworks such as AWS Cloud Development Kit (AWS CDK), Terraform and AWS CloudFormation Templates. This integration brings the power of IaC directly to AgentCore so developers can provision, configure, and manage their AI agent infrastructure. In this post, we use CloudFormation templates to build an end-to-end application for a weather activity planner.  ( 109 min )
    How the Amazon.com Catalog Team built self-learning generative AI at scale with Amazon Bedrock
    In this post, we demonstrate how the Amazon Catalog Team built a self-learning system that continuously improves accuracy while reducing costs at scale using Amazon Bedrock.  ( 114 min )

  • Open

    How PDI built an enterprise-grade RAG system for AI applications with AWS
    PDI Technologies is a global leader in the convenience retail and petroleum wholesale industries. In this post, we walk through the PDI Intelligence Query (PDIQ) process flow and architecture, focusing on the implementation details and the business outcomes it has helped PDI achieve.  ( 114 min )
    How CLICKFORCE accelerates data-driven advertising with Amazon Bedrock Agents
    In this post, we demonstrate how CLICKFORCE used AWS services to build Lumos and transform advertising industry analysis from weeks-long manual work into an automated, one-hour process.  ( 108 min )

  • Open

    How Thomson Reuters built an Agentic Platform Engineering Hub with Amazon Bedrock AgentCore
    This blog post explains how TR's Platform Engineering team, a geographically distributed unit overseeing TR's service availability, boosted its operational productivity by transitioning from manual to an automated agentic system using Amazon Bedrock AgentCore.  ( 111 min )
    Build agents to learn from experiences using Amazon Bedrock AgentCore episodic memory
    In this post, we walk you through the complete architecture to structure and store episodes, discuss the reflection module, and share compelling benchmarks that demonstrate significant improvements in agent task success rates.  ( 116 min )
    How bunq handles 97% of support with Amazon Bedrock
    In this post, we show how bunq upgraded Finn, its in-house generative AI assistant, using Amazon Bedrock to transform user support and banking operations to be seamless, in multiple languages and time zones.  ( 111 min )
    Using Strands Agents to create a multi-agent solution with Meta’s Llama 4 and Amazon Bedrock
    In this post, we explore how to build a multi-agent video processing workflow using Strands Agents, Meta's Llama 4 models, and Amazon Bedrock to automatically analyze and understand video content through specialized AI agents working in coordination. To showcase the solution, we will use Amazon SageMaker AI to walk you through the code.  ( 114 min )

  • Open

    Introducing multimodal retrieval for Amazon Bedrock Knowledge Bases
    In this post, we'll guide you through building multimodal RAG applications. You'll learn how multimodal knowledge bases work, how to choose the right processing strategy based on your content type, and how to configure and implement multimodal retrieval using both the console and code examples.  ( 112 min )

  • Open

    SRE Weekly Issue #506
    View on sreweekly.com A message from our sponsor, Costory: You didn’t sign up to do FinOps.Costory automatically explains why your cloud costs change, and reports it straight to Slack.Built for SREs who want to code, not wrestle with spreadsheets.Now on AWS & GCP Marketplaces. Start your free trial at costory.io What came first- the CNAME […]  ( 4 min )

  • Open

    Agentic Product Development and Theory of Constraints
    TL;DR Coding is no longer the constraint. It’s now cheaper than ever to make software. But there are supply side constraints on innovation, and getting apps to market. Who dreams up something worth making? How do apps get in front of users? There’s also a demand side constraint on adoption – how do people learn […]  ( 17 min )
    Agentic Product Development and Theory of Constraints
    TL;DR Coding is no longer the constraint. It’s now cheaper than ever to make software. But there are supply side constraints on innovation, and getting apps to market. Who dreams up something worth making? How do apps get in front of users? There’s also a demand side constraint on adoption – how do people learn […]  ( 16 min )
  • Open

    囊肿记
    囊肿记:搜索引擎与 AI 广告 几年前,我的皮肤表面冒出一个小小的囊肿。那时我还不懂它是什么,只是下意识地去搜索。哎呀,搜索引擎好似鬼市,左边喊“良性恶性”,右边叫“冰冻切除”,前面推医院,后面摆广告。绿豆大小的小囊肿,被吆喝成洪水猛兽。吓得我心慌手乱,洗澡时一摸一抓,竟弄破了,结果感染化脓。最后不得不去三甲医院,医生建议先消炎再切除,还做了病理检查。手术切下的肉块像东坡先生的红烧肉那般大小,比囊肿本身大数十倍,留下一道永久的疤痕,算是买了个“永久纪念”。 今年,类似的情况再次发生。缘由大概是是因为臭美,在胳肢窝喷了两次香水,几天后又冒出一个小囊肿。若是旧时,我必再去搜索,吓得夜不能寐。不同的是,这一次我有了 AI。拍了张照片,描述前因后果,问了几个 AI 工具。AI答曰:小感染耳,莫慌,保持清洁,勿弄破,且观察。语气平平,不吓人,不推销。 我照做了。虽然胳肢窝摩擦带来些许不适,但没有进一步恶化。几天后,小肉球由粉色变暗,逐渐萎缩,最后有一天洗澡时忽然不见了,像瓜熟蒂落般自然。没有手术,没有疤痕,只留一声轻叹。 这两段经历,像是传统搜索与 AI 的对比实验。前者制造焦虑,如市井叫卖,把用户推向医院和消费;后者提供冷静的参考,帮助人做出理性的观察与选择。于是,绿豆大小的囊肿被渲染成洪水猛兽,用户在恐惧中被裹挟。AI 的回答虽然不能替代医生,但它至少更客观,不带利益驱动,也不会用夸张的词汇吓人。 信息时代,获取知识的方式决定了人的心境。传统搜索像是一个嘈杂的集市,叫卖声此起彼伏;而 AI 更像一个冷静的朋友,帮你分析利弊,提醒你观察和耐心。前者制造焦虑,后者让人安心。 我希望未来的信息工具,能少一些竞价排名的套路,多一些真正的帮助。毕竟,知识若只为赚钱,便是囊肿未破,心先化脓;若能安人心,才是瓜熟蒂落的妙处。 后更:就在昨天,听说OpenAI 马上就要在ChatGPT上广告了,一阵后背发凉。  ( 1 min )

  • Open

    Advanced fine-tuning techniques for multi-agent orchestration: Patterns from Amazon at scale
    In this post, we show you how fine-tuning enabled a 33% reduction in dangerous medication errors (Amazon Pharmacy), engineering 80% human effort reduction (Amazon Global Engineering Services), and content quality assessments improving 77% to 96% accuracy (Amazon A+). This post details the techniques behind these outcomes: from foundational methods like Supervised Fine-Tuning (SFT) (instruction tuning), and Proximal Policy Optimization (PPO), to Direct Preference Optimization (DPO) for human alignment, to cutting-edge reasoning optimizations such as Grouped-based Reinforcement Learning from Policy Optimization (GRPO), Direct Advantage Policy Optimization (DAPO), and Group Sequence Policy Optimization (GSPO) purpose-built for agentic systems.  ( 118 min )
    How Palo Alto Networks enhanced device security infra log analysis with Amazon Bedrock
    Palo Alto Networks’ Device Security team wanted to detect early warning signs of potential production issues to provide more time to SMEs to react to these emerging problems. They partnered with the AWS Generative AI Innovation Center (GenAIIC) to develop an automated log classification pipeline powered by Amazon Bedrock. In this post, we discuss how Amazon Bedrock, through Anthropic’ s Claude Haiku model, and Amazon Titan Text Embeddings work together to automatically classify and analyze log data. We explore how this automated pipeline detects critical issues, examine the solution architecture, and share implementation insights that have delivered measurable operational improvements.  ( 111 min )
    From beginner to champion: A student’s journey through the AWS AI League ASEAN finals
    The AWS AI League, launched by Amazon Web Services (AWS), expanded its reach to the Association of Southeast Asian Nations (ASEAN) last year, welcoming student participants from Singapore, Indonesia, Malaysia, Thailand, Vietnam, and the Philippines. In this blog post, you’ll hear directly from the AWS AI League champion, Blix D. Foryasen, as he shares his reflection on the challenges, breakthroughs, and key lessons discovered throughout the competition.  ( 121 min )
    Deploy AI agents on Amazon Bedrock AgentCore using GitHub Actions
    In this post, we demonstrate how to use a GitHub Actions workflow to automate the deployment of AI agents on AgentCore Runtime. This approach delivers a scalable solution with enterprise-level security controls, providing complete continuous integration and delivery (CI/CD) automation.  ( 111 min )

  • Open

    How the Amazon AMET Payments team accelerates test case generation with Strands Agents
    In this post, we explain how we overcame the limitations of single-agent AI systems through a human-centric approach, implemented structured outputs to significantly reduce hallucinations and built a scalable solution now positioned for expansion across the AMET QA team and later across other QA teams in International Emerging Stores and Payments (IESP) Org.  ( 120 min )
    Build a generative AI-powered business reporting solution with Amazon Bedrock
    This post introduces generative AI guided business reporting—with a focus on writing achievements & challenges about your business—providing a smart, practical solution that helps simplify and accelerate internal communication and reporting.  ( 110 min )
    Safeguard generative AI applications with Amazon Bedrock Guardrails
    In this post, we demonstrate how you can address these challenges by adding centralized safeguards to a custom multi-provider generative AI gateway using Amazon Bedrock Guardrails.  ( 115 min )
    Scale creative asset discovery with Amazon Nova Multimodal Embeddings unified vector search
    In this post, we describe how you can use Amazon Nova Multimodal Embeddings to retrieve specific video segments. We also review a real-world use case in which Nova Multimodal Embeddings achieved a recall success rate of 96.7% and a high-precision recall of 73.3% (returning the target content in the top two results) when tested against a library of 170 gaming creative assets. The model also demonstrates strong cross-language capabilities with minimal performance degradation across multiple languages.  ( 115 min )

  • Open

    How AutoScout24 built a Bot Factory to standardize AI agent development with Amazon Bedrock
    In this post, we explore the architecture that AutoScout24 used to build their standardized AI development framework, enabling rapid deployment of secure and scalable AI agents.  ( 112 min )
    Transform AI development with new Amazon SageMaker AI model customization and large-scale training capabilities
    This post explores how new serverless model customization capabilities, elastic training, checkpointless training, and serverless MLflow work together to accelerate your AI development from months to days.  ( 112 min )

  • Open

    Securing Amazon Bedrock cross-Region inference: Geographic and global
    In this post, we explore the security considerations and best practices for implementing Amazon Bedrock cross-Region inference profiles. Whether you're building a generative AI application or need to meet specific regional compliance requirements, this guide will help you understand the secure architecture of Amazon Bedrock CRIS and how to properly configure your implementation.  ( 116 min )

  • Open

    How Omada Health scaled patient care by fine-tuning Llama models on Amazon SageMaker AI
    This post is co-written with Sunaina Kavi, AI/ML Product Manager at Omada Health. Omada Health, a longtime innovator in virtual healthcare delivery, launched a new nutrition experience in 2025, featuring OmadaSpark, an AI agent trained with robust clinical input that delivers real-time motivational interviewing and nutrition education. It was built on AWS. OmadaSpark was designed […]  ( 110 min )
  • Open

    SRE Weekly Issue #505
    View on sreweekly.com A message from our sponsor, Hopp: Paging at 2am? 🚨 Make incident triage feel like you’re at the same keyboard with Hopp. crisp, readable screen-sharing no more “can you zoom in?” click + type together bring the incident bridge into one session Start pair programming: https://www.gethopp.app/?via=sreweekly 2013–09–17 Outage Postmortem An incident write-up […]  ( 4 min )
  • Open

    生活的意义
    AI 不能取代我们寻找生活的意义 世人皆爱捷径。AI 出世,人人惊呼:三十秒一幅画,五分钟一篇文,代码也能一键生成。好似天上掉馅饼,省了功夫,省了心思。可若真把人生都交给捷径,岂不成了“买纪念品”的游客?走马观花,拍几张照片,回家翻看,心里空空。 据说南宋诗人杨万里一生写过两万多首诗,流传至今的也有4200多首,而我们真正耳熟能详的,只有几句: “小荷才露尖尖角,早有蜻蜓立上头”,“接天莲叶无穷碧,映日荷花别样红”。可是据说他可是个开宗立派的人物,据说他的诗叫“诚斋体”,不雕不饰,信手拈来,却有真趣。 人生亦然。写字、画画、做事,若只求结果,便失了过程里浸泡的滋味。正如煮茶,水滚了才下叶,慢火细煎,香气才会氤氲。若一键冲泡,虽也能喝,却少了那份“等候”的闲情。 AI 的本事,是快。快到让人心慌。可人心的价值,偏在慢。写一首诗,推敲字句,忽而灵光闪现,那一刻的喜悦,岂是机器能替?画一幅画,笔走龙蛇,墨香四溢,心手相应,那份畅快,岂是算法能给? 过程是磨刀石。它让人耐心,让人专注,让人懂得失败的滋味,也懂得成功的来之不易。若只要结果,便失了锻炼心性的机会。正如登山,若坐缆车直达顶峰,风景虽在眼前,却少了汗水与心跳的记忆。真正的风景,往往在半山腰怨自己“没苦硬吃”。 意义也在过程里。人类工作,不止为温饱。归属感、自尊心、自我实现,寻求意义皆在其中。正如我现在攒这篇闲谈,不只是为了发布,更是为了与人心相通。 所以说,AI 能给结果,却给不了故事。故事要人来讲,过程要人来走。捷径固然诱人,但若人生全是捷径,便成了空壳。真正的价值,在于一步一脚印,在于慢火细煎,在于那份“诚斋”的真趣。结果是果,过程是花。花开花谢,才有四季。  ( 1 min )

  • Open

    Crossmodal search with Amazon Nova Multimodal Embeddings
    In this post, we explore how Amazon Nova Multimodal Embeddings addresses the challenges of crossmodal search through a practical ecommerce use case. We examine the technical limitations of traditional approaches and demonstrate how Amazon Nova Multimodal Embeddings enables retrieval across text, images, and other modalities. You learn how to implement a crossmodal search system by generating embeddings, handling queries, and measuring performance. We provide working code examples and share how to add these capabilities to your applications.  ( 113 min )

  • Open

    Accelerating LLM inference with post-training weight and activation using AWQ and GPTQ on Amazon SageMaker AI
    Quantized models can be seamlessly deployed on Amazon SageMaker AI using a few lines of code. In this post, we explore why quantization matters—how it enables lower-cost inference, supports deployment on resource-constrained hardware, and reduces both the financial and environmental impact of modern LLMs, while preserving most of their original performance. We also take a deep dive into the principles behind PTQ and demonstrate how to quantize the model of your choice and deploy it on Amazon SageMaker.  ( 125 min )
    How Beekeeper optimized user personalization with Amazon Bedrock
    Beekeeper’s automated leaderboard approach and human feedback loop system for dynamic LLM and prompt pair selection addresses the key challenges organizations face in navigating the rapidly evolving landscape of language models.  ( 114 min )
    Sentiment Analysis with Text and Audio Using AWS Generative AI Services: Approaches, Challenges, and Solutions
    This post, developed through a strategic scientific partnership between AWS and the Instituto de Ciência e Tecnologia Itaú (ICTi), P&D hub maintained by Itaú Unibanco, the largest private bank in Latin America, explores the technical aspects of sentiment analysis for both text and audio. We present experiments comparing multiple machine learning (ML) models and services, discuss the trade-offs and pitfalls of each approach, and highlight how AWS services can be orchestrated to build robust, end-to-end solutions. We also offer insights into potential future directions, including more advanced prompt engineering for large language models (LLMs) and expanding the scope of audio-based analysis to capture emotional cues that text data alone might miss.  ( 114 min )
    Architecting TrueLook’s AI-powered construction safety system on Amazon SageMaker AI
    This post provides a detailed architectural overview of how TrueLook built its AI-powered safety monitoring system using SageMaker AI, highlighting key technical decisions, pipeline design patterns, and MLOps best practices. You will gain valuable insights into designing scalable computer vision solutions on AWS, particularly around model training workflows, automated pipeline creation, and production deployment strategies for real-time inference.  ( 114 min )

  • Open

    Scaling medical content review at Flo Health using Amazon Bedrock (Part 1)
    This two-part series explores Flo Health's journey with generative AI for medical content verification. Part 1 examines our proof of concept (PoC), including the initial solution, capabilities, and early results. Part 2 covers focusing on scaling challenges and real-world implementation. Each article stands alone while collectively showing how AI transforms medical content management at scale.  ( 114 min )
    Detect and redact personally identifiable information using Amazon Bedrock Data Automation and Guardrails
    This post shows an automated PII detection and redaction solution using Amazon Bedrock Data Automation and Amazon Bedrock Guardrails through a use case of processing text and image content in high volumes of incoming emails and attachments. The solution features a complete email processing workflow with a React-based user interface for authorized personnel to more securely manage and review redacted email communications and attachments. We walk through the step-by-step solution implementation procedures used to deploy this solution. Finally, we discuss the solution benefits, including operational efficiency, scalability, security and compliance, and adaptability.  ( 116 min )
    Speed meets scale: Load testing SageMakerAI endpoints with Observe.AI’s testing tool
    Observe.ai developed the One Load Audit Framework (OLAF), which integrates with SageMaker to identify bottlenecks and performance issues in ML services, offering latency and throughput measurements under both static and dynamic data loads. In this blog post, you will learn how to use the OLAF utility to test and validate your SageMaker endpoint.  ( 113 min )

  • Open

    Milo cancer diary part 22 – remission again
    CHOP #4 has worked, and Milo’s scan today shows that he’s in remission again (before even getting his Epirubicin). This cycle of chemo seemed to go better than previous protocols, until we got to the planned Epirubicin last week, and his neutrophils were too low. So we were back at North Downs Specialist Referals (NDSR) […]  ( 14 min )
    Milo cancer diary part 22 – remission again
    CHOP #4 has worked, and Milo’s scan today shows that he’s in remission again (before even getting his Epirubicin). This cycle of chemo seemed to go better than previous protocols, until we got to the planned Epirubicin last week, and his neutrophils were too low. So we were back at North Downs Specialist Referals (NDSR) […]  ( 14 min )
  • Open

    SRE Weekly Issue #504
    View on sreweekly.com Finding the grain of sand in a heap of Salt Salt is Cloudflare’s configuration management tool. How do you find the root cause of a configuration management failure when you have a peak of hundreds of changes in 15 minutes on thousands of servers? The result of this has been a reduction […]  ( 4 min )

  • Open

    December 2025
    Pupdate It’s been quite dry over the Christmas break, which has encouraged some longer than usual walks that the boys have enjoyed. After a scan at the start of the month Milo has now almost completed the first cycle of his 4th modified ‘CHOP’ chemotherapy protocol. As before, low neutrophils mean we’re a little behind […]  ( 16 min )
    December 2025
    Pupdate It’s been quite dry over the Christmas break, which has encouraged some longer than usual walks that the boys have enjoyed. After a scan at the start of the month Milo has now almost completed the first cycle of his 4th modified ‘CHOP’ chemotherapy protocol. As before, low neutrophils mean we’re a little behind […]  ( 16 min )

  • Open

    Migrate MLflow tracking servers to Amazon SageMaker AI with serverless MLflow
    This post shows you how to migrate your self-managed MLflow tracking server to a MLflow App – a serverless tracking server on SageMaker AI that automatically scales resources based on demand while removing server patching and storage management tasks at no cost. Learn how to use the MLflow Export Import tool to transfer your experiments, runs, models, and other MLflow resources, including instructions to validate your migration's success.  ( 111 min )
    Build an AI-powered website assistant with Amazon Bedrock
    This post demonstrates how to solve this challenge by building an AI-powered website assistant using Amazon Bedrock and Amazon Bedrock Knowledge Bases.  ( 110 min )
  • Open

    Silent PC GPU upgrade
    TL;DR Nvidia have ended Linux support for my ‘Pascal’ GTX 1050 Ti GPU. I’ve been able to fit an RTX 5050 card in its place, though the process was problematic due to driver issues. And I’m still concerned that it can only be limited to 110W when my passive cooling is rated up to 75W. […]  ( 16 min )
    Silent PC GPU upgrade
    TL;DR Nvidia have ended Linux support for my ‘Pascal’ GTX 1050 Ti GPU. I’ve been able to fit an RTX 5050 card in its place, though the process was problematic due to driver issues. And I’m still concerned that it can only be limited to 110W when my passive cooling is rated up to 75W. […]  ( 16 min )
  • Open

    SRE Weekly Issue #503
    View on sreweekly.com The Abstraction Debt in Infrastructure as Code Abstraction is meant to encapsulate complexity, but when done poorly, it creates opacity—a lack of visibility into what’s actually happening under the hood.   RoseSecurity Fun with incident data and statistical process control This article uses publicly available incident data and an open source tool to […]  ( 4 min )
2026-01-23T18:30:40.429Z osmosfeed 1.15.1