• Open

    Set up a custom plugin on Amazon Q Business and authenticate with Amazon Cognito to interact with backend systems
    In this post, we demonstrate how to build a custom plugin with Amazon Q Business for backend integration. This plugin can integrate existing systems, including third-party systems, with little to no development in just weeks and automate critical workflows. Additionally, we show how to safeguard the solution using Amazon Cognito and AWS IAM Identity Center, maintaining the safety and integrity of sensitive data and workflows.  ( 12 min )
    Detect hallucinations for RAG-based systems
    This post walks you through how to create a basic hallucination detection system for RAG-based applications. We also weigh the pros and cons of different methods in terms of accuracy, precision, recall, and cost.  ( 13 min )
    AWS machine learning supports Scuderia Ferrari HP pit stop analysis
    Pit crews are trained to operate at optimum efficiency, although measuring their performance has been challenging, until now. In this post, we share how Amazon Web Services (AWS) is helping Scuderia Ferrari HP develop more accurate pit stop analysis techniques using machine learning (ML).  ( 6 min )
    Accelerate edge AI development with SiMa.ai Edgematic with a seamless AWS integration
    In this post, we demonstrate how to retrain and quantize a model using SageMaker AI and the SiMa.ai Palette software suite. The goal is to accurately detect individuals in environments where visibility and protective equipment detection are essential for compliance and safety.  ( 14 min )

  • Open

    How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod
    Building on this foundation of specialized information extraction solutions and using the capabilities of SageMaker HyperPod, we collaborate with APOIDEA Group to explore the use of large vision language models (LVLMs) to further improve table structure recognition performance on banking and financial documents. In this post, we present our work and step-by-step code on fine-tuning the Qwen2-VL-7B-Instruct model using LLaMA-Factory on SageMaker HyperPod.  ( 15 min )
    How Qualtrics built Socrates: An AI platform powered by Amazon SageMaker and Amazon Bedrock
    In this post, we share how Qualtrics built an AI platform powered by Amazon SageMaker and Amazon Bedrock.  ( 12 min )
    Vxceed secures transport operations with Amazon Bedrock
    AWS partnered with Vxceed to support their AI strategy, resulting in the development of LimoConnect Q, an innovative ground transportation management solution. Using AWS services including Amazon Bedrock and Lambda, Vxceed successfully built a secure, AI-powered solution that streamlines trip booking and document processing.  ( 9 min )

  • Open

    Cost-effective AI image generation with PixArt-Sigma inference on AWS Trainium and AWS Inferentia
    This post is the first in a series where we will run multiple diffusion transformers on Trainium and Inferentia-powered instances. In this post, we show how you can deploy PixArt-Sigma to Trainium and Inferentia-powered instances.  ( 9 min )
    Cost-effective AI image generation with PixArt-Σ inference on AWS Trainium and AWS Inferentia
    This post is the first in a series where we will run multiple diffusion transformers on Trainium and Inferentia-powered instances. In this post, we show how you can deploy PixArt-Sigma to Trainium and Inferentia-powered instances.  ( 9 min )
    Customize DeepSeek-R1 671b model using Amazon SageMaker HyperPod recipes – Part 2
    In this post, we use the recipes to fine-tune the original DeepSeek-R1 671b parameter model. We demonstrate this through the step-by-step implementation of these recipes using both SageMaker training jobs and SageMaker HyperPod.  ( 15 min )
    Build a financial research assistant using Amazon Q Business and Amazon QuickSight for generative AI–powered insights
    In this post, we show you how Amazon Q Business can help augment your generative AI needs in all the abovementioned use cases and more by answering questions, providing summaries, generating content, and securely completing tasks based on data and information in your enterprise systems.  ( 12 min )

  • Open

    Securing Amazon Bedrock Agents: A guide to safeguarding against indirect prompt injections
    Generative AI tools have transformed how we work, create, and process information. At Amazon Web Services (AWS), security is our top priority. Therefore, Amazon Bedrock provides comprehensive security controls and best practices to help protect your applications and data. In this post, we explore the security measures and practical strategies provided by Amazon Bedrock Agents to safeguard your AI interactions against indirect prompt injections, making sure that your applications remain both secure and reliable.  ( 11 min )
    Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock
    In this post, we demonstrate a solution using Amazon Elastic Kubernetes Service (EKS) with Amazon Bedrock to build scalable and containerized RAG solutions for your generative AI applications on AWS while bringing your unstructured user file data to Amazon Bedrock in a straightforward, fast, and secure way.  ( 7 min )
    How Hexagon built an AI assistant using AWS generative AI services
    Recognizing the transformative benefits of generative AI for enterprises, we at Hexagon’s Asset Lifecycle Intelligence division sought to enhance how users interact with our Enterprise Asset Management (EAM) products. Understanding these advantages, we partnered with AWS to embark on a journey to develop HxGN Alix, an AI-powered digital worker using AWS generative AI services. This blog post explores the strategy, development, and implementation of HxGN Alix, demonstrating how a tailored AI solution can drive efficiency and enhance user satisfaction.  ( 13 min )

  • Open

    Build an intelligent community agent to revolutionize IT support with Amazon Q Business
    In this post, we demonstrate how your organization can reduce the end-to-end burden of resolving regular challenges experienced by your IT support teams—from understanding errors and reviewing diagnoses, remediation steps, and relevant documentation, to opening external support tickets using common third-party services such as Jira.  ( 11 min )
  • Open

    Milo cancer diary part 19 – Four
    Milo is four today, another milestone worth celebrating :) Last week he had a scan, which followed 8 weeks after the end of his third (modified) CHOP protocol. The scan was clear, so we’ll be back to NDSR at the start of July. It’s all a very similar situation to a year ago. Insurance ManyPets […]  ( 13 min )
  • Open

    SRE Weekly Issue #476
    View on sreweekly.com Automation and The Substitution Myth The myth is: The underlying and often unexamined assumption for the benefits of automation is the notion that computers/machines are better at some tasks, and humans are better at a different, non-overlapping set of tasks. This article lays out several pitfalls to this approach, with references.   Courtney […]  ( 4 min )

  • Open

    Elevate marketing intelligence with Amazon Bedrock and LLMs for content creation, sentiment analysis, and campaign performance evaluation
    In the media and entertainment industry, understanding and predicting the effectiveness of marketing campaigns is crucial for success. Marketing campaigns are the driving force behind successful businesses, playing a pivotal role in attracting new customers, retaining existing ones, and ultimately boosting revenue. However, launching a campaign isn’t enough; to maximize their impact and help achieve […]  ( 13 min )

  • Open

    How Deutsche Bahn redefines forecasting using Chronos models – Now available on Amazon Bedrock Marketplace
    Whereas traditional forecasting methods typically rely on statistical modeling, Chronos treats time series data as a language to be modeled and uses a pre-trained FM to generate forecasts — similar to how large language models (LLMs) generate texts. Chronos helps you achieve accurate predictions faster, significantly reducing development time compared to traditional methods. In this post, we share how Deutsche Bahn is redefining forecasting using Chronos models, and provide an example use case to demonstrate how you can get started using Chronos.  ( 9 min )

  • Open

    Use custom metrics to evaluate your generative AI application with Amazon Bedrock
    Now with Amazon Bedrock, you can develop custom evaluation metrics for both model and RAG evaluations. This capability extends the LLM-as-a-judge framework that drives Amazon Bedrock Evaluations. In this post, we demonstrate how to use custom metrics in Amazon Bedrock Evaluations to measure and improve the performance of your generative AI applications according to your specific business requirements and evaluation criteria.  ( 18 min )

  • Open

    SRE Weekly Issue #475
    View on sreweekly.com Anomaly Detection in Time Series Using Statistical Analysis I haven’t seen this level of detail in an article on anomaly detection in quite awhile. Still, the math is very approachable even if you slept through stats class.   Ivan Shubin — Booking.com A Key Incident Response Skill That Can Reduce Resolution Time TL;DR: […]  ( 4 min )
  • Open

    Plex Media Server
    今天给大家推荐一款免费、开源的媒体播放器服务器,它允许您存储和流自己的媒体集合(如电影、电视剧、音乐、照片和家庭视频)到各种设备,例如智能手机、平板电脑、流媒体盒子和智能电视。 这款 Media Server 适用于各种操作系统,包括 Windows、macOS、Linux 和 freeBSD。它还可以安装在 NAS(网络附加存储)设备或 Raspberry Pi 上。 它是一款强大的流媒体播放器服务器,它允许您从本地存储添加媒体文件并进行流媒体化处理,使其能够在任何地方(本地局域网或者公网远程)的任何设备访问和播放。 下面就是我在家里局域网的搭建过程。 在一台 Windows 11 上安装 Media Server。下载地址是 Plex 官网。 注册一个 Plex 账号,登录后会提示你安装 Plex Media Server。安装完成后,打开 Plex Media Server。 添加媒体文件夹。点击左上角的“+”号,选择要添加的媒体文件夹。Plex 会自动扫描该文件夹中的媒体文件。 设置媒体库。选择媒体类型(电影、电视剧、音乐等),并为媒体库命名。 等待 Plex 扫描媒体文件。扫描完成后,您可以在 Plex Media Server 中查看和播放媒体文件。 在其他设备上安装 Plex 客户端应用程序。Plex 客户端应用程序可在 Android、iOS、Windows、macOS 和智能电视上使用。 家里没有智能电视也没有关系,我在家里15年前Sony 电视通过外接一个Raspberry 树莓派也能搞定。 7.1 在树莓派上安装Kodi 7.2 Kodi 里安装该媒体软件的插件 (PM4K for Plex)。 这样打开这台老旧的非智能电视机,也能享受智慧的媒体生活。  ( 1 min )

  • Open

    Build a gen AI–powered financial assistant with Amazon Bedrock multi-agent collaboration
    This post explores a financial assistant system that specializes in three key tasks: portfolio creation, company research, and communication. This post aims to illustrate the use of multiple specialized agents within the Amazon Bedrock multi-agent collaboration capability, with particular emphasis on their application in financial analysis.  ( 17 min )
    WordFinder app: Harnessing generative AI on AWS for aphasia communication
    In this post, we showcase how Dr. Kori Ramajoo, Dr. Sonia Brownsett, Prof. David Copland, from QARC, and Scott Harding, a person living with aphasia, used AWS services to develop WordFinder, a mobile, cloud-based solution that helps individuals with aphasia increase their independence through the use of AWS generative AI technology.  ( 11 min )
    Get faster and actionable AWS Trusted Advisor insights to make data-driven decisions using Amazon Q Business
    In this post, we show how to create an application using Amazon Q Business with Jira integration that used a dataset containing a Trusted Advisor detailed report. This solution demonstrates how to use new generative AI services like Amazon Q Business to get data insights faster and make them actionable.  ( 9 min )

  • Open

    Best practices for Meta Llama 3.2 multimodal fine-tuning on Amazon Bedrock
    In this post, we share comprehensive best practices and scientific insights for fine-tuning Meta Llama 3.2 multimodal models on Amazon Bedrock. By following these guidelines, you can fine-tune smaller, more cost-effective models to achieve performance that rivals or even surpasses much larger models—potentially reducing both inference costs and latency, while maintaining high accuracy for your specific use case.  ( 12 min )
    Extend large language models powered by Amazon SageMaker AI using Model Context Protocol
    The MCP proposed by Anthropic offers a standardized way of connecting FMs to data sources, and now you can use this capability with SageMaker AI. In this post, we presented an example of combining the power of SageMaker AI and MCP to build an application that offers a new perspective on loan underwriting through specialized roles and automated workflows.  ( 15 min )
    Automate document translation and standardization with Amazon Bedrock and Amazon Translate
    In this post, we show how you can automate language localization through translating documents using Amazon Web Services (AWS). The solution combines Amazon Bedrock and AWS Serverless technologies, a suite of fully managed event-driven services for running code, managing data, and integrating applications—all without managing servers.  ( 7 min )
    Autonomous mortgage processing using Amazon Bedrock Data Automation and Amazon Bedrock Agents
    In this post, we introduce agentic automatic mortgage approval, a next-generation sample solution that uses autonomous AI agents powered by Amazon Bedrock Agents and Amazon Bedrock Data Automation. These agents orchestrate the entire mortgage approval process—intelligently verifying documents, assessing risk, and making data-driven decisions with minimal human intervention.  ( 12 min )
    Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency
    In this post, we highlight the advanced data augmentation techniques and performance improvements in Amazon Bedrock Model Distillation with Meta's Llama model family. This technique transfers knowledge from larger, more capable foundation models (FMs) that act as teachers to smaller, more efficient models (students), creating specialized models that excel at specific tasks.  ( 13 min )
  • Open

    April 2025
    Pupdate April has been rather nice, which has given the opportunity for plenty of long walks :) Florida We started the month in Florida, more on that in its own post – Florida 2025. Bike bother Vespa I mentioned my broken Vespa last month, and returned from Florida to a diagnosis of low compression, and […]  ( 15 min )

  • Open

    Build public-facing generative AI applications using Amazon Q Business for anonymous users
    Today, we’re excited to announce that Amazon Q Business now supports anonymous user access. With this new feature, you can now create Amazon Q Business applications with anonymous user mode, where user authentication is not required and content is publicly accessible. In this post, we demonstrate how to build a public-facing generative AI application using Amazon Q Business for anonymous users.  ( 10 min )
    FloQast builds an AI-powered accounting transformation solution with Anthropic’s Claude 3 on Amazon Bedrock
    In this post, we share how FloQast built an AI-powered accounting transaction solution using Anthropic’s Claude 3 on Amazon Bedrock.  ( 10 min )
    Insights in implementing production-ready solutions with generative AI
    As generative AI revolutionizes industries, organizations are eager to harness its potential. However, the journey from production-ready solutions to full-scale implementation can present distinct operational and technical considerations. This post explores key insights and lessons learned from AWS customers in Europe, Middle East, and Africa (EMEA) who have successfully navigated this transition, providing a roadmap for others looking to follow suit.  ( 11 min )

  • Open

    Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS
    In this post, we explore how AWS services can be seamlessly integrated with open source tools to help establish a robust red teaming mechanism within your organization. Specifically, we discuss Data Reply’s red teaming solution, a comprehensive blueprint to enhance AI safety and responsible AI practices.  ( 11 min )
    InterVision accelerates AI development using AWS LLM League and Amazon SageMaker AI
    This post demonstrates how AWS LLM League’s gamified enablement accelerates partners’ practical AI development capabilities, while showcasing how fine-tuning smaller language models can deliver cost-effective, specialized solutions for specific industry needs.  ( 9 min )
    Improve Amazon Nova migration performance with data-aware prompt optimization
    In this post, we present an LLM migration paradigm and architecture, including a continuous process of model evaluation, prompt generation using Amazon Bedrock, and data-aware optimization. The solution evaluates the model performance before migration and iteratively optimizes the Amazon Nova model prompts using user-provided dataset and objective metrics.  ( 14 min )

  • Open

    Customize Amazon Nova models to improve tool usage
    In this post, we demonstrate model customization (fine-tuning) for tool use with Amazon Nova. We first introduce a tool usage use case, and gave details about the dataset. We walk through the details of Amazon Nova specific data formatting and showed how to do tool calling through the Converse and Invoke APIs in Amazon Bedrock. After getting the baseline results from Amazon Nova models, we explain in detail the fine-tuning process, hosting fine-tuned models with provisioned throughput, and using the fine-tuned Amazon Nova models for inference.  ( 14 min )
    Evaluate Amazon Bedrock Agents with Ragas and LLM-as-a-judge
    In this post, we introduced the Open Source Bedrock Agent Evaluation framework, a Langfuse-integrated solution that streamlines the agent development process. We demonstrated how this evaluation framework can be integrated with pharmaceutical research agents. We used it to evaluate agent performance against biomarker questions and sent traces to Langfuse to view evaluation metrics across question types.  ( 11 min )
  • Open

    SRE Weekly Issue #474
    View on sreweekly.com A message from our sponsor, incident.io: We’ve just raised $62M at incident.io to build AI agents that resolve incidents with you. See how we’re pioneering a new era of incident management. https://go.incident.io/blog/incident.io-raises-62m Why do we do blameless incident reviews? This is a truly outstanding article about blameless incident analysis! Beyond just “why”, […]  ( 4 min )

  • Open

    Enterprise-grade natural language to SQL generation using LLMs: Balancing accuracy, latency, and scale
    In this post, the AWS and Cisco teams unveil a new methodical approach that addresses the challenges of enterprise-grade SQL generation. The teams were able to reduce the complexity of the NL2SQL process while delivering higher accuracy and better overall performance.  ( 16 min )
    AWS Field Experience reduced cost and delivered low latency and high performance with Amazon Nova Lite foundation model
    The AFX team’s product migration to the Nova Lite model has delivered tangible enterprise value by enhancing sales workflows. By migrating to the Amazon Nova Lite model, the team has not only achieved significant cost savings and reduced latency, but has also empowered sellers with a leading intelligent and reliable solution.  ( 5 min )
    Combine keyword and semantic search for text and images using Amazon Bedrock and Amazon OpenSearch Service
    In this post, we walk you through how to build a hybrid search solution using OpenSearch Service powered by multimodal embeddings from the Amazon Titan Multimodal Embeddings G1 model through Amazon Bedrock. This solution demonstrates how you can enable users to submit both text and images as queries to retrieve relevant results from a sample retail image dataset.  ( 12 min )

  • Open

    Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker
    In this post, we discuss how you can build an AI-powered document processing platform with open source NER and LLMs on SageMaker.  ( 9 min )
    Protect sensitive data in RAG applications with Amazon Bedrock
    In this post, we explore two approaches for securing sensitive data in RAG applications using Amazon Bedrock. The first approach focused on identifying and redacting sensitive data before ingestion into an Amazon Bedrock knowledge base, and the second demonstrated a fine-grained RBAC pattern for managing access to sensitive information during retrieval. These solutions represent just two possible approaches among many for securing sensitive data in generative AI applications.  ( 15 min )

  • Open

    Supercharge your LLM performance with Amazon SageMaker Large Model Inference container v15
    Today, we’re excited to announce the launch of Amazon SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4 with support for the vLLM V1 engine. This release introduces significant performance improvements, expanded model compatibility with multimodality (that is, the ability to understand and analyze text-to-text, images-to-text, and text-to-images data), and provides built-in integration with vLLM to help you seamlessly deploy and serve large language models (LLMs) with the highest performance at scale.  ( 8 min )
    Accuracy evaluation framework for Amazon Q Business – Part 2
    In the first post of this series, we introduced a comprehensive evaluation framework for Amazon Q Business, a fully managed Retrieval Augmented Generation (RAG) solution that uses your company’s proprietary data without the complexity of managing large language models (LLMs). The first post focused on selecting appropriate use cases, preparing data, and implementing metrics to […]  ( 14 min )
    Use Amazon Bedrock Intelligent Prompt Routing for cost and latency benefits
    Today, we’re happy to announce the general availability of Amazon Bedrock Intelligent Prompt Routing. In this blog post, we detail various highlights from our internal testing, how you can get started, and point out some caveats and best practices. We encourage you to incorporate Amazon Bedrock Intelligent Prompt Routing into your new and existing generative AI applications.  ( 10 min )
    How Infosys improved accessibility for Event Knowledge using Amazon Nova Pro, Amazon Bedrock and Amazon Elemental Media Services
    In this post, we explore how Infosys developed Infosys Event AI to unlock the insights generated from events and conferences. Through its suite of features—including real-time transcription, intelligent summaries, and an interactive chat assistant—Infosys Event AI makes event knowledge accessible and provides an immersive engagement solution for the attendees, during and after the event.  ( 11 min )

  • Open

    Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group
    Today, we are excited to announce the availability of Prompt Optimization on Amazon Bedrock. With this capability, you can now optimize your prompts for several use cases with a single API call or a click of a button on the Amazon Bedrock console. In this blog post, we discuss how Prompt Optimization improves the performance of large language models (LLMs) for intelligent text processing task in Yuewen Group.  ( 8 min )
    Build a location-aware agent using Amazon Bedrock Agents and Foursquare APIs
    In this post, we combine Amazon Bedrock Agents and Foursquare APIs to demonstrate how you can use a location-aware agent to bring personalized responses to your users.  ( 8 min )
    Build an automated generative AI solution evaluation pipeline with Amazon Nova
    In this post, we explore the importance of evaluating LLMs in the context of generative AI applications, highlighting the challenges posed by issues like hallucinations and biases. We introduced a comprehensive solution using AWS services to automate the evaluation process, allowing for continuous monitoring and assessment of LLM performance. By using tools like the FMeval Library, Ragas, LLMeter, and Step Functions, the solution provides flexibility and scalability, meeting the evolving needs of LLM consumers.  ( 14 min )
  • Open

    SRE Weekly Issue #473
    View on sreweekly.com A message from our sponsor, incident.io: We’ve just raised $62M at incident.io to build AI agents that resolve incidents with you. See how we’re pioneering a new era of incident management. https://go.incident.io/blog/incident.io-raises-62m Scaling Nextdoor’s Datastores: Part 5 In this final installment of the Scaling Nextdoor’s Datastores blog series, we detail how the […]  ( 4 min )
2025-05-18T08:19:47.769Z osmosfeed 1.15.1