• Open

    Build an AI-powered automated summarization system with Amazon Bedrock and Amazon Transcribe using Terraform
    This post introduces a serverless meeting summarization system that harnesses the advanced capabilities of Amazon Bedrock and Amazon Transcribe to transform audio recordings into concise, structured, and actionable summaries. By automating this process, organizations can reclaim countless hours while making sure key insights, action items, and decisions are systematically captured and made accessible to stakeholders.  ( 37 min )
    Kyruus builds a generative AI provider matching solution on AWS
    In this post, we demonstrate how Kyruus Health uses AWS services to build Guide. We show how Amazon Bedrock, a fully managed service that provides access to foundation models (FMs) from leading AI companies and Amazon through a single API, and Amazon OpenSearch Service, a managed search and analytics service, work together to understand everyday language about health concerns and connect members with the right providers.  ( 28 min )
    Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance
    In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how Amazon Web Services (AWS) customers can build a solution that automates the digitisation and extraction of crucial information from many reports using generative AI.  ( 29 min )
  • Open

    SRE Weekly Issue #486
    View on sreweekly.com A message from our sponsor, Spacelift: IaC Experts! IaCConf Call for Presenters – August 27, 2025 The upcoming IaCConf Spotlight dives into the security and governance challenges of managing infrastructure as code at scale. From embedding security in your pipelines to navigating the realities of open source risk, this event brings together […]  ( 4 min )

  • Open

    Build real-time travel recommendations using AI agents on Amazon Bedrock
    In this post, we show how to build a generative AI solution using Amazon Bedrock that creates bespoke holiday packages by combining customer profiles and preferences with real-time pricing data. We demonstrate how to use Amazon Bedrock Knowledge Bases for travel information, Amazon Bedrock Agents for real-time flight details, and Amazon OpenSearch Serverless for efficient package search and retrieval.  ( 30 min )
    Deploy a full stack voice AI agent with Amazon Nova Sonic
    In this post, we show how to create an AI-powered call center agent for a fictional company called AnyTelco. The agent, named Telly, can handle customer inquiries about plans and services while accessing real-time customer data using custom tools implemented with the Model Context Protocol (MCP) framework.  ( 29 min )
    Manage multi-tenant Amazon Bedrock costs using application inference profiles
    This post explores how to implement a robust monitoring solution for multi-tenant AI deployments using a feature of Amazon Bedrock called application inference profiles. We demonstrate how to create a system that enables granular usage tracking, accurate cost allocation, and dynamic resource management across complex multi-tenant environments.  ( 30 min )

  • Open

    Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI
    Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores. For most real-world generative AI scenarios, it’s crucial to understand whether a model is producing better outputs than a baseline or an earlier iteration. This is especially important for applications such as summarization, content generation, […]  ( 37 min )
    Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors
    In this post, we demonstrate how to integrate Amazon S3 Vectors with Amazon Bedrock Knowledge Bases for RAG applications. You'll learn a practical approach to scale your knowledge bases to handle millions of documents while maintaining retrieval quality and using S3 Vectors cost-effective storage.  ( 32 min )
    Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock
    In this post, we walk through the custom model on-demand deployment workflow for Amazon Bedrock and provide step-by-step implementation guides using both the AWS Management Console and APIs or AWS SDKs. We also discuss best practices and considerations for deploying customized Amazon Nova models on Amazon Bedrock.  ( 30 min )
    Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI
    Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive innovation at unprecedented speed. However, standalone LLMs have key limitations such as hallucinations, outdated knowledge, and no access to proprietary data. Retrieval Augmented Generation (RAG) addresses these gaps by combining semantic search with generative AI, […]  ( 36 min )

  • Open

    Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents
    This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture. Video highlights offer a powerful way to boost audience engagement and extend content value for content publishers. These short, high-impact clips capture key moments that drive viewer retention, amplify reach across social media, reinforce brand identity, and open new […]  ( 31 min )
    Deploy conversational agents with Vonage and Amazon Nova Sonic
    In this post, we explore how developers can integrate Amazon Nova Sonic with the Vonage communications service to build responsive, natural-sounding voice experiences in real time. By combining the Vonage Voice API with the low-latency and expressive speech capabilities of Amazon Nova Sonic, businesses can deploy AI voice agents that deliver more human-like interactions than traditional voice interfaces. These agents can be used as customer support, virtual assistants, and more.  ( 29 min )
    Enabling customers to deliver production-ready AI agents at scale
    Today, I’m excited to share how we’re bringing this vision to life with new capabilities that address the fundamental aspects of building and deploying agents at scale. These innovations will help you move beyond experiments to production-ready agent systems that can be trusted with your most critical business processes.  ( 32 min )

  • Open

    Amazon Bedrock Knowledge Bases now supports Amazon OpenSearch Service Managed Cluster as vector store
    Amazon Bedrock Knowledge Bases has extended its vector store options by enabling support for Amazon OpenSearch Service managed clusters, further strengthening its capabilities as a fully managed Retrieval Augmented Generation (RAG) solution. This enhancement builds on the core functionality of Amazon Bedrock Knowledge Bases , which is designed to seamlessly connect foundation models (FMs) with internal data sources. This post provides a comprehensive, step-by-step guide on integrating an Amazon Bedrock knowledge base with an OpenSearch Service managed cluster as its vector store.  ( 44 min )
    Monitor agents built on Amazon Bedrock with Datadog LLM Observability
    We’re excited to announce a new integration between Datadog LLM Observability and Amazon Bedrock Agents that helps monitor agentic applications built on Amazon Bedrock. In this post, we'll explore how Datadog's LLM Observability provides the visibility and control needed to successfully monitor, operate, and debug production-grade agentic applications built on Amazon Bedrock Agents.  ( 29 min )
    How PayU built a secure enterprise AI assistant using Amazon Bedrock
    PayU offers a full-stack digital financial services system that serves the financial needs of merchants, banks, and consumers through technology. In this post, we explain how we equipped the PayU team with an enterprise AI solution and democratized AI access using Amazon Bedrock, without compromising on data residency requirements.  ( 32 min )
    Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import
    This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA. DGX Cloud on Amazon Web Services (AWS) represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock […]  ( 32 min )
    Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS
    This post introduces NVIDIA Dynamo and explains how to set it up on Amazon EKS for automated scaling and streamlined Kubernetes operations. We provide a hands-on walkthrough, which uses the NVIDIA Dynamo blueprint on the AI on EKS GitHub repo by AWS Labs to provision the infrastructure, configure monitoring, and install the NVIDIA Dynamo operator.  ( 35 min )
    AWS doubles investment in AWS Generative AI Innovation Center, marking two years of customer success
    In this post, AWS announces a $100 million additional investment in its AWS Generative AI Innovation Center, marking two years of successful customer collaborations across industries from financial services to healthcare. The investment comes as AI evolves toward more autonomous, agentic systems, with the center already helping thousands of customers drive millions in productivity gains and transform customer experiences.  ( 30 min )

  • Open

    Build AI-driven policy creation for vehicle data collection and automation using Amazon Bedrock
    Sonatus partnered with the AWS Generative AI Innovation Center to develop a natural language interface to generate data collection and automation policies using generative AI. This innovation aims to reduce the policy generation process from days to minutes while making it accessible to both engineers and non-experts alike. In this post, we explore how we built this system using Sonatus’s Collector AI and Amazon Bedrock. We discuss the background, challenges, and high-level solution architecture.  ( 31 min )
    How Rapid7 automates vulnerability risk scores with ML pipelines using Amazon SageMaker AI
    In this post, we share how Rapid7 implemented end-to-end automation for the training, validation, and deployment of ML models that predict CVSS vectors. Rapid7 customers have the information they need to accurately understand their risk and prioritize remediation measures.  ( 31 min )
    Build secure RAG applications with AWS serverless data lakes
    In this post, we explore how to build a secure RAG application using serverless data lake architecture, an important data strategy to support generative AI development. We use Amazon Web Services (AWS) services including Amazon S3, Amazon DynamoDB, AWS Lambda, and Amazon Bedrock Knowledge Bases to create a comprehensive solution supporting unstructured data assets which can be extended to structured data. The post covers how to implement fine-grained access controls for your enterprise data and design metadata-driven retrieval systems that respect security boundaries. These approaches will help you maximize the value of your organization's data while maintaining robust security and compliance.  ( 34 min )
  • Open

    SRE Weekly Issue #485
    View on sreweekly.com YOUR AD COULD BE HERE! SRE Weekly has openings for new sponsorships. Reply or email lex at sreweekly.com for details. Migrating the Jira Database Platform to AWS Aurora How would you migrate several million databases, with minimal impact to your users? Atlassian allocates one Postgres database per tenant customer, with a few […]  ( 4 min )

  • Open

    Advanced fine-tuning methods on Amazon SageMaker AI
    When fine-tuning ML models on AWS, you can choose the right tool for your specific needs. AWS provides a comprehensive suite of tools for data scientists, ML engineers, and business users to achieve their ML goals. AWS has built solutions to support various levels of ML sophistication, from simple SageMaker training jobs for FM fine-tuning to the power of SageMaker HyperPod for cutting-edge research. We invite you to explore these options, starting with what suits your current needs, and evolve your approach as those needs change.  ( 35 min )
    Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod
    This post is co-written with Zhanghao Wu, co-creator of SkyPilot. The rapid advancement of generative AI and foundation models (FMs) has significantly increased computational resource requirements for machine learning (ML) workloads. Modern ML pipelines require efficient systems for distributing workloads across accelerated compute resources, while making sure developer productivity remains high. Organizations need infrastructure solutions […]  ( 31 min )
    Intelligent document processing at scale with generative AI and Amazon Bedrock Data Automation
    This post presents an end-to-end IDP application powered by Amazon Bedrock Data Automation and other AWS services. It provides a reusable AWS infrastructure as code (IaC) that deploys an IDP pipeline and provides an intuitive UI for transforming documents into structured tables at scale. The application only requires the user to provide the input documents (such as contracts or emails) and a list of attributes to be extracted. It then performs IDP with generative AI.  ( 35 min )
    Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight
    In this post, we dive into how we integrated Amazon Q in QuickSight to transform natural language requests like “Show me how many items were returned in the US over the past 6 months” into meaningful data visualizations. We demonstrate how combining Amazon Bedrock Agents with Amazon Q in QuickSight creates a comprehensive data assistant that delivers both SQL code and visual insights through a single, intuitive conversational interface—democratizing data access across the enterprise.  ( 34 min )
    Build a conversational data assistant, Part 1: Text-to-SQL with Amazon Bedrock Agents
    In this post, we focus on building a Text-to-SQL solution with Amazon Bedrock, a managed service for building generative AI applications. Specifically, we demonstrate the capabilities of Amazon Bedrock Agents. Part 2 explains how we extended the solution to provide business insights using Amazon Q in QuickSight, a business intelligence assistant that answers questions with auto-generated visualizations.  ( 34 min )
    Implement user-level access control for multi-tenant ML platforms on Amazon SageMaker AI
    In this post, we discuss permission management strategies, focusing on attribute-based access control (ABAC) patterns that enable granular user access control while minimizing the proliferation of AWS Identity and Access Management (IAM) roles. We also share proven best practices that help organizations maintain security and compliance without sacrificing operational efficiency in their ML workflows.  ( 34 min )
    Long-running execution flows now supported in Amazon Bedrock Flows in public preview
    We announce the public preview of long-running execution (asynchronous) flow support within Amazon Bedrock Flows. With Amazon Bedrock Flows, you can link foundation models (FMs), Amazon Bedrock Prompt Management, Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, Amazon Bedrock Guardrails, and other AWS services together to build and scale predefined generative AI workflows.  ( 31 min )
    Fraud detection empowered by federated learning with the Flower framework on Amazon SageMaker AI
    In this post, we explore how SageMaker and federated learning help financial institutions build scalable, privacy-first fraud detection systems.  ( 29 min )
    Building intelligent AI voice agents with Pipecat and Amazon Bedrock – Part 2
    In Part 1 of this series, you learned how you can use the combination of Amazon Bedrock and Pipecat, an open source framework for voice and multimodal conversational AI agents to build applications with human-like conversational AI. You learned about common use cases of voice agents and the cascaded models approach, where you orchestrate several components to build your voice AI agent. In this post (Part 2), you explore how to use speech-to-speech foundation model, Amazon Nova Sonic, and the benefits of using a unified model.  ( 29 min )
    Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails
    In the fashion industry, teams are frequently innovating quickly, often utilizing AI. Sharing content, whether it be through videos, designs, or otherwise, can lead to content moderation challenges. There remains a risk (through intentional or unintentional actions) of inappropriate, offensive, or toxic content being produced and shared. In this post, we cover the use of the multimodal toxicity detection feature of Amazon Bedrock Guardrails to guard against toxic content. Whether you’re an enterprise giant in the fashion industry or an up-and-coming brand, you can use this solution to screen potentially harmful content before it impacts your brand’s reputation and ethical standards. For the purposes of this post, ethical standards refer to toxic, disrespectful, or harmful content and images that could be created by fashion designers.  ( 31 min )

  • Open

    New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models
    In this post, we share some of the new innovations in SageMaker AI that can accelerate how you build and train AI models. These innovations include new observability capabilities in SageMaker HyperPod, the ability to deploy JumpStart models on HyperPod, remote connections to SageMaker AI from local development environments, and fully managed MLflow 3.0.  ( 29 min )
    Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod
    With a one-click installation of the Amazon Elastic Kubernetes Service (Amazon EKS) add-on for SageMaker HyperPod observability, you can consolidate health and performance data from NVIDIA DCGM, instance-level Kubernetes node exporters, Elastic Fabric Adapter (EFA), integrated file systems, Kubernetes APIs, Kueue, and SageMaker HyperPod task operators. In this post, we walk you through installing and using the unified dashboards of the out-of-the-box observability feature in SageMaker HyperPod. We cover the one-click installation from the Amazon SageMaker AI console, navigating the dashboard and metrics it consolidates, and advanced topics such as setting up custom alerts.  ( 29 min )
    Accelerating generative AI development with fully managed MLflow 3.0 on Amazon SageMaker AI
    In this post, we explore how Amazon SageMaker now offers fully managed support for MLflow 3.0, streamlining AI experimentation and accelerating your generative AI journey from idea to production. This release transforms managed MLflow from experiment tracking to providing end-to-end observability, reducing time-to-market for generative AI development.  ( 31 min )
    Amazon SageMaker HyperPod launches model deployments to accelerate the generative AI model development lifecycle
    In this post, we announce Amazon SageMaker HyperPod support for deploying foundation models from SageMaker JumpStart, as well as custom or fine-tuned models from Amazon S3 or Amazon FSx. This new capability allows customers to train, fine-tune, and deploy models on the same HyperPod compute resources, maximizing resource utilization across the entire model lifecycle.  ( 36 min )
    Supercharge your AI workflows by connecting to SageMaker Studio from Visual Studio Code
    AI developers and machine learning (ML) engineers can now use the capabilities of Amazon SageMaker Studio directly from their local Visual Studio Code (VS Code). With this capability, you can use your customized local VS Code setup, including AI-assisted development tools, custom extensions, and debugging tools while accessing compute resources and your data in SageMaker Studio. In this post, we show you how to remotely connect your local VS Code to SageMaker Studio development environments to use your customized development environment while accessing Amazon SageMaker AI compute resources.  ( 33 min )
    Use K8sGPT and Amazon Bedrock for simplified Kubernetes cluster maintenance
    This post demonstrates the best practices to run K8sGPT in AWS with Amazon Bedrock in two modes: K8sGPT CLI and K8sGPT Operator. It showcases how the solution can help SREs simplify Kubernetes cluster management through continuous monitoring and operational intelligence.  ( 34 min )
    How Rocket streamlines the home buying experience with Amazon Bedrock Agents
    Rocket AI Agent is more than a digital assistant. It’s a reimagined approach to client engagement, powered by agentic AI. By combining Amazon Bedrock Agents with Rocket’s proprietary data and backend systems, Rocket has created a smarter, more scalable, and more human experience available 24/7, without the wait. This post explores how Rocket brought that vision to life using Amazon Bedrock Agents, powering a new era of AI-driven support that is consistently available, deeply personalized, and built to take action.  ( 32 min )
    Build an MCP application with Mistral models on AWS
    This post demonstrates building an intelligent AI assistant using Mistral AI models on AWS and MCP, integrating real-time location services, time data, and contextual memory to handle complex multimodal queries. This use case, restaurant recommendations, serves as an example, but this extensible framework can be adapted for enterprise use cases by modifying MCP server configurations to connect with your specific data sources and business systems.  ( 37 min )
    Build real-time conversational AI experiences using Amazon Nova Sonic and LiveKit
    mazon Nova Sonic is now integrated with LiveKit’s WebRTC framework, a widely used platform that enables developers to build real-time audio, video, and data communication applications. This integration makes it possible for developers to build conversational voice interfaces without needing to manage complex audio pipelines or signaling protocols. In this post, we explain how this integration works, how it addresses the historical challenges of voice-first applications, and some initial steps to start using this solution.  ( 28 min )
  • Open

    Using Architecture Decision Records (ADRs) with AI coding assistants
    Last week my former colleague Doug Todd asked a question about recording decisions on BlueSky: Of course I replied suggesting Architecture Decision Records (ADRs), with a pointer to the at_protocol GitHub repo where we use them. A few days back Doug demoed how he’s using ADRs with his coding assistant (Claude and Claude Code), and […]  ( 13 min )
    Using Architecture Decision Records (ADRs) with AI coding assistants
    Last week my former colleague Doug Todd asked a question about recording decisions on BlueSky: Of course I replied suggesting Architecture Decision Records (ADRs), with a pointer to the at_protocol GitHub repo where we use them. A few days back Doug demoed how he’s using ADRs with his coding assistant (Claude and Claude Code), and […]  ( 13 min )

  • Open

    AWS AI infrastructure with NVIDIA Blackwell: Two powerful compute solutions for the next frontier of AI
    In this post, we announce general availability of Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances, powered by NVIDIA Blackwell GPUs, designed for training and deploying the largest, most sophisticated AI models.  ( 30 min )
    Unlock retail intelligence by transforming data into actionable insights using generative AI with Amazon Q Business
    Amazon Q Business for Retail Intelligence is an AI-powered assistant designed to help retail businesses streamline operations, improve customer service, and enhance decision-making processes. This solution is specifically engineered to be scalable and adaptable to businesses of various sizes, helping them compete more effectively. In this post, we show how you can use Amazon Q Business for Retail Intelligence to transform your data into actionable insights.  ( 30 min )
    Democratize data for timely decisions with text-to-SQL at Parcel Perform
    The business team in Parcel Perform often needs access to data to answer questions related to merchants’ parcel deliveries, such as “Did we see a spike in delivery delays last week? If so, in which transit facilities were this observed, and what was the primary cause of the issue?” Previously, the data team had to manually form the query and run it to fetch the data. With the new generative AI-powered text-to-SQL capability in Parcel Perform, the business team can self-serve their data needs by using an AI assistant interface. In this post, we discuss how Parcel Perform incorporated generative AI, data storage, and data access through AWS services to make timely decisions.  ( 34 min )
    Query Amazon Aurora PostgreSQL using Amazon Bedrock Knowledge Bases structured data
    In this post, we discuss how to make your Amazon Aurora PostgreSQL-Compatible Edition data available for natural language querying through Amazon Bedrock Knowledge Bases while maintaining data freshness.  ( 31 min )
    Configure fine-grained access to Amazon Bedrock models using Amazon SageMaker Unified Studio
    In this post, we demonstrate how to use SageMaker Unified Studio and AWS Identity and Access Management (IAM) to establish a robust permission framework for Amazon Bedrock models. We show how administrators can precisely manage which users and teams have access to specific models within a secure, collaborative environment. We guide you through creating granular permissions to control model access, with code examples for common enterprise governance scenarios.  ( 35 min )
    Improve conversational AI response times for enterprise applications with the Amazon Bedrock streaming API and AWS AppSync
    This post demonstrates how integrating an Amazon Bedrock streaming API with AWS AppSync subscriptions significantly enhances AI assistant responsiveness and user satisfaction. By implementing this streaming approach, the global financial services organization reduced initial response times for complex queries by approximately 75%—from 10 seconds to just 2–3 seconds—empowering users to view responses as they’re generated rather than waiting for complete answers.  ( 30 min )
    Scale generative AI use cases, Part 1: Multi-tenant hub and spoke architecture using AWS Transit Gateway
    n this two-part series, we discuss a hub and spoke architecture pattern for building a multi-tenant and multi-account architecture. This pattern supports abstractions for shared services across use cases and teams, helping create secure, scalable, and reliable generative AI systems. In Part 1, we present a centralized hub for generative AI service abstractions and tenant-specific spokes, using AWS Transit Gateway for cross-account interoperability.  ( 32 min )

  • Open

    Accelerate AI development with Amazon Bedrock API keys
    Today, we’re excited to announce a significant improvement to the developer experience of Amazon Bedrock: API keys. API keys provide quick access to the Amazon Bedrock APIs, streamlining the authentication process so that developers can focus on building rather than configuration.  ( 28 min )
    Accelerating data science innovation: How Bayer Crop Science used AWS AI/ML services to build their next-generation MLOps service
    In this post, we show how Bayer Crop Science manages large-scale data science operations by training models for their data analytics needs and maintaining high-quality code documentation to support developers. Through these solutions, Bayer Crop Science projects up to a 70% reduction in developer onboarding time and up to a 30% improvement in developer productivity.  ( 30 min )
    Combat financial fraud with GraphRAG on Amazon Bedrock Knowledge Bases
    In this post, we show how to use Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics to build a financial fraud detection solution.  ( 31 min )
    Classify call center conversations with Amazon Bedrock batch inference
    In this post, we demonstrate how to build an end-to-end solution for text classification using the Amazon Bedrock batch inference capability with the Anthropic’s Claude Haiku model. We walk through classifying travel agency call center conversations into categories, showcasing how to generate synthetic training data, process large volumes of text data, and automate the entire workflow using AWS services.  ( 34 min )
    Effective cross-lingual LLM evaluation with Amazon Bedrock
    In this post, we demonstrate how to use the evaluation features of Amazon Bedrock to deliver reliable results across language barriers without the need for localized prompts or custom infrastructure. Through comprehensive testing and analysis, we share practical strategies to help reduce the cost and complexity of multilingual evaluation while maintaining high standards across global large language model (LLM) deployments.  ( 33 min )
    Cohere Embed 4 multimodal embeddings model is now available on Amazon SageMaker JumpStart
    The Cohere Embed 4 multimodal embeddings model is now generally available on Amazon SageMaker JumpStart. The Embed 4 model is built for multimodal business documents, has leading multilingual capabilities, and offers notable improvement over Embed 3 across key benchmarks. In this post, we discuss the benefits and capabilities of this new model. We also walk you through how to deploy and use the Embed 4 model using SageMaker JumpStart.  ( 31 min )

  • Open

    How INRIX accelerates transportation planning with Amazon Bedrock
    INRIX pioneered the use of GPS data from connected vehicles for transportation intelligence. In this post, we partnered with Amazon Web Services (AWS) customer INRIX to demonstrate how Amazon Bedrock can be used to determine the best countermeasures for specific city locations using rich transportation data and how such countermeasures can be automatically visualized in street view images. This approach allows for significant planning acceleration compared to traditional approaches using conceptual drawings.  ( 30 min )
    Qwen3 family of reasoning models now available in Amazon Bedrock Marketplace and Amazon SageMaker JumpStart
    Today, we are excited to announce that Qwen3, the latest generation of large language models (LLMs) in the Qwen family, is available through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, you can deploy the Qwen3 models—available in 0.6B, 4B, 8B, and 32B parameter sizes—to build, experiment, and responsibly scale your generative AI applications on AWS. In this post, we demonstrate how to get started with Qwen3 on Amazon Bedrock Marketplace and SageMaker JumpStart.  ( 34 min )
    Build a just-in-time knowledge base with Amazon Bedrock
    Traditional Retrieval Augmented Generation (RAG) systems consume valuable resources by ingesting and maintaining embeddings for documents that might never be queried, resulting in unnecessary storage costs and reduced system efficiency. This post presents a just-in-time knowledge base solution that reduces unused consumption through intelligent document processing. The solution processes documents only when needed and automatically removes unused resources, so organizations can scale their document repositories without proportionally increasing infrastructure costs.  ( 31 min )
    Agents as escalators: Real-time AI video monitoring with Amazon Bedrock Agents and video streams
    In this post, we show how to build a fully deployable solution that processes video streams using OpenCV, Amazon Bedrock for contextual scene understanding and automated responses through Amazon Bedrock Agents. This solution extends the capabilities demonstrated in Automate chatbot for document and data retrieval using Amazon Bedrock Agents and Knowledge Bases, which discussed using Amazon Bedrock Agents for document and data retrieval. In this post, we apply Amazon Bedrock Agents to real-time video analysis and event monitoring.  ( 37 min )
  • Open

    SRE Weekly Issue #484
    View on sreweekly.com Exact Code Search: Find code faster across repositories This is really neat! They’ve developed a new approach to search that uses 3-letter “trigrams” rather than tokenizing words, making it especially well-suited to code search. It converts regular expressions to trigram searches behind the scenes.   Dmitry Gruzd — GitLab Pattern machines that we […]  ( 4 min )

  • Open

    Public Ollama Models
    Public Ollama Models 2025-07-05 How to chat with Ollama models Select an IP and model from the table below, then use them in this command: # Start a conversation with a model # Replace <IP> with an IP from the table below # Replace <MODEL> with one of the models listed for that IP curl -X POST http://<IP>:11434/api/chat -d '{ "model": "<MODEL>", "messages": [{ "role": "user", "content": "Hello, how are you?" }] }' Available Models IP Models 222.70.88.44 qwen3:8bdeepseek-r1:8bdeepseek-r1:7bqwen3:30b-a3bnomic-embed-text:latestbge-m3:latest 117.50.179.196 smollm2:135mhf.co/IlyaGusev/saiga_nemo_12b_gguf:Q5_K_M 117.50.197.100 qwen2.5:7bbge-m3:567mqwen2.5vl:7b 117.50.164.136 qwen3:1.7bnomic-embed-text:v1.5qwen3:4bllama3.2:3b-instruct-q5_K_M 106.14.202.11 mxbai-embed-large:latest 163.228.156.198 qwen3:8b-nothinkqwen3:8bdeepseek-r1:8bqwen2.5:32bqwen2.5-coder:7bQwen2.5-7B-Instruct-Distill-ds-r1-110k:latestQwen2.5-7B-Instruct:7bQwen2.5-7B-Distill-ds-r1-110k:7bqwq:latestsmollm2:135mqwen2.5:3B-Traineddeepseek-r1:32bdeepseek-r1:14bllava:latestnomic-embed-text:latestqwen2.5:latestdeepseek-r1:7b 218.1.151.175 smollm2:135mnomic-embed-text:latestqwen2.5:latest 218.78.108.171 deepseek-r1:14bdeepseek-r1:7bdeepseek-r1:1.5b 117.50.245.70 smollm2:135mqwen2.5:32bqwen2.5:7bgemma2:27bgemma2:2bqwen2.5:14bdeepseek-r1:14bdeepseek-r1:7bgemma3:4bnomic-embed-text:latestdeepseek-r1:1.5bgemma3:12bgemma3:27bqwen2.5-coder:latestunsloth.F16.gguf:latestunsloth.Q8_0.gguf:latest 117.50.194.3 dengcao/Qwen3-Embedding-8B:Q5_K_Mdengcao/Qwen3-Embedding-4B:Q5_K_M 61.165.183.106 huihui_ai/deepseek-r1-abliterated:70b-llama-distill-q8_0 124.71.154.35 llama3.2:3b-instruct-q5_K_Mdeepseek-r1:1.5bnomic-embed-text:latest 117.50.174.178 smollm2:135mqwen2.5:7bdeepseek-r1:8b 117.50.175.121 changji_medical_deepseek_r1:14bchangji_medical_deepseek_r1:32b 101.132.102.117 smollm2:135mbge-m3:567mdeepseek-r1:1.5b 117.50.250.245 qwen3:8b_nothinkqwen3:8b 143.64.160.92 llama3.2:3b-instruct-q5_K_MMartinRizzo/Ayla-Light-v2:12b-q4_K_M 58.246.1.174 llama3.2:3b-instruct-q5_K_Mqwen2.5:32b 61.172.167.153 deepseek-r1:7b 61.172.167.211 deepseek-r1:7b 61.169.115.204 nomic-embed-text:latestdeepseek-r1:32b 47.116.202.9 qwen3-no-think:latestqwen3:latestqwen3:8bqwen:7bllava:latestmistral:7b-instructnomic-embed-text:latestqllama/bge-reranker-v2-m3:latestbge-large:latestdeepseek-r1:7bbge-m3:latestdeepseek-r1:latestdeepseek-r1:1.5bqwen2:latest 223.166.95.229 deepseek-r1:7bdeepseek-r1:14bdeepseek-r1:8bqwen3:latestqwen3:14bqwen2.5vl:32bqwen2.5vl:latestqwen3:8bgemma3:12bgemma3:27bllava:34bllava:13bmxbai-embed-large:latestnomic-embed-text:latestqwq:latestcodellama:13bllama3.2-vision:latestqwen2.5-coder:latestqwen2.5-coder:14bphi4:latestphi3:14bmistral:latestllama3.3:latestllama3.2:latestllama3.1:latestllama3:latestllama3:70bgemma2:latestgemma2:27b 180.158.174.61 qwq:32b-q8_0qwq:32b-16384contextqwq:32bnomic-embed-text:latestdeepseek-r1:32bdeepseek-r1:14bllama3.2-vision:11bqwen2.5:32bllama3.2:latest Disclaimer These Ollama model endpoints are publicly exposed interfaces found on the internet. They are listed here for informational purposes only. Please be aware that: These endpoints are not maintained or controlled by us The availability and stability of these services cannot be guaranteed Use these services at your own risk We take no responsibility for any issues or damages that may arise from using these endpoints 免责声明 本文列出的 Ollama 模型接口均来自互联网上公开暴露的端点。请注意: 这些端点并非由我们维护或控制 无法保证这些服务的可用性和稳定性 使用这些服务需自行承担风险 对于使用这些端点可能产生的任何问题或损失,我们不承担任何责任  ( 2 min )

  • Open

    Transforming network operations with AI: How Swisscom built a network assistant using Amazon Bedrock
    In this post, we explore how Swisscom developed their Network Assistant. We discuss the initial challenges and how they implemented a solution that delivers measurable benefits. We examine the technical architecture, discuss key learnings, and look at future enhancements that can further transform network operations.  ( 32 min )
    End-to-End model training and deployment with Amazon SageMaker Unified Studio
    In this post, we guide you through the stages of customizing large language models (LLMs) with SageMaker Unified Studio and SageMaker AI, covering the end-to-end process starting from data discovery to fine-tuning FMs with SageMaker AI distributed training, tracking metrics using MLflow, and then deploying models using SageMaker AI inference for real-time inference. We also discuss best practices to choose the right instance size and share some debugging best practices while working with JupyterLab notebooks in SageMaker Unified Studio.  ( 37 min )

  • Open

    Optimize RAG in production environments using Amazon SageMaker JumpStart and Amazon OpenSearch Service
    In this post, we show how to use Amazon OpenSearch Service as a vector store to build an efficient RAG application.  ( 34 min )
    Advancing AI agent governance with Boomi and AWS: A unified approach to observability and compliance
    In this post, we share how Boomi partnered with AWS to help enterprises accelerate and scale AI adoption with confidence using Agent Control Tower.  ( 28 min )

  • Open

    Use Amazon SageMaker Unified Studio to build complex AI workflows using Amazon Bedrock Flows
    In this post, we demonstrate how you can use SageMaker Unified Studio to create complex AI workflows using Amazon Bedrock Flows.  ( 31 min )
    Accelerating AI innovation: Scale MCP servers for enterprise workloads with Amazon Bedrock
    In this post, we present a centralized Model Context Protocol (MCP) server implementation using Amazon Bedrock that provides shared access to tools and resources for enterprise AI workloads. The solution enables organizations to accelerate AI innovation by standardizing access to resources and tools through MCP, while maintaining security and governance through a centralized approach.  ( 32 min )
    Choosing the right approach for generative AI-powered structured data retrieval
    In this post, we explore five different patterns for implementing LLM-powered structured data query capabilities in AWS, including direct conversational interfaces, BI tool enhancements, and custom text-to-SQL solutions.  ( 32 min )
    Revolutionizing drug data analysis using Amazon Bedrock multimodal RAG capabilities
    In this post, we explore how Amazon Bedrock's multimodal RAG capabilities revolutionize drug data analysis by efficiently processing complex medical documentation containing text, images, graphs, and tables.  ( 32 min )
  • Open

    Milo cancer diary part 20 – extended remission
    Milo was back at North Downs Specialist Referrals today for his second scan since finishing his third (modified) ‘CHOP’ chemotherapy protocol. Amazingly he’s still looking clear, which means this is now the longest period of remission since he started treatment :) Our fingers will be crossed for the next scan in a couple of months […]  ( 12 min )
    Milo cancer diary part 20 – extended remission
    Milo was back at North Downs Specialist Referrals today for his second scan since finishing his third (modified) ‘CHOP’ chemotherapy protocol. Amazingly he’s still looking clear, which means this is now the longest period of remission since he started treatment :) Our fingers will be crossed for the next scan in a couple of months […]  ( 12 min )
    June 2025
    Pupdate There’s been a bumper crop of raspberries this year, which has kept the boys entertained.. Berlin Google’s I/O Connect event was in Berlin once again, which provided a good chance to catch up with various communities and some of the product folk. I also took the chance to grab dinner with some local ex-pat […]  ( 15 min )
    June 2025
    Pupdate There’s been a bumper crop of raspberries this year, which has kept the boys entertained.. Berlin Google’s I/O Connect event was in Berlin once again, which provided a good chance to catch up with various communities and some of the product folk. I also took the chance to grab dinner with some local ex-pat […]  ( 15 min )

  • Open

    Build and deploy AI inference workflows with new enhancements to the Amazon SageMaker Python SDK
    In this post, we provide an overview of the user experience, detailing how to set up and deploy these workflows with multiple models using the SageMaker Python SDK. We walk through examples of building complex inference workflows, deploying them to SageMaker endpoints, and invoking them for real-time inference.  ( 35 min )
    Context extraction from image files in Amazon Q Business using LLMs
    In this post, we look at a step-by-step implementation for using the custom document enrichment (CDE) feature within an Amazon Q Business application to process standalone image files. We walk you through an AWS Lambda function configured within CDE to process various image file types, and showcase an example scenario of how this integration enhances Amazon Q Business's ability to provide comprehensive insights.  ( 100 min )
    Build AWS architecture diagrams using Amazon Q CLI and MCP
    In this post, we explore how to use Amazon Q Developer CLI with the AWS Diagram MCP and the AWS Documentation MCP servers to create sophisticated architecture diagrams that follow AWS best practices. We discuss techniques for basic diagrams and real-world diagrams, with detailed examples and step-by-step instructions.  ( 98 min )
  • Open

    SRE Weekly Issue #483
    View on sreweekly.com A message from our sponsor, PagerDuty: When the internet faltered on June 12th, other incident management platforms may have crashed—but PagerDuty handled a 172% surge in incidents and 433% spike in notifications flawlessly. Your platform should be rock-solid during a storm, not another worry. See what sets PagerDuty’s reliability apart. The same […]  ( 4 min )

  • Open

    AWS costs estimation using Amazon Q CLI and AWS Cost Analysis MCP
    In this post, we explore how to use Amazon Q CLI with the AWS Cost Analysis MCP server to perform sophisticated cost analysis that follows AWS best practices. We discuss basic setup and advanced techniques, with detailed examples and step-by-step instructions.  ( 98 min )

  • Open

    Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails
    In this post, we introduce the new safeguard tiers available in Amazon Bedrock Guardrails, explain their benefits and use cases, and provide guidance on how to implement and evaluate them in your AI applications.  ( 98 min )
    Structured data response with Amazon Bedrock: Prompt Engineering and Tool Use
    We demonstrate two methods for generating structured responses with Amazon Bedrock: Prompt Engineering and Tool Use with the Converse API. Prompt Engineering is flexible, works with Bedrock models (including those without Tool Use support), and handles various schema types (e.g., Open API schemas), making it a great starting point. Tool Use offers greater reliability, consistent results, seamless API integration, and runtime validation of JSON schema for enhanced control.  ( 95 min )
    Using Amazon SageMaker AI Random Cut Forest for NASA’s Blue Origin spacecraft sensor data
    In this post, we demonstrate how to use SageMaker AI to apply the Random Cut Forest (RCF) algorithm to detect anomalies in spacecraft position, velocity, and quaternion orientation data from NASA and Blue Origin’s demonstration of lunar Deorbit, Descent, and Landing Sensors (BODDL-TP).  ( 99 min )

  • Open

    Build an intelligent multi-agent business expert using Amazon Bedrock
    In this post, we demonstrate how to build a multi-agent system using multi-agent collaboration in Amazon Bedrock Agents to solve complex business questions in the biopharmaceutical industry. We show how specialized agents in research and development (R&D), legal, and finance domains can work together to provide comprehensive business insights by analyzing data from multiple sources.  ( 100 min )
    Driving cost-efficiency and speed in claims data processing with Amazon Nova Micro and Amazon Nova Lite
    In this post, we shared how an internal technology team at Amazon evaluated Amazon Nova models, resulting in notable improvements in inference speed and cost-efficiency.  ( 93 min )

  • Open

    Power Your LLM Training and Evaluation with the New SageMaker AI Generative AI Tools
    Today we are excited to introduce the Text Ranking and Question and Answer UI templates to SageMaker AI customers. In this blog post, we’ll walk you through how to set up these templates in SageMaker to create high-quality datasets for training your large language models.  ( 95 min )
    Amazon Bedrock Agents observability using Arize AI
    Today, we’re excited to announce a new integration between Arize AI and Amazon Bedrock Agents that addresses one of the most significant challenges in AI development: observability. In this post, we demonstrate the Arize Phoenix system for tracing and evaluation.  ( 100 min )
    How SkillShow automates youth sports video processing using Amazon Transcribe
    SkillShow, a leader in youth sports video production, films over 300 events yearly in the youth sports industry, creating content for over 20,000 young athletes annually. This post describes how SkillShow used Amazon Transcribe and other Amazon Web Services (AWS) machine learning (ML) services to automate their video processing workflow, reducing editing time and costs while scaling their operations.  ( 93 min )
    NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy
    This post is co-written with Sergio Zavota and Amy Perring from NewDay. NewDay has a clear and defining purpose: to help people move forward with credit. NewDay provides around 4 million customers access to credit responsibly and delivers exceptional customer experiences, powered by their in-house technology system. NewDay’s contact center handles 2.5 million calls annually, […]  ( 95 min )

  • Open

    No-code data preparation for time series forecasting using Amazon SageMaker Canvas
    Amazon SageMaker Canvas offers no-code solutions that simplify data wrangling, making time series forecasting accessible to all users regardless of their technical background. In this post, we explore how SageMaker Canvas and SageMaker Data Wrangler provide no-code data preparation techniques that empower users of all backgrounds to prepare data and build time series forecasting models in a single interface with confidence.  ( 92 min )
    Build an agentic multimodal AI assistant with Amazon Nova and Amazon Bedrock Data Automation
    In this post, we demonstrate how agentic workflow patterns such as Retrieval Augmented Generation (RAG), multi-tool orchestration, and conditional routing with LangGraph enable end-to-end solutions that artificial intelligence and machine learning (AI/ML) developers and enterprise architects can adopt and extend. We walk through an example of a financial management AI assistant that can provide quantitative research and grounded financial advice by analyzing both the earnings call (audio) and the presentation slides (images), along with relevant financial data feeds.  ( 98 min )
  • Open

    SRE Weekly Issue #482
    View on sreweekly.com A message from our sponsor, PagerDuty: Incidents move fast. But you’ll never get left behind with PagerDuty’s GenAI incident response assistant, available in all paid plans. Get instant business impact analysis, troubleshooting steps, and auto-drafted status updates—directly in Slack. Stop context-switching, start resolving faster. https://fnf.dev/4dZ5V36 Service Disruption on multiple Salesforce services on […]  ( 4 min )
2025-07-22T10:20:10.441Z osmosfeed 1.15.1