RSS Feed Reader

Open

Build an AI-powered automated summarization system with Amazon Bedrock and Amazon Transcribe using Terraform

This post introduces a serverless meeting summarization system that harnesses the advanced capabilities of Amazon Bedrock and Amazon Transcribe to transform audio recordings into concise, structured, and actionable summaries. By automating this process, organizations can reclaim countless hours while making sure key insights, action items, and decisions are systematically captured and made accessible to stakeholders. ( 37 min )

Kyruus builds a generative AI provider matching solution on AWS

In this post, we demonstrate how Kyruus Health uses AWS services to build Guide. We show how Amazon Bedrock, a fully managed service that provides access to foundation models (FMs) from leading AI companies and Amazon through a single API, and Amazon OpenSearch Service, a managed search and analytics service, work together to understand everyday language about health concerns and connect members with the right providers. ( 28 min )

Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance

In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how Amazon Web Services (AWS) customers can build a solution that automates the digitisation and extraction of crucial information from many reports using generative AI. ( 29 min )
Open

SRE Weekly Issue #486

View on sreweekly.com A message from our sponsor, Spacelift: IaC Experts! IaCConf Call for Presenters – August 27, 2025 The upcoming IaCConf Spotlight dives into the security and governance challenges of managing infrastructure as code at scale. From embedding security in your pipelines to navigating the realities of open source risk, this event brings together […] ( 4 min )

Open

Build real-time travel recommendations using AI agents on Amazon Bedrock

In this post, we show how to build a generative AI solution using Amazon Bedrock that creates bespoke holiday packages by combining customer profiles and preferences with real-time pricing data. We demonstrate how to use Amazon Bedrock Knowledge Bases for travel information, Amazon Bedrock Agents for real-time flight details, and Amazon OpenSearch Serverless for efficient package search and retrieval. ( 30 min )

Deploy a full stack voice AI agent with Amazon Nova Sonic

In this post, we show how to create an AI-powered call center agent for a fictional company called AnyTelco. The agent, named Telly, can handle customer inquiries about plans and services while accessing real-time customer data using custom tools implemented with the Model Context Protocol (MCP) framework. ( 29 min )

Manage multi-tenant Amazon Bedrock costs using application inference profiles

This post explores how to implement a robust monitoring solution for multi-tenant AI deployments using a feature of Amazon Bedrock called application inference profiles. We demonstrate how to create a system that enables granular usage tracking, accurate cost allocation, and dynamic resource management across complex multi-tenant environments. ( 30 min )

Open

Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI

Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores. For most real-world generative AI scenarios, it’s crucial to understand whether a model is producing better outputs than a baseline or an earlier iteration. This is especially important for applications such as summarization, content generation, […] ( 37 min )

Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors

In this post, we demonstrate how to integrate Amazon S3 Vectors with Amazon Bedrock Knowledge Bases for RAG applications. You'll learn a practical approach to scale your knowledge bases to handle millions of documents while maintaining retrieval quality and using S3 Vectors cost-effective storage. ( 32 min )

Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock

In this post, we walk through the custom model on-demand deployment workflow for Amazon Bedrock and provide step-by-step implementation guides using both the AWS Management Console and APIs or AWS SDKs. We also discuss best practices and considerations for deploying customized Amazon Nova models on Amazon Bedrock. ( 30 min )

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive innovation at unprecedented speed. However, standalone LLMs have key limitations such as hallucinations, outdated knowledge, and no access to proprietary data. Retrieval Augmented Generation (RAG) addresses these gaps by combining semantic search with generative AI, […] ( 36 min )

Open

Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents

This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture. Video highlights offer a powerful way to boost audience engagement and extend content value for content publishers. These short, high-impact clips capture key moments that drive viewer retention, amplify reach across social media, reinforce brand identity, and open new […] ( 31 min )

Deploy conversational agents with Vonage and Amazon Nova Sonic

In this post, we explore how developers can integrate Amazon Nova Sonic with the Vonage communications service to build responsive, natural-sounding voice experiences in real time. By combining the Vonage Voice API with the low-latency and expressive speech capabilities of Amazon Nova Sonic, businesses can deploy AI voice agents that deliver more human-like interactions than traditional voice interfaces. These agents can be used as customer support, virtual assistants, and more. ( 29 min )

Enabling customers to deliver production-ready AI agents at scale

Today, I’m excited to share how we’re bringing this vision to life with new capabilities that address the fundamental aspects of building and deploying agents at scale. These innovations will help you move beyond experiments to production-ready agent systems that can be trusted with your most critical business processes. ( 32 min )

Open

Amazon Bedrock Knowledge Bases now supports Amazon OpenSearch Service Managed Cluster as vector store

Amazon Bedrock Knowledge Bases has extended its vector store options by enabling support for Amazon OpenSearch Service managed clusters, further strengthening its capabilities as a fully managed Retrieval Augmented Generation (RAG) solution. This enhancement builds on the core functionality of Amazon Bedrock Knowledge Bases , which is designed to seamlessly connect foundation models (FMs) with internal data sources. This post provides a comprehensive, step-by-step guide on integrating an Amazon Bedrock knowledge base with an OpenSearch Service managed cluster as its vector store. ( 44 min )

Monitor agents built on Amazon Bedrock with Datadog LLM Observability

We’re excited to announce a new integration between Datadog LLM Observability and Amazon Bedrock Agents that helps monitor agentic applications built on Amazon Bedrock. In this post, we'll explore how Datadog's LLM Observability provides the visibility and control needed to successfully monitor, operate, and debug production-grade agentic applications built on Amazon Bedrock Agents. ( 29 min )

How PayU built a secure enterprise AI assistant using Amazon Bedrock

PayU offers a full-stack digital financial services system that serves the financial needs of merchants, banks, and consumers through technology. In this post, we explain how we equipped the PayU team with an enterprise AI solution and democratized AI access using Amazon Bedrock, without compromising on data residency requirements. ( 32 min )

Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import

This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA. DGX Cloud on Amazon Web Services (AWS) represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock […] ( 32 min )

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS

This post introduces NVIDIA Dynamo and explains how to set it up on Amazon EKS for automated scaling and streamlined Kubernetes operations. We provide a hands-on walkthrough, which uses the NVIDIA Dynamo blueprint on the AI on EKS GitHub repo by AWS Labs to provision the infrastructure, configure monitoring, and install the NVIDIA Dynamo operator. ( 35 min )

AWS doubles investment in AWS Generative AI Innovation Center, marking two years of customer success

In this post, AWS announces a $100 million additional investment in its AWS Generative AI Innovation Center, marking two years of successful customer collaborations across industries from financial services to healthcare. The investment comes as AI evolves toward more autonomous, agentic systems, with the center already helping thousands of customers drive millions in productivity gains and transform customer experiences. ( 30 min )

Open

Advanced fine-tuning methods on Amazon SageMaker AI

When fine-tuning ML models on AWS, you can choose the right tool for your specific needs. AWS provides a comprehensive suite of tools for data scientists, ML engineers, and business users to achieve their ML goals. AWS has built solutions to support various levels of ML sophistication, from simple SageMaker training jobs for FM fine-tuning to the power of SageMaker HyperPod for cutting-edge research. We invite you to explore these options, starting with what suits your current needs, and evolve your approach as those needs change. ( 35 min )

Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod

This post is co-written with Zhanghao Wu, co-creator of SkyPilot. The rapid advancement of generative AI and foundation models (FMs) has significantly increased computational resource requirements for machine learning (ML) workloads. Modern ML pipelines require efficient systems for distributing workloads across accelerated compute resources, while making sure developer productivity remains high. Organizations need infrastructure solutions […] ( 31 min )

Intelligent document processing at scale with generative AI and Amazon Bedrock Data Automation

This post presents an end-to-end IDP application powered by Amazon Bedrock Data Automation and other AWS services. It provides a reusable AWS infrastructure as code (IaC) that deploys an IDP pipeline and provides an intuitive UI for transforming documents into structured tables at scale. The application only requires the user to provide the input documents (such as contracts or emails) and a list of attributes to be extracted. It then performs IDP with generative AI. ( 35 min )

Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight

In this post, we dive into how we integrated Amazon Q in QuickSight to transform natural language requests like “Show me how many items were returned in the US over the past 6 months” into meaningful data visualizations. We demonstrate how combining Amazon Bedrock Agents with Amazon Q in QuickSight creates a comprehensive data assistant that delivers both SQL code and visual insights through a single, intuitive conversational interface—democratizing data access across the enterprise. ( 34 min )

Build a conversational data assistant, Part 1: Text-to-SQL with Amazon Bedrock Agents

In this post, we focus on building a Text-to-SQL solution with Amazon Bedrock, a managed service for building generative AI applications. Specifically, we demonstrate the capabilities of Amazon Bedrock Agents. Part 2 explains how we extended the solution to provide business insights using Amazon Q in QuickSight, a business intelligence assistant that answers questions with auto-generated visualizations. ( 34 min )

Implement user-level access control for multi-tenant ML platforms on Amazon SageMaker AI

In this post, we discuss permission management strategies, focusing on attribute-based access control (ABAC) patterns that enable granular user access control while minimizing the proliferation of AWS Identity and Access Management (IAM) roles. We also share proven best practices that help organizations maintain security and compliance without sacrificing operational efficiency in their ML workflows. ( 34 min )

Long-running execution flows now supported in Amazon Bedrock Flows in public preview

We announce the public preview of long-running execution (asynchronous) flow support within Amazon Bedrock Flows. With Amazon Bedrock Flows, you can link foundation models (FMs), Amazon Bedrock Prompt Management, Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, Amazon Bedrock Guardrails, and other AWS services together to build and scale predefined generative AI workflows. ( 31 min )

Fraud detection empowered by federated learning with the Flower framework on Amazon SageMaker AI

In this post, we explore how SageMaker and federated learning help financial institutions build scalable, privacy-first fraud detection systems. ( 29 min )

Building intelligent AI voice agents with Pipecat and Amazon Bedrock – Part 2

In Part 1 of this series, you learned how you can use the combination of Amazon Bedrock and Pipecat, an open source framework for voice and multimodal conversational AI agents to build applications with human-like conversational AI. You learned about common use cases of voice agents and the cascaded models approach, where you orchestrate several components to build your voice AI agent. In this post (Part 2), you explore how to use speech-to-speech foundation model, Amazon Nova Sonic, and the benefits of using a unified model. ( 29 min )

Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails

In the fashion industry, teams are frequently innovating quickly, often utilizing AI. Sharing content, whether it be through videos, designs, or otherwise, can lead to content moderation challenges. There remains a risk (through intentional or unintentional actions) of inappropriate, offensive, or toxic content being produced and shared. In this post, we cover the use of the multimodal toxicity detection feature of Amazon Bedrock Guardrails to guard against toxic content. Whether you’re an enterprise giant in the fashion industry or an up-and-coming brand, you can use this solution to screen potentially harmful content before it impacts your brand’s reputation and ethical standards. For the purposes of this post, ethical standards refer to toxic, disrespectful, or harmful content and images that could be created by fashion designers. ( 31 min )

Open

New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models

In this post, we share some of the new innovations in SageMaker AI that can accelerate how you build and train AI models. These innovations include new observability capabilities in SageMaker HyperPod, the ability to deploy JumpStart models on HyperPod, remote connections to SageMaker AI from local development environments, and fully managed MLflow 3.0. ( 29 min )

Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod

With a one-click installation of the Amazon Elastic Kubernetes Service (Amazon EKS) add-on for SageMaker HyperPod observability, you can consolidate health and performance data from NVIDIA DCGM, instance-level Kubernetes node exporters, Elastic Fabric Adapter (EFA), integrated file systems, Kubernetes APIs, Kueue, and SageMaker HyperPod task operators. In this post, we walk you through installing and using the unified dashboards of the out-of-the-box observability feature in SageMaker HyperPod. We cover the one-click installation from the Amazon SageMaker AI console, navigating the dashboard and metrics it consolidates, and advanced topics such as setting up custom alerts. ( 29 min )

Accelerating generative AI development with fully managed MLflow 3.0 on Amazon SageMaker AI

In this post, we explore how Amazon SageMaker now offers fully managed support for MLflow 3.0, streamlining AI experimentation and accelerating your generative AI journey from idea to production. This release transforms managed MLflow from experiment tracking to providing end-to-end observability, reducing time-to-market for generative AI development. ( 31 min )

Amazon SageMaker HyperPod launches model deployments to accelerate the generative AI model development lifecycle

In this post, we announce Amazon SageMaker HyperPod support for deploying foundation models from SageMaker JumpStart, as well as custom or fine-tuned models from Amazon S3 or Amazon FSx. This new capability allows customers to train, fine-tune, and deploy models on the same HyperPod compute resources, maximizing resource utilization across the entire model lifecycle. ( 36 min )

Supercharge your AI workflows by connecting to SageMaker Studio from Visual Studio Code

AI developers and machine learning (ML) engineers can now use the capabilities of Amazon SageMaker Studio directly from their local Visual Studio Code (VS Code). With this capability, you can use your customized local VS Code setup, including AI-assisted development tools, custom extensions, and debugging tools while accessing compute resources and your data in SageMaker Studio. In this post, we show you how to remotely connect your local VS Code to SageMaker Studio development environments to use your customized development environment while accessing Amazon SageMaker AI compute resources. ( 33 min )

Use K8sGPT and Amazon Bedrock for simplified Kubernetes cluster maintenance

This post demonstrates the best practices to run K8sGPT in AWS with Amazon Bedrock in two modes: K8sGPT CLI and K8sGPT Operator. It showcases how the solution can help SREs simplify Kubernetes cluster management through continuous monitoring and operational intelligence. ( 34 min )

How Rocket streamlines the home buying experience with Amazon Bedrock Agents

Rocket AI Agent is more than a digital assistant. It’s a reimagined approach to client engagement, powered by agentic AI. By combining Amazon Bedrock Agents with Rocket’s proprietary data and backend systems, Rocket has created a smarter, more scalable, and more human experience available 24/7, without the wait. This post explores how Rocket brought that vision to life using Amazon Bedrock Agents, powering a new era of AI-driven support that is consistently available, deeply personalized, and built to take action. ( 32 min )

Build an MCP application with Mistral models on AWS

This post demonstrates building an intelligent AI assistant using Mistral AI models on AWS and MCP, integrating real-time location services, time data, and contextual memory to handle complex multimodal queries. This use case, restaurant recommendations, serves as an example, but this extensible framework can be adapted for enterprise use cases by modifying MCP server configurations to connect with your specific data sources and business systems. ( 37 min )

Build real-time conversational AI experiences using Amazon Nova Sonic and LiveKit

mazon Nova Sonic is now integrated with LiveKit’s WebRTC framework, a widely used platform that enables developers to build real-time audio, video, and data communication applications. This integration makes it possible for developers to build conversational voice interfaces without needing to manage complex audio pipelines or signaling protocols. In this post, we explain how this integration works, how it addresses the historical challenges of voice-first applications, and some initial steps to start using this solution. ( 28 min )
Open

Using Architecture Decision Records (ADRs) with AI coding assistants

Last week my former colleague Doug Todd asked a question about recording decisions on BlueSky: Of course I replied suggesting Architecture Decision Records (ADRs), with a pointer to the at_protocol GitHub repo where we use them. A few days back Doug demoed how he’s using ADRs with his coding assistant (Claude and Claude Code), and […] ( 13 min )

Using Architecture Decision Records (ADRs) with AI coding assistants

Last week my former colleague Doug Todd asked a question about recording decisions on BlueSky: Of course I replied suggesting Architecture Decision Records (ADRs), with a pointer to the at_protocol GitHub repo where we use them. A few days back Doug demoed how he’s using ADRs with his coding assistant (Claude and Claude Code), and […] ( 13 min )

Open

AWS AI infrastructure with NVIDIA Blackwell: Two powerful compute solutions for the next frontier of AI

In this post, we announce general availability of Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances, powered by NVIDIA Blackwell GPUs, designed for training and deploying the largest, most sophisticated AI models. ( 30 min )

Unlock retail intelligence by transforming data into actionable insights using generative AI with Amazon Q Business

Amazon Q Business for Retail Intelligence is an AI-powered assistant designed to help retail businesses streamline operations, improve customer service, and enhance decision-making processes. This solution is specifically engineered to be scalable and adaptable to businesses of various sizes, helping them compete more effectively. In this post, we show how you can use Amazon Q Business for Retail Intelligence to transform your data into actionable insights. ( 30 min )

Democratize data for timely decisions with text-to-SQL at Parcel Perform

The business team in Parcel Perform often needs access to data to answer questions related to merchants’ parcel deliveries, such as “Did we see a spike in delivery delays last week? If so, in which transit facilities were this observed, and what was the primary cause of the issue?” Previously, the data team had to manually form the query and run it to fetch the data. With the new generative AI-powered text-to-SQL capability in Parcel Perform, the business team can self-serve their data needs by using an AI assistant interface. In this post, we discuss how Parcel Perform incorporated generative AI, data storage, and data access through AWS services to make timely decisions. ( 34 min )

Query Amazon Aurora PostgreSQL using Amazon Bedrock Knowledge Bases structured data

In this post, we discuss how to make your Amazon Aurora PostgreSQL-Compatible Edition data available for natural language querying through Amazon Bedrock Knowledge Bases while maintaining data freshness. ( 31 min )

Configure fine-grained access to Amazon Bedrock models using Amazon SageMaker Unified Studio

In this post, we demonstrate how to use SageMaker Unified Studio and AWS Identity and Access Management (IAM) to establish a robust permission framework for Amazon Bedrock models. We show how administrators can precisely manage which users and teams have access to specific models within a secure, collaborative environment. We guide you through creating granular permissions to control model access, with code examples for common enterprise governance scenarios. ( 35 min )

Improve conversational AI response times for enterprise applications with the Amazon Bedrock streaming API and AWS AppSync

This post demonstrates how integrating an Amazon Bedrock streaming API with AWS AppSync subscriptions significantly enhances AI assistant responsiveness and user satisfaction. By implementing this streaming approach, the global financial services organization reduced initial response times for complex queries by approximately 75%—from 10 seconds to just 2–3 seconds—empowering users to view responses as they’re generated rather than waiting for complete answers. ( 30 min )

Scale generative AI use cases, Part 1: Multi-tenant hub and spoke architecture using AWS Transit Gateway

n this two-part series, we discuss a hub and spoke architecture pattern for building a multi-tenant and multi-account architecture. This pattern supports abstractions for shared services across use cases and teams, helping create secure, scalable, and reliable generative AI systems. In Part 1, we present a centralized hub for generative AI service abstractions and tenant-specific spokes, using AWS Transit Gateway for cross-account interoperability. ( 32 min )

Open

Accelerate AI development with Amazon Bedrock API keys

Today, we’re excited to announce a significant improvement to the developer experience of Amazon Bedrock: API keys. API keys provide quick access to the Amazon Bedrock APIs, streamlining the authentication process so that developers can focus on building rather than configuration. ( 28 min )

Accelerating data science innovation: How Bayer Crop Science used AWS AI/ML services to build their next-generation MLOps service

In this post, we show how Bayer Crop Science manages large-scale data science operations by training models for their data analytics needs and maintaining high-quality code documentation to support developers. Through these solutions, Bayer Crop Science projects up to a 70% reduction in developer onboarding time and up to a 30% improvement in developer productivity. ( 30 min )

Combat financial fraud with GraphRAG on Amazon Bedrock Knowledge Bases

In this post, we show how to use Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics to build a financial fraud detection solution. ( 31 min )

Classify call center conversations with Amazon Bedrock batch inference

In this post, we demonstrate how to build an end-to-end solution for text classification using the Amazon Bedrock batch inference capability with the Anthropic’s Claude Haiku model. We walk through classifying travel agency call center conversations into categories, showcasing how to generate synthetic training data, process large volumes of text data, and automate the entire workflow using AWS services. ( 34 min )

Effective cross-lingual LLM evaluation with Amazon Bedrock

In this post, we demonstrate how to use the evaluation features of Amazon Bedrock to deliver reliable results across language barriers without the need for localized prompts or custom infrastructure. Through comprehensive testing and analysis, we share practical strategies to help reduce the cost and complexity of multilingual evaluation while maintaining high standards across global large language model (LLM) deployments. ( 33 min )

Cohere Embed 4 multimodal embeddings model is now available on Amazon SageMaker JumpStart

The Cohere Embed 4 multimodal embeddings model is now generally available on Amazon SageMaker JumpStart. The Embed 4 model is built for multimodal business documents, has leading multilingual capabilities, and offers notable improvement over Embed 3 across key benchmarks. In this post, we discuss the benefits and capabilities of this new model. We also walk you through how to deploy and use the Embed 4 model using SageMaker JumpStart. ( 31 min )

Open

Public Ollama Models

Public Ollama Models 2025-07-05 How to chat with Ollama models Select an IP and model from the table below, then use them in this command: # Start a conversation with a model # Replace <IP> with an IP from the table below # Replace <MODEL> with one of the models listed for that IP curl -X POST http://<IP>:11434/api/chat -d '{ "model": "<MODEL>", "messages": [{ "role": "user", "content": "Hello, how are you?" }] }' Available Models IP Models 222.70.88.44 qwen3:8bdeepseek-r1:8bdeepseek-r1:7bqwen3:30b-a3bnomic-embed-text:latestbge-m3:latest 117.50.179.196 smollm2:135mhf.co/IlyaGusev/saiga_nemo_12b_gguf:Q5_K_M 117.50.197.100 qwen2.5:7bbge-m3:567mqwen2.5vl:7b 117.50.164.136 qwen3:1.7bnomic-embed-text:v1.5qwen3:4bllama3.2:3b-instruct-q5_K_M 106.14.202.11 mxbai-embed-large:latest 163.228.156.198 qwen3:8b-nothinkqwen3:8bdeepseek-r1:8bqwen2.5:32bqwen2.5-coder:7bQwen2.5-7B-Instruct-Distill-ds-r1-110k:latestQwen2.5-7B-Instruct:7bQwen2.5-7B-Distill-ds-r1-110k:7bqwq:latestsmollm2:135mqwen2.5:3B-Traineddeepseek-r1:32bdeepseek-r1:14bllava:latestnomic-embed-text:latestqwen2.5:latestdeepseek-r1:7b 218.1.151.175 smollm2:135mnomic-embed-text:latestqwen2.5:latest 218.78.108.171 deepseek-r1:14bdeepseek-r1:7bdeepseek-r1:1.5b 117.50.245.70 smollm2:135mqwen2.5:32bqwen2.5:7bgemma2:27bgemma2:2bqwen2.5:14bdeepseek-r1:14bdeepseek-r1:7bgemma3:4bnomic-embed-text:latestdeepseek-r1:1.5bgemma3:12bgemma3:27bqwen2.5-coder:latestunsloth.F16.gguf:latestunsloth.Q8_0.gguf:latest 117.50.194.3 dengcao/Qwen3-Embedding-8B:Q5_K_Mdengcao/Qwen3-Embedding-4B:Q5_K_M 61.165.183.106 huihui_ai/deepseek-r1-abliterated:70b-llama-distill-q8_0 124.71.154.35 llama3.2:3b-instruct-q5_K_Mdeepseek-r1:1.5bnomic-embed-text:latest 117.50.174.178 smollm2:135mqwen2.5:7bdeepseek-r1:8b 117.50.175.121 changji_medical_deepseek_r1:14bchangji_medical_deepseek_r1:32b 101.132.102.117 smollm2:135mbge-m3:567mdeepseek-r1:1.5b 117.50.250.245 qwen3:8b_nothinkqwen3:8b 143.64.160.92 llama3.2:3b-instruct-q5_K_MMartinRizzo/Ayla-Light-v2:12b-q4_K_M 58.246.1.174 llama3.2:3b-instruct-q5_K_Mqwen2.5:32b 61.172.167.153 deepseek-r1:7b 61.172.167.211 deepseek-r1:7b 61.169.115.204 nomic-embed-text:latestdeepseek-r1:32b 47.116.202.9 qwen3-no-think:latestqwen3:latestqwen3:8bqwen:7bllava:latestmistral:7b-instructnomic-embed-text:latestqllama/bge-reranker-v2-m3:latestbge-large:latestdeepseek-r1:7bbge-m3:latestdeepseek-r1:latestdeepseek-r1:1.5bqwen2:latest 223.166.95.229 deepseek-r1:7bdeepseek-r1:14bdeepseek-r1:8bqwen3:latestqwen3:14bqwen2.5vl:32bqwen2.5vl:latestqwen3:8bgemma3:12bgemma3:27bllava:34bllava:13bmxbai-embed-large:latestnomic-embed-text:latestqwq:latestcodellama:13bllama3.2-vision:latestqwen2.5-coder:latestqwen2.5-coder:14bphi4:latestphi3:14bmistral:latestllama3.3:latestllama3.2:latestllama3.1:latestllama3:latestllama3:70bgemma2:latestgemma2:27b 180.158.174.61 qwq:32b-q8_0qwq:32b-16384contextqwq:32bnomic-embed-text:latestdeepseek-r1:32bdeepseek-r1:14bllama3.2-vision:11bqwen2.5:32bllama3.2:latest Disclaimer These Ollama model endpoints are publicly exposed interfaces found on the internet. They are listed here for informational purposes only. Please be aware that: These endpoints are not maintained or controlled by us The availability and stability of these services cannot be guaranteed Use these services at your own risk We take no responsibility for any issues or damages that may arise from using these endpoints 免责声明本文列出的 Ollama 模型接口均来自互联网上公开暴露的端点。请注意：这些端点并非由我们维护或控制无法保证这些服务的可用性和稳定性使用这些服务需自行承担风险对于使用这些端点可能产生的任何问题或损失，我们不承担任何责任 ( 2 min )

Open

Transforming network operations with AI: How Swisscom built a network assistant using Amazon Bedrock

In this post, we explore how Swisscom developed their Network Assistant. We discuss the initial challenges and how they implemented a solution that delivers measurable benefits. We examine the technical architecture, discuss key learnings, and look at future enhancements that can further transform network operations. ( 32 min )

End-to-End model training and deployment with Amazon SageMaker Unified Studio

In this post, we guide you through the stages of customizing large language models (LLMs) with SageMaker Unified Studio and SageMaker AI, covering the end-to-end process starting from data discovery to fine-tuning FMs with SageMaker AI distributed training, tracking metrics using MLflow, and then deploying models using SageMaker AI inference for real-time inference. We also discuss best practices to choose the right instance size and share some debugging best practices while working with JupyterLab notebooks in SageMaker Unified Studio. ( 37 min )

Open

Optimize RAG in production environments using Amazon SageMaker JumpStart and Amazon OpenSearch Service

In this post, we show how to use Amazon OpenSearch Service as a vector store to build an efficient RAG application. ( 34 min )

Advancing AI agent governance with Boomi and AWS: A unified approach to observability and compliance

In this post, we share how Boomi partnered with AWS to help enterprises accelerate and scale AI adoption with confidence using Agent Control Tower. ( 28 min )

Open

Use Amazon SageMaker Unified Studio to build complex AI workflows using Amazon Bedrock Flows

In this post, we demonstrate how you can use SageMaker Unified Studio to create complex AI workflows using Amazon Bedrock Flows. ( 31 min )

Accelerating AI innovation: Scale MCP servers for enterprise workloads with Amazon Bedrock

In this post, we present a centralized Model Context Protocol (MCP) server implementation using Amazon Bedrock that provides shared access to tools and resources for enterprise AI workloads. The solution enables organizations to accelerate AI innovation by standardizing access to resources and tools through MCP, while maintaining security and governance through a centralized approach. ( 32 min )

Choosing the right approach for generative AI-powered structured data retrieval

In this post, we explore five different patterns for implementing LLM-powered structured data query capabilities in AWS, including direct conversational interfaces, BI tool enhancements, and custom text-to-SQL solutions. ( 32 min )

Revolutionizing drug data analysis using Amazon Bedrock multimodal RAG capabilities

In this post, we explore how Amazon Bedrock's multimodal RAG capabilities revolutionize drug data analysis by efficiently processing complex medical documentation containing text, images, graphs, and tables. ( 32 min )
Open

Milo cancer diary part 20 – extended remission

Milo was back at North Downs Specialist Referrals today for his second scan since finishing his third (modified) ‘CHOP’ chemotherapy protocol. Amazingly he’s still looking clear, which means this is now the longest period of remission since he started treatment :) Our fingers will be crossed for the next scan in a couple of months […] ( 12 min )

Milo cancer diary part 20 – extended remission

Milo was back at North Downs Specialist Referrals today for his second scan since finishing his third (modified) ‘CHOP’ chemotherapy protocol. Amazingly he’s still looking clear, which means this is now the longest period of remission since he started treatment :) Our fingers will be crossed for the next scan in a couple of months […] ( 12 min )

June 2025

Pupdate There’s been a bumper crop of raspberries this year, which has kept the boys entertained.. Berlin Google’s I/O Connect event was in Berlin once again, which provided a good chance to catch up with various communities and some of the product folk. I also took the chance to grab dinner with some local ex-pat […] ( 15 min )

June 2025

Pupdate There’s been a bumper crop of raspberries this year, which has kept the boys entertained.. Berlin Google’s I/O Connect event was in Berlin once again, which provided a good chance to catch up with various communities and some of the product folk. I also took the chance to grab dinner with some local ex-pat […] ( 15 min )

Open

AWS costs estimation using Amazon Q CLI and AWS Cost Analysis MCP

In this post, we explore how to use Amazon Q CLI with the AWS Cost Analysis MCP server to perform sophisticated cost analysis that follows AWS best practices. We discuss basic setup and advanced techniques, with detailed examples and step-by-step instructions. ( 98 min )

Open

Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

In this post, we introduce the new safeguard tiers available in Amazon Bedrock Guardrails, explain their benefits and use cases, and provide guidance on how to implement and evaluate them in your AI applications. ( 98 min )

Structured data response with Amazon Bedrock: Prompt Engineering and Tool Use

We demonstrate two methods for generating structured responses with Amazon Bedrock: Prompt Engineering and Tool Use with the Converse API. Prompt Engineering is flexible, works with Bedrock models (including those without Tool Use support), and handles various schema types (e.g., Open API schemas), making it a great starting point. Tool Use offers greater reliability, consistent results, seamless API integration, and runtime validation of JSON schema for enhanced control. ( 95 min )

Using Amazon SageMaker AI Random Cut Forest for NASA’s Blue Origin spacecraft sensor data

In this post, we demonstrate how to use SageMaker AI to apply the Random Cut Forest (RCF) algorithm to detect anomalies in spacecraft position, velocity, and quaternion orientation data from NASA and Blue Origin’s demonstration of lunar Deorbit, Descent, and Landing Sensors (BODDL-TP). ( 99 min )

Open

Build an intelligent multi-agent business expert using Amazon Bedrock

In this post, we demonstrate how to build a multi-agent system using multi-agent collaboration in Amazon Bedrock Agents to solve complex business questions in the biopharmaceutical industry. We show how specialized agents in research and development (R&D), legal, and finance domains can work together to provide comprehensive business insights by analyzing data from multiple sources. ( 100 min )

Driving cost-efficiency and speed in claims data processing with Amazon Nova Micro and Amazon Nova Lite

In this post, we shared how an internal technology team at Amazon evaluated Amazon Nova models, resulting in notable improvements in inference speed and cost-efficiency. ( 93 min )

Open

Power Your LLM Training and Evaluation with the New SageMaker AI Generative AI Tools

Today we are excited to introduce the Text Ranking and Question and Answer UI templates to SageMaker AI customers. In this blog post, we’ll walk you through how to set up these templates in SageMaker to create high-quality datasets for training your large language models. ( 95 min )

Amazon Bedrock Agents observability using Arize AI

Today, we’re excited to announce a new integration between Arize AI and Amazon Bedrock Agents that addresses one of the most significant challenges in AI development: observability. In this post, we demonstrate the Arize Phoenix system for tracing and evaluation. ( 100 min )

How SkillShow automates youth sports video processing using Amazon Transcribe

SkillShow, a leader in youth sports video production, films over 300 events yearly in the youth sports industry, creating content for over 20,000 young athletes annually. This post describes how SkillShow used Amazon Transcribe and other Amazon Web Services (AWS) machine learning (ML) services to automate their video processing workflow, reducing editing time and costs while scaling their operations. ( 93 min )

NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy

This post is co-written with Sergio Zavota and Amy Perring from NewDay. NewDay has a clear and defining purpose: to help people move forward with credit. NewDay provides around 4 million customers access to credit responsibly and delivers exceptional customer experiences, powered by their in-house technology system. NewDay’s contact center handles 2.5 million calls annually, […] ( 95 min )