| Management number | 231974955 | Release Date | 2026/06/18 | List Price | US$8.66 | Model Number | 231974955 | ||
|---|---|---|---|---|---|---|---|---|---|
| Category | |||||||||
Transform Generative AI Prototypes into Enterprise-Grade AWS Production SystemsThe gap between a local generative AI script and a mission-critical cloud environment is massive. While building a conversational chatbot prototype is relatively simple, scaling an autonomous, multi-agent architecture introduces severe engineering bottlenecks. Unpredictable token costs, multi-second autoregressive latency, and non-deterministic model hallucinations routinely derail enterprise AI deployments before they ever reach production.This book is the definitive engineering manual for cloud developers, AI Architects, and DevOps professionals tasked with operationalizing large language models. Moving far beyond the hype of basic prompt engineering, this comprehensive guide delivers hands-on, architectural blueprints for building highly available, cost-effective, and secure generative AI systems natively on Amazon Web Services (AWS).Inside, you will master the exact infrastructure patterns required to conquer autoregressive latency and strictly govern your cloud compute costs. Through detailed, programmatic examples utilizing Python and the AWS Cloud Development Kit (CDK), you will learn how to bypass standard REST API timeouts by streaming foundational model responses directly to the frontend using Amazon API Gateway and WebSockets. You will discover how to implement sub-millisecond semantic caching layers with Amazon ElastiCache for Redis, drastically reducing expensive GPU compute cycles by intercepting and resolving mathematically similar user queries before they ever reach the foundation model.Furthermore, this guide establishes a rigorous, production-ready foundation for LLMOps. You will learn how to transition from brittle, manual prompt testing to fully automated continuous integration pipelines using the RAGAS framework and automated LLM-as-a-judge methodologies. By configuring dynamic model routing, cross-region inference profiles, and granular Amazon CloudWatch telemetry dashboards, you will secure complete operational observability over your token economics and system health.Finally, you will prepare for the agentic future of cloud computing by learning how to orchestrate stateful, multi-agent networks equipped with persistent episodic memory, tool-calling capabilities, and strict enterprise security guardrails using Amazon Bedrock.Core Topics Covered:Mitigating perceived latency via Bedrock response streaming and serverless WebSocket broadcasting.Drastically reducing token consumption through vector embeddings and semantic caching.Deploying secure, version-controlled Generative AI infrastructure using the AWS CDK.Evaluating Retrieval-Augmented Generation (RAG) pipelines automatically using the RAGAS framework.Monitoring token economics, execution latency, and capacity throttling with Amazon CloudWatch.Architecting stateful, autonomous agentic networks and reasoning workflows.Stop wrestling with fragile AI wrappers and unpredictable cloud billing. Equip yourself with the deep architectural mastery required to design, deploy, and monitor resilient generative AI workloads at scale.Scroll up and click "Buy Now" to secure your competitive edge in the agentic cloud era. Read more
| ASIN | B0GSJLSVWX |
|---|---|
| ISBN13 | 979-8251944853 |
| Language | English |
| Publisher | Independently published |
| Dimensions | 7 x 0.53 x 10 inches |
| Item Weight | 1.15 pounds |
| Print length | 231 pages |
| Publication date | March 13, 2026 |
If you notice any omissions or errors in the product information on this page, please use the correction request form below.
Correction Request Form