The Kong AI Gateway enhances the security and management of generative AI workloads in several ways:
Introspecting AI Traffic: The gateway can introspect AI traffic and provide a unified API to consume one or more AI providers. This allows for better control and management of AI requests.
Prompt Security: The gateway provides prompt security, ensuring that AI prompts are governed and controlled. It allows for the configuration of rules to authorize or block free-form prompts created by applications, helping to ensure compliance and restrict discussions on sensitive topics.
Centralized AI Credential Management: The gateway helps ensure secure and centralized storage of AI credentials within Kong Gateway. This design negates the need for credentials within applications, streamlining credential rotation and updates directly from the gateway.
Layer 7 AI Metrics Collection: The gateway captures detailed Layer 7 AI analytics, including metrics such as request and response token counts, along with usage data for LLM providers and models. This enhances observability and offers insights into developer preferences and AI usage.
AI Rate Limiting Advanced Plugin: This plugin can be used to implement rate limiting on AI requests traffic5. It introspects LLM responses to calculate token cost and enable rate limits for the LLM backend5.
AI Prompt Guard: This plugin provides a governance layer for AI prompts. It establishes rules to authorize or block free-form prompts created by applications, ensuring prompts adhere to approved standards before being transmitted to LLM providers.
AI Prompt Decorator Plugin: This plugin injects an array of llm/v1/chat messages at the start or end of a caller's chat history. This capability allows the caller to create more complex prompts and have more control over how a Large Language Model (LLM) is used when called via Kong Gateway.
By providing these features, the Kong AI Gateway helps organizations accelerate their AI transformation while increasing developer productivity, security, and performance at scale2.
The Kong AI Gateway's ability to introspect AI traffic and provide a unified API offers several benefits for organizations:
Simplified AI Management: The unified API allows developers to consume multiple AI providers from the same client codebase, simplifying the process of connecting, managing, and securing access to multiple large language models (LLMs) without needing to write any code2.
Enhanced Security and Compliance: The gateway provides prompt security, compliance, governance, templating, and a lifecycle around AI prompts. It also offers "L7 AI observability metrics" to give visibility into provider performance, token usage, and costs across AI traffic. These capabilities are unmatched by competitors, which, according to Palladino, "stop at the API level, they don’t go deeper into the AI level."
Monetization of Fine-tuned AI Models: Kong’s unified control plane, Kong Konnect, enables organizations to monetize their fine-tuned AI models alongside traditional APIs. This is particularly valuable when the models are being fine-tuned with specific corporate intelligence that only the organization has, allowing them to harness and monetize that intelligence.
Scalability and Performance: The Kong AI Gateway is built on top of Kong's existing gateway features, providing deep AI-specific capabilities. This allows organizations to scale their AI workloads across any cloud environment, managing AI alongside traditional APIs.
Ease of Use: The Kong AI Gateway is designed for anyone, including developers, teams, and broader business functions, looking to integrate AI into their applications with greater ease, security, and standardization. This is particularly beneficial as organizations look to adopt AI in order to drive innovation and enhance their competitive edge.
The Kong AI Gateway offers a range of infrastructure capabilities tailored for AI, including:
Support for multiple large language models (LLMs): This allows users to leverage different LLMs for various use cases and manage them through a single interface.
Semantic caching: This capability stores and quickly retrieves frequently requested AI responses, improving performance and reducing costs.
Semantic routing: AI requests are intelligently routed to the most appropriate LLM based on the request content, enhancing efficiency and accuracy.
Semantic firewalling: This feature protects against malicious AI requests by analyzing the content and context of the requests.
Model lifecycle management: The Kong AI Gateway simplifies the process of managing the entire lifecycle of AI models, from development to deployment and maintenance4.
These capabilities are designed to help organizations more easily adopt, manage, and secure AI technology across any cloud environment.