{"id":3893,"date":"2025-12-24T10:00:23","date_gmt":"2025-12-24T15:00:23","guid":{"rendered":"https:\/\/www.mymiller.name\/wordpress\/?p=3893"},"modified":"2025-12-24T10:00:23","modified_gmt":"2025-12-24T15:00:23","slug":"building-intelligent-apps-with-spring-ai","status":"publish","type":"post","link":"https:\/\/www.mymiller.name\/wordpress\/spring_ai\/building-intelligent-apps-with-spring-ai\/","title":{"rendered":"Building Intelligent Apps with Spring AI"},"content":{"rendered":"\n<p>In today&#8217;s fast-paced world of software development, integrating artificial intelligence into applications is no longer just a trend\u2014it&#8217;s a necessity. At the heart of this revolution is <strong>Generative AI<\/strong>, a type of artificial intelligence that can create new content, such as text, images, and code, in response to prompts. It&#8217;s fundamentally changing how we interact with technology and build software solutions. For the millions of developers who rely on the Spring Framework, the good news is that you don&#8217;t need to be an AI expert to get started. The <strong>Spring AI<\/strong> project provides a robust, idiomatic, and simplified approach to bringing these capabilities directly into your Java applications.<\/p>\n\n\n\n<p>This article will guide you through the process of adding Spring AI to your project, explore the core AI patterns it supports, and outline the key technologies you can integrate to build powerful, intelligent applications.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Adding Spring AI to Your Project<\/h2>\n\n\n\n<p>The first step is to configure your build file to include the necessary dependencies. Spring AI follows the familiar Spring Boot conventions, providing starter dependencies that handle the heavy lifting of auto-configuration.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Gradle Project<\/h3>\n\n\n\n<p>For Gradle, you&#8217;ll first need to add the Spring AI Bill of Materials (BOM) to your <code>dependencies<\/code> block. The BOM ensures that all Spring AI-related dependencies use compatible versions. You can then add the specific AI model and other dependencies you need.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>dependencies {\n    \/\/ Spring AI BOM for consistent versions\n    implementation platform(\"org.springframework.ai:spring-ai-bom:0.8.1\")\n\n    \/\/ Starter for OpenAI (or other LLMs)\n    implementation 'org.springframework.ai:spring-ai-openai-spring-boot-starter'\n\n    \/\/ Optional: for a vector database like Pinecone\n    implementation 'org.springframework.ai:spring-ai-pinecone-store-spring-boot-starter'\n\n    \/\/ Other Spring Boot dependencies\n    implementation 'org.springframework.boot:spring-boot-starter-web'\n    \/\/ ...\n}\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Maven Project<\/h3>\n\n\n\n<p>For a Maven project, the process is very similar. You add the Spring AI BOM to the <code>&lt;dependencyManagement&gt;<\/code> section of your <code>pom.xml<\/code>, and then include the individual starter dependencies in your <code>&lt;dependencies&gt;<\/code> section.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt;dependencyManagement&gt;\n    &lt;dependencies&gt;\n        &lt;dependency&gt;\n            &lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\n            &lt;artifactId&gt;spring-ai-bom&lt;\/artifactId&gt;\n            &lt;version&gt;0.8.1&lt;\/version&gt;\n            &lt;type&gt;pom&lt;\/type&gt;\n            &lt;scope&gt;import&lt;\/scope&gt;\n        &lt;\/dependency&gt;\n    &lt;\/dependencies&gt;\n&lt;\/dependencyManagement&gt;\n\n&lt;dependencies&gt;\n    &lt;dependency&gt;\n        &lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\n        &lt;artifactId&gt;spring-ai-openai-spring-boot-starter&lt;\/artifactId&gt;\n    &lt;\/dependency&gt;\n    &lt;!-- Other dependencies --&gt;\n    &lt;dependency&gt;\n        &lt;groupId&gt;org.springframework.boot&lt;\/groupId&gt;\n        &lt;artifactId&gt;spring-boot-starter-web&lt;\/artifactId&gt;\n    &lt;\/dependency&gt;\n    &lt;!-- ... --&gt;\n&lt;\/dependencies&gt;\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">2. Key AI Patterns Supported by Spring AI<\/h2>\n\n\n\n<p>Spring AI is not just a simple API wrapper; it&#8217;s designed to help you implement sophisticated AI patterns in a portable and modular way.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>System Prompts &amp; Prompt Engineering<\/strong><\/h3>\n\n\n\n<p><strong>Description:<\/strong> Prompt Engineering is the art of crafting specific instructions and context to guide an LLM&#8217;s behavior. A <strong>System Prompt<\/strong> is a key part of this, acting as the foundation for the conversation by defining the LLM&#8217;s role, rules, and style. It provides constraints and instructions before the user ever provides input.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> A system prompt is invaluable for ensuring consistency. For a customer service chatbot, you could use a system prompt that says, &#8220;You are a friendly and professional customer support assistant. You must always be polite and ask for a ticket number for every new issue.&#8221; This helps the LLM maintain a specific persona and follow business rules.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Retrieval-Augmented Generation (RAG)<\/strong><\/h3>\n\n\n\n<p><strong>Description:<\/strong> RAG enhances an LLM&#8217;s ability to answer questions by giving it access to external, private, or real-time data sources. It overcomes the LLM&#8217;s static knowledge by retrieving relevant information from your documents and &#8220;stuffing&#8221; it into the prompt. This process, often called <strong>Prompt Stuffing<\/strong>, provides the LLM with the context it needs to generate a grounded, accurate response.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> A great example is a Q&amp;A chatbot for an enterprise. The chatbot can&#8217;t answer questions about internal policies because that information wasn&#8217;t in the LLM&#8217;s training data. With RAG, you can use an <strong>embedding model<\/strong> to convert your company&#8217;s documents into numerical representations (vectors) and store them in a <strong>vector database<\/strong>. When a user asks a question, the application finds the most relevant document snippets, which are then used as context for the LLM to formulate an answer.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Function Calling \/ Tooling<\/strong><\/h3>\n\n\n\n<p><strong>Description:<\/strong> This pattern allows an LLM to dynamically call external APIs or code functions to retrieve real-time data or perform actions. The LLM acts as a reasoning engine, deciding when a tool is needed based on a user&#8217;s request. The model doesn&#8217;t execute the code itself; it simply provides a structured response indicating the function to call and the parameters to use.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> Imagine a travel booking chatbot. A user asks, &#8220;What&#8217;s the weather like in Paris?&#8221; The LLM, recognizing that it needs current information, will &#8220;request&#8221; a call to a <code>getWeather<\/code> function, passing &#8220;Paris&#8221; as the city. Your application intercepts this request, calls a weather API, and feeds the live weather data back to the LLM. The LLM then uses this information to formulate a polite, accurate response to the user.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Output Converters<\/strong><\/h3>\n\n\n\n<p><strong>Description:<\/strong> LLMs often return responses as unstructured text. An <strong>Output Converter<\/strong> solves this by instructing the model to return a structured format (like JSON or a list) and then parsing that output into a Java object. Spring AI provides a convenient way to map the raw text to a <code>List<\/code>, <code>Map<\/code>, or a custom POJO.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> A common use case is generating a structured report. You could prompt the LLM to &#8220;Give me the top 5 trending topics from the past week in JSON format with a title and summary for each.&#8221; An output converter would then automatically parse this JSON string into a <code>List<\/code> of <code>Topic<\/code> objects, making it easy to use the data in your application.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Chat Memory<\/strong><\/h3>\n\n\n\n<p><strong>Description:<\/strong> By default, LLMs are stateless; they treat each new prompt as a completely new conversation. <strong>Chat Memory<\/strong> gives your application the ability to remember previous messages and provide conversational context. Spring AI offers different implementations, from simple in-memory storage to persistent repositories like JDBC.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> Chat memory is crucial for creating natural, multi-turn conversations. Without it, if a user asks, &#8220;What&#8217;s my name?&#8221; after telling the chatbot &#8220;Hello, my name is Alex,&#8221; the chatbot won&#8217;t know the answer. With chat memory, the previous message is included in the new prompt, allowing the LLM to recall the user&#8217;s name and provide a relevant, personalized response.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Evaluators<\/strong><\/h3>\n\n\n\n<p><strong>Description:<\/strong> An <strong>Evaluator<\/strong> is a tool used to automatically assess the quality of an LLM&#8217;s response. This is a critical pattern for building reliable and safe AI applications. Spring AI provides built-in evaluators that can check for things like relevance to the prompt or factual accuracy against a given context.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> For a RAG-based Q&amp;A system, you can use a <code>RelevanceEvaluator<\/code> to automatically score how well the LLM&#8217;s answer aligns with the user&#8217;s question. This allows you to set a quality threshold and, if a response falls below it, either discard it or flag it for human review, ensuring your application provides high-quality information.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">3. Technology Integrations<\/h2>\n\n\n\n<p>One of the greatest strengths of Spring AI is its modularity and extensive support for a wide range of AI technologies. This allows you to easily switch providers with minimal code changes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Large Language Models (LLMs)<\/strong><\/h3>\n\n\n\n<p>Spring AI provides starters for all major LLM providers, including:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>OpenAI:<\/strong> The most popular choice, providing access to models like GPT-4.<\/li>\n\n\n\n<li><strong>Google Gemini:<\/strong> Integrates with Google&#8217;s powerful family of models.<\/li>\n\n\n\n<li><strong>Hugging Face:<\/strong> Connects to a vast ecosystem of open-source models.<\/li>\n\n\n\n<li><strong>Ollama:<\/strong> Allows you to use a local, self-hosted LLM.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Vector Databases<\/strong><\/h3>\n\n\n\n<p>Vector databases are essential for implementing the RAG pattern. Spring AI supports a number of popular solutions, providing a consistent <code>VectorStore<\/code> API for each:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Pinecone<\/strong><\/li>\n\n\n\n<li><strong>Chroma<\/strong><\/li>\n\n\n\n<li><strong>Milvus<\/strong><\/li>\n\n\n\n<li><strong>PostgreSQL with the <code>pgvector<\/code> extension<\/strong><\/li>\n\n\n\n<li><strong>Elasticsearch<\/strong><\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Embedding Models<\/strong><\/h3>\n\n\n\n<p>Embedding models are responsible for converting text into numerical vectors. Spring AI offers integrations for popular providers, including:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>OpenAI<\/strong><\/li>\n\n\n\n<li><strong>Google<\/strong><\/li>\n\n\n\n<li><strong>Mistral AI<\/strong><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">4. Model Configuration<\/h2>\n\n\n\n<p>Configuring Spring AI is a straightforward process thanks to Spring Boot&#8217;s property-based configuration. You can manage your API keys, model names, and other options in the <code>application.properties<\/code> or <code>application.yml<\/code> file.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">OpenAI<\/h3>\n\n\n\n<p>To connect to OpenAI, you must provide your API key. You can also specify the model and other options like <code>temperature<\/code> for creativity.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>spring.ai.openai.api-key=YOUR_API_KEY\nspring.ai.openai.chat.options.model=gpt-4o-mini\nspring.ai.openai.chat.options.temperature=0.7\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Google Gemini<\/h3>\n\n\n\n<p>For Google Gemini, you configure the project ID and location, which are used to authenticate with Google Cloud&#8217;s Vertex AI.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>spring.ai.vertex.ai.gemini.project-id=YOUR_PROJECT_ID\nspring.ai.vertex.ai.gemini.location=us-central1\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Ollama<\/h3>\n\n\n\n<p>Since Ollama runs locally, it doesn&#8217;t require an API key. You just need to specify the model you want to use.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>spring.ai.ollama.chat.options.model=llama3\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Hugging Face<\/h3>\n\n\n\n<p>For Hugging Face, you provide an API key and the URL for the specific inference endpoint you want to use.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>spring.ai.huggingface.chat.api-key=YOUR_API_KEY\nspring.ai.huggingface.chat.url=YOUR_INFERENCE_ENDPOINT_URL\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">5. Synchronous vs. Streaming API<\/h2>\n\n\n\n<p>The <code>ChatClient<\/code> in Spring AI provides two primary ways to interact with an LLM: a <strong>synchronous<\/strong> <code>call()<\/code> method and a <strong>reactive<\/strong> <code>stream()<\/code> method. Choosing between them depends on your application&#8217;s requirements for responsiveness and user experience.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Synchronous <code>call()<\/code><\/h3>\n\n\n\n<p>The <code>call()<\/code> method is a blocking operation. Your application sends a request to the LLM and waits for the entire response to be generated before it can proceed.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> This approach is suitable for single-turn requests where the response is expected to be relatively short, such as a summary, a classification, or a joke. It&#8217;s simple to implement and doesn&#8217;t require a reactive programming model.<\/p>\n\n\n\n<p><strong>Example Code:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.springframework.ai.chat.client.ChatClient;\nimport org.springframework.ai.chat.model.ChatResponse;\nimport org.springframework.ai.chat.prompt.Prompt;\nimport org.springframework.web.bind.annotation.GetMapping;\nimport org.springframework.web.bind.annotation.RequestParam;\nimport org.springframework.web.bind.annotation.RestController;\n\n@RestController\npublic class ChatController {\n\n    private final ChatClient chatClient;\n\n    public ChatController(ChatClient.Builder builder) {\n        this.chatClient = builder.build();\n    }\n\n    @PostMapping(\"\/chat\/call\")\n    public String chatWithCall(@RequestParam String message) {\n        \/\/ The call() method blocks until the full response is received.\n        ChatResponse response = chatClient.prompt().user(message).call().chatResponse();\n        return response.getResult().getOutput().getContent();\n    }\n}\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Streaming <code>stream()<\/code><\/h3>\n\n\n\n<p>The <code>stream()<\/code> method provides a non-blocking, reactive approach. The LLM&#8217;s response is sent back as a continuous stream of tokens, and your application can process these tokens as they arrive. This is handled using Spring&#8217;s reactive framework, <strong>Project Reactor<\/strong>, which returns a <code>Flux<\/code>.<\/p>\n\n\n\n<p><strong>Use Case:<\/strong> This is ideal for building real-time, interactive applications like chatbots or content generators where you want to provide a &#8220;typewriter&#8221; effect to the user, showing the response as it&#8217;s being generated. It significantly improves the perceived responsiveness of your application for longer responses.<\/p>\n\n\n\n<p><strong>Example Code:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.springframework.ai.chat.client.ChatClient;\nimport org.springframework.web.bind.annotation.GetMapping;\nimport org.springframework.web.bind.annotation.RequestParam;\nimport org.springframework.web.bind.annotation.RestController;\nimport reactor.core.publisher.Flux;\n\n@RestController\npublic class ChatController {\n\n    private final ChatClient chatClient;\n\n    public ChatController(ChatClient.Builder builder) {\n        this.chatClient = builder.build();\n    }\n\n    @PostMapping(value = \"\/chat\/stream\", produces = \"text\/event-stream\")\n    public Flux&lt;String> chatWithStream(@RequestParam String message) {\n        \/\/ The stream() method returns a Flux, emitting tokens as they are generated.\n        return chatClient.prompt().user(message).stream().content();\n    }\n}\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">6. AI Model Evaluation and Testing<\/h2>\n\n\n\n<p>Building a reliable AI application requires more than just integrating with a model; it requires a strategy for validating its outputs. <strong>AI Model Evaluation and Testing<\/strong> is a critical part of the development lifecycle, especially for preventing issues like <strong>hallucinations<\/strong> (where the model generates false information) or irrelevant responses.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The Role of Evaluators<\/h3>\n\n\n\n<p>Spring AI provides a core <code>Evaluator<\/code> interface and several built-in implementations to help you test and validate your AI-generated content. These evaluators use a separate AI model to act as a judge, assessing the quality of your primary model&#8217;s output. This is a common and effective approach because an LLM can be an excellent tool for judging the output of another.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Evaluators in Spring AI<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><code>RelevanceEvaluator<\/code><\/strong>: This evaluator checks how well an AI-generated response aligns with the original user prompt. It assesses the semantic similarity to ensure the answer is on-topic and helpful.<\/li>\n\n\n\n<li><strong><code>FactCheckingEvaluator<\/code><\/strong>: This evaluator is designed to combat hallucinations. It compares a specific claim made by the AI against a provided context (e.g., a document from a RAG pipeline) to verify factual accuracy.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Testing with Evaluators<\/h3>\n\n\n\n<p>You can integrate these evaluators directly into your JUnit tests to create a robust CI\/CD pipeline for your AI features. For example, you can write a test that sends a prompt to your application, receives the response, and then uses a <code>RelevanceEvaluator<\/code> to assert that the response meets a certain quality score.<\/p>\n\n\n\n<p><strong>Example Test Snippet:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.junit.jupiter.api.Test;\nimport org.springframework.ai.evaluation.RelevanceEvaluator;\nimport org.springframework.ai.evaluation.RelevanceEvaluator.RelevanceEvaluationOptions;\nimport org.springframework.beans.factory.annotation.Autowired;\nimport org.springframework.boot.test.context.SpringBootTest;\nimport static org.assertj.core.api.Assertions.assertThat;\n\n@SpringBootTest\npublic class ChatControllerTests {\n\n    @Autowired\n    private ChatController chatController;\n\n    @Autowired\n    private RelevanceEvaluator relevanceEvaluator;\n\n    @Test\n    void testChatResponseRelevance() {\n        String prompt = \"What are the key features of Spring AI?\";\n        String response = chatController.chatWithCall(prompt);\n\n        \/\/ Evaluate the relevance of the response to the original prompt\n        var evaluation = relevanceEvaluator.evaluate(\n            response,\n            new RelevanceEvaluationOptions(prompt)\n        );\n\n        \/\/ Assert that the relevance score is above a certain threshold (e.g., 0.8)\n        assertThat(evaluation.getScore()).isGreaterThan(0.8);\n    }\n}\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">7. Implementing Tool Calling with Custom Functions<\/h2>\n\n\n\n<p>Tool Calling is one of the most powerful features of Spring AI, allowing you to seamlessly connect an LLM to your own business logic. Spring AI makes this process incredibly simple by using standard annotations on Plain Old Java Objects (POJOs).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Create a Tool Service<\/h3>\n\n\n\n<p>First, create a simple Spring component that contains the methods you want the LLM to be able to call. You can use a standard <code>Function<\/code> interface or a class with methods.<\/p>\n\n\n\n<p><strong>Example Code:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.springframework.ai.model.tool.ToolFunction;\nimport org.springframework.stereotype.Component;\n\nimport java.util.function.Function;\n\n@Component\npublic class WeatherService {\n\n    \/**\n     * Get the current weather for a given city.\n     * @param request a request to get the current weather\n     * @return a response containing the weather data\n     *\/\n    @ToolFunction(name = \"getWeather\",\n            description = \"Get the current weather for a given city\")\n    public Function&lt;WeatherRequest, WeatherResponse&gt; getWeatherFunction() {\n        return request -&gt; {\n            \/\/ In a real application, you would call a weather API here.\n            \/\/ For this example, we'll return mock data.\n            System.out.println(\"Calling the weather service for city: \" + request.city());\n            return new WeatherResponse(request.city(), 25.0, \"Sunny\");\n        };\n    }\n\n    public record WeatherRequest(String city) {}\n    public record WeatherResponse(String city, double temperature, String conditions) {}\n}\n<\/code><\/pre>\n\n\n\n<p>The key is the <code>@ToolFunction<\/code> annotation. It tells Spring AI to expose this function to the LLM, and the <code>description<\/code> is crucial because the LLM uses it to understand when to call the function.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Register the Tool with the <code>ChatClient<\/code><\/h3>\n\n\n\n<p>Next, you need to tell your <code>ChatClient<\/code> about the tools it has access to. You can do this by passing a list of <code>ToolFunction<\/code> objects to the <code>tools()<\/code> method on the <code>ChatClient.Builder<\/code>.<\/p>\n\n\n\n<p><strong>Example Code:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.springframework.ai.chat.client.ChatClient;\nimport org.springframework.ai.chat.messages.UserMessage;\nimport org.springframework.ai.chat.model.ChatResponse;\nimport org.springframework.ai.chat.prompt.Prompt;\nimport org.springframework.ai.model.tool.ToolFunctions;\nimport org.springframework.web.bind.annotation.GetMapping;\nimport org.springframework.web.bind.annotation.RequestParam;\nimport org.springframework.web.bind.annotation.RestController;\n\n@RestController\npublic class ChatController {\n\n    private final ChatClient chatClient;\n    private final WeatherService weatherService;\n\n    public ChatController(ChatClient.Builder builder, WeatherService weatherService) {\n        this.weatherService = weatherService;\n\n        this.chatClient = builder\n                .defaultAdvisors(new ToolFunctions(weatherService))\n                .build();\n    }\n\n    @PostMapping(\"\/chat\/tool-calling\")\n    public String chatWithToolCalling(@RequestParam String message) {\n        \/\/ When the user asks about the weather, the LLM will call our getWeather function\n        ChatResponse response = chatClient.prompt().user(message).call().chatResponse();\n        return response.getResult().getOutput().getContent();\n    }\n}\n<\/code><\/pre>\n\n\n\n<p>When a user&#8217;s prompt (e.g., &#8220;What is the weather like in New York?&#8221;) matches the description of the <code>getWeather<\/code> tool, the LLM will execute the function and use the result to formulate a response. Spring AI handles the entire orchestration\u2014from the LLM&#8217;s request to the function&#8217;s execution and the final response generation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">8. RAG Implementation: A Deeper Dive<\/h2>\n\n\n\n<p>While RAG is a powerful concept, its implementation involves a detailed, multi-step pipeline. Spring AI provides the necessary abstractions to manage each step seamlessly. The process is broken down into two main phases: <strong>Data Ingestion<\/strong> and <strong>Query &amp; Retrieval<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Ingestion Pipeline<\/h3>\n\n\n\n<p>The first step is to get your unstructured data (documents, PDFs, etc.) into a format that a vector database can understand. This process is a classic ETL (Extract, Transform, Load) pipeline.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Extract:<\/strong> A <code>DocumentReader<\/code> extracts content from a data source. Spring AI includes readers for common formats like Markdown, PDFs, and web pages.<\/li>\n\n\n\n<li><strong>Transform:<\/strong> A <code>TextSplitter<\/code> breaks down large documents into smaller, semantically meaningful chunks. This is crucial because LLMs have a limited context window.<\/li>\n\n\n\n<li><strong>Embed:<\/strong> An <code>EmbeddingClient<\/code> converts these text chunks into numerical vectors. This process captures the semantic meaning of the text, allowing for a similarity search later.<\/li>\n\n\n\n<li><strong>Load:<\/strong> The <code>VectorStore<\/code> then stores these vectors, ready for retrieval.<\/li>\n<\/ol>\n\n\n\n<p><strong>Example Ingestion Code:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.springframework.ai.document.Document;\nimport org.springframework.ai.embedding.EmbeddingClient;\nimport org.springframework.ai.reader.pdf.PagePdfDocumentReader;\nimport org.springframework.ai.transformer.splitter.TokenTextSplitter;\nimport org.springframework.ai.vectorstore.SimpleVectorStore;\nimport org.springframework.ai.vectorstore.VectorStore;\nimport org.springframework.context.annotation.Bean;\nimport org.springframework.context.annotation.Configuration;\nimport org.springframework.core.io.Resource;\nimport java.util.List;\n\n@Configuration\npublic class VectorStoreConfig {\n\n    @Bean\n    public VectorStore vectorStore(EmbeddingClient embeddingClient,\n                                   @Value(\"classpath:\/docs\/my-policy-manual.pdf\") Resource pdfResource) {\n\n        \/\/ Use a simple in-memory vector store for this example\n        SimpleVectorStore vectorStore = new SimpleVectorStore(embeddingClient);\n\n        \/\/ Extract text from the PDF\n        PagePdfDocumentReader pdfReader = new PagePdfDocumentReader(pdfResource);\n\n        \/\/ Split the documents into manageable chunks\n        TokenTextSplitter textSplitter = new TokenTextSplitter();\n        List&lt;Document&gt; documents = textSplitter.split(pdfReader.read());\n\n        \/\/ Add the documents to the vector store\n        vectorStore.add(documents);\n\n        return vectorStore;\n    }\n}\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Query &amp; Retrieval Pipeline<\/h3>\n\n\n\n<p>Once your data is in the vector store, you can use it to answer user questions.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Retrieve:<\/strong> When a user submits a query, the application uses an <code>Advisor<\/code> to first perform a similarity search on the <code>VectorStore<\/code> to find the most relevant documents.<\/li>\n\n\n\n<li><strong>Augment:<\/strong> The retrieved documents are then &#8220;stuffed&#8221; into the user&#8217;s prompt, providing the LLM with the specific context it needs to generate a grounded response.<\/li>\n\n\n\n<li><strong>Generate:<\/strong> The augmented prompt is sent to the LLM, which uses the provided context to answer the user&#8217;s question.<\/li>\n<\/ol>\n\n\n\n<p><strong>Example Retrieval Code:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import org.springframework.ai.chat.client.ChatClient;\nimport org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;\nimport org.springframework.ai.vectorstore.VectorStore;\nimport org.springframework.web.bind.annotation.GetMapping;\nimport org.springframework.web.bind.annotation.RequestParam;\nimport org.springframework.web.bind.annotation.RestController;\n\n@RestController\npublic class RAGController {\n\n    private final ChatClient chatClient;\n\n    public RAGController(ChatClient.Builder builder, VectorStore vectorStore) {\n        this.chatClient = builder\n                \/\/ The advisor automatically retrieves documents and adds them to the prompt\n                .defaultAdvisors(new QuestionAnswerAdvisor(vectorStore))\n                .build();\n    }\n\n    @PostMapping(\"\/chat\/rag\")\n    public String chatWithRAG(@RequestParam String message) {\n        \/\/ The user's message is passed, and the advisor handles the RAG process\n        return chatClient.prompt()\n                .user(message)\n                .call()\n                .content();\n    }\n}\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>In today&#8217;s fast-paced world of software development, integrating artificial intelligence into applications is no longer just a trend\u2014it&#8217;s a necessity. At the heart of this revolution is Generative AI, a type of artificial intelligence that can create new content, such as text, images, and code, in response to prompts. It&#8217;s fundamentally changing how we interact [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3894,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_coblocks_attr":"","_coblocks_dimensions":"","_coblocks_responsive_height":"","_coblocks_accordion_ie_support":"","jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[443],"tags":[429,469,319],"series":[],"class_list":["post-3893","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-spring_ai","tag-ai","tag-rag","tag-spring"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/08\/ai-generated-8273796_1280.avif","jetpack-related-posts":[{"id":3951,"url":"https:\/\/www.mymiller.name\/wordpress\/java\/scaling-streams-mastering-virtual-threads-in-spring-boot-4-and-java-25\/","url_meta":{"origin":3893,"position":0},"title":"Scaling Streams: Mastering Virtual Threads in Spring Boot 4 and Java 25","author":"Jeffery Miller","date":"December 22, 2025","format":false,"excerpt":"As a software architect, I\u2019ve seen the industry shift from heavy platform threads to reactive streams, and finally to the \"best of both worlds\": Virtual Threads. With the recent release of Spring Boot 4.0 and Java 25 (LTS), Project Loom's innovations have officially become the bedrock of high-concurrency enterprise Java.\u2026","rel":"","context":"In &quot;JAVA&quot;","block_context":{"text":"JAVA","link":"https:\/\/www.mymiller.name\/wordpress\/category\/java\/"},"img":{"alt_text":"","src":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/12\/Gemini_Generated_Image_wqijejwqijejwqij-scaled.avif","width":350,"height":200,"srcset":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/12\/Gemini_Generated_Image_wqijejwqijejwqij-scaled.avif 1x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/12\/Gemini_Generated_Image_wqijejwqijejwqij-scaled.avif 1.5x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/12\/Gemini_Generated_Image_wqijejwqijejwqij-scaled.avif 2x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/12\/Gemini_Generated_Image_wqijejwqijejwqij-scaled.avif 3x"},"classes":[]},{"id":3965,"url":"https:\/\/www.mymiller.name\/wordpress\/angular\/bringing-worlds-to-life-integrating-ai-personas-in-multi-user-dungeons-muds\/","url_meta":{"origin":3893,"position":1},"title":"Bringing Worlds to Life: Integrating AI Personas in Multi-User Dungeons (MUDs)","author":"Jeffery Miller","date":"April 20, 2026","format":false,"excerpt":"A few weeks ago, I found myself pondering the ultimate objective for an artificial intelligence system. The answer kept returning to a single concept: the ability to truly mimic a human. This spark of an idea gave rise to a challenge\u2014I needed a sandbox where I could work with AI\u2026","rel":"","context":"In &quot;Angular&quot;","block_context":{"text":"Angular","link":"https:\/\/www.mymiller.name\/wordpress\/category\/angular\/"},"img":{"alt_text":"","src":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_hsr3ethsr3ethsr3-scaled.avif","width":350,"height":200,"srcset":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_hsr3ethsr3ethsr3-scaled.avif 1x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_hsr3ethsr3ethsr3-scaled.avif 1.5x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_hsr3ethsr3ethsr3-scaled.avif 2x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_hsr3ethsr3ethsr3-scaled.avif 3x"},"classes":[]},{"id":3557,"url":"https:\/\/www.mymiller.name\/wordpress\/spring_ai\/spring-ai-simplifying-ai-development\/","url_meta":{"origin":3893,"position":2},"title":"Spring AI: Simplifying AI Development","author":"Jeffery Miller","date":"April 20, 2026","format":false,"excerpt":"Spring AI is a new framework aimed at making AI development easier for Java developers. It provides a simple, Spring-friendly way to integrate AI models into your applications, including the use of templates for more structured prompts. It currently supports models from various providers. Why Spring AI? Simplified Integration: Easily\u2026","rel":"","context":"In &quot;Spring AI&quot;","block_context":{"text":"Spring AI","link":"https:\/\/www.mymiller.name\/wordpress\/category\/spring_ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8686301_640.jpg?fit=640%2C640&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8686301_640.jpg?fit=640%2C640&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8686301_640.jpg?fit=640%2C640&ssl=1&resize=525%2C300 1.5x"},"classes":[]},{"id":3574,"url":"https:\/\/www.mymiller.name\/wordpress\/spring_ai\/deeplearning4j-and-spring-boot-a-powerful-duo-for-ai-powered-applications\/","url_meta":{"origin":3893,"position":3},"title":"Deeplearning4J and Spring Boot: A Powerful Duo for AI-Powered Applications","author":"Jeffery Miller","date":"April 20, 2026","format":false,"excerpt":"Deeplearning4J (DL4J) offers a comprehensive Java framework for deep learning, while Spring Boot streamlines the development of production-ready applications. By combining these two technologies, you unlock a flexible platform for building intelligent services that can handle various types of data. In this guide, we\u2019ll explore how to integrate DL4J into\u2026","rel":"","context":"In &quot;Spring AI&quot;","block_context":{"text":"Spring AI","link":"https:\/\/www.mymiller.name\/wordpress\/category\/spring_ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8453379_1280.jpg?fit=800%2C1200&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8453379_1280.jpg?fit=800%2C1200&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8453379_1280.jpg?fit=800%2C1200&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2024\/04\/ai-generated-8453379_1280.jpg?fit=800%2C1200&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":3931,"url":"https:\/\/www.mymiller.name\/wordpress\/uncategorized\/advanced-spring-ai-creating-agentic-workflows-with-function-calling\/","url_meta":{"origin":3893,"position":4},"title":"Advanced Spring AI: Creating Agentic Workflows with Function Calling","author":"Jeffery Miller","date":"November 24, 2025","format":false,"excerpt":"The landscape of AI is rapidly evolving, moving beyond simple request-response models to more sophisticated, agentic systems. These systems empower Large Language Models (LLMs) to not just generate text, but to act within your applications, making them an active and integral part of your business logic. Spring AI is at\u2026","rel":"","context":"Similar post","block_context":{"text":"Similar post","link":""},"img":{"alt_text":"","src":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/11\/Gemini_Generated_Image_kg5i0ykg5i0ykg5i.avif","width":350,"height":200,"srcset":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/11\/Gemini_Generated_Image_kg5i0ykg5i0ykg5i.avif 1x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/11\/Gemini_Generated_Image_kg5i0ykg5i0ykg5i.avif 1.5x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/11\/Gemini_Generated_Image_kg5i0ykg5i0ykg5i.avif 2x"},"classes":[]},{"id":3912,"url":"https:\/\/www.mymiller.name\/wordpress\/uncategorized\/spring-boot-4-0-whats-next-for-the-modern-java-architect\/","url_meta":{"origin":3893,"position":5},"title":"Spring Boot 4.0: What&#8217;s Next for the Modern Java Architect?","author":"Jeffery Miller","date":"September 24, 2025","format":false,"excerpt":"A Forward-Looking Comparison of Spring Boot 3.x and 4.0 Staying on top of the rapidly evolving Java ecosystem is paramount for any software architect. The shift from Spring Boot 2.x to 3.x brought significant changes, notably the move to Jakarta EE. Now, with the horizon of Spring Boot 4.0 and\u2026","rel":"","context":"Similar post","block_context":{"text":"Similar post","link":""},"img":{"alt_text":"","src":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/09\/per-2056740_1280.avif","width":350,"height":200,"srcset":"https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/09\/per-2056740_1280.avif 1x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/09\/per-2056740_1280.avif 1.5x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/09\/per-2056740_1280.avif 2x, https:\/\/www.mymiller.name\/wordpress\/wp-content\/uploads\/2025\/09\/per-2056740_1280.avif 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/posts\/3893","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/comments?post=3893"}],"version-history":[{"count":2,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/posts\/3893\/revisions"}],"predecessor-version":[{"id":3896,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/posts\/3893\/revisions\/3896"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/media\/3894"}],"wp:attachment":[{"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/media?parent=3893"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/categories?post=3893"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/tags?post=3893"},{"taxonomy":"series","embeddable":true,"href":"https:\/\/www.mymiller.name\/wordpress\/wp-json\/wp\/v2\/series?post=3893"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}