Modern search systems are rapidly moving beyond traditional keyword matching. Instead of relying only on exact text matches, modern applications now use AI-powered semantic search to understand the meaning of queries.
In this tutorial, we will build a complete AI-powered search engine using Spring Boot and Elasticsearch. This system will automatically generate embeddings using an ML model, store them as vectors, and perform high-quality semantic search using kNN vector similarity.
If you are new to Elasticsearch semantic search, you may first want to read our foundational guide: Setting up ElasticSearch for Semantic Search
If you want to learn the basics of vector search in Spring Boot, check this guide: Spring Boot Elasticsearch Vector Search
What We Will Build
By the end of this tutorial you will have a working AI search engine that:
- Generates text embeddings using an Elasticsearch ML model
- Indexes documents with vector embeddings
- Performs semantic search using kNN vector similarity
- Exposes a REST API through Spring Boot
This architecture is similar to what modern AI search systems use.
System Architecture
A clean AI search architecture separates responsibilities across multiple layers.
User Query
Spring Boot REST API
|
Search Service
|
Embedding Service
|
Elasticsearch ML Model
|
Vector Search (kNN)
|
Relevant Documents
Project Structure
Elasticsearch Client Configuration
We have bypassed the SSL validation to run this app in local.
@Bean
public ElasticsearchClient elasticsearchClient() throws Exception {
// 1. Create SSLContext that trusts all certificates (bypass validation)
SSLContext sslContext = SSLContextBuilder.create()
.loadTrustMaterial(null, (chain, authType) -> true) // trust all
.build();
// 2. Basic authentication
BasicCredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(
AuthScope.ANY,
new UsernamePasswordCredentials(
elasticProperties.getUsername(),
elasticProperties.getPassword()
)
);
// 3. Build RestClient
RestClient restClient = RestClient.builder(HttpHost.create(elasticProperties.getUrl()))
.setHttpClientConfigCallback(httpClientBuilder -> httpClientBuilder
.setSSLContext(sslContext)
.setSSLHostnameVerifier(NoopHostnameVerifier.INSTANCE) // bypass hostname check
.setDefaultCredentialsProvider(credentialsProvider)
.setMaxConnTotal(100)
.setMaxConnPerRoute(20)
)
.setRequestConfigCallback(requestConfigBuilder -> requestConfigBuilder
.setConnectTimeout(5_000)
.setSocketTimeout(60_000)
)
.build();
// 4. Create Elasticsearch client
ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
return new ElasticsearchClient(transport);
}
Embedding Service (Generate AI Vectors)
This service calls the Elasticsearch ML model to generate embeddings.
@Service @RequiredArgsConstructor public class EmbeddingService { private final ElasticsearchClient elasticsearchClient; private final ElasticProperties elasticProperties; //POST /_ml/trained_models/multilingual-e5-small/_infer public List<Float> inferEmbedding(String text) { List<Map<String, JsonData>> docs = new ArrayList<>(); docs.add(Map.of("text_field", JsonData.of(text))); try { InferTrainedModelRequest request = new InferTrainedModelRequest.Builder() .modelId(elasticProperties.getModelId()) .docs(docs) .build(); InferTrainedModelResponse response = elasticsearchClient.ml().inferTrainedModel(request); List<List<FieldValue>> predictions = response.inferenceResults().get(0).predictedValue(); return predictions.stream() .flatMap(fieldValues -> fieldValues.stream().map(FieldValue::doubleValue)) .map(Double::floatValue) .toList(); } catch (Exception e) { throw new RuntimeException("Embedding generation failed", e); } } }
Index Documents with Embeddings
Documents are indexed using an ingest pipeline that automatically generates embeddings.
@Service
@RequiredArgsConstructor
public class DocumentService {
private final ElasticsearchClient elasticsearchClient;
private final ElasticProperties elasticProperties;
public void indexDocument(String content) {
try {
Map<String, Object> doc = Map.of("content", content);
IndexRequest<Map<String, Object>> request =
new IndexRequest.Builder<Map<String, Object>>()
.index(elasticProperties.getIndex())
.pipeline("semantic-pipeline")
.document(doc)
.build();
elasticsearchClient.index(request);
} catch (Exception e) {
throw new RuntimeException("Document indexing failed", e);
}
}
}
We are using the pipeline that we created earlier through Postman in my previous article.
curl --location --request PUT 'https://localhost:9200/_ingest/pipeline/semantic-pipeline' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic <password>' \
--data '{
"processors": [
{
"inference": {
"model_id": "multilingual-e5-small",
"field_map": {
"content": "text_field"
},
"target_field": "ml_output"
}
},
{
"script": {
"source": "ctx.content_vector = ctx.ml_output.predicted_value; ctx.remove('\''ml_output'\'');"
}
}
]
}
It takes the field named "content" -> Feed it to the ML model as "text_field" -> Generate embeddings
{
"content": "Traditional keyword search relies on exact term matching."
}
This line is the key:
ctx.content_vector = ctx.ml_output.predicted_value;
This creates a new field: content_vector. Hence, the final stored document becomes:
{
"content": "Traditional keyword search relies on exact term matching.",
"content_vector": [0.12, -0.88, 0.34, ...]
}
Why Pipeline-Based Embeddings Are Better
This is the architecture used when building semantic search on top of Elasticsearch with services built in Spring Boot.
- Keeps Spring Boot lightweight
- Centralizes AI logic inside Elasticsearch
- Easier model upgrades
- Works with bulk indexing
- Same pipeline works for any indexing source
Perform Vector Search
Next we perform kNN search using the generated query embeddings.
public List<SearchResult> knnSearch(List<Float> queryVector) {
try {
SearchRequest request = new SearchRequest.Builder()
.index(elasticProperties.getIndex())
.knn(knn -> knn
.field(elasticProperties.getField())
.queryVector(queryVector)
.k(3)
.numCandidates(100)
)
.size(10)
.build();
SearchResponse<SearchResult> response =
elasticsearchClient.search(request, SearchResult.class);
return response.hits().hits()
.stream()
.map(hit -> {
SearchResult result = hit.source();
result.setScore(hit.score());
result.setIndex(hit.index());
return result;
})
.toList();
} catch (Exception e) {
throw new RuntimeException("Vector search failed", e);
}
}
Search API
Expose the search functionality via a REST controller.
@RestController @RequestMapping("/api/search") public class SearchController { @Autowired private SearchService searchService; @GetMapping public List<<SearchResult>> search(@RequestParam String query) { return searchService.semanticSearch(query); } }Example request:
GET /api/search?query=What is vector embedding
How the Complete AI Search Flow Works
- User sends a query to Spring Boot
- The query is converted into an embedding
- Elasticsearch performs kNN vector search
- Relevant documents are returned
This allows your application to find results based on meaning rather than keywords.
What You Can Build With This
This architecture can power many modern AI applications:
- AI knowledge base search
- ChatGPT-style document assistants
- Semantic blog search
- AI-powered product discovery
- Developer documentation search
Conclusion
In this guide we built a complete AI-powered semantic search engine using Spring Boot and Elasticsearch. The system generates embeddings using an ML model, stores vectors in Elasticsearch, and performs high-quality semantic retrieval.
The source code can be found here on Github.
This architecture is widely used in modern AI applications including knowledge assistants, document search systems, and intelligent chatbots.