MAX-MIN SEMANTIC CHUNKING FOR TECHNICAL DOCUMENTATION RETRIEVAL: A CASE STUDY WITH HAZELCAST DOCUMENTATION
DOI:
https://doi.org/10.35546/kntu2078-4481.2026.2.49Keywords:
Max-Min semantic chunking, retrieval-augmented generation, technical documentation, Hazelcast, vector embeddingsAbstract
Effective chunking of source documents is a critical factor in determining retrieval quality in retrieval-augmented generation (RAG) systems. Traditional fixed-size and sentence-based chunking strategies process documents without semantic awareness, frequently breaking cohesive information units at arbitrary boundaries. Max-Min Semantic Chunking uses an embedding-first approach: all sentences are embedded before segmentation, and chunk boundaries are determined by comparing the similarity between a candidate sentence and the existing chunk against a predefined minimum similarity threshold. This paper presents a theoretical case study analysing the suitability of Max-Min Semantic Chunking for large-scale technical documentation, using the Hazelcast documentation as the target corpus. The analysis identifies and characterises seven content types in the Hazelcast documentation, such as narrative text, API reference entries, code blocks, configuration tables, step-by-step instructions, admonition blocks, and tabbed content panels. For each content type, the expected behaviour of Max-Min chunking is evaluated relative to fixed-size and sentence-based baselines across four dimensions: intra-chunk semantic coherence, retrieval precision, chunk size distribution and variance, and boundary detection quality at content-type transitions. A brief illustrative walkthrough demonstrates the algorithm's operation on a representative Hazelcast page. The analysis reveals a nuanced picture. Max-Min is predicted to substantially outperform baselines on narrative, stepby- step, and inline admonition content, where embedding-space similarity reliably tracks logical structure. However, it faces structural limitations on code blocks, configuration tables, standalone admonitions, and – most severely – tabbed multi-language content panels, where near-identical embeddings across panels prevent boundary detection. Four concrete adaptation strategies are proposed. These strategies are content-type-aware threshold adjustment, hybrid pre-segmentation of structured elements with per-panel tab extraction, admonition-aware boundary handling, and context prefix injection
References
Lewis, P., Perez, E., Piktus, A., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS '20). Curran Associates Inc., Red Hook, NY, USA, Article 793, 9459–9474. https://dl.acm.org/doi/abs/10.5555/3495724.3496517
Gao, Y., Xiong, Y., Gao, X., et al. (2023). Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
https://doi.org/10.48550/arXiv.2312.10997
Qu, R., Tu, R., & Bao, F. S. (2025). Is Semantic Chunking Worth the Computational Cost? Findings of the Association for Computational Linguistics: NAACL 2025, 2155–2177. https://doi.org/10.18653/v1/2025.findings-naacl.114
Hazelcast. (2026). Hazelcast Documentation. Retrieved from https://docs.hazelcast.com
Kiss, C., Nagy, M., & Szilágyi, P. (2025). Max-Min semantic chunking of documents for RAG application. Discover Computing, 28, 117. https://doi.org/10.1007/s10791-025-09638-7
Shi, W., Min, S., Yasunaga, M., Seo, M., James, R., Lewis, M., Zettlemoyer, L., & Yih, W. (2023). REPLUG: Retrieval-Augmented Black-Box Language Models. North American Chapter of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2301.12652
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks.Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3982–3992. https://doi.org/10.18653/v1/D19-1410
Milvus. (2025). Embedding First, Then Chunking: Smarter RAG Retrieval with Max-Min Semantic Chunking. Retrieved from https://milvus.io/blog/embedding-first-chunking-second-smarter-rag-retrieval-with-max-min-semanticchunking.
md





