#chunking

Fabrizio MusacchioFabMusacchio
2026-01-12

🧠 New preprint by Zhong et al. proposes a mechanism for in .

Using short-term and synaptic augmentation, their model shows how items can be temporarily suppressed and later retrieved as chunks, increasing effective capacity w/o increasing simultaneous activity.

🌍 doi.org/10.7554/eLife.109538.1

Fig. 1: Illustration of the hierarchical working memory model.
2025-12-05

Архитектура высоконагруженных RAG-систем: 10 стратегий оптимизации чанкинга и интеграция с Weaviate, Qwen / Llama /Gemma

Привет, Хабр! Это Андрей Носов, AI-архитектор в компании Raft, проектирую и внедряю высоконагруженные RAG-системы на предприятиях. Сегодня я расскажу о вызовах, которые мы преодолеваем каждый день, создавая такие системы, и сделаю акцент на чанкинге. Обозначим направления, в которых мы будем работать. Сегодня поговорим только о двух возможностях применения больших языковых моделей — это MedTech и LegalTech. Они наиболее востребованные на рынке в текущий момент в плане систем поиска. Такой выбор направлений связан с глобальным трендом на работу с профессиональными знаниями, о котором говорят Gartner и OpenAI.

habr.com/ru/companies/oleg-bun

#rag #chunking #llm #genai #архитектура #чанкинг #highload #highload++

2025-11-30

Mô hình ngữ cảnh dài có thực sự giải quyết 'attention dilution'? Có người cho rằng Gemini 1M context loại bỏ RAG/chia nhỏ tài liệu. Tuy nhiên, tác giả bài viết nghi ngờ, kinh nghiệm cho thấy hiệu suất giảm mạnh sau 100K-200K token. Điều này rất quan trọng với tài liệu pháp lý cần độ chính xác cao.

#AI #LLM #LongContext #AttentionDilution #RAG #Chunking #Gemini #TechNews #ArtificialIntelligence
#MôHìnhAI #NgữCảnhDài #XửLýNgônNgữ #CôngNghệ #TríTuệNhânTạo #HỏiĐáp

reddit.com/r/LocalLLa

2025-11-15

Công cụ rag-chunk giúp kiểm tra chiến lược chunking cho tài liệu. Cho phép phân tích, kiểm tra và đánh giá các chiến lược khác nhau. Calculates Recall score để đánh giá hiệu quả. #RAG #chunking #CLI #Python #ragchunk #tiểu#côngcụ #phần_mềm

reddit.com/r/programming/comme

2025-10-30

Bạn đang tối ưu RAG? Đừng bỏ qua cách chia đoạn văn bản (chunking)! Thay vì chia theo ký tự thô, việc chia theo ngữ nghĩa (semantic chunking), duy trì ngữ cảnh với các đoạn 500-1000 token có chồng lấn nhỏ, mang lại hiệu quả vượt trội hơn cả việc đổi model hay embedding. Hãy chia theo ý nghĩa, không phải số lượng!

#RAG #AI #NLP #TextSplitting #Chunking #TốiƯuAI

reddit.com/r/LocalLLaMA/commen

N-gated Hacker Newsngate
2025-10-26

🤖 Oh joy, yet another model that promises to chunk across languages, because apparently, understanding words needed a chonkier approach. 🙄 proudly presents a delightfully complex name, ideal for confusing your cat and impressing no one at dinner parties. 🌍 Expect world peace and better any day now! 🎉
huggingface.co/mirth/chonky_mm

Hacker Newsh4ckernews
2025-10-26
2025-10-25

Bạn có thử Reducto để phân tích tài liệu Mél? Họ kết hợp mô hìnhCV-ngôn ngữ và chunking thông minh dựa trên embedding, gây ra chunk phù hợp với LLM bằng cách giữ nguyên cấu trúc (bảng, hình). Cách này có hiệu quả trong công thức thực tế? TAG: #Reducto #Chunking #LLM #RAG

reddit.com/r/LocalLLaMA/commen

2025-10-08

Chonkie: революция в RAG-чанкинге — скорость, лёгкость, удобство

В эпоху, когда большие языковые модели (LLM) становятся всё более мощными и применяются во многих задачах, одна из ключевых проблем остаётся прежней — как эффективно снабжать их релевантным контекстом. Одним из популярных решений является подход RAG, где качество итогового ответа зависит от целого ряда факторов, одним из которых является качественное чанкирование исходных текстов. Сегодня мы рассмотрим одно из новых и интересных решений. Всем привет! Меня зовут Вадим, я Data Scientist в компании Raft. В этой статье я расскажу о Chonkie — библиотеке для простого и быстрого чанкирования документов, а также на практике применю её и сравню с другими популярными решениями: LangChain и LlamaIndex .

habr.com/ru/companies/raft/art

#rag #chunking #ai #поиск #чанкинг #векторные_базы_данных #библиотека #llm_память

2025-06-04

От задачи до решения: LLM с RAG-конфигурацией и ROC-AUC. Эксперимент на 121 прогоне за 40 часов с помощью ИИ

Меня зовут Антон, сейчас занимаюсь прикладными проектами индекса цифровой зрелости БРИКС. Пробую за счет инструментов ИИ собирать каскады моделей ИИ для выявления неочевидных зависимостей в разных экономических и культурных процессах на основе данных извлекаемых из открытых источников. В рамках эксперимента я поставил себе задачу применить ИИ в прикладной задаче, при этом использовать только доступные всем инструменты и понятные нарративы. Одним словом, решил примерить на себя роль «Сделай там что-то с ИИ-шечкой, только быстро!» Рассказываю, что из этого поучилось (ссылки на рабочие блокноты, промпты и скриншоты прилагаются).

habr.com/ru/companies/mipt_dig

#llm #rag #f1_score #rocauc #google_colab #openrouter #Groq_api #chunking #DeepSeek #perplexity

Hacker Newsh4ckernews
2025-04-13
Justin Buzzardjdbuzzard
2024-09-29

Chunking - Breaking things into smaller pieces, making it easier to digest/read/memorize. Good way to train/learn.

Towards Data Sciencetowardsdatascience@me.dm
2024-09-19

💡 HOW TO: Use embeddings and visualization tools to split text into meaningful chunks. Robert Martin-Short shows you how.

#Semantic #Text #Chunking

towardsdatascience.com/a-visua

David WakehamwakehamAMR
2024-06-07

I can’t believe it took me this long to discover this wonderful app.

For myself and many of my students who are this is a god send.

Yes it is AI so it is not perfect, e.g. the estimator function.

Wonderful tool for especially for teachers/educators with little experience in this.

goblin.tools/About

Joe Steinbringjoe@jws.news
2024-05-09

Back in January, we started looking at AI and how to run a large language model (LLM) locally (instead of just using something like ChatGPT or Gemini).  A tool like Ollama is great for building a system that uses AI without dependence on OpenAI.  Today, we will look at creating a Retrieval-augmented generation (RAG) application, using Python, LangChain, Chroma DB, and Ollama. Retrieval-augmented generation is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response.  If you have a source of truth that isn’t in the training data, it is a good way to get the model to know about it.  Let’s get started!

Your RAG will need a model (like llama3 or mistral), an embedding model (like mxbai-embed-large), and a vector database.  The vector database contains relevant documentation to help the model answer specific questions better.  For this demo, our vector database is going to be Chroma DB.  You will need to “chunk” the text you are feeding into the database.  Let’s start there.

Chunking

There are many ways of choosing the right chunk size and overlap but for this demo, I am just going to use a chunk size of 7500 characters and an overlap of 100 characters.  I am also going to use LangChain‘s CharacterTextSplitter to do the chunking.  It means that the last 100 characters in the value will be duplicated in the next database record.

The Vector Database

A vector database is a type of database designed to store, manage, and manipulate vector embeddings. Vector embeddings are representations of data (such as text, images, or sounds) in a high-dimensional space, where each data item is represented as a dense vector of real numbers. When you query a vector database, your query is transformed into a vector of real numbers. The database then uses this vector to perform similarity searches.

You can think of it as being like a two-dimensional chart with points on it.  One of those points is your query.  The rest are your database records.  What are the points that are closest to the query point?

Embedding Model

To do this, you can’t just use an Ollama model.  You need to also use an embedding model.  There are three that are available to pull from the Ollama library as of the writing of this.  For this demo, we are going to be using nomic-embed-text.

Main Model

Our main model for this demo is going to be phi3.  It is a 3.8B parameters model that was trained by Microsoft.

LangChain

You will notice that today’s demo is heavily using LangChain. LangChain is an open-source framework designed for developing applications that use LLMs. It provides tools and structures that enhance the customization, accuracy, and relevance of the outputs produced by these models. Developers can leverage LangChain to create new prompt chains or modify existing ones.  LangChain pretty much has APIs for everything that we need to do in this app.

The Actual App

Before we start, you are going to want to pip install tiktoken langchain langchain-community langchain-core.  You are also going to want to ollama pull phi3 and ollama pull nomic-embed-text.  This is going to be a CLI app.  You can run it from the terminal like python3 app.py "<Question Here>".

You also need a sources.txt file containing the URLs of things that you want to have in your vector database.

So, what is happening here?  Our app.py file is reading sources.txt to get a list of URLs for news stories from Tuesday’s Apple event.  It then uses WebBaseLoader to download the pages behind those URLs, uses CharacterTextSplitter to chunk the data, and creates the vectorstore using Chroma.  It then creates and invokes rag_chain.

Here is what the output looks like:

The May 7th event is too recent to be in the model’s training data.  This makes sure that the model knows about it.  You could also feed the model company policy documents, the rules to a board game, or your diary and it will magically know that information.  Since you are running the model in Ollama, there is no risk of that information getting out, too.  It is pretty awesome.

Have any questions, comments, etc?  Feel free to drop a comment, below.

 

https://jws.news/2024/how-to-build-a-rag-system-using-python-ollama-langchain-and-chroma-db/

#AI #ChromaDB #Chunking #LangChain #LLM #Ollama #Python #RAG

2024-04-26

Indulge in the vibrant world of "Chun King" by Sonnyjim, a track that embodies Anthony Bourdain infused Hip Hop finesse. Produced by Statik Selektah, this song is a journey through the art of making money, spending it, and doing so with unmatched style.

#ChunKing #E$ #sonnyjim #StatikSelektah

britishhiphop.co.uk/downloads/

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst