Basic usage:
MarkdownHeaderTextSplitter strips headers being split on from the output chunk’s content. This can be disabled by setting strip_headers = False.
The default 
MarkdownHeaderTextSplitter strips white spaces and new lines. To preserve the original formatting of your Markdown documents, check out ExperimentalMarkdownSyntaxTextSplitter.How to return Markdown lines as separate documents
By default,MarkdownHeaderTextSplitter aggregates lines based on the headers specified in headers_to_split_on. We can disable this by specifying return_each_line:
metadata for each document.
How to constrain chunk size:
Within each markdown group we can then apply any text splitter we want, such asRecursiveCharacterTextSplitter, which allows for further control of the chunk size.
Troubleshooting: chunk_overlap doesn’t seem to apply
- After header-based splitting (e.g., MarkdownHeaderTextSplitter), usesplit_documents(docs)(notsplit_text) so that overlap is applied within each section and per-section metadata (headers) is preserved on chunks.
- Overlap appears only when a single section exceeds chunk_sizeand is split into multiple chunks.
- Overlap does not cross section/document boundaries (e.g., # H1→## H2).
- If the header becomes a tiny first chunk, consider settubg strip_headerstoTrueso the header line doesn’t become a standalone chunk.
- If your text lacks newlines/spaces, keep a fallback ""inseparatorsso the splitter can still split and apply overlap.
Connect these docs programmatically to Claude, VSCode, and more via MCP for    real-time answers.