Overview
Integration details
| Class | Package | Local | Serializable | JS support | 
|---|---|---|---|---|
| OpenDataLoader PDF | langchain-opendataloader-pdf | ✅ | ❌ | ❌ | 
Loader features
| Source | Document Lazy Loading | Native Async Support | 
|---|---|---|
| OpenDataLoaderPDFLoader | ✅ | ❌ | 
OpenDataLoaderPDFLoader component enables you to parse PDFs into structured Document objects.
Requirements
- Python >= 3.9
- Java 11 or newer available on the system PATH
- opendataloader-pdf >= 1.1.1
Installation
Quick start
Parameters
| Parameter | Type | Required | Default | Description | 
|---|---|---|---|---|
| file_path | List[str] | ✅ Yes | — | One or more PDF file paths or directories to process. | 
| format | str | No | None | Output formats (e.g. "json","html","markdown","text"). | 
| quiet | bool | No | False | Suppresses CLI logging output when True. | 
| content_safety_off | Optional[List[str]] | No | None | List of content safety filters to disable (e.g. "all","hidden-text","off-page","tiny","hidden-ocg"). | 
Additional Resources
- LangChain OpenDataLoader PDF integration GitHub
- LangChain OpenDataLoader PDF integration PyPI package
- OpenDataLoader PDF GitHub
- OpenDataLoader PDF Homepage
Connect these docs programmatically to Claude, VSCode, and more via MCP for    real-time answers.