News

Llama Index
llamaindex. ai > blog > building-a-better-liteparse-skill-with-evals

Building a Faster, Cheaper PDF-Parsing Skill for Claude Agents: A Lite Parse Case Study

3+ hour, 35+ min ago  (568+ words) OSS repos trusted by millions of developers In this blog post, we go through how we improved our Lite Parse skill for document parsing from into a cheaper, faster and higher-quality helper by evaluating the agent's usage of it, analyzing…...

Symbols: nasdaq:adbe,btc-usd
Google News
llamaindex. ai > blog > how-to-make-a-pdf-searchable

How to Make a PDF Searchable: Methods and Limits

1+ week, 4+ day ago  (410+ words) OSS repos trusted by millions of developers What "Searchable" Means: Two Layers, One of Them Invisible The fastest way to make a PDF searchable takes about four clicks in Adobe Acrobat: open the file, run Scan & OCR, recognize text, save....

Symbols: btc-usd,nasdaq:pdfs
Llama Index
llamaindex. ai > blog > extract-contract-metadata

Extract Contract Metadata: Methods, Challenges, and Workflows

1+ week, 4+ day ago  (328+ words) OSS repos trusted by millions of developers Why Contract Metadata Extraction Is Difficult The diagram below illustrates how metadata extraction fits into a full contract lifecycle workflow, from ingestion through compliance monitoring and renewal. Modern metadata extraction workflows operate through…...

Symbols: btc-usd
Llama Index
llamaindex. ai > glossary > strikethrough-detection

What is Strikethrough Detection?

2+ week, 3+ day ago  (264+ words) OSS repos trusted by millions of developers The following table distinguishes strikethrough from visually or functionally similar annotation types " a distinction that matters especially in image-based detection, where horizontal marks of different kinds can be easily confused. Detection methods vary…...

Llama Index
llamaindex. ai > glossary > code-block-extraction

What is Code Block Extraction?

2+ week, 3+ day ago  (306+ words) OSS repos trusted by millions of developers How Code Block Extraction Works Code block extraction targets and isolates code content from within a larger body of text. Rather than processing an entire document as undifferentiated content, extraction logic locates the…...

Symbols: symbol:albld
Llama Index
llamaindex. ai > glossary > header-detection

What is Header Detection?

2+ week, 3+ day ago  (457+ words) OSS repos trusted by millions of developers What Header Detection Means Across Different Contexts "Detection" in this context means the process by which a system locates, reads, and interprets that structured block'distinguishing it from surrounding content and extracting the information…...

Symbols: tsxv:epl,nyse:www,symbol:since
Llama Index
llamaindex. ai > glossary > document-denoising

What is Document Denoising?

2+ week, 3+ day ago  (1100+ words) OSS repos trusted by millions of developers Types of Noise in Document Processing Document denoising refers to the systematic removal of unwanted elements that obscure or distort the intended content of a document. These elements, collectively called "noise," can originate…...

Symbols: nikkei,d05.S0,u11.S0,z74.S0,1d3.S0,594.S0
Llama Index
llamaindex. ai > glossary > bold-and-italic-detection

What is Bold and Italic Detection?

2+ week, 3+ day ago  (426+ words) OSS repos trusted by millions of developers What Bold and Italic Detection Actually Means Bold and italic detection is the process of identifying text formatted with bold or italic styling within a document, image, or digital file, distinguishing it from…...

Symbols: nse:fsl,btc-usd,eth-usd
Llama Index
llamaindex. ai > glossary > highlighted-text-extraction

What is Highlighted Text Extraction?

2+ week, 3+ day ago  (379+ words) OSS repos trusted by millions of developers What Highlighted Text Extraction Actually Does Highlighted text extraction serves a broad range of users across different workflows: Highlighted text extraction can be performed in two fundamentally different ways. Manual extraction involves a…...

Symbols: btc-usd,eth-usd
Llama Index
llamaindex. ai > glossary > reading-order-detection

What is Reading Order Detection?

2+ week, 3+ day ago  (585+ words) OSS repos trusted by millions of developers What Reading Order Detection Actually Does Getting this right matters for accessibility compliance, screen reader compatibility, and any downstream process that depends on coherent, logically ordered text. Reading order detection determines the logical…...

Symbols: btc-usd,eth-usd