Основні поняття
DocETL is a novel system designed to optimize complex document processing pipelines for accuracy by leveraging LLM agents to rewrite and evaluate user-defined pipelines, addressing the limitations of existing declarative frameworks that prioritize cost reduction over accuracy.
Статистика
DocETL-generated pipelines produced outputs that were 1.34 to 4.6 times higher quality than hand-engineered baselines.
As of October 2024, DocETL has amassed over 800 GitHub Stars.