インサイトエッジ Inc. is pleased to announce that it has released “Exparso”, a Python document analysis library that utilizes LLM (large-scale language model), as OSS (open source software). Exparso analyzes unstructured data such as PDFs, Office files, and images using multimodal LLM, improving the search accuracy and answer quality of RAG (Retrieval Augmented Generation).
LLMによるテキストデータ解析が一般化する中、代表的な手法としてRAGが広く利用されています。しかし、図表やフローチャート、手書き文字などを含む文書から、高い精度で情報を抽出し、検索性を確保することは、RAGシステムの精度を左右する大きな課題でした。
こちらもお読みください: ケイデンス、NVIDIA AI搭載スーパーコンピュータ「M2000」を発表
Through technical support for DX projects in various industries, including the Sumitomo Corporation Group, we have come to realize that on-site business documents are diverse and that pre-processing of those documents directly impacts project outcomes. However, it has also become clear that document processing can easily become dependent on individuals, resulting in variations in quality and start-up speed for each project. As a result, we developed “Exparso” out of the need for a platform technology that can eliminate dependency on individuals while standardizing the quality of the work provided and provide sustainable value common to multiple projects.
ソース PRタイムズ
