Commit Graph

21 Commits

Author SHA1 Message Date
wxg0103 c5585da57d feat: i18n 2025-01-14 09:46:21 +08:00
CaptainB fb8b96779c fix: 处理某些pdf中不包括目录和内部链接不能完整导入的问题 2024-12-06 10:49:37 +08:00
CaptainB f638abdea2 fix: 修复文档提取doc图片没有保存和展示的问题 2024-11-28 15:07:21 +08:00
CaptainB 59f5c8ac76 fix: 修复文档提取报错没有显示的问题 2024-11-27 12:20:16 +08:00
CaptainB e1df4b2857 fix: 处理PDF中出现 \0 字符报 Null characters are not allowed
--bug=1048190 --user=刘瑞斌 【知识库】- 上传PDF文档 报错  ,关联issue #1468 https://www.tapd.cn/57709429/s/1611070
2024-11-18 12:46:37 +08:00
CaptainB b57a619bdb feat: 高级编排支持文件上传(WIP) 2024-11-14 13:36:16 +08:00
CaptainB 834ccaa35b refactor: PDF分段强制按字数限制
--bug=1047568 --user=刘瑞斌 【github#1363】pdf 文件高级分段默认分段长度为500,但生成的段落长度超过29000字符 https://www.tapd.cn/57709429/s/1600183
2024-10-29 11:44:37 +08:00
wxg0103 d5bbf48d01 style: 优化样式 2024-10-18 15:51:03 +08:00
CaptainB e16e827028 fix: 处理文本前后的空白字符 2024-09-25 16:00:30 +08:00
CaptainB 6cacb5be71 fix: 处理不规范的pdf中前言部分没在目录中标识出来,导致不能正常识别的问题 2024-09-24 12:06:51 +08:00
CaptainB 70f44b990c refactor: 格式规范的pdf通过目录来分段 2024-09-06 10:56:27 +08:00
shaohuzhang1 a9443a638c fix: 修复上传文档中后缀为PDF 不识别 2024-08-27 14:16:03 +08:00
CaptainB 2a87af6172 chore: 解析错误时输出错误原因 2024-08-22 10:43:48 +08:00
shaohuzhang1 00af530d27
chore: 解析错误时输出错误原因 (#996)
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
Co-authored-by: CaptainB <bin@fit2cloud.com>
2024-08-20 22:03:58 +08:00
CaptainB 17af603397 refactor: 优化pdf加载,修复部分pdf中文乱码的问题
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2024-08-20 16:58:04 +08:00
CaptainB 01d8204cb5 refactor: 逐页加载pdf, 图片类型单独保存成文件加载 2024-08-16 15:08:22 +08:00
CaptainB 0d59ab2be9 refactor: 使用lazy_load方式加载pdf
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2024-08-16 10:43:20 +08:00
CaptainB e266dd9d99 refactor: 支持解析pdf中的图片
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2024-08-15 20:53:44 +08:00
shaohuzhang1 1f916a5c3e
feat: 【知识库】docx支持图片上传 #69 (#267) 2024-04-26 18:03:02 +08:00
shaohuzhang1 fb7abb432f
Pr@main@fix bugs (#41)
* fix: 修复提示问题

* fix: 上传文档限制

* feat: 问题管理

* fix: 修改分段正则,优化分段逻辑

* feat: 问题管理

* fix: word分段支持表格数据

* fix: 问题批量插入去重

* fix: 修复文档问题

* feat: 文档分页优化

* fix: 优化关联问题

* fix: 嵌入样式
2024-04-10 14:16:56 +08:00
shaohuzhang1 c55bb3f6e5
Pr@main@pdf (#23)
* feat: 分段API支持word,pdf

* fix: 通用型知识库支持上传 PDF/DOC 格式的文档#19

---------

Co-authored-by: wangdan-fit2cloud <dan.wang@fit2cloud.com>
2024-03-29 18:28:05 +08:00