CaptainB
|
834ccaa35b
|
refactor: PDF分段强制按字数限制
--bug=1047568 --user=刘瑞斌 【github#1363】pdf 文件高级分段默认分段长度为500,但生成的段落长度超过29000字符 https://www.tapd.cn/57709429/s/1600183
|
2024-10-29 11:44:37 +08:00 |
|
wxg0103
|
d5bbf48d01
|
style: 优化样式
|
2024-10-18 15:51:03 +08:00 |
|
CaptainB
|
e16e827028
|
fix: 处理文本前后的空白字符
|
2024-09-25 16:00:30 +08:00 |
|
CaptainB
|
6cacb5be71
|
fix: 处理不规范的pdf中前言部分没在目录中标识出来,导致不能正常识别的问题
|
2024-09-24 12:06:51 +08:00 |
|
CaptainB
|
70f44b990c
|
refactor: 格式规范的pdf通过目录来分段
|
2024-09-06 10:56:27 +08:00 |
|
shaohuzhang1
|
a9443a638c
|
fix: 修复上传文档中后缀为PDF 不识别
|
2024-08-27 14:16:03 +08:00 |
|
CaptainB
|
2a87af6172
|
chore: 解析错误时输出错误原因
|
2024-08-22 10:43:48 +08:00 |
|
shaohuzhang1
|
00af530d27
|
chore: 解析错误时输出错误原因 (#996)
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
Co-authored-by: CaptainB <bin@fit2cloud.com>
|
2024-08-20 22:03:58 +08:00 |
|
CaptainB
|
17af603397
|
refactor: 优化pdf加载,修复部分pdf中文乱码的问题
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
|
2024-08-20 16:58:04 +08:00 |
|
CaptainB
|
01d8204cb5
|
refactor: 逐页加载pdf, 图片类型单独保存成文件加载
|
2024-08-16 15:08:22 +08:00 |
|
CaptainB
|
0d59ab2be9
|
refactor: 使用lazy_load方式加载pdf
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
|
2024-08-16 10:43:20 +08:00 |
|
CaptainB
|
e266dd9d99
|
refactor: 支持解析pdf中的图片
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
|
2024-08-15 20:53:44 +08:00 |
|
shaohuzhang1
|
1f916a5c3e
|
feat: 【知识库】docx支持图片上传 #69 (#267)
|
2024-04-26 18:03:02 +08:00 |
|
shaohuzhang1
|
fb7abb432f
|
Pr@main@fix bugs (#41)
* fix: 修复提示问题
* fix: 上传文档限制
* feat: 问题管理
* fix: 修改分段正则,优化分段逻辑
* feat: 问题管理
* fix: word分段支持表格数据
* fix: 问题批量插入去重
* fix: 修复文档问题
* feat: 文档分页优化
* fix: 优化关联问题
* fix: 嵌入样式
|
2024-04-10 14:16:56 +08:00 |
|
shaohuzhang1
|
c55bb3f6e5
|
Pr@main@pdf (#23)
* feat: 分段API支持word,pdf
* fix: 通用型知识库支持上传 PDF/DOC 格式的文档#19
---------
Co-authored-by: wangdan-fit2cloud <dan.wang@fit2cloud.com>
|
2024-03-29 18:28:05 +08:00 |
|