Commit Graph

88 Commits

Author SHA1 Message Date
CaptainB 0f1d57f0cb feat: enhance error logging for file processing in CSV, XLS, and DOC handlers 2025-06-30 12:49:50 +08:00
CaptainB 82a2203be6 fix: handle string type for limit and improve error logging in pdf_split_handle
--bug=1057493 --user=刘瑞斌 【知识库】上传文档,使用高级分段报错 https://www.tapd.cn/62980211/s/1720110
2025-06-30 12:47:47 +08:00
CaptainB d49f448a5f fix: correct image path replacement logic in zip_split_handle 2025-06-26 17:02:34 +08:00
CaptainB 37ac79dc5a feat: import File model in zip_split_handle for enhanced functionality
--bug=1057478 --user=刘瑞斌 【知识库】通用知识库上传ZIP文件,分段失败 https://www.tapd.cn/62980211/s/1719181
2025-06-26 16:56:28 +08:00
CaptainB e24a2001c5 feat: refine regex patterns in text_split_handle for improved comment detection
--bug=1057526 --user=刘瑞斌 【知识库】markdown文件导入知识库,分段详情中代码块展示异常 https://www.tapd.cn/62980211/s/1719131
2025-06-26 16:23:32 +08:00
CaptainB a73e0b10f9 refactor: replace logging with maxkb_logger for consistent logging across modules 2025-06-25 17:00:18 +08:00
CaptainB fe8f87834d refactor: replace logging with maxkb_logger for consistent logging across modules 2025-06-25 16:46:50 +08:00
CaptainB 3aa0847506 refactor: replace print statements with logging for improved error tracking 2025-06-25 16:18:19 +08:00
wxg0103 c253e8b696 refactor: remove print 2025-06-24 15:30:42 +08:00
CaptainB 45908b91ff refactor: update dataset_id to knowledge_id in zip_split_handle.py and tools.py 2025-06-18 21:28:33 +08:00
CaptainB c0b770f41e refactor: update dataset_id to knowledge_id in zip_split_handle.py and tools.py 2025-06-18 21:15:53 +08:00
CaptainB 9a7281212d fix: update image URL paths to use OSS endpoints 2025-06-12 15:49:54 +08:00
wxg0103 b8b14884bd refactor: add application settings 2025-06-07 17:57:11 +08:00
wxg0103 93833849c1 refactor: file to oss
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
2025-06-06 11:42:31 +08:00
CaptainB c3581be9bd fix: rename image_name to file_name in zip_split_handle and remove workspace_id assignment in document 2025-05-13 12:47:59 +08:00
CaptainB e702af8c2b feat: enhance Document API with workspace ID support for get, put, and delete operations 2025-05-06 15:24:36 +08:00
CaptainB 43bef216d5 refactor: reorganize file handling imports into a structured directory 2025-04-30 16:08:17 +08:00
CaptainB 48297d81e5 feat: add initial implementations of various file handling classes for CSV, XLS, and XLSX formats 2025-04-30 15:52:58 +08:00
CaptainB c78a6babb6 ci: v2 2025-04-11 15:47:59 +08:00
CaptainB 560890f717 fix: limit chapter title length to 256 characters in pdf_split_handle.py
--bug=1054363 --user=刘瑞斌 【知识库】导入PDF文档,分段标题长度超长时,没有自动截断 https://www.tapd.cn/57709429/s/1681044
2025-04-07 10:54:59 +08:00
CaptainB 675adeeb63 fix: exclude macOS specific files from zip processing
--bug=1054264 --user=刘瑞斌 【知识库】QA问答对模式,导入在mac上压缩的zip文件,会出现2个乱码文档 https://www.tapd.cn/57709429/s/1681034
2025-04-07 10:37:06 +08:00
CaptainB 27bc01d442 fix: skip macOS specific metadata directories and files in zip parsing
--bug=1054264 --user=刘瑞斌 【知识库】QA问答对模式,导入在mac上压缩的zip文件,会出现2个乱码文档 https://www.tapd.cn/57709429/s/1679674
2025-04-02 16:06:36 +08:00
shaohuzhang1 9750c6d605
fix: garbled zip import file names (#2747) 2025-03-31 16:22:39 +08:00
shaohuzhang1 55cdd0a708
fix: Zip with title cannot be parsed (#2683) 2025-03-26 10:31:31 +08:00
shaohuzhang1 5ec94860b2
perf: Enhance Word parsing (#2612) 2025-03-19 12:04:43 +08:00
shaohuzhang1 e420a01e0d
fix: Enterprise WeChat docking sub application cannot output thinking process (#2489) 2025-03-04 19:31:49 +08:00
shaohuzhang1 8c45e92ee4
feat: The OpenAI interface supports the thought process (#2392) 2025-02-25 14:22:51 +08:00
CaptainB c524fbc0e4 fix: Fix excel merge cells header 2025-02-14 10:26:18 +08:00
CaptainB 89c08b4bb0 fix: Filter blank sheet
--bug=1052097 --user=刘瑞斌 【github#2196】【应用编排】应用对话的时候上传带空白sheet的表格会报错 https://www.tapd.cn/57709429/s/1653414
2025-02-11 15:17:24 +08:00
shaohuzhang1 f16f417bd5
fix: The knowledge base table file upload is missing a header (#2185) 2025-02-10 10:22:23 +08:00
wxg0103 b90995d3aa fix: defect of incorrect document names after importing CSV and docx files into the knowledge base
--bug=1052039 --user=王孝刚 【知识库】-压缩文件中包含csv、docx文件时,导入到知识库后,文档名称包含文件夹名称 https://www.tapd.cn/57709429/s/1651752
2025-02-08 16:00:57 +08:00
shaohuzhang1 a3d6083188
fix: XLS, XLSX, CSV file upload lost data (#2150) 2025-02-07 15:13:14 +08:00
wxg0103 c5585da57d feat: i18n 2025-01-14 09:46:21 +08:00
shaohuzhang1 a28de6feaf
feat: i18n (#2011) 2025-01-13 11:15:51 +08:00
shaohuzhang1 d9df013e33
fix: Part of the docx document is parsed incorrectly (#1981)
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2025-01-06 14:37:51 +08:00
shaohuzhang1 832b0dbd63
feat: Knowledge base import supports zip, xls, xlsx, and csv formats, while knowledge base export supports zip format (#1869) 2024-12-18 18:00:19 +08:00
CaptainB fb8b96779c fix: 处理某些pdf中不包括目录和内部链接不能完整导入的问题 2024-12-06 10:49:37 +08:00
CaptainB 7346ef6a2c fix: 过滤空白的sheet
--bug=1049943 --user=刘瑞斌 【文档内容提取】-上传的excel中sheet为空时报错 https://www.tapd.cn/57709429/s/1625062
2024-12-04 16:30:43 +08:00
shaohuzhang1 6b4cee1412
fix: 修复对话使用api调用无法响应数据 (#1755) 2024-12-04 14:19:37 +08:00
shaohuzhang1 b6c65154c5
fix: 修复子应用表单调用无法调用问题 (#1741) 2024-12-03 15:23:53 +08:00
shaohuzhang1 b8aa4756c5
fix: 修复工作流节点输出等问题 (#1716)
Some checks failed
sync2gitee / repo-sync (push) Has been cancelled
Typos Check / Spell Check with Typos (push) Has been cancelled
2024-11-29 19:26:16 +08:00
CaptainB 78cd949f43 fix: 修复上传xlsx里的图片没在文档提取中显示的问题 2024-11-29 16:28:34 +08:00
CaptainB 2a07a50a60 fix: 修复文档提取doc图片没有保存和展示的问题 2024-11-28 16:17:23 +08:00
CaptainB f638abdea2 fix: 修复文档提取doc图片没有保存和展示的问题 2024-11-28 15:07:21 +08:00
CaptainB 59f5c8ac76 fix: 修复文档提取报错没有显示的问题 2024-11-27 12:20:16 +08:00
CaptainB 64e8f4dc9f chore: 文档内容无法提取的时候输出错误信息 2024-11-22 17:56:07 +08:00
CaptainB e1df4b2857 fix: 处理PDF中出现 \0 字符报 Null characters are not allowed
--bug=1048190 --user=刘瑞斌 【知识库】- 上传PDF文档 报错  ,关联issue #1468 https://www.tapd.cn/57709429/s/1611070
2024-11-18 12:46:37 +08:00
CaptainB 10e53f08e2 feat: 高级编排支持文件上传(WIP) 2024-11-14 14:24:36 +08:00
CaptainB b57a619bdb feat: 高级编排支持文件上传(WIP) 2024-11-14 13:36:16 +08:00
shaohuzhang1 22d9fdc42f fix: 修复旧word文档图片无法正常识别 #1533 2024-11-06 14:20:10 +08:00