Commit Graph

98 Commits

Author SHA1 Message Date
liqiang-fit2cloud 8c0836627a refactor: remove print.
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2025-11-19 17:16:22 +08:00
shaohuzhang1 a8d0729e65
perf: Memory optimization (#4318) 2025-11-05 19:05:26 +08:00
CaptainB d147b794ce chore: replace split_text with smart_split_paragraph in pdf_split_handle.py 2025-10-27 14:23:42 +08:00
shaohuzhang1 d92dcd722b
fix: Add file name to prompt when processing images with doc (#4114)
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2025-09-25 18:51:21 +08:00
shaohuzhang1 7264545ab6
feat: Support loop node (#4045) 2025-09-16 15:49:49 +08:00
CaptainB 75c461f385 chore: replace datetime.now() with timezone.now() for consistent time handling 2025-08-29 10:16:53 +08:00
CaptainB 4c9756839a chore: normalize with_filter parameter to boolean in split handle files
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
--bug=1057879 --user=刘瑞斌 【知识库】高级分段中自动清洗功能未生效 https://www.tapd.cn/62980211/s/1727744
2025-07-10 15:06:19 +08:00
CaptainB cb40d62162 refactor: allow loading of truncated images and increase max pixel limit in common_handle.py
--bug=1057749 --user=刘瑞斌 【知识库】qa问答对文档中带图片,导入后图片未显示 https://www.tapd.cn/62980211/s/1723700
2025-07-04 15:53:37 +08:00
CaptainB aa901c7fc7 fix: update file URL paths to use relative references 2025-07-02 22:45:11 +08:00
CaptainB 089915f488 refactor: improve error logging for image reading and enhance image handling logic
--bug=1057749 --user=刘瑞斌 【知识库】qa问答对文档中带图片,导入后图片未显示 https://www.tapd.cn/62980211/s/1720856
2025-07-01 14:17:10 +08:00
CaptainB 0f1d57f0cb feat: enhance error logging for file processing in CSV, XLS, and DOC handlers 2025-06-30 12:49:50 +08:00
CaptainB 82a2203be6 fix: handle string type for limit and improve error logging in pdf_split_handle
--bug=1057493 --user=刘瑞斌 【知识库】上传文档,使用高级分段报错 https://www.tapd.cn/62980211/s/1720110
2025-06-30 12:47:47 +08:00
CaptainB d49f448a5f fix: correct image path replacement logic in zip_split_handle 2025-06-26 17:02:34 +08:00
CaptainB 37ac79dc5a feat: import File model in zip_split_handle for enhanced functionality
--bug=1057478 --user=刘瑞斌 【知识库】通用知识库上传ZIP文件,分段失败 https://www.tapd.cn/62980211/s/1719181
2025-06-26 16:56:28 +08:00
CaptainB e24a2001c5 feat: refine regex patterns in text_split_handle for improved comment detection
--bug=1057526 --user=刘瑞斌 【知识库】markdown文件导入知识库,分段详情中代码块展示异常 https://www.tapd.cn/62980211/s/1719131
2025-06-26 16:23:32 +08:00
CaptainB a73e0b10f9 refactor: replace logging with maxkb_logger for consistent logging across modules 2025-06-25 17:00:18 +08:00
CaptainB fe8f87834d refactor: replace logging with maxkb_logger for consistent logging across modules 2025-06-25 16:46:50 +08:00
CaptainB 3aa0847506 refactor: replace print statements with logging for improved error tracking 2025-06-25 16:18:19 +08:00
wxg0103 c253e8b696 refactor: remove print 2025-06-24 15:30:42 +08:00
CaptainB 45908b91ff refactor: update dataset_id to knowledge_id in zip_split_handle.py and tools.py 2025-06-18 21:28:33 +08:00
CaptainB c0b770f41e refactor: update dataset_id to knowledge_id in zip_split_handle.py and tools.py 2025-06-18 21:15:53 +08:00
CaptainB 9a7281212d fix: update image URL paths to use OSS endpoints 2025-06-12 15:49:54 +08:00
wxg0103 b8b14884bd refactor: add application settings 2025-06-07 17:57:11 +08:00
wxg0103 93833849c1 refactor: file to oss
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
2025-06-06 11:42:31 +08:00
CaptainB c3581be9bd fix: rename image_name to file_name in zip_split_handle and remove workspace_id assignment in document 2025-05-13 12:47:59 +08:00
CaptainB e702af8c2b feat: enhance Document API with workspace ID support for get, put, and delete operations 2025-05-06 15:24:36 +08:00
CaptainB 43bef216d5 refactor: reorganize file handling imports into a structured directory 2025-04-30 16:08:17 +08:00
CaptainB 48297d81e5 feat: add initial implementations of various file handling classes for CSV, XLS, and XLSX formats 2025-04-30 15:52:58 +08:00
CaptainB c78a6babb6 ci: v2 2025-04-11 15:47:59 +08:00
CaptainB 560890f717 fix: limit chapter title length to 256 characters in pdf_split_handle.py
--bug=1054363 --user=刘瑞斌 【知识库】导入PDF文档,分段标题长度超长时,没有自动截断 https://www.tapd.cn/57709429/s/1681044
2025-04-07 10:54:59 +08:00
CaptainB 675adeeb63 fix: exclude macOS specific files from zip processing
--bug=1054264 --user=刘瑞斌 【知识库】QA问答对模式,导入在mac上压缩的zip文件,会出现2个乱码文档 https://www.tapd.cn/57709429/s/1681034
2025-04-07 10:37:06 +08:00
CaptainB 27bc01d442 fix: skip macOS specific metadata directories and files in zip parsing
--bug=1054264 --user=刘瑞斌 【知识库】QA问答对模式,导入在mac上压缩的zip文件,会出现2个乱码文档 https://www.tapd.cn/57709429/s/1679674
2025-04-02 16:06:36 +08:00
shaohuzhang1 9750c6d605
fix: garbled zip import file names (#2747) 2025-03-31 16:22:39 +08:00
shaohuzhang1 55cdd0a708
fix: Zip with title cannot be parsed (#2683) 2025-03-26 10:31:31 +08:00
shaohuzhang1 5ec94860b2
perf: Enhance Word parsing (#2612) 2025-03-19 12:04:43 +08:00
shaohuzhang1 e420a01e0d
fix: Enterprise WeChat docking sub application cannot output thinking process (#2489) 2025-03-04 19:31:49 +08:00
shaohuzhang1 8c45e92ee4
feat: The OpenAI interface supports the thought process (#2392) 2025-02-25 14:22:51 +08:00
CaptainB c524fbc0e4 fix: Fix excel merge cells header 2025-02-14 10:26:18 +08:00
CaptainB 89c08b4bb0 fix: Filter blank sheet
--bug=1052097 --user=刘瑞斌 【github#2196】【应用编排】应用对话的时候上传带空白sheet的表格会报错 https://www.tapd.cn/57709429/s/1653414
2025-02-11 15:17:24 +08:00
shaohuzhang1 f16f417bd5
fix: The knowledge base table file upload is missing a header (#2185) 2025-02-10 10:22:23 +08:00
wxg0103 b90995d3aa fix: defect of incorrect document names after importing CSV and docx files into the knowledge base
--bug=1052039 --user=王孝刚 【知识库】-压缩文件中包含csv、docx文件时,导入到知识库后,文档名称包含文件夹名称 https://www.tapd.cn/57709429/s/1651752
2025-02-08 16:00:57 +08:00
shaohuzhang1 a3d6083188
fix: XLS, XLSX, CSV file upload lost data (#2150) 2025-02-07 15:13:14 +08:00
wxg0103 c5585da57d feat: i18n 2025-01-14 09:46:21 +08:00
shaohuzhang1 a28de6feaf
feat: i18n (#2011) 2025-01-13 11:15:51 +08:00
shaohuzhang1 d9df013e33
fix: Part of the docx document is parsed incorrectly (#1981)
Some checks are pending
sync2gitee / repo-sync (push) Waiting to run
Typos Check / Spell Check with Typos (push) Waiting to run
2025-01-06 14:37:51 +08:00
shaohuzhang1 832b0dbd63
feat: Knowledge base import supports zip, xls, xlsx, and csv formats, while knowledge base export supports zip format (#1869) 2024-12-18 18:00:19 +08:00
CaptainB fb8b96779c fix: 处理某些pdf中不包括目录和内部链接不能完整导入的问题 2024-12-06 10:49:37 +08:00
CaptainB 7346ef6a2c fix: 过滤空白的sheet
--bug=1049943 --user=刘瑞斌 【文档内容提取】-上传的excel中sheet为空时报错 https://www.tapd.cn/57709429/s/1625062
2024-12-04 16:30:43 +08:00
shaohuzhang1 6b4cee1412
fix: 修复对话使用api调用无法响应数据 (#1755) 2024-12-04 14:19:37 +08:00
shaohuzhang1 b6c65154c5
fix: 修复子应用表单调用无法调用问题 (#1741) 2024-12-03 15:23:53 +08:00