FastGPT/test/cases/global/common/string/test.md
Archer 2c681bcdd1
Some checks are pending
Document deploy / sync-images (push) Waiting to run
Document deploy / generate-timestamp (push) Blocked by required conditions
Document deploy / build-images (map[domain:https://fastgpt.cn suffix:cn]) (push) Blocked by required conditions
Document deploy / build-images (map[domain:https://fastgpt.io suffix:io]) (push) Blocked by required conditions
Document deploy / update-images (map[deployment:fastgpt-docs domain:https://fastgpt.cn kube_config:KUBE_CONFIG_CN suffix:cn]) (push) Blocked by required conditions
Document deploy / update-images (map[deployment:fastgpt-docs domain:https://fastgpt.io kube_config:KUBE_CONFIG_IO suffix:io]) (push) Blocked by required conditions
Build FastGPT images in Personal warehouse / get-vars (push) Waiting to run
Build FastGPT images in Personal warehouse / build-fastgpt-images (map[arch:amd64 runs-on:ubuntu-24.04]) (push) Blocked by required conditions
Build FastGPT images in Personal warehouse / build-fastgpt-images (map[arch:arm64 runs-on:ubuntu-24.04-arm]) (push) Blocked by required conditions
Build FastGPT images in Personal warehouse / release-fastgpt-images (push) Blocked by required conditions
fix: text split (#5933)
* fix: text split

* remove test
2025-11-17 12:30:56 +08:00

5 lines
6.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[
"这是一个测试的内容包含代码块快速了解FastGPTFastGPT的能力与优势FastGPT是一个基于LLM大语言模型的知识库问答系统提供开箱即用的数据处理、模型调用等能力。同时可以通过Flow可视化进行工作流编排从而实现复杂的问答场景FastGPT在线使用https://fastgpt.ioFastGPT能力1.专属AI客服通过导入文档或已有问答对进行训练让AI模型能根据你的文档以交互式对话方式回答问题。2.简单易用的可视化界面FastGPT采用直观的可视化界面设计为各种应用场景提供了丰富实用的功能。通过简洁易懂的操作步骤可以轻松完成AI客服的创建和训练流程。~~~jsimport{defaultMaxChunkSize}from'../../core/dataset/training/utils';import{getErrText}from'../error/utils';constgetOneTextOverlapText=({text,step}:{text:string;step:number}):string=>{constforbidOverlap=checkForbidOverlap(step);constmaxOverlapLen=chunkSize*0.4;//step>=stepReges.length:Donotoverlapincompletesentencesif(forbidOverlap||overlapLen===0||step>=stepReges.length)return'';constsplitTexts=getSplitTexts({text,step});letoverlayText='';for(leti=splitTexts.length-1;i>=0;i--){constcurrentText=splitTexts[i].text;constnewText=currentText+overlayText;constnewTextLen=newText.length;if(newTextLen>overlapLen){if(newTextLen>maxOverlapLen){consttext=getOneTextOverlapText({text:newText,step:step+1});returntext||overlayText;}returnnewText;}overlayText=newText;}returnoverlayText;};constgetOneTextOverlapText=({text,step}:{text:string;step:number}):string=>{constforbidOverlap=checkForbidOverlap(step);constmaxOverlapLen=chunkSize*0.4;//step>=stepReges.length:Donotoverlapincompletesentencesif(forbidOverlap||overlapLen===0||step>=stepReges.length)return'';constsplitTexts=getSplitTexts({text,step});letoverlayText='';for(leti=splitTexts.length-1;i>=0;i--){constcurrentText=splitTexts[i].text;constnewText=currentText+overlayText;constnewTextLen=newText.length;if(newTextLen>overlapLen){if(newTextLen>maxOverlapLen){consttext=getOneTextOverlapText({text:newText,step:step+1});returntext||overlayText;}returnnewText;}overlayText=newText;}returnoverlayText;};constgetOneTextOverlapText=({text,step}:{text:string;step:number}):string=>{constforbidOverlap=checkForbidOverlap(step);constmaxOverlapLen=chunkSize*0.4;//step>=stepReges.length:Donotoverlapincompletesentencesif(forbidOverlap||overlapLen===0||step>=stepReges.length)return'';constsplitTexts=getSplitTexts({text,step});letoverlayText='';for(leti=splitTexts.length-1;i>=0;i--){constcurrentText=splitTexts[i].text;constnewText=currentText+overlayText;constnewTextLen=newText.length;if(newTextLen>overlapLen){if(newTextLen>maxOverlapLen){consttext=getOneTextOverlapText({text:newText,step:step+1});returntext||overlayText;}returnnewText;}overlayText=newText;}returnoverlayText;};constgetOneTextOverlapText=({text,step}:{text:string;step:number}):string=>{constforbidOverlap=checkForbidOverlap(step);constmaxOverlapLen=chunkSize*0.4;//step>=stepReges.length:Donotoverlapincompletesentencesif(forbidOverlap||overlapLen===0||step>=stepReges.length)return'';constsplitTexts=getSplitTexts({text,step});letoverlayText='';for(leti=splitTexts.length-1;i>=0;i--){constcurrentText=splitTexts[i].text;constnewText=currentText+overlayText;constnewTextLen=newText.length;if(newTextLen>overlapLen){if(newTextLen>maxOverlapLen){consttext=getOneTextOverlapText({text:newText,step:step+1});returntext||overlayText;}returnnewText;}overlayText=newText;}returnoverlayText;};constgetOneTextOverlapText=({text,step}:{text:string;step:number}):string=>{constforbidOverlap=checkForbidOverlap(step);constmaxOverlapLen=chunkSize*0.4;//step>=stepReges.length:Donotoverlapincompletesentencesif(forbidOverlap||overlapLen===0||step>=stepReges.length)return'';constsplitTexts=getSplitTexts({text,step});letoverlayText='';for(leti=splitTexts.length-1;i>=0;i--){constcurrentText=splitTexts[i].text;constnewText=currentText+overlayText;constnewTextLen=newText.length;if(newTextLen>overlapLen){if(newTextLen>maxOverlapLen){consttext=getOneTextOverlapText({text:newText,step:step+1});returntext||overlayText;}returnnewText;}overlayText=newText;}returnoverlayText;};~~~",
"3.自动数据预处理提供手动输入、直接分段、LLM自动处理和CSV等多种数据导入途径其中“直接分段”支持通过PDF、WORD、Markdown和CSV文档内容作为上下文。FastGPT会自动对文本数据进行预处理、向量化和QA分割节省手动训练时间提升效能。4.工作流编排基于Flow模块的工作流编排可以帮助你设计更加复杂的问答流程。例如查询数据库、查询库存、预约实验室等。5.强大的API集成FastGPT对外的API接口对齐了OpenAI官方接口可以直接接入现有的GPT应用也可以轻松集成到企业微信、公众号、飞书等平台。FastGPT特点项目开源FastGPT遵循附加条件ApacheLicense2.0开源协议你可以Fork之后进行二次开发和发布。FastGPT社区版将保留核心功能商业版仅在社区版基础上使用API的形式进行扩展不影响学习使用。独特的QA结构针对客服问答场景设计的QA结构提高在大量数据场景中的问答准确性。可视化工作流通过Flow模块展示了从问题输入到模型输出的完整流程便于调试和设计复杂流程。无限扩展基于API进行扩展无需修改FastGPT源码也可快速接入现有的程序中。",
"便于调试提供搜索测试、引用修改、完整对话预览等多种调试途径。支持多种模型支持GPT、Claude、文心一言等多种LLM模型未来也将支持自定义的向量模型。知识库核心流程FastGPTAI相关参数配置说明在FastGPT的AI对话模块中有一个AI高级配置里面包含了AI模型的参数配置本文详细介绍这些配置的含义。返回AI内容高级编排特有这是一个开关打开的时候当AI对话模块运行时会将其输出的内容返回到浏览器API响应如果关闭AI输出的内容不会返回到浏览器但是生成的内容仍可以通过【AI回复】进行输出。你可以将【AI回复】连接到其他模块中。最大上下文代表模型最多容纳的文字数量。"
]