Add File
This commit is contained in:
103
doc/安装.txt
Normal file
103
doc/安装.txt
Normal file
@@ -0,0 +1,103 @@
|
||||
一、第一步先准备处理文档的工具,使用tika(依赖jdk)
|
||||
sudo apt install openjdk-11-jdk
|
||||
或者
|
||||
sudo apt install default-jdk
|
||||
|
||||
如果不能使用系统安装的,可以手工下载https://adoptium.net/zh-CN/来解压,最后指定好位置
|
||||
|
||||
sudo ln -s 所在目录/jdk-17.0.14+7-jre/bin/java /usr/bin/java
|
||||
|
||||
终端输入,验证java安装成功
|
||||
java -version
|
||||
|
||||
然后安装tika
|
||||
pip install tika -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
|
||||
二、文档同步和切割
|
||||
1、文档同步
|
||||
pip install smbprotocol -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
pip install paramiko -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
|
||||
2、升级langchain到0.3
|
||||
pip install -U langchain -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
3、文档生成
|
||||
#docx
|
||||
sudo apt-get install pandoc
|
||||
pip install pypandoc -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
#pdf支持,需要下载的资源比较多
|
||||
sudo apt-get install wkhtmltopdf
|
||||
pip install pdfkit -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
#ppt支持
|
||||
pip install lxml python-pptx -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
|
||||
三、 全文检索
|
||||
pip install xapian-bindings-binary -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
pip install jieba -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
|
||||
四、 ORM模型
|
||||
pip install peewee -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
|
||||
五、Fastapi
|
||||
pip install fastapi -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
pip install python-multipart -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
pip install uvicorn[standard] -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
pip install gunicorn -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
|
||||
六、qwen-agent
|
||||
pip install qwen-agent -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
pip install dotenv -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
七、初始化和运行
|
||||
1、解压k3GPT-Vx.xz文件,进入解压后的目录
|
||||
2、 默认的存储路径为/mnt, 需要确保用户有创建和写入权限
|
||||
sudo chown -R 用户.用户组 /mnt
|
||||
3、用./start.sh脚本启动三个服务
|
||||
一个主服务,一个对外知识体共享服务,一个构建全文索引服务
|
||||
Started uvicorn with PID: 47382
|
||||
Started uvicorn with PID: 47384
|
||||
Started build_full_index.py with PID: 47386
|
||||
|
||||
|
||||
3、启动好后在浏览器中访问https://ip:8000/
|
||||
|
||||
4、 /stop文件会停掉所有服务
|
||||
|
||||
5. 真正处理文档的时候才会启动tika服务,可以在文件中心上传一个文件试试
|
||||
用ps -ef |grep java验证
|
||||
element+ 169987 1 1 16:43 ? 00:00:01 java -cp /tmp/tika-server.jar org.apache.tika.server.core.TikaServerCli --port 9998 --host localhost
|
||||
element+ 170024 169987 27 16:43 ? 00:00:20 java -Djava.awt.headless=true -cp /tmp/tika-server.jar -Dtika.server.id=1b8dcb6f-a416-45b2-9e07-5349ae2d61fb org.apache.tika.server.core.TikaServerProcess -h localhost -p 9998 -i 1b8dcb6f-a416-45b2-9e07-5349ae2d61fb -forkedStatusFile /tmp/apache-tika-server-forked-tmp-6459948668656732552 -numRestarts 0
|
||||
element+ 170272 4933 0 16:44 pts/2 00:00:00 grep --color=auto java
|
||||
|
||||
6、 知识体对完访问
|
||||
|
||||
https://ip:7000/?sn=LOZxAy06AaQ
|
||||
|
||||
八、单独启动的方法如下:
|
||||
首先进入到main目录
|
||||
1、主服务web启动
|
||||
uvicorn web:app --host 0.0.0.0 --port 8000
|
||||
2. 一个知识体对外共享服务
|
||||
uvicorn webx:app --host 0.0.0.0 --port 7000
|
||||
3、构建知识库索引
|
||||
python3 build_full_index.py
|
||||
|
||||
|
||||
九. Excel数据分析
|
||||
1. 分析库
|
||||
pip install polars
|
||||
2. 读写excel的库
|
||||
pip install fastexcel
|
||||
pip install xlsxwriter
|
||||
|
||||
十. 海报生成
|
||||
pip install --upgrade Pillow
|
||||
Reference in New Issue
Block a user