Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Upload / KnowledgeBase Deployment Feedback | 知识库部署问题反馈 #3527

Closed
arvinxx opened this issue Aug 21, 2024 · 159 comments · Fixed by #3607
Closed

File Upload / KnowledgeBase Deployment Feedback | 知识库部署问题反馈 #3527

arvinxx opened this issue Aug 21, 2024 · 159 comments · Fixed by #3607
Labels

Comments

@arvinxx
Copy link
Contributor

arvinxx commented Aug 21, 2024

This issue is only for feedback on deployment problems. For feature requests or bug-related issues, please open a new issue.

此 issue 只用于反馈部署问题,功能需求 / bug 类问题请新开 issue。


The first phase of the knowledge base has been released, and here is a brief deployment guide:

  1. Create a pg instance, which needs to include the pgvector plugin (Note: The following command is for demonstration purposes only, as this pg instance does not include the persistence part. Please build a production-grade pg instance that meets your requirements.)

If your pg instance is deployed via Docker, you can use the pgvector/pgvector:pg16 image instead. (Platforms like Vercel / Neon / Supabase do not require any additional operations, as they come with the pgvector plugin by default.)

docker run --name my-postgres --network pg -e POSTGRES_PASSWORD=mysecretpassword -p 5432:5432 -d pgvector/pgvector:pg16
  1. Create a lobe-chat.env file to store environment variables:
# Website Domain Name
APP_URL=https://your-prod-domain.com

# DB Required
KEY_VAULTS_SECRET=jgwsK28dspyVQoIf8/M3IIHl1h6LYYceSYNXeLpy6uk=
DATABASE_URL=postgres://postgres:mysecretpassword@my-postgres:5432/postgres


# NEXT_AUTH related, you can use auth0, and if there are other integration requests, feel free to submit a PR
NEXT_AUTH_SECRET=3904039cd41ea1bdf6c93db0db96e250
NEXT_AUTH_SSO_PROVIDERS=auth0
NEXTAUTH_URL=https://your-prod-domain.com/api/auth
AUTH0_CLIENT_ID=xxxxxx
AUTH0_CLIENT_SECRET=cSX_xxxxx
AUTH0_ISSUER=https://lobe-chat-demo.us.auth0.com

# S3 Related
S3_ACCESS_KEY_ID=xxxxxxxxxx
S3_SECRET_ACCESS_KEY=xxxxxxxxxx
S3_ENDPOINT=https://xxxxxxxxxx.r2.cloudflarestorage.com
S3_BUCKET=lobechat
S3_PUBLIC_DOMAIN=https://s3-for-lobechat.your-domain.com
  1. Start the lobe-chat-database docker image
docker run -it -d -p 3210:3210 --network pg --env-file lobe-chat.env --name lobe-chat-database lobehub/lobe-chat-database

I'm sorry, but I need the source text to provide a translation. Please provide the text you would like translated

Other considerations:

  • Vectorization requires the use of OpenAI's text-embedding-3-small model, please confirm in advance whether your API supports this model;
  • If you encounter a chunk failure due to an unsupported chunk type, you can provide feedback at Better File Chunk | 更加强大的文件分块 #3550
  • Minio needs to add S3_ENABLE_PATH_STYLE=1

知识库一期已发布,以下是一个简版部署指南:

  1. 创建 pg 实例,需要包含 pgvector 插件(注意:以下命令仅用于演示,因为这个 pg 实例并没有包含持久化部分,请自行构建一个符合你诉求的 生产级 pg 实例)

如果你的 pg 实例是通过 docker 部署,可以使用 pgvector/pgvector:pg16 镜像替代。(Vercel / Neon / Supabase 这些不需要额外操作,默认就带有 pgvector 插件)

docker run --name my-postgres --network pg -e POSTGRES_PASSWORD=mysecretpassword -p 5432:5432 -d pgvector/pgvector:pg16
  1. 创建一个 lobe-chat.env 文件用于存放环境变量:
#网站域名
APP_URL=https://your-prod-domain.com

# DB 必须
KEY_VAULTS_SECRET=jgwsK28dspyVQoIf8/M3IIHl1h6LYYceSYNXeLpy6uk=
DATABASE_URL=postgres://postgres:mysecretpassword@my-postgres:5432/postgres


# NEXT_AUTH 相关,可以使用 auth0 等,如有其他接入诉求欢迎提 PR
NEXT_AUTH_SECRET=3904039cd41ea1bdf6c93db0db96e250
NEXT_AUTH_SSO_PROVIDERS=auth0
NEXTAUTH_URL=https://your-prod-domain.com/api/auth
AUTH0_CLIENT_ID=xxxxxx
AUTH0_CLIENT_SECRET=cSX_xxxxx
AUTH0_ISSUER=https://lobe-chat-demo.us.auth0.com

# S3 相关
S3_ACCESS_KEY_ID=xxxxxxxxxx
S3_SECRET_ACCESS_KEY=xxxxxxxxxx
S3_ENDPOINT=https://xxxxxxxxxx.r2.cloudflarestorage.com
S3_BUCKET=lobechat
S3_PUBLIC_DOMAIN=https://s3-for-lobechat.your-domain.com
  1. 启动 lobe-chat-database docker 镜像
docker run -it -d -p 3210:3210 --network pg --env-file lobe-chat.env --name lobe-chat-database lobehub/lobe-chat-database

其余注意事项:

  • 向量化需要使用 OpenAI 的 text-embedding-3-small 模型,请提前确认你的 API 是否支持该模型;
  • 如果遇到分块失败,是不支持的 chunk 类型,可以在 Better File Chunk | 更加强大的文件分块 #3550 这里反馈
  • minio 需要添加 S3_ENABLE_PATH_STYLE=1
@zrr1999

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@arvinxx

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@ReplyInSimpleChineseAlways

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@sqkkyzx

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@arvinxx

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@arvinxx

This comment has been minimized.

@lobehubbot

This comment has been minimized.

@Lovest20018

This comment has been minimized.

@lobehubbot

This comment was marked as resolved.

@arvinxx

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@Lovest20018

This comment was marked as resolved.

@arvinxx

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@iparanoid

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@arvinxx

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@Lovest20018

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@zrr1999

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@arvinxx
Copy link
Contributor Author

arvinxx commented Aug 23, 2024

我想问一下,连接数据库后报错。是不是因为迁移数据库报的错,我的表没有自动生成,是我用“src/database/server/migrations/0000_init.sql”文件手动生成的。

对。选择手动生成的话,以后每次数据库变更你都要自己执行一遍迁移的 sql

我又重新执行了一遍,然后我使用oauth2.0登录为什么没有把我的用户信息存进去表里,需要我自己写方法吗?还是profile这里定义好对应的表字段就行?

@cy948 来看下?

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


I would like to ask about an error after connecting to the database. Is it because of the error reported in the migration database? My table was not automatically generated. I generated it manually using the "src/database/server/migrations/0000_init.sql" file.

Right. If you choose to generate it manually, you will have to execute the migration SQL yourself every time the database changes in the future.

I executed it again, and then I used oauth2.0 to log in. Why was my user information not saved in the table? Do I need to write a method myself? Or should the corresponding table fields be defined here in the profile?

@cy948 Let’s take a look?

@ReplyInSimpleChineseAlways

This comment was marked as resolved.

@ccoc-cc

This comment was marked as resolved.

@lobehubbot

This comment was marked as resolved.

@mujiannan
Copy link

image
这个项目的bug修复速度奇快,Server版的DALLE.3图片存储功能也正常工作了

@TanXiang7o

This comment was marked as resolved.

@asd4259682
Copy link

Cannot read properties of null (reading 'toString') 知识库里面传的是一个pdf,chunk已经完成了,提问的时候报这个错误。 最新的v.12.8版本,编译部署启动 image

刚才测试了下,发现知识库里面有向量化失败的pdf就有这个问题,把这个失败的pdf删除就没有问题了。

我也遇到类似问题,但是删除完所有向量化失败和分块失败的pdf后,仍然会报错……

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Cannot read properties of null (reading 'toString') The knowledge base is uploading a PDF, and the chunk has been completed. This error is reported when asking questions. The latest v.12.8 version, compilation and deployment started! [image](https://private-user-images.githubusercontent.com/81293467/360732696-6f460fca-d1ca-4faf-a87a-0c6c751ee80c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpX VCJ9.. ltSLji9XVRpqnQ7BL74W0BcJkH5Tig1-RqRQh6hUGn0)

I just tested it and found that there is a pdf with failed vectorization in the knowledge base that has this problem. Delete the failed pdf and there will be no problem.

I also encountered a similar problem, but after deleting all PDFs that failed to vectorize and fail to block, an error still occurred...

@arvinxx
Copy link
Contributor Author

arvinxx commented Aug 26, 2024

@asd4259682 专门开个 issue,这个不是部署问题。是使用问题

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@asd4259682 opened an issue specifically, this is not a deployment issue. It's a matter of usage

@wcu1117
Copy link

wcu1117 commented Aug 26, 2024

我想问一下,连接数据库后报错。是不是因为迁移数据库报的错,我的表没有自动生成,是我用“src/database/server/migrations/0000_init.sql”文件手动生成的。

对。选择手动生成的话,以后每次数据库变更你都要自己执行一遍迁移的 sql

我又重新执行了一遍,然后我使用oauth2.0登录为什么没有把我的用户信息存进去表里,需要我自己写方法吗?还是profile这里定义好对应的表字段就行?

@cy948 来看下?

这个问题能帮忙解答一下吗?我现在授权回调回来直接报,但是网页上面已经是登录成功了:
{"level":30,"time":1724662603462,"pid":15288,"hostname":"鍗庡","msg":"Error in tRPC handler (lambda) on path: user.getUserState, type: query"} UserNotFoundError [TRPCError]: user not found at UserModel.getUserState (webpack-internal:///(rsc)/./src/database/server/models/user.ts:117:23)

@zdt3476
Copy link

zdt3476 commented Aug 26, 2024

你好,我这边使用阿里 OSS 作为存储,上传能正常上传,但是分块的时候报错了,但是这边没有更详细的错误
image

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Hello, I use Alibaba OSS as storage. The upload can be uploaded normally, but an error is reported when dividing into chunks, but there are no more detailed errors here.
image

@wcu1117
Copy link

wcu1117 commented Aug 27, 2024

我想问一下,连接数据库后报错。是不是因为迁移数据库报的错,我的表没有自动生成,是我用“src/database/server/migrations/0000_init.sql”文件手动生成的。

对。选择手动生成的话,以后每次数据库变更你都要自己执行一遍迁移的 sql

我又重新执行了一遍,然后我使用oauth2.0登录为什么没有把我的用户信息存进去表里,需要我自己写方法吗?还是profile这里定义好对应的表字段就行?

@cy948 来看下?

这个问题能帮忙解答一下吗?我现在授权回调回来直接报,但是网页上面已经是登录成功了: {"level":30,"time":1724662603462,"pid":15288,"hostname":"鍗庡","msg":"Error in tRPC handler (lambda) on path: user.getUserState, type: query"} UserNotFoundError [TRPCError]: user not found at UserModel.getUserState (webpack-internal:///(rsc)/./src/database/server/models/user.ts:117:23)

我现在解决这个问题的方法是把这个文件的代码改了:src/server/routers/lambda/user.ts

if (enableClerk && error instanceof UserNotFoundError) {

把这个判断clerk的部分删了,加入了nextauth的处理逻辑,我不懂是否会影响后续的流程,现在使用一切正常。

@lobehubbot
Copy link
Member

@arvinxx

This issue is closed, If you have any questions, you can comment and reply.
此问题已经关闭。如果您有任何问题,可以留言并回复。

@arvinxx arvinxx unpinned this issue Aug 27, 2024
@lobehubbot
Copy link
Member

🎉 This issue has been resolved in version 1.14.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

@zwjzxh520
Copy link

向量化需要使用 OpenAI 的 text-embedding-3-small 模型,请提前确认你的 API 是否支持该模型;

向量化有国产的api可以替代吗?

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Vectorization requires the use of OpenAI's text-embedding-3-small model. Please confirm in advance whether your API supports this model;

Is there any domestic API that can replace vectorization?

@wcu1117
Copy link

wcu1117 commented Sep 5, 2024

在哪里修改,可以计算统计某个用户的token总使用量

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Where to modify, you can calculate and count the total token usage of a certain user

@lobehub lobehub locked as resolved and limited conversation to collaborators Sep 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.