Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何拿到layout_detection渲染图的位置和内容信息 #199

Open
Tsuki95 opened this issue Jan 3, 2025 · 1 comment
Open

如何拿到layout_detection渲染图的位置和内容信息 #199

Tsuki95 opened this issue Jan 3, 2025 · 1 comment

Comments

@Tsuki95
Copy link

Tsuki95 commented Jan 3, 2025

hi,想请教一下,layout_detection输出的是一个将pdf拆分渲染之后的png格式的图片,但是我想在生成图片之前,得到里面不同模块(figure,figure content,table,plain text等等)的位置信息,并且拿到这些模块的内容(文本就得到文字内容,图片就得到图片格式的jpg、png等等),想问一下这个如何实现?谢谢

@JulioZhao97
Copy link
Collaborator

@Tsuki95 您好,您可以首先通过layout_detection模块得到布局检测的类别和信息,然后对于图片类别把图片裁剪出来,对于文本类型的话可以用OCR工具直接提取文字

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants