如何拿到layout_detection渲染图的位置和内容信息 #199

Tsuki95 · 2025-01-03T03:15:28Z

hi，想请教一下，layout_detection输出的是一个将pdf拆分渲染之后的png格式的图片，但是我想在生成图片之前，得到里面不同模块(figure,figure content,table,plain text等等）的位置信息，并且拿到这些模块的内容(文本就得到文字内容，图片就得到图片格式的jpg、png等等），想问一下这个如何实现？谢谢

JulioZhao97 · 2025-01-10T09:15:21Z

@Tsuki95 您好，您可以首先通过layout_detection模块得到布局检测的类别和信息，然后对于图片类别把图片裁剪出来，对于文本类型的话可以用OCR工具直接提取文字

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何拿到layout_detection渲染图的位置和内容信息 #199

如何拿到layout_detection渲染图的位置和内容信息 #199

Tsuki95 commented Jan 3, 2025

JulioZhao97 commented Jan 10, 2025

如何拿到layout_detection渲染图的位置和内容信息 #199

如何拿到layout_detection渲染图的位置和内容信息 #199

Comments

Tsuki95 commented Jan 3, 2025

JulioZhao97 commented Jan 10, 2025