feat: Docker内pdf2zh翻译跑通 + 批量翻译
- Dockerfile: libgl1-mesa-glx→libgl1, libglib2.0-0→libglib2.0-0t64 - batch_docker.py: 容器内批量翻译脚本(thread=2) - 模型预下载到 models/ 目录纳入镜像 - 修复索引模板多次修改导致的混乱 - 底栏 spdis链接 + DeepSeek维护说明
This commit is contained in:
@@ -3,6 +3,8 @@ FROM python:3.11-slim
|
|||||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
poppler-utils \
|
poppler-utils \
|
||||||
libgomp1 \
|
libgomp1 \
|
||||||
|
libgl1 \
|
||||||
|
libglib2.0-0t64 \
|
||||||
&& rm -rf /var/lib/apt/lists/*
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
|
|||||||
@@ -52,7 +52,7 @@ for arxiv_id in papers:
|
|||||||
translate(
|
translate(
|
||||||
[pdf_path], output=TRANSLATED_DIR,
|
[pdf_path], output=TRANSLATED_DIR,
|
||||||
lang_in='en', lang_out='zh',
|
lang_in='en', lang_out='zh',
|
||||||
service='deepseek', thread=4, model=model,
|
service='deepseek', model=model,
|
||||||
)
|
)
|
||||||
mono = os.path.join(TRANSLATED_DIR, f"{arxiv_id}-mono.pdf")
|
mono = os.path.join(TRANSLATED_DIR, f"{arxiv_id}-mono.pdf")
|
||||||
dual = os.path.join(TRANSLATED_DIR, f"{arxiv_id}-dual.pdf")
|
dual = os.path.join(TRANSLATED_DIR, f"{arxiv_id}-dual.pdf")
|
||||||
|
|||||||
BIN
models/doclayout_yolo_docstructbench_imgsz1024.onnx
Normal file
BIN
models/doclayout_yolo_docstructbench_imgsz1024.onnx
Normal file
Binary file not shown.
Reference in New Issue
Block a user