Skip to content

Conversation

@wantbook-book
Copy link

Issue resolved by this Pull Request:
Resolves #2465

Introduced a reusable retry-enabled HTTP helper (docling.utils.http_client) and refactored all existing remote requests calls to use it. This makes download helpers, HTML backend image fetches, and OpenAI-compatible image requests more resilient to transient errors and rate limiting without duplicating retry configuration.

Checklist:

  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

Introduce docling.utils.http_client with a retry-configured Session and helper so remote calls share the same backoff settings. Refactor image download, API image requests, and HTML backend remote fetches to use the helper, enabling consistent retries for both sync and streaming paths without changing call sites’ behavior.

Signed-off-by: akai <[email protected]>
Update docling.utils.http_client to type status_forcelist as Collection[int] in the retry builder/session helper so it matches the requests.Retry signature.

Signed-off-by: akai <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Dec 6, 2025

DCO Check Passed

Thanks @wantbook-book, all your commits are properly signed off. 🎉

@dosubot
Copy link

dosubot bot commented Dec 6, 2025

Related Documentation

Checked 5 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@mergify
Copy link

mergify bot commented Dec 6, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

❌ Patch coverage is 51.51515% with 16 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
docling/utils/http_client.py 52.00% 12 Missing ⚠️
docling/utils/api_image_request.py 50.00% 2 Missing ⚠️
docling/backend/html_backend.py 50.00% 1 Missing ⚠️
docling/utils/utils.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@cau-git
Copy link
Contributor

cau-git commented Dec 8, 2025

@wantbook-book Many thanks for the effort, this looks definitely useful. However, some tests are failing. Could you please review those?

To run locally:

uv run pytest tests/ -s -v # Note: some tests may fail related to OCR/webp if you don't have TESSDATA_PREFIX env set up. The tests failing in CI are not related to these.

@wantbook-book
Copy link
Author

I ran uv run pytest tests/ -s -v locally, and a lot of TypeError: StringConstraints() takes no arguments errors appeared, for example:

==================================== ERRORS ====================================
________________ ERROR collecting tests/test_asr_mlx_whisper.py ________________
tests/test_asr_mlx_whisper.py:12: in <module>
    from docling.datamodel.asr_model_specs import (
docling/datamodel/asr_model_specs.py:9: in <module>
    from docling.datamodel.pipeline_options_asr_model import (
docling/datamodel/pipeline_options_asr_model.py:8: in <module>
    from docling.datamodel.pipeline_options_vlm_model import (
docling/datamodel/pipeline_options_vlm_model.py:4: in <module>
    from docling_core.types.doc.page import SegmentedPage
.venv/lib/python3.9/site-packages/docling_core/types/__init__.py:8: in <module>
    from docling_core.types.doc.document import DoclingDocument
.venv/lib/python3.9/site-packages/docling_core/types/doc/__init__.py:9: in <module>
    from .document import (
.venv/lib/python3.9/site-packages/docling_core/types/doc/document.py:39: in <module>
    from docling_core.search.package import VERSION_PATTERN
.venv/lib/python3.9/site-packages/docling_core/search/package.py:23: in <module>
    class Package(BaseModel, extra="forbid"):
.venv/lib/python3.9/site-packages/docling_core/search/package.py:30: in Package
    version: Annotated[str, StringConstraints(strict=True, pattern=VERSION_PATTERN)] = (
E   TypeError: StringConstraints() takes no arguments

Do you have any suggestions for how to fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

implement retry strategy for remote calls

3 participants