Skip to content

maintenance: Performance and Logging enhancements#76

Open
bencap wants to merge 2 commits intomavedb-devfrom
maintenance/bencap/logging-and-performance-enhancements
Open

maintenance: Performance and Logging enhancements#76
bencap wants to merge 2 commits intomavedb-devfrom
maintenance/bencap/logging-and-performance-enhancements

Conversation

@bencap
Copy link
Collaborator

@bencap bencap commented Feb 19, 2026

This pull request migrates the codebase from using the requests library to httpx for HTTP requests, both in the main application and in the test suite. Additionally, it improves error logging throughout the mapping and data retrieval code, and updates the testing dependencies to use respx instead of requests-mock.

When querying the Ensembl API, the requests library was taking 30-70s per request, while curl was rendering results in less than a second. This migration reduces ensembl wait times and speeds up mapping jobs.

HTTP library migration:

  • Replaced all usage of requests with httpx across the codebase, including imports, request calls, exception handling, and documentation [1] [2] [3] [4] [5] [6] [7] [8].
  • Updated exception handling to use httpx.HTTPStatusError and httpx.HTTPError where appropriate, replacing requests equivalents [1] [2] [3] [4] [5] [6] [7] [8] [9] [10].

Error handling and logging improvements:

  • Added detailed logging for various error cases in map_scoreset to aid debugging and operational visibility [1] [2] [3] [4] [5] [6] [7].

Testing and dependency updates:

  • Switched test dependencies from requests-mock to respx for mocking httpx requests in tests [1] [2].
  • Updated test code to use respx and httpx in place of requests-mock and requests [1] [2].

Supporting changes for HTTPX:

  • Updated streaming and chunked download logic to use httpx methods such as iter_bytes instead of requests' iter_content.
  • Updated function signatures and docstrings to reflect the switch to httpx types and exceptions [1] [2].

Dependency management:

  • Added httpx~=0.28 to the main dependencies in pyproject.toml and removed requests.
  • Added respx to the test dependencies and removed requests-mock.

urllib3 2.6.3 seems to have a regression causing requests-based calls to
rest.ensembl.org to time out at 30-70s while curl and httpx complete
in ~0.8s. The root cause is in urllib3's connection handling layer;
switching to httpx bypasses it entirely and restores expected latency.

Rather than patching only the Ensembl calls, all HTTP call sites are
migrated to ensure consistent performance throughout the codebase
(MaveDB API, UCSC genome download, cdot transcript lookups, UniProt).

resource_utils.py already used httpx for request_with_backoff; the
remaining requests.get calls and the streaming download in http_download
are updated to match. Test infrastructure is updated from requests-mock
to respx and exception types are updated to their httpx equivalents
(HTTPStatusError, ConnectError, TimeoutException, HTTPError).
@bencap bencap requested review from jstone-dev and sallybg February 19, 2026 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant