maintenance: Performance and Logging enhancements#76
Open
bencap wants to merge 2 commits intomavedb-devfrom
Open
maintenance: Performance and Logging enhancements#76bencap wants to merge 2 commits intomavedb-devfrom
bencap wants to merge 2 commits intomavedb-devfrom
Conversation
urllib3 2.6.3 seems to have a regression causing requests-based calls to rest.ensembl.org to time out at 30-70s while curl and httpx complete in ~0.8s. The root cause is in urllib3's connection handling layer; switching to httpx bypasses it entirely and restores expected latency. Rather than patching only the Ensembl calls, all HTTP call sites are migrated to ensure consistent performance throughout the codebase (MaveDB API, UCSC genome download, cdot transcript lookups, UniProt). resource_utils.py already used httpx for request_with_backoff; the remaining requests.get calls and the streaming download in http_download are updated to match. Test infrastructure is updated from requests-mock to respx and exception types are updated to their httpx equivalents (HTTPStatusError, ConnectError, TimeoutException, HTTPError).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request migrates the codebase from using the
requestslibrary tohttpxfor HTTP requests, both in the main application and in the test suite. Additionally, it improves error logging throughout the mapping and data retrieval code, and updates the testing dependencies to userespxinstead ofrequests-mock.When querying the Ensembl API, the requests library was taking 30-70s per request, while curl was rendering results in less than a second. This migration reduces ensembl wait times and speeds up mapping jobs.
HTTP library migration:
requestswithhttpxacross the codebase, including imports, request calls, exception handling, and documentation [1] [2] [3] [4] [5] [6] [7] [8].httpx.HTTPStatusErrorandhttpx.HTTPErrorwhere appropriate, replacingrequestsequivalents [1] [2] [3] [4] [5] [6] [7] [8] [9] [10].Error handling and logging improvements:
map_scoresetto aid debugging and operational visibility [1] [2] [3] [4] [5] [6] [7].Testing and dependency updates:
requests-mocktorespxfor mockinghttpxrequests in tests [1] [2].respxandhttpxin place ofrequests-mockandrequests[1] [2].Supporting changes for HTTPX:
httpxmethods such asiter_bytesinstead ofrequests'iter_content.httpxtypes and exceptions [1] [2].Dependency management:
httpx~=0.28to the main dependencies inpyproject.tomland removedrequests.respxto the test dependencies and removedrequests-mock.