feat(spp_api_v2): add outgoing API log model, service, and auditor security#38
feat(spp_api_v2): add outgoing API log model, service, and auditor security#38
Conversation
Summary of ChangesHello @jeremi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the API auditing capabilities by introducing a new system for logging outgoing API calls. It provides a detailed record of external interactions, crucial for troubleshooting, compliance, and security analysis. The changes also include a specialized auditor role with restricted access to sensitive log data, ensuring that only authorized personnel can view detailed request and response payloads. This feature set improves the overall observability and security posture of the API V2 module. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive feature for logging outgoing API calls, including a new model, service, views, and security configurations, and adds a new 'Auditor' role to control access to sensitive log data. While the implementation is robust, well-structured, and thoroughly tested, and the code is clean, there are two medium-severity security concerns related to insecure data handling. The promised 'automatic payload masking' is missing from the service implementation, leading to the potential logging of unmasked secrets. Additionally, the full URL of outgoing calls is logged and exposed to low-privileged users, which could leak sensitive information if secrets are passed in query parameters. Addressing these security issues is crucial to prevent data leakage from the logging system itself. There is also one minor suggestion for improving code style in a new test file.
| truncated_request = self._truncate_payload(request_summary) | ||
| truncated_response = self._truncate_payload(response_summary) |
There was a problem hiding this comment.
The OutgoingApiLogService is missing the 'automatic payload masking' functionality mentioned in the pull request description. The current implementation only performs truncation of large payloads via _truncate_payload, but it does not redact sensitive information such as credentials, tokens, or PII from the request and response summaries. Storing unmasked sensitive data in the database, even if restricted to auditors, poses a security risk. Please implement a masking mechanism to redact sensitive keys (e.g., 'Authorization', 'password', 'token') before logging.
| url = fields.Char( | ||
| required=True, | ||
| index=True, | ||
| help="Full URL called", | ||
| ) |
There was a problem hiding this comment.
The url field stores the full URL of outgoing API calls, which may contain sensitive information such as API keys or tokens in query parameters. Unlike the payload fields, the url field is not restricted to the group_api_v2_auditor group, making it visible to any user with 'viewer' access to the logs. Consider adding groups="spp_api_v2.group_api_v2_auditor" to this field or implementing logic to strip sensitive query parameters before logging.
| ) | ||
|
|
||
| # Build a payload whose JSON serialization is exactly max_length | ||
| import json |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 19.0 #38 +/- ##
==========================================
- Coverage 71.31% 68.97% -2.35%
==========================================
Files 299 400 +101
Lines 23618 32081 +8463
==========================================
+ Hits 16844 22128 +5284
- Misses 6774 9953 +3179
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Restrict sensitive PII fields in API logs to a dedicated group_api_v2_auditor group. Regular viewers can see log metadata (timestamp, endpoint, status, duration) but not payloads or error details. - Add privilege_api_v2_auditor and group_api_v2_auditor - Auditor implies Viewer; Admin implies Auditor - Restrict outgoing log: request_summary, response_summary, error_detail - Restrict audit log: search_parameters, fields_returned, extensions_returned, ip_address, user_agent, error_detail - Hide payload pages and error_detail in outgoing log form view - Add field-level security tests for both models
Wire up the spp.api.outgoing.log model with manifest registration, ACL rules (viewer read-only, officer/manager read+write), the OutgoingApiLogService wrapper with payload truncation and error resilience, and an "Outgoing API Logs" menu item under API V2.
… API log - Add groups restriction to url field and XML view to limit access to auditors - Add _sanitize_url method to strip sensitive query parameters before logging - Call _sanitize_url in log_call so stored URLs never contain tokens or keys - Add tests covering _sanitize_url and end-to-end URL sanitization in log_call
…urity leak The _compute_display_name method was falling back to record.url when endpoint was not set. Since url has groups="spp_api_v2.group_api_v2_auditor" but display_name is store=True with no groups restriction, the URL value was being persisted into an unrestricted field, bypassing field-level security. Also adds url to @api.depends implicitly by removing the reference entirely. Replace the url fallback with a generic "API Call" string.
ff1b282 to
6438991
Compare
- Use sudo() when accessing OAuth2 credentials in data source - Use sudo() when caching OAuth2 token (write to restricted model) - Add _after_submit() hook to auto-approve CR on submit if DCI verified This enables users like demo_officer to use DCI birth verification without requiring administrator privileges.
Add a dedicated method for system-initiated approvals that bypasses user permission checks. This enables automated approval workflows triggered by system events (e.g., DCI verification match) where there is no human approver. Also adds security control to invalidate verification when verified fields (name, DOB, gender, BRN) are edited after verification.
Instrument _make_request to log all outgoing calls via spp.api.outgoing.log (soft dependency). Logs persist in a separate cursor so they survive transaction rollback on UserError. Captures timing, status codes, request/response payloads, and error details for success, HTTP errors, connection errors, timeouts, and 401 retries.
…oval and narrow exception handling Rename action_approve_system to _action_approve_system to prevent Odoo from exposing it via the RPC interface, since it uses sudo() and skips _check_can_approve() authorization checks. Replace broad Exception catches in spp_dci_client with specific exception types (json.JSONDecodeError, KeyError, TypeError) and add json import.
…d fix test log ordering Pass auto=True to _do_approve in _action_approve_system so automated approvals show "(auto)" in activity feedback and skip submitter notification, matching the intent of system-initiated approvals. Fix reversed log assertions in test_401_retry_creates_two_log_entries: due to try-finally semantics with recursive calls, the inner retry's finally creates the success log first (lower ID), so with order="id desc" logs[0] is the 401 entry and logs[1] is the success entry.
Features: - Birth verification via OpenCRVS DCI integration - Auto-approval when DCI data matches CR detail fields - Auto-enrollment of household in configured program on CR apply - Add Child wizard for streamlined UX - Verified BRN registry ID created on apply Technical: - Add search_by_id_opencrvs method for OpenCRVS-specific format - Extract DCI verification logic to utils module - Add system parameters for configuration - Add post_init_hook for auto-configuration Note: DCI data source credentials must be configured manually via Settings > Technical > System Parameters or UI.
…exists Add computed single_dci_data_source field to auto-hide the DCI data source dropdown when there is zero or one active Civil registry, removing an unnecessary selection step from the UI.
…after_submit The DCI demo wizard adds no value over the standard CR flow now that detail forms have Submit buttons. Remove the wizard and its tests, and remove the duplicated _try_auto_approve() from the detail model. Auto- approval now only happens via the _after_submit() hook on the CR model.
…emo household - Add Masters household (Adam + Mary) as demo data for DCI CR flow - Hide Contact Information and Relationship fields in add-member form - Target Conditional Child Grant program in post_init_hook - Add Conditional Child Grant program with first-1,000-days eligibility - Add Health Visit event type for compliance tracking - Configure compliance manager with CEL expression support
…add system parameter Invalidate household group_membership_ids cache before iterating so the newly created child membership is included in auto-enrollment. Add missing spp_dci_demo.default_crvs_data_source system parameter with an empty default to system_parameters.xml so the parameter is defined on module install.
emjay0921
left a comment
There was a problem hiding this comment.
Reviewed code and tested manually on a running instance. Architecture is clean — model stores, service sanitizes (masking, URL sanitization, truncation). Auditor group field-level security verified working via XML-RPC.
Minor items for follow-up:
- Model
log_calldocstring saysrequest_summary: Request payload (secrets redacted)but the model doesn't redact — only the service does. Misleading docstring. error_detailis not sanitized by the service layer — could leak internal infrastructure details (IPs, hostnames). The field is gated by auditor group in the UI, but consider sanitizing at the service level too.
Neither is a blocker. Good to merge.
emjay0921
left a comment
There was a problem hiding this comment.
3 deterministic test failures that will break the 19.0 build
The new outgoing log feature looks good overall (666/669 tests pass), but 3 new tests have bugs that will fail on every CI run after merge. These must be fixed before merging.
1. test_display_name_falls_back_to_url (test_api_outgoing_log.py:106)
Test expects the URL in display_name when endpoint is empty:
self.assertIn("https://example.org/api/test", log.display_name)But the model (api_outgoing_log.py:145) uses "API Call" as the fallback:
record.display_name = f"{record.http_method} {record.endpoint or 'API Call'} @ {timestamp_str}"Fix: Either update the test to assert "API Call" or change _compute_display_name to fall back to self.url instead of "API Call".
2. test_log_call_sanitizes_url (test_outgoing_api_log_service.py:366)
Test expects literal ***MASKED*** in the stored URL:
self.assertIn("***MASKED***", result.url)But _sanitize_url uses urlencode() which percent-encodes * → %2A, so the actual stored value is api_key=%2A%2A%2AMASKED%2A%2A%2A.
3. test_sanitize_url_masks_sensitive_params (test_outgoing_api_log_service.py:323)
Same urlencode issue as #2.
Fix for #2 and #3: Use urllib.parse.quote(safe='*') when building the sanitized URL, or change MASK_VALUE to something URL-safe (e.g., MASKED), or update the assertions to account for percent-encoding.
Everything else looks solid — the model, service, auditor security, and masking logic are well-designed. Just these 3 test/implementation mismatches need to be resolved.
feat(spp_dci_client): DCI client improvements and audit trail logging
feat(spp_dci_demo): add DCI birth verification demo module and Conditional Child Grant
| "Birth verification for BRN %s completed with status: %s, data_match: %s", | ||
| self.birth_registration_number, | ||
| verification_status, | ||
| data_matches, |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 days ago
General approach: stop logging raw or easily re‑identifiable personal data. Instead, log only high‑level indicators (e.g., which fields mismatched, not their values) or counts. Ensure any utility that builds diagnostic messages for logging does so in a privacy‑preserving way.
Best concrete fix in this codebase:
-
In
spp_dci_demo/utils/dci_verification.py, changecheck_data_matchesso that:- It no longer embeds actual field values (names, birthdates, sex) in the
mismatcheslist. - It instead records only which field mismatched, e.g.
"given_name","family_name","birthdate","gender", or some similar generic indicator. - All comparisons still occur exactly as before, so functionality (match / mismatch decision) is unchanged; only the mismatch description strings are altered.
- It no longer embeds actual field values (names, birthdates, sex) in the
-
In
spp_dci_demo/models/cr_detail_add_member.py, the_check_data_matches_dci_responsemethod already logs:_logger.info( "DCI data mismatch for BRN %s: %s", self.birth_registration_number, "; ".join(mismatches), )
After the change above, this log will only contain field identifiers, not raw PII, so we keep the logging call unchanged. The log line that CodeQL originally points at:
_logger.info( "Birth verification for BRN %s completed with status: %s, data_match: %s", self.birth_registration_number, verification_status, data_matches, )
will still log only a boolean
data_matches; once the mismatch strings are sanitized, the taint source of PII that concerned CodeQL is removed, and this log becomes acceptable.
Files/regions to change:
spp_dci_demo/utils/dci_verification.py- Function
check_data_matches(around lines 126–170): update themismatches.append(...)calls to no longer interpolate sensitive values, and to use generic labels instead.
- Function
- No changes are required to imports or method signatures; the function interface remains the same.
This single focused change sanitizes all the variants reported: they all stem from PII used in mismatch messages built in check_data_matches and then logged.
| @@ -143,14 +143,16 @@ | ||
| cr_given_name = (given_name or "").strip().upper() | ||
| dci_given_name = person_data["given_name"] | ||
| if cr_given_name != dci_given_name: | ||
| mismatches.append(f"given_name: CR='{cr_given_name}' vs DCI='{dci_given_name}'") | ||
| # Do not log the actual values to avoid exposing PII in logs. | ||
| mismatches.append("given_name") | ||
|
|
||
| # Compare family name | ||
| if person_data.get("family_name"): | ||
| cr_family_name = (family_name or "").strip().upper() | ||
| dci_family_name = person_data["family_name"] | ||
| if cr_family_name != dci_family_name: | ||
| mismatches.append(f"family_name: CR='{cr_family_name}' vs DCI='{dci_family_name}'") | ||
| # Record only the field name, not the concrete values. | ||
| mismatches.append("family_name") | ||
|
|
||
| # Compare birth date | ||
| if person_data.get("birth_date") and birthdate: | ||
| @@ -158,13 +154,13 @@ | ||
| dci_birthdate = person_data["birth_date"] | ||
| # Handle different date formats (YYYY-MM-DD) | ||
| if cr_birthdate[:10] != dci_birthdate[:10]: | ||
| mismatches.append(f"birthdate: CR='{cr_birthdate}' vs DCI='{dci_birthdate}'") | ||
| mismatches.append("birthdate") | ||
|
|
||
| # Compare gender/sex | ||
| if person_data.get("sex") and gender_display: | ||
| cr_gender = gender_display.lower() | ||
| dci_sex = person_data["sex"].lower() | ||
| if cr_gender != dci_sex: | ||
| mismatches.append(f"gender: CR='{cr_gender}' vs DCI='{dci_sex}'") | ||
| mismatches.append("gender") | ||
|
|
||
| return (len(mismatches) == 0, mismatches) |
| _logger.info( | ||
| "DCI data mismatch for BRN %s: %s", | ||
| self.birth_registration_number, | ||
| "; ".join(mismatches), |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 days ago
In general, the problem should be fixed by ensuring that logs do not contain raw sensitive personal data. Instead of logging full field values (names, birthdates, gender) and differences, logs should either (a) only state that mismatches exist (possibly with a count), or (b) log only non-sensitive, high-level metadata (e.g., which categories mismatched, or a generic reason code).
For this specific case, the most targeted fix that preserves functionality is:
- Change
check_data_matchesso that it no longer constructs mismatch messages containing the actual values from CR/DCI. Instead, it should record only which fields mismatch, for example"given_name","family_name","birthdate","gender". This keeps the boolean outcome and a list of which fields differed, which is likely sufficient for debugging and for the caller to react, without exposing the actual PII. - Keep the
_check_data_matches_dci_responselogging incr_detail_add_member.py, but adjust it to log only a generic description built from the new, non-sensitive mismatch markers (e.g.,"Fields mismatched: given_name, birthdate"). Sincemismatcheswill then only contain field names,"; ".join(mismatches)will no longer embed sensitive values. - No changes are required for imports or external dependencies; we only adjust string construction.
Concretely:
- In
spp_dci_demo/utils/dci_verification.py, editcheck_data_matchesso that it appends e.g."given_name","family_name","birthdate","gender"rather than formatted strings with CR/DCI values. - In
spp_dci_demo/models/cr_detail_add_member.py, keep the logging but rely on the now redactedmismatchescontent (no sensitive data) so the existing"; ".join(mismatches)becomes safe. The rest of the method remains unchanged.
This single change in the utility function ensures that all tainted variants (birthdate, gender, sex, etc.) are no longer propagated into log messages, addressing all the CodeQL variants at once.
| @@ -294,8 +294,10 @@ | ||
| ) | ||
|
|
||
| if mismatches: | ||
| # `mismatches` contains only field identifiers (no PII), see | ||
| # spp_dci_demo.utils.dci_verification.check_data_matches. | ||
| _logger.info( | ||
| "DCI data mismatch for BRN %s: %s", | ||
| "DCI data mismatch for BRN %s (fields: %s)", | ||
| self.birth_registration_number, | ||
| "; ".join(mismatches), | ||
| ) |
| @@ -135,6 +135,11 @@ | ||
|
|
||
| Returns: | ||
| Tuple of (matches: bool, mismatches: list[str]) | ||
|
|
||
| Note: | ||
| The `mismatches` list intentionally contains only field identifiers | ||
| (for example, "given_name", "birthdate") and does not include any | ||
| concrete personal data values, to avoid logging sensitive data. | ||
| """ | ||
| mismatches = [] | ||
|
|
||
| @@ -143,14 +148,15 @@ | ||
| cr_given_name = (given_name or "").strip().upper() | ||
| dci_given_name = person_data["given_name"] | ||
| if cr_given_name != dci_given_name: | ||
| mismatches.append(f"given_name: CR='{cr_given_name}' vs DCI='{dci_given_name}'") | ||
| # Do not include actual values in mismatch details | ||
| mismatches.append("given_name") | ||
|
|
||
| # Compare family name | ||
| if person_data.get("family_name"): | ||
| cr_family_name = (family_name or "").strip().upper() | ||
| dci_family_name = person_data["family_name"] | ||
| if cr_family_name != dci_family_name: | ||
| mismatches.append(f"family_name: CR='{cr_family_name}' vs DCI='{dci_family_name}'") | ||
| mismatches.append("family_name") | ||
|
|
||
| # Compare birth date | ||
| if person_data.get("birth_date") and birthdate: | ||
| @@ -158,13 +158,13 @@ | ||
| dci_birthdate = person_data["birth_date"] | ||
| # Handle different date formats (YYYY-MM-DD) | ||
| if cr_birthdate[:10] != dci_birthdate[:10]: | ||
| mismatches.append(f"birthdate: CR='{cr_birthdate}' vs DCI='{dci_birthdate}'") | ||
| mismatches.append("birthdate") | ||
|
|
||
| # Compare gender/sex | ||
| if person_data.get("sex") and gender_display: | ||
| cr_gender = gender_display.lower() | ||
| dci_sex = person_data["sex"].lower() | ||
| if cr_gender != dci_sex: | ||
| mismatches.append(f"gender: CR='{cr_gender}' vs DCI='{dci_sex}'") | ||
| mismatches.append("gender") | ||
|
|
||
| return (len(mismatches) == 0, mismatches) |
| if "sex" in person_data: | ||
| normalized["sex"] = person_data["sex"].lower() | ||
| elif "gender" in person_data: | ||
| normalized["sex"] = person_data["gender"].lower() |
There was a problem hiding this comment.
Null values in DCI response cause AttributeError
Low Severity
extract_person_from_dci_response calls .strip().upper() on values from name.get("given_name", "") and .lower() on person_data["sex"]. If the DCI response contains explicit null values (e.g., {"given_name": null}), dict.get() returns None (not the default ""), causing AttributeError on the chained string method call. The same applies to the "sex" and "gender" fields.
| CRVS = "ns:org:RegistryType:Civil" | ||
| IBR = "ns:org:RegistryType:IBR" | ||
| DISABILITY_REGISTRY = "ns:org:RegistryType:DR" | ||
| FUNCTIONAL_REGISTRY = "ns:org:RegistryType:FR" |
There was a problem hiding this comment.
RegistryType enum values changed without database migration
High Severity
The RegistryType enum values changed from simple strings (e.g., "CRVS", "SOCIAL_REGISTRY") to namespaced strings (e.g., "ns:org:RegistryType:Civil", "ns:org:RegistryType:Social"), but there is no database migration to update existing spp.dci.data.source records. The registry_type Selection field's options come from _get_registry_types() which uses the new enum values, so any existing records with old values will have invalid selection values, appear blank in the UI, and fail lookups like get_by_registry_type().
Additional Locations (1)
- display_name: fall back to url when endpoint is empty, add url to @api.depends - _sanitize_url: use quote_via to preserve asterisks in MASK_VALUE so urlencode does not percent-encode them
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| CRVS = "ns:org:RegistryType:Civil" | ||
| IBR = "ns:org:RegistryType:IBR" | ||
| DISABILITY_REGISTRY = "ns:org:RegistryType:DR" | ||
| FUNCTIONAL_REGISTRY = "ns:org:RegistryType:FR" |
There was a problem hiding this comment.
RegistryType enum change breaks dependent modules
High Severity
The RegistryType enum values were changed from simple strings (e.g., "CRVS", "DR", "SOCIAL_REGISTRY") to namespaced strings (e.g., "ns:org:RegistryType:Civil", "ns:org:RegistryType:DR"), but not all consumer modules were updated. Notably, spp_dci_client_dr/services/dr_service.py still compares self.data_source.registry_type != "DR", which now always evaluates to True (the actual value is "ns:org:RegistryType:DR"), making DR service initialization always raise ValidationError. Similarly, spp_dci_server subscription/transaction models have Selection fields with the old values.


Summary
spp.api.outgoing.logmodel to track outgoing API requests (URL, method, headers, payload, response, status, duration)OutgoingApiLogServicefor creating and querying log entries with automatic payload maskingTest plan
test_single_module.sh spp_api_v2Note
High Risk
Touches security boundaries (new auditor privilege +
sudo()usage), adds persistence of outbound request/response payloads, and changes DCI request/envelope construction and logging behavior, so misconfiguration could leak sensitive data or break integrations.Overview
Adds a new
spp.api.outgoing.logaudit trail for outgoing HTTP calls (request/response summaries, status, duration, origin context), plus UI views/menu access under API V2.Introduces an opt-in
API V2: Auditorgroup/privilege and applies field-level security to hide sensitive log payload fields (including existingspp.api.audit.logIP/user-agent/search params/error details) from non-auditors, with updated access rules and tests.Extends the DCI client to (a) support SPDCI-compliant registry type constants and OpenCRVS-specific search/envelope formats, and (b) automatically log every outbound request via the new service using masking/truncation and a separate DB cursor so logging survives rollbacks; also tightens OAuth2 credential access via
sudo()and adds a system-only approval bypass method for automated approvals.Adds a new
spp_dci_demomodule implementing birth verification in “add member” change requests (verification UI/actions, auto-creating a verified BRN registry ID, optional auto-enrollment via system params, and demo data) with comprehensive tests.Written by Cursor Bugbot for commit f5fbed7. This will update automatically on new commits. Configure here.