fix(experiments): support /cmsHomePage vanity for experiment URL matching#35360
fix(experiments): support /cmsHomePage vanity for experiment URL matching#35360dsolistorres merged 13 commits intomainfrom
Conversation
|
Claude finished @dsolistorres's task in 4m 8s —— View job 🔍 dotCMS Backend Review
Review posted to comment #4277243445. Found 5 medium-severity findings across security, database, and Java standards domains. REST API review was clean (NO_FINDINGS). |
🔍 dotCMS Backend Review[🟡 Medium]
// line 122 — empty pattern source yields a catch-all template expansion
.map(vanity -> String.format(DEFAULT_URL_REGEX_TEMPLATE, vanity.pattern.pattern()))💡 Add a guard to skip blank pattern sources: vanityUrls.stream()
.filter(vanity -> !vanity.pattern.pattern().isEmpty())
.map(vanity -> String.format(DEFAULT_URL_REGEX_TEMPLATE, vanity.pattern.pattern()))Next steps
|
a7a3d5a to
52fbcf2
Compare
1eff62f to
bf98f5a
Compare
ce3c444 to
bd5e5a4
Compare
|
Tracked the client-side ReDoS hardening discussion as a follow-up: #35379. Surfaced during this review but scoped out of the PR — it's a cross-cutting SDK hardening decision (Worker + timeout, static regex linting, |
|
Pull Request Unsafe to Rollback!!!
|
733f154 to
122e1ad
Compare
|
Pull Request Unsafe to Rollback!!!
|
🔍 dotCMS Backend Review[🟡 Medium]
.filter(cachedVanityUrl -> cachedVanityUrl.forwardTo.equals(forward))💡 Flip the receiver: [🟡 Medium]
// "ByteBuddy advice fires on the self-invocation" ← factually incorrect
return findByForward(host, language, forward, action, false);💡 Remove the incorrect comment. Add [🟡 Medium]
public static final String LEGACY_CMS_HOME_PAGE = VanityUrlAPI.LEGACY_CMS_HOME_PAGE;💡 Add [🟡 Medium]
default List<CachedVanityUrl> findByForward(..., final boolean includeSystemHost) {
return findByForward(host, language, forward, action); // no permission guard
}💡 Consider moving the SYSTEM_HOST-broadened query to a package-private helper in [🟡 Medium]
.map(vanity -> String.format(DEFAULT_URL_REGEX_TEMPLATE, vanity.pattern.pattern()))💡 A server-side complexity heuristic (e.g. reject patterns containing Next steps
|
0694bb5 to
27afec3
Compare
…hing (#34747) VanityUrlAPIImpl.resolveVanityUrl has a legacy fallback that forwards a "/" request to the /cmsHomePage vanity. ExperimentUrlPatternCalculator did not account for this, so the generated regex only matched /cmsHomePage (not /) and variants were never served when the experiment page was reached via that fallback. Adds an integration test covering the /cmsHomePage scenario and removes an unused /cmsHomePage constant from VisitorFilter. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…face Addresses PR review feedback: the experiments package was reaching into VanityUrlAPIImpl for the LEGACY_CMS_HOME_PAGE constant, leaking an impl detail across module boundaries. Moves the constant to the VanityUrlAPI interface (matching the existing VANITY_URL_RESPONSE_HEADER pattern) and updates callers. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses PR review feedback: passing a java.util.regex.Pattern to
String.format("%s", ...) relies on implicit Pattern.toString(). Switch to
vanity.pattern.pattern() so the intent (extract the regex source string) is
explicit. Semantically a no-op.
Refs: #34747
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses PR review feedback: streams are single-use and lazy; storing them in named locals invites misuse. Inline both streams directly into Stream.concat(...) since each is consumed exactly once. Semantically a no-op. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
findByForward previously only searched the given host, unlike resolveVanityUrl which falls back to SYSTEM_HOST. As a result, a /cmsHomePage vanity published on SYSTEM_HOST (the recommended setup for cross-site home-page aliasing) would not be picked up by the Experiment URL regex calculator, and experiments on the target page would still be skipped for visitors arriving at "/". Updates findByForward to also collect vanities from SYSTEM_HOST, mirroring resolveVanityUrl's host resolution. Adds an integration test covering the SYSTEM_HOST + /cmsHomePage scenario. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CachedVanityUrl compiles each vanity's URI with Pattern.CASE_INSENSITIVE, so VanityUrlAPIImpl.resolveVanityUrl triggers the /cmsHomePage legacy fallback even when an admin created the vanity with a different casing (/cmshomepage, /CmsHomePage, etc.). The previous equals() check missed those variants and the "/" regex fallback would not be added. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ching The server-assembled vanity-URL regex preserved the admin-entered casing of each vanity URI, but the SDK tracker (parser.ts verifyRegex) lowercases the incoming URL path before matching. A mixed-case vanity URI such as /cmsHomePage therefore never matched the lowercased client path, even though server-side vanity resolution itself is case-insensitive (CachedVanityUrl compiles patterns with Pattern.CASE_INSENSITIVE). Mirrors the toLowerCase() already applied to the experiment page regex in calculatePageUrlRegexPattern. Tests updated to use lowercase URLs that reflect real SDK matching. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tract, API contract - ExperimentUrlPatternCalculator.calculatePageUrlRegexPattern: Javadoc note that the returned regex is consumed client-side by the Analytics SDK and is NOT protected by MatcherTimeoutFactory; follow-up tracked in #35379. - ExperimentUrlPatternCalculator.getVanityUrlsRegex: extract the /cmsHomePage-detection anyMatch into a named boolean so vanityUrls is no longer traversed twice inside Stream.concat. Adds a comment clarifying that the exact-match check is intentional — regex-based cmsHomePage URIs are unsupported by design, mirroring resolveVanityUrl's literal fallback lookup. - VanityUrlAPI.findByForward: Javadoc note that this is intended for system-user / internal routing contexts and performs no permission check; callers without READ permission on the host, or paths that expose results to end users, must not use it. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The final toLowerCase() in getVanityUrlsRegex affects every vanity URL alternative in the assembled pattern, not only the /cmsHomePage one. A reviewer flagged that this is a scope expansion: admin-authored vanity URIs that rely on uppercase characters or uppercase-only character classes (e.g. "[A-Z]+") lose those semantics in this path. Document the behavior and its justification in both the public Javadoc on calculatePageUrlRegexPattern and in the inline comment at the return statement. Rationale: CachedVanityUrl already compiles each vanity's URI with Pattern.CASE_INSENSITIVE, so case-sensitive regex constructs never actually influence vanity matching anywhere — no consumer loses behavior by the lowercase fold here. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…licit Previous iteration silently widened findByForward to also return SYSTEM_HOST vanities, documented only in Javadoc. Caller intent is now visible at every call site via a required includeSystemHost boolean parameter: - VanityUrlAPI.findByForward: new 5-arg signature; Javadoc updated. - VanityUrlAPIImpl.findByForward: flag gates SYSTEM_HOST stream inclusion. - ExperimentUrlPatternCalculator.getVanityUrlsRegex: passes true with a comment explaining why (a /cmsHomePage vanity forwarding to the experiment page may live on SYSTEM_HOST). - VanityUrlAPITest: three direct findByForward tests pass false — they assert host-specific behavior (pre-PR semantics preserved). Regression suite: 6 experiment + 3 findByForward tests all pass. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/ LEGACY_CMS_HOME_PAGE The previous iteration changed the public `findByForward` signature to require an explicit `includeSystemHost` flag, and moved `LEGACY_CMS_HOME_PAGE` from `VanityUrlAPIImpl` to the `VanityUrlAPI` interface. Both were source-breaking changes for external consumers. - VanityUrlAPI: add a default 4-arg `findByForward` overload that delegates to the 5-arg form with `includeSystemHost = false`, restoring pre-PR behavior for callers that relied on host-specific results. - VanityUrlAPIImpl: re-add `LEGACY_CMS_HOME_PAGE` as a public-static-final shim pointing at the canonical interface constant; update the internal `resolveVanityUrl` reference to qualify as `VanityUrlAPI.LEGACY_CMS_HOME_PAGE` so the impl's own code doesn't depend on the compat field. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d regex CachedVanityUrl.normalize() returns "" when VanityUrlUtil.isValidRegex rejects the URI as an invalid regex, in which case the compiled Pattern's source is also "". Previously such vanities were still joined into the assembled vanity URL regex; String.format on the URL template then produced a path alternative that matched any URL, effectively creating a catch-all for the experiment and causing page-view events to be recorded for every URL the visitor navigated to. Filter out those entries before the map step so only vanities with a non-empty pattern source contribute alternatives. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous iteration made the new 5-arg findByForward(..., boolean includeSystemHost) abstract on the VanityUrlAPI interface, so any OSGi alternative provider that only implemented the pre-PR 4-arg abstract method would throw AbstractMethodError when the 5-arg is invoked on its instance. Invert the abstract/default split on the interface so the 5-arg is the default method and the 4-arg stays abstract with pre-PR semantics: - 4-arg findByForward is now the canonical abstract method (pre-PR shape). All existing implementors already override it, so nothing breaks. - 5-arg findByForward gets a default body that delegates to the 4-arg (host-only results). Legacy providers that don't override the 5-arg get a safe, backward-compatible implementation automatically. - VanityUrlAPIImpl overrides both: the 4-arg delegates to the 5-arg with false; the 5-arg carries @CloseDBIfOpened and the SYSTEM_HOST-aware logic. ByteBuddy bytecode advice fires on the self-invocation, so the 4-arg delegation preserves connection lifecycle without duplicating the annotation. Regression suite: 6 experiment + 3 findByForward tests all pass. Refs: #34747 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
27afec3 to
0b9d0ee
Compare
Summary
Fixes the remaining scope of #34747 — Experiments not firing when the experiment page is reached via the legacy
/cmsHomePagevanity URL fallback.VanityUrlAPIImpl.resolveVanityUrlhas a legacy fallback (dotCMS/src/main/java/com/dotcms/vanityurl/business/VanityUrlAPIImpl.java:257-261) that, when a visitor requests/and no vanity directly matches, resolves/cmsHomePageinstead and forwards (action=200) to the configured target page. The browser URL bar stays at/.ExperimentUrlPatternCalculator.getVanityUrlsRegexwas only adding the vanity's own URI (/cmsHomePage) as a regex alternative — so the incoming analytics URLhttp://site/(orhttp://site) never matched and the Experiment engine always served the Original variant./?alternative (matching/and empty) to the generated regex when any returned vanity has URI/cmsHomePage. The approach mirrors the existing permissive pattern used inRootIndexRegexUrlPatterStrategyfor the real root-index page.An earlier PR (#26803, commit 53718e0) fixed the common vanity-URL case but did not cover the
/cmsHomePagelegacy fallback — details in the issue comment thread.Also removes an unused
CMS_HOME_PAGEconstant fromVisitorFilter.Test plan
ExperimentUrlPatternCalculatorIntegrationTest#experimentWithCmsHomePageVanity— fails on main, passes on this branch.Tests run: 5, Failures: 0, Errors: 0)./cmsHomePagescenario and regression-check other vanity-URL + Experiment combinations.Refs: #34747
🤖 Generated with Claude Code