Skip to content

Add EntryLinks support#12

Open
eeshadutta1408 wants to merge 11 commits into
GoogleCloudPlatform:mainfrom
eeshadutta1408:entry-links
Open

Add EntryLinks support#12
eeshadutta1408 wants to merge 11 commits into
GoogleCloudPlatform:mainfrom
eeshadutta1408:entry-links

Conversation

@eeshadutta1408

Copy link
Copy Markdown

This PR implements full support for Dataplex EntryLinks (both top-level and schema column-inlined links) in the kcmd tool. It enables catalog curators to manage relationships (such as definition/glossary linkages or schema joins) locally in a structured format and synchronize them with the cloud catalog.


Key Features

  • Manifest & Alias Mapping: Added entryLinks configuration blocks to snapshot/publishing schemas in manifest.ts, supporting custom and global/system aliases (e.g. definition, schema-join).
  • Dataplex API Integration: Extended dataplex.ts with Glossary and EntryLinks endpoints (lookupEntryLinks, createEntryLink, deleteEntryLink, listEntryLinks), with built-in CRM project ID/number normalization.
  • Bi-directional Synchronization: Implemented paginated, cached pulling for entry group scopes and Symmetrical Reference Matching on pushing in sync.ts to prevent accidental link deletion or duplicate creation.

EntryLink Examples

Below are examples of how entry links are serialized locally in an entry's YAML file:

1. Schema-Join Link

For undirected relationships containing aspect metadata, such as join keys and inference attributes:

links:
  schema-join:
    - target: myzh-test-project-1.froyotech_dataset.component
      id: bigquery.googleapis.com/projects/myzh-test-project-1/datasets/froyotech_dataset/tables/component
      schema-join:
        joins:
          - source:
               name: product
               fields:
                 - name
            target:
               name: component
               fields:
                 - name
            type: JOIN
            inferenceSource: QUERY_HISTORY
        userManaged: true
        job:
          runTime: 2026-02-27T13:24:32Z
          name: Dataplex Catalog Usage Signals

2. Glossary definition Link

For directional references mapped to glossary term path targets (resolved dynamically using display names):

links:
  definition:
    - target: dc-cuj-staging-playground.global.UtilityTest.UtilityTerm1
      id: projects/dc-cuj-staging-playground/locations/global/glossaries/pephjajda-c7c4b951-9c078501-4f35fa82/terms/imbaoooej-b14cc513-abdaf298-1b8843c2

@eeshadutta1408 eeshadutta1408 changed the title Add EntryLinks support with alias resolution, bi-directional sync, and dry-run mode Add EntryLinks support Jun 3, 2026
@eeshadutta1408 eeshadutta1408 marked this pull request as ready for review June 3, 2026 16:09
} while (pageToken);
}

async lookupEntryLinks(

@nikhilk nikhilk Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to page through the list of entry links to retrieve them all similar to list?

// Fix all entries and aspects to consistently use project id. Its currently a mess with an
// inconsistent mix of project ids and unusable project numbers.
async function _fixEntry(entry: Entry, ctx: context.ApiContext): Promise<void> {
export async function _fixEntry(entry: Entry, ctx: context.ApiContext): Promise<void> {

@nikhilk nikhilk Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is expected to be used only within the service wrapper, the idea being all responses returned from this client layer are 'fixed', and any downstream user of this client should not have to think about such details.

import { CatalogSource, createSource, Sources } from './source';


export const SYSTEM_LINK_ALIASES: Record<string, string> = {

@nikhilk nikhilk Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things -

  1. Alias functionality would probably be better added in future alongside support for entry type and aspect type aliases. Being worked on in https://github.com/GoogleCloudPlatform/knowledge-catalog/pull/20/changes ...
  2. Still looking ahead, but we should support all link types without having to list them all here. The set of link types to be retrieved or created/modified should depend on the types listed in catalog manifest snapshot and publishing configuration.

# Conflicts:
#	toolbox/mdcode/src/libts/gcp/dataplex.ts
#	toolbox/mdcode/tests/libts/mocks.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants