Skip to content

[Feature Request]: End-to-end Azure deployment recipe (azd + Bicep) for GraphRAG #2362

@jluocsa

Description

@jluocsa

Do you need to file an issue?

  • I have searched the existing issues and this feature is not already filed.
  • My model is hosted on OpenAI or Azure.
  • I believe this is a legitimate feature request, not just a question.

Is your feature request related to a problem? Please describe.

The docs cover the Azure building blocks individually — Azure OpenAI with model_provider: azure, auth_type: azure_managed_identity, Azure AI Search as the vector store, Azure Blob Storage / Cosmos DB for input + cache + reporting (see docs/get_started.md and docs/config/yaml.md) — but there is no single, worked end-to-end Azure recipe that ties them together.

As a result, anyone trying to deploy GraphRAG against managed Azure services (Azure OpenAI, Azure AI Search, Azure Storage, optionally Cosmos DB) currently has to:

  1. Provision each resource by hand (or wire up their own ARM/Bicep/Terraform).
  2. Hand-assign the right RBAC roles to whichever identity will run graphrag index (User-Assigned Managed Identity, the developer's user principal, or a service principal). The roles needed (Cognitive Services OpenAI User, Search Index Data Contributor, Storage Blob Data Contributor, etc.) aren't documented in one place.
  3. Hand-author settings.yaml to point all of models:, input:, cache:, storage:, reporting:, and vector_store: at the right Azure endpoints with auth_type: azure_managed_identity.

This is a frequent ask from customers / CSAs evaluating GraphRAG on Azure, and the failure mode when something is misconfigured (typically RBAC or endpoint shape) is opaque.

Describe the solution you'd like

Add a worked Azure recipe to the repository — e.g. under samples/azure/ or docs/recipes/azure/ — containing:

  • azure.yaml for Azure Developer CLI (azd), so the whole stack provisions and configures with azd up.
  • infra/main.bicep (subscription- or resource-group-scope) that provisions:
    • Azure OpenAI account with two deployments (chat + embeddings), with public-network-disabled and PE optional via a flag.
    • Azure AI Search service (Basic or S1).
    • Azure Storage account (for input, cache, output, reporting containers).
    • Optional Azure Cosmos DB account for the alternative storage backends.
    • User-Assigned Managed Identity + role assignments: Cognitive Services OpenAI User, Search Index Data Contributor, Search Service Contributor, Storage Blob Data Contributor, and Cosmos DB data-plane role where applicable.
    • Log Analytics + diagnostic settings (optional, behind a flag).
  • settings.yaml.template wired to those resources with auth_type: azure_managed_identity (no API keys in the recipe), produced from the azd outputs.
  • A short README that walks through azd up → upload docs → graphrag init (skipping the existing template) → graphrag indexgraphrag query, plus the obvious teardown command (azd down).
  • A small GitHub Action or azd pipeline config snippet showing how to run the indexer on a schedule against the deployed stack (nice-to-have, can be a follow-up).

The recipe should target the default GraphRAG indexer (no custom code), so it stays in sync with mainline by following docs/config changes only.

Additional context

  • Prior art for the shape of recipe I'm proposing: the Azure-Samples/azure-search-openai-demo repo (azd-based, single azd up, Bicep + RBAC + managed identity) — that pattern has been very effective for newcomers and would map cleanly onto GraphRAG.
  • This is additive — nothing in the existing repo structure or docs has to move. The recipe lives in its own folder and the existing per-feature Azure docs in docs/get_started.md and docs/config/yaml.md can link out to it.
  • Happy to contribute the PR myself if maintainers agree on:
    1. Target location (samples/azure/ vs docs/recipes/azure/ vs a separate microsoft/graphrag-azure-samples repo).
    2. Whether AI Search + Storage (the minimum), or AI Search + Cosmos DB + Storage (the full matrix) should be the first cut.
    3. Whether the recipe should ship as a public-network-allowed quickstart, or a private-endpoint-by-default reference.

I'd like to take #1 = samples/azure/, #2 = AI Search + Storage as the first cut, #3 = public-network quickstart with a flag for PE, unless there's a strong opinion otherwise.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions