crmne · patriciomacadden · Dec 4, 2025 · patriciomacadden · Dec 12, 2025
diff --git a/README.md b/README.md
@@ -120,7 +120,9 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
 * **Tools:** Let AI call your Ruby methods
 * **Structured output:** JSON schemas that just work
 * **Streaming:** Real-time responses with blocks
-* **Rails:** ActiveRecord integration with `acts_as_chat`
+* **Rails:**
+  * ActiveRecord integration with `acts_as_chat`
+  * Instrumentation with `ActiveSupport::Notifications` for observability
 * **Async:** Fiber-based concurrency
 * **Model registry:** 500+ models with capability detection and pricing
 * **Providers:** OpenAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API

diff --git a/docs/_advanced/rails.md b/docs/_advanced/rails.md
@@ -31,6 +31,7 @@ After reading this guide, you will know:
 *   How to store raw provider payloads (Anthropic prompt caching, etc.)
 *   How to integrate streaming responses with Hotwire/Turbo Streams
 *   How to customize the persistence behavior for validation-focused scenarios
+*   How to use the built-in instrumentation to observe your messages
 
 ## Understanding the Persistence Flow
 
@@ -978,6 +979,149 @@ class Chat < ApplicationRecord
 end
 ```
 
+## Instrumentation
+
+RubyLLM automatically instruments all operations using `ActiveSupport::Notifications` when using within Rails. This enables monitoring, logging, and performance tracking without any configuration.
+
+### Available Events
+
+RubyLLM instruments six events:
+
+| Event Name                  | Operation           | When It Fires              |
+| --------------------------- | ------------------- | -------------------------- |
+| `complete_chat.ruby_llm`    | Chat completions    | Every chat API call        |
+| `embed_text.ruby_llm`       | Text embeddings     | Every embedding generation |
+| `paint_image.ruby_llm`      | Image generation    | Every image creation       |
+| `moderate_content.ruby_llm` | Content moderation  | Every moderation check     |
+| `transcribe_audio.ruby_llm` | Audio transcription | Every transcription        |
+| `execute_tool.ruby_llm`     | Tool execution      | Every tool call            |
+
+### Subscribing to Events
+
+Use `ActiveSupport::Notifications` to consume those events:
+
+```ruby
+# config/initializers/ruby_llm.rb or anywhere else
+
+# Subscribe to all RubyLLM events
+ActiveSupport::Notifications.subscribe(/\.ruby_llm$/) do |name, start, finish, id, payload|
+  duration = (finish - start) * 1000 # Convert to milliseconds
+
+  Rails.logger.info({
+    event: name,
+    duration_ms: duration.round(2),
+    provider: payload[:provider],
+    model: payload[:model],
+    timestamp: start
+  }.to_json)
+end
+
+# Subscribe to specific events
+ActiveSupport::Notifications.subscribe('complete_chat.ruby_llm') do |name, start, finish, id, payload|
+  duration = (finish - start) * 1000
+
+  # Track costs and usage
+  MetricsService.record(
+    event: 'llm_completion',
+    duration: duration,
+    input_tokens: payload[:input_tokens],
+    output_tokens: payload[:output_tokens],
+    cached_tokens: payload[:cached_tokens],
+    provider: payload[:provider],
+    model: payload[:model]
+  )
+end
+```
+
+### Event Payloads
+
+Each event includes relevant information in its payload:
+
+#### `complete_chat.ruby_llm`
+
+Fired for every chat completion request.
+
+```ruby
+{
+  provider: 'anthropic',           # Provider slug
+  model: 'claude-sonnet-4-5',      # Model ID
+  streaming: true,                 # Whether streaming was used
+  input_tokens: 150,               # Prompt tokens consumed
+  output_tokens: 250,              # Completion tokens generated
+  cached_tokens: 100,              # Cache read tokens (if supported)
+  cache_creation_tokens: 50,       # Cache write tokens (if supported)
+  tool_calls: 2                    # Number of tools called (if any)
+}
+```
+
+Token fields are populated from the provider's response and vary by provider capabilities.
+
+#### `execute_tool.ruby_llm`
+
+Fired for each tool executed during chat completions.
+
+```ruby
+{
+  tool_name: 'Weather',            # Tool class name
+  arguments: { city: 'Paris' },    # Arguments passed to tool
+  halted: false                    # Whether tool returned Tool::Halt
+}
+```
+
+The `halted` field indicates if the tool stopped the conversation loop.
+
+#### `embed_text.ruby_llm`
+
+Fired for embedding generation requests.
+
+```ruby
+{
+  provider: 'openai',              # Provider slug
+  model: 'text-embedding-3-large', # Model ID
+  dimensions: 1024,                # Embedding dimensions (if specified)
+  input_tokens: 45,                # Input tokens consumed
+  vector_count: 1                  # Number of vectors generated
+}
+```
+
+#### `paint_image.ruby_llm`
+
+Fired for image generation requests.
+
+```ruby
+{
+  provider: 'openai',              # Provider slug
+  model: 'dall-e-3',               # Model ID
+  size: '1024x1024'                # Image dimensions
+}
+```
+
+#### `moderate_content.ruby_llm`
+
+Fired for content moderation requests.
+
+```ruby
+{
+  provider: 'openai',              # Provider slug
+  model: 'text-moderation-latest', # Model ID
+  flagged: false                   # Whether content was flagged
+}
+```
+
+#### `transcribe_audio.ruby_llm`
+
+Fired for audio transcription requests.
+
+```ruby
+{
+  provider: 'openai',              # Provider slug
+  model: 'whisper-1',              # Model ID
+  input_tokens: 0,                 # Input tokens (if provider reports)
+  output_tokens: 120,              # Output tokens (if provider reports)
+  duration: 45.3                   # Audio duration in seconds (if available)
+}
+```
+
 ## Next Steps
 
 *   [Chatting with AI Models]({% link _core_features/chat.md %})

diff --git a/lib/ruby_llm/chat.rb b/lib/ruby_llm/chat.rb
@@ -4,6 +4,7 @@ module RubyLLM
   # Represents a conversation with an AI model
   class Chat
     include Enumerable
+    include Instrumentation
 
     attr_reader :model, :messages, :tools, :params, :headers, :schema
 
@@ -122,35 +123,48 @@ def each(&)
     end
 
     def complete(&) # rubocop:disable Metrics/PerceivedComplexity
-      response = @provider.complete(
-        messages,
-        tools: @tools,
-        temperature: @temperature,
-        model: @model,
-        params: @params,
-        headers: @headers,
-        schema: @schema,
-        &wrap_streaming_block(&)
-      )
-
-      @on[:new_message]&.call unless block_given?
-
-      if @schema && response.content.is_a?(String)
-        begin
-          response.content = JSON.parse(response.content)
-        rescue JSON::ParserError
-          # If parsing fails, keep content as string
+      # rubocop:disable Metrics/BlockLength
+      instrument('complete_chat.ruby_llm',
+                 { provider: @provider.slug, model: @model.id, streaming: block_given? }) do |payload|
+        response = @provider.complete(
+          messages,
+          tools: @tools,
+          temperature: @temperature,
+          model: @model,
+          params: @params,
+          headers: @headers,
+          schema: @schema,
+          &wrap_streaming_block(&)
+        )
+
+        @on[:new_message]&.call unless block_given?
+
+        if @schema && response.content.is_a?(String)
+          begin
+            response.content = JSON.parse(response.content)
+          rescue JSON::ParserError
+            # If parsing fails, keep content as string
+          end
         end
-      end
 
-      add_message response
-      @on[:end_message]&.call(response)
+        add_message response
+        @on[:end_message]&.call(response)
+
+        if payload
+          %i[input_tokens output_tokens cached_tokens cache_creation_tokens].each do |field|
+            value = response.public_send(field)
+            payload[field] = value unless value.nil?
+          end
+          payload[:tool_calls] = response.tool_calls.size if response.tool_call?
+        end
 
-      if response.tool_call?
-        handle_tool_calls(response, &)
-      else
-        response
+        if response.tool_call?
+          handle_tool_calls(response, &)
+        else
+          response
+        end
       end
+      # rubocop:enable Metrics/BlockLength
     end
 
     def add_message(message_or_attributes)
@@ -207,7 +221,13 @@ def handle_tool_calls(response, &) # rubocop:disable Metrics/PerceivedComplexity
     def execute_tool(tool_call)
       tool = tools[tool_call.name.to_sym]
       args = tool_call.arguments
-      tool.call(args)
+
+      instrument('execute_tool.ruby_llm',
+                 { tool_name: tool_call.name, arguments: args }) do |payload|
+        tool.call(args).tap do |result|
+          payload[:halted] = result.is_a?(Tool::Halt) if payload
+        end
+      end
     end
 
     def build_content(message, attachments)

diff --git a/lib/ruby_llm/embedding.rb b/lib/ruby_llm/embedding.rb
@@ -3,6 +3,8 @@
 module RubyLLM
   # Core embedding interface.
   class Embedding
+    extend Instrumentation
+
     attr_reader :vectors, :model, :input_tokens
 
     def initialize(vectors:, model:, input_tokens: 0)
@@ -23,7 +25,15 @@ def self.embed(text, # rubocop:disable Metrics/ParameterLists
                                                        config: config)
       model_id = model.id
 
-      provider_instance.embed(text, model: model_id, dimensions:)
+      instrument('embed_text.ruby_llm',
+                 { provider: provider_instance.slug, model: model_id, dimensions: dimensions }) do |payload|
+        provider_instance.embed(text, model: model_id, dimensions:).tap do |result|
+          if payload
+            payload[:input_tokens] = result.input_tokens unless result.input_tokens.nil?
+            payload[:vector_count] = result.vectors.is_a?(Array) ? result.vectors.length : 1
+          end
+        end
+      end
     end
   end
 end
diff --git a/lib/ruby_llm/image.rb b/lib/ruby_llm/image.rb
@@ -3,6 +3,8 @@
 module RubyLLM
   # Represents a generated image from an AI model.
   class Image
+    extend Instrumentation
+
     attr_reader :url, :data, :mime_type, :revised_prompt, :model_id
 
     def initialize(url: nil, data: nil, mime_type: nil, revised_prompt: nil, model_id: nil)
@@ -43,7 +45,10 @@ def self.paint(prompt, # rubocop:disable Metrics/ParameterLists
                                                        config: config)
       model_id = model.id
 
-      provider_instance.paint(prompt, model: model_id, size:)
+      instrument('paint_image.ruby_llm',
+                 { provider: provider_instance.slug, model: model_id, size: size }) do |_payload|
+        provider_instance.paint(prompt, model: model_id, size:)
+      end
     end
   end
 end
diff --git a/lib/ruby_llm/instrumentation.rb b/lib/ruby_llm/instrumentation.rb
@@ -0,0 +1,15 @@
+# frozen_string_literal: true
+
+module RubyLLM
+  # Optional instrumentation if ActiveSupport::Notifications is defined
+  module Instrumentation
+    def instrument(name, payload = {})
+      if defined?(ActiveSupport::Notifications)
+        ActiveSupport::Notifications.instrument(name, payload) { yield payload }
+      else
+        yield
+      end
+    end
+    module_function :instrument
+  end
+end
diff --git a/lib/ruby_llm/moderation.rb b/lib/ruby_llm/moderation.rb
@@ -4,6 +4,8 @@ module RubyLLM
   # Identify potentially harmful content in text.
   # https://platform.openai.com/docs/guides/moderation
   class Moderation
+    extend Instrumentation
+
     attr_reader :id, :model, :results
 
     def initialize(id:, model:, results:)
@@ -23,7 +25,12 @@ def self.moderate(input,
                                                        config: config)
       model_id = model.id
 
-      provider_instance.moderate(input, model: model_id)
+      instrument('moderate_content.ruby_llm',
+                 { provider: provider_instance.slug, model: model_id }) do |payload|
+        provider_instance.moderate(input, model: model_id).tap do |result|
+          payload[:flagged] = result.flagged? if payload && result.respond_to?(:flagged?)
+        end
+      end
     end
 
     # Convenience method to get content from moderation result

diff --git a/lib/ruby_llm/transcription.rb b/lib/ruby_llm/transcription.rb
@@ -3,6 +3,8 @@
 module RubyLLM
   # Represents a transcription of audio content.
   class Transcription
+    extend Instrumentation
+
     attr_reader :text, :model, :language, :duration, :segments, :input_tokens, :output_tokens
 
     def initialize(text:, model:, **attributes)
@@ -29,7 +31,17 @@ def self.transcribe(audio_file, **kwargs)
                                                        config: config)
       model_id = model.id
 
-      provider_instance.transcribe(audio_file, model: model_id, language:, **options)
+      instrument('transcribe_audio.ruby_llm',
+                 { provider: provider_instance.slug, model: model_id }) do |payload|
+        provider_instance.transcribe(audio_file, model: model_id, language:, **options).tap do |result|
+          if payload
+            %i[input_tokens output_tokens duration].each do |field|
+              value = result.public_send(field)
+              payload[field] = value unless value.nil?
+            end
+          end
+        end
+      end
     end
   end
 end