Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,9 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
* **Tools:** Let AI call your Ruby methods
* **Structured output:** JSON schemas that just work
* **Streaming:** Real-time responses with blocks
* **Rails:** ActiveRecord integration with `acts_as_chat`
* **Rails:**
* ActiveRecord integration with `acts_as_chat`
* Instrumentation with `ActiveSupport::Notifications` for observability
* **Async:** Fiber-based concurrency
* **Model registry:** 500+ models with capability detection and pricing
* **Providers:** OpenAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
Expand Down
144 changes: 144 additions & 0 deletions docs/_advanced/rails.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ After reading this guide, you will know:
* How to store raw provider payloads (Anthropic prompt caching, etc.)
* How to integrate streaming responses with Hotwire/Turbo Streams
* How to customize the persistence behavior for validation-focused scenarios
* How to use the built-in instrumentation to observe your messages

## Understanding the Persistence Flow

Expand Down Expand Up @@ -978,6 +979,149 @@ class Chat < ApplicationRecord
end
```

## Instrumentation

RubyLLM automatically instruments all operations using `ActiveSupport::Notifications` when using within Rails. This enables monitoring, logging, and performance tracking without any configuration.

### Available Events

RubyLLM instruments six events:

| Event Name | Operation | When It Fires |
| --------------------------- | ------------------- | -------------------------- |
| `complete_chat.ruby_llm` | Chat completions | Every chat API call |
| `embed_text.ruby_llm` | Text embeddings | Every embedding generation |
| `paint_image.ruby_llm` | Image generation | Every image creation |
| `moderate_content.ruby_llm` | Content moderation | Every moderation check |
| `transcribe_audio.ruby_llm` | Audio transcription | Every transcription |
| `execute_tool.ruby_llm` | Tool execution | Every tool call |

### Subscribing to Events

Use `ActiveSupport::Notifications` to consume those events:

```ruby
# config/initializers/ruby_llm.rb or anywhere else

# Subscribe to all RubyLLM events
ActiveSupport::Notifications.subscribe(/\.ruby_llm$/) do |name, start, finish, id, payload|
duration = (finish - start) * 1000 # Convert to milliseconds

Rails.logger.info({
event: name,
duration_ms: duration.round(2),
provider: payload[:provider],
model: payload[:model],
timestamp: start
}.to_json)
end

# Subscribe to specific events
ActiveSupport::Notifications.subscribe('complete_chat.ruby_llm') do |name, start, finish, id, payload|
duration = (finish - start) * 1000

# Track costs and usage
MetricsService.record(
event: 'llm_completion',
duration: duration,
input_tokens: payload[:input_tokens],
output_tokens: payload[:output_tokens],
cached_tokens: payload[:cached_tokens],
provider: payload[:provider],
model: payload[:model]
)
end
```

### Event Payloads

Each event includes relevant information in its payload:

#### `complete_chat.ruby_llm`

Fired for every chat completion request.

```ruby
{
provider: 'anthropic', # Provider slug
model: 'claude-sonnet-4-5', # Model ID
streaming: true, # Whether streaming was used
input_tokens: 150, # Prompt tokens consumed
output_tokens: 250, # Completion tokens generated
cached_tokens: 100, # Cache read tokens (if supported)
cache_creation_tokens: 50, # Cache write tokens (if supported)
tool_calls: 2 # Number of tools called (if any)
}
```

Token fields are populated from the provider's response and vary by provider capabilities.

#### `execute_tool.ruby_llm`

Fired for each tool executed during chat completions.

```ruby
{
tool_name: 'Weather', # Tool class name
arguments: { city: 'Paris' }, # Arguments passed to tool
halted: false # Whether tool returned Tool::Halt
}
```

The `halted` field indicates if the tool stopped the conversation loop.

#### `embed_text.ruby_llm`

Fired for embedding generation requests.

```ruby
{
provider: 'openai', # Provider slug
model: 'text-embedding-3-large', # Model ID
dimensions: 1024, # Embedding dimensions (if specified)
input_tokens: 45, # Input tokens consumed
vector_count: 1 # Number of vectors generated
}
```

#### `paint_image.ruby_llm`

Fired for image generation requests.

```ruby
{
provider: 'openai', # Provider slug
model: 'dall-e-3', # Model ID
size: '1024x1024' # Image dimensions
}
```

#### `moderate_content.ruby_llm`

Fired for content moderation requests.

```ruby
{
provider: 'openai', # Provider slug
model: 'text-moderation-latest', # Model ID
flagged: false # Whether content was flagged
}
```

#### `transcribe_audio.ruby_llm`

Fired for audio transcription requests.

```ruby
{
provider: 'openai', # Provider slug
model: 'whisper-1', # Model ID
input_tokens: 0, # Input tokens (if provider reports)
output_tokens: 120, # Output tokens (if provider reports)
duration: 45.3 # Audio duration in seconds (if available)
}
```

## Next Steps

* [Chatting with AI Models]({% link _core_features/chat.md %})
Expand Down
72 changes: 46 additions & 26 deletions lib/ruby_llm/chat.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ module RubyLLM
# Represents a conversation with an AI model
class Chat
include Enumerable
include Instrumentation

attr_reader :model, :messages, :tools, :params, :headers, :schema

Expand Down Expand Up @@ -122,35 +123,48 @@ def each(&)
end

def complete(&) # rubocop:disable Metrics/PerceivedComplexity
response = @provider.complete(
messages,
tools: @tools,
temperature: @temperature,
model: @model,
params: @params,
headers: @headers,
schema: @schema,
&wrap_streaming_block(&)
)

@on[:new_message]&.call unless block_given?

if @schema && response.content.is_a?(String)
begin
response.content = JSON.parse(response.content)
rescue JSON::ParserError
# If parsing fails, keep content as string
# rubocop:disable Metrics/BlockLength
instrument('complete_chat.ruby_llm',
{ provider: @provider.slug, model: @model.id, streaming: block_given? }) do |payload|
response = @provider.complete(
messages,
tools: @tools,
temperature: @temperature,
model: @model,
params: @params,
headers: @headers,
schema: @schema,
&wrap_streaming_block(&)
)

@on[:new_message]&.call unless block_given?

if @schema && response.content.is_a?(String)
begin
response.content = JSON.parse(response.content)
rescue JSON::ParserError
# If parsing fails, keep content as string
end
end
end

add_message response
@on[:end_message]&.call(response)
add_message response
@on[:end_message]&.call(response)

if payload
%i[input_tokens output_tokens cached_tokens cache_creation_tokens].each do |field|
value = response.public_send(field)
payload[field] = value unless value.nil?
end
payload[:tool_calls] = response.tool_calls.size if response.tool_call?
end

if response.tool_call?
handle_tool_calls(response, &)
else
response
if response.tool_call?
handle_tool_calls(response, &)
else
response
end
end
# rubocop:enable Metrics/BlockLength
end

def add_message(message_or_attributes)
Expand Down Expand Up @@ -207,7 +221,13 @@ def handle_tool_calls(response, &) # rubocop:disable Metrics/PerceivedComplexity
def execute_tool(tool_call)
tool = tools[tool_call.name.to_sym]
args = tool_call.arguments
tool.call(args)

instrument('execute_tool.ruby_llm',
{ tool_name: tool_call.name, arguments: args }) do |payload|
tool.call(args).tap do |result|
payload[:halted] = result.is_a?(Tool::Halt) if payload
end
end
end

def build_content(message, attachments)
Expand Down
12 changes: 11 additions & 1 deletion lib/ruby_llm/embedding.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
module RubyLLM
# Core embedding interface.
class Embedding
extend Instrumentation

attr_reader :vectors, :model, :input_tokens

def initialize(vectors:, model:, input_tokens: 0)
Expand All @@ -23,7 +25,15 @@ def self.embed(text, # rubocop:disable Metrics/ParameterLists
config: config)
model_id = model.id

provider_instance.embed(text, model: model_id, dimensions:)
instrument('embed_text.ruby_llm',
{ provider: provider_instance.slug, model: model_id, dimensions: dimensions }) do |payload|
provider_instance.embed(text, model: model_id, dimensions:).tap do |result|
if payload
payload[:input_tokens] = result.input_tokens unless result.input_tokens.nil?
payload[:vector_count] = result.vectors.is_a?(Array) ? result.vectors.length : 1
end
end
end
end
end
end
7 changes: 6 additions & 1 deletion lib/ruby_llm/image.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
module RubyLLM
# Represents a generated image from an AI model.
class Image
extend Instrumentation

attr_reader :url, :data, :mime_type, :revised_prompt, :model_id

def initialize(url: nil, data: nil, mime_type: nil, revised_prompt: nil, model_id: nil)
Expand Down Expand Up @@ -43,7 +45,10 @@ def self.paint(prompt, # rubocop:disable Metrics/ParameterLists
config: config)
model_id = model.id

provider_instance.paint(prompt, model: model_id, size:)
instrument('paint_image.ruby_llm',
{ provider: provider_instance.slug, model: model_id, size: size }) do |_payload|
provider_instance.paint(prompt, model: model_id, size:)
end
end
end
end
15 changes: 15 additions & 0 deletions lib/ruby_llm/instrumentation.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# frozen_string_literal: true

module RubyLLM
# Optional instrumentation if ActiveSupport::Notifications is defined
module Instrumentation
def instrument(name, payload = {})
if defined?(ActiveSupport::Notifications)
ActiveSupport::Notifications.instrument(name, payload) { yield payload }
else
yield
end
end
module_function :instrument
end
end
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an alternative to this module is a module per class (eg RubyLLM::Chat::Instrumentation) that would be included only when including rails. For example, in lib/ruby_llm/ruby_llm.rb, at the bottom:

# we currently have this:
if defined?(Rails::Railtie)
  require 'ruby_llm/railtie'
  require 'ruby_llm/active_record/acts_as'
  require 'ruby_llm/chat/instrumentation'
end

and then:

module RubyLLM
  class Chat
    module Instrumentation
      alias_method :original_complete, :complete
      def complete(&)
        ActiveSupport::Notifications.instrument(
          "complete_chat.ruby_llm",
          { ... }
        ) do
          original_complete(&)
        end
    end
  end
end

I'm unsure of what's the best convention for these files in this project so I'm open to suggestions.

This instrument method feels simpler though, but I'm willing to change to this other implementation.

9 changes: 8 additions & 1 deletion lib/ruby_llm/moderation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ module RubyLLM
# Identify potentially harmful content in text.
# https://platform.openai.com/docs/guides/moderation
class Moderation
extend Instrumentation

attr_reader :id, :model, :results

def initialize(id:, model:, results:)
Expand All @@ -23,7 +25,12 @@ def self.moderate(input,
config: config)
model_id = model.id

provider_instance.moderate(input, model: model_id)
instrument('moderate_content.ruby_llm',
{ provider: provider_instance.slug, model: model_id }) do |payload|
provider_instance.moderate(input, model: model_id).tap do |result|
payload[:flagged] = result.flagged? if payload && result.respond_to?(:flagged?)
end
end
end

# Convenience method to get content from moderation result
Expand Down
14 changes: 13 additions & 1 deletion lib/ruby_llm/transcription.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
module RubyLLM
# Represents a transcription of audio content.
class Transcription
extend Instrumentation

attr_reader :text, :model, :language, :duration, :segments, :input_tokens, :output_tokens

def initialize(text:, model:, **attributes)
Expand All @@ -29,7 +31,17 @@ def self.transcribe(audio_file, **kwargs)
config: config)
model_id = model.id

provider_instance.transcribe(audio_file, model: model_id, language:, **options)
instrument('transcribe_audio.ruby_llm',
{ provider: provider_instance.slug, model: model_id }) do |payload|
provider_instance.transcribe(audio_file, model: model_id, language:, **options).tap do |result|
if payload
%i[input_tokens output_tokens duration].each do |field|
value = result.public_send(field)
payload[field] = value unless value.nil?
end
end
end
end
end
end
end
Loading