diff --git a/README.md b/README.md index c0b8fcd50..ce2a4f038 100644 --- a/README.md +++ b/README.md @@ -120,7 +120,9 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr * **Tools:** Let AI call your Ruby methods * **Structured output:** JSON schemas that just work * **Streaming:** Real-time responses with blocks -* **Rails:** ActiveRecord integration with `acts_as_chat` +* **Rails:** + * ActiveRecord integration with `acts_as_chat` + * Instrumentation with `ActiveSupport::Notifications` for observability * **Async:** Fiber-based concurrency * **Model registry:** 500+ models with capability detection and pricing * **Providers:** OpenAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API diff --git a/docs/_advanced/rails.md b/docs/_advanced/rails.md index d34e1a25b..cea4cf4dd 100644 --- a/docs/_advanced/rails.md +++ b/docs/_advanced/rails.md @@ -31,6 +31,7 @@ After reading this guide, you will know: * How to store raw provider payloads (Anthropic prompt caching, etc.) * How to integrate streaming responses with Hotwire/Turbo Streams * How to customize the persistence behavior for validation-focused scenarios +* How to use the built-in instrumentation to observe your messages ## Understanding the Persistence Flow @@ -978,6 +979,149 @@ class Chat < ApplicationRecord end ``` +## Instrumentation + +RubyLLM automatically instruments all operations using `ActiveSupport::Notifications` when using within Rails. This enables monitoring, logging, and performance tracking without any configuration. + +### Available Events + +RubyLLM instruments six events: + +| Event Name | Operation | When It Fires | +| --------------------------- | ------------------- | -------------------------- | +| `complete_chat.ruby_llm` | Chat completions | Every chat API call | +| `embed_text.ruby_llm` | Text embeddings | Every embedding generation | +| `paint_image.ruby_llm` | Image generation | Every image creation | +| `moderate_content.ruby_llm` | Content moderation | Every moderation check | +| `transcribe_audio.ruby_llm` | Audio transcription | Every transcription | +| `execute_tool.ruby_llm` | Tool execution | Every tool call | + +### Subscribing to Events + +Use `ActiveSupport::Notifications` to consume those events: + +```ruby +# config/initializers/ruby_llm.rb or anywhere else + +# Subscribe to all RubyLLM events +ActiveSupport::Notifications.subscribe(/\.ruby_llm$/) do |name, start, finish, id, payload| + duration = (finish - start) * 1000 # Convert to milliseconds + + Rails.logger.info({ + event: name, + duration_ms: duration.round(2), + provider: payload[:provider], + model: payload[:model], + timestamp: start + }.to_json) +end + +# Subscribe to specific events +ActiveSupport::Notifications.subscribe('complete_chat.ruby_llm') do |name, start, finish, id, payload| + duration = (finish - start) * 1000 + + # Track costs and usage + MetricsService.record( + event: 'llm_completion', + duration: duration, + input_tokens: payload[:input_tokens], + output_tokens: payload[:output_tokens], + cached_tokens: payload[:cached_tokens], + provider: payload[:provider], + model: payload[:model] + ) +end +``` + +### Event Payloads + +Each event includes relevant information in its payload: + +#### `complete_chat.ruby_llm` + +Fired for every chat completion request. + +```ruby +{ + provider: 'anthropic', # Provider slug + model: 'claude-sonnet-4-5', # Model ID + streaming: true, # Whether streaming was used + input_tokens: 150, # Prompt tokens consumed + output_tokens: 250, # Completion tokens generated + cached_tokens: 100, # Cache read tokens (if supported) + cache_creation_tokens: 50, # Cache write tokens (if supported) + tool_calls: 2 # Number of tools called (if any) +} +``` + +Token fields are populated from the provider's response and vary by provider capabilities. + +#### `execute_tool.ruby_llm` + +Fired for each tool executed during chat completions. + +```ruby +{ + tool_name: 'Weather', # Tool class name + arguments: { city: 'Paris' }, # Arguments passed to tool + halted: false # Whether tool returned Tool::Halt +} +``` + +The `halted` field indicates if the tool stopped the conversation loop. + +#### `embed_text.ruby_llm` + +Fired for embedding generation requests. + +```ruby +{ + provider: 'openai', # Provider slug + model: 'text-embedding-3-large', # Model ID + dimensions: 1024, # Embedding dimensions (if specified) + input_tokens: 45, # Input tokens consumed + vector_count: 1 # Number of vectors generated +} +``` + +#### `paint_image.ruby_llm` + +Fired for image generation requests. + +```ruby +{ + provider: 'openai', # Provider slug + model: 'dall-e-3', # Model ID + size: '1024x1024' # Image dimensions +} +``` + +#### `moderate_content.ruby_llm` + +Fired for content moderation requests. + +```ruby +{ + provider: 'openai', # Provider slug + model: 'text-moderation-latest', # Model ID + flagged: false # Whether content was flagged +} +``` + +#### `transcribe_audio.ruby_llm` + +Fired for audio transcription requests. + +```ruby +{ + provider: 'openai', # Provider slug + model: 'whisper-1', # Model ID + input_tokens: 0, # Input tokens (if provider reports) + output_tokens: 120, # Output tokens (if provider reports) + duration: 45.3 # Audio duration in seconds (if available) +} +``` + ## Next Steps * [Chatting with AI Models]({% link _core_features/chat.md %}) diff --git a/lib/ruby_llm/chat.rb b/lib/ruby_llm/chat.rb index d03d872ca..3e128163d 100644 --- a/lib/ruby_llm/chat.rb +++ b/lib/ruby_llm/chat.rb @@ -4,6 +4,7 @@ module RubyLLM # Represents a conversation with an AI model class Chat include Enumerable + include Instrumentation attr_reader :model, :messages, :tools, :params, :headers, :schema @@ -122,35 +123,48 @@ def each(&) end def complete(&) # rubocop:disable Metrics/PerceivedComplexity - response = @provider.complete( - messages, - tools: @tools, - temperature: @temperature, - model: @model, - params: @params, - headers: @headers, - schema: @schema, - &wrap_streaming_block(&) - ) - - @on[:new_message]&.call unless block_given? - - if @schema && response.content.is_a?(String) - begin - response.content = JSON.parse(response.content) - rescue JSON::ParserError - # If parsing fails, keep content as string + # rubocop:disable Metrics/BlockLength + instrument('complete_chat.ruby_llm', + { provider: @provider.slug, model: @model.id, streaming: block_given? }) do |payload| + response = @provider.complete( + messages, + tools: @tools, + temperature: @temperature, + model: @model, + params: @params, + headers: @headers, + schema: @schema, + &wrap_streaming_block(&) + ) + + @on[:new_message]&.call unless block_given? + + if @schema && response.content.is_a?(String) + begin + response.content = JSON.parse(response.content) + rescue JSON::ParserError + # If parsing fails, keep content as string + end end - end - add_message response - @on[:end_message]&.call(response) + add_message response + @on[:end_message]&.call(response) + + if payload + %i[input_tokens output_tokens cached_tokens cache_creation_tokens].each do |field| + value = response.public_send(field) + payload[field] = value unless value.nil? + end + payload[:tool_calls] = response.tool_calls.size if response.tool_call? + end - if response.tool_call? - handle_tool_calls(response, &) - else - response + if response.tool_call? + handle_tool_calls(response, &) + else + response + end end + # rubocop:enable Metrics/BlockLength end def add_message(message_or_attributes) @@ -207,7 +221,13 @@ def handle_tool_calls(response, &) # rubocop:disable Metrics/PerceivedComplexity def execute_tool(tool_call) tool = tools[tool_call.name.to_sym] args = tool_call.arguments - tool.call(args) + + instrument('execute_tool.ruby_llm', + { tool_name: tool_call.name, arguments: args }) do |payload| + tool.call(args).tap do |result| + payload[:halted] = result.is_a?(Tool::Halt) if payload + end + end end def build_content(message, attachments) diff --git a/lib/ruby_llm/embedding.rb b/lib/ruby_llm/embedding.rb index 159620f83..f99afb279 100644 --- a/lib/ruby_llm/embedding.rb +++ b/lib/ruby_llm/embedding.rb @@ -3,6 +3,8 @@ module RubyLLM # Core embedding interface. class Embedding + extend Instrumentation + attr_reader :vectors, :model, :input_tokens def initialize(vectors:, model:, input_tokens: 0) @@ -23,7 +25,15 @@ def self.embed(text, # rubocop:disable Metrics/ParameterLists config: config) model_id = model.id - provider_instance.embed(text, model: model_id, dimensions:) + instrument('embed_text.ruby_llm', + { provider: provider_instance.slug, model: model_id, dimensions: dimensions }) do |payload| + provider_instance.embed(text, model: model_id, dimensions:).tap do |result| + if payload + payload[:input_tokens] = result.input_tokens unless result.input_tokens.nil? + payload[:vector_count] = result.vectors.is_a?(Array) ? result.vectors.length : 1 + end + end + end end end end diff --git a/lib/ruby_llm/image.rb b/lib/ruby_llm/image.rb index 738ed6bbf..d70d5c9b7 100644 --- a/lib/ruby_llm/image.rb +++ b/lib/ruby_llm/image.rb @@ -3,6 +3,8 @@ module RubyLLM # Represents a generated image from an AI model. class Image + extend Instrumentation + attr_reader :url, :data, :mime_type, :revised_prompt, :model_id def initialize(url: nil, data: nil, mime_type: nil, revised_prompt: nil, model_id: nil) @@ -43,7 +45,10 @@ def self.paint(prompt, # rubocop:disable Metrics/ParameterLists config: config) model_id = model.id - provider_instance.paint(prompt, model: model_id, size:) + instrument('paint_image.ruby_llm', + { provider: provider_instance.slug, model: model_id, size: size }) do |_payload| + provider_instance.paint(prompt, model: model_id, size:) + end end end end diff --git a/lib/ruby_llm/instrumentation.rb b/lib/ruby_llm/instrumentation.rb new file mode 100644 index 000000000..6f9165051 --- /dev/null +++ b/lib/ruby_llm/instrumentation.rb @@ -0,0 +1,15 @@ +# frozen_string_literal: true + +module RubyLLM + # Optional instrumentation if ActiveSupport::Notifications is defined + module Instrumentation + def instrument(name, payload = {}) + if defined?(ActiveSupport::Notifications) + ActiveSupport::Notifications.instrument(name, payload) { yield payload } + else + yield + end + end + module_function :instrument + end +end diff --git a/lib/ruby_llm/moderation.rb b/lib/ruby_llm/moderation.rb index 5c7a87098..54b444997 100644 --- a/lib/ruby_llm/moderation.rb +++ b/lib/ruby_llm/moderation.rb @@ -4,6 +4,8 @@ module RubyLLM # Identify potentially harmful content in text. # https://platform.openai.com/docs/guides/moderation class Moderation + extend Instrumentation + attr_reader :id, :model, :results def initialize(id:, model:, results:) @@ -23,7 +25,12 @@ def self.moderate(input, config: config) model_id = model.id - provider_instance.moderate(input, model: model_id) + instrument('moderate_content.ruby_llm', + { provider: provider_instance.slug, model: model_id }) do |payload| + provider_instance.moderate(input, model: model_id).tap do |result| + payload[:flagged] = result.flagged? if payload && result.respond_to?(:flagged?) + end + end end # Convenience method to get content from moderation result diff --git a/lib/ruby_llm/transcription.rb b/lib/ruby_llm/transcription.rb index c1fe13850..28e09cc30 100644 --- a/lib/ruby_llm/transcription.rb +++ b/lib/ruby_llm/transcription.rb @@ -3,6 +3,8 @@ module RubyLLM # Represents a transcription of audio content. class Transcription + extend Instrumentation + attr_reader :text, :model, :language, :duration, :segments, :input_tokens, :output_tokens def initialize(text:, model:, **attributes) @@ -29,7 +31,17 @@ def self.transcribe(audio_file, **kwargs) config: config) model_id = model.id - provider_instance.transcribe(audio_file, model: model_id, language:, **options) + instrument('transcribe_audio.ruby_llm', + { provider: provider_instance.slug, model: model_id }) do |payload| + provider_instance.transcribe(audio_file, model: model_id, language:, **options).tap do |result| + if payload + %i[input_tokens output_tokens duration].each do |field| + value = result.public_send(field) + payload[field] = value unless value.nil? + end + end + end + end end end end diff --git a/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_emits_complete_chat_ruby_llm_event.yml b/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_emits_complete_chat_ruby_llm_event.yml new file mode 100644 index 000000000..117d6ad11 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_emits_complete_chat_ruby_llm_event.yml @@ -0,0 +1,34 @@ +--- +http_interactions: +- request: + method: post + uri: "/chat/completions" + body: + encoding: UTF-8 + string: '{"model":"qwen3","messages":[{"role":"user","content":"Hello"}],"stream":false}' + headers: + User-Agent: + - Faraday v2.14.0 + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Content-Type: + - application/json + Date: + - Thu, 04 Dec 2025 16:31:49 GMT + Content-Length: + - '705' + body: + encoding: ASCII-8BIT + string: !binary |- + eyJpZCI6ImNoYXRjbXBsLTI2NSIsIm9iamVjdCI6ImNoYXQuY29tcGxldGlvbiIsImNyZWF0ZWQiOjE3NjQ4NjU5MDksIm1vZGVsIjoicXdlbjMiLCJzeXN0ZW1fZmluZ2VycHJpbnQiOiJmcF9vbGxhbWEiLCJjaG9pY2VzIjpbeyJpbmRleCI6MCwibWVzc2FnZSI6eyJyb2xlIjoiYXNzaXN0YW50IiwiY29udGVudCI6IkhlbGxvISBIb3cgY2FuIEkgYXNzaXN0IHlvdSB0b2RheT8g8J+YiiIsInJlYXNvbmluZyI6Ik9rYXksIHRoZSB1c2VyIGdyZWV0ZWQgbWUgd2l0aCBcIkhlbGxvXCIsIHNvIEkgc2hvdWxkIHJlc3BvbmQgcG9saXRlbHkuIExldCBtZSBtYWtlIHN1cmUgdG8gYWNrbm93bGVkZ2UgdGhlaXIgZ3JlZXRpbmcgYW5kIG9mZmVyIGFzc2lzdGFuY2UuIEkgc2hvdWxkIGtlZXAgdGhlIHRvbmUgZnJpZW5kbHkgYW5kIG9wZW4tZW5kZWQuIE1heWJlIHNvbWV0aGluZyBsaWtlLCBcIkhlbGxvISBIb3cgY2FuIEkgYXNzaXN0IHlvdSB0b2RheT9cIiBUaGF0IHNvdW5kcyBnb29kLiBMZXQgbWUgY2hlY2sgaWYgdGhlcmUncyBhbnl0aGluZyBlbHNlIEkgbmVlZCB0byBjb25zaWRlci4gTm8sIHRoYXQgc2hvdWxkIGJlIGZpbmUuIFJlYWR5IHRvIHJlc3BvbmQuXG4ifSwiZmluaXNoX3JlYXNvbiI6InN0b3AifV0sInVzYWdlIjp7InByb21wdF90b2tlbnMiOjExLCJjb21wbGV0aW9uX3Rva2VucyI6OTYsInRvdGFsX3Rva2VucyI6MTA3fX0K + recorded_at: Thu, 04 Dec 2025 16:31:49 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_includes_streaming_flag_as_false_for_non-streaming_requests.yml b/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_includes_streaming_flag_as_false_for_non-streaming_requests.yml new file mode 100644 index 000000000..3ef33a70d --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_includes_streaming_flag_as_false_for_non-streaming_requests.yml @@ -0,0 +1,34 @@ +--- +http_interactions: +- request: + method: post + uri: "/chat/completions" + body: + encoding: UTF-8 + string: '{"model":"qwen3","messages":[{"role":"user","content":"Test"}],"stream":false}' + headers: + User-Agent: + - Faraday v2.14.0 + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Content-Type: + - application/json + Date: + - Thu, 04 Dec 2025 16:31:55 GMT + Content-Length: + - '790' + body: + encoding: ASCII-8BIT + string: !binary |- + eyJpZCI6ImNoYXRjbXBsLTk2OCIsIm9iamVjdCI6ImNoYXQuY29tcGxldGlvbiIsImNyZWF0ZWQiOjE3NjQ4NjU5MTUsIm1vZGVsIjoicXdlbjMiLCJzeXN0ZW1fZmluZ2VycHJpbnQiOiJmcF9vbGxhbWEiLCJjaG9pY2VzIjpbeyJpbmRleCI6MCwibWVzc2FnZSI6eyJyb2xlIjoiYXNzaXN0YW50IiwiY29udGVudCI6IkhlbGxvISBJdCBzZWVtcyBsaWtlIHlvdSdyZSB0ZXN0aW5nIG9yIGp1c3Qgc2F5aW5nIGhlbGxvLiBIb3cgY2FuIEkgYXNzaXN0IHlvdSB0b2RheT8g8J+YiiIsInJlYXNvbmluZyI6Ik9rYXksIHRoZSB1c2VyIHNlbnQgXCJUZXN0IC90aGlua1wiLiBJIG5lZWQgdG8gZmlndXJlIG91dCB3aGF0IHRoZXkncmUgYXNraW5nLiBUaGV5IG1pZ2h0IGJlIHRlc3RpbmcgaWYgdGhlIEFJIGlzIHdvcmtpbmcgb3IgdHJ5aW5nIG91dCBhIGZlYXR1cmUuIFNpbmNlIHRoZXJlJ3Mgbm8gc3BlY2lmaWMgcXVlc3Rpb24sIEkgc2hvdWxkIHJlc3BvbmQgaW4gYSBmcmllbmRseSwgb3Blbi1lbmRlZCB3YXkuIE1heWJlIGFzayBob3cgSSBjYW4gYXNzaXN0IHRoZW0uIEtlZXAgaXQgc2ltcGxlIGFuZCBpbnZpdGluZy4gTGV0IHRoZW0ga25vdyBJJ20gaGVyZSB0byBoZWxwIHdpdGggYW55IHF1ZXN0aW9ucyB0aGV5IGhhdmUuIFRoYXQgc2hvdWxkIGNvdmVyIGl0IHdpdGhvdXQgYXNzdW1pbmcgdG9vIG11Y2guXG4ifSwiZmluaXNoX3JlYXNvbiI6InN0b3AifV0sInVzYWdlIjp7InByb21wdF90b2tlbnMiOjExLCJjb21wbGV0aW9uX3Rva2VucyI6MTE3LCJ0b3RhbF90b2tlbnMiOjEyOH19Cg== + recorded_at: Thu, 04 Dec 2025 16:31:55 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_includes_streaming_flag_as_true_for_streaming_requests.yml b/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_includes_streaming_flag_as_true_for_streaming_requests.yml new file mode 100644 index 000000000..6a588b2d6 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_chat_instrumentation_includes_streaming_flag_as_true_for_streaming_requests.yml @@ -0,0 +1,34 @@ +--- +http_interactions: +- request: + method: post + uri: "/chat/completions" + body: + encoding: UTF-8 + string: '{"model":"qwen3","messages":[{"role":"user","content":"Test"}],"stream":true,"stream_options":{"include_usage":true}}' + headers: + User-Agent: + - Faraday v2.14.0 + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Content-Type: + - text/event-stream + Date: + - Thu, 04 Dec 2025 16:31:55 GMT + Transfer-Encoding: + - chunked + body: + encoding: ASCII-8BIT + string: !binary |- +  + recorded_at: Thu, 04 Dec 2025 16:32:00 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_embedding_instrumentation_emits_embed_text_ruby_llm_event.yml b/spec/fixtures/vcr_cassettes/instrumentation_embedding_instrumentation_emits_embed_text_ruby_llm_event.yml new file mode 100644 index 000000000..8c49cd89b --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_embedding_instrumentation_emits_embed_text_ruby_llm_event.yml @@ -0,0 +1,68 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/embeddings + body: + encoding: UTF-8 + string: '{"model":"text-embedding-3-small","input":"Test"}' + headers: + User-Agent: + - Faraday v2.14.0 + Authorization: + - Bearer test + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 401 + message: Unauthorized + headers: + Date: + - Thu, 04 Dec 2025 16:32:00 GMT + Content-Type: + - application/json; charset=utf-8 + Content-Length: + - '254' + Connection: + - keep-alive + Vary: + - Origin + X-Request-Id: + - "" + X-Envoy-Upstream-Service-Time: + - '1' + X-Openai-Proxy-Wasm: + - v0.1 + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: UTF-8 + string: | + { + "error": { + "message": "Incorrect API key provided: test. You can find your API key at https://platform.openai.com/account/api-keys.", + "type": "invalid_request_error", + "param": null, + "code": "invalid_api_key" + } + } + recorded_at: Thu, 04 Dec 2025 16:32:00 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_embedding_instrumentation_includes_dimensions_in_payload_when_specified.yml b/spec/fixtures/vcr_cassettes/instrumentation_embedding_instrumentation_includes_dimensions_in_payload_when_specified.yml new file mode 100644 index 000000000..be4fef329 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_embedding_instrumentation_includes_dimensions_in_payload_when_specified.yml @@ -0,0 +1,68 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/embeddings + body: + encoding: UTF-8 + string: '{"model":"text-embedding-3-small","input":"Test","dimensions":512}' + headers: + User-Agent: + - Faraday v2.14.0 + Authorization: + - Bearer test + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 401 + message: Unauthorized + headers: + Date: + - Thu, 04 Dec 2025 16:32:01 GMT + Content-Type: + - application/json; charset=utf-8 + Content-Length: + - '254' + Connection: + - keep-alive + Vary: + - Origin + X-Request-Id: + - "" + X-Envoy-Upstream-Service-Time: + - '1' + X-Openai-Proxy-Wasm: + - v0.1 + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: UTF-8 + string: | + { + "error": { + "message": "Incorrect API key provided: test. You can find your API key at https://platform.openai.com/account/api-keys.", + "type": "invalid_request_error", + "param": null, + "code": "invalid_api_key" + } + } + recorded_at: Thu, 04 Dec 2025 16:32:01 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_image_instrumentation_emits_paint_image_ruby_llm_event.yml b/spec/fixtures/vcr_cassettes/instrumentation_image_instrumentation_emits_paint_image_ruby_llm_event.yml new file mode 100644 index 000000000..9e8ab68a3 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_image_instrumentation_emits_paint_image_ruby_llm_event.yml @@ -0,0 +1,70 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/images/generations + body: + encoding: UTF-8 + string: '{"model":"dall-e-3","prompt":"Test","n":1,"size":"1024x1024"}' + headers: + User-Agent: + - Faraday v2.14.0 + Authorization: + - Bearer test + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 401 + message: Unauthorized + headers: + Date: + - Thu, 04 Dec 2025 16:32:01 GMT + Content-Type: + - application/json + Content-Length: + - '233' + Connection: + - keep-alive + Www-Authenticate: + - Bearer realm="OpenAI API" + Openai-Version: + - '2020-10-01' + X-Request-Id: + - "" + Openai-Processing-Ms: + - '41' + X-Envoy-Upstream-Service-Time: + - '44' + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: UTF-8 + string: |- + { + "error": { + "message": "Incorrect API key provided: test. You can find your API key at https://platform.openai.com/account/api-keys.", + "type": "invalid_request_error", + "param": null, + "code": "invalid_api_key" + } + } + recorded_at: Thu, 04 Dec 2025 16:32:01 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_moderation_instrumentation_emits_moderate_content_ruby_llm_event.yml b/spec/fixtures/vcr_cassettes/instrumentation_moderation_instrumentation_emits_moderate_content_ruby_llm_event.yml new file mode 100644 index 000000000..424e5541f --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_moderation_instrumentation_emits_moderate_content_ruby_llm_event.yml @@ -0,0 +1,72 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/moderations + body: + encoding: UTF-8 + string: '{"model":"omni-moderation-latest","input":"Test"}' + headers: + User-Agent: + - Faraday v2.14.0 + Authorization: + - Bearer test + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 401 + message: Unauthorized + headers: + Date: + - Thu, 04 Dec 2025 16:32:01 GMT + Content-Type: + - application/json + Content-Length: + - '233' + Connection: + - keep-alive + Www-Authenticate: + - Bearer realm="OpenAI API" + Openai-Version: + - '2020-10-01' + X-Request-Id: + - "" + Openai-Processing-Ms: + - '15' + X-Envoy-Upstream-Service-Time: + - '21' + X-Openai-Proxy-Wasm: + - v0.1 + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: UTF-8 + string: |- + { + "error": { + "message": "Incorrect API key provided: test. You can find your API key at https://platform.openai.com/account/api-keys.", + "type": "invalid_request_error", + "param": null, + "code": "invalid_api_key" + } + } + recorded_at: Thu, 04 Dec 2025 16:32:01 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/instrumentation_tool_execution_instrumentation_emits_execute_tool_ruby_llm_event_when_tools_are_called.yml b/spec/fixtures/vcr_cassettes/instrumentation_tool_execution_instrumentation_emits_execute_tool_ruby_llm_event_when_tools_are_called.yml new file mode 100644 index 000000000..be8a1491b --- /dev/null +++ b/spec/fixtures/vcr_cassettes/instrumentation_tool_execution_instrumentation_emits_execute_tool_ruby_llm_event_when_tools_are_called.yml @@ -0,0 +1,80 @@ +--- +http_interactions: +- request: + method: post + uri: "/chat/completions" + body: + encoding: UTF-8 + string: '{"model":"qwen3","messages":[{"role":"user","content":"What is 2 + + 2?"}],"stream":false,"tools":[{"type":"function","function":{"name":"calculator","description":"Perform + calculations","parameters":{"type":"object","properties":{"expression":{"type":"string","description":"Math + expression"}},"required":["expression"],"additionalProperties":false,"strict":true}}}]}' + headers: + User-Agent: + - Faraday v2.14.0 + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Content-Type: + - application/json + Date: + - Thu, 04 Dec 2025 16:32:08 GMT + Content-Length: + - '904' + body: + encoding: UTF-8 + string: '{"id":"chatcmpl-860","object":"chat.completion","created":1764865928,"model":"qwen3","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"","reasoning":"Okay, + the user is asking \"What is 2 + 2?\" So I need to calculate that. Let me + check the tools provided. There''s a calculator function that takes an expression. + The parameters require an expression as a string. The expression here is \"2 + + 2\", so I should call the calculator function with that expression. I just + need to make sure to format the tool call correctly in JSON inside the XML + tags. No other functions are available, so this is straightforward.\n","tool_calls":[{"id":"call_mdtpeblv","index":0,"type":"function","function":{"name":"calculator","arguments":"{\"expression\":\"2 + + 2\"}"}}]},"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":144,"completion_tokens":125,"total_tokens":269}} + + ' + recorded_at: Thu, 04 Dec 2025 16:32:08 GMT +- request: + method: post + uri: "/chat/completions" + body: + encoding: UTF-8 + string: '{"model":"qwen3","messages":[{"role":"user","content":"What is 2 + + 2?"},{"role":"assistant","content":"","tool_calls":[{"id":"call_mdtpeblv","type":"function","function":{"name":"calculator","arguments":"{\"expression\":\"2 + + 2\"}"}}]},{"role":"tool","content":"4","tool_call_id":"call_mdtpeblv"}],"stream":false,"tools":[{"type":"function","function":{"name":"calculator","description":"Perform + calculations","parameters":{"type":"object","properties":{"expression":{"type":"string","description":"Math + expression"}},"required":["expression"],"additionalProperties":false,"strict":true}}}]}' + headers: + User-Agent: + - Faraday v2.14.0 + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Content-Type: + - application/json + Date: + - Thu, 04 Dec 2025 16:32:13 GMT + Content-Length: + - '755' + body: + encoding: ASCII-8BIT + string: !binary |- + eyJpZCI6ImNoYXRjbXBsLTEyNCIsIm9iamVjdCI6ImNoYXQuY29tcGxldGlvbiIsImNyZWF0ZWQiOjE3NjQ4NjU5MzMsIm1vZGVsIjoicXdlbjMiLCJzeXN0ZW1fZmluZ2VycHJpbnQiOiJmcF9vbGxhbWEiLCJjaG9pY2VzIjpbeyJpbmRleCI6MCwibWVzc2FnZSI6eyJyb2xlIjoiYXNzaXN0YW50IiwiY29udGVudCI6IlRoZSByZXN1bHQgb2YgMiArIDIgaXMgKio0KiouIExldCBtZSBrbm93IGlmIHlvdSBuZWVkIGhlbHAgd2l0aCBhbnl0aGluZyBlbHNlISDwn5iKIiwicmVhc29uaW5nIjoiT2theSwgdGhlIHVzZXIgYXNrZWQgXCJXaGF0IGlzIDIgKyAyP1wiIGFuZCBJIHVzZWQgdGhlIGNhbGN1bGF0b3IgZnVuY3Rpb24gdG8gY29tcHV0ZSBpdC4gVGhlIHJlc3BvbnNlIGZyb20gdGhlIHRvb2wgd2FzIDQuIE5vdyBJIG5lZWQgdG8gcHJlc2VudCB0aGlzIGFuc3dlciBjbGVhcmx5LiBTaW5jZSB0aGUgY2FsY3VsYXRpb24gaXMgc3RyYWlnaHRmb3J3YXJkLCBJJ2xsIHN0YXRlIHRoZSByZXN1bHQgZGlyZWN0bHkuIE1ha2Ugc3VyZSB0byBmb3JtYXQgdGhlIHJlc3BvbnNlIGluIGEgZnJpZW5kbHkgYW5kIGhlbHBmdWwgbWFubmVyLCBjb25maXJtaW5nIHRoZSByZXN1bHQgYW5kIG9mZmVyaW5nIGZ1cnRoZXIgYXNzaXN0YW5jZSBpZiBuZWVkZWQuXG4ifSwiZmluaXNoX3JlYXNvbiI6InN0b3AifV0sInVzYWdlIjp7InByb21wdF90b2tlbnMiOjE3OSwiY29tcGxldGlvbl90b2tlbnMiOjEwOSwidG90YWxfdG9rZW5zIjoyODh9fQo= + recorded_at: Thu, 04 Dec 2025 16:32:13 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/ruby_llm/instrumentation_spec.rb b/spec/ruby_llm/instrumentation_spec.rb new file mode 100644 index 000000000..b70770b35 --- /dev/null +++ b/spec/ruby_llm/instrumentation_spec.rb @@ -0,0 +1,213 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe 'RubyLLM::Instrumentation' do + include_context 'with configured RubyLLM' + + let(:events) { [] } + let(:subscriber) do + lambda do |name, start, finish, _id, payload| + events << { + name: name, + duration: finish - start, + payload: payload.dup + } + end + end + + before do + events.clear + ActiveSupport::Notifications.subscribe(/\.ruby_llm$/, subscriber) + end + + after do + ActiveSupport::Notifications.unsubscribe(subscriber) + end + + describe 'Chat instrumentation' do + it 'emits complete_chat.ruby_llm event' do + chat = RubyLLM.chat(model: 'qwen3', provider: :ollama) + + begin + chat.ask 'Hello' + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + chat_events = events.select { |e| e[:name] == 'complete_chat.ruby_llm' } + expect(chat_events).not_to be_empty + + event = chat_events.first + expect(event[:payload][:provider]).to eq('ollama') + expect(event[:payload][:model]).to eq('qwen3') + expect(event[:payload][:streaming]).to eq(false) + expect(event[:duration]).to be >= 0 + end + + it 'includes streaming flag as false for non-streaming requests' do + chat = RubyLLM.chat(model: 'qwen3', provider: :ollama) + + begin + chat.ask 'Test' + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + event = events.find { |e| e[:name] == 'complete_chat.ruby_llm' } + expect(event).not_to be_nil + expect(event[:payload][:streaming]).to eq(false) + expect(event[:payload][:provider]).to eq('ollama') + expect(event[:payload][:model]).to eq('qwen3') + end + + it 'includes streaming flag as true for streaming requests' do + chat = RubyLLM.chat(model: 'qwen3', provider: :ollama) + + begin + chat.ask('Test') { |_chunk| } + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + event = events.find { |e| e[:name] == 'complete_chat.ruby_llm' } + expect(event).not_to be_nil + expect(event[:payload][:streaming]).to eq(true) + expect(event[:payload][:provider]).to eq('ollama') + expect(event[:payload][:model]).to eq('qwen3') + end + end + + describe 'Embedding instrumentation' do + it 'emits embed_text.ruby_llm event' do + begin + RubyLLM.embed('Test', model: 'text-embedding-3-small', provider: :openai) + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + embed_events = events.select { |e| e[:name] == 'embed_text.ruby_llm' } + expect(embed_events).not_to be_empty + + event = embed_events.first + expect(event[:payload][:provider]).to eq('openai') + expect(event[:payload][:model]).to eq('text-embedding-3-small') + # vector_count may not be present if an error occurred + expect(event[:payload]).to have_key(:vector_count).or have_key(:exception) + expect(event[:duration]).to be >= 0 + end + + it 'includes dimensions in payload when specified' do + begin + RubyLLM.embed('Test', model: 'text-embedding-3-small', provider: :openai, dimensions: 512) + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + event = events.find { |e| e[:name] == 'embed_text.ruby_llm' } + expect(event).not_to be_nil + expect(event[:payload][:dimensions]).to eq(512) + expect(event[:payload][:provider]).to eq('openai') + expect(event[:payload][:model]).to eq('text-embedding-3-small') + end + end + + describe 'Image instrumentation' do + it 'emits paint_image.ruby_llm event' do + begin + RubyLLM.paint('Test', model: 'dall-e-3', provider: :openai) + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + paint_events = events.select { |e| e[:name] == 'paint_image.ruby_llm' } + expect(paint_events).not_to be_empty + + event = paint_events.first + expect(event[:payload][:provider]).to eq('openai') + expect(event[:payload][:model]).to eq('dall-e-3') + expect(event[:payload]).to have_key(:size) + expect(event[:duration]).to be >= 0 + end + end + + describe 'Moderation instrumentation' do + it 'emits moderate_content.ruby_llm event' do + begin + RubyLLM.moderate('Test', model: 'omni-moderation-latest', provider: :openai) + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + moderate_events = events.select { |e| e[:name] == 'moderate_content.ruby_llm' } + expect(moderate_events).not_to be_empty + + event = moderate_events.first + expect(event[:payload][:provider]).to eq('openai') + expect(event[:payload][:model]).to eq('omni-moderation-latest') + # flagged may not be present if an error occurred + expect(event[:payload]).to have_key(:flagged).or have_key(:exception) + expect(event[:duration]).to be >= 0 + end + end + + describe 'Transcription instrumentation' do + let(:audio_file) { File.expand_path('../../fixtures/audio.wav', __dir__) } + + it 'emits transcribe_audio.ruby_llm event' do + skip 'Audio file not available' unless File.exist?(audio_file) + + begin + RubyLLM.transcribe(audio_file, model: 'whisper-1', provider: :openai) + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + transcribe_events = events.select { |e| e[:name] == 'transcribe_audio.ruby_llm' } + expect(transcribe_events).not_to be_empty + + event = transcribe_events.first + expect(event[:payload][:provider]).to eq('openai') + expect(event[:payload][:model]).to eq('whisper-1') + expect(event[:duration]).to be >= 0 + end + end + + describe 'Tool execution instrumentation' do + let(:calculator_tool) do + Class.new(RubyLLM::Tool) do + def self.name + 'Calculator' + end + + description 'Perform calculations' + param :expression, desc: 'Math expression' + + def execute(expression:) + eval expression + end + end + end + + it 'emits execute_tool.ruby_llm event when tools are called' do + chat = RubyLLM.chat(model: 'qwen3', provider: :ollama) + chat.with_tool(calculator_tool) + + begin + chat.ask 'What is 2 + 2?' + rescue Faraday::Error, RubyLLM::Error => e + skip "API not available: #{e.message}" if events.empty? + end + + tool_events = events.select { |e| e[:name] == 'execute_tool.ruby_llm' } + + skip 'Model did not call the tool in this test run' if tool_events.empty? + + event = tool_events.first + expect(event[:payload][:tool_name]).to eq('calculator') + expect(event[:payload]).to have_key(:arguments) + expect(event[:payload][:halted]).to be_in([true, false]) + expect(event[:duration]).to be >= 0 + end + end +end