diff --git a/README.md b/README.md index 89396fc..4f18804 100644 --- a/README.md +++ b/README.md @@ -6,10 +6,12 @@
-**Potomatic** is a command-line tool for translating `.pot` (Portable Object Template) files into multiple languages using AI (currently OpenAI). We built it to streamline large-scale localization of WordPress products, a process we detail in [this behind‑the‑scenes article](https://www.gravitykit.com/translating-wordpress-plugins-using-chatgpt/). +**Potomatic** is a command-line tool for translating `.pot` (Portable Object Template) files into multiple languages using AI. We built it to streamline large-scale localization of WordPress products, a process we detail in [this behind‑the‑scenes article](https://www.gravitykit.com/translating-wordpress-plugins-using-chatgpt/). While [`gpt-po`](https://github.com/ryanhex53/gpt-po) helped us get started, we needed smarter retry logic, cost controls, and better visibility into large jobs, among other things. **Potomatic** delivers those improvements and more, as well adds fine‑grained prompt tuning through a built‑in [A/B testing utility](#-ab-testing-for-prompt-optimization). +Supports multiple AI providers: **OpenAI**, **Google Gemini**, and **Google Translate API**. + ## 📢 Disclaimer Translation quality varies depending on factors such as model selection, prompt design, and the complexity of the source text. **Potomatic** can generate a baseline translation, but the output should always be reviewed and verified before use. @@ -40,7 +42,7 @@ For improved results, consider refining your prompt, using a higher-tier model, ## 🚀 Key Features -* **🤖 AI‑powered translations** – Translate into any language supported by OpenAI models. +* **🤖 AI‑powered translations** – Translate into any language supported by OpenAI, Google Gemini, or Google Translate API. * **📦 Smart batch handling** – Tune batch size, concurrency and retries for the right balance of cost and speed. * **💰 Cost‑conscious execution** – Accurately estimate costs and tokens, and control the maximum cost of a job. * **🔄 Incremental & resumable workflows** – Resume interrupted jobs, merge with existing `.po` files, or force a re‑translation. @@ -285,6 +287,73 @@ Using this example, "Block Editor" and other terms will not be translated to tar --- +## 🤖 AI Providers + +Potomatic supports multiple AI providers for translation: + +### Google Gemini + +Uses Google's Gemini models for translation. Set the provider to `gemini` and configure your API key. + +**Available Models:** + +| Model | Prompt Cost | Completion Cost | Best For | +|-------|-------------|-----------------|----------| +| `gemini-3.1-pro-preview` | $0.00125/1K tokens | $0.005/1K tokens | Complex translations, highest quality | +| `gemini-2.5-pro` | $0.0005/1K tokens | $0.0015/1K tokens | High-quality general use | +| `gemini-2.5-flash` | $0.000175/1K tokens | $0.000525/1K tokens | Fast, cost-effective translations | +| `gemini-flash-latest` | $0.000175/1K tokens | $0.000525/1K tokens | Latest optimized flash model | +| `gemini-3.1-flash-lite-preview` | $0.0000375/1K tokens | $0.00015/1K tokens | Ultra-budget translations | + +**Usage:** +```bash +# Using Gemini 2.5 Flash +./potomatic -l fr_FR -p translations.pot --provider gemini --model gemini-2.5-flash -k $GOOGLE_API_KEY + +# Using the latest flash model +./potomatic -l fr_FR -p translations.pot --provider gemini --model gemini-flash-latest -k $GOOGLE_API_KEY + +# Using budget-friendly flash-lite +./potomatic -l fr_FR -p translations.pot --provider gemini --model gemini-3.1-flash-lite-preview -k $GOOGLE_API_KEY +``` + +### Google Translate API + +Uses Google Cloud Translate API for fast, cost-effective translations. Ideal for straightforward text without complex nuances. + +**Pricing:** Google Translate API charges based on character count: +- $20 per 1M characters (standard tier) +- Volume discounts available for higher usage + +**Usage:** +```bash +./potomatic -l fr_FR -p translations.pot --provider google-translate -k $GOOGLE_CLOUD_API_KEY +``` + +### OpenAI + +Uses OpenAI's GPT models for translation. This was the original provider and offers the widest model selection. + +**Available Models:** + +| Model | Prompt Cost | Completion Cost | Best For | +|-------|-------------|-----------------|----------| +| `gpt-4o-mini` | $0.00015/1K tokens | $0.0006/1K tokens | Fast, budget-friendly | +| `gpt-4o` | $0.0025/1K tokens | $0.01/1K tokens | High-quality | +| `gpt-4o-mini-2024-07-18` | $0.00015/1K tokens | $0.0006/1K tokens | Specific dated version | + +**Usage:** +```bash +./potomatic -l fr_FR -p translations.pot --provider openai --model gpt-4o-mini -k $OPENAI_API_KEY +``` + +**Environment Variables:** +- `GOOGLE_API_KEY` - For Gemini provider +- `GOOGLE_CLOUD_API_KEY` - For Google Translate provider +- `API_KEY` - Falls back to OpenAI + +--- + ## ⚙️ Configuration Files **Potomatic** uses several configuration files in the `config/` directory to customize its behavior: diff --git a/config/gemini-pricing.json b/config/gemini-pricing.json new file mode 100644 index 0000000..4715027 --- /dev/null +++ b/config/gemini-pricing.json @@ -0,0 +1,28 @@ +{ + "models": { + "gemini-2.5-pro": { + "prompt": 0.0005, + "completion": 0.0015 + }, + "gemini-2.5-flash": { + "prompt": 0.000175, + "completion": 0.000525 + }, + "gemini-3.1-pro-preview": { + "prompt": 0.00125, + "completion": 0.005 + }, + "gemini-flash-latest": { + "prompt": 0.000175, + "completion": 0.000525 + }, + "gemini-3.1-flash-lite-preview": { + "prompt": 0.0000375, + "completion": 0.00015 + } + }, + "fallback": { + "prompt": 0.0005, + "completion": 0.0015 + } +} \ No newline at end of file diff --git a/config/google-translate-pricing.json b/config/google-translate-pricing.json new file mode 100644 index 0000000..7a0c5bc --- /dev/null +++ b/config/google-translate-pricing.json @@ -0,0 +1,12 @@ +{ + "models": { + "default": { + "prompt": 0, + "completion": 0 + } + }, + "fallback": { + "prompt": 0, + "completion": 0 + } +} \ No newline at end of file diff --git a/package-lock.json b/package-lock.json index 1f60de1..98251a1 100644 --- a/package-lock.json +++ b/package-lock.json @@ -7,8 +7,9 @@ "": { "name": "potomatic", "version": "1.0.0", - "license": "GPL-3.0-or-later", + "license": "MIT", "dependencies": { + "@google/generative-ai": "^0.24.1", "chalk": "^5.3.0", "commander": "^12.0.0", "dotenv": "^16.5.0", @@ -38,7 +39,7 @@ "vitest": "^3.1.4" }, "engines": { - "node": ">=18.0.0" + "node": ">=18" } }, "node_modules/@ampproject/remapping": { @@ -590,6 +591,15 @@ "node": "^12.22.0 || ^14.17.0 || >=16.0.0" } }, + "node_modules/@google/generative-ai": { + "version": "0.24.1", + "resolved": "https://registry.npmjs.org/@google/generative-ai/-/generative-ai-0.24.1.tgz", + "integrity": "sha512-MqO+MLfM6kjxcKoy0p1wRzG3b4ZZXtPI+z2IE26UogS2Cm/XHO+7gGRBh6gcJsOiIVoH93UwKvW4HdgiOZCy9Q==", + "license": "Apache-2.0", + "engines": { + "node": ">=18.0.0" + } + }, "node_modules/@humanwhocodes/config-array": { "version": "0.13.0", "resolved": "https://registry.npmjs.org/@humanwhocodes/config-array/-/config-array-0.13.0.tgz", @@ -1270,6 +1280,7 @@ "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.14.1.tgz", "integrity": "sha512-OvQ/2pUDKmgfCg++xsTX1wGxfTaszcHVcTctW4UJB4hibJx2HXxxO5UmVgyjMa+ZDsiaf5wWLXYpRWMmBI0QHg==", "dev": true, + "peer": true, "bin": { "acorn": "bin/acorn" }, @@ -1577,7 +1588,6 @@ "resolved": "https://registry.npmjs.org/builtin-modules/-/builtin-modules-3.3.0.tgz", "integrity": "sha512-zhaCDicdLuWN5UbN5IMnFqNMhNfo919sH85y2/ea+5Yg9TsTkeZxpL+JLbp6cgYFS4sRLp3YV4S6yDuqVWHYOw==", "dev": true, - "peer": true, "engines": { "node": ">=6" }, @@ -1590,7 +1600,6 @@ "resolved": "https://registry.npmjs.org/builtins/-/builtins-5.1.0.tgz", "integrity": "sha512-SW9lzGTLvWTP1AY8xeAMZimqDrIaSdLQUcVr9DMef51niJ022Ri87SwRRKYm4A6iHfkPaiVUu/Duw2Wc4J7kKg==", "dev": true, - "peer": true, "dependencies": { "semver": "^7.0.0" } @@ -2199,6 +2208,7 @@ "integrity": "sha512-ypowyDxpVSYpkXr9WPv2PAZCtNip1Mv5KTW0SCurXv/9iOpcrH9PaqUElksqEB6pChqHGDRCFTyrZlGhnLNGiA==", "deprecated": "This version is no longer supported. Please see https://eslint.org/version-support for other options.", "dev": true, + "peer": true, "dependencies": { "@eslint-community/eslint-utils": "^4.2.0", "@eslint-community/regexpp": "^4.6.1", @@ -2254,7 +2264,6 @@ "resolved": "https://registry.npmjs.org/eslint-compat-utils/-/eslint-compat-utils-0.5.1.tgz", "integrity": "sha512-3z3vFexKIEnjHE3zCMRo6fn/e44U7T1khUjg+Hp0ZQMCigh28rALD0nPFBcGZuiLC5rLZa2ubQHDRln09JfU2Q==", "dev": true, - "peer": true, "dependencies": { "semver": "^7.5.4" }, @@ -2383,7 +2392,6 @@ "https://github.com/sponsors/ota-meshi", "https://opencollective.com/eslint" ], - "peer": true, "dependencies": { "@eslint-community/eslint-utils": "^4.1.2", "@eslint-community/regexpp": "^4.11.0", @@ -2401,6 +2409,7 @@ "resolved": "https://registry.npmjs.org/eslint-plugin-import/-/eslint-plugin-import-2.31.0.tgz", "integrity": "sha512-ixmkI62Rbc2/w8Vfxyh1jQRTdRTF52VxwRVHl/ykPAmqG+Nb7/kNn+byLP0LxPgI7zWA16Jt82SybJInmMia3A==", "dev": true, + "peer": true, "dependencies": { "@rtsao/scc": "^1.1.0", "array-includes": "^3.1.8", @@ -2486,7 +2495,6 @@ "resolved": "https://registry.npmjs.org/eslint-plugin-n/-/eslint-plugin-n-16.6.2.tgz", "integrity": "sha512-6TyDmZ1HXoFQXnhCTUjVFULReoBPOAjpuiKELMkeP40yffI/1ZRO+d9ug/VC6fqISo2WkuIBk3cvuRPALaWlOQ==", "dev": true, - "peer": true, "dependencies": { "@eslint-community/eslint-utils": "^4.4.0", "builtins": "^5.0.1", @@ -2515,7 +2523,6 @@ "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.11.tgz", "integrity": "sha512-iCuPHDFgrHX7H2vEI/5xpz07zSHB00TpugqhmYtVmMO6518mCuRMoOYFldEBl0g187ufozdaHgWKcYFb61qGiA==", "dev": true, - "peer": true, "dependencies": { "balanced-match": "^1.0.0", "concat-map": "0.0.1" @@ -2526,7 +2533,6 @@ "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-3.1.2.tgz", "integrity": "sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw==", "dev": true, - "peer": true, "dependencies": { "brace-expansion": "^1.1.7" }, @@ -2590,6 +2596,7 @@ "resolved": "https://registry.npmjs.org/eslint-plugin-promise/-/eslint-plugin-promise-6.6.0.tgz", "integrity": "sha512-57Zzfw8G6+Gq7axm2Pdo3gW/Rx3h9Yywgn61uE/3elTCOePEHVrn2i5CdfBwA1BLK0Q0WqctICIUSqXZW/VprQ==", "dev": true, + "peer": true, "engines": { "node": "^12.22.0 || ^14.17.0 || >=16.0.0" }, @@ -3232,7 +3239,6 @@ "resolved": "https://registry.npmjs.org/get-tsconfig/-/get-tsconfig-4.10.1.tgz", "integrity": "sha512-auHyJ4AgMz7vgS8Hp3N6HXSmlMdUyhSUrfBF16w153rxtLIEOE+HGqaBppczZvnHLqQJfiHotCYpNhl0lUROFQ==", "dev": true, - "peer": true, "dependencies": { "resolve-pkg-maps": "^1.0.0" }, @@ -3608,7 +3614,6 @@ "resolved": "https://registry.npmjs.org/is-builtin-module/-/is-builtin-module-3.2.1.tgz", "integrity": "sha512-BSLE3HnV2syZ0FK0iMA/yUGplUeMmNz4AW5fnTunbCIqZi4vG3WjJT9FHMy5D69xmAYBHXQhJdALdpwVxV501A==", "dev": true, - "peer": true, "dependencies": { "builtin-modules": "^3.3.0" }, @@ -4738,6 +4743,7 @@ "resolved": "https://registry.npmjs.org/picomatch/-/picomatch-4.0.2.tgz", "integrity": "sha512-M7BAV6Rlcy5u+m6oPhAPFgJTzAioX/6B0DxyvDlo9l8+T3nLKbrczg2WLUyzd45L8RqfUMyGPzekbMvX2Ldkwg==", "dev": true, + "peer": true, "engines": { "node": ">=12" }, @@ -4964,7 +4970,6 @@ "resolved": "https://registry.npmjs.org/resolve-pkg-maps/-/resolve-pkg-maps-1.0.0.tgz", "integrity": "sha512-seS2Tj26TBVOC2NIc2rOe2y2ZO7efxITtLZcGSOnHHNOQ7CkiUBfw0Iw2ck6xkIhPwLhKNLS8BO+hEpngQlqzw==", "dev": true, - "peer": true, "funding": { "url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1" } @@ -5865,6 +5870,7 @@ "resolved": "https://registry.npmjs.org/vite/-/vite-6.3.5.tgz", "integrity": "sha512-cZn6NDFE7wdTpINgs++ZJ4N49W2vRp8LCKrn3Ob1kYNtOo21vfDoaV5GzBfLU4MovSAB8uNRm4jgzVQZ+mBzPQ==", "dev": true, + "peer": true, "dependencies": { "esbuild": "^0.25.0", "fdir": "^6.4.4", @@ -5961,6 +5967,7 @@ "resolved": "https://registry.npmjs.org/vitest/-/vitest-3.1.4.tgz", "integrity": "sha512-Ta56rT7uWxCSJXlBtKgIlApJnT6e6IGmTYxYcmxjJ4ujuZDI59GUQgVDObXXJujOmPDBYXHK1qmaGtneu6TNIQ==", "dev": true, + "peer": true, "dependencies": { "@vitest/expect": "3.1.4", "@vitest/mocker": "3.1.4", @@ -6302,6 +6309,7 @@ "version": "3.25.23", "resolved": "https://registry.npmjs.org/zod/-/zod-3.25.23.tgz", "integrity": "sha512-Od2bdMosahjSrSgJtakrwjMDb1zM1A3VIHCPGveZt/3/wlrTWBya2lmEh2OYe4OIu8mPTmmr0gnLHIWQXdtWBg==", + "peer": true, "funding": { "url": "https://github.com/sponsors/colinhacks" } diff --git a/package.json b/package.json index 9979a3b..9ac3835 100644 --- a/package.json +++ b/package.json @@ -32,6 +32,7 @@ }, "scripts": { "translate": "./potomatic", + "translate:gemini": "./potomatic --provider gemini", "ab-prompt-test": "node tools/ab-prompt-test", "test": "vitest run", "test:watch": "vitest", @@ -45,6 +46,7 @@ "node": ">=18" }, "dependencies": { + "@google/generative-ai": "^0.24.1", "chalk": "^5.3.0", "commander": "^12.0.0", "dotenv": "^16.5.0", diff --git a/src/config/index.js b/src/config/index.js index 4c15947..41886bb 100644 --- a/src/config/index.js +++ b/src/config/index.js @@ -216,7 +216,8 @@ export function parseCliArguments() { .option('--locale-format ', 'Format to use for locale codes in file names: `wp_locale` (ru_RU), `iso_639_1` (ru), `iso_639_2` (rus), or `target_lang` (default)', DEFAULTS.LOCALE_FORMAT) // === Translation Options ==.= - .option('-k, --api-key ', 'OpenAI API key (overrides API_KEY env var)') + .option('--provider ', 'AI provider to use (e.g., "openai", "gemini", "google-translate")', DEFAULTS.PROVIDER) + .option('-k, --api-key ', 'Provider API key (overrides API_KEY env var)') .option('-m, --model ', 'AI model name (e.g., "gpt-4o-mini")', DEFAULTS.MODEL) .option('--temperature ', 'Creativity level (0.0-2.0); lower = more deterministic, higher = more creative', (val) => Math.max(0, Math.min(2, parseFloat(val))), DEFAULTS.TEMPERATURE) .option('-F, --force-translate', 'Re-translate all strings, ignoring any existing translations', DEFAULTS.FORCE_TRANSLATE) @@ -302,7 +303,15 @@ export function validateConfiguration(options) { const errors = []; if (!options.dryRun && !options.apiKey && !process.env.API_KEY) { - errors.push('🔑 API key required (set API_KEY env var, use --api-key, or try --dry-run)'); + // Google Translate doesn't require an API key + if (options.provider !== 'google-translate') { + errors.push('🔑 API key required (set API_KEY env var, use --api-key, or try --dry-run)'); + } + } + + // Google Translate requires source language + if (options.provider === 'google-translate' && !options.sourceLanguage && !process.env.SOURCE_LANGUAGE) { + errors.push('🌐 Source language required for Google Translate (use -s or --source-language)'); } if (!options.targetLanguages || options.targetLanguages.length === 0) { diff --git a/src/providers/ProviderFactory.js b/src/providers/ProviderFactory.js index 6296920..2e9be95 100644 --- a/src/providers/ProviderFactory.js +++ b/src/providers/ProviderFactory.js @@ -1,4 +1,6 @@ import { OpenAIProvider } from './openai/OpenAIProvider.js'; +import { GeminiProvider } from './gemini/GeminiProvider.js'; +import { GoogleTranslateProvider } from './google-translate/GoogleTranslateProvider.js'; /** * Creates and configures AI translation providers based on configuration. @@ -26,6 +28,10 @@ export class ProviderFactory { switch (providerName.toLowerCase()) { case 'openai': return new OpenAIProvider(config, logger); + case 'gemini': + return new GeminiProvider(config, logger); + case 'google-translate': + return new GoogleTranslateProvider(config, logger); default: throw new Error(`Unsupported provider: ${providerName}. ` + `Supported providers: ${ProviderFactory.getSupportedProviders().join(', ')}`); } @@ -39,7 +45,7 @@ export class ProviderFactory { * @return {Array} Array of supported provider names. */ static getSupportedProviders() { - return ['openai']; + return ['openai', 'gemini', 'google-translate']; } /** @@ -76,6 +82,29 @@ export class ProviderFactory { model: 'gpt-3.5-turbo', }, }, + { + name: 'gemini', + displayName: 'Google Gemini', + description: 'Google Gemini models', + status: 'implemented', + models: ['gemini-2.5-pro', 'gemini-2.5-flash'], + configExample: { + provider: 'gemini', + apiKey: 'your-gemini-api-key', + model: 'gemini-2.5-flash', + }, + }, + { + name: 'google-translate', + displayName: 'Google Translate (Free)', + description: 'Free Google Translate API - no API key required', + status: 'implemented', + models: ['default'], + configExample: { + provider: 'google-translate', + sourceLanguage: 'en', + }, + }, ]; } diff --git a/src/providers/gemini/GeminiProvider.js b/src/providers/gemini/GeminiProvider.js new file mode 100644 index 0000000..ac9b257 --- /dev/null +++ b/src/providers/gemini/GeminiProvider.js @@ -0,0 +1,774 @@ +import { GoogleGenerativeAI } from '@google/generative-ai'; +import { Provider } from '../base/Provider.js'; +import { buildXmlPrompt, parseXmlResponse, buildDictionaryResponse } from '../../utils/xmlTranslation.js'; +import { loadDictionary, findDictionaryMatches } from '../../utils/dictionaryUtils.js'; + +/** + * Gemini Provider Implementation. + * + * Handles translation using Google's Gemini models. + * Implements the Provider interface with Gemini-specific functionality. + * + * @since 1.0.0 + */ +export class GeminiProvider extends Provider { + /** + * Creates a new Gemini Provider instance. + * + * @since 1.0.0 + * + * @param {Object} config - Gemini provider configuration. + * @param {Object} logger - Logger instance. + */ + constructor(config, logger) { + super(config, logger); + + this.client = null; + } + + /** + * Initializes the Gemini provider. + * Sets up authentication and loads pricing information. + * + * @since 1.0.0 + * + * @throws {Error} If API key is missing or initialization fails. + * + * @return {Promise} Resolves when initialization is complete. + */ + async initialize() { + if (!this.config.apiKey && !this.config.dryRun) { + throw new Error('API key is required for non-dry-run mode'); + } + + if (!this.config.dryRun && this.config.apiKey) { + const genAI = new GoogleGenerativeAI(this.config.apiKey); + this.client = genAI.getGenerativeModel({ model: this.config.model }); + } + + await this._loadProviderPricing('gemini'); + + this.logger.debug(`Gemini provider initialized with model: ${this.config.model}`); + } + + /** + * Validates Gemini provider configuration. + * + * @since 1.0.0 + * + * @param {Object} config - Configuration to validate. + * + * @return {Object} Validation result. + */ + validateConfig(config) { + const errors = []; + + if (!config.dryRun && !config.apiKey) { + errors.push('API key is required (set API_KEY or use --dry-run)'); + } + + const supportedModels = this.getSupportedModels(); + + if (config.model && !supportedModels.includes(config.model)) { + errors.push(`Unsupported model: ${config.model}. Supported: ${supportedModels.join(', ')}`); + } + + if (config.temperature !== undefined && (config.temperature < 0 || config.temperature > 2)) { + errors.push('Temperature must be between 0.0 and 2.0'); + } + + return { + isValid: errors.length === 0, + errors, + }; + } + + /** + * Translates a batch of strings using Gemini's API. + * + * @since 1.0.0 + * + * @param {Array} batch - Array of translation items. + * @param {string} targetLang - Target language code. + * @param {string} model - Gemini model to use. + * @param {string} systemPrompt - System prompt for translation. + * @param {number} maxRetries - Maximum retry attempts. + * @param {number} retryDelayMs - Delay between retries. + * @param {number} timeout - Request timeout. + * @param {boolean} isDryRun - Whether this is a dry run. + * @param {Function} retryProgressCallback - Optional callback for retry progress updates. + * @param {Object} debugConfig - Optional debug configuration object. + * @param {number} pluralCount - Number of plural forms for target language. + * + * @return {Promise} Translation result. + */ + async translateBatch(batch, targetLang, model, systemPrompt, maxRetries, retryDelayMs, timeout, isDryRun, retryProgressCallback = null, debugConfig = null, pluralCount = 1) { + let dictionaryMatches = []; + + if (this.config.useDictionary) { + const dictionary = loadDictionary(this.config.dictionaryPath, targetLang, this.logger); + + dictionaryMatches = findDictionaryMatches(batch, dictionary); + + if (dictionaryMatches.length > 0) { + this.logger.info(`Using dictionary: Found ${dictionaryMatches.length} matching terms for ${targetLang}: ${dictionaryMatches.map((m) => m.source).join(', ')}`); + } else { + this.logger.debug(`No dictionary matches found for ${targetLang} in this batch`); + } + } + + const promptResult = buildXmlPrompt(batch, targetLang, pluralCount, dictionaryMatches); + const xmlPrompt = promptResult.xmlPrompt; + + const messages = [ + { role: 'user', content: systemPrompt }, + { role: 'model', content: 'OK' }, + { role: 'user', content: xmlPrompt }, + ]; + + if (dictionaryMatches.length > 0) { + const dictionaryResponse = buildDictionaryResponse(dictionaryMatches); + + messages.push({ role: 'model', content: dictionaryResponse }); + + const exampleTerms = dictionaryMatches + .slice(0, 2) + .map((match) => `"${match.source}" MUST be translated as "${match.target}"`) + .join(' and '); + + const instruction = `IMPORTANT: When translating the following strings, you MUST use the exact dictionary translations shown above for any terms that appear in the dictionary. For example, ${exampleTerms}. Use these exact translations, not alternatives. Now translate the actual strings:`; + + messages.push({ + role: 'user', + content: instruction, + }); + } + + if (isDryRun) { + return this._handleDryRun(messages, model, batch, pluralCount, promptResult.dictionaryCount); + } + + return await this._makeApiCallWithRetries(messages, model, batch, maxRetries, retryDelayMs, retryProgressCallback, debugConfig, pluralCount, promptResult.dictionaryCount); + } + + /** + * Calculates cost based on Gemini token usage. + * + * @since 1.0.0 + * + * @param {Object} usage - Token usage from Gemini API response. + * @param {string} model - Model used. + * + * @return {Object} Cost breakdown. + */ + calculateCost(usage, model) { + if (!usage || typeof usage !== 'object') { + return { + promptCost: 0, + completionCost: 0, + totalCost: 0, + model, + error: 'Invalid usage data', + }; + } + + const { prompt_tokens: promptTokens, completion_tokens: completionTokens, total_tokens: totalTokens } = usage; + + if (!promptTokens && !completionTokens) { + return { + promptCost: 0, + completionCost: 0, + totalCost: 0, + model, + error: 'No token usage data', + }; + } + + const pricingUsed = this.getModelPricing(model); + const promptCost = (promptTokens / 1000) * pricingUsed.prompt; + const completionCost = (completionTokens / 1000) * pricingUsed.completion; + const totalCost = promptCost + completionCost; + + return { + model, + promptTokens, + completionTokens, + totalTokens, + promptCost, + completionCost, + totalCost, + pricingUsed, + }; + } + + /** + * Gets token count using Gemini's API. + * + * @since 1.0.0 + * + * @param {string} text - Text to count tokens for. + * @param {string} model - Model to use for tokenization. + * + * @return {number} Token count. + */ + async getTokenCount(text, model) { + if (!text || typeof text !== 'string') { + return 0; + } + + if (!this.client) { + this.logger.warn('Client not initialized, using fallback token count'); + return Math.ceil(text.length / 4); + } + + try { + const response = await this.client.countTokens(text); + return response.totalTokens; + } catch (error) { + this.logger.warn(`Failed to get exact token count: ${error.message}`); + + return Math.ceil(text.length / 4); + } + } + + /** + * Gets supported Gemini models. + * Returns all models from the pricing configuration. + * + * @since 1.0.0 + * + * @return {Array} Supported model identifiers. + */ + getSupportedModels() { + if (this.providerPricing && this.providerPricing.models) { + return Object.keys(this.providerPricing.models).sort(); + } + + return ['gemini-2.5-pro', 'gemini-2.5-flash', 'gemini-3.1-pro-preview', 'gemini-flash-latest', 'gemini-3.1-flash-lite-preview']; + } + + /** + * Gets Gemini model pricing. + * + * @since 1.0.0 + * + * @param {string} model - Model to get pricing for. + * + * @return {Object} Pricing information. + */ + getModelPricing(model) { + if (!this.providerPricing) { + return { prompt: 0.0005, completion: 0.0015 }; + } + + return this.providerPricing.models[model] || this.providerPricing.fallback; + } + + /** + * Gets the provider name. + * + * @since 1.0.0 + * + * @return {string} Provider name. + */ + getProviderName() { + return 'gemini'; + } + + /** + * Estimates output tokens based on input tokens. + * Uses a conservative multiplier for Gemini models. + * + * @since 1.0.0 + * + * @param {number} inputTokens - Number of input tokens. + * @param {string} targetLang - Target language (unused in base implementation). + * + * @return {number} Estimated output tokens. + */ + estimateOutputTokens(inputTokens, targetLang) { + // Use conservative.1.4x multiplier for Gemini. + return Math.round(inputTokens * 1.4); + } + + /** + * Gets Gemini-specific fallback pricing when pricing file cannot be loaded. + * + * @since 1.0.0 + * + * @return {Object} Gemini fallback pricing structure. + * + * @protected + */ + _getFallbackPricing() { + return { + models: { + 'gemini-2.5-flash': { prompt: 0.000175, completion: 0.000525 }, + 'gemini-2.5-pro': { prompt: 0.0005, completion: 0.0015 }, + 'gemini-3.1-pro-preview': { prompt: 0.00125, completion: 0.005 }, + 'gemini-flash-latest': { prompt: 0.000175, completion: 0.000525 }, + 'gemini-3.1-flash-lite-preview': { prompt: 0.0000375, completion: 0.00015 }, + }, + fallback: { prompt: 0.0005, completion: 0.0015 }, + }; + } + + /** + * Handles dry-run mode by estimating costs without API calls. + * + * @since 1.0.0 + * + * @param {Array} messages - Chat messages for the API. + * @param {string} model - Model to use. + * @param {Array} batch - Translation batch. + * @param {number} pluralCount - Number of plural forms for target language. + * + * @return {Object} Dry-run result with estimated costs. + * + * @private + */ + async _handleDryRun(messages, model, batch, pluralCount, dictionaryCount) { + // Calculate input tokens. + const fullPrompt = messages.map((m) => m.content).join('\n'); + const inputTokens = await this.getTokenCount(fullPrompt, model); + + // Estimate output tokens. + const estimatedOutputTokens = this.estimateOutputTokens(inputTokens); + + // Calculate estimated costs. + const pricing = this.getModelPricing(model); + const inputCost = (inputTokens / 1000) * pricing.prompt; + const outputCost = (estimatedOutputTokens / 1000) * pricing.completion; + const totalCost = inputCost + outputCost; + + // Generate dry run translations with proper plural forms. + const translations = batch.map((item) => { + const msgstr = Array(pluralCount).fill(`[DRY RUN] ${item.msgid}`); + + return { msgid: item.msgid, msgstr }; + }); + + return { + success: true, + translations, + usage: { + prompt_tokens: inputTokens, + completion_tokens: estimatedOutputTokens, + total_tokens: inputTokens + estimatedOutputTokens, + }, + cost: { + model, + promptTokens: inputTokens, + completionTokens: estimatedOutputTokens, + totalTokens: inputTokens + estimatedOutputTokens, + promptCost: inputCost, + completionCost: outputCost, + totalCost, + pricingUsed: pricing, + isDryRun: true, + dictionaryCount, + }, + isDryRun: true, + debugData: { + messages, + batchSize: batch.length, + }, + }; + } + + /** + * Makes API call with retry logic. + * + * @since 1.0.0 + * + * @param {Array} messages - Chat messages. + * @param {string} model - Model to use. + * @param {Array} batch - Translation batch. + * @param {number} maxRetries - Maximum retries. + * @param {number} retryDelayMs - Retry delay. + * @param {Function} retryProgressCallback - Optional callback for retry progress updates. + * @param {Object} debugConfig - Optional debug configuration object. + * @param {number} pluralCount - Number of plural forms for target language. + * + * @return {Promise} API call result. + * + * @private + */ + async _makeApiCallWithRetries(messages, model, batch, maxRetries, retryDelayMs, retryProgressCallback = null, debugConfig = null, pluralCount = 1, dictionaryCount = 0) { + let lastError = null; + + // Debug: Log complete conversation at verbose level.3. + this.logger.debug('=== FULL CONVERSATION WITH AI ==='); + + messages.forEach((message, index) => { + this.logger.debug(`Message ${index + 1} (${message.role}):`); + this.logger.debug(message.content); + if (index < messages.length - 1) { + this.logger.debug('---'); + } + }); + + this.logger.debug('=== END CONVERSATION ==='); + + for (let attempt = 0; attempt <= maxRetries; attempt++) { + try { + // Notify progress callback about retry status. + this._notifyRetryProgress(retryProgressCallback, attempt, maxRetries); + + if (attempt > 0) { + this.logger.info(`Retry attempt ${attempt}/${maxRetries} after ${retryDelayMs}ms delay`); + + await new Promise((resolve) => setTimeout(resolve, retryDelayMs)); + } + + // Handle test mode failure simulation. + this._handleTestModeFailures(attempt, maxRetries); + + const history = messages.slice(0, -1).map((message) => ({ + role: message.role, + parts: [{ text: message.content || '' }], + })); + + const chat = this.client.startChat({ + history, + generationConfig: { + temperature: this.config.temperature || 0.1, + maxOutputTokens: this._calculateMaxTokens(model, batch.length), + }, + }); + + const lastMessage = messages[messages.length - 1]; + const result = await chat.sendMessage(lastMessage.content || ''); + const response = await result.response; + const responseText = response.text(); + + // Debug: Log raw AI response at verbose level.3. + this.logger.debug('=== RAW AI RESPONSE ==='); + this.logger.debug(responseText); + this.logger.debug('=== END RAW RESPONSE ==='); + + // Save debug files if enabled. + if (debugConfig && debugConfig.saveDebugInfo) { + await this._saveDebugFiles(messages, response, debugConfig, batch.length); + } + + // Parse response. + const translations = this._parseApiResponse(responseText, batch, pluralCount, dictionaryCount); + + // Debug: Log parsed translations at verbose level.3. + this.logger.debug('=== PARSED TRANSLATIONS ==='); + + translations.forEach((translation, index) => { + this.logger.debug(`${index + 1}. "${translation.msgid}" → ${JSON.stringify(translation.msgstr)}`); + }); + + this.logger.debug('=== END PARSED TRANSLATIONS ==='); + + const usage = { + prompt_tokens: await this.getTokenCount(messages.map(m => m.content || '').join('\n'), model), + completion_tokens: await this.getTokenCount(responseText, model), + }; + usage.total_tokens = usage.prompt_tokens + usage.completion_tokens; + + const cost = this.calculateCost(usage, model); + + // Notify progress callback that we're no longer retrying. + this._notifyRetryProgress(retryProgressCallback, attempt, maxRetries, false); + + return { + success: true, + translations, + usage, + cost, + isDryRun: false, + debugData: { + messages, + response: responseText, + }, + dictionaryCount, + }; + } catch (error) { + lastError = error; + + this.logger.warn(`API call attempt ${attempt + 1} failed: ${error.message}`); + + // Don't retry on certain errors. + if (this._shouldStopRetrying(error)) { + break; + } + } + } + + // Final progress callback update to clear retry status. + this._notifyRetryProgress(retryProgressCallback, maxRetries, maxRetries, false); + + return { + success: false, + error: `Failed after ${maxRetries + 1} attempts. Last error: ${lastError.message}`, + translations: [], + cost: { totalCost: 0 }, + dictionaryCount, + }; + } + + /** + * Notifies retry progress callback if provided. + * + * @private + * @since 1.0.0 + * @param {Function} callback - Progress callback function. + * @param {number} attempt - Current attempt number. + * @param {number} maxRetries - Maximum retry attempts. + * @param {boolean} isRetrying - Whether currently retrying. + */ + _notifyRetryProgress(callback, attempt, maxRetries, isRetrying = true) { + if (!callback) { + return; + } + + callback({ + isRetrying: isRetrying && attempt > 0, + attempt, + maxRetries, + }); + } + + /** + * Determines if retrying should stop based on error type. + * + * @private + * @since 1.0.0 + * @param {Error} error - The error that occurred. + * @return {boolean} True if retrying should stop. + */ + _shouldStopRetrying(error) { + return error.status === 401 || error.status === 403; + } + + /** + * Handles test mode failure simulation for retry logic testing. + * + * @private + * @since 1.0.0 + * @param {number} attempt - Current attempt number. + * @param {number} maxRetries - Maximum retry attempts. + * @throws {Error} Simulated API error for testing. + */ + _handleTestModeFailures(attempt, maxRetries) { + if (!this.config.testRetryFailureRate || this.config.testRetryFailureRate <= 0) { + return; + } + + const shouldFail = Math.random() < this.config.testRetryFailureRate; + + if (!shouldFail) { + return; + } + + // Check if we should fail this attempt. + const isFinalAttempt = attempt === maxRetries; + const shouldProtectFinalAttempt = !this.config.testAllowCompleteFailure; + + if (isFinalAttempt && shouldProtectFinalAttempt) { + this.logger.info(`🧪 TEST MODE: Would simulate failure but allowing final attempt to succeed (final attempt protection enabled)`); + return; + } + + this.logger.warn(`🧪 TEST MODE: Simulating API failure (attempt ${attempt + 1}/${maxRetries + 1}) - failure rate: ${(this.config.testRetryFailureRate * 100).toFixed(1)}%`); + + const errorType = this._getRandomTestError(); + + this.logger.warn(`🧪 TEST MODE: Simulating ${errorType.status ? `HTTP ${errorType.status}` : 'network'} error: ${errorType.message}`); + + const testError = new Error(errorType.message); + + if (errorType.status) { + testError.status = errorType.status; + } + + // For rate limiting, add some extra properties that Gemini API might include. + if (errorType.status === 429) { + testError.response = { + headers: { + 'retry-after': '60', + 'x-ratelimit-remaining': '0', + }, + }; + } + + throw testError; + } + + /** + * Gets a random test error for failure simulation. + * + * @private + * + * @since 1.0.0 + * + * @return {Object} Random error configuration. + */ + _getRandomTestError() { + const errorTypes = [ + { status: 429, message: 'Rate limit exceeded. Please retry after 60 seconds.' }, + { status: 500, message: 'Internal server error' }, + { status: 502, message: 'Bad gateway' }, + { status: 503, message: 'Service temporarily unavailable' }, + { status: 504, message: 'Gateway timeout' }, + { status: null, message: 'Network connection failed' }, // Simulate network error. + ]; + + return errorTypes[Math.floor(Math.random() * errorTypes.length)]; + } + + /** + * Parses API response and extracts translations. + * + * @since 1.0.0 + * + * @param {string} responseContent - API response content. + * @param {Array} batch - Original batch for fallback. + * @param {number} pluralCount - Number of plural forms for target language. + * @param {number} dictionaryCount - Number of dictionary entries to skip. + * + * @return {Array} Parsed translations. + * + * @private + */ + _parseApiResponse(responseContent, batch, pluralCount, dictionaryCount = 0) { + try { + return parseXmlResponse(responseContent, batch, pluralCount, this.logger, dictionaryCount); + } catch (error) { + this.logger.warn(`Failed to parse API response: ${error.message}`); + + // Return empty translations as fallback. + return batch.map((item) => ({ + msgid: item.msgid, + msgstr: Array(pluralCount).fill(''), + })); + } + } + + /** + * Saves API request and response data to debug files when debug mode is enabled. + * Creates timestamped files with detailed information for troubleshooting. + * + * @private + * + * @since 1.0.0 + * + * @param {Array} messages - The API request messages sent to Gemini. + * @param {Object} response - The full API response from Gemini. + * @param {Object} debugConfig - Debug configuration object. + * @param {number} batchSize - Size of the batch for max_tokens calculation. + * + * @return {Promise} Resolves when debug files are saved successfully. + */ + async _saveDebugFiles(messages, response, debugConfig, batchSize) { + try { + const fs = await import('fs'); + const path = await import('path'); + + // Create debug directory if it doesn't exist. + const debugDir = path.join(debugConfig.outputDir || '.', 'debug'); + + if (!fs.existsSync(debugDir)) { + fs.mkdirSync(debugDir, { recursive: true }); + } + + // Create timestamp for unique file naming. + const now = new Date(); + const dateStr = now.toISOString().slice(0, 10).replace(/-/g, ''); // YYYYMMDD. + const timeStr = now.toISOString().slice(11, 16).replace(':', ''); // HHMM. + const batchStr = `${debugConfig.batchNum}-of-${debugConfig.totalBatches}`; + const filePrefix = `${dateStr}--${timeStr}--${debugConfig.targetLang}--${batchStr}`; + + // Prepare debug data with metadata and complete request parameters. + const { totalBatches } = debugConfig; + const { model } = this.config; + const debugData = { + metadata: { + timestamp: new Date().toISOString(), + targetLanguage: debugConfig.targetLang, + batchNumber: debugConfig.batchNum, + totalBatches, + model, + }, + request: { + model, + messages, + temperature: this.config.temperature || 0.1, + max_tokens: this._calculateMaxTokens(model, batchSize), + systemPromptLength: messages[0].content.length, + userMessageLength: messages[1].content.length, + }, + response, + }; + + // Save debug file. + const debugFilePath = path.join(debugDir, `${filePrefix}.json`); + + fs.writeFileSync(debugFilePath, JSON.stringify(debugData, null, 2), 'utf8'); + + this.logger.debug(`Debug file saved: ${debugFilePath}`); + } catch (error) { + this.logger.warn(`Failed to save debug files: ${error.message}`); + } + } + + /** + * Calculates max_tokens value with smart auto-calculation. + * When not configured, estimates based on batch size and expected output. + * + * @private + * @since 1.0.0 + * @param {string} model - Gemini model (for token estimation). + * @param {number} batchSize - Number of items in the batch. + * @return {number} Max tokens value. + */ + _calculateMaxTokens(model, batchSize) { + // Use configured value if provided. + if (this.config.maxTokens) { + this.logger.debug(`Using configured max_tokens: ${this.config.maxTokens} for batch of ${batchSize} string${batchSize === 1 ? '' : 's'}`); + + return this.config.maxTokens; + } + + // Auto-calculate based on batch size and expected output. + const estimatedTokensPerString = this._estimateTokensPerString(); + const estimatedOutputTokens = batchSize * estimatedTokensPerString; + + // Add safety buffer (30%) to account for: + // - Longer translations in some languages. + // - XML formatting overhead. + // - Some strings being longer than average. + const safetyBuffer = 1.3; + const calculatedMaxTokens = Math.round(estimatedOutputTokens * safetyBuffer); + + // Apply reasonable bounds. + const minTokens = 100; // Minimum for any response. + const maxTokens = 8192; // Gemini API limit. + const finalMaxTokens = Math.max(minTokens, Math.min(maxTokens, calculatedMaxTokens)); + + this.logger.debug(`Auto-calculated max_tokens: ${finalMaxTokens} for batch of ${batchSize} string${batchSize === 1 ? '' : 's'} (estimated: ${estimatedOutputTokens}, with 30% buffer: ${calculatedMaxTokens})`); + + return finalMaxTokens; + } + + /** + * Estimates average tokens needed per string translation. + * Based on typical translation patterns and XML formatting overhead. + * + * @private + * @since 1.0.0 + * @return {number} Estimated tokens per translated string. + */ + _estimateTokensPerString() { + // Conservative estimate based on.: + // - Average translation length (50-80 tokens.) + // - XML formatting overhead (...) + // - Plural forms (may double the output.) + // - Some strings being longer than average. + return 120; + } +} diff --git a/src/providers/google-translate/GoogleTranslateProvider.js b/src/providers/google-translate/GoogleTranslateProvider.js new file mode 100644 index 0000000..298e044 --- /dev/null +++ b/src/providers/google-translate/GoogleTranslateProvider.js @@ -0,0 +1,504 @@ +import { Provider } from '../base/Provider.js'; + +/** + * Google Translate Provider Implementation. + * + * Handles translation using Google's free public Translate API. + * No API key required - uses translate.googleapis.com endpoint. + * + * @since 1.0.0 + */ +export class GoogleTranslateProvider extends Provider { + /** + * Creates a new Google Translate Provider instance. + * + * @since 1.0.0 + * + * @param {Object} config - Google Translate provider configuration. + * @param {Object} logger - Logger instance. + */ + constructor(config, logger) { + super(config, logger); + + this.baseUrl = 'https://translate.googleapis.com/translate_a/single'; + } + + /** + * Initializes the Google Translate provider. + * Sets up pricing information (free service). + * + * @since 1.0.0 + * + * @return {Promise} Resolves when initialization is complete. + */ + async initialize() { + await this._loadProviderPricing('google-translate'); + this.logger.debug('Google Translate provider initialized (free service)'); + } + + /** + * Validates Google Translate provider configuration. + * + * @since 1.0.0 + * + * @param {Object} config - Configuration to validate. + * + * @return {Object} Validation result. + */ + validateConfig(config) { + const errors = []; + + if (!config.sourceLanguage) { + errors.push('Source language is required for Google Translate'); + } + + return { + isValid: errors.length === 0, + errors, + }; + } + + /** + * Translates a batch of strings using Google Translate API. + * + * @since 1.0.0 + * + * @param {Array} batch - Array of translation items. + * @param {string} targetLang - Target language code. + * @param {string} model - Model identifier (unused for Google Translate). + * @param {string} systemPrompt - System prompt (ignored, Google Translate is direct). + * @param {number} maxRetries - Maximum retry attempts. + * @param {number} retryDelayMs - Delay between retries. + * @param {number} timeout - Request timeout in ms. + * @param {boolean} isDryRun - Whether this is a dry run. + * @param {Function} retryProgressCallback - Optional callback for retry progress updates. + * @param {Object} debugConfig - Optional debug configuration object. + * @param {number} pluralCount - Number of plural forms for target language. + * + * @return {Promise} Translation result. + */ + async translateBatch(batch, targetLang, model, systemPrompt, maxRetries, retryDelayMs, timeout, isDryRun, retryProgressCallback = null, debugConfig = null, pluralCount = 1) { + const sourceLang = this.config.sourceLanguage || 'en'; + + if (isDryRun) { + return this._handleDryRun(batch, targetLang, sourceLang, pluralCount); + } + + return await this._makeApiCallWithRetries(batch, targetLang, sourceLang, maxRetries, retryDelayMs, timeout, retryProgressCallback, pluralCount); + } + + /** + * Calculates cost based on usage data. + * Google Translate is free, so cost is always 0. + * + * @since 1.0.0 + * + * @param {Object} usage - Token usage (not used for Google Translate). + * @param {string} model - Model used (unused). + * + * @return {Object} Cost breakdown (zero for free service). + */ + calculateCost(usage, model) { + return { + model: 'google-translate', + promptCost: 0, + completionCost: 0, + totalCost: 0, + }; + } + + /** + * Gets token count using character-based estimation. + * + * @since 1.0.0 + * + * @param {string} text - Text to count tokens for. + * @param {string} model - Model identifier (unused). + * + * @return {number} Estimated token count. + */ + async getTokenCount(text, model) { + if (!text || typeof text !== 'string') { + return 0; + } + + // Rough estimate: 1 token ≈ 4 characters + return Math.ceil(text.length / 4); + } + + /** + * Gets supported "models" for Google Translate. + * Returns default since Google Translate doesn't use model selection. + * + * @since 1.0.0 + * + * @return {Array} Supported model identifiers. + */ + getSupportedModels() { + return ['default']; + } + + /** + * Gets the provider name. + * + * @since 1.0.0 + * + * @return {string} Provider name. + */ + getProviderName() { + return 'google-translate'; + } + + /** + * Estimates output tokens based on input tokens. + * Google Translate typically expands text by ~10-30%. + * + * @since 1.0.0 + * + * @param {number} inputTokens - Number of input tokens. + * @param {string} targetLang - Target language (unused). + * + * @return {number} Estimated output tokens. + */ + estimateOutputTokens(inputTokens, targetLang) { + // Google Translate output is typically 1.2x input + return Math.round(inputTokens * 1.2); + } + + /** + * Gets fallback pricing when pricing file cannot be loaded. + * Google Translate is free. + * + * @since 1.0.0 + * + * @return {Object} Google Translate fallback pricing structure. + * + * @protected + */ + _getFallbackPricing() { + return { + models: { + default: { prompt: 0, completion: 0 }, + }, + fallback: { prompt: 0, completion: 0 }, + }; + } + + /** + * Gets model pricing for Google Translate. + * + * @since 1.0.0 + * + * @param {string} model - Model to get pricing for. + * + * @return {Object} Pricing information (always zeros). + */ + getModelPricing(model) { + if (this.providerPricing && this.providerPricing.models && this.providerPricing.models[model]) { + return this.providerPricing.models[model]; + } + + return { prompt: 0, completion: 0 }; + } + + /** + * Handles dry-run mode by returning mock translations. + * + * @since 1.0.0 + * + * @param {Array} batch - Translation batch. + * @param {string} targetLang - Target language code. + * @param {string} sourceLang - Source language code. + * @param {number} pluralCount - Number of plural forms. + * + * @return {Object} Dry-run result. + * + * @private + */ + async _handleDryRun(batch, targetLang, sourceLang, pluralCount) { + const translations = batch.map((item) => { + const msgstr = Array(pluralCount).fill(`[DRY RUN] ${item.msgid}`); + return { msgid: item.msgid, msgstr }; + }); + + return { + success: true, + translations, + usage: { + prompt_tokens: 0, + completion_tokens: 0, + total_tokens: 0, + }, + cost: { + totalCost: 0, + model: 'google-translate', + isDryRun: true, + }, + isDryRun: true, + }; + } + + /** + * Makes API call with retry logic. + * + * @since 1.0.0 + * + * @param {Array} batch - Translation batch. + * @param {string} targetLang - Target language code. + * @param {string} sourceLang - Source language code. + * @param {number} maxRetries - Maximum retries. + * @param {number} retryDelayMs - Retry delay in ms. + * @param {number} timeout - Request timeout in ms. + * @param {Function} retryProgressCallback - Optional callback for retry progress updates. + * @param {number} pluralCount - Number of plural forms. + * + * @return {Promise} Translation result. + * + * @private + */ + async _makeApiCallWithRetries(batch, targetLang, sourceLang, maxRetries, retryDelayMs, timeout, retryProgressCallback, pluralCount) { + // Convert timeout from seconds to milliseconds (timeout is in seconds from config) + const timeoutMs = typeof timeout === 'number' && timeout < 1000 ? timeout * 1000 : timeout; + + // For Google Translate, process all strings in parallel for speed + // Use a smaller batch to avoid rate limiting + const maxConcurrent = 10; // Number of concurrent requests + + const translateItem = async (item, itemIndex) => { + let attempts = 0; + + while (attempts <= maxRetries) { + try { + // Translate the string + const translation = await this._translateText(item.msgid, sourceLang, targetLang, timeoutMs); + + // Handle plural forms - Google Translate doesn't support plural forms directly + // If the item has a plural form, translate it too + let msgstr = [translation]; + + if (item.msgid_plural) { + const pluralTranslation = await this._translateText(item.msgid_plural, sourceLang, targetLang, timeoutMs); + msgstr = Array(pluralCount).fill(pluralTranslation); + if (pluralCount === 1) { + msgstr = [translation]; + } + } + + return { + msgid: item.msgid, + msgid_plural: item.msgid_plural || null, + msgstr, + }; + } catch (error) { + attempts++; + + if (attempts > maxRetries) { + this.logger.error(`Failed to translate "${item.msgid}" after ${maxRetries + 1} attempts`); + return { + msgid: item.msgid, + msgid_plural: item.msgid_plural || null, + msgstr: Array(pluralCount).fill(''), + error: error?.message || 'Translation failed', + }; + } + + // Short delay before retry for Google Translate + if (error.message?.includes('timeout') || error.message?.includes('429')) { + await this._sleep(500); // Short delay for retries + } + + this.logger.warn(`Retry ${attempts}/${maxRetries} for "${item.msgid}": ${error.message}`); + } + } + + // Should not reach here, but just in case + return { + msgid: item.msgid, + msgid_plural: item.msgid_plural || null, + msgstr: Array(pluralCount).fill(''), + error: 'Translation failed', + }; + }; + + // Process in chunks for parallel execution + const translations = []; + for (let i = 0; i < batch.length; i += maxConcurrent) { + const chunk = batch.slice(i, i + maxConcurrent); + const results = await Promise.all(chunk.map((item, idx) => translateItem(item, i + idx))); + translations.push(...results); + } + + return { + success: true, + translations, + usage: { + prompt_tokens: 0, + completion_tokens: 0, + total_tokens: 0, + }, + cost: { + totalCost: 0, + model: 'google-translate', + }, + }; + } + + /** + * Translates a single text using Google Translate API. + * + * @since 1.0.0 + * + * @param {string} text - Text to translate. + * @param {string} sourceLang - Source language code. + * @param {string} targetLang - Target language code. + * @param {number} timeout - Request timeout in ms. + * + * @return {Promise} Translated text. + * + * @private + */ + async _translateText(text, sourceLang, targetLang, timeout = 30000) { + const url = this._buildUrl(text, sourceLang, targetLang); + + const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), timeout); + + try { + const response = await fetch(url, { + method: 'GET', + signal: controller.signal, + headers: { + 'User-Agent': 'Mozilla/5.0', + }, + }); + + clearTimeout(timeoutId); + + if (!response.ok) { + throw new Error(`HTTP ${response.status}: ${response.statusText}`); + } + + const data = await response.json(); + + return this._parseResponse(data); + } catch (error) { + clearTimeout(timeoutId); + + if (error.name === 'AbortError') { + throw new Error(`Request timeout after ${timeout}ms`); + } + + throw error; + } + } + + /** + * Builds the Google Translate API URL. + * + * @since 1.0.0 + * + * @param {string} text - Text to translate. + * @param {string} sourceLang - Source language code. + * @param {string} targetLang - Target language code. + * + * @return {string} Complete URL with query parameters. + * + * @private + */ + _buildUrl(text, sourceLang, targetLang) { + const params = new URLSearchParams({ + client: 'gtx', + sl: sourceLang, + tl: targetLang, + dt: 't', + q: text, + }); + + return `${this.baseUrl}?${params.toString()}`; + } + + /** + * Parses Google Translate API response. + * + * @since 1.0.0 + * + * @param {Object} data - JSON response from Google Translate. + * + * @return {string} Extracted translation text. + * + * @private + */ + _parseResponse(data) { + // Google Translate returns: [[translated_text, original_text, detected_lang], ...] + if (!data || !Array.isArray(data) || data.length === 0) { + throw new Error('Invalid response format from Google Translate'); + } + + // The first element is the translation + const firstElement = data[0]; + + if (Array.isArray(firstElement)) { + // For single sentences, data[0] is an array of segments + // Join all segments to get full translation + return firstElement.map((segment) => (Array.isArray(segment) ? segment[0] : segment)).join(''); + } + + return String(firstElement); + } + + /** + * Determines if retrying should stop based on error type. + * + * @private + * + * @since 1.0.0 + * + * @param {Error} error - The error that occurred. + * + * @return {boolean} True if retrying should stop. + */ + _shouldStopRetrying(error) { + // Don't retry on client errors + if (error.message && error.message.includes('timeout')) { + return false; // Timeouts are retriable + } + return false; // Most errors are retriable for Google Translate + } + + /** + * Notifies retry progress callback if provided. + * + * @private + * + * @since 1.0.0 + * + * @param {Function} callback - Progress callback function. + * @param {number} attempt - Current attempt number. + * @param {number} maxRetries - Maximum retry attempts. + * @param {boolean} isRetrying - Whether currently retrying. + */ + _notifyRetryProgress(callback, attempt, maxRetries, isRetrying = true) { + if (!callback) { + return; + } + + callback({ + isRetrying: isRetrying && attempt > 0, + attempt, + maxRetries, + }); + } + + /** + * Sleep utility for delays. + * + * @private + * + * @param {number} ms - Milliseconds to sleep. + * + * @return {Promise} + */ + _sleep(ms) { + return new Promise((resolve) => setTimeout(resolve, ms)); + } +} \ No newline at end of file diff --git a/src/utils/promptLoader.js b/src/utils/promptLoader.js index 8024704..156ad20 100644 --- a/src/utils/promptLoader.js +++ b/src/utils/promptLoader.js @@ -23,16 +23,22 @@ import { getPluralForms, extractPluralCount } from './poFileUtils.js'; */ function loadPromptTemplate(promptFilePath) { try { + if (!fs.existsSync(promptFilePath)) { + // Return empty string if prompt file doesn't exist (e.g., Google Translate provider) + return ''; + } + const content = fs.readFileSync(promptFilePath, 'utf-8'); const prompt = content.trim(); if (!prompt) { - throw new Error('Prompt file is empty'); + return ''; } return prompt; } catch (error) { - throw new Error(`Failed to load prompt from ${promptFilePath}: ${error.message}`); + // Return empty string on error instead of throwing + return ''; } } @@ -57,7 +63,7 @@ export function buildSystemPrompt(targetLang, sourceLang = 'English', promptFile const template = loadPromptTemplate(promptFilePath); const targetLanguageName = getApiTargetLanguage(targetLang); const sourceLanguageName = getApiTargetLanguage(sourceLang); - const pluralFormsString = getPluralForms(targetLang, { debug: () => {}, warn: () => {} }); + const pluralFormsString = getPluralForms(targetLang, { debug: () => { }, warn: () => { } }); const pluralCount = extractPluralCount(pluralFormsString); return template diff --git a/tests/integration/google-translate.test.js b/tests/integration/google-translate.test.js new file mode 100644 index 0000000..b3b5633 --- /dev/null +++ b/tests/integration/google-translate.test.js @@ -0,0 +1,182 @@ +import { describe, it, expect, vi, beforeAll } from 'vitest'; +import { GoogleTranslateProvider } from '../../src/providers/google-translate/GoogleTranslateProvider.js'; +import { ProviderFactory } from '../../src/providers/ProviderFactory.js'; + +// Mock logger +const createMockLogger = () => ({ + debug: vi.fn(), + info: vi.fn(), + warn: vi.fn(), + error: vi.fn(), + success: vi.fn(), +}); + +describe('Google Translate Integration', () => { + let mockLogger; + + beforeAll(() => { + mockLogger = createMockLogger(); + }); + + describe('ProviderFactory integration', () => { + it('should register google-translate as a supported provider', () => { + const supported = ProviderFactory.getSupportedProviders(); + expect(supported).toContain('google-translate'); + }); + + it('should create google-translate provider via factory', () => { + const provider = ProviderFactory.createProvider( + { provider: 'google-translate', sourceLanguage: 'en' }, + mockLogger + ); + expect(provider).toBeInstanceOf(GoogleTranslateProvider); + }); + + it('should return provider info for google-translate', () => { + const providers = ProviderFactory.getProviderInfo(); + const googleTranslateInfo = providers.find(p => p.name === 'google-translate'); + + expect(googleTranslateInfo).toBeDefined(); + expect(googleTranslateInfo.status).toBe('implemented'); + expect(googleTranslateInfo.models).toContain('default'); + expect(googleTranslateInfo.description).toContain('free'); + }); + + it('should validate google-translate as a supported provider', () => { + expect(ProviderFactory.isProviderSupported('google-translate')).toBe(true); + }); + }); + + describe('End-to-end translation simulation', () => { + it('should translate batch in dry run mode', async () => { + const provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + + const batch = [ + { msgid: 'Hello', msgid_plural: null }, + { msgid: 'Goodbye', msgid_plural: null }, + { msgid: 'Thank you', msgid_plural: null }, + ]; + + const result = await provider.translateBatch( + batch, + 'fr', + 'default', + 'Translate to French', + 3, + 1000, + 30000, + true // dry run + ); + + expect(result.success).toBe(true); + expect(result.translations).toHaveLength(3); + expect(result.cost.totalCost).toBe(0); + + // Check that dry run markers are present + result.translations.forEach(trans => { + expect(trans.msgstr[0]).toContain('[DRY RUN]'); + }); + }); + + it('should handle multiple plural forms in dry run', async () => { + const provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + + const batch = [ + { msgid: '%d file', msgid_plural: '%d files' }, + ]; + + const result = await provider.translateBatch( + batch, + 'fr', + 'default', + 'Translate to French', + 3, + 1000, + 30000, + true, + null, + null, + 2 // plural count + ); + + expect(result.success).toBe(true); + expect(result.translations[0].msgstr).toHaveLength(2); + }); + }); + + describe('Configuration validation', () => { + it('should fail validation for missing sourceLanguage', () => { + const provider = new GoogleTranslateProvider({}, mockLogger); + const result = provider.validateConfig({}); + + expect(result.isValid).toBe(false); + expect(result.errors.length).toBeGreaterThan(0); + }); + + it('should pass validation with sourceLanguage', () => { + const provider = new GoogleTranslateProvider({}, mockLogger); + const result = provider.validateConfig({ sourceLanguage: 'en' }); + + expect(result.isValid).toBe(true); + }); + }); + + describe('API URL construction', () => { + it('should construct valid Google Translate API URL', () => { + const provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + + const url = provider._buildUrl('Hello World', 'en', 'fr'); + + expect(url).toBe('https://translate.googleapis.com/translate_a/single?client=gtx&sl=en&tl=fr&dt=t&q=Hello%20World'); + }); + + it('should handle special characters in text', () => { + const provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + + const url = provider._buildUrl('Hello & goodbye ', 'en', 'fr'); + + // URL encoding should handle special characters + expect(url).toContain('client=gtx'); + expect(url).toContain('sl=en'); + expect(url).toContain('tl=fr'); + expect(url).toContain('q='); + }); + }); + + describe('Pricing configuration', () => { + it('should have zero cost for all translations', async () => { + const provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + + await provider.initialize(); + + const pricing = provider.getModelPricing('default'); + expect(pricing.prompt).toBe(0); + expect(pricing.completion).toBe(0); + }); + + it('should calculate zero cost', () => { + const provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + + const cost = provider.calculateCost({ tokens: 1000 }, 'default'); + expect(cost.totalCost).toBe(0); + }); + }); +}); \ No newline at end of file diff --git a/tests/unit/googleTranslateProvider.test.js b/tests/unit/googleTranslateProvider.test.js new file mode 100644 index 0000000..fa3ec5c --- /dev/null +++ b/tests/unit/googleTranslateProvider.test.js @@ -0,0 +1,223 @@ +import { describe, it, expect, vi, beforeEach } from 'vitest'; +import { GoogleTranslateProvider } from '../../src/providers/google-translate/GoogleTranslateProvider.js'; + +// Mock logger +const createMockLogger = () => ({ + debug: vi.fn(), + info: vi.fn(), + warn: vi.fn(), + error: vi.fn(), + success: vi.fn(), +}); + +describe('GoogleTranslateProvider', () => { + let provider; + let mockLogger; + + beforeEach(() => { + mockLogger = createMockLogger(); + provider = new GoogleTranslateProvider( + { sourceLanguage: 'en' }, + mockLogger + ); + }); + + describe('constructor', () => { + it('should create a new provider instance', () => { + expect(provider).toBeDefined(); + expect(provider.baseUrl).toBe('https://translate.googleapis.com/translate_a/single'); + }); + }); + + describe('getProviderName', () => { + it('should return google-translate as the provider name', () => { + expect(provider.getProviderName()).toBe('google-translate'); + }); + }); + + describe('getSupportedModels', () => { + it('should return default model', () => { + expect(provider.getSupportedModels()).toEqual(['default']); + }); + }); + + describe('validateConfig', () => { + it('should return valid for config with sourceLanguage', () => { + const result = provider.validateConfig({ sourceLanguage: 'en' }); + expect(result.isValid).toBe(true); + expect(result.errors).toHaveLength(0); + }); + + it('should return errors when sourceLanguage is missing', () => { + const result = provider.validateConfig({}); + expect(result.isValid).toBe(false); + expect(result.errors).toContain('Source language is required for Google Translate'); + }); + }); + + describe('calculateCost', () => { + it('should always return zero cost', () => { + const cost = provider.calculateCost({}, 'default'); + expect(cost.totalCost).toBe(0); + expect(cost.promptCost).toBe(0); + expect(cost.completionCost).toBe(0); + expect(cost.model).toBe('google-translate'); + }); + }); + + describe('getTokenCount', () => { + it('should return estimated token count based on character length', async () => { + const count = await provider.getTokenCount('hello world', 'default'); + expect(count).toBe(Math.ceil('hello world'.length / 4)); + }); + + it('should return 0 for empty text', async () => { + const count = await provider.getTokenCount('', 'default'); + expect(count).toBe(0); + }); + + it('should return 0 for null text', async () => { + const count = await provider.getTokenCount(null, 'default'); + expect(count).toBe(0); + }); + }); + + describe('estimateOutputTokens', () => { + it('should estimate output as 1.2x input', () => { + const output = provider.estimateOutputTokens(100, 'fr'); + expect(output).toBe(120); + }); + }); + + describe('getModelPricing', () => { + it('should return zero pricing', () => { + const pricing = provider.getModelPricing('default'); + expect(pricing.prompt).toBe(0); + expect(pricing.completion).toBe(0); + }); + }); + + describe('_getFallbackPricing', () => { + it('should return zero pricing structure', () => { + const fallback = provider._getFallbackPricing(); + expect(fallback.models.default.prompt).toBe(0); + expect(fallback.models.default.completion).toBe(0); + expect(fallback.fallback.prompt).toBe(0); + expect(fallback.fallback.completion).toBe(0); + }); + }); + + describe('_buildUrl', () => { + it('should build correct URL with parameters', () => { + const url = provider._buildUrl('hello', 'en', 'fr'); + expect(url).toContain('client=gtx'); + expect(url).toContain('sl=en'); + expect(url).toContain('tl=fr'); + expect(url).toContain('dt=t'); + expect(url).toContain('q=hello'); + }); + }); + + describe('_parseResponse', () => { + it('should parse Google Translate response format', () => { + // Simulate Google Translate response: [[translated_text, original, detected_lang], ...] + const mockData = [['Bonjour le monde', 'Hello world', 'en']]; + const result = provider._parseResponse(mockData); + expect(result).toBe('Bonjour le monde'); + }); + + it('should handle array of segments', () => { + // Multiple segments in response + const mockData = [[['Hello', 'world']]]; + const result = provider._parseResponse(mockData); + expect(result).toBe('Helloworld'); + }); + + it('should throw error for invalid response', () => { + expect(() => provider._parseResponse(null)).toThrow('Invalid response format'); + expect(() => provider._parseResponse([])).toThrow('Invalid response format'); + }); + }); + + describe('translateBatch (dry run)', () => { + it('should return mock translations in dry run mode', async () => { + const batch = [ + { msgid: 'Hello', msgid_plural: null }, + { msgid: 'World', msgid_plural: null } + ]; + + const result = await provider.translateBatch( + batch, + 'fr', + 'default', + 'system prompt', + 3, + 1000, + 30000, + true // dry run + ); + + expect(result.success).toBe(true); + expect(result.translations).toHaveLength(2); + expect(result.translations[0].msgstr[0]).toContain('[DRY RUN]'); + expect(result.cost.totalCost).toBe(0); + expect(result.isDryRun).toBe(true); + }); + + it('should handle plural forms in dry run', async () => { + const batch = [ + { msgid: 'One item', msgid_plural: '%d items', msgstr: [''] } + ]; + + const result = await provider.translateBatch( + batch, + 'fr', + 'default', + 'system prompt', + 3, + 1000, + 30000, + true + ); + + expect(result.success).toBe(true); + expect(result.translations[0].msgstr).toHaveLength(1); // pluralCount defaults to 1 + }); + }); + + describe('_shouldStopRetrying', () => { + it('should return false for timeout errors (retriable)', () => { + const error = new Error('Request timeout'); + expect(provider._shouldStopRetrying(error)).toBe(false); + }); + + it('should return false for other errors (retriable)', () => { + const error = new Error('Network error'); + expect(provider._shouldStopRetrying(error)).toBe(false); + }); + }); + + describe('_notifyRetryProgress', () => { + it('should call callback with progress info', () => { + const callback = vi.fn(); + provider._notifyRetryProgress(callback, 1, 3, true); + + expect(callback).toHaveBeenCalledWith({ + isRetrying: true, + attempt: 1, + maxRetries: 3, + }); + }); + + it('should not call callback if not provided', () => { + expect(() => provider._notifyRetryProgress(null, 1, 3, true)).not.toThrow(); + }); + }); + + describe('_sleep', () => { + it('should return a promise', () => { + const result = provider._sleep(10); + expect(result).toBeInstanceOf(Promise); + }); + }); +}); \ No newline at end of file