Cria capacidade de usar sistema com ou sem suporte ao LLaMa#28
Conversation
- O objetivo é facilitar o deploy local - Basta fazer make build_llama (em lugar de make build)
- Atrasamos o import da lib llama_cpp para melhor lidar com exceções
- Faz raise que deve ser capturado em todo mundo que usar GenericLlama
- Lidar com raises provocados no GenericLlama - Captura e registra exceções com o Tracker - Retorna string indicando que marcação falhou, se for o caso
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| - | - | Hugging Face user access token | b0f1a31 | .envs/.local/.django | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the LLaMa model integration to make it optional, separating LLaMa-specific dependencies from the core application. This allows for lighter Docker images when LLaMa functionality is not needed while maintaining the ability to enable it when required.
- Separated LLaMa dependencies into optional Docker images and requirements files
- Added robust error handling with custom exceptions for LLaMa-related issues
- Implemented singleton pattern for LLaMa model loading and configuration-based enabling
Reviewed Changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements/extra-llama.txt | New file containing LLaMa-specific dependencies |
| requirements/base.txt | Removed LLaMa dependencies from core requirements |
| reference/wagtail_hooks.py | Migrated from deprecated wagtail_modeladmin to snippets |
| reference/models.py | Minor formatting and label improvements |
| reference/marker.py | Added comprehensive error handling for LLaMa operations |
| reference/management/commands/download_model.py | New management command for downloading LLaMa models |
| llama3/generic_llama.py | Refactored with singleton pattern and robust error handling |
| llama3/download_model.py | Removed obsolete hardcoded download script |
| llama.local.yml | New Docker Compose configuration for LLaMa-enabled development |
| config/settings/base.py | Added LLAMA_ENABLED setting and reorganized LLaMa configuration |
| compose/production/django/Dockerfile.llama | New production Docker image with LLaMa support |
| compose/production/django/Dockerfile | Simplified by removing LLaMa-specific build logic |
| compose/local/django/Dockerfile.llama | New local development Docker image with LLaMa support |
| compose/local/django/Dockerfile | Simplified by removing LLaMa-specific build logic |
| README.md | Added reference to LLaMa model setup guide |
| Makefile | Added build_llama command for LLaMa-enabled images |
| .envs/.local/.django | Added HF_TOKEN placeholder for Hugging Face authentication |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| ARG ENABLE_OPTIMIZATIONS=true | ||
| ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | ||
| CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | ||
| CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" |
There was a problem hiding this comment.
The CMAKE_ARGS environment variable name on line 37 conflicts with the variable used in the generic_llama.py file for a different purpose. Consider using a more specific name like LLAMA_CMAKE_ARGS to avoid confusion.
| CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" | |
| LLAMA_CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" |
| # Instalar gcc-10 y g++-10 en Debian Bullseye | ||
| RUN apt-get update && \ | ||
| apt-get install -y gcc-10 g++-10 ninja-build cmake && \ | ||
| update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \ | ||
| update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \ | ||
| apt-get clean && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| # Instalar ninja-build y cmake | ||
| RUN apt-get install -y ninja-build cmake | ||
|
|
||
| # Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente | ||
| ARG ENABLE_OPTIMIZATIONS=true | ||
| ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | ||
| CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | ||
| CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" | ||
|
|
||
| # Actualizar pip, setuptools y wheel antes de instalar dependencias |
There was a problem hiding this comment.
Comment should be in English to maintain consistency with the rest of the codebase.
| # Instalar gcc-10 y g++-10 en Debian Bullseye | |
| RUN apt-get update && \ | |
| apt-get install -y gcc-10 g++-10 ninja-build cmake && \ | |
| update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \ | |
| update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \ | |
| apt-get clean && rm -rf /var/lib/apt/lists/* | |
| # Instalar ninja-build y cmake | |
| RUN apt-get install -y ninja-build cmake | |
| # Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente | |
| ARG ENABLE_OPTIMIZATIONS=true | |
| ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | |
| CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | |
| CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" | |
| # Actualizar pip, setuptools y wheel antes de instalar dependencias | |
| # Install gcc-10 and g++-10 in Debian Bullseye | |
| RUN apt-get update && \ | |
| apt-get install -y gcc-10 g++-10 ninja-build cmake && \ | |
| update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \ | |
| update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \ | |
| apt-get clean && rm -rf /var/lib/apt/lists/* | |
| # Install ninja-build and cmake | |
| RUN apt-get install -y ninja-build cmake | |
| # Set environment variables to conditionally compile with BLAS and SIMD | |
| ARG ENABLE_OPTIMIZATIONS=true | |
| ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | |
| CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | |
| CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" | |
| # Upgrade pip, setuptools, and wheel before installing dependencies |
| # Instalar ninja-build y cmake | ||
| RUN apt-get install -y ninja-build cmake | ||
|
|
||
| # Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente |
There was a problem hiding this comment.
Comment should be in English to maintain consistency with the rest of the codebase.
| # Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente | |
| # Configure environment variables to compile with BLAS and SIMD conditionally |
| CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \ | ||
| CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}" | ||
|
|
||
| # Actualizar pip, setuptools y wheel antes de instalar dependencias |
There was a problem hiding this comment.
Comment should be in English to maintain consistency with the rest of the codebase.
| # Actualizar pip, setuptools y wheel antes de instalar dependencias | |
| # Update pip, setuptools, and wheel before installing dependencies |
Cria capacidade de usar sistema com ou sem suporte ao LLaMa
Descrição
Esta PR refatora a integração do modelo de linguagem Llama, tornando-a opcional. Anteriormente, as dependências e a configuração do Llama eram carregadas incondicionalmente, aumentando o tamanho da imagem Docker e a complexidade do ambiente. As principais mudanças introduzem um ambiente Docker dedicado para o Llama, separam as dependências e aprimoram o tratamento de erros e o carregamento do modelo na aplicação.
Principais Mudanças
1. Ambiente Docker Opcional para Llama
Dockerfile.llama): Foram criadosDockerfile.llamapara os ambientes local e de produção. Eles contêm todas as dependências de compilação (cmake,ninja-build, etc.) e os pacotes Python necessários para executar ollama-cpp-python.Dockerfile) foram limpos, removendo toda a lógica de instalação condicional e as dependências do Llama.llama.local.yml): Um novo arquivo de compose foi adicionado para permitir levantar o ambiente de desenvolvimento local com o Llama ativado, utilizando as novas imagens.build_llamapara construir as imagens Docker específicas do Llama.2. Gerenciamento de Dependências
huggingface-hub,llama-cpp-python) foram movidas derequirements/base.txtpara um novo arquivorequirements/extra-llama.txt.base.txt+extra-llama.txt), enquanto as imagens padrão instalam apenas obase.txt(e local.txt ou production.txt).3. Robustez e Melhorias no Código
LLAMA_ENABLEDfoi adicionada aosettings.pypara ativar ou desativar globalmente a funcionalidade do Llama.GenericLlama:_cached_llm), garantindo que ele seja inicializado apenas uma vez, melhorando a performance.LLAMA_ENABLEDéTrueantes de tentar carregar o modeloLlamaDisabledError,LlamaModelNotFoundError,LlamaNotInstalledError) para lidar com cenários específicos, como o pacotellama-cpp-pythonnão estar instalado ou o arquivo do modelo não ser encontrado.marker.py:mark_referenceagora captura as novas exceções.GeneralEventé registrado para fins de rastreabilidade, e uma mensagem amigável é retornada ao usuário.Como Testar
Cenário 1: Ambiente Padrão (Sem Llama)
make buildmake upCenário 2: Ambiente com Llama
HF_TOKENno arquivo.envs/.local/.django.make build_llamadocker compose -f llama.local.yml upllama3/llama-3.2.LlamaModelNotFoundError.