Skip to content

Cria capacidade de usar sistema com ou sem suporte ao LLaMa#28

Merged
robertatakenaka merged 27 commits intoscieloorg:mainfrom
pitangainnovare:impl/llama-or-not-llama
Oct 6, 2025
Merged

Cria capacidade de usar sistema com ou sem suporte ao LLaMa#28
robertatakenaka merged 27 commits intoscieloorg:mainfrom
pitangainnovare:impl/llama-or-not-llama

Conversation

@pitangainnovare
Copy link
Copy Markdown
Contributor

Cria capacidade de usar sistema com ou sem suporte ao LLaMa

Descrição

Esta PR refatora a integração do modelo de linguagem Llama, tornando-a opcional. Anteriormente, as dependências e a configuração do Llama eram carregadas incondicionalmente, aumentando o tamanho da imagem Docker e a complexidade do ambiente. As principais mudanças introduzem um ambiente Docker dedicado para o Llama, separam as dependências e aprimoram o tratamento de erros e o carregamento do modelo na aplicação.

Principais Mudanças

1. Ambiente Docker Opcional para Llama

  • Novos Dockerfiles (Dockerfile.llama): Foram criados Dockerfile.llama para os ambientes local e de produção. Eles contêm todas as dependências de compilação (cmake, ninja-build, etc.) e os pacotes Python necessários para executar o llama-cpp-python.
  • Simplificação dos Dockerfiles Padrão: Os Dockerfiles originais (Dockerfile) foram limpos, removendo toda a lógica de instalação condicional e as dependências do Llama.
  • Novo Docker Compose (llama.local.yml): Um novo arquivo de compose foi adicionado para permitir levantar o ambiente de desenvolvimento local com o Llama ativado, utilizando as novas imagens.
  • Makefile: Adicionado o comando build_llama para construir as imagens Docker específicas do Llama.

2. Gerenciamento de Dependências

  • Separação de Requisitos: As dependências do Llama (huggingface-hub, llama-cpp-python) foram movidas de requirements/base.txt para um novo arquivo requirements/extra-llama.txt.
  • As imagens Docker do Llama agora instalam os requisitos de ambos os arquivos (base.txt + extra-llama.txt), enquanto as imagens padrão instalam apenas o base.txt (e local.txt ou production.txt).

3. Robustez e Melhorias no Código

  • Habilitação via Settings: Uma nova variável LLAMA_ENABLED foi adicionada ao settings.py para ativar ou desativar globalmente a funcionalidade do Llama.
  • Refatoração do GenericLlama:
    • Singleton Pattern: O modelo agora é carregado como um singleton (_cached_llm), garantindo que ele seja inicializado apenas uma vez, melhorando a performance.
    • Carregamento Condicional: A classe agora verifica se LLAMA_ENABLED é True antes de tentar carregar o modelo
    • Tratamento de Exceções: Foram criadas exceções personalizadas (LlamaDisabledError, LlamaModelNotFoundError, LlamaNotInstalledError) para lidar com cenários específicos, como o pacote llama-cpp-python não estar instalado ou o arquivo do modelo não ser encontrado.
  • Tratamento de Erros no marker.py:
    • A função mark_reference agora captura as novas exceções.
    • Em caso de erro, um GeneralEvent é registrado para fins de rastreabilidade, e uma mensagem amigável é retornada ao usuário.

Como Testar

Cenário 1: Ambiente Padrão (Sem Llama)

  1. Construa a imagem padrão: make build
  2. Inicie os containers: make up
  3. A aplicação deve funcionar normalmente.
  4. Tente executar uma tarefa que utilize a marcação de referências. O sistema deve retornar a mensagem "Llama model is not available: LLaMA is disabled in settings." ou similar, sem quebrar a aplicação.

Cenário 2: Ambiente com Llama

  1. Adicione seu token do Hugging Face à variável HF_TOKEN no arquivo .envs/.local/.django.
  2. Construa a imagem com Llama: make build_llama
  3. Inicie os containers usando o novo compose: docker compose -f llama.local.yml up
  4. Execute o download do modelo:
    docker compose -f llama.local.yml run --rm django python manage.py download_model
  5. Verifique se o modelo foi baixado para a pasta llama3/llama-3.2.
  6. Execute a funcionalidade de marcação de referências. Ela deve processar o texto utilizando o modelo Llama e retornar o resultado esperado.
  7. (Opcional) Remova o arquivo do modelo e tente executar a funcionalidade novamente para verificar a mensagem de erro LlamaModelNotFoundError.

samuelveigarangel and others added 26 commits September 15, 2025 20:23
- O objetivo é facilitar o deploy local
- Basta fazer make build_llama (em lugar de make build)
- Atrasamos o import da lib llama_cpp para melhor lidar com exceções
- Faz raise que deve ser capturado em todo mundo que usar GenericLlama
- Lidar com raises provocados no GenericLlama
- Captura e registra exceções com o Tracker
- Retorna string indicando que marcação falhou, se for o caso
@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Oct 3, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
- - Hugging Face user access token b0f1a31 .envs/.local/.django View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the LLaMa model integration to make it optional, separating LLaMa-specific dependencies from the core application. This allows for lighter Docker images when LLaMa functionality is not needed while maintaining the ability to enable it when required.

  • Separated LLaMa dependencies into optional Docker images and requirements files
  • Added robust error handling with custom exceptions for LLaMa-related issues
  • Implemented singleton pattern for LLaMa model loading and configuration-based enabling

Reviewed Changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
requirements/extra-llama.txt New file containing LLaMa-specific dependencies
requirements/base.txt Removed LLaMa dependencies from core requirements
reference/wagtail_hooks.py Migrated from deprecated wagtail_modeladmin to snippets
reference/models.py Minor formatting and label improvements
reference/marker.py Added comprehensive error handling for LLaMa operations
reference/management/commands/download_model.py New management command for downloading LLaMa models
llama3/generic_llama.py Refactored with singleton pattern and robust error handling
llama3/download_model.py Removed obsolete hardcoded download script
llama.local.yml New Docker Compose configuration for LLaMa-enabled development
config/settings/base.py Added LLAMA_ENABLED setting and reorganized LLaMa configuration
compose/production/django/Dockerfile.llama New production Docker image with LLaMa support
compose/production/django/Dockerfile Simplified by removing LLaMa-specific build logic
compose/local/django/Dockerfile.llama New local development Docker image with LLaMa support
compose/local/django/Dockerfile Simplified by removing LLaMa-specific build logic
README.md Added reference to LLaMa model setup guide
Makefile Added build_llama command for LLaMa-enabled images
.envs/.local/.django Added HF_TOKEN placeholder for Hugging Face authentication

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread compose/local/django/Dockerfile.llama Outdated
ARG ENABLE_OPTIMIZATIONS=true
ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CMAKE_ARGS environment variable name on line 37 conflicts with the variable used in the generic_llama.py file for a different purpose. Consider using a more specific name like LLAMA_CMAKE_ARGS to avoid confusion.

Suggested change
CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
LLAMA_CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +39
# Instalar gcc-10 y g++-10 en Debian Bullseye
RUN apt-get update && \
apt-get install -y gcc-10 g++-10 ninja-build cmake && \
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \
apt-get clean && rm -rf /var/lib/apt/lists/*

# Instalar ninja-build y cmake
RUN apt-get install -y ninja-build cmake

# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente
ARG ENABLE_OPTIMIZATIONS=true
ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

# Actualizar pip, setuptools y wheel antes de instalar dependencias
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should be in English to maintain consistency with the rest of the codebase.

Suggested change
# Instalar gcc-10 y g++-10 en Debian Bullseye
RUN apt-get update && \
apt-get install -y gcc-10 g++-10 ninja-build cmake && \
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \
apt-get clean && rm -rf /var/lib/apt/lists/*
# Instalar ninja-build y cmake
RUN apt-get install -y ninja-build cmake
# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente
ARG ENABLE_OPTIMIZATIONS=true
ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
# Actualizar pip, setuptools y wheel antes de instalar dependencias
# Install gcc-10 and g++-10 in Debian Bullseye
RUN apt-get update && \
apt-get install -y gcc-10 g++-10 ninja-build cmake && \
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \
apt-get clean && rm -rf /var/lib/apt/lists/*
# Install ninja-build and cmake
RUN apt-get install -y ninja-build cmake
# Set environment variables to conditionally compile with BLAS and SIMD
ARG ENABLE_OPTIMIZATIONS=true
ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
# Upgrade pip, setuptools, and wheel before installing dependencies

Copilot uses AI. Check for mistakes.
# Instalar ninja-build y cmake
RUN apt-get install -y ninja-build cmake

# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should be in English to maintain consistency with the rest of the codebase.

Suggested change
# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente
# Configure environment variables to compile with BLAS and SIMD conditionally

Copilot uses AI. Check for mistakes.
CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

# Actualizar pip, setuptools y wheel antes de instalar dependencias
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should be in English to maintain consistency with the rest of the codebase.

Suggested change
# Actualizar pip, setuptools y wheel antes de instalar dependencias
# Update pip, setuptools, and wheel before installing dependencies

Copilot uses AI. Check for mistakes.
@robertatakenaka robertatakenaka merged commit c886e00 into scieloorg:main Oct 6, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants