Cria capacidade de usar sistema com ou sem suporte ao LLaMa by pitangainnovare · Pull Request #28 · scieloorg/markapi

pitangainnovare · 2025-10-03T20:27:20Z

Cria capacidade de usar sistema com ou sem suporte ao LLaMa

Descrição

Esta PR refatora a integração do modelo de linguagem Llama, tornando-a opcional. Anteriormente, as dependências e a configuração do Llama eram carregadas incondicionalmente, aumentando o tamanho da imagem Docker e a complexidade do ambiente. As principais mudanças introduzem um ambiente Docker dedicado para o Llama, separam as dependências e aprimoram o tratamento de erros e o carregamento do modelo na aplicação.

Principais Mudanças

1. Ambiente Docker Opcional para Llama

Novos Dockerfiles (Dockerfile.llama): Foram criados Dockerfile.llama para os ambientes local e de produção. Eles contêm todas as dependências de compilação (cmake, ninja-build, etc.) e os pacotes Python necessários para executar o llama-cpp-python.
Simplificação dos Dockerfiles Padrão: Os Dockerfiles originais (Dockerfile) foram limpos, removendo toda a lógica de instalação condicional e as dependências do Llama.
Novo Docker Compose (llama.local.yml): Um novo arquivo de compose foi adicionado para permitir levantar o ambiente de desenvolvimento local com o Llama ativado, utilizando as novas imagens.
Makefile: Adicionado o comando build_llama para construir as imagens Docker específicas do Llama.

2. Gerenciamento de Dependências

Separação de Requisitos: As dependências do Llama (huggingface-hub, llama-cpp-python) foram movidas de requirements/base.txt para um novo arquivo requirements/extra-llama.txt.
As imagens Docker do Llama agora instalam os requisitos de ambos os arquivos (base.txt + extra-llama.txt), enquanto as imagens padrão instalam apenas o base.txt (e local.txt ou production.txt).

3. Robustez e Melhorias no Código

Habilitação via Settings: Uma nova variável LLAMA_ENABLED foi adicionada ao settings.py para ativar ou desativar globalmente a funcionalidade do Llama.
Refatoração do GenericLlama:
- Singleton Pattern: O modelo agora é carregado como um singleton (_cached_llm), garantindo que ele seja inicializado apenas uma vez, melhorando a performance.
- Carregamento Condicional: A classe agora verifica se LLAMA_ENABLED é True antes de tentar carregar o modelo
- Tratamento de Exceções: Foram criadas exceções personalizadas (LlamaDisabledError, LlamaModelNotFoundError, LlamaNotInstalledError) para lidar com cenários específicos, como o pacote llama-cpp-python não estar instalado ou o arquivo do modelo não ser encontrado.
Tratamento de Erros no marker.py:
- A função mark_reference agora captura as novas exceções.
- Em caso de erro, um GeneralEvent é registrado para fins de rastreabilidade, e uma mensagem amigável é retornada ao usuário.

Como Testar

Cenário 1: Ambiente Padrão (Sem Llama)

Construa a imagem padrão: make build
Inicie os containers: make up
A aplicação deve funcionar normalmente.
Tente executar uma tarefa que utilize a marcação de referências. O sistema deve retornar a mensagem "Llama model is not available: LLaMA is disabled in settings." ou similar, sem quebrar a aplicação.

Cenário 2: Ambiente com Llama

Adicione seu token do Hugging Face à variável HF_TOKEN no arquivo .envs/.local/.django.
Construa a imagem com Llama: make build_llama
Inicie os containers usando o novo compose: docker compose -f llama.local.yml up

Execute o download do modelo:

docker compose -f llama.local.yml run --rm django python manage.py download_model

Verifique se o modelo foi baixado para a pasta llama3/llama-3.2.
Execute a funcionalidade de marcação de referências. Ela deve processar o texto utilizando o modelo Llama e retornar o resultado esperado.
(Opcional) Remova o arquivo do modelo e tente executar a funcionalidade novamente para verificar a mensagem de erro LlamaModelNotFoundError.

- O objetivo é facilitar o deploy local - Basta fazer make build_llama (em lugar de make build)

- Atrasamos o import da lib llama_cpp para melhor lidar com exceções

- Faz raise que deve ser capturado em todo mundo que usar GenericLlama

- Lidar com raises provocados no GenericLlama - Captura e registra exceções com o Tracker - Retorna string indicando que marcação falhou, se for o caso

gitguardian · 2025-10-03T20:28:03Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
-	-	Hugging Face user access token	`b0f1a31`	.envs/.local/.django	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

Copilot

Pull Request Overview

This PR refactors the LLaMa model integration to make it optional, separating LLaMa-specific dependencies from the core application. This allows for lighter Docker images when LLaMa functionality is not needed while maintaining the ability to enable it when required.

Separated LLaMa dependencies into optional Docker images and requirements files
Added robust error handling with custom exceptions for LLaMa-related issues
Implemented singleton pattern for LLaMa model loading and configuration-based enabling

Reviewed Changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
requirements/extra-llama.txt	New file containing LLaMa-specific dependencies
requirements/base.txt	Removed LLaMa dependencies from core requirements
reference/wagtail_hooks.py	Migrated from deprecated wagtail_modeladmin to snippets
reference/models.py	Minor formatting and label improvements
reference/marker.py	Added comprehensive error handling for LLaMa operations
reference/management/commands/download_model.py	New management command for downloading LLaMa models
llama3/generic_llama.py	Refactored with singleton pattern and robust error handling
llama3/download_model.py	Removed obsolete hardcoded download script
llama.local.yml	New Docker Compose configuration for LLaMa-enabled development
config/settings/base.py	Added LLAMA_ENABLED setting and reorganized LLaMa configuration
compose/production/django/Dockerfile.llama	New production Docker image with LLaMa support
compose/production/django/Dockerfile	Simplified by removing LLaMa-specific build logic
compose/local/django/Dockerfile.llama	New local development Docker image with LLaMa support
compose/local/django/Dockerfile	Simplified by removing LLaMa-specific build logic
README.md	Added reference to LLaMa model setup guide
Makefile	Added build_llama command for LLaMa-enabled images
.envs/.local/.django	Added HF_TOKEN placeholder for Hugging Face authentication

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-03T20:28:35Z

+ARG ENABLE_OPTIMIZATIONS=true
+ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
+    CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
+    CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"


The CMAKE_ARGS environment variable name on line 37 conflicts with the variable used in the generic_llama.py file for a different purpose. Consider using a more specific name like LLAMA_CMAKE_ARGS to avoid confusion.

Suggested change

CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

LLAMA_CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

Copilot · 2025-10-03T20:28:35Z

+# Instalar gcc-10 y g++-10 en Debian Bullseye
+RUN apt-get update && \
+    apt-get install -y gcc-10 g++-10 ninja-build cmake && \
+    update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \
+    update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \
+    apt-get clean && rm -rf /var/lib/apt/lists/*
+
+# Instalar ninja-build y cmake
+RUN apt-get install -y ninja-build cmake
+
+# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente
+ARG ENABLE_OPTIMIZATIONS=true
+ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
+    CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
+    CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
+
+# Actualizar pip, setuptools y wheel antes de instalar dependencias


Comment should be in English to maintain consistency with the rest of the codebase.

Suggested change

# Instalar gcc-10 y g++-10 en Debian Bullseye

RUN apt-get update && \

apt-get install -y gcc-10 g++-10 ninja-build cmake && \

update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \

update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \

apt-get clean && rm -rf /var/lib/apt/lists/*

# Instalar ninja-build y cmake

RUN apt-get install -y ninja-build cmake

# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente

ARG ENABLE_OPTIMIZATIONS=true

ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \

CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \

CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

# Actualizar pip, setuptools y wheel antes de instalar dependencias

# Install gcc-10 and g++-10 in Debian Bullseye

RUN apt-get update && \

apt-get install -y gcc-10 g++-10 ninja-build cmake && \

update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 && \

update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 && \

apt-get clean && rm -rf /var/lib/apt/lists/*

# Install ninja-build and cmake

RUN apt-get install -y ninja-build cmake

# Set environment variables to conditionally compile with BLAS and SIMD

ARG ENABLE_OPTIMIZATIONS=true

ENV CFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \

CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \

CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

# Upgrade pip, setuptools, and wheel before installing dependencies

Copilot · 2025-10-03T20:28:35Z

+# Instalar ninja-build y cmake
+RUN apt-get install -y ninja-build cmake
+
+# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente


Comment should be in English to maintain consistency with the rest of the codebase.

Suggested change

# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente

# Configure environment variables to compile with BLAS and SIMD conditionally

Copilot · 2025-10-03T20:28:36Z

+    CXXFLAGS="${ENABLE_OPTIMIZATIONS:+-mfma -mavx2}" \
+    CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
+
+# Actualizar pip, setuptools y wheel antes de instalar dependencias


Comment should be in English to maintain consistency with the rest of the codebase.

Suggested change

# Actualizar pip, setuptools y wheel antes de instalar dependencias

# Update pip, setuptools, and wheel before installing dependencies

samuelveigarangel and others added 26 commits September 15, 2025 20:23

gitignore update

4c98845

Update modeladmin to snippetview

7fcb03e

Add label em inlinePanel

050297e

Atualiza o download do modelo para command

b0f1a31

Fix __init__

a32aadf

Move LLAMA_MODEL_DIR e MODEL_LLAMA

a4161d8

black

5de0daf

remove token HF_TOKEN

71c0a4b

Update README.md

f5caf9f

remove dep repetida no mesmo arquivo

648760b

remove dep llama e hugging face

6865310

cria requirement adicional específico para llama e huggingface

51dd32e

Altera Dockerfile padrão (local e production) para não terem llama

2b00c39

Cria variável settings para indicar se Llama está ou não ativado

67d1f70

Cria um yml para usar fazer deploy local de imagem com llama

78a42aa

Cria dockerfile com llama (prod e local)

ca0819e

Cria um comando em make para construir imagem com llama (build_llama)

3302460

- O objetivo é facilitar o deploy local - Basta fazer make build_llama (em lugar de make build)

Ajusta import em módulo que cria GenericLlama

7e3a713

- Atrasamos o import da lib llama_cpp para melhor lidar com exceções

Cria exceções para melhor lidar com os erros de instanciação do Llama

524d97e

Impede criação do GenericLlama

b51c734

- Faz raise que deve ser capturado em todo mundo que usar GenericLlama

Caso o singleton não esteja instanciado, faz import do llama

beb0036

Verifica se o caminho do modelo existe

1509008

Cria uma instância do LLama

c9e2681

E povoa o cache (singleton pattern)

2a57bed

Ajuste return para melhor legibilidade

975045b

Ajuste o marker para melhor lidar com as alterações no GenericLlama

77d3aa3

- Lidar com raises provocados no GenericLlama - Captura e registra exceções com o Tracker - Retorna string indicando que marcação falhou, se for o caso

pitangainnovare requested review from Copilot and robertatakenaka October 3, 2025 20:27

Copilot AI reviewed Oct 3, 2025

View reviewed changes

Remove instalação duplicada de deps

cb2021a

robertatakenaka merged commit c886e00 into scieloorg:main Oct 6, 2025
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cria capacidade de usar sistema com ou sem suporte ao LLaMa#28

Cria capacidade de usar sistema com ou sem suporte ao LLaMa#28
robertatakenaka merged 27 commits intoscieloorg:mainfrom
pitangainnovare:impl/llama-or-not-llama

pitangainnovare commented Oct 3, 2025

Uh oh!

gitguardian Bot commented Oct 3, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Oct 3, 2025

Uh oh!

Copilot AI Oct 3, 2025

Uh oh!

Copilot AI Oct 3, 2025

Uh oh!

Copilot AI Oct 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"
	LLAMA_CMAKE_ARGS="${ENABLE_OPTIMIZATIONS:+-DGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS}"

	# Configurar variables de entorno para compilar con BLAS y SIMD condicionalmente
	# Configure environment variables to compile with BLAS and SIMD conditionally

	# Actualizar pip, setuptools y wheel antes de instalar dependencias
	# Update pip, setuptools, and wheel before installing dependencies

Conversation

pitangainnovare commented Oct 3, 2025

Cria capacidade de usar sistema com ou sem suporte ao LLaMa

Descrição

Principais Mudanças

1. Ambiente Docker Opcional para Llama

2. Gerenciamento de Dependências

3. Robustez e Melhorias no Código

Como Testar

Cenário 1: Ambiente Padrão (Sem Llama)

Cenário 2: Ambiente com Llama

Uh oh!

gitguardian Bot commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gitguardian Bot commented Oct 3, 2025 •

edited

Loading