This repository provides easy scripts to run PrimeQA applications via docker.
We use docker and docker-compose to run our application. Make sure you have the most up-to-date version of those tools.
OS: Ubuntu 20.04.4 LTS
Memory: 32GB (64GB - Recommended)
GPU: NVIDIA Corporation GV100GL [V100 PCIe 16GB]
NVIDIA Driver version: 470.141.03
Disk space: 50 GB is required for the docker, 25 GB of available free space is needed in the docker container storage
PrimeQA services now adds support for:
Rerankers
For more details:
Generative Readers
For more details on GenerativeReader and PromptReader:
PrimeQA services now adds support for BM25 and DPR Retrievers.
The information.json file in the index directory must include an engine_type files set to one of BM25, ColBERT or DPR.
If you have existing ColBERT indexes in primeqa-store/indexes, please update the information.json file in the index directory to include a configuration section as follows:
"configuration": {
"engine_type": ColBERT,
"checkpoint": <checkpoint-dir-name>
}
-
Set the environment variable
PUBLIC_IPto the ip address of the localhost. This host must be reachable from where you will be accessing via the browser. Otherwise, please use VNC to access the host. If accessing the application via the browser locally,PUBLIC_IPcan be set tolocalhost.``` export PUBLIC_IP=<hostname> ``` -
Please ensure that the following three ports are free and available:
50051,50059and82 -
Launch the container using
bashincpu(default) orgpumode:CPU mode (default):
launch.shGPU mode:
launch.sh -m gpu
🚨 Note: This process will take a while to complete as it will download necessary docker images and bring up services.
-
Run
docker psto verify that all the three containers (primeqa-ui, primqa-orchestrator and primeqa-service) are running. -
You will need to configure a few additional settings before first use. These setting are intentionally left blank for security purposes.
-
Settings are defined in the file
orchestrator-store/primeqa.json. Create this file and copy-pase the Reader and Retriever setting that you would like to use from the examples belowa. To use the IBM® Watson Discovery retriever and PrimeQA reader, first configure a IBM® Watson Discovery Cloud instance using the instructions here and create a collection index.
{ "retrievers": { "Watson Discovery": { "service_endpoint": "<IBM® Watson Discovery Cloud/CP4D Instance Endpoint>", "service_api_key": "<API key (ONLY If using IBM® Watson Discovery Cloud instance)>", "service_project_id": "<IBM® Watson Discovery Project ID>" } }, "readers": { "PrimeQA": { "service_endpoint": "primeqa:50051", "beta": 0.7 } } }b. To use the PrimeQA retriever and PrimeQA reader, first setup the collection index for the Retriever using the instructions here.
{ "retrievers": { "PrimeQA": { "service_endpoint": "primeqa:50051" } }, "readers": { "PrimeQA": { "service_endpoint": "primeqa:50051", "beta": 0.7 } } }NOTE: The final scoring and ranking is done with a weighted sum of the Reader answer scores and Retriever search hits scores. The
betafield is the weight assigned to the reader scores and1-betais the weight assigned to the retriever scores. -
Please allow 30 seconds for the primeqa-orchestrator to establish connectivity to IBM® Watson Discovery and PrimeQA service.
-
You can test the PrimeQA orchestrator's connectivity to your IBM® Watson Discovery (WD) instance by executing the [GET]
/retrievers/{retriever_id}/collectionsendpoint.curl -X 'GET' "http://{$PUBLIC_IP}:50059/retrievers/WatsonDiscovery/collections" -H 'accept: application/json'
-
To see all available retrievers, execute [GET]
/retrieversendpointcurl -X 'GET' "http://{$PUBLIC_IP}:50059/retrievers" -H 'accept: application/json'
-
To run a sample question answering query, execute [POST]
/askendpointa. Using the IBM® Watson Discovery Retriever (You must provide the name of your <collection_id>)
curl -X 'POST' "http://{$PUBLIC_IP}:50059/ask" -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "question": "<SAMPLE QUERY>", "retriever": { "retriever_id": "WatsonDiscovery" }, "collection": { "collection_id": "<collection_id> from collections returned by [GET]/collections API.", "name": "Name of corresponding collection" }, "reader": { "reader_id": "ExtractiveReader" } }'
b. Using the PrimeQA Retriever (You must provide the name of your <collection_id>)
curl -X 'POST' "http://{$PUBLIC_IP}:50059/ask" -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "question": "<SAMPLE QUERY>", "retriever": { "retriever_id": "ColBERTRetriever" }, "collection": { "collection_id": "<collection_id> from collections returned by [GET]/collections API.", "name": "Name of corresponding collection" }, "reader": { "reader_id": "ExtractiveReader" } }'
-
To run reading:
curl -X 'POST' \ "http://{$PUBLIC_IP}:50059/GetAnswersRequest" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "question": "Where was Genghis Khan buried?", "contexts": [ "Before Genghis Khan died, he assigned Ögedei Khan as his successor and split his empire into khanates among his sons and grandsons. He died in 1227 after defeating the Western Xia. He was buried in an unmarked grave somewhere in Mongolia at an unknown location. His descendants extended the Mongol Empire across most of Eurasia by conquering or creating vassal states out of all of modern-day China, Korea, the Caucasus, Central Asia, and substantial portions of modern Eastern Europe, Russia, and Southwest Asia. Many of these invasions repeated the earlier large-scale slaughters of local populations. As a result, Genghis Khan and his empire have a fearsome reputation in local histories.." ], "reader": { "reader_id": "ExtractiveReader", "parameters": [ { "parameter_id": "max_num_answers", "value": 5 } ] } }'
Example Answer:
[ { "text": "Mongolia at an unknown location", "confidence_score": 1, "start_char_offset": 229, "end_char_offset": 260, "context_index": 0 } ]
You can now open a browser of your choice (Mozilla Firefox/Google Chrome) and visit "http://{PUBLIC_IP}:82" to interact with the PrimeQA application. You will see our Retrieval, Reader and QuestionAnswering components. Some features include the ability to adjust settings and for users to provide feedback on retrieved answers.
Users can provide feedback via the 👍 and 👎 icons to the answers shown in the results page.
To use the feedback to fine-tune your Reader model
- Get the feedback data:
curl -X 'GET' \
'http://localhost:50059/feedbacks?application=reading&application=qa&_format=primeqa' \
-H 'accept: application/json' > feedbacks.json-
Follow the instructions on how to finetune a PrimeQA reader with custom data here. Generally, the finetuning would start with the model used when collecting the feedback data as specified in the
Modelfield underReadersettings in theReadingand/orQuestionAnsweringUI. -
To deploy the finetuned model, follow the instructions here.
a. If the UI is not loading properly or a field is blank, please try these quick steps:
- clear the browser cache and retry
- restart the containers by running
terminate.shand thenlaunch.sh
b. To view the logs, use the docker logs command, for example:
```
docker logs primeqa-ui
docker logs primeqa-orchestrator
docker logs primeqa-services
```
-
How do I switch to a different PrimeQA Reader model from the Huggingface model hub ?
Paste the model name from the Huggingface model hub into the
Modelfield underReadersettings in theReadingand/orQuestionAnsweringUI.IMPORTANT: Only models trained using PrimeQA are supported. Other models based on Huggingface QA model will not work.
-
How do I use my custom model for reader in
ReadingorQAapplication?By default the reader initializes the
PrimeQA/nq_tydi_sq1-reader-xlmr_large-20221110from the Huggingface model hub.To use your own reader model, place your model in a directory under
primeqa-store/modelsdirectory. To point to your model from the UI, navigate toApplication Settings, scroll down toReader Settingsand toModeland set it to/store/model/<model-dir>, replacemodel-dirwith the name of the directory containing the model files.The service will load the model and initialize a new reader. This may take a few minutes. Subsequent queries will use this model.
-
How do I use my ColBERT index and checkpoint ?
Please follow the instructions here
-
The Corpus field is blank in the 'Retriever' or 'Question Answering' page
See Troubleshooting
