The PIXL CLI driver provides functionality to populate a queue with messages containing information required to run electronic health queries against the VNA image system. Once a set of queues are populated the consumers can be started, updated and the system extractions stopped cleanly.
- Python version 3.11 (matching python versions in pixl-ci and dev).
- Docker with version
>=27.0.3
- Docker Compose with version
>=v2.28.1-desktop.1
- We recommend installing PIXL project in specific virtual environment using a environment management tool such as conda or virtualenv. See detailed instructions here
Activate your python virtual environment and install PIXL
project in editable mode by running
python -m pip install -e ../pixl_core -e .
Note The rabbitmq
, export-api
and imaging-api
services must be started prior to using the CLI
This is done by spinning up the necessary Docker containers through docker compose
.
For convenience, we provide the pixl dc
command, which acts as a wrapper for docker compose
,
but takes care of some of the configuration for you.
See the commands and subcommands with
pixl --help
The rabbitmq
and postgres
services are configured by setting the following environment variables
(default values shown):
RABBITMQ_HOST=localhost
RABBITMQ_PORT=7008
RABBITMQ_USERNAME=rabbitmq_username
RABBITMQ_PASSWORD=rabbitmq_password
POSTGRES_HOST=localhost
POSTGRES_PORT=7001
PIXL_DB_USER=pixl_db_username
PIXL_DB_PASSWORD=pixl_db_password
PIXL_DB_NAME=pixl
The rabbitmq
queues for the imaging
API is configured by setting:
PIXL_IMAGING_API_HOST=localhost
PIXL_IMAGING_API_PORT=7007
PIXL_IMAGING_API_RATE=1
where the *_RATE
variables set the default querying rate for the message queues.
Populate queue for Imaging using parquet files:
pixl populate </path/to/parquet_dir>
where parquet_dir
contains at least the following files:
parquet_dir
├── extract_summary.json
├── private
│ ├── PERSON_LINKS.parquet
│ └── PROCEDURE_OCCURRENCE_LINKS.parquet
└── public
└── PROCEDURE_OCCURRENCE.parquet
Alternatively, the queue can be populated based on records in CSV files:
pixl populate <path/to/file.csv>
One advantage of using a CSV file is that multiple projects can be listed
for export in the file. Using the parquet format, in contrast, only supports
exporting a single project per call to pixl populate
.
Extraction will start automatically after populating the queues. If granular
customisation of the rate per queue is required or a queue should not be started
then supply the argument --no-start
and use pixl start...
to launch
processing.
Once the messages have been processed, the OMOP extracts (including radiology reports) can be
exported to a parquet file
using
pixl export-patient-data </path/to/parquet_dir>
Stop Imaging extraction
pixl stop
By default, messages will be sent to the queue with the lowest priority (1).
To send to the queue with a different priority, you can use the --priority
argument to
populate
:
pixl populate --priority 5 <path/to/file.csv>
priority
must be an integer between 1 and 5, with 5 being the highest priority.
The CLI is created using click. To see which commands
are currently available, you can use the pixl --help
command:
Activate your python environment and install project locally in editable mode with the development and testing dependencies by running
python -m pip install -e ../pixl_core -e ../pytest-pixl -e ".[test]" -e ".[dev]"
The CLI tests require a running instance of the rabbitmq
service, for which we provide a
docker-compose
file. The service is automatically started by the
run_containers
pytest fixture. So to run the tests, run
pytest -vs tests #for all tests
pytest -vs tests/test_docker_commands.py #e.g., for particular tests