Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profiling capabilities #3

Merged
merged 14 commits into from
Dec 5, 2024
Binary file added .github/API_fink.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 0 additions & 2 deletions .github/workflows/linter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,7 @@ jobs:
run: |
ruff check --statistics *.py
ruff check --statistics apps/
ruff check --ignore D205 tests/
- name: Format
run: |
ruff format --check *.py
ruff format --check apps/
ruff format --check tests/
114 changes: 113 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,123 @@
# Fink object API

[![Sentinel](https://github.com/astrolabsoftware/fink-object-api/workflows/Sentinel/badge.svg)](https://github.com/astrolabsoftware/fink-object-api/actions?query=workflow%3ASentinel)

![structure](.github/API_fink.png)

This repository contains the code source of the Fink REST API used to access object data stored in tables in Apache HBase.

## Installation
## Requirements and installation

You will need Python installed (>=3.11) with requirements listed in [requirements.txt](requirements.txt). You will also need [fink-cutout-api](https://github.com/astrolabsoftware/fink-cutout-api) fully installed (which implies Hadoop installed on the machine, and Java 11/17). For the full installation and deployment, refer as to the [procedure](install/README.md).

## Configuration

First you need to configure the parameters in [config.yml](config.yml):

```yml
# Host and port of the application
HOST: localhost
PORT: 32000

# URL of the fink_cutout_api
CUTOUTAPIURL: http://localhost

# HBase configuration
HBASEIP: localhost
ZOOPORT: 2183

# Table schema (schema_{fink_broker}_{fink_science})
SCHEMAVER: schema_3.1_5.21.14

# Maximum number of rows to
# return in one call
NLIMIT: 10000
```

Make sure that the `SCHEMAVER` is the same you use for your tables in HBase.

TODO:
- [ ] Find a way to automatically sync schema with tables.

## Deployment

### Debug

After starting [fink-cutout-api](https://github.com/astrolabsoftware/fink-cutout-api), you can simply test the API using:

```bash
python app.py
```

### Production

The application is managed by `gunicorn` and `systemd` (see [install](install/README.md)), and you can simply manage it using:

```bash
# start the application
systemctl start fink_object_api

# reload the application if code changed
systemctl restart fink_object_api

# stop the application
systemctl stop fink_object_api
```

TODO:
- [ ] Add nginx management
- [ ] Add bash scripts under `bin/` to manage both nginx and gunicorn

## Tests

All the routes are extensively tested. To trigger a test on a route, simply run:

```bash
python apps/routes/objects/test.py $HOST:$PORT
```

By replacing `HOST` and `$PORT` with their values (could be the main API instance). If the program exits with no error or message, the test has been successful.

TODO:
- [ ] Make tests more verbose, even is successful.

Alternatively, you can launch all tests using:


```bash
./run_tests.sh --url $HOST:$PORT
```

## Profiling a route

To profile a route, simply use:

```bash
./profile_route.sh --route apps/routes/<route>
```

Depending on the route, you will see the details of the timings and a summary similar to:

```python
Wrote profile results to profiling.py.lprof
Inspect results with:
python -m line_profiler -rmt "profiling.py.lprof"
Timer unit: 1e-06 s

Total time: 0.000241599 s
File: /home/peloton/codes/fink-object-api/apps/routes/template/utils.py
Function: my_function at line 19

Line # Hits Time Per Hit % Time Line Contents
==============================================================
19 @profile
20 def my_function(payload):
21 1 241.6 241.6 100.0 return pd.DataFrame({payload["arg1"]: [1, 2, 3]})


0.00 seconds - /home/peloton/codes/fink-object-api/apps/routes/template/utils.py:19 - my_function
```

## Adding a new route

You find a [template](apps/routes/template) route to start a new route. Just copy this folder, and modify it with your new route. Alternatively, you can see how other routes are structured to get inspiration. Do not forget to add tests in the [test folder](tests/)!
25 changes: 25 additions & 0 deletions apps/routes/cutouts/profiling.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Copyright 2024 AstroLab Software
# Author: Julien Peloton
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Call format_and_send_cutout"""

from apps.routes.cutouts.utils import format_and_send_cutout

payload = {
"objectId": "ZTF21abfmbix",
"kind": "All",
"output-format": "array",
}

format_and_send_cutout(payload)
File renamed without changes.
25 changes: 25 additions & 0 deletions apps/routes/objects/profiling.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Copyright 2024 AstroLab Software
# Author: Julien Peloton
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Call extract_object_data"""

from apps.routes.objects.utils import extract_object_data

payload = {
"objectId": "ZTF21abfmbix",
"withupperlim": True,
# "withcutouts": True,
}

extract_object_data(payload)
File renamed without changes.
23 changes: 23 additions & 0 deletions apps/routes/template/profiling.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright 2024 AstroLab Software
# Author: Julien Peloton
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Call extract_object_data"""

from apps.routes.template.utils import my_function

payload = {
"arg1": "toto",
}

my_function(payload)
3 changes: 3 additions & 0 deletions apps/routes/template/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@
# limitations under the License.
import pandas as pd

from line_profiler import profile


@profile
def my_function(payload):
return pd.DataFrame({payload["arg1"]: [1, 2, 3]})
45 changes: 45 additions & 0 deletions install/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# API installation and deployment

Fire a Virtual Machine, and follow instructions. Work perfectly on recent AlmaLinux.

## Python dependencies

Clone this repository, and install all python dependencies:

```bash
pip install -r requirements.txt
```

## Fink cutout API installation

Follow instructions in the [fink-cutout-api](https://github.com/astrolabsoftware/fink-cutout-api/blob/main/install/README.md).

## Systemctl and gunicorn

Install a new unit for systemd under `/etc/systemd/system/fink_object_api.service`:

```bash
[Unit]
Description=gunicorn daemon for fink_object_api
After=network.target

[Service]
User=almalinux
Group=almalinux
WorkingDirectory=/home/almalinux/fink-object-api

ExecStart=/bin/sh -c 'source /home/almalinux/.bashrc; exec /home/almalinux/fink-env/bin/gunicorn --log-file=/tmp/fink_object_api.log app:app -b localhost:PORT2 --workers=1 --threads=8 --timeout 180 --chdir /home/almalinux/fink-object-api --bind unix:/home/almalinux/fink_object_api.sock 2>&1 >> /tmp/fink_object_api.out'

[Install]
WantedBy=multi-user.target
```

Make sure you change `PORT2` with your actual port, and `localhost` with your domain. Make sure also to update path to `gunicorn`. Update the `config.yml`, reload units and launch the application:

```bash
sudo systemctl daemon-reload
sudo systemctl start fink_object_api
```


You are ready to use the API!
43 changes: 43 additions & 0 deletions profile_route.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/bin/bash
# Copyright 2024 AstroLab Software
# Author: Julien Peloton
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
## Script to launch the python test suite and measure the coverage.
## Must be launched as fink_test
set -e
message_help="""
Profile a route\n\n
Usage:\n
\t./profile_route.sh --route <route_path>\n\n
"""

export ROOTPATH=`pwd`
# Grab the command line arguments
NO_SPARK=false
while [ "$#" -gt 0 ]; do
case "$1" in
--route)
ROUTE_PATH=$2
shift 2
;;
-h)
echo -e $message_help
exit
;;
esac
done

kernprof -l $ROUTE_PATH/profiling.py
python -m line_profiler -rmt "profiling.py.lprof"

3 changes: 3 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,6 @@ line_profiler
requests
pyarrow
matplotlib
JPype1
PyYAML
pyspark
3 changes: 1 addition & 2 deletions run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@ if [[ -f $URL ]]; then
fi

# Run the test suite on the utilities
cd tests
for filename in ./*.py
for filename in apps/routes/*/test.py
do
echo $filename
# Run test suite
Expand Down
Loading