Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎇 BLE matching in the pipeline #965

Merged
merged 6 commits into from
May 5, 2024
Merged

Conversation

JGreenlee
Copy link
Contributor

add dynamic_config.py

This will provide a way for the pipeline to access the dynamic config.
the first time get_dynamic_config is called, it fetches the config associated with the study from Github. The study is defined by the environment variable STUDY_CONFIG.
The downloaded config is cached within one session (each pipeline run is a different session), so subsequent calls to get_dynamic_config will not need to re-fetch.


add ble_sensed_mode to sections

Testing done:
-loaded a dump of dfc-fermata from 24 april into my local database
-ran . setup/setup.sh to update dependencies (e-mission-common is a new dependency)
-located one user (the opcode I've been using the most for my own testing). cleared pipeline for that user via ./e-mission-py.bash bin/reset_pipeline.py -e nrelop_dfc-fermata_test_*
-ran pipeline for that user, supplying the STUDY_CONFIG as an environment variable STUDY_CONFIG=dfc-fermata ./e-mission-py.bash bin/debug/intake_single_user.py -e nrelop_dfc-fermata_test_*
-observed log output. BLE modes were detected for at least some of the sections

2024-05-02 17:11:52,566:DEBUG:140704482301888:Returning cached dynamic config for dfc-fermata at version 5
2024-05-02 17:11:52,566:DEBUG:140704482301888:getting BLE sensed vehicle for section from 1713192674 to 1713193146
2024-05-02 17:11:52,593:DEBUG:140704482301888:After filtering, 471 BLE ranging entries during the section
2024-05-02 17:11:52,595:DEBUG:140704482301888:after counting, ble_beacon_counts = {'dfc0:fff0': 471}
2024-05-02 17:11:52,595:DEBUG:140704482301888:found vehicle Jack's Mazda 3 with BLE beacon dfc0:fff0

Some sections still have a ble_sensed_mode of None but this is plausible because we hadn't integrated the BLE sensing until a couple weeks ago and I've been using this opcode longer than that.

  • observed that the sections have ble_sensed_mode:
ts = esta.TimeSeries.get_time_series(user_id)
entries = ts.find_entries(["analysis/cleaned_section"])
print([e['data']['ble_sensed_mode'] for e in entries])

Output shows 31 sections with 'None' and 9 sections with the following:

{'value': 'car_jacks_mazda3', 'bluetooth_major_minor': ['dfc0:fff0'], 'text': "Jack's Mazda 3", 'baseMode': 'CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.16777, 'vehicle_info': {'type': 'car', 'license': 'JHK ****', 'make': 'Mazda', 'model': '3', 'year': 2014, 'color': 'red', 'engine': 'ICE', 'mpg': 33}}

add ble_sensed_summary to confirmed trips

Since sections now have a ble_sensed_mode property in addition to sensed_mode, we can use the ble_sensed_mode as the basis for a new kind of section summary.

To enable this, the get_section_summary function was refactored to accept an additional argument, which can be either sensed_mode or ble_sensed_mode. If it's ble_sensed_mode, the summary values will be summed and grouped by the baseMode of the vehicle that was sensed by BLE. If the additional argument is not given, it just defaults to sensed_mode (same behavior as before)

Testing done:
-using a dump of dfc-fermata from 24 april loaded into my local database
-reset and re-ran the pipeline for a user
-inspected the ble_sensed_summarys of the resulting confirmed trips:

ts = esta.TimeSeries.get_time_series(user_id)
entries = ts.find_entries(["analysis/confirmed_trip"])
for e in entries:
    print(e['data']['ble_sensed_summary'])

Output:

{'distance': {'UNKNOWN': 4294.125850161847}, 'duration': {'UNKNOWN': 10379.947390556335}, 'count': {'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 1511402.9404509964}, 'duration': {'UNKNOWN': 131559.69582343102}, 'count': {'UNKNOWN': 4}}
{'distance': {'UNKNOWN': 34462.99760301494}, 'duration': {'UNKNOWN': 977.1180453300476}, 'count': {'UNKNOWN': 3}}
{'distance': {'UNKNOWN': 19485.795458445176}, 'duration': {'UNKNOWN': 3718.8894975185394}, 'count': {'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 850.4852679201318}, 'duration': {'UNKNOWN': 15861.93504524231}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 47455.78820030493}, 'duration': {'UNKNOWN': 7241.158731937408}, 'count': {'UNKNOWN': 3}}
{'distance': {'UNKNOWN': 1551245.9810044689}, 'duration': {'UNKNOWN': 13537.040816783905}, 'count': {'UNKNOWN': 5}}
{'distance': {'UNKNOWN': 713.4707286619786}, 'duration': {'UNKNOWN': 1448.6543893814087}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 3981236.406827432}, 'duration': {'UNKNOWN': 26685.684997320175}, 'count': {'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 333.2678850005484}, 'duration': {'UNKNOWN': 21.80204725265503}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2866.186136018516}, 'duration': {'UNKNOWN': 3249.8193860054016}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2645.5964919933926}, 'duration': {'UNKNOWN': 750.7993273735046}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2442.091053542208}, 'duration': {'UNKNOWN': 489.8975965976715}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 4449.832978365483}, 'duration': {'UNKNOWN': 2710.8742623329163}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 6409.15214382765}, 'duration': {'UNKNOWN': 745.2548396587372}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 163.76752941202727}, 'duration': {'UNKNOWN': 12.731597900390625}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 4440.681532305813}, 'duration': {'UNKNOWN': 93.64632654190063}, 'count': {'UNKNOWN': 3}}
{'distance': {'UNKNOWN': 366.71884583356365}, 'duration': {'UNKNOWN': 1299.6981484889984}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2078.9642225193907}, 'duration': {'UNKNOWN': 11704.498789787292}, 'count': {'UNKNOWN': 1}}
{'distance': {'E_CAR': 2328.846316296626, 'UNKNOWN': 6971.404103278014}, 'duration': {'E_CAR': 3389.3658237457275, 'UNKNOWN': 945.0989944934845}, 'count': {'E_CAR': 1, 'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 155.63066751202888}, 'duration': {'UNKNOWN': 116.02340650558472}, 'count': {'UNKNOWN': 1}}
{'distance': {'E_CAR': 7177.555954021382, 'UNKNOWN': 43.50518475446066}, 'duration': {'E_CAR': 1140.2288630008698, 'UNKNOWN': 175.48782801628113}, 'count': {'E_CAR': 1, 'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 846.2578386118126}, 'duration': {'UNKNOWN': 766.3598084449768}, 'count': {'UNKNOWN': 1}}
{'distance': {'E_CAR': 104.02576498063684}, 'duration': {'E_CAR': 38.734825134277344}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 8250.614829636044}, 'duration': {'E_CAR': 825.4048693180084}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 6171.384206967509}, 'duration': {'E_CAR': 781.507465839386}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 1065.2594223724116, 'UNKNOWN': 73.02336166659416}, 'duration': {'E_CAR': 491.0299062728882, 'UNKNOWN': 105.17022156715393}, 'count': {'E_CAR': 1, 'UNKNOWN': 1}}
{'distance': {'E_CAR': 733.2246673403289}, 'duration': {'E_CAR': 845.3612954616547}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 4275.13516142465}, 'duration': {'E_CAR': 541.7157530784607}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 7391.707788417107}, 'duration': {'E_CAR': 611.8730540275574}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 10035.89912694906}, 'duration': {'E_CAR': 1205.663957118988}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 9085.234840541858}, 'duration': {'E_CAR': 3597.4890780448914}, 'count': {'E_CAR': 1}}

JGreenlee added 3 commits May 2, 2024 16:36
This will provide a way for the pipeline to access the dynamic config.
the first time `get_dynamic_config` is called, it fetches the config associated with the study from Github. The study is defined by the environment variable `STUDY_CONFIG`.
The downloaded config is cached within one session (each pipeline run is a different session), so subsequent calls to `get_dynamic_config` will not need to re-fetch.
Testing done:
-loaded a dump of dfc-fermata from 24 april into my local database
-ran `. setup/setup.sh` to update dependencies (e-mission-common is a new dependency)
-located one user (the opcode I've been using the most for my own testing). cleared pipeline for that user via `./e-mission-py.bash bin/reset_pipeline.py -e nrelop_dfc-fermata_test_*`
-ran pipeline for that user, supplying the STUDY_CONFIG as an environment variable `STUDY_CONFIG=dfc-fermata ./e-mission-py.bash bin/debug/intake_single_user.py -e nrelop_dfc-fermata_test_*`
-observed log output. BLE modes were detected for at least some of the sections
```
2024-05-02 17:11:52,566:DEBUG:140704482301888:Returning cached dynamic config for dfc-fermata at version 5
2024-05-02 17:11:52,566:DEBUG:140704482301888:getting BLE sensed vehicle for section from 1713192674 to 1713193146
2024-05-02 17:11:52,593:DEBUG:140704482301888:After filtering, 471 BLE ranging entries during the section
2024-05-02 17:11:52,595:DEBUG:140704482301888:after counting, ble_beacon_counts = {'dfc0:fff0': 471}
2024-05-02 17:11:52,595:DEBUG:140704482301888:found vehicle Jack's Mazda 3 with BLE beacon dfc0:fff0
```
Some sections still have a ble_sensed_mode of None but this is plausible because we hadn't integrated the BLE sensing until a couple weeks ago and I've been using this opcode longer than that.
- observed that the sections have `ble_sensed_mode`:
```
ts = esta.TimeSeries.get_time_series(user_id)
entries = ts.find_entries(["analysis/cleaned_section"])
print([e['data']['ble_sensed_mode'] for e in entries])
```
Output shows 31 sections with 'None' and 9 sections with the following:
```
{'value': 'car_jacks_mazda3', 'bluetooth_major_minor': ['dfc0:fff0'], 'text': "Jack's Mazda 3", 'baseMode': 'CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.16777, 'vehicle_info': {'type': 'car', 'license': 'JHK ****', 'make': 'Mazda', 'model': '3', 'year': 2014, 'color': 'red', 'engine': 'ICE', 'mpg': 33}}
```
Since sections now have a `ble_sensed_mode` property in addition to `sensed_mode`, we can use the `ble_sensed_mode` as the basis for a new kind of section summary.

To enable this, the `get_section_summary` function was refactored to accept an additional argument, which can be either `sensed_mode` or `ble_sensed_mode`. If it's `ble_sensed_mode`, the summary values will be summed and grouped by the `baseMode` of the vehicle that was sensed by BLE. If the additional argument is not given, it just defaults to `sensed_mode` (same behavior as before)

Testing done:
-using a dump of dfc-fermata from 24 april loaded into my local database
-reset and re-ran the pipeline for a user
-inspected the `ble_sensed_summary`s of the resulting confirmed trips:

```
ts = esta.TimeSeries.get_time_series(user_id)
entries = ts.find_entries(["analysis/confirmed_trip"])
for e in entries:
    print(e['data']['ble_sensed_summary'])
```

Output:
```
{'distance': {'UNKNOWN': 4294.125850161847}, 'duration': {'UNKNOWN': 10379.947390556335}, 'count': {'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 1511402.9404509964}, 'duration': {'UNKNOWN': 131559.69582343102}, 'count': {'UNKNOWN': 4}}
{'distance': {'UNKNOWN': 34462.99760301494}, 'duration': {'UNKNOWN': 977.1180453300476}, 'count': {'UNKNOWN': 3}}
{'distance': {'UNKNOWN': 19485.795458445176}, 'duration': {'UNKNOWN': 3718.8894975185394}, 'count': {'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 850.4852679201318}, 'duration': {'UNKNOWN': 15861.93504524231}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 47455.78820030493}, 'duration': {'UNKNOWN': 7241.158731937408}, 'count': {'UNKNOWN': 3}}
{'distance': {'UNKNOWN': 1551245.9810044689}, 'duration': {'UNKNOWN': 13537.040816783905}, 'count': {'UNKNOWN': 5}}
{'distance': {'UNKNOWN': 713.4707286619786}, 'duration': {'UNKNOWN': 1448.6543893814087}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 3981236.406827432}, 'duration': {'UNKNOWN': 26685.684997320175}, 'count': {'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 333.2678850005484}, 'duration': {'UNKNOWN': 21.80204725265503}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2866.186136018516}, 'duration': {'UNKNOWN': 3249.8193860054016}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2645.5964919933926}, 'duration': {'UNKNOWN': 750.7993273735046}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2442.091053542208}, 'duration': {'UNKNOWN': 489.8975965976715}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 4449.832978365483}, 'duration': {'UNKNOWN': 2710.8742623329163}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 6409.15214382765}, 'duration': {'UNKNOWN': 745.2548396587372}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 163.76752941202727}, 'duration': {'UNKNOWN': 12.731597900390625}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 4440.681532305813}, 'duration': {'UNKNOWN': 93.64632654190063}, 'count': {'UNKNOWN': 3}}
{'distance': {'UNKNOWN': 366.71884583356365}, 'duration': {'UNKNOWN': 1299.6981484889984}, 'count': {'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 2078.9642225193907}, 'duration': {'UNKNOWN': 11704.498789787292}, 'count': {'UNKNOWN': 1}}
{'distance': {'E_CAR': 2328.846316296626, 'UNKNOWN': 6971.404103278014}, 'duration': {'E_CAR': 3389.3658237457275, 'UNKNOWN': 945.0989944934845}, 'count': {'E_CAR': 1, 'UNKNOWN': 2}}
{'distance': {'UNKNOWN': 155.63066751202888}, 'duration': {'UNKNOWN': 116.02340650558472}, 'count': {'UNKNOWN': 1}}
{'distance': {'E_CAR': 7177.555954021382, 'UNKNOWN': 43.50518475446066}, 'duration': {'E_CAR': 1140.2288630008698, 'UNKNOWN': 175.48782801628113}, 'count': {'E_CAR': 1, 'UNKNOWN': 1}}
{'distance': {'UNKNOWN': 846.2578386118126}, 'duration': {'UNKNOWN': 766.3598084449768}, 'count': {'UNKNOWN': 1}}
{'distance': {'E_CAR': 104.02576498063684}, 'duration': {'E_CAR': 38.734825134277344}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 8250.614829636044}, 'duration': {'E_CAR': 825.4048693180084}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 6171.384206967509}, 'duration': {'E_CAR': 781.507465839386}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 1065.2594223724116, 'UNKNOWN': 73.02336166659416}, 'duration': {'E_CAR': 491.0299062728882, 'UNKNOWN': 105.17022156715393}, 'count': {'E_CAR': 1, 'UNKNOWN': 1}}
{'distance': {'E_CAR': 733.2246673403289}, 'duration': {'E_CAR': 845.3612954616547}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 4275.13516142465}, 'duration': {'E_CAR': 541.7157530784607}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 7391.707788417107}, 'duration': {'E_CAR': 611.8730540275574}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 10035.89912694906}, 'duration': {'E_CAR': 1205.663957118988}, 'count': {'E_CAR': 1}}
{'distance': {'E_CAR': 9085.234840541858}, 'duration': {'E_CAR': 3597.4890780448914}, 'count': {'E_CAR': 1}}
```
@JGreenlee JGreenlee marked this pull request as draft May 3, 2024 05:05
@JGreenlee
Copy link
Contributor Author

JGreenlee commented May 3, 2024

Ok. The test-with-docker run failed because of the new dependency on e-mission-common. That workflow isn't set up to download pip dependencies via git. Probably need to install git in the workflow


The ubuntu-only-test-with-manual-install also failed because confirmed_trip objects now have a new property (ble_sensed_summary)

AssertionError: dict_[343 chars]ion_summary', 'ble_sensed_summary', 'user_input', 'additions']) != dict_[343 chars]ion_summary', 'user_input', 'additions'])

I just need to regenerate the 'expected output' snapshots

@JGreenlee
Copy link
Contributor Author

Ideally, I should add some new test data under emission/tests/data/real_examples/ that has background/bluetooth_ble entries.
Then use that in a unit test to validate the result of ble_sensed_summary.

After I fix the existing tests, I'll try to find a day of my own travel that has multiple BLE trips and not too many gaps / not any untracked time

JGreenlee added 2 commits May 3, 2024 10:21
When conda installs dependencies, it will attempt to use git for any dependencies that reference a git repository (e.g. `git+https://github.com/my/repo`)
So git must be installed on the container before dependencies are installed, or it will fail.
`ble_sensed_summary` is a new property for confirmed trips, so tests fail on the old snapshots. New generated snapshots are the same, but with different uuid and oids, and with empty ble_sensed_summary values in each confirmed trip.
Currently we don't have any tests with BLE sensor data, so there are no tests that have a `ble_sensed_summary` that is filled in. Coming later...
@JGreenlee
Copy link
Contributor Author

A different test failed. I think it's because I got the UUID formats wrong when I regenerated the snapshots
image

@shankari
Copy link
Contributor

shankari commented May 3, 2024

There is a script to download the expected values, under bin/debug which should take care of the proper formatting and so on

@JGreenlee
Copy link
Contributor Author

If you're referring to the save_ground_truth.py script, that's a method I already tried

@JGreenlee
Copy link
Contributor Author

JGreenlee commented May 3, 2024

On master, shankari_2016-06-20.expected_confirmed_trips has a different UUID format than shankari_2016-06-20.expected_confirmed_trips.manual_trip_addition_input and shankari_2016-06-20.expected_confirmed_trips.manual_trip_user_input

expected_confirmed_trips has dashes separating UUID chunks. The other two are continuous strings. I can't find a way to generate dash-separated UUIDs

@shankari
Copy link
Contributor

shankari commented May 3, 2024

expected_confirmed_trips has dashes separating UUID chunks. The other two are continuous strings.

I am not sure why this happened. Will have to look at git blame to figure it out.

I can't find a way to generate dash-separated UUIDs

str should work

>>> import uuid
>>> uuid.uuid4() # generate a new random UUID
UUID('5c7fec48-8197-40be-b017-832b9561fcbe')
>>> str(uuid.uuid4()) # convert it to a dash-separated string
'de9f9b74-e196-4bc0-8140-27bd281d4678'

@shankari
Copy link
Contributor

shankari commented May 5, 2024

@JGreenlee I am also confused as to why you end up with E_CAR in the summary

{'distance': {'E_CAR': 104.02576498063684}, 'duration': {'E_CAR': 38.734825134277344}, 'count': {'E_CAR': 1}}

when the basemode for your entry is CAR

{'value': 'car_jacks_mazda3', 'bluetooth_major_minor': ['dfc0:fff0'], 'text': "Jack's Mazda 3", 'baseMode': 'CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.16777, 'vehicle_info': {'type': 'car', 'license': 'JHK ****', 'make': 'Mazda', 'model': '3', 'year': 2014, 'color': 'red', 'engine': 'ICE', 'mpg': 33}}

@JGreenlee
Copy link
Contributor Author

I used a different user (not me) to demonstrate the ble_sensed_summary.

I encountered an error in running the pipeline on my opcode, causing trips to not get generated. not a regression

Planning to follow up on it, but I ran out of time on Friday

This test is failing because it expects shankari_2016-06-20.expected_confirmed_trips to have a specific UUID, which is hardcoded in the test.

This came up the last time these "expected" snapshots were re-generated: e-mission@3c52375

I think it will be annoying to keep replacing the user_id every time the expected outputs change. So I just reworked the test to use whatever user_id is in the file rather than some arbitrary one
@JGreenlee
Copy link
Contributor Author

BTW there was nothing wrong with the UUID format or the way I was creating the dumps. The test that was failing (TestRunGreedyIncrementalModel) actually wanted a specific hardcoded UUID

make TestRunGreedyIncrementalModel not use a hardcoded user_id

This test is failing because it expects shankari_2016-06-20.expected_confirmed_trips to have a specific UUID, which is hardcoded in the test.

This came up the last time these "expected" snapshots were re-generated: 3c52375

I think it will be annoying to keep replacing the user_id every time the expected outputs change. So I just reworked the test to use whatever user_id is in the file rather than some arbitrary one

@JGreenlee JGreenlee marked this pull request as ready for review May 5, 2024 04:45
@JGreenlee
Copy link
Contributor Author

Still need to add a test that runs the pipeline with some BLE sensor data & validates the expected ble_sensed_summary

But this PR can be reviewed & merged to unblock others if needed

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this in general. The secret sauce is e-mission-common, which I have not reviewed yet, but at least this will generate the correct data structures to unblock @Abby-Wheelis

Comment on lines +22 to +24
# sys.exit(1)
# TODO what to do here? What if Github is down or something?
# If we terminate, will the pipeline just try again later?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw an exception instead of terminating. That will cause the pipeline to catch the exception and mark that state as incomplete, which will ensure that any unprocessed data will be processed in the next run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

@@ -233,6 +236,7 @@ def create_confirmed_entry(ts, tce, confirmed_key, input_key_list):
tce["data"]["cleaned_trip"])
confirmed_object_data['inferred_section_summary'] = get_section_summary(ts, cleaned_trip, "analysis/inferred_section")
confirmed_object_data['cleaned_section_summary'] = get_section_summary(ts, cleaned_trip, "analysis/cleaned_section")
confirmed_object_data['ble_sensed_summary'] = get_section_summary(ts, cleaned_trip, "analysis/inferred_section", mode_prop="ble_sensed_mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that you would want to use analysis/cleaned_section here since the inferred section is not based on raw/smoothed data from sensors, but by running an ML pipeline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sensed data goes with sensor data, makes sense
TODO

@shankari shankari merged commit cc20ac4 into e-mission:master May 5, 2024
5 checks passed
@shankari
Copy link
Contributor

shankari commented May 5, 2024

@JGreenlee I merged this and then reset and re-ran the pipeline on the production dfc_fermata instance. I am seeing that all the summary entries are UNKNOWN

>>> all_cols = pd.json_normalize(edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/confirmed_trip"})).columns
>>> [c for c in all_cols if "ble_sensed_summary" in c]
['data.ble_sensed_summary.distance.UNKNOWN', 'data.ble_sensed_summary.duration.UNKNOWN', 'data.ble_sensed_summary.count.UNKNOWN']

There are a few users for whom there is no ble_sensed_summary. Taking a closer look at them...

@shankari
Copy link
Contributor

shankari commented May 5, 2024

Ah I know, it's because I don't have the DYNAMIC_CONFIG environment variable set in the terminal where I manually ran the pipeline. So let me reset the pipeline and let the default AWS pipeline re-create the entries. I do think that some additional logging would be helpful when you clean this up.

2024-05-05 17:02:14,295:DEBUG:139855285856064:Returning cached dynamic config for stage
-program at version 1
2024-05-05 17:02:14,295:DEBUG:139855285856064:getting BLE sensed vehicle for section fr
om 1714661096 to 1714661297

shankari added a commit that referenced this pull request May 6, 2024
We use a `git` based reference to the `em-common` repo.
The related PR added git to the test dockerfile
#965
to get the tests to pass.

We need to add a similar command to the production Dockerfile
JGreenlee added a commit to JGreenlee/e-mission-phone that referenced this pull request May 7, 2024
`ble_sensed_mode` and `ble_sensed_summary` were added to sections and confirmed trips in e-mission/e-mission-server#965.

We are going to need `ble_sensed_mode` to determine vehicle info.

The section summaries (`ble_sensed_summary`, `cleaned_section_summary` and `inferred_section_summary`) are not used in unprocessed trips currently, but I am adding them so there is less of a gap between composite trips and unprocessed trips
shankari added a commit to Abby-Wheelis/em-public-dashboard that referenced this pull request May 8, 2024
We do not need to manually install em-common since it was already installed in the base server image as part of
e-mission/e-mission-server#965
and
e-mission/e-mission-server@d33ce2e

So we removed it in 334c95b
But we do need to bump up the base image to include it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Address code reviews with "future"
Development

Successfully merging this pull request may close these issues.

2 participants