Skip to content
This repository has been archived by the owner on Jan 4, 2018. It is now read-only.

stream data in real time #3

Open
mimming opened this issue May 10, 2016 · 1 comment
Open

stream data in real time #3

mimming opened this issue May 10, 2016 · 1 comment
Assignees

Comments

@mimming
Copy link

mimming commented May 10, 2016

What we have now

sensor data is being dropped into Google Cloud Storage daily. A python script is slurping it up, and loading it into BigQuery

desired state

Create a Pub/Sub pipeline that accepts sensor data and follow this path:

Pub/Sub -> DataFlow (Beam) [as necessary] -> BigQuery
@mimming
Copy link
Author

mimming commented May 18, 2016

It looks like Dataflow for Python is still in alpha.

I'm going to wait until it's in beta before I move to it.

In the mean time, use this model to get batches:

Panoptes unit -> Cloud Storage bucket unit_sensors -> Python script on compute engine VM -> BigQuery

@mimming mimming self-assigned this May 18, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant