-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration of normalising data pipelines #37
Comments
@rhiaro could you give a quick outline of what a particular pipe might encompass? Is it, for example, carrying out a particular normalisation, or "the geo stuff"? Do we know if there are any dependencies between pipes that would mean that particular ones can't be disabled without other ones being useless/pointless/unreliable? |
There will be pipes for "the geo stuff", "the activity tag stuff" and the "organisation stuff" aka the enhancement pipes. There will be pipes for particular normalisations, but in several cases these are functionally the same, so are merged into one pipe. There are object types that will need to pass through more than one pipe to be completed, eg. an Today I've been thinking about breaking it up a bit so there are pipes for things that are common between all/several pipes, eg. dealing with the presence of invalid fields. However, these could potentially be reorganised as methods on the parent Pipe that all the other pipes can call on instead. It would be helpful to know exactly what sorts of things will need turning on and off to architect this better. |
Requirements are the ability to turn the enhancement pipes off at runtime, but normalisation pipes don't need this. Dependencies between normalisation pipes should be noted, in case someone alters the code to disable some, but it's not a requirement we need to explicitly support at the CLI. |
This app, like the last one, uses a system of pipelines that perform certain actions on the data.
Is the work here to enable configuration options so it's easy to turn certain pipes on and off?
Are there other use cases you're looking to meet here?
The text was updated successfully, but these errors were encountered: