Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.

Add network to docker containers #43

Open
sdave2 opened this issue Dec 22, 2020 · 2 comments
Open

Add network to docker containers #43

sdave2 opened this issue Dec 22, 2020 · 2 comments

Comments

@sdave2
Copy link

sdave2 commented Dec 22, 2020

I am testing/trying out fiber from my local machine. I am looking to use fiber processes to do a side-effect job (put data into databases) and use docker as the fiber backend. For testing, i have elasticsearch and postgress running in docker containers in a docker network called test.
I would like to pass network name as a parameter (just like the docker image) to the process running the docker container.
I tried it out locally and it works for me. This is the modification i made to the docker_backend.py file:

     81         try:
     82             container = self.client.containers.run(
     83                 image,
     84                 job_spec.command,
     85                 name=job_spec.name + '-' + str(uuid.uuid4()),
     86                 volumes=volumes,
     87                 cap_add=["SYS_PTRACE"],
     88                 tty=tty,
     89                 stdin_open=stdin_open,
     **90                 network="test",**
     91                 detach=True

I am not sure how to pass the network in as a parameter. Possibly via job_spec ?

Questions:

  1. Is it recommend to use Fiber process to do side-effect jobs, specifically use it and insert data into database?
    If i have 5 places i want to put the data in (elasticsearch, redis-stream, postgress, other-places), is it recommend to use 5 fiber processes to insert data into the respective "databases"
@calio
Copy link
Collaborator

calio commented Dec 23, 2020

Hi @sdave2, this is an interesting use case. job_spec currently doesn't have a network attribute, but it can be useful. Not all Fiber backends need a "networkconfig, so probably the best way is to add an "extras" field tojob_specand also feed into that through some config value specified inconfig.pyandpopen_fiber_spawn.py`. Feel free to create a PR for this.

For your question, it's perfectly fine to use Fiber to do jobs with side effects. The only thing you need to pay attention to is to use lower-level Process rather than Pool, as Pool has error handling logic which may mess up with your data insertion.

@sdave2
Copy link
Author

sdave2 commented Dec 23, 2020

The only thing you need to pay attention to is to use lower-level Process rather than Pool, as Pool has error handling logic which may mess up with your data insertion.

Right, I am using processes. Also i think with Pool, you map data across a function, where as in my case, i am mapping functions across the dataset.

I will create a PR for passing in a network attribute

Also, I took it a little bit further yesterday and started running the main process inside docker, and have that process spawn fiber processes ie, docker containers. I want to avoid running anything on my local machine and encapsulate everything within docker.
The only problem I ran into was volume mapping; I am running the process as root in my container and if I spawn other docker containers, I find I am mapping /root:/:rw. Docker doesn't like destination ending in /, and also, I want to avoid mapping root.
If this is also something you find useful, I can create a PR for this as well once I figure out volume mapping issue that I am having.
Feedback is welcome!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants