Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve upload speed #118

Closed
flavioamieiro opened this issue Jun 27, 2014 · 5 comments
Closed

Improve upload speed #118

flavioamieiro opened this issue Jun 27, 2014 · 5 comments
Milestone

Comments

@flavioamieiro
Copy link
Member

As described in NAMD/mediacloud_backend#65 the upload speed is not enough to have it as part of a regular process. We need to benchmark the upload and see where we can improve speed.

@fccoelho
Copy link
Member

I think that the reason why the add_document call takes a while to return
is because it has to wait for the pipeline to end, so that it can return
the document I'd to the caller. Perhaps if we modified the code to create
an empty document and return the Id immediately, The pipeline can then
update this document whenever it finishes. @turicas what do you think?
Em 27/06/2014 13:20, "Flávio Amieiro" [email protected] escreveu:

As described in NAMD/mediacloud_backend#65
NAMD/mediacloud_backend#65 the upload speed is
not enough to have it as part of a regular process. We need to benchmark
the upload and see where we can improve speed.


Reply to this email directly or view it on GitHub
#118.

@turicas
Copy link
Contributor

turicas commented Jun 29, 2014

@fccoelho, the add_document call (on client) does not wait for the pipeline to finish, it only waits for the Web interface to add the document pipeline on the backend (only one ZMQ call). We were thinking in not sending this message to the backend directly, but using a background daemon, so we can return directly to the API user, but I figured out in my tests that this part does not take some time.

@flavioamieiro
Copy link
Member Author

As @turicas said, this was our first thought, but after some tests we realized creating the document was not our bottleneck. Also, I don't think any operation is going to be faster than just creating the document before returning as we do now. @turicas do you remember if we found any bottlenecks in our tests?

@turicas
Copy link
Contributor

turicas commented Jun 30, 2014

@flavioamieiro, the bottenecks I've found were related to running the services (broker and pypln-web, basically) in the same machine, so when the workers are consuming 100% CPU, the web process could not answer in a normal time (we need to modify the deployment process to permit it).

@fccoelho
Copy link
Member

We must either renice the processes to give priority to the frontend or
simply do not use the same machine for both.
Em 30/06/2014 13:10, "Álvaro Justen" [email protected] escreveu:

@flavioamieiro https://github.com/flavioamieiro, the bottenecks I've
found were related to running the services (broker and pypln-web,
basically) in the same machine, so when the workers are consuming 100% CPU,
the web process could not answer in a normal time (we need to modify the
deployment process to permit it).


Reply to this email directly or view it on GitHub
#118 (comment).

@flavioamieiro flavioamieiro added this to the Next Release milestone Jul 7, 2014
@fccoelho fccoelho closed this as completed Jun 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants