Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTP Connection Management Optimization using Singleton Pattern #213

Merged
merged 17 commits into from
Dec 3, 2024

Conversation

felipeadeildo
Copy link
Contributor

Overview

This PR implements an FTP connection singleton pattern to significantly improve module import performance. The change was motivated by discussion #211 where multiple redundant FTP connections were identified as a performance bottleneck.

Changes

  • Implemented FTPSingleton class to manage a single FTP connection
  • Modified Directory class to use the singleton pattern
  • Reduced redundant FTP connections across database modules
  • Maintained all existing functionality while improving performance

Performance Impact

The implementation shows dramatic performance improvements:

image

Before (main branch):

  • Total import time: 21.0434s
  • Individual module imports taking 1.1-4.4s each
  • Multiple redundant FTP connections being created

After (with FTP singleton):

  • Total import time: 0.3499s (98.3% reduction)
  • First import: 0.3479s (creates singleton)
  • Subsequent imports: ~0.0003s (reuses connection)

Performance Test Code

import importlib
from time import perf_counter
from typing import Dict
from loguru import logger

def measure_import_time(module_name: str) -> float:
    """Measure import time for a single module"""
    start = perf_counter()
    importlib.import_module(module_name)
    end = perf_counter()
    return end - start

def main():
    # List of modules to test that use Directory class
    modules = [
        "pysus.ftp.databases.sim",
        "pysus.ftp.databases.sia",
        "pysus.ftp.databases.sih",
        "pysus.ftp.databases.sinan",
        "pysus.ftp.databases.sinasc",
        "pysus.ftp.databases.cnes",
        "pysus.ftp.databases.pni",
        "pysus.ftp.databases.ibge_datasus",
    ]

    results: Dict[str, float] = {}
    total_time = 0.0

    logger.info("Starting import measurements...")

    for module in modules:
        try:
            time_taken = measure_import_time(module)
            results[module] = time_taken
            total_time += time_taken
            logger.info(f"{module}: {time_taken:.4f}s")
        except Exception as e:
            logger.error(f"Error importing {module}: {e}")

    logger.info("=" * 50)
    logger.info(f"Total import time: {total_time:.4f}s")
    logger.info("=" * 50)

if __name__ == "__main__":
    main()

I believe this change significantly improves the library's performance while maintaining its ease of use and reliability.

Note: I've added some type-hints that was lacking, a lot of warnings was appearing (some of them was solved putting a type ignore alias.)

Copy link
Collaborator

@luabida luabida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipeadeildo thank you for your PR. I've ran some simple tests locally and notice that the Databases are not loading their Directories' __content__. I've included some comments pointing what's missing in the load_directory_content method to make it to work.

simple testcase:

from pysus.ftp import Directory
Directory("/dissemin/publicos/CIHA/201101_/Dados").load().__content__ # should not be []

pysus/ftp/__init__.py Show resolved Hide resolved
pysus/ftp/__init__.py Show resolved Hide resolved
pysus/ftp/__init__.py Outdated Show resolved Hide resolved
@felipeadeildo
Copy link
Contributor Author

Oh, i agree!
Ive removed the content append when rewriting the whole file.
I'm implementing the content loading and removing the unused line parser with operational system conditional (Ive noticed that we don't need it)

@felipeadeildo
Copy link
Contributor Author

felipeadeildo commented Nov 26, 2024

I've fixed the problem with the last commits. I would appreciate your revision, @luabida
I believe that everything is working fine now

@luabida
Copy link
Collaborator

luabida commented Nov 26, 2024

Hello @felipeadeildo, could you please rebase your PR based on the latest release? I've fixed the CI and added some endpoint tests that will ensure the ftp functionality. Also, please note that you should be using semantic release commit message format in the PR title to include your changes in the next release

@felipeadeildo felipeadeildo marked this pull request as draft November 26, 2024 16:40
@felipeadeildo felipeadeildo marked this pull request as ready for review November 27, 2024 16:59
@luabida
Copy link
Collaborator

luabida commented Nov 28, 2024

I approved approved the tests on CI, but I think the rebase has failed, since there should have only the files you eddited in the PR. Please check the rebase with the command git fetch --all && git rebase -i upstream/main to see if prompts any conflicts, then use git push -f to push the rebase itself

@felipeadeildo
Copy link
Contributor Author

Hey!

I'm waiting the ftp server turns on again.

When the service comes back, I will execute the tests (make test-pysus) that i've been using to confirm if is everything fine.

So, i will mark this PR as "still in progress" for a while.

@felipeadeildo felipeadeildo marked this pull request as draft November 29, 2024 21:25
@felipeadeildo felipeadeildo marked this pull request as ready for review December 1, 2024 19:52
@felipeadeildo
Copy link
Contributor Author

felipeadeildo commented Dec 1, 2024

Hi! I've run the tests, and all FTP-related tests passed successfully. However, there are some warnings regarding deprecated functions in the tqdm library.

pysus/tests/test_ftp.py::TestDirectoryAndFile::test_directory_cache PASSED                                                                                                                       [ 41%]
pysus/tests/test_ftp.py::TestDirectoryAndFile::test_root_directory PASSED                                                                                                                        [ 44%]
pysus/tests/test_ftp.py::TestDirectoryAndFile::test_root_load PASSED                                                                                                                             [ 47%]
pysus/tests/test_ftp.py::TestDirectoryAndFile::test_root_reload PASSED                                                                                                                           [ 50%]
pysus/tests/test_ftp.py::TestDirectoryAndFile::test_sinan_file PASSED                                                                                                                            [ 52%]
pysus/tests/test_ftp.py::TestDatabases::test_ciha PASSED                                                                                                                                         [ 55%]
pysus/tests/test_ftp.py::TestDatabases::test_cnes PASSED                                                                                                                                         [ 58%]
pysus/tests/test_ftp.py::TestDatabases::test_ibge_datasus PASSED                                                                                                                                 [ 61%]
pysus/tests/test_ftp.py::TestDatabases::test_pni PASSED                                                                                                                                          [ 63%]
pysus/tests/test_ftp.py::TestDatabases::test_sia PASSED                                                                                                                                          [ 66%]
pysus/tests/test_ftp.py::TestDatabases::test_sih PASSED                                                                                                                                          [ 69%]
pysus/tests/test_ftp.py::TestDatabases::test_sim PASSED                                                                                                                                          [ 72%]
pysus/tests/test_ftp.py::TestDatabases::test_sinan PASSED                                                                                                                                        [ 75%]
pysus/tests/test_ftp.py::TestDatabases::test_sinasc PASSED

@luabida
Copy link
Collaborator

luabida commented Dec 2, 2024

@felipeadeildo you may run the command pre-commit run --all-files and fix the errors in order to pass on CI.

@felipeadeildo
Copy link
Contributor Author

@felipeadeildo you may run the command pre-commit run --all-files and fix the errors in order to pass on CI.

Ok, done.

@luabida luabida merged commit cf49f88 into AlertaDengue:main Dec 3, 2024
4 checks passed
Copy link

github-actions bot commented Dec 3, 2024

🎉 This PR is included in version 0.15.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants