Releases: AlertaDengue/PySUS
0.10.0
New PySUS version 0.10.0 (2023-09-19) has been release!
This release includes, majorly, new classes to interact with DATASUS FTP server, they are intended to replace the pysus.online_data.__init__
classes and functionalities. These classes, located on pysus.ftp
module, are the building blocks to retrieve data files, to be then visualized as dataframes when locally parsed to parquets. An additional class pysus.data.local.ParquetSet
has been introduced to work with downloaded DBC or DBF files from the server, that will be responsible for transforming them into parquet files.
pysus.online_data
Deprecation
This release aims to create a better FTP interface to interact with DATASUS FTP server. Therefore, the classes and methods, mainly on init.py, are intended to be deprecated in future releases. To grant similar usage compatibility with previous versions, the methods on the former databases have been updated to use the new FTP interface, but methods in the modules remain similar until FTP_Inspect and FTP_Downloader classes become completely obsolete.
pysus.ftp
module
The classes introduced in this module are consist in two groups. Base FTP modules, found in init.py file, in which are File, Directory and Database. And the second group are the Databases itself, representing DATASUS groups of data inside the server.
pysus.ftp.File
& pysus.ftp.Directory
classes
FTP File is an output class when listing DATASUS FTP content with PySUS. The File's main methods are the download()
and async_download()
, but they can also display information retrieved in the FTP server with File.info
.
A FTP Directory is responsible for actual parse the FTP content in Files or other Directories. When instantiated, a Directory CWDs into the path provided and load itself and its parent into CACHE, but not its content yet. To load the content inside a directory, it has to be explicit loaded with Directory.load()
, this will then parse all the FTP content into Files/Directories in its own content
. The CACHE here matters, because when a child dir is loaded, it can be linked to a former Directory instance that have been loaded already.
Note that only Directories are stored in cache.
Database
classes
PySUS FTP Databases are the reason why File and Directory exist. A Database consists in specifics Directories in DATASUS with specific File (DBC or DBF) names format. These files will be parsed to ParquetSets
(parquet) format when extracted from DATASUS to a local machine in order to be read as pandas DataFrames.
A list with all databases implemented to this day can be found in pysus.ftp.databases directory. Each Database has its own specifications, but they all share the same main functionalities:
name
: <ABBREVIATION> - <Long Name>paths
: A list of Directories or a Directory where Database's Files will be searched for, in DATASUSmetadata
: A dictionary with detailed information about the Databasegroups
: A Database's specific groups of data found in FTP Servercontent
: The loaded content (Files/Directories) of a Databasesfiles
: Its content, filtered by Files onlyload()
: Loads a Directory content in its own content. The default Directories are itspaths
describe()
: Displays a File (specific to its Database) information in a human formatformat()
: Extracts a File information into a tupleget_files()
: Filters itscontent
based on its specificationsdownload()
: Downloads a list of Files and returns inParquetSet
format
pysus.data.local.ParquetSet
ParquetSets are the output class when retrieving Database files into another machine. They represent a final file format after parsed DBC -> DBF -> parquet. The parquet data format splits the data in smaller chunks, so it can be better managed when loaded in memory to be visualized. In general, the ParquetSet
is able to load all the chunks into memory and display the data as DataFrame with to_dataframe()
. But be aware that large parquet sets may fill the entire memory.
Features
- databases: create CACHE structure to ftp Directories & add CNES database (#152) (b99dd38)
- pbar: include a progress bar to download and parsing data (8cd691c)
- struc: database modularization and code improvement (#137) (d7e6d27)
Bug Fixes
0.9.4
0.9.3
0.9.2
0.9.1
0.9.0
0.8.0
0.8.0 (2023-03-14)
Features
- SINAN: moving EGH changes to PySUS (72f2d93)
Commits
- 84ee4a9 - Mon, 13 Mar 2023 11:01:21 -0300 (23 hours ago) (HEAD -> sinan-metadata, origin/sinan-metadata)
| linter - Luã Bida Vacaro - 70be800 - Mon, 13 Mar 2023 10:57:32 -0300 (23 hours ago)
| Parse df method - Luã Bida Vacaro - 3dff56b - Mon, 13 Mar 2023 10:26:25 -0300 (24 hours ago)
| Minor CNES test fix - Luã Bida Vacaro - a9be440 - Mon, 13 Mar 2023 09:38:03 -0300 (24 hours ago)
| Delete poetry lock - Luã Bida Vacaro - 4a85ee5 - Mon, 13 Mar 2023 09:35:51 -0300 (24 hours ago)
| Poetry lock - Luã Bida Vacaro - 2324821 - Mon, 13 Mar 2023 09:31:27 -0300 (24 hours ago)
| Tests - Luã Bida Vacaro - 70be800 - Mon, 13 Mar 2023 10:57:32 -0300 (23 hours ago)
| Parse df method - Luã Bida Vacaro - 3dff56b - Mon, 13 Mar 2023 10:26:25 -0300 (24 hours ago)
| Minor CNES test fix - Luã Bida Vacaro - a9be440 - Mon, 13 Mar 2023 09:38:03 -0300 (24 hours ago)
| Delete poetry lock - Luã Bida Vacaro - 4a85ee5 - Mon, 13 Mar 2023 09:35:51 -0300 (24 hours ago)
| Poetry lock - Luã Bida Vacaro - 2324821 - Mon, 13 Mar 2023 09:31:27 -0300 (24 hours ago)
| Tests - Luã Bida Vacaro - 0d11d9c - Mon, 13 Mar 2023 08:15:03 -0300 (26 hours ago)
| Sinan metadata - Luã Bida Vacaro - 061b4f0 - Fri, 10 Mar 2023 18:41:11 -0300 (4 days ago)
| Minor fixes - Luã Bida Vacaro - f93e2ef - Fri, 10 Mar 2023 17:13:51 -0300 (4 days ago)
| SINAN disease returns circular import - Luã Bida Vacaro - 5a88c79 - Fri, 10 Mar 2023 16:15:33 -0300 (4 days ago)
| FTP classes implementation - Luã Bida Vacaro - e849bd4 - Thu, 9 Mar 2023 22:16:28 -0300 (5 days ago)
| Improvements & docstrings - Luã Bida Vacaro - 9cab15a - Wed, 8 Mar 2023 19:11:14 -0300 (6 days ago)
| Abstract download classes - Luã Bida Vacaro - 70b3f36 - Mon, 6 Mar 2023 02:30:28 -0300 (8 days ago)
| linter - Luã Bida Vacaro - ab4db1e - Mon, 6 Mar 2023 01:31:27 -0300 (8 days ago)
| Trim wrong encoded columns - Luã Bida Vacaro - e93a932 - Mon, 27 Feb 2023 18:51:02 -0300 (2 weeks ago)
| SQLAlchemy 2.0 breaks airflow - Luã Bida Vacaro - 8be8283 - Mon, 27 Feb 2023 18:35:20 -0300 (2 weeks ago)
| Minor Fixes - Luã Bida Vacaro - 01c32b9 - Mon, 27 Feb 2023 16:03:18 -0300 (2 weeks ago)
| Updating pyarrow version - Luã Bida Vacaro - a75a686 - Mon, 27 Feb 2023 14:58:37 -0300 (2 weeks ago)
| Tests - Luã Bida Vacaro - 9927972 - Mon, 27 Feb 2023 11:51:15 -0300 (2 weeks ago)
| Finishing converting found dtypes - Luã Bida Vacaro - 72f2d93 - Fri, 24 Feb 2023 19:05:52 -0300 (3 weeks ago)
| feat(SINAN): moving EGH changes to PySUS - Luã Bida Vacaro
What's Changed
Full Changelog: 0.7.0...0.8.0
0.7.0
BugFix
Logging modules
Adding stderr logs in online_data
modules using Loguru.
b91a7f1 - Wed, 23 Nov 2022 09:31:06 -0300 (HEAD -> logging, tag: v0.6.3, origin/logging)
Release 0.6.3 - Luã Bida Vacaro
- Updating tag version in
pyproject.toml
to 0.6.3
dcabe89 - Tue, 22 Nov 2022 14:27:03 -0300
Logging sinasc - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.sinasc
- Minor bug-fixes
de856dc - Tue, 22 Nov 2022 11:47:12 -0300
Logging SINAN - Luã Bida Vacaro
- Changing the builtin logger to Loguru in
pysus.online_data.SINAN
3c5b388 - Tue, 22 Nov 2022 11:35:54 -0300
Logging SIM - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.SIM
- Lint
febba36 - Tue, 22 Nov 2022 11:16:06 -0300
Logging SIH - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.SIH
8d28a5b - Tue, 22 Nov 2022 11:11:19 -0300
Logging SIA - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.SIA
78df7b0 - Tue, 22 Nov 2022 11:01:42 -0300
Logging PNI - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.PNI
- Minor cleaning in module
pysus.online_data.IBGE
568cb62 - Tue, 22 Nov 2022 10:52:12 -0300
Logging ESUS - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.ESUS
e720a2a - Tue, 22 Nov 2022 10:44:04 -0300
Logging CNES - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.CNES
7425f97 - Tue, 22 Nov 2022 10:30:29 -0300
Upgrading package version to prepare Pypi build - Luã Bida Vacaro
- Updating tag version in
pyproject.toml
to 0.6.2 (Fixed)
9cd4760 - Tue, 22 Nov 2022 09:20:55 -0300
CIHA logging - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.CIHA
44b3b7c - Tue, 22 Nov 2022 08:54:22 -0300
Logging vaccine.py - Luã Bida Vacaro
- Adding logs in module
pysus.online_data.vaccine
fac2d62 - Tue, 22 Nov 2022 08:04:53 -0300
Start logging with loguru - Luã Bida Vacaro
- Installing Loguru package
What's Changed
- Fix test failures by @fccoelho in #96
- BREAKING CHANGE (SINAN): Improve downloading data from SINAN by @luabida in #100
- CONTINUE Update unit tests by @luabida in #102
Full Changelog: v0.6.1...v0.6.3