Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add optional arg for minio_client #73

Merged
merged 1 commit into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 18 additions & 5 deletions src/minio_utils/minio_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,28 @@
from minio import Minio


def get_minio_client() -> Minio:
def get_minio_client(
minio_url: str = None,
access_key: str = None,
secret_key: str = None,
secure: bool = False
) -> Minio:
"""
Helper function to get the Minio client.

:param minio_url: URL for the Minio server (environment variable used if not provided)
:param access_key: Access key for Minio (environment variable used if not provided)
:param secret_key: Secret key for Minio (environment variable used if not provided)
:param secure: Whether to use HTTPS (optional, default is False)
:return: A Minio client object
"""
minio_url = minio_url or os.environ['MINIO_URL'].replace("http://", "")
access_key = access_key or os.environ['MINIO_ACCESS_KEY']
secret_key = secret_key or os.environ['MINIO_SECRET_KEY']

Check warning on line 27 in src/minio_utils/minio_utils.py

View check run for this annotation

Codecov / codecov/patch

src/minio_utils/minio_utils.py#L25-L27

Added lines #L25 - L27 were not covered by tests

return Minio(
os.environ['MINIO_URL'].replace("http://", ""),
access_key=os.environ['MINIO_ACCESS_KEY'],
secret_key=os.environ['MINIO_SECRET_KEY'],
secure=False
minio_url,
access_key=access_key,
secret_key=secret_key,
secure=secure
)
8 changes: 7 additions & 1 deletion src/spark/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,9 @@
path: str,
header: bool = True,
sep: str = None,
minio_url: str = None,
access_key: str = None,
secret_key: str = None,
**kwargs
) -> DataFrame:
"""
Expand All @@ -179,13 +182,16 @@
:param path: The minIO path to the CSV file. e.g. s3a://bucket-name/file.csv or bucket-name/file.csv
:param header: Whether the CSV file has a header. Default is True.
:param sep: The delimiter to use. If not provided, the function will try to detect it.
:param minio_url: The minIO URL. Default is None (environment variable used if not provided).
:param access_key: The minIO access key. Default is None (environment variable used if not provided).
:param secret_key: The minIO secret key. Default is None (environment variable used if not provided).
:param kwargs: Additional arguments to pass to spark.read.csv.

:return: A DataFrame.
"""

if not sep:
client = get_minio_client()
client = get_minio_client(minio_url=minio_url, access_key=access_key, secret_key=secret_key)

Check warning on line 194 in src/spark/utils.py

View check run for this annotation

Codecov / codecov/patch

src/spark/utils.py#L194

Added line #L194 was not covered by tests
bucket, key = path.replace("s3a://", "").split("/", 1)
obj = client.get_object(bucket, key)
sample = obj.read(8192).decode()
Expand Down
Loading