Skip to content

Commit

Permalink
Merge pull request #27 from JuliaTrustworthyAI/issue-with-seed
Browse files Browse the repository at this point in the history
couple more things
  • Loading branch information
pat-alt authored Jan 9, 2025
2 parents 3c8c062 + 382bb26 commit 1ebf84a
Show file tree
Hide file tree
Showing 6 changed files with 21 additions and 4 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),

### Changed

- Changed the way the default seed is set to avoid overriding the global seed.
- Changed the way the default seed is set to avoid overriding the global seed. [#26], [#27]
6 changes: 6 additions & 0 deletions src/TaijaData.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,13 @@ using Flux
using MLDatasets
using StatsBase

"""
data_seed
A global seed to produce standardized synthetic data. This seed is used in various functions to ensure reproducibility of the synthetic datasets.
"""
const data_seed = 42

data_dir = joinpath(artifact"data-tabular", "data-tabular")

include("synthetic/blobs.jl")
Expand Down
5 changes: 4 additions & 1 deletion src/synthetic/linearly_separable.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@
load_linearly_separable(n=250; seed=data_seed)
Loads linearly separable synthetic data.
!!! note
This calls the [`load_blobs`](@ref) function with specific parameters and the seed set to [`data_seed`](@ref). To ensure linear spearability and reproducibility, setting the `seed` keyword argument has no effect. For more flexibility, you can use [`load_blobs`](@ref) directly with different parameters if needed.
"""
function load_linearly_separable(n=250; seed=data_seed)
data = load_blobs(n; seed=seed, centers=2, cluster_std=0.5)
data = load_blobs(n; seed=data_seed, centers=2, cluster_std=0.5)
return data
end
2 changes: 2 additions & 0 deletions src/synthetic/moons.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
load_moons(n=250; seed=data_seed, kwrgs...)
Loads synthetic moons data.
"""
function load_moons(n=250; seed=data_seed, kwrgs...)
if isa(seed, Int)
Expand Down
5 changes: 4 additions & 1 deletion src/synthetic/multi_class.jl
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
"""
load_multi_class(n=250; seed=data_seed)
load_multi_class(n=250; seed=data_seed, centers=4)
Loads multi-class synthetic data.
!!! note
This calls the [`load_blobs`](@ref) function with specific parameters and the seed set to [`data_seed`](@ref).
"""
function load_multi_class(n=250; seed=data_seed, centers=4)
data = load_blobs(n; seed=seed, centers=centers, cluster_std=0.5)
Expand Down
5 changes: 4 additions & 1 deletion src/synthetic/overlapping.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@
load_overlapping(n=250; seed=data_seed)
Loads overlapping synthetic data.
!!! note
This calls the [`load_blobs`](@ref) function with specific parameters and the seed set to [`data_seed`](@ref). To ensure overlapping clusters and reproducibility, setting the `seed` keyword argument has no effect. For more flexibility, you can use [`load_blobs`](@ref) directly with different parameters if needed.
"""
function load_overlapping(n=250; seed=data_seed)
data = load_blobs(n; seed=seed, centers=2, cluster_std=2.0)
data = load_blobs(n; seed=data_seed, centers=2, cluster_std=2.0)
return data
end

2 comments on commit 1ebf84a

@pat-alt
Copy link
Member Author

@pat-alt pat-alt commented on 1ebf84a Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/122687

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v1.0.1 -m "<description of version>" 1ebf84aef12c4b51a556cb8ed9c2565059458799
git push origin v1.0.1

Please sign in to comment.