Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CategoricalNB #313

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Add CategoricalNB #313

wants to merge 2 commits into from

Conversation

ksew1
Copy link
Contributor

@ksew1 ksew1 commented Dec 22, 2024

Added CategoricalNB :)

@krstopro
Copy link
Member

krstopro commented Dec 23, 2024

Hi @ksew1 and thanks for the pull request! Looks great on the first look and I will be having a detailed review soon.

},
x
) do
{_, _, _, jll} =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you don't use the first three variables, you can wrap them in a tuple, and then use more convenient pattern match {_, jil}

{i + 1, feature_log_probability, x, jll}
end

total_jll = jll + class_log_priors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simply returned


feature_count = Nx.broadcast(0.0, {num_features, num_classes, num_categories})

{_, _, _, _, feature_count} =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above

Copy link
Contributor

@msluszniak msluszniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've Dropped some minor comments, but LGMT. There are some part that are common for all of NB algorithms. I think that it will be cool to separate them into utils.ex inside naive_bayes dir. But it's not necessary as a scope of this PR.

@ksew1
Copy link
Contributor Author

ksew1 commented Dec 31, 2024

I've Dropped some minor comments, but LGMT. There are some part that are common for all of NB algorithms. I think that it will be cool to separate them into utils.ex inside naive_bayes dir. But it's not necessary as a scope of this PR.

I’ve addressed all the comments 😄. I agree that it would be a good idea to extract the common parts into a separate file. I’ll do this in a separate PR.

@krstopro
Copy link
Member

krstopro commented Jan 2, 2025

Any reason why num_categories isn't a required option?

@ksew1
Copy link
Contributor Author

ksew1 commented Jan 2, 2025

Any reason why num_categories isn't a required option?

num_categories isn’t needed as an option since it’s just an implementation detail. We can easily calculate it from min_categories, which the user can already specify, so i think adding it would just be redundant

@krstopro
Copy link
Member

krstopro commented Jan 4, 2025

@josevalim @msluszniak I am still having a look. Seems very good, but please give me some time before merging. I might have some improvements to suggest (e.g. maybe the code can be vectorised; this could be valuable for @ksew1 to learn in case they don't know it already).

Happy New Year everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants