Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for relational algebra #3438

Open
odow opened this issue Jul 29, 2023 · 5 comments
Open

Improve support for relational algebra #3438

odow opened this issue Jul 29, 2023 · 5 comments
Labels
Category: Containers Related to the Containers submodule Type: Feature request

Comments

@odow
Copy link
Member

odow commented Jul 29, 2023

Motivated by https://jump.dev/2023/07/20/gams-blog/

To make this syntax a bit nicer

image
@odow
Copy link
Member Author

odow commented Aug 14, 2023

The alternative is to parse SparseAxisArray into a data frame: #3441 (comment)

Ideally, we should parse the condition of the JuMP.Containers.NestedIterator then it would be an expression tree of conditions with && and ||. Then we put it into conjunctive normal form c1 && c2 && ... && cm. Then for each condition, we should check: which indices does it involve ? is this an equality lhs == rhs
If it involves a subset of the indices then we should filter with this condition earlier and not in the most nested level of the for loops.
If it is an equality, we could do the indexing ourself (as in https://discourse.julialang.org/t/is-it-possible-to-incorporate-the-relational-algebra-technique-into-jump-in-the-future-to-expedite-model-generation/101343/4?u=blegat) or use DataFrames but that's kind of an heavy dependency for just this.
Now what's a bit tricky is that you could have [i = I, j = J, k = K, l = L; f(i, j, k) == g(j, k, l)]. Then, it means you need to build the set of (i, j, k) and the set of (j, k, l) and then index them with the Dict to take the intersection. One could wonder if it wouldn't be better to just to the 4-nested for loops instead of having to do twice a 3-nested for loops. I think however than unless K has only 2 elements, it is always better to do two 3-nested for-loops than a 4-nested for loops so doing this intersection with dictionaries is always better.

In some cases, it might be better to do the nested for loops in an order that is different from the order of the indices used by the user but that can be left as future work since I don't think this is easy to find that. But detecting the two things I mentioned above and improving the iteration shouldn't be too complicated and it would solve the issue in the GAMS example without the user to do anything. The SparseAxisArray would automatically build itself efficiently.

@odow odow changed the title Add DataFrames extension Improve support for relational algebra Sep 18, 2023
@odow
Copy link
Member Author

odow commented Sep 18, 2023

This tutorial is relevant for any changes in the future looking at this:

https://jump.dev/JuMP.jl/stable/tutorials/linear/multi_commodity_network/

@odow
Copy link
Member Author

odow commented Sep 18, 2023

See also #3212

@odow
Copy link
Member Author

odow commented Sep 9, 2024

Perhaps I do like #3441

julia> using JuMP, DataFrames

julia> function JuMP.Containers.container(
           f::Function,
           indices,
           ::Type{DataFrames.DataFrame},
           names,
       )
           rows = vec(collect(indices))
           df = DataFrames.DataFrame(NamedTuple{tuple(names...)}(arg) for arg in rows)
           df.value = [f(arg...) for arg in rows]
           return df
       end

julia> model = Model();

julia> @variables(model, begin
           x[i in 1:2, j in i:2], (container = DataFrames.DataFrame)
           y[i in 1:2, 1:2], (container = DataFrames.DataFrame)
           z[i in 1:2, j in 1:2; isodd(i+j)], (container = DataFrames.DataFrame)
       end);

julia> x
3×3 DataFrame
 Row │ i      j      value     
     │ Int64  Int64  GenericV 
─────┼─────────────────────────
   11      1  x[1,1]
   21      2  x[1,2]
   32      2  x[2,2]

julia> y
4×3 DataFrame
 Row │ i      ##288  value     
     │ Int64  Int64  GenericV 
─────┼─────────────────────────
   11      1  y[1,1]
   22      1  y[2,1]
   31      2  y[1,2]
   42      2  y[2,2]

julia> z
2×3 DataFrame
 Row │ i      j      value     
     │ Int64  Int64  GenericV 
─────┼─────────────────────────
   11      2  z[1,2]
   22      1  z[2,1]

@odow
Copy link
Member Author

odow commented Sep 9, 2024

If we supported that syntax, then improvements to how we parsed and constructed the sparse index sets would just be a performance optimization.

@odow odow mentioned this issue Sep 10, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Containers Related to the Containers submodule Type: Feature request
Development

Successfully merging a pull request may close this issue.

1 participant