You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The PostgreSQL version and conda environment are the same as recommended in README.md. When I run it as python run.py --run Balsa_TPCH --local, an error occurred with the following traceback.
Traceback (most recent call last):
File "run.py", line 2155, in <module>
app.run(Main)
File "/home/xxx/anaconda3/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/xxx/anaconda3/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "run.py", line 2150, in Main
agent = BalsaAgent(p)
File "run.py", line 754, in __init__
self.exp, self.exp_val = self._MakeExperienceBuffer()
File "run.py", line 809, in _MakeExperienceBuffer
wi = self.GetOrTrainSim().training_workload_info
File "run.py", line 1160, in GetOrTrainSim
self.sim = TrainSim(p, self.loggers)
File "run.py", line 379, in TrainSim
sim.CollectSimulationData()
File "/home/xxx/balsa/sim.py", line 728, in CollectSimulationData
self.search.Run(query_node, query_node.info['sql_str'])
File "/home/xxx/balsa/balsa/search.py", line 245, in Run
dp_tables)
File "/home/xxx/balsa/balsa/search.py", line 317, in _dp_bushy_search_space
return list(dp_tables[num_rels].values())[0][1], dp_tables
IndexError: list index out of range
I use only three queries in the query_dir, like:
select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select n1.n_name as supp_nation, n2.n_name as cust_nation, extract(year from l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'VIETNAM' and n2.n_name = 'UNITED KINGDOM') or (n1.n_name = 'UNITED KINGDOM' and n2.n_name = 'VIETNAM') ) and l_shipdate between date '1995-01-01' and date '1996-12-31' ) as shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year;
The I add print(join_graph) after Line 257 of balsa/balsa/search.py, which is
r = r_tup[1]
and it shows "Graph with 0 nodes and 0 edges". I think I cannot get a correct join graph in Line 224 of balsa/balsa/search.py, which is
I then check the definition of GetOrParseSql(self) in balsa/balsa/util/plans_lib.py and print graph and join_conds. It shows Graph with 0 nodes and 0 edges for the graph and [] for the join_conds. I then check the definition of simple_sql_parser in balsa/balsa/util/simple_sql_parser.py and print the result of join_conds after
join_conds = join_cond_pat.findall(sql)
The sql is one of the queries in my query_dir but the join_conds is still []. I check the regular expression and guess it cannot deal with the expression c_custkey = o_custkey in my queries since there are dots in the used regular expression.
As introduced in the paper, TPC-H is used as a benchmark. Could you please give me some hints for the above parser problem or add some codes on TPC-H. Many thanks in advance.
Another confusion is that when I run the above command for the first time and set
3 train queries: ['test1', 'test2', 'test3']
0 test queries: []
wandb: (1) Create a W&B account
even if in the BalsaAgent params test_query_glob is ['test1.sql']. I am just curious about why we need to get the Baseline PG performance by running all test and training queries before training. Hope your reply sincerely!
The text was updated successfully, but these errors were encountered:
Blondig
changed the title
IndexError occurred when running python run.py --run Balsa --local
IndexError occurred when running python run.py --run Balsa——TPCH --local
Jun 9, 2022
Blondig
changed the title
IndexError occurred when running python run.py --run Balsa——TPCH --local
IndexError occurred when running python run.py --run Balsa_TPCH --local
Jun 9, 2022
I am interested in your code and try to run it with TPC-H . I write a subclass of Balsa_JOBRandSplit and change p as follows.
The PostgreSQL version and conda environment are the same as recommended in README.md. When I run it as python run.py --run Balsa_TPCH --local, an error occurred with the following traceback.
I use only three queries in the query_dir, like:
The I add print(join_graph) after Line 257 of balsa/balsa/search.py, which is
and it shows "Graph with 0 nodes and 0 edges". I think I cannot get a correct join graph in Line 224 of balsa/balsa/search.py, which is
I then check the definition of GetOrParseSql(self) in balsa/balsa/util/plans_lib.py and print graph and join_conds. It shows Graph with 0 nodes and 0 edges for the graph and [] for the join_conds. I then check the definition of simple_sql_parser in balsa/balsa/util/simple_sql_parser.py and print the result of join_conds after
The sql is one of the queries in my query_dir but the join_conds is still []. I check the regular expression and guess it cannot deal with the expression c_custkey = o_custkey in my queries since there are dots in the used regular expression.
As introduced in the paper, TPC-H is used as a benchmark. Could you please give me some hints for the above parser problem or add some codes on TPC-H. Many thanks in advance.
Another confusion is that when I run the above command for the first time and set
it shows
even if in the BalsaAgent params test_query_glob is ['test1.sql']. I am just curious about why we need to get the Baseline PG performance by running all test and training queries before training. Hope your reply sincerely!
The text was updated successfully, but these errors were encountered: