Zone Aggregation #8

DavidOry · 2022-05-26T17:50:17Z

Placeholder pull request for zone aggregation functionality

# Conflicts: # README.md # sandag_rsm/__init__.py

DavidOry · 2022-05-26T17:52:27Z

@jpn--, @JoeJimFlood
File references have been moved to the resources directory. @JoeJimFlood: you should now be able to run the demo notebook if you're interested.

these are now in `resources`

JoeJimFlood · 2022-06-09T21:01:51Z

@DavidOry @jpn-- I tried cloning the zone-agg branch of the repo and wasn't able to run the notebook. It looks like it's failing while the trip list is being loaded. Do I need to run it in Docker? Or is there something else I'm missing?

Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 trips = load_trip_list("trips_sample.pq", data_dir=data_dir)

File ~\.conda\envs\test1\lib\site-packages\sandag_rsm\data_load\triplist.py:19, in load_trip_list(trips_filename, data_dir)
     17 try:
     18     if trips_filename.endswith(".pq") or trips_filename.endswith(".parquet"):
---> 19         trips = pd.read_parquet(trips_filename)
     20     else:
     21         trips = pd.read_csv(trips_filename)

File ~\.conda\envs\test1\lib\site-packages\pandas\io\parquet.py:493, in read_parquet(path, engine, columns, storage_options, use_nullable_dtypes, **kwargs)
    446 """
    447 Load a parquet object from the file path, returning a DataFrame.
    448 
   (...)
    489 DataFrame
    490 """
    491 impl = get_engine(engine)
--> 493 return impl.read(
    494     path,
    495     columns=columns,
    496     storage_options=storage_options,
    497     use_nullable_dtypes=use_nullable_dtypes,
    498     **kwargs,
    499 )

File ~\.conda\envs\test1\lib\site-packages\pandas\io\parquet.py:347, in FastParquetImpl.read(self, path, columns, storage_options, **kwargs)
    343     path = handles.handle
    345 parquet_file = self.api.ParquetFile(path, **parquet_kwargs)
--> 347 result = parquet_file.to_pandas(columns=columns, **kwargs)
    349 if handles is not None:
    350     handles.close()

File ~\.conda\envs\test1\lib\site-packages\fastparquet\api.py:751, in ParquetFile.to_pandas(self, columns, categories, filters, index, row_filter)
    747         continue
    748     parts = {name: (v if name.endswith('-catdef')
    749                     else v[start:start + thislen])
    750              for (name, v) in views.items()}
--> 751     self.read_row_group_file(rg, columns, categories, index,
    752                              assign=parts, partition_meta=self.partition_meta,
    753                              row_filter=sel, infile=infile)
    754     start += thislen
    755 return df

File ~\.conda\envs\test1\lib\site-packages\fastparquet\api.py:361, in ParquetFile.read_row_group_file(self, rg, columns, categories, index, assign, partition_meta, row_filter, infile)
    358     ret = True
    359 f = infile or self.open(fn, mode='rb')
--> 361 core.read_row_group(
    362     f, rg, columns, categories, self.schema, self.cats,
    363     selfmade=self.selfmade, index=index,
    364     assign=assign, scheme=self.file_scheme, partition_meta=partition_meta,
    365     row_filter=row_filter
    366 )
    367 if ret:
    368     return df

File ~\.conda\envs\test1\lib\site-packages\fastparquet\core.py:608, in read_row_group(file, rg, columns, categories, schema_helper, cats, selfmade, index, assign, scheme, partition_meta, row_filter)
    606 if assign is None:
    607     raise RuntimeError('Going with pre-allocation!')
--> 608 read_row_group_arrays(file, rg, columns, categories, schema_helper,
    609                       cats, selfmade, assign=assign, row_filter=row_filter)
    611 for cat in cats:
    612     if cat not in assign:
    613         # do no need to have partition columns in output

File ~\.conda\envs\test1\lib\site-packages\fastparquet\core.py:580, in read_row_group_arrays(file, rg, columns, categories, schema_helper, cats, selfmade, assign, row_filter)
    577 if name not in columns:
    578     continue
--> 580 read_col(column, schema_helper, file, use_cat=name+'-catdef' in out,
    581          selfmade=selfmade, assign=out[name],
    582          catdef=out.get(name+'-catdef', None),
    583          row_filter=row_filter)
    585 if _is_map_like(schema_helper, column):
    586     # TODO: could be done in fast loop in _assemble_objects?
    587     if name not in maps:

File ~\.conda\envs\test1\lib\site-packages\fastparquet\core.py:549, in read_col(column, schema_helper, infile, use_cat, selfmade, assign, catdef, row_filter)
    547     piece[:] = i.codes
    548 elif d and not use_cat:
--> 549     piece[:] = dic[val]
    550 elif not use_cat:
    551     piece[:] = convert(val, se)

IndexError: index 132096 is out of bounds for axis 0 with size 131469

jpn-- · 2022-06-09T21:23:18Z

That's an odd error. Maybe the trips_sample.pq file somehow got corrupted in the download step? I'd try deleting it and trying again. If the same error persists, you can try docker, or we can dig into it tomorrow when we talk

JoeJimFlood · 2022-06-09T23:09:34Z

I had been using an environment that I use for testing things out, but I realized that I should create a new environment based on the yaml file in the repo. After doing that, installing pyarrow, and upgrading scipy it would appear as though the notebook ran through all of the way successfully.

Translate Demand, initial commit

Removing the existing centroid connectors and creating new ones based on aggregate zone structure

DavidOry · 2022-10-28T17:19:57Z

@jpn--
Can we start over with a new PR or modify this one to back out the hundreds of files committed with the model reference?

DavidOry · 2022-11-09T20:09:04Z

Replaced by #19

jpn-- and others added 15 commits April 29, 2022 13:43

initial zone agg

c0035e5

Adding aggregation fields for zone aggregation

e0c3e9b

Updating column references

6681e1e

implementing explicit aggregation

44095f4

zone agg updates

ab6f497

notebook

660dbad

Merge branch 'main' into zone-agg

d11f655

# Conflicts: # README.md # sandag_rsm/__init__.py

zone agg on TAZs

bc38c45

test data

136c7b9

git lfs

10f1de6

git-lfs in env

08433d0

add pyarrow dependency

dbb91f1

notes

89b6d4c

script to create test data

943d495

Remove GIT LFS, point notebook to resources dir

279bee7

DavidOry added this to the Product 1A: Zone Creator MVP milestone May 26, 2022

This was linked to issues May 26, 2022

PRODUCT 1A Zone Creator: Sketch out MAZ Aggregation Algorithm #2

Open

PRODUCT 1A Zone Creator: First Pass at Algorithm for Transit Connectors #3

Open

jpn-- added 2 commits June 8, 2022 10:47

remove big files

8652779

these are now in `resources`

remove big file

2d1df97

jpn-- and others added 6 commits June 21, 2022 16:59

crosswalks and centroids

cb305fd

crosswalks and centroid mapping in notebook

46edf95

Translate Demand, initial commit

c4e4c09

sandag model - 14.2.2

cb98e9c

Merge pull request #10 from elias-sanz/zone-agg

c848a73

Translate Demand, initial commit

customizable start values for cluster_id

0f4598e

vivekyadav26 added 2 commits July 15, 2022 13:25

Network Centroid Connectors

57d84bb

Removing the existing centroid connectors and creating new ones based on aggregate zone structure

Implement as method

e2d8f61

DavidOry linked an issue Aug 9, 2022 that may be closed by this pull request

PRODUCT 1B: Translate Demand #12

Open

jpn-- added 5 commits October 19, 2022 07:55

add load data function

23202e5

test data loader

5fc4822

Merge branch 'data-loader' into zone-agg

fea623d

fix error when no chdir

b99c07b

update notebook

13ce926

DavidOry marked this pull request as ready for review October 25, 2022 16:20

DavidOry changed the title ~~Zone Aggregation (WIP)~~ Zone Aggregation Oct 25, 2022

AshishKuls added 2 commits October 27, 2022 14:31

translate demand code for aggregating omx matrices

cc5609e

add gitignore to ignore test files

da991df

DavidOry mentioned this pull request Nov 9, 2022

Zone Aggregation Take 2 #19

Closed

DavidOry closed this Nov 9, 2022

This was unlinked from issues Nov 9, 2022

PRODUCT 1B: Translate Demand #12

Open

PRODUCT 1A Zone Creator: Sketch out MAZ Aggregation Algorithm #2

Open

AshishKuls deleted the zone-agg branch January 20, 2023 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zone Aggregation #8

Zone Aggregation #8

DavidOry commented May 26, 2022

DavidOry commented May 26, 2022

JoeJimFlood commented Jun 9, 2022 •

edited

Loading

jpn-- commented Jun 9, 2022

JoeJimFlood commented Jun 9, 2022

DavidOry commented Oct 28, 2022

DavidOry commented Nov 9, 2022

Zone Aggregation #8

Zone Aggregation #8

Conversation

DavidOry commented May 26, 2022

DavidOry commented May 26, 2022

JoeJimFlood commented Jun 9, 2022 • edited Loading

jpn-- commented Jun 9, 2022

JoeJimFlood commented Jun 9, 2022

DavidOry commented Oct 28, 2022

DavidOry commented Nov 9, 2022

JoeJimFlood commented Jun 9, 2022 •

edited

Loading