Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support table referencing with columns #2063

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

IvanHristov98
Copy link

Add support for table referencing (aka structured references)

Description

After experimenting I noticed that table references aren't supported.

Docs used for reference during implementation:

Related Issue

#2062

Motivation and Context

It allows table referencing when calculating cell formulas.

How Has This Been Tested

I tested it with unit tests. TODO (will test with actual files)

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document. (couldn't find such a document)
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@IvanHristov98
Copy link
Author

I have missed the case where a table is in a different sheet. Will cover it.

It is a bit trickier when initially reading the excel file. Hence I'll finish what's left in the beginning of the next week.

Hence the PR is currently in a draft phase.

@xuri xuri added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 4, 2025
@IvanHristov98
Copy link
Author

Hi @xuri, I think that it is ready for a review. Implementing the cross sheet table reference was tricky.

TLDR I had to read a handful of metadata files until I could create the mapping:

  1. Read from xl/workbook.xml to get the sheets metadata.
  2. Read from xl/_rels/workbook.xml.rels to get the respective sheet metadata files of the sheets. It turns out that just sheet numeric ID isn't reliable. Hence I had to additionally parse this file.
  3. Read from xl/worksheets/_rels/sheetX.xml.rels files to find the referenced tables.
  4. Read from xl/tables/tableX.xml files to find the related table names.

The parsed files are metadata files, that should be small. Thus I think/hope that reading them shouldn't harm the performance of the library when calling OpenFile. 🤞


I think that the PR is now ready for a review.


Just one additional question. I tested table referencing of files opened with OpenFile manually. I tried adding a test xlsx file to the repo but there is a .gitignore that prevents me from adding it to the test directory.

Any advices here? I'd like to bring a more complete unit test coverage.

@xuri
Copy link
Member

xuri commented Jan 6, 2025

Thanks for your PR. I have been busy recently, will review for this as soon as possible.

@IvanHristov98 IvanHristov98 changed the title Support table refs Support table referencing with columns Jan 8, 2025
@IvanHristov98
Copy link
Author

IvanHristov98 commented Jan 8, 2025

Hi, I'd like to make one more note.

This PR targets only the most basic syntax. That is TableName[ColumnName]. Thus I renamed it.

There are other more (sophisticated) syntaxes for table referencing. Examples include (see ms docs):

  • TableName -> to get the whole range of a table
  • TableName[#Headers] -> to get the headers of a table
  • TableName[#All] -> to get the whole range of a table
  • TableName[#Totals] -> to get some aggregations
  • TableName[[#Headers], [ColName]], TableName[[#All],[ColName]], TableName[[#Totals],[ColName]] -> get just parts of a table for a given column
  • TableName[[ColName1]:[ColNameOther]] -> to get a range between columns
  • TableName[[#Totals], [ColName1]:[ColNameOther]]

I haven't covered them because otherwise this PR would become too big. I propose to implement them in follow up PRs. The whole point of this PR is to set up the main foundations for table referencing. Once having the foundations adding additional syntaxes should be easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants