Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Improve external data handling #2020

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from
Draft

Conversation

justinchuby
Copy link
Collaborator

@justinchuby justinchuby commented Jan 17, 2025

  1. Add an external_data option to ir.save. This will save all initializers as external tensors. It is robust against data loss when overwriting.
  2. Add an modify_model option to ir.save to allow users to control if they want to keep the model unchanged when saving to external data.
  3. Simplified torch_apis logic by leveraging to updated ir.save method.
  4. Updated the to_external_data function to always load data to memory, iff the tensor references an external data file that is being written to. This simplifies the logic and avoids creating and managing temporary files.

Note

We do not need to add external data options to ir.load. The external data is always loaded lazily in the IR. If we want to transfer the data to memory at loading, we can use the _external_tensor_to_memory_tensor internally.

Example usage

ir.save(model, "model.onnx", external_data="model.onnx.data")
# Can save many times
ir.save(model, "model_copy.onnx", external_data="model_copy.onnx.data")

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

onnxscript/ir/_external_data_test.py:352

  • The tobytes method should raise a TypeError instead of returning it.
return TypeError

onnxscript/ir/_io.py Show resolved Hide resolved
onnxscript/ir/_external_data.py Outdated Show resolved Hide resolved
onnxscript/ir/_external_data.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jan 17, 2025

❌ 20 Tests Failed:

Tests completed Failed Passed Skipped
13032 20 13012 2455
View the top 3 failed tests by shortest run time
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0034_test_and_bcast4v3d
Stack Traces | 0.003s run time
onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_and_bcast4v3d'

The above exception was the direct cause of the following exception:
.nox\test_ort_nightly\Lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_and_bcast4v3d' (e=No module named 'tests.onnx_backend_test_code.test_and_bcast4v3d') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_and_bcast4v3d.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_and_bcast4v3d.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import BOOL
E   from onnxscript.onnx_opset import opset7
E   
E   @script()
E   def bck_test_and_bcast4v3d(x: BOOL[3,4,5,6], y: BOOL[4,5,6]) -> (BOOL[3,4,5,6]):
E       r_and = opset7.And(x, y)
E       return r_and
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0634_test_max_int32
Stack Traces | 0.003s run time
onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_max_int32'

The above exception was the direct cause of the following exception:
.nox\test_ort_nightly\Lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_max_int32' (e=No module named 'tests.onnx_backend_test_code.test_max_int32') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_max_int32.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_max_int32.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import INT32
E   from onnxscript.onnx_opset import opset13
E   
E   @script()
E   def bck_test_max_int32(data_0: INT32[3], data_1: INT32[3]) -> (INT32[3]):
E       result = opset13.Max(data_0, data_1)
E       return result
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0752_test_optional_has_element_empty_no_input_name_optional_input
Stack Traces | 0.003s run time
onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_optional_has_element_empty_no_input_name_optional_input'

The above exception was the direct cause of the following exception:
.nox\test_torch_nightly\Lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_optional_has_element_empty_no_input_name_optional_input' (e=No module named 'tests.onnx_backend_test_code.test_optional_has_element_empty_no_input_name_optional_input') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_optional_has_element_empty_no_input_name_optional_input.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_optional_has_element_empty_no_input_name_optional_input.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import BOOL
E   from onnxscript.onnx_opset import opset18
E   
E   @script()
E   def bck_test_optional_has_element_empty_no_input_name_optional_input() -> (BOOL):
E       output = opset18.OptionalHasElement(None)
E       return output

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

onnxscript/ir/_io.py Outdated Show resolved Hide resolved
@justinchuby justinchuby added hold on merging Don't merge yet topic: IR Intermediate representation labels Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

onnxscript/ir/_external_data.py:174

  • The variable name 'base_path' should be renamed to 'base_dir' for consistency.
base_path: str | os.PathLike,

onnxscript/ir/_external_data.py:252

  • The word 'unneccesarry' should be corrected to 'unnecessary'.
# Sort all tensors based on tensor sizes, in order to avoid unneccesarry alignment.
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
onnxscript/ir/_io_test.py Fixed Show fixed Hide fixed
model: _core.Model,
path: str | os.PathLike,
format: str | None = None,
external_data: str | os.PathLike | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit suggestion: maybe name it as external_data_path to clarify it is a path to file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the option is intended to provided by keyword, I find external_data to be more concise and as clear. Let me know if you think differently?

onnxscript/_framework_apis/torch_2_5.py Outdated Show resolved Hide resolved
@justinchuby justinchuby marked this pull request as draft January 21, 2025 16:44
@justinchuby justinchuby marked this pull request as ready for review January 21, 2025 20:38
@gramalingam
Copy link
Collaborator

I have a question (the same one discussed earlier too): is it possible to ensure that small tensors (typically scalar constants, and shapes) stay internal to the model?

This is a question that impacts all ONNX users, so it is worth making a decision based on overall requirements (as opposed to just the IR perspective). The clear need is to have large tensors external (typically weights). Scalars and shapes: not so much. Further, no point in incurring any of the overheads of external tensors for them (not worth memory-mapping, not worth lazy loading, etc.) ONNX shape inference specifically does not load any external tensors during inference, and will be impacted if shapes are stored externally. Updating ONNX shape inference to do this will be a somewhat non-trivial task.

Another related question is the role of "Constant()" nodes vs. initializers ... here, there is no issue with using initializers in favor of Constant nodes (except inside Functions, which don't support them, but that's another issue worth discussing separately). Eg., initializers seem to be handled better by Netron (and a few other tools) than Constant nodes.

@justinchuby
Copy link
Collaborator Author

is it possible to ensure that small tensors (typically scalar constants, and shapes) stay internal to the model?

That makes sense. We can have a size limit like the onnx.save function does

@justinchuby justinchuby marked this pull request as draft January 21, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hold on merging Don't merge yet topic: IR Intermediate representation
Projects
Development

Successfully merging this pull request may close these issues.

3 participants