[IR] Improve external data handling #2020

justinchuby · 2025-01-17T19:42:04Z

Add an external_data option to ir.save. This will save all initializers as external tensors. It is robust against data loss when overwriting.
Add an modify_model option to ir.save to allow users to control if they want to keep the model unchanged when saving to external data.
Simplified torch_apis logic by leveraging to updated ir.save method.
Updated the to_external_data function to always load data to memory, iff the tensor references an external data file that is being written to. This simplifies the logic and avoids creating and managing temporary files.

Note

We do not need to add external data options to ir.load. The external data is always loaded lazily in the IR. If we want to transfer the data to memory at loading, we can use the _external_tensor_to_memory_tensor internally.

Example usage

ir.save(model, "model.onnx", external_data="model.onnx.data")
# Can save many times
ir.save(model, "model_copy.onnx", external_data="model_copy.onnx.data")

Copilot

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

onnxscript/ir/_external_data_test.py:352

The tobytes method should raise a TypeError instead of returning it.

return TypeError

onnxscript/ir/_io.py

onnxscript/ir/_external_data.py

codecov · 2025-01-17T19:46:15Z

❌ 20 Tests Failed:

Tests completed	Failed	Passed	Skipped
13032	20	13012	2455

View the top 3 failed tests by shortest run time

onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0034_test_and_bcast4v3d

Stack Traces | 0.003s run time

onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_and_bcast4v3d'

The above exception was the direct cause of the following exception:
.nox\test_ort_nightly\Lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_and_bcast4v3d' (e=No module named 'tests.onnx_backend_test_code.test_and_bcast4v3d') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_and_bcast4v3d.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_and_bcast4v3d.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import BOOL
E   from onnxscript.onnx_opset import opset7
E   
E   @script()
E   def bck_test_and_bcast4v3d(x: BOOL[3,4,5,6], y: BOOL[4,5,6]) -> (BOOL[3,4,5,6]):
E       r_and = opset7.And(x, y)
E       return r_and

onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0634_test_max_int32

Stack Traces | 0.003s run time

onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_max_int32'

The above exception was the direct cause of the following exception:
.nox\test_ort_nightly\Lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_max_int32' (e=No module named 'tests.onnx_backend_test_code.test_max_int32') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_max_int32.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_max_int32.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import INT32
E   from onnxscript.onnx_opset import opset13
E   
E   @script()
E   def bck_test_max_int32(data_0: INT32[3], data_1: INT32[3]) -> (INT32[3]):
E       result = opset13.Max(data_0, data_1)
E       return result

onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0752_test_optional_has_element_empty_no_input_name_optional_input

Stack Traces | 0.003s run time

onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_optional_has_element_empty_no_input_name_optional_input'

The above exception was the direct cause of the following exception:
.nox\test_torch_nightly\Lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_optional_has_element_empty_no_input_name_optional_input' (e=No module named 'tests.onnx_backend_test_code.test_optional_has_element_empty_no_input_name_optional_input') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_optional_has_element_empty_no_input_name_optional_input.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_optional_has_element_empty_no_input_name_optional_input.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import BOOL
E   from onnxscript.onnx_opset import opset18
E   
E   @script()
E   def bck_test_optional_has_element_empty_no_input_name_optional_input() -> (BOOL):
E       output = opset18.OptionalHasElement(None)
E       return output

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

onnxscript/ir/_external_data.py

onnxscript/ir/_io.py

Co-authored-by: Copilot <[email protected]>

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

onnxscript/ir/_external_data.py:174

The variable name 'base_path' should be renamed to 'base_dir' for consistency.

base_path: str | os.PathLike,

onnxscript/ir/_external_data.py:252

The word 'unneccesarry' should be corrected to 'unnecessary'.

# Sort all tensors based on tensor sizes, in order to avoid unneccesarry alignment.

onnxscript/ir/_io_test.py

shubhambhokare1 · 2025-01-21T03:34:18Z

onnxscript/ir/_io.py

+    model: _core.Model,
+    path: str | os.PathLike,
+    format: str | None = None,
+    external_data: str | os.PathLike | None = None,


nit suggestion: maybe name it as external_data_path to clarify it is a path to file?

Since the option is intended to provided by keyword, I find external_data to be more concise and as clear. Let me know if you think differently?

onnxscript/_framework_apis/torch_2_5.py

gramalingam · 2025-01-21T20:51:33Z

I have a question (the same one discussed earlier too): is it possible to ensure that small tensors (typically scalar constants, and shapes) stay internal to the model?

This is a question that impacts all ONNX users, so it is worth making a decision based on overall requirements (as opposed to just the IR perspective). The clear need is to have large tensors external (typically weights). Scalars and shapes: not so much. Further, no point in incurring any of the overheads of external tensors for them (not worth memory-mapping, not worth lazy loading, etc.) ONNX shape inference specifically does not load any external tensors during inference, and will be impacted if shapes are stored externally. Updating ONNX shape inference to do this will be a somewhat non-trivial task.

Another related question is the role of "Constant()" nodes vs. initializers ... here, there is no issue with using initializers in favor of Constant nodes (except inside Functions, which don't support them, but that's another issue worth discussing separately). Eg., initializers seem to be handled better by Netron (and a few other tools) than Constant nodes.

justinchuby · 2025-01-21T20:53:07Z

is it possible to ensure that small tensors (typically scalar constants, and shapes) stay internal to the model?

That makes sense. We can have a size limit like the onnx.save function does

justinchuby added 4 commits January 17, 2025 11:04

[IR] Improve external data handling

dbc3890

Also load into memory

3c9d315

external

fe696f5

f

63553c5

justinchuby requested review from gramalingam, shubhambhokare1 and Copilot January 17, 2025 19:42

Copilot AI reviewed Jan 17, 2025

View reviewed changes

onnxscript/ir/_io.py Show resolved Hide resolved

onnxscript/ir/_external_data.py Outdated Show resolved Hide resolved

onnxscript/ir/_external_data.py Outdated Show resolved Hide resolved

justinchuby added 5 commits January 17, 2025 12:19

Simplify logic

03b90ca

format

ad169e1

comments

903b206

docs

ffe270f

sim

e8726fc

justinchuby commented Jan 17, 2025

View reviewed changes

onnxscript/ir/_external_data.py Outdated Show resolved Hide resolved

justinchuby commented Jan 17, 2025

View reviewed changes

onnxscript/ir/_io.py Outdated Show resolved Hide resolved

Apply suggestions from code review

68c8e41

Co-authored-by: Copilot <[email protected]>

justinchuby added hold on merging Don't merge yet topic: IR Intermediate representation labels Jan 17, 2025

justinchuby requested review from titaiwangms and Copilot January 17, 2025 20:48

Copilot AI reviewed Jan 17, 2025

View reviewed changes

fix samefile call

d78f407

github-advanced-security bot found potential problems Jan 18, 2025

View reviewed changes

wip

81458ec

github-advanced-security bot found potential problems Jan 18, 2025

View reviewed changes

shubhambhokare1 reviewed Jan 21, 2025

View reviewed changes

justinchuby marked this pull request as draft January 21, 2025 16:44

justinchuby added 2 commits January 21, 2025 09:04

address comments

273919d

Merge branch 'main' into justinchu/ir-save

67d8ab9

justinchuby added 2 commits January 21, 2025 12:23

test

357c41b

test

a0a5b58

justinchuby marked this pull request as ready for review January 21, 2025 20:38

Merge branch 'main' into justinchu/ir-save

5e5a59c

justinchuby requested a review from shubhambhokare1 January 21, 2025 20:42

test

f601264

justinchuby marked this pull request as draft January 21, 2025 23:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IR] Improve external data handling #2020

[IR] Improve external data handling #2020

justinchuby commented Jan 17, 2025 •

edited

Loading

Copilot AI left a comment

codecov bot commented Jan 17, 2025 •

edited

Loading

shubhambhokare1 Jan 21, 2025

justinchuby Jan 21, 2025

gramalingam commented Jan 21, 2025

justinchuby commented Jan 21, 2025

[IR] Improve external data handling #2020

Are you sure you want to change the base?

[IR] Improve external data handling #2020

Conversation

justinchuby commented Jan 17, 2025 • edited Loading

Example usage

Copilot AI left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 17, 2025 • edited Loading

❌ 20 Tests Failed:

Choose a reason for hiding this comment

shubhambhokare1 Jan 21, 2025

Choose a reason for hiding this comment

justinchuby Jan 21, 2025

Choose a reason for hiding this comment

gramalingam commented Jan 21, 2025

justinchuby commented Jan 21, 2025

justinchuby commented Jan 17, 2025 •

edited

Loading

codecov bot commented Jan 17, 2025 •

edited

Loading