Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Pipeline left in an inconsistent state when the init function passed to make_segment throws #360

Open
2 tasks done
dagardner-nv opened this issue Aug 2, 2023 · 0 comments · May be fixed by #434
Open
2 tasks done
Assignees
Labels
bug Something isn't working

Comments

@dagardner-nv
Copy link
Contributor

Version

23.11

Which installation method(s) does this occur on?

Source

Describe the bug.

This occurs with Morpheus' test_multi_segment_bad_data_type test which intentionally builds a multi-segment pipeline where one of the segments fails a pre-flight check and raises/throws in the init method passed to the mrc::pymrc::Pipeline::make_segment method.

When the pipeline goes out of scope we then get a failed assert in edge_holder:

A node was destructed which still had dependent connections. Nodes must be kept alive while dependent connections are still active

Minimum reproducible example

From morpheus run:

pytest -v -s -x tests/test_multi_segment.py::test_multi_segment_bad_data_type

Note:
Currently Morpheus has a circular reference issue such that the MRC executor is never destroyed, fixing this issue uncovered this issue.



### Relevant log output

```shell
tests/test_multi_segment.py::test_multi_segment_bad_data_type[use_cpp] E20230802 09:29:40.420850 1218132 builder_definition.cpp:283] Exception during segment initializer. Segment name: linear_segment_0, Segment Rank: 0. Exception message:
RuntimeError: The linear_segment_egress stage cannot handle input of <class 'morpheus.messages.message_meta.MessageMeta'>. Accepted input types: (<class 'int'>,)

At:
  /home/dagardner/work/morpheus/morpheus/pipeline/single_port_stage.py(66): _pre_build
  /home/dagardner/work/morpheus/morpheus/pipeline/stream_wrapper.py(325): build
  /home/dagardner/work/morpheus/morpheus/pipeline/stream_wrapper.py(349): build
  /home/dagardner/work/morpheus/morpheus/pipeline/pipeline.py(257): inner_build
E20230802 09:29:40.421111 1218132 controller.cpp:64] exception caught while performing update - this is fatal - issuing kill
E20230802 09:29:40.421136 1218132 context.cpp:124] rank: 0; size: 1; tid: 139798018127424; fid: 0x7f253c041200: set_exception issued; issuing kill to current runnable. Exception msg: RuntimeError: The linear_segment_egress stage cannot handle input of <class 'morpheus.messages.message_meta.MessageMeta'>. Accepted input types: (<class 'int'>,)

At:
  /home/dagardner/work/morpheus/morpheus/pipeline/single_port_stage.py(66): _pre_build
  /home/dagardner/work/morpheus/morpheus/pipeline/stream_wrapper.py(325): build
  /home/dagardner/work/morpheus/morpheus/pipeline/stream_wrapper.py(349): build
  /home/dagardner/work/morpheus/morpheus/pipeline/pipeline.py(257): inner_build
E20230802 09:29:40.421155 1218132 manager.cpp:89] error detected on controller
E20230802 09:29:40.421192 1218069 runner.cpp:189] Runner::await_join - an exception was caught while awaiting on one or more contexts/instances - rethrowing
E20230802 09:29:40.422310 1218069 service.cpp:136] mrc::service: service was not joined before being destructed; issuing join
F20230802 09:29:40.422335 1218069 edge_holder.hpp:61] A node was destructed which still had dependent connections. Nodes must be kept alive while dependent connections are still active
*** Check failure stack trace: ***
Exception occurred in pipeline. Rethrowing
Traceback (most recent call last):
  File "/home/dagardner/work/morpheus/morpheus/pipeline/pipeline.py", line 328, in join
    await self._mrc_executor.join_async()
  File "/home/dagardner/work/morpheus/morpheus/pipeline/pipeline.py", line 257, in inner_build
    stage.build(builder)
  File "/home/dagardner/work/morpheus/morpheus/pipeline/stream_wrapper.py", line 349, in build
    dep.build(builder, do_propagate=do_propagate)
  File "/home/dagardner/work/morpheus/morpheus/pipeline/stream_wrapper.py", line 325, in build
    in_ports_pairs = self._pre_build(builder=builder)
  File "/home/dagardner/work/morpheus/morpheus/pipeline/single_port_stage.py", line 66, in _pre_build
    raise RuntimeError("The {} stage cannot handle input of {}. Accepted input types: {}".format(
RuntimeError: The linear_segment_egress stage cannot handle input of <class 'morpheus.messages.message_meta.MessageMeta'>. Accepted input types: (<class 'int'>,)
    @     0x7f271ba5af8d  google::LogMessage::Fail()
    @     0x7f271ba5ee67  google::LogMessage::SendToLog()
    @     0x7f271ba5aa55  google::LogMessage::Flush()
    @     0x7f271ba5beaa  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f26041afc84  (unknown)
    @     0x7f260418da37  (unknown)
    @     0x7f26041a28fa  (unknown)
    @     0x7f26041a724c  (unknown)
    @     0x7f26fc884b0a  mrc::pipeline::PipelineInstance::~PipelineInstance()
    @     0x7f26fc884d9a  mrc::pipeline::PipelineInstance::~PipelineInstance()
    @     0x7f26fc8783c0  mrc::pipeline::Controller::~Controller()
    @     0x7f26fc87865e  mrc::pipeline::Controller::~Controller()
    @     0x7f26fc995fe0  mrc::runnable::Runner::~Runner()
    @     0x7f26fc7768b8  mrc::runnable::SpecializedRunner<>::~SpecializedRunner()
    @     0x7f26fc879a57  mrc::pipeline::Manager::~Manager()
    @     0x7f26fc879a8a  mrc::pipeline::Manager::~Manager()
    @     0x7f26fc837780  mrc::executor::ExecutorDefinition::~ExecutorDefinition()
    @     0x7f26fc83782a  mrc::executor::ExecutorDefinition::~ExecutorDefinition()
    @     0x7f26fceb9aea  std::_Sp_counted_base<>::_M_release()
    @     0x7f26fcec572c  mrc::pymrc::Executor::~Executor()
    @     0x7f26040207c3  (unknown)
    @     0x7f26040205aa  (unknown)
    @     0x7f271bbf8d16  pybind11::detail::clear_instance()
    @     0x7f271bbf8e1d  pybind11_object_dealloc
    @     0x563fe52bbc1f  insertdict
    @     0x563fe52c009a  _PyObject_GenericSetAttrWithDict.localalias
    @     0x563fe52bfe71  PyObject_SetAttr.localalias
    @     0x563fe52cb2f2  _PyEval_EvalFrameDefault
    @     0x563fe52da99c  _PyFunction_Vectorcall
    @     0x563fe52cac5c  _PyEval_EvalFrameDefault
    @     0x563fe53776b4  gen_send_ex2
    @     0x563fe52cc0ea  _PyEval_EvalFrameDefault
Aborted

Full env printout

No response

Other/Misc.

No response

Code of Conduct

  • I agree to follow MRC's Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@dagardner-nv dagardner-nv added the bug Something isn't working label Aug 2, 2023
@dagardner-nv dagardner-nv self-assigned this Aug 2, 2023
dagardner-nv added a commit to dagardner-nv/Morpheus that referenced this issue Aug 2, 2023
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Aug 2, 2023
@mdemoret-nv mdemoret-nv moved this from Todo to In Progress in Morpheus Boards Aug 21, 2023
rapids-bot bot pushed a commit to nv-morpheus/Morpheus that referenced this issue Aug 30, 2023
* Fixes a memory leak #1114 by releasing reference to MRC Pipeline & Executor on stop.
* Numerous pylint fixes for `morpheus/pipeline/pipeline.py`
* Skip `tests/test_multi_segment.py::test_multi_segment_bad_data_type` due to nv-morpheus/MRC#360

fixes #1114

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Christopher Harris (https://github.com/cwharris)
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #1115
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Sep 26, 2023
rapids-bot bot pushed a commit that referenced this issue Sep 28, 2023
* Prevents `check_active_connection` from mistakenly returning true for a holder where `init_owned_edge` has been called but neither the `init_connected_edge method` or the `add_connector` method have not been called.

Relates to issue #360

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #402
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Jan 22, 2024
…n loaded and added to the segment [no ci]
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
1 participant