Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid redundant shutdown in TracerProvider::drop when already shut down #2197

Merged

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Oct 11, 2024

Changes

changes similar to #2195 for TracerProvider

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@lalitb lalitb requested a review from a team as a code owner October 11, 2024 19:35
@lalitb lalitb changed the title void redundant shutdown in TracerProvider::drop when already shut down Avoid redundant shutdown in TracerProvider::drop when already shut down Oct 11, 2024
Copy link

codecov bot commented Oct 11, 2024

Codecov Report

Attention: Patch coverage is 96.46018% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.1%. Comparing base (2bb53b4) to head (1272312).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
opentelemetry-sdk/src/trace/provider.rs 96.4% 4 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff          @@
##            main   #2197   +/-   ##
=====================================
  Coverage   79.1%   79.1%           
=====================================
  Files        121     121           
  Lines      21084   21180   +96     
=====================================
+ Hits       16678   16772   +94     
- Misses      4406    4408    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make it similar to #2195

@lalitb
Copy link
Member Author

lalitb commented Oct 14, 2024

Lets make it similar to #2195

Done.

@@ -200,6 +200,10 @@ pub enum TraceError {
#[error("Exporting timed out after {} seconds", .0.as_secs())]
ExportTimedOut(time::Duration),

/// already shutdown error
#[error("{0} already shutdown")]
AlreadyShutdown(String),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect to use this variant for anything other than TracerProvider?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thought to use them for the processors and exporters too. But I believe we can customize it later if required. For now, made it static for TracerProvider.

lalitb and others added 2 commits October 15, 2024 18:33
///
/// ## Cloning and Shutdown
///
/// The `TracerProvider` is designed to be lightweight and clonable. Cloning a `TracerProvider`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think TracerProvider is lightweight. It is pretty heavy, and we expect user to create it only once. It is correct to mention cloning is cheap as it is just creating a new ref.

Copy link
Member Author

@lalitb lalitb Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Have updated the comment to remove the lightweight reference.

///
/// The `TracerProvider` manages the lifecycle of span processors, which are responsible for
/// collecting, processing, and exporting spans. To ensure all spans are processed before shutdown,
/// users can call the [`force_flush`](TracerProvider::force_flush) method at any time to trigger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets remove force_flush mention here. I have seen many users doing force_flush in their code (and block their threads).. Not sure why, but lets make sure official docs don't recommend it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have reworded it so that it doesn't look as recommendation. I think it's better to at-least document since we provide it.

/// ## Span Processing and Force Flush
///
/// The `TracerProvider` manages the lifecycle of span processors, which are responsible for
/// collecting, processing, and exporting spans. The [`force_flush`](TracerProvider::force_flush) method
/// invoked at any time will trigger an immediate flush of all pending spans (if any) to the exporters.
/// This will block the user thread till all the spans are passed to exporters

/// `TracerProvider` is lightweight container holding pointers to `SpanProcessor` and other components.
/// Cloning and dropping them will not stop the span processing. To stop span processing, users
/// must either call `shutdown` method explicitly, or drop every clone of `TracerProvider`.
/// `TracerProvider` is a lightweight container holding pointers to `SpanProcessor` and other components.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not introduced in this PR, but advertising TracerProvider as lightweight is incorrect, and can lead to users repeatedly creating them, instead of doing it once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated the comment to remove the lightweight reference.


#[derive(Debug)]
struct CountingShutdownProcessor {
shutdown_count: Arc<Mutex<i32>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Atomics maybe easier here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

…tb/opentelemetry-rust into tracer-provider-drop-shutdown-check
@@ -36,36 +90,60 @@ static NOOP_TRACER_PROVIDER: Lazy<TracerProvider> = Lazy::new(|| TracerProvider
span_limits: SpanLimits::default(),
resource: Cow::Owned(Resource::empty()),
},
is_shutdown: AtomicBool::new(true),
Copy link
Contributor

@utpilla utpilla Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not added in this PR but why have we initialized the no-op tracer provider as an already shut down provider?

I know this wouldn't make much difference in functionality but semantically it would be weird if I call shutdown on the global provider and get an error saying it has already been shut down.

Copy link
Member Author

@lalitb lalitb Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global shutdown_tracer_provider naming is bit confusing. It doesn't invoke shutdown, only decrements the TracerProviderInner reference. Once all the tracers are dropped, the shutdown will get invoked through Drop. Probably release_tracer_provider is better name :)

And the user will never get hold of NOOP_TRACER_PROVIDER instance, or can invoke shutdown on it directly or indirectly. It's just internally used to create the tracer instance after shutdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so then we could initialize no-op tracer provider with is_shutdown as false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so then we could initialize no-op tracer provider with is_shutdown as false?

Making it false fails this test -


And this test make me realise user can access it through tracer.provider(), and can try to invoke shutdown on it. But feel free to raise an issue if there are any improvements here.

@cijothomas cijothomas merged commit a47b429 into open-telemetry:main Oct 23, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants