Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash: android.os.TransactionTooLargeException #9685

Closed
loremattei opened this issue Apr 23, 2019 · 54 comments · Fixed by #10240 or #20046
Closed

Crash: android.os.TransactionTooLargeException #9685

loremattei opened this issue Apr 23, 2019 · 54 comments · Fixed by #10240 or #20046

Comments

@loremattei
Copy link
Contributor

This one started in 11.8 and has been becoming more prominent in 12.0 and 12.1.
It's currently the most common crash in Fabric.

Unluckily, the crash logs doesn't say much about where it occurs, but Fabric suggests that it is often seen when transferring bitmaps between Activities, or when saving a large amount of state between Activity configuration changes.

We had other occurrences of it in the past: #5456.

Also, this thread facebook/react-native#19458 suggests that it may be related to React-Native. The trend we see in Fabric (started in 11.8, growth a bit in 11.9, growth a lot in 12.0 and 12.1) seems to be compatible with mobile Gutenberg deployment in the app.

android.os.BinderProxy.transactNative (BinderProxy.java)
android.os.Looper.loop (Looper.java:176)
android.app.ActivityThread.main (ActivityThread.java:6635)
java.lang.reflect.Method.invoke (Method.java)
com.android.internal.os.ZygoteInit.main (ZygoteInit.java:823)

5c93a36df8b88c2963f28894-fabric

@designsimply
Copy link
Contributor

30-day impact: ~40 times a day
Latest version affected: 12.2
5c93a36df8b88c2963f28894-fabric-android

The trend we see in Fabric (started in 11.8, growth a bit in 11.9, growth a lot in 12.0 and 12.1) seems to be compatible with mobile Gutenberg deployment in the app.

cc @koke to get input on priority since it was mentioned as possibly related to Gutenberg Mobile and also because it is currently the most common crash.

@koke
Copy link
Member

koke commented Apr 29, 2019

I'm not super familiar with Android, but I looked at the logs for a bunch of crashes and the pattern I'm seeing is that it happens when the application is closing and it's trying to save its state. For instance:

11:18:30:394 (UTC) | D/CrashlyticsCore w/WordPress-UTILS: No valid URLs passed to URLFilteredWebViewClient! HTTP Links in the page are NOT disabled, and ALL URLs could be loaded by the user!!
11:18:30:689 (UTC) | D/CrashlyticsCore d/WordPress-NOTIFS: notifications pager > adapter saveState
11:27:17:484 (UTC) | D/CrashlyticsCore i/WordPress-UTILS: App goes to background
11:27:17:501 (UTC) | D/CrashlyticsCore i/WordPress-STATS: 🔵 Tracked: application_closed, Properties: {"last_visible_screen":"Notifications","time_in_app":569}

@designsimply
Copy link
Contributor

designsimply commented May 9, 2019

30-day impact: ~41 per day
Users affected: 961
Last seen in: 12.3 (Sentry issue: WORDPRESS-ANDROID-23)

(5c93a36df8b88c2963f28894-fabric-android)

Note: in 12.3 we switched to using Sentry and this issue is showing up there but is far more common in older versions of Android so far (or we do not have enough data for 12.3 yet to compare).

@planarvoid
Copy link
Contributor

This crash is indeed related to #5456 and it's happening because we're saving too much stuff into the savedInstanceState all over the place. I've found out the following:

TransactionTooLargeException occurs because the Parcel objects stored in Binder transaction buffer exceeds the limited size of 1Mb. The Binder transaction buffer is shared by all transactions in progress for the process. Consequently this exception can be thrown when there are many transactions in progress even when most of the individual transactions are of moderate size.

I think this means that any place in the app could cause this issue (or any place where we store a lot of stuff into the savedInstanceState). The proper solution would be to cache the data and only save IDs into savedInstanceState (and I see that's been done in the NotificationsDetailActivity) and use ViewModels to keep the visible state.

@designsimply
Copy link
Contributor

A mobile request for help which came in over Apr and May may be a candidate for this crash and the user reported the following example steps to reproduce the issue they are facing:

  1. Create a new post using the Aztec editor in the app on an Android Tablet.
  2. Add 3000+ words.
  3. Add at least 6 YouTube video embeds.
  4. Add 45 to 85 images throughout the post.
  5. Try landscape and portrait modes.

I think the large number of images are the culprit in their case.

(internal references: p4a5px-2om-p2/#comment-9755 and 1953510-zen)

@designsimply
Copy link
Contributor

Copying over steps to reproduce from a similar crash report at #5456 (comment).

Steps to reproduce:

  1. On the web write a huge post. I used plain text only (460kb and 1mb )
  2. On your device enable "Don't keep activities".
  3. Open the post in Visual Editor.
  4. Press Home button to leave application.
  5. Notice crash.

@byencho
Copy link

byencho commented Jun 6, 2019

As mentioned in #5456 (comment) , if you can identify specific places where you put too much stuff in a savedInstanceState Bundle, you can use https://github.com/livefront/bridge as a (potentially temporary) workaround to avoid the crash. This works for Activities, Fragments, Views, ViewModels, etc. This library takes your Bundle data and manually saves it so that the OS doesn't need to push that data across processes to save it itself (which is when the TransactionTooLargeException will get triggered).

@designsimply
Copy link
Contributor

30-day impact: ~37 per day
Users affected: 885 in the last 30d
First seen in: 11.8
Last seen in: 12.7.1 (latest major release at the time of this report)

https://sentry.io/share/issue/ed6aca84b5d74e40b016c9472defe828/

@antonis
Copy link
Contributor

antonis commented Oct 26, 2023

Reopening the issue since there are still occurrences in the latest release 23.4 (ref https://a8c.sentry.io/share/issue/ed6aca84b5d74e40b016c9472defe828/)

Pasting the stack trace:

android.os.TransactionTooLargeException: data parcel size 535144 bytes
    at android.os.BinderProxy.transactNative(BinderProxy.java)
    at android.os.BinderProxy.transact(BinderProxy.java:662)
    at android.app.IActivityClientController$Stub$Proxy.activityStopped(IActivityClientController.java:1309)
    at android.app.ActivityClient.activityStopped(ActivityClient.java:85)
    at android.app.servertransaction.PendingTransactionActions$StopInfo.run(PendingTransactionActions.java:143)
    at android.os.Handler.handleCallback(Handler.java:942)
    at android.os.Handler.dispatchMessage(Handler.java:99)
    at android.os.Looper.loopOnce(Looper.java:226)
    at android.os.Looper.loop(Looper.java:313)
    at android.app.ActivityThread.main(ActivityThread.java:8762)
    at java.lang.reflect.Method.invoke(Method.java)
    at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:604)
    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1067)
java.lang.RuntimeException: android.os.TransactionTooLargeException: data parcel size 535144 bytes
    at android.app.ActivityClient.activityStopped(ActivityClient.java:88)
    at android.app.servertransaction.PendingTransactionActions$StopInfo.run(PendingTransactionActions.java:143)
    at android.os.Handler.handleCallback(Handler.java:942)
    at android.os.Handler.dispatchMessage(Handler.java:99)
    at android.os.Looper.loopOnce(Looper.java:226)
    at android.os.Looper.loop(Looper.java:313)
    at android.app.ActivityThread.main(ActivityThread.java:8762)
    at java.lang.reflect.Method.invoke(Method.java)
    at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:604)
    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1067)

@antonis antonis reopened this Oct 26, 2023
@fluiddot
Copy link
Contributor

Sentry events analysis

I couldn't find any pattern after reviewing different Sentry events and sessions. However, I did notice that all of them are Atomic sites. This might be a clue although I couldn't reproduce the crash when testing an Atomic site.

As shared internally by @antonis (pcdRpT-4n3-p2#comment-7328), this crash has experienced a dramatic increase in version 23.5. But none of the changes either introduced in the app (reference) and the editor (reference) seem to be related to this.

Debugging stack trace

The only area of the app that can be associated with the crash is the editor, as the stack trace points to the function EditPostActivity.saveInstanceState. I've analyzed the data saved in the state using toolargetool in order to identify the culprit of this crash. Here are the results:

Steps:

  1. Create a post.
  2. Add different blocks.
  3. Add an Image block.
  4. In the "Choose image" bottom sheet, select "Choose from device".
  5. Dismiss the media picker and repeat step 4.

For the above scenario, I found that the saved state can easily reach between 200 - 300 KB. I identified two main actors for the unexpected size:

Theme data

Theme data is included as arguments for the two Gutenberg fragments** (GutenbergEditorFragment and GutenbergContainerFragment). As part of this data, we have two properties (rawFeatures and rawStyles) that hold a long JSON string. Per the tests I performed using different themes, this adds around 64 KB per fragment, i.e. 128 KB in total.

I tried to increase the data size by customizing a theme but this didn't end up incurring an extra size. I thought this might be a theme-specific issue. However, after testing the same themes used by users who experienced the issue, none of them have led to the crash.

NOTE: The key used to hold the data in the bundle is param_gutenberg_props_builder.editorTheme.

State of views used within the editor

Depending on the content of the post, I noticed that the instance state can be incremented.

  • +41 KB for an empty image and one paragraph.
  • +81 KB for an empty image and 15 paragraphs.
  • +117 KB for an empty image and 30 paragraphs.

This data is in the state of GutenbergEditorFragment with the key android:view_state in the form of an array. In the case of having 15 paragraphs, I saw the array can be really long with over 800 items. Some of them are generic and I couldn't identify the origin, but others referenced Aztec.

I'm unaware if we really need to keep the state of Aztec views, so might be interesting to explore how to remove this in the spirit to reduce the state size.

TransactionTooLargeException

Oddly, after testing different scenarios, I couldn't manage to make the instance state big enough to raise this exception. I'm wondering if the previous state upon entering the editor might be implicated. Based on the error logs, several of the crashes happen when reaching 500 KB, so taking into account that the editor easily takes 250 KB, the other half might be coming from other views/fragments. In any case, I haven't confirmed this hypothesis and we'd need to investigate further.

Version 24.5 vs version 24.4

I performed the same analysis on both versions 24.4 and 24.5 and got the same results.

Next Steps

  • Test Atomic sites thoroughly and perform stress tests in the editor. The idea is to find a consistent way to reproduce the crash.
  • Explore options to remove theme data from Gutenberg fragments.
  • Investigate further views under the editor that are saving their state (e.g. Aztec).

@antonis
Copy link
Contributor

antonis commented Nov 10, 2023

Thank you for the detailed investigation and report @fluiddot 🙇

after testing different scenarios, I couldn't manage to make the instance state big enough to raise this exception.

Same. I wasn't able to reproduce a crash.

I'm unaware if we really need to keep the state of Aztec views, so might be interesting to explore how to remove this in the spirit to reduce the state size.

Good idea 👍

Since we cannot predict the size of the data on each site I think our best bet would be to replace the current mechanism that passes the data in the Bundle with an implementation that saves everything (or at least the large data) in the database.

@fluiddot
Copy link
Contributor

Since we cannot predict the size of the data on each site I think our best bet would be to replace the current mechanism that passes the data in the Bundle with an implementation that saves everything (or at least the large data) in the database.

I totally agree. A comment in this issue (#9685 (comment)) also pointed out a similar solution by using the library https://github.com/livefront/bridge.

@antonis
Copy link
Contributor

antonis commented Jan 30, 2024

  1. The TransactionTooLargeException also occurs in WPWebViewActivity

Reopening since this is not covered yet

@antonis antonis reopened this Jan 30, 2024
@antonis
Copy link
Contributor

antonis commented Feb 2, 2024

  1. The TransactionTooLargeException also occurs in WPWebViewActivity

I haven't managed to reproduce this issue on my device but there are reports (Chromium issue) that the WebView.saveState (used in the parent activity WebViewActivity) might be the root of the crash.
Note that the behaviour of this method has changed since this code was introduced 9 years ago and the method no longer stores the display data for the WebView which might make it obsolete.
Similar to this SO report I experimented with removing this call (see the patch below) and didn't notice any difference on the WebView behaviour in my (not so extensive) tests.

diff --git a/WordPress/src/main/java/org/wordpress/android/ui/WebViewActivity.java b/WordPress/src/main/java/org/wordpress/android/ui/WebViewActivity.java
index 8770f417b37..2cd5bb5fc55 100644
--- a/WordPress/src/main/java/org/wordpress/android/ui/WebViewActivity.java
+++ b/WordPress/src/main/java/org/wordpress/android/ui/WebViewActivity.java
@@ -80,24 +80,6 @@ public abstract class WebViewActivity extends LocaleAwareActivity {
         }
     }
 
-    /*
-     * save the webView state with the bundle so it can be restored
-     */
-    @Override
-    protected void onSaveInstanceState(Bundle outState) {
-        mWebView.saveState(outState);
-        super.onSaveInstanceState(outState);
-    }
-
-    /*
-     * restore the webView state saved above
-     */
-    @Override
-    protected void onRestoreInstanceState(Bundle savedInstanceState) {
-        super.onRestoreInstanceState(savedInstanceState);
-        mWebView.restoreState(savedInstanceState);
-    }
-
     @Override
     protected void onPause() {
         super.onPause();

A fix like the above would require extensive testing since it would affect the SSLCertsViewActivity, the WPWebViewActivity which is used across the app and also extended by:DomainRegistrationCheckoutWebViewActivity, DomainManagementDetailsActivity, ThemeWebActivity, SupportWebViewActivity and JetpackConnectionWebViewActivity.

Considering the above and that I haven't managed to reproduce the crash to validate this as a fix I haven't proceeded with a PR with this approach yet.

Another (maybe safer) hacky alternative would be to remove the WEBVIEW_CHROMIUM_STATE from the saved Bundle when it exceeds a certain threshold. Something like this is also used in the Chromium source code.

Copy link

sentry-io bot commented Feb 7, 2024

Sentry issue: JETPACK-ANDROID-7Y4

Copy link

sentry-io bot commented Feb 7, 2024

Sentry issue: JETPACK-ANDROID-8HV

@antonis
Copy link
Contributor

antonis commented Feb 12, 2024

Marking as resolved in 24.2 with #19747, #20046 and #20139. I'll keep an eye for any regressions in Sentry

@antonis antonis closed this as completed Feb 12, 2024
Copy link

sentry-io bot commented Feb 12, 2024

Sentry issue: WORDPRESS-ANDROID-2V8H

@antonis
Copy link
Contributor

antonis commented Feb 23, 2024

Marking as resolved in 24.2 with #19747, #20046 and #20139. I'll keep an eye for any regressions in Sentry

With 24.2 fully rolled out 6 crashes by 3 users were recorder this week all of them in the WordPress app editor on low end devices. From the logs the following might indicate the usage of Aztec 🤔

  • e/WordPress-EDITOR: HTML content of Aztec Editor before the crash:
  • editor_opened, Properties: {"blog_id":...,"post_id":..,"has_gutenberg_blocks":false,"site_type":"blog","post_type":"post","post_format":"standard","is_jetpack":false,"editor_has_hw_disabled":"0"}

I'll keep monitoring the issue and iterate with additional fixes if needed.

@antonis antonis reopened this Feb 23, 2024
Copy link

sentry-io bot commented Mar 14, 2024

Sentry Issue: WORDPRESS-ANDROID-2QCB

Copy link

sentry-io bot commented Mar 14, 2024

Sentry Issue: JETPACK-ANDROID-E9W

Copy link

sentry-io bot commented Mar 14, 2024

Sentry Issue: WORDPRESS-ANDROID-2TD8

@antonis
Copy link
Contributor

antonis commented Mar 14, 2024

I've linked a few more occurrences of TransactionTooLargeException. In the past 30d and in the latest public versions of the app (24.2 and 24.3.1 that include the fixes #9685 (comment)) 12 users encountered 21 crashes in total.

Copy link

sentry-io bot commented Mar 28, 2024

Sentry Issue: JETPACK-ANDROID-MKP

@dangermattic
Copy link
Collaborator

Thanks for reporting! 👍

@antonis
Copy link
Contributor

antonis commented Mar 28, 2024

Given that the occurrences of TransactionTooLargeException related crashes after 24.2 is low I lowered the priority to medium. I'll keep monitoring and fix issues as needed.

Screenshot 2024-03-28 at 10 46 07 AM Screenshot 2024-03-28 at 10 45 52 AM Screenshot 2024-03-28 at 10 46 21 AM

Copy link

sentry-io bot commented Mar 28, 2024

Sentry Issue: WORDPRESS-ANDROID-2W93

Copy link

sentry-io bot commented Apr 4, 2024

Sentry Issue: JETPACK-ANDROID-P5N

@antonis
Copy link
Contributor

antonis commented Apr 12, 2024

I'm marking this as resolved as there are no occurrences in the latest version. I'll keep an eye for regressions.

@antonis
Copy link
Contributor

antonis commented May 30, 2024

Opened a separate issue for the MediaPreviewActivity related crashes #20913

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment