Add cache mechanism to sherpa tts #1732

mah92 · 2025-01-17T16:04:01Z

Delay is a major drawback of sherpa tts on android phones, which makes it unusable for average blind people using average phones. By caching the most frequent tts requests, sherpa is made as fast as lightning on most of the time, using much less cpu and battery. Some blind individials have tested this upgrade and have been happy :)

Considered guide-lines:

App does not cache user texts due to security and privacy concerns. Just a hash is saved from the texts.
Usage statistics of texts(hashes) are saved to a list, and stored periodically(every 10 minutes). If cache space is filled, wav files that are less used are removed.
Setting amount of cache and the ability to remove cached wav files are added in the MainActivity
Present jni functions are tried to remain intact due to possible compatibility considerations.
When changing speed, cache could be cleared automatically to update the speed of cached media as well, not implemented yet as there is a clear button on MainActivity.kt.
Code is scanned for possible bugs with deepseek AI.

mah92 · 2025-01-17T16:40:43Z

csukuangfj · 2025-01-20T01:16:33Z

Thank you for your contribution!

Will review it.

android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/MainActivity.kt

sherpa-onnx/csrc/offline-tts-cache-mechanism.h

sherpa-onnx/csrc/CMakeLists.txt

sherpa-onnx/csrc/offline-tts.h

csukuangfj · 2025-01-22T04:18:36Z

sherpa-onnx/csrc/offline-tts.h

@@ -32,6 +33,9 @@ struct OfflineTtsConfig {
  // If you set it to -1, then we process all sentences in a single batch.
  int32_t max_num_sentences = 1;

+  // Path to cache_directory
+  std::string cache_dir;


Please create a new config for cache.

What do you mean exactly?

I can remove cache_dir by calling the constructor from within GetOfflineTtsConfig. Is it acceptable?
If not, please give more details.

For example, OfflineTts has a config called OfflineTtsConfig.

You can create a struct called CacheMechanismConfig and put cache_dir in it. In this way, you don't need to add a new field to OfflineTtsConfig. You also don't need to add new methods to OfflineTts. (Note you need to wrap CacheMechanism and its config to Kotlin via jni)

Remember that sherpa-onnx currently supports 12 programming languages. If we change the config in C++, we also need to update APIs of other programming languages.

You can create a smart pointer of CacheMechanism and pass it to the constructor of OfflineTts.

If later we need to extend the CacheMechanism, we can leave OfflineTts and OfflineTtsConfig untouched. We only need to change CacheMechanism or/and CacheMechanismConfig.

By the way, since the CacheMechanism is related to tts, can you rename it to OfflineTtsCacheMechanism?

In general, the filename of a class contains the class name, though there are dashes in the filename but not in the class name

Methods of CacheMechanism don't need to be repeated in OfflineTts.

csukuangfj · 2025-01-22T04:21:45Z

sherpa-onnx/csrc/offline-tts.h

+  // Return the maximum number of cached audio files size
+  int32_t CacheSize() const;
+
+  // Set the maximum number of cached audio files size
+  void SetCacheSize(const int32_t cache_size);
+
+  // Remove all cache data
+  void ClearCache();
+
+  // To get total used cache size(for wav files) in bytes
+  int64_t GetTotalUsedCacheSize();


Please remove these methods from OfflineTtts.

You can pass a smart pointer of CacheMechanism to the constructor of OffineTts.
(You can add an overload constructor for OfflineTts)

I did not get how.
You mean I directly call functions in CacheMechanism from jni functions themselves?

Yes. CacheMechanism is constructed outside of OfflineTts.

OfflineTts is only for converting text to speech. Anything related to cache should be put in a separate class, i.e., please don't repeat the methods of CacheMechanism in OfflineTts.

csukuangfj · 2025-01-22T04:22:02Z

sherpa-onnx/csrc/offline-tts.h

+  OfflineTtsConfig config_;
  std::unique_ptr<OfflineTtsImpl> impl_;
+  std::unique_ptr<CacheMechanism> cache_mechanism_;


Suggested change

OfflineTtsConfig config_;

std::unique_ptr<OfflineTtsImpl> impl_;

std::unique_ptr<CacheMechanism> cache_mechanism_;

std::unique_ptr<OfflineTtsImpl> impl_;

I do not get it. If CacheMechanism is to be added to the constructor of OffineTts, aren't these lines still needed?

csukuangfj · 2025-01-22T04:27:13Z

Please use
https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/check_style_cpplint.sh
to check the style of your C++ code.

You can refer to

sherpa-onnx/scripts/check_style_cpplint.sh

Lines 19 to 26 in 66e02d8

    
           # (1) To check files of the last commit 
        
           #  ./scripts/check_style_cpplint.sh 
        
           # 
        
           # (2) To check changed files not committed yet 
        
           #  ./scripts/check_style_cpplint.sh 1 
        
           # 
        
           # (3) To check all files in the project 
        
           #  ./scripts/check_style_cpplint.sh 2

You can use

pip install clang-format==12.0.1

to install clang-format and use it to auto-format your code.

mah92 · 2025-01-22T04:59:19Z

Thank you, I will correct tomorrow.
I wonder if it is better to remove the clear button and remove the cache automatically when the speed is changed. By the way the clear button seems to be displaced after merging to the newer commit.

csukuangfj · 2025-01-22T05:01:31Z

I wonder if it is better to remove the clear button and remove the cache automatically when the speed is changed

Yes, I agree.

By the way, you can even set a default cache size. Users don't need to set the cache size through the UI.

mah92 · 2025-01-22T05:02:11Z

Another proposal: whatever we do in the model, there still remains some words that are not spelled correctly(like keyboard letters). Good to add them permanently in the cache so that they be read fron the cache instead of the model?

mah92 · 2025-01-22T05:05:46Z

About the cache slider, I think it is good to keep it for a while. Then rethink based on user feedback. I know it is good between 20 and 200 MB but need feedback to fix it(maybe not a single optimal value based on phone). I also think the user may be happier when knowing about the cach size and would accept the extra memory usage easier. Also better for privacy concerns to see the slider.

csukuangfj · 2025-01-22T05:59:47Z

the cache instead of the model?

Ok, it is fine with me.

csukuangfj · 2025-01-22T06:00:22Z

Another proposal: whatever we do in the model, there still remains some words that are not spelled correctly(like keyboard letters). Good to add them permanently in the cache so that they be read fron the cache instead of the model?

Please make the cache do only 1 thing.

You can create a separate PR to add it.

mah92 · 2025-01-23T13:23:55Z

Please use https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/check_style_cpplint.sh to check the style of your C++ code.
...

done.

integrated caching feature into last commit

7d35712

mah92 and others added 2 commits January 17, 2025 20:18

Merge branch 'master' into master

de504ee

Speed up least repeated txt search

43e0262

mah92 mentioned this pull request Jan 19, 2025

Add cache mechanism to sherpa tts #1734

Open

csukuangfj requested changes Jan 22, 2025

View reviewed changes

Your Name added 2 commits January 23, 2025 11:48

Done some of the reviewer's request

b7c91b4

cpplint passed

42d6d24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cache mechanism to sherpa tts #1732

Add cache mechanism to sherpa tts #1732

mah92 commented Jan 17, 2025 •

edited

Loading

mah92 commented Jan 17, 2025

csukuangfj commented Jan 20, 2025

csukuangfj Jan 22, 2025

mah92 Jan 23, 2025

mah92 Jan 23, 2025

csukuangfj Jan 23, 2025

csukuangfj Jan 22, 2025

mah92 Jan 23, 2025 •

edited

Loading

csukuangfj Jan 23, 2025

csukuangfj Jan 22, 2025

mah92 Jan 23, 2025

csukuangfj commented Jan 22, 2025

mah92 commented Jan 22, 2025

csukuangfj commented Jan 22, 2025

mah92 commented Jan 22, 2025 •

edited

Loading

mah92 commented Jan 22, 2025 •

edited

Loading

csukuangfj commented Jan 22, 2025

csukuangfj commented Jan 22, 2025

mah92 commented Jan 23, 2025 •

edited

Loading

Add cache mechanism to sherpa tts #1732

Are you sure you want to change the base?

Add cache mechanism to sherpa tts #1732

Conversation

mah92 commented Jan 17, 2025 • edited Loading

mah92 commented Jan 17, 2025

csukuangfj commented Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mah92 Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csukuangfj commented Jan 22, 2025

mah92 commented Jan 22, 2025

csukuangfj commented Jan 22, 2025

mah92 commented Jan 22, 2025 • edited Loading

mah92 commented Jan 22, 2025 • edited Loading

csukuangfj commented Jan 22, 2025

csukuangfj commented Jan 22, 2025

mah92 commented Jan 23, 2025 • edited Loading

mah92 commented Jan 17, 2025 •

edited

Loading

mah92 Jan 23, 2025 •

edited

Loading

mah92 commented Jan 22, 2025 •

edited

Loading

mah92 commented Jan 22, 2025 •

edited

Loading

mah92 commented Jan 23, 2025 •

edited

Loading