Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK-321: Optimizing SDK Cache Management #255

Merged
merged 4 commits into from
Oct 16, 2023

Conversation

wikumChamith
Copy link
Member

Description of what I changed

  • Add the enableDataSaving parameter to the setup-sdk plugin. This parameter enables the data-saving mode. When using this mode, the SDK will reuse the existing cache folder rather than creating a new one.

  • In normal mode, SDK will remove the cache folder after running the setup.

Issue I worked on

see https://issues.openmrs.org/browse/SDK-321

Checklist: I completed these to help reviewers :)

  • My IDE is configured to follow the code style of this project.

    No? Unsure? -> configure your IDE, format the code and add the changes with git add . && git commit --amend

  • I have added tests to cover my changes. (If you refactored
    existing code that was well tested you do not have to add tests)

    No? -> write tests and add them to this commit git add . && git commit --amend

  • I ran mvn clean install right before creating this pull request and
    added all formatting changes to my commit.

    No? -> execute the above command

  • All new and existing tests passed.

    No? -> figure out why and add the fix to your commit. It is your responsibility to make sure your code works.

  • My pull request is based on the latest changes of the master branch.

    No? Unsure? -> execute command git pull --rebase upstream master

Add the enableDataSaving parameter to the setup-sdk plugin. This parameter enables the data-saving mode. When using this mode, the SDK will reuse the existing cache folder rather than creating a new one.

In normal mode, SDK will remove the cache folder after running the setup.
Add the enableDataSaving parameter to the setup-sdk plugin. This parameter enables the data-saving mode. When using this mode, the SDK will reuse the existing cache folder rather than creating a new one.

In normal mode, SDK will remove the cache folder after running the setup.

@Mojo(name = "setup-sdk", requiresProject = false)
public class SetupSDK extends AbstractTask {

@Parameter(defaultValue = "false", property = "enableDataSaving")
private boolean enableDataSaving;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should change the above name to something that reflects what exactly the effects of this flag will be on the actual implementation.

boolean isDataSavingMode = Boolean.parseBoolean(sdkProperties.getProperty("enableDataSaving"));
String cacheName = "openmrs-sdk-node";
if (isDataSavingMode) {
logger.info("OpenMRS SDK is running on data saving mode");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not call this a data saving mode because it may be used for other reasons like performance instead of just saving data.


@Mojo(name = "setup-sdk", requiresProject = false)
public class SetupSDK extends AbstractTask {

@Parameter(defaultValue = "true", property = "removeCacheAfterSetup")
private boolean removeCacheAfterSetup;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some documentation to this parameter?

File tempDir = new File(tempDirPath);
if(tempDir.exists() && tempDir.isDirectory()) {
File[] cache = tempDir.listFiles(file -> file.isDirectory() && file.getName().startsWith("openmrs-sdk-node"));
if(cache == null) return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pom.xml Show resolved Hide resolved
boolean isRemoveCacheAfterSetup = Boolean.parseBoolean(sdkProperties.getProperty("removeCacheAfterSetup"));
String cacheName = "openmrs-sdk-node";
if (!isRemoveCacheAfterSetup) {
logger.info("OpenMRS SDK will reuse the NPM cache. Run openmrs-sdk:setup-sdk -DremoveCacheAfterSetup=true to disable it.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we be consistent? The description says reuse npm cache but the method name says remove cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename the parameter as "ReuseNpmCache" ??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think is better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logger info is consistent with implementation. If isRemoveCacheAfterSetup is false the NPM will reuse NPM caches without deleting them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something along the lines of reuseNodeCache?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the parameter name

if (cache == null) {
return null;
}
if (cache.length > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to the rest of the caches?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete them??

Copy link
Member

@dkayiwa dkayiwa Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are more than one, will the first one (at index 0) be consistently the same for subsequent calls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member Author

@wikumChamith wikumChamith Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkayiwa what do you suggest we should do here? We could delete the other caches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

This actually isn't guaranteed. We're operating on files in the order returned from listFiles() which has this caveat:

"There is no guarantee that the name strings in the resulting array will appear in any specific order; they are not, in particular, guaranteed to appear in alphabetical order. "

Similarly, createTempDirectory() says:

"The details as to how the name of the directory is constructed is implementation dependent and therefore not specified."

In order to guarantee that this always returns the same folder, we'd need a guarantee that listFiles() returned files in a consistent order and that createTempDirectory() generates files in a defined order that will fit the order returned by listFiles(), but those details are specifically disclaimed.

(On Windows, the default JVM does indeed use the tick count as the source of randomness for createTempDirectory() and listFiles() likely returns in a consistent order, which would likely result in this appearing to work; however, Linux, the underlying mkstemp() implementation will try to use a RNG, falling back to ticks if not available on that device. I haven't looked into the details of macOS.)

Probably the thing to do is to make the tempDir a class variable instead of an instance variable so it's only set once per JVM, which would guarantee consistent results across multiple invocations of the NodeHelper within a single invocation of the SDK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, actually, I think a better strategy is: if the user wants to use a consistent NPM cache directory, use a named folder in the directory the SDK "servers directory" (defined here). Just name it something unlikely to be a server name, e.g., _openmrs_sdk_node_cache. That's cleaner, easier for end-users to understand and nobody has to go poking around for the right "openmrs-sdk-node.1234567" folder.

* By default, this is set to 'true' to save storage space.
* If set to 'false', the SDK will reuse the existing NPM cache
*/
@Parameter(defaultValue = "true", property = "removeCacheAfterSetup")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default be what it was before these changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the default value and the code to reflect the property name.


private static final Logger logger = LoggerFactory.getLogger(NodeHelper.class);

public Path getTempDir() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be public? Also, coding style, accessor properties should always come after constructors.

this.mavenProject = mavenProject;
this.mavenSession = mavenSession;
this.pluginManager = pluginManager;
try {
this.tempDir = Files.createTempDirectory("openmrs-sdk-node");
this.tempDir.toFile().deleteOnExit();
Properties sdkProperties = getSdkProperties();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's preferrable, I think to move this code into getTempDir() that way we can defer creating a temporary directory until we actually need it.

this.tempDir.toFile().deleteOnExit();
Properties sdkProperties = getSdkProperties();
boolean reuseNodeCache = Boolean.parseBoolean(sdkProperties.getProperty("reuseNodeCache"));
String cacheName = "openmrs-sdk-node";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should just be a class constant, no?

if (cache == null) {
return null;
}
if (cache.length > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

This actually isn't guaranteed. We're operating on files in the order returned from listFiles() which has this caveat:

"There is no guarantee that the name strings in the resulting array will appear in any specific order; they are not, in particular, guaranteed to appear in alphabetical order. "

Similarly, createTempDirectory() says:

"The details as to how the name of the directory is constructed is implementation dependent and therefore not specified."

In order to guarantee that this always returns the same folder, we'd need a guarantee that listFiles() returned files in a consistent order and that createTempDirectory() generates files in a defined order that will fit the order returned by listFiles(), but those details are specifically disclaimed.

(On Windows, the default JVM does indeed use the tick count as the source of randomness for createTempDirectory() and listFiles() likely returns in a consistent order, which would likely result in this appearing to work; however, Linux, the underlying mkstemp() implementation will try to use a RNG, falling back to ticks if not available on that device. I haven't looked into the details of macOS.)

Probably the thing to do is to make the tempDir a class variable instead of an instance variable so it's only set once per JVM, which would guarantee consistent results across multiple invocations of the NodeHelper within a single invocation of the SDK.

@wikumChamith
Copy link
Member Author

wikumChamith commented Sep 30, 2023

@ibacher I made the changes you requested.

@ibacher
Copy link
Member

ibacher commented Oct 3, 2023

@wikumChamith I think what I actually asked for was this:

On reflection, actually, I think a better strategy is: if the user wants to use a consistent NPM cache directory, use a named folder in the directory the SDK "servers directory" (defined here). Just name it something unlikely to be a server name, e.g., _openmrs_sdk_node_cache. That's cleaner, easier for end-users to understand and nobody has to go poking around for the right "openmrs-sdk-node.1234567" folder.

Also, even if we have a global option for this (which is fine), it would be nice to have a local option to override the global option without needing to run setup-sdk all over again (especially as that has side-effects I'd rather avoid).

Add the enableDataSaving parameter to the setup-sdk plugin. This parameter enables the data-saving mode. When using this mode, the SDK will reuse the existing cache folder rather than creating a new one.

In normal mode, SDK will remove the cache folder after running the setup.
@wikumChamith
Copy link
Member Author

@ibacher I made the changes you suggested.

Copy link
Member

@ibacher ibacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wikumChamith! There are two more places we should be able to override the node cache value:

  1. BuildDistro
  2. ServerUpgrader

Then I think this will be all set. Thanks for the work on this!

@wikumChamith
Copy link
Member Author

@ibacher I integrated the cache management to build-distro and server upgrader.

@makombe
Copy link

makombe commented Oct 15, 2023 via email

@@ -33,6 +36,7 @@
public class PropertiesUtils {

private static final Logger log = LoggerFactory.getLogger(PropertiesUtils.class);
private static final String SDK_PROPERTIES_FILE = "SDK Properties file";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just be something like sdk.properties?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When writing the SDK properties to the files using Properties#store(java.io.OutputStream, java.lang.String) we have to pass a comment. This comment is used as the heading of the file. I renamed the variable name to eliminate any confusion. The actual file name of the properties file is "sdk.properties".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the constant name would be less confusing if it reflected that in one way or another like SDK_PROPERTIES_FILE_HEADER. Or just get rid of the constant because after all, you use it in only one place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkayiwa I removed the variable in this PR: #257

public static void savePropertiesChangesToFile(Properties properties, File file)
throws MojoExecutionException {
try (OutputStream fos = new FileOutputStream(file)) {
properties.store(fos, SDK_PROPERTIES_FILE + ":");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use of the full colon at the end of this comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the colon.

@dkayiwa dkayiwa merged commit 2623cd9 into openmrs:master Oct 16, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants