Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next Build and Vets Website branches can be chosen on Tugboat #16853

Closed
3 tasks done
timcosgrove opened this issue Jan 16, 2024 · 20 comments · Fixed by #17154
Closed
3 tasks done

Next Build and Vets Website branches can be chosen on Tugboat #16853

timcosgrove opened this issue Jan 16, 2024 · 20 comments · Fixed by #17154
Assignees
Labels
Accelerated Publishing Blocked Issues that are blocked on factors other than blocking issues.

Comments

@timcosgrove
Copy link
Contributor

timcosgrove commented Jan 16, 2024

Requirements

We need to be able to select different Next Build and Vets Website branches on Tugboat, so that engineers can preview their front-end work in conjunction with CMS changes

Acceptance criteria

Preview Give feedback

Background & implementation details

This will be heavily based on the framework available in vets_website_content_release and vets_website_build_trigger. More concrete information will come.

@alexfinnarn
Copy link
Contributor

Adding a note that the va_gov_content_release module contains the form code and it loads at /admin/content/deploy.

I think I can copy/paste the form element code, use the same autocomplete lookup, and just modify the route parameter for frontend.

      '#autocomplete_route_name' => 'va_gov_content_release.frontend_version_autocomplete',
      '#autocomplete_route_parameters' => [
        'frontend' => 'next_build',
        'count' => 10,
      ],

This will then set a version that is picked up and placed into a queue in Request::submitRequest() like:

      $job = Job::create(static::JOB_TYPE, [
        'reason' => $reason,
      ]);
      $this->queue->enqueueJob($job);

Once added, you should be able to view at "/admin/config/system/queues/jobs/content_release".

@alexfinnarn
Copy link
Contributor

I feel like I'm in Alice's rabbit hole...but I finally found where the queue job actually executes in EnvironmentPluginBase::triggerFrontendBuild

  public function triggerFrontendBuild() : void {
    // The existence of this file triggers a content release.
    // See scripts/queue_runner/queue_runner.sh.
    $this->filesystem->saveData('build plz', 'public://.buildrequest', FileSystemInterface::EXISTS_REPLACE);
    $this->messenger()->addStatus('A request to rebuild the front end has been submitted.');
  }

So at least now I understand the full execution from submitting the form to the frontend actually building. Then the release script gathers values set by other parts of the job enqueueing.

I wonder if this could have been done in one Form class connected to a job queue service. I'd run the script using the Symfony Process component so the queue can observe the status and re-run if there are reported failures. As it is now, the form submission just sets values and doesn't really guarantee that the content build process will start. I know the process takes a long time, but it'd be nice to kick it off from the form submission itself vs. just setting variables since the submission button says "Release Content" and not "Set Variables That Will Be Used To Release Content At A Later Date".

Based on this knowledge and the ticket requirements, I will create the same form I see on "/admin/content/deploy" but move it to "/admin/content/deploy-next". That should set the variables in Drupal's state key-value store and enqueue the job that writes a file, but I assume the release script to build the frontend will need additional work, especially with the content negotiation fallback strategy.

@alexfinnarn
Copy link
Contributor

I've gotten farther after copying GitForm and setting the autocomplete parameters for next_build. I was very confused on how the repo information came into the search class since the constructor only type hints by interface and not class implementation.

However, I was able to find the service definitions for the autocomplete search:

  va_gov_content_release.frontend_version_search:
    class: Drupal\va_gov_content_release\FrontendVersionSearch\FrontendVersionSearch
    arguments:
      - '@va_gov_git.branch_search.content_build'
      - '@va_gov_github.api_client.content_build'
      - '@va_gov_git.branch_search.vets_website'
      - '@va_gov_github.api_client.vets_website'
      - '@logger.factory'

So I need to create services for the API client and branch service following those examples.

@alexfinnarn
Copy link
Contributor

alexfinnarn commented Jan 18, 2024

Now I'm onto the setting used after wiring everything up for the autocomplete search. In settings.php I see variables relating to where code is cloned to, but I'm not sure where next-build should go:

// The default project root locations. These settings are currently only used on Tugboat and local environments.
$settings['va_gov_web_root'] = '/var/www/cms/web';
$settings['va_gov_app_root'] = '/var/www/cms';
$settings['va_gov_vets_website_root'] = '/var/www/cms/docroot/vendor/va-gov/vets-website';

// @todo Add the next build root to the settings.
//$settings['va_gov_next_build_root'] = '/var/www/cms/docroot/vendor/va-gov/next-build';

I will look in docs, and the example is just a guess. Locally, I don't see it...

Screenshot 2024-01-18 at 2 29 32 PM

but maybe the code adds to the vendor directory?

I also have to set the GitHub token somehow, but I haven't seen what that variable is called yet.

@tjheffner
Copy link
Contributor

@alexfinnarn next-build gets cloned into the repo root on tugboat currently. would be var/lib/tugboat/next or /var/www/cms/next

(locally, it is currently living in an adjacent folder to the cms repo entirely)

see build-next in tugboat.yml.

other things that may help shine a light:

composer.json:

        "va:next:install": [
            "# Prepare the next build project",
            "! ./scripts/should-run-directly.sh || ./scripts/next-install.sh",
            "./scripts/should-run-directly.sh || ddev composer va:next:install --"
        ],
        "va:next:build": [
            "# Prepare the next build server",
            "! ./scripts/should-run-directly.sh || ./scripts/next-build.sh",
            "./scripts/should-run-directly.sh || ddev composer va:next:build --"
        ],
        "va:next:start": [
            "# Start the next build server",
            "! ./scripts/should-run-directly.sh || ./scripts/next-start.sh",
            "./scripts/should-run-directly.sh || ddev composer va:next:start --"
        ]

@alexfinnarn
Copy link
Contributor

@tjheffner I see that content-build is on Packagist https://packagist.org/?query=va-gov and it is listed in va.gov-cms. Will it be necessary for next-build to also add a composer.json file and become registered on Packagist? If so, then it should end up at /var/www/cms/docroot/vendor/va-gov/next-build, at least for local ddev.

I also see a symlink /var/www/html/web -> docroot/vendor/va-gov/content-build in ddev that might come into play?

But pulling in the repo via composer seems like it should work locally, on Tugboat, and wherever this repo is deployed. I'm not sure if pulling in git information is related to building the content and previews, but then again, it wouldn't make sense to have multiple copies of external repositories.

@tjheffner
Copy link
Contributor

I don't think we have any plans currently to put next-build onto packagist / introduce composer.json. We do want the ability for developers to use their local cms with their local next-build

@alexfinnarn
Copy link
Contributor

alexfinnarn commented Jan 22, 2024

Okay, I think the code on #16923 now should pull in the next-build git info at "/admin/content/deploy/next_git".

I was thrown for a loop for a while since I could get the git information from the locally cloned repo even without GITHUB_TOKEN being set. After spotting some code I failed to update and fiddling with the repository filesystem paths, I got complete git information on the content release form on Tugboat.

I'm still not sure if the variable for next-build's filesystem path needs any overrides or if using a relative path in settings.php will suffice. As long as next-build is cloned next to Drupal's docroot, it should work.

// In settings.tugboat.php.
$settings['va_gov_next_build_root'] = getenv('TUGBOAT_ROOT')  . '/next';

// In settings.php.
$settings['va_gov_next_build_root'] = '../next';

// I also noticed this isn't at the docroot, but I think '../vets-website' exists?
$settings['va_gov_vets_website_root'] = '/var/www/cms/docroot/vendor/va-gov/vets-website';

There's more to this issue in the acceptance criteria, but I think we were going to break it up into multiple tickets? I'll try and see what happens as far as a build goes but I've never run through the whole process.

@tjheffner
Copy link
Contributor

I think what you have is correct.

// I also noticed this isn't at the docroot, but I think '../vets-website' exists?
$settings['va_gov_vets_website_root'] = '/var/www/cms/docroot/vendor/va-gov/vets-website';

yeah, both exist 😭 ../vets-website exists solely for the purposes of assets for next-build... symlinks and coordinating rebuilding between all three systems (next-build content-build vets-website) is a PITA.

the one inside docroot/vendor is the one that should be changed via git (i believe this is in place already)

@alexfinnarn
Copy link
Contributor

The current path forward is to write a build request file specifically for next-build that will trigger a release in the queue runner like...

# in scripts/queue_runner/queue_runner.sh

cd "${TUGBOAT_ROOT}"
./bin/drush advancedqueue:queue:process command_runner 2>&1
./bin/drush advancedqueue:queue:process content_release 2>&1
[ -f "./docroot/sites/default/files/.buildrequest" ] && ./scripts/build-frontend.sh && rm ./docroot/sites/default/files/.buildrequest

# New for next-build
[ -f "./docroot/sites/default/files/.next-buildrequest" ] && ./scripts/next-build-frontend.sh && rm ./docroot/sites/default/files/.next-buildrequest
sleep 60s

I found the point where the .buildrequest file is saved to the file system to be later picked up for a content release build: https://github.com/department-of-veterans-affairs/va.gov-cms/blob/main/docroot/modules/custom/va_gov_content_release/src/LocalFilesystem/LocalFilesystemBuildFile.php#L45

However, it isn't configurable or based on a frontend so the code needs to be refactored to accommodate more than one file. I will start by...well I just went down rabbit holes and I'm not sure what I typed above is useful...but I will leave it be as a reference.

In other news, I thought I had the content release form working so you can actually set the version and get it returned from drush va-gov-content-release:frontend-version:get next_build. However, I got a strange error that what next-install.sh cloned down was not a git repository. Sure enough, when I looked in ddev the git directory was removed.

https://github.com/department-of-veterans-affairs/va.gov-cms/blob/main/scripts/remove-git-dirs.sh#L15

Turns out that at the end of a composer install that remove-git-dirs script removes all git information except from the content-build and vets-website dependencies. I will try adding an exclusion line so the git information isn't removed from the next-build cloned repo, but just pointing out another gotcha.

My hope is that I can pass down the frontend type through several functions, kind of like React prop drilling, since I'm not sure looking for other pieces of state is safe considering the release states, queue, etc. Back to debugging, validating, and stepping through code...saving comment this time.

@alexfinnarn
Copy link
Contributor

In a prior comment, I mentioned how the .buildrequest file is written from an enqueued job in https://github.com/department-of-veterans-affairs/va.gov-cms/blob/main/docroot/modules/custom/va_gov_build_trigger/src/Environment/EnvironmentPluginBase.php#L61 . But I guess I forgot that and found another place where the same code is run: https://github.com/department-of-veterans-affairs/va.gov-cms/blob/main/docroot/modules/custom/va_gov_content_release/src/LocalFilesystem/LocalFilesystemBuildFile.php#L41

I will try to follow around the LocalFilesystemBuildFile::submit() and see what calls that, but my current impression has changed from trying to pass the frontend information through a series of services to using state.

The content_release advanced queue job, takes a payload, but curiously the payload is never used. It might be logged somewhere but the actual queue job never uses it. So, the job "information" is really just a boolean value determining if someone submitted the content release form or an entity was updated. I think it might have been possible to just set a variable build_needed if one of those two events happened and check that...but then that's really the same as checking for a .buildrequest file. So, I think writing the .buildrequest file after submitting the form or when content changes would have been enough for a cron job to check and initiate a build if the file exists. But I digress...

I should be able to add a key frontend to the job payload and read that during processing without affecting anything as long as I leave the other payload values as-is. So the code would change to:

// In ReleaseRequest::process().
    $payload = $job->getPayload();
    $dispatch_job = Job::create('va_gov_content_release_dispatch', ['placeholder' => 'placeholder', 'frontend' => $payload['frontend] ?? 'content_build']);

and then use the frontend value in the code writing the .buildrequest file.

@alexfinnarn
Copy link
Contributor

alexfinnarn commented Jan 26, 2024

Screenshot 2024-01-26 at 1 33 16 PM

My approach seems to have worked as far as getting the job in the queue. However, when trying to process I keep getting ReleaseStateManager::STATE_TRANSITION_WAIT even though the other jobs are complete/success so now I will look into that issue.

@alexfinnarn
Copy link
Contributor

alexfinnarn commented Feb 1, 2024

I was able to trace a build through the whole way from submitting the content release form until the build completes. I'm still not sure how the queue_runner.sh script gets called in production although I can see it being moved to a background processing directory on Tugboat. What I did locally was:

  1. With the release state set to "ready" on "/admin/content/deploy", choose a content_build version and submit the form.
  2. That will put a job in "/admin/config/system/queues/jobs/content_release".
  3. Normally, the queue_runner.sh script runs via some background method, but I logged into ddev and ran than script manually.
  4. You can now follow the "/sites/default/files/build.txt" build log linked from the content release form's "Content release status" block.
  5. After some time, the state reset to "ready" after the build was completed.

The main issue I see with the current code is that an event subscriber called ContinuousReleaseSubscriber adds an item to the content release queue after a build completes. This is how a job is guaranteed to always be there even if no content changes on the site. However, it means that content_build trigger will always go first before a next_build queue job. You'd have to manually inspect and delete queued jobs to ensure the first job ran would trigger for next_build.

There is some kind of va-gov:content-release:toggle-continuous command that can disable the event subscriber from adding another item, but that's getting into setting another variable or a conditional check.

So, at this point, I'm wondering if it would be better to set the "frontend build type" in configuration, which would be either next_build or content_build. Then, whenever a command is run that needs to differentiate between build types, that variable is used to determine which build to trigger. I think that approach might be cleaner than trying to allow two builds to happen at once.

It all comes down to how the content release form is used. I think it is to rebuild Tugboat, and I would think whoever is doing that knows which build they want. I don't think they would need both at the same time. I guess I can create a second branch to try this out.

@alexfinnarn
Copy link
Contributor

After some discussion on the approach, it was determined that content_build and next_build must run simultaneously. I started looking at passing parameters around function calls, but that won't work in this case.

My conclusion is to create a new NextReleaseStateManager that extends ReleaseStateManager, and I think I only have to overwrite public const STATE_KEY = 'va_gov_build_trigger.release_state'; to be able to inject that new service into the needed places. Then, I need to duplicate the drush commands that call the release state manager for state transitions.

It would be easier if the state key were used throughout the code so I could simply change that key depending on the frontend being built, but I think creating a new release manager should be cleaner since I'm mainly injecting a different service and only changing one constant.

There are about 9 places where the current ReleaseStateManager is injected into the code, and I will now go and try to replicate whatever is needed to create a different state key and inject the service.

@alexfinnarn
Copy link
Contributor

After some debugging, I don't think creating a separate ReleaseStateManager will work. I think the service container injects one instance and can't reinitialize during the request, which is needed to run both builds at the same time.

Initially, I had wondered if the state manager could take the frontend as a parameter but I didn't want to mess with the current implementation too much if I didn't need to. Now, I will try adding frontend information so the release request commands turn into:

# Old command.
drush va-gov:content-release:advance-state starting

# New command.
drush va-gov:content-release:advance-state next-build starting

Since we have separate frontend build scripts, swapping the frontend name should be easy. I think drush can take positional arguments for commands but maybe it'd be better to name them.

@alexfinnarn
Copy link
Contributor

alexfinnarn commented Feb 6, 2024

After some more "hmm..maybe this will work...oh darn...but wait I didn't try...oh that doesn't work either..." I decided to go rouge and attempt to simplify the build using mostly the form submission function.

New draft PR: #17154

I kept running into walls with event subscribers not having all the information needed to distinguish between frontends, the release state manager not respecting two separate state keys when checking status, how the builds would run concurrently, and how not to potentially break content build continuous release...especially right when people who maintain the release are leaving the project.

So, I made a much simpler version of the submit function for the same content release form I have in other branches. It seems to meet all the requirements of this ticket, and I'm already father after a few hours than I was with my other attempts spanning more than a week...granted, I kept some code from my prior attempts, but it is still a dramatic increase in productivity without having to sort through all the queue, state machine, event subscriber, etc code.

Content release form when you can trigger a build.
Screenshot 2024-02-06 at 1 06 05 PM

Content release form when a build is running with a link to the build log.
Screenshot 2024-02-06 at 1 08 05 PM

The status block and contents can be updated, I simply added bare bones to test on Tugboat.

I am now seeing what happens on Tugboat, but I did create another background process so I hope it will start and be kept up using runit.

@alexfinnarn
Copy link
Contributor

alexfinnarn commented Feb 7, 2024

So locally, I can see the build kick off and view the progress via a build file. It functions pretty much the same way as the current content release workflow, but it is far simpler...still messy at this point, but I want to see it run on Tugboat.

I've tried setting up the background job like I saw for content build in the .tugboat/config.yml file:

        # Setup background processing service.  This uses runit to keep process up
        # See https://docs.tugboat.qa/setting-up-services/how-to-set-up-services/running-a-background-process
        - mkdir -p /etc/service/drupal_events
        - mv "${TUGBOAT_ROOT}/scripts/queue_runner/queue_runner.sh" /etc/service/drupal_events/run
        - chmod +x /etc/service/drupal_events/run

        # Separate process for next-build preview.
        - mkdir -p /etc/service/next-build
        - mv "${TUGBOAT_ROOT}/scripts/queue_runner/next_queue_runner.sh" /etc/service/next-build/run
        - chmod +x /etc/service/next-build/run

But that didn't seem to work so I made a cron job.

function va_gov_content_release_cron(): void {
  \Drupal::logger('va_gov_content_release')->info('Cron run starting.');
  
  $script = dirname(DRUPAL_ROOT) . '/scripts/queue_runner/next_queue_runner.sh';
  if (file_exists($script)) {
    exec("nohup $script > /dev/null 2>&1 &");
  }
  else {
    \Drupal::logger('va_gov_content_release')->error('Queue runner script not found at %script', ['%script' => $script]);
  }
}

I had to disable the linkychecker module to look at my debugging log messages, but I can see it reportedly run the cron job.

So, I'm not sure how to view Tugboat logs for running processes or if you can ssh in. I can try debugging with PHP code that checks for various things using shell commands, but ssh access would be nice.

@alexfinnarn
Copy link
Contributor

Aha, I figured it out. After ssh-ing into the Tugboat instance, I tried to manually run the /scripts/queue_runner/next_queue_runner.sh script, which calls /scripts/next--build-frontend.sh. However, the second script had no permissions for execution. So, I think the queue runner script was running in the background but never actually ran the build script, which is what would have removed the build request file.

I will chmod +x that script in the Tugboat config and see what happens. I bet the cron process failed for the same reason, and locally I modified permissions and forgot. I know that ignoring file permissions is in the git instructions for this repo so that's why it probably wasn't picked up when I committed my code.

@alexfinnarn alexfinnarn added Blocked Issues that are blocked on factors other than blocking issues. and removed Needs refining Issue status labels Mar 6, 2024
@alexfinnarn
Copy link
Contributor

alexfinnarn commented Mar 6, 2024

Re-opening the issue and moving to blocked status so I can close my tab reminding me to follow this issue. We merged this, and then an issue on staging made us revert the code changes.

I made a comment in #17285 with the issue/PR that un-blocks this issue.

@alexfinnarn
Copy link
Contributor

Closing since this code is in #17209 . I started that branch off of this code, but enough time has passed we will just merge this code in the other issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment