Skip to content

Commit

Permalink
Merge pull request #117 from kbase/static_narratives_view
Browse files Browse the repository at this point in the history
Mostly Blobstore details.
  • Loading branch information
jkbaumohl authored Oct 10, 2024
2 parents 09bacc7 + 11ef6db commit 5869feb
Show file tree
Hide file tree
Showing 26 changed files with 3,196 additions and 263 deletions.
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,15 @@ source/daily/upload_public_narratives_count.py
source/daily/make_reporting_tables.py


-------------------
Within the logstash dockerfile there is:
https://github.com/kbase/logstash/blob/41778da1238129a65296bdddcb6ff26e9c694779/Dockerfile#L24-L29
The rm at the end I believe is just cleaning up after itself. This was set up by Steve for Cheyenne's work
This is used by this code:
https://github.com/kbase/metrics/blob/master/source/daily_cron_jobs/methods_upload_elasticsearch_sumrydicts.py



-------------------

CRON Jobs are run from mysql-metrics
Expand All @@ -53,23 +62,22 @@ There are nightly CRON jobs that get run are located in bin/master_cron_shell.sh
which runs scripts from the source/daily directory

Then there are also monthly CRON jobs that get run are located in bin/upload_workspace_stats.sh
It used to be workspaces (user info needed first for FK potential issues), but now it also conatins scripts for
DOI metrics.)
It used to be workspaces (user info needed first for FK potential issues),
Runs scripts from source/monthly directory

There is a doi_monthly CRON job for Credit Engine that runs are located in bin/upload_doi_metrics.sh

These create Logs to keep track of (note nightly metrics is calling master_cron_shell
01 17 * * * /root/metrics/nightly_metrics.sh >>/mnt/metrics_logs/crontab_nightly 2>&1
01 0 1 * * /root/metrics/monthly_metrics.sh >>/mnt/metrics_logs/crontab_monthly 2>&1
01 0 15 * * /root/metrics/monthly_metrics.sh >>/mnt/metrics_logs/crontab_doi_monthly 2>&1
01 07 * * * /root/metrics/nightly_errorlogs.sh >>/mnt/metrics_logs/crontab_errorlogs 2>&1

From Docker03 the logs can be checked by going doing the following. (Note no y at end of monthly)
From Docker03 the logs can be checked by going doing the following.
cat /mnt/nfs3/data1/metrics/crontab_logs/crontab_nightly
cat /mnt/nfs3/data1/metrics/crontab_logs/crontab_monthly
cat /mnt/nfs3/data1/metrics/crontab_logs/crontab_doi_monthly


Can also confirm things ran by looking in the database (if not need to do backfills).
Example: (should be first of each month)
select DATE_FORMAT(`record_date`,'%Y-%m') as narrative_cron_month, count(*) as narrative_count from metrics.workspaces ws group by narrative_cron_month;
Expand Down
3 changes: 3 additions & 0 deletions bin/dump_get_copy_info_for_narratives.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

python custom_scripts/get_copy_info_for_narratives.py
3 changes: 3 additions & 0 deletions bin/dump_weekly_ADAM_app_categories.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

python custom_scripts/dump_weekly_ADAM_app_categories.py
3 changes: 3 additions & 0 deletions bin/dump_weekly_app_categories_v2.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

python custom_scripts/dump_weekly_app_categories_v2.py
2 changes: 2 additions & 0 deletions bin/master_cron_shell.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ python daily_cron_jobs/upload_public_narratives_count.py

python daily_cron_jobs/upload_user_orcid_count.py

python daily_cron_jobs/upload_blobstore_details.py

python daily_cron_jobs/make_reporting_tables.py


242 changes: 0 additions & 242 deletions source/custom_scripts/backfill_blobstore_details.py

This file was deleted.

10 changes: 7 additions & 3 deletions source/custom_scripts/dump_narratives_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,16 @@ def dump_narratives_results():

# CHANGE QUERY HERE
# Query for Adam Narratives dump of information:
query = ("select wc.* from metrics.user_info ui inner join metrics_reporting.workspaces_current wc on ui.username = wc.username "
"where ui.kb_internal_user = 0 and wc.narrative_version > 0 and is_deleted = 0 and is_temporary = 0")
query = ("select * from metrics.workspaces_current_plus_users ")
# query = ("select wc.* from metrics.user_info ui inner join metrics_reporting.workspaces_current wc on ui.username = wc.username "
# "where ui.kb_internal_user = 0 and wc.narrative_version > 0 and is_deleted = 0 and is_temporary = 0")
# Headers for Adam's narratives query (Note if more columns added, may need to update this
print("ws_id\tusername\tmod_date\tinitial_save_date\trecord_date\ttop_lvl_object_count\ttotal_object_count\tvisible_app_cells_count\tcode_cells_count\t"
"narrative_version\thidden_object_count\tdeleted_object_count\ttotal_size\ttop_lvl_size\tis_public\tis_temporary\tis_deleted\tnumber_of_shares\t"
"num_nar_obj_ids\tstatic_narratives_count\tstatic_narratives_views\tunique_object_types_count")
"num_nar_obj_ids\tstatic_narratives_count\tstatic_narratives_views\tunique_object_types_count\t"
"orig_saver_count\tnon_orig_saver_count\torig_saver_size_GB\tnon_orig_saver_size_GB")

# "num_nar_obj_ids\tstatic_narratives_count\tstatic_narratives_views\tunique_object_types_count")

cursor.execute(query)
row_values = list()
Expand Down
9 changes: 8 additions & 1 deletion source/custom_scripts/dump_query_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,9 @@ def dump_query_results():
"last_narrative_modified_date\ttotal_narrative_objects_count\ttop_lvl_narrative_objects_count\ttotal_narrative_objects_size\t"
"top_lvl_narrative_objects_size\ttotal_narrative_count\ttotal_public_narrative_count\tdistinct_static_narratives_count\t"
"static_narratives_created_count\ttotal_visible_app_cells\ttotal_code_cells_count\tfirst_file_date\tlast_file_date\t"
"total_file_sizes_MB\ttotal_file_count\tmost_used_app\tdistinct_apps_used\ttotal_apps_run_all_time\ttotal_apps_run_last365\t"
"total_file_sizes_MB\ttotal_file_count\tblobstore_orig_saver_count\tblobstore_non_orig_saver_count\t"
"blobstore_orig_saver_size_GB\tblobstore_non_orig_saver_size_GB\t"
"most_used_app\tdistinct_apps_used\ttotal_apps_run_all_time\ttotal_apps_run_last365\t"
"total_apps_run_last90\ttotal_apps_run_last30\ttotal_app_errors_all_time\tfirst_app_run\tlast_app_run\ttotal_run_time_hours\t"
"total_queue_time_hours\ttotal_CPU_hours\tsession_count_all_time\tsession_count_last_year\tsession_count_last_90\tsession_count_last_30"
)
Expand Down Expand Up @@ -88,6 +90,11 @@ def dump_query_results():
# order by avg_hours_active desc, session_count, total_hours_active")
#print("username\tsession_count\ttotal_hours_active\tavg_hours_active\tstd_hours_active\tfirst_seen\tlast_seen")

# Custom apps updates for RSV
# query = ("select app_name, git_commit_hash, min(finish_date) as first_run_date from user_app_usage \
# group by app_name, git_commit_hash having first_run_date > '2021-01-01'")
# print("appname\tgit_commit_hash\tfirst_run_date")

#Blobstore cumulative sizes over users
# query = ("select sum(total_size) as blobstore_size, bs.username from blobstore_stats bs \
# group by username order by blobstore_size")
Expand Down
Loading

0 comments on commit 5869feb

Please sign in to comment.