The GitHub plugin provides tables, including github_search_issue and github_search_pull_request, that leverage GitHub's powerful search syntax. Here are some ways to use it to explore a user's activity.
Here I'm looking for issues that I created in any of 100+ repos whose names match turbot
.
select
*
from
github_search_issue
where
query = 'is:issue author:judell'
and html_url ~ 'turbot'
The native GitHub query can almost do this, but https://github.com/issues?q=is:issue+author:judell doesn't filter by repo, and you can't say https://github.com/issues?q=is:issue+author:judell+repo:*turbot*.
This query combines eight CTEs that encapsulate variations of the GitHub query syntax for both issues and pull requests
with my_created_issues as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_issue
where
query = 'is:issue author:judell'
and html_url ~ 'turbot'
),
my_assigned_issues as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_issue
where
query = 'is:issue assignee:judell'
and html_url ~ 'turbot'
),
my_mentioned_issues as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_issue
where
query = 'is:issue mentions:judell'
and html_url ~ 'turbot'
),
my_commenter_issues as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_issue
where
query = 'is:issue commenter:judell'
and html_url ~ 'turbot'
),
my_created_pulls as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_pull_request
where
query = 'is:pr author:judell'
and html_url ~ 'turbot'
),
my_assigned_pulls as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_pull_request
where
query = 'is:pr assignee:judell'
and html_url ~ 'turbot'
),
my_mentioned_pulls as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_pull_request
where
query = 'is:pr mentions:judell'
and html_url ~ 'turbot'
),
my_commenter_pulls as (
select
html_url,
title,
updated_at,
created_at,
closed_at,
comments
from
github_search_issue
where
query = 'is:pr commenter:judell'
and html_url ~ 'turbot'
),
combined as (
select * from my_created_issues
union
select * from my_assigned_issues
union
select * from my_mentioned_issues
union
select * from my_commenter_issues
union
select * from my_created_pulls
union
select * from my_assigned_pulls
union
select * from my_mentioned_pulls
union
select * from my_commenter_pulls
)
select distinct
*
from
combined
order by
updated_at desc
For the above parameters -- user judell
and repo match turbot
-- Steampipe runs the initial query in 13.5 seconds. Followup queries that filter the initial results are instantaneous within the default 5-minute cache duration. One way to persist the cache is to wrap the query in a materialized view.
create materialized view my_github_activity as (
-- include above sql
) with data;
If I create that view today, it's immediately available throughout the day. Tomorrow I can say refresh materialized view my_github_activity
to spend another 13.5 seconds recaching the view for the rest of that day.
Here are three ways to refine the query:
-
Find issues/pulls for a different user
-
Find issues/pulls in repos whose names match a different pattern
-
Find issues/pulls whose bodies match a search string
This function parameterizes the query in those ways.
create or replace function github_activity(match_user text, match_repo text, match_body text)
returns table (
html_url text,
title text,
updated_at timestamptz,
created_at timestamptz,
closed_at timestamptz,
comments bigint,
body text
) as $$
begin
return query
with my_created_issues as (
select
i.html_url,
i.title,
i.updated_at,
i.created_at,
i.closed_at,
i.comments,
i.body
from
github_search_issue i
where
i.query = 'is:issue author:' || match_user
and i.html_url ~ match_repo
),
my_assigned_issues as (
select
i.html_url,
i.title,
i.updated_at,
i.created_at,
i.closed_at,
i.comments,
i.body
from
github_search_issue i
where
i.query = 'is:issue assignee:' || match_user
and i.html_url ~ match_repo
),
my_mentioned_issues as (
select
i.html_url,
i.title,
i.updated_at,
i.created_at,
i.closed_at,
i.comments,
i.body
from
github_search_issue i
where
i.query = 'is:issue mentions:' || match_user
and i.html_url ~ match_repo
),
my_commenter_issues as (
select
i.html_url,
i.title,
i.updated_at,
i.created_at,
i.closed_at,
i.comments,
i.body
from
github_search_issue i
where
i.query = 'is:issue commenter:' || match_user
and i.html_url ~ match_repo
),
my_created_pulls as (
select
p.html_url,
p.title,
p.updated_at,
p.created_at,
p.closed_at,
p.comments,
p.body
from
github_search_pull_request p
where
p.query = 'is:pr author:' || match_user
and p.html_url ~ match_repo
),
my_assigned_pulls as (
select
p.html_url,
p.title,
p.updated_at,
p.created_at,
p.closed_at,
p.comments,
p.body
from
github_search_pull_request p
where
p.query = 'is:pr assignee:' || match_user
and p.html_url ~ match_repo
),
my_mentioned_pulls as (
select
p.html_url,
p.title,
p.updated_at,
p.created_at,
p.closed_at,
p.comments,
p.body
from
github_search_pull_request p
where
p.query = 'is:pr mentions:' || match_user
and p.html_url ~ match_repo
),
my_commenter_pulls as (
select
p.html_url,
p.title,
p.updated_at,
p.created_at,
p.closed_at,
p.comments,
p.body
from
github_search_pull_request p
where
p.query = 'is:pr commenter:' || match_user
and p.html_url ~ match_repo
),
combined as (
select * from my_created_issues
union
select * from my_assigned_issues
union
select * from my_mentioned_issues
union
select * from my_commenter_issues
union
select * from my_created_pulls
union
select * from my_assigned_pulls
union
select * from my_mentioned_pulls
union
select * from my_commenter_pulls
),
filtered as (
select distinct
*
from
combined c
where
( c.body is not null and c.body ~* match_body )
or
( c.body is null and match_body = '')
)
select
*
from
filtered f
order by
f.updated_at desc;
end;
$$ language plpgsql;
The SQL wrapped in this function uses Postgres POSIX regular expression match operators: ~
and its case-insensitive counterpart ~*
.
When you use one of these operators to match a string and the empty string, the result is always true.
select 'abc123' ~ '' as match
match
-------
t
(1 row)
But when you match null and the empty string, it isn't.
select null ~ '' as match;
match
-------
(1 row)
(I'm not sure why Postgres doesn't report f
here.)
Anyway, since issue/pr bodies can be null, the filtered
CTE has to handle both cases.
To create the function, paste that code into the Steampipe CLI (steampipe query
) or psql
.
Now these queries are possible.
This does exactly what the above query does.
select * from github_activity('judell','turbot','')
It can be cached like so.
create materialized view my_github_activity as (
select * from github_activity('judell','turbot','')
) with data;
select * from github_activity('judell','','')
select * from github_activity('rajlearner17','steampipe-mod','')
select * from github_activity('kaidaguerre','turbot','nil pointer')
If you connect a visualizer to Steampipe, and if the visualizer supports UX for substitutable parameters, then you can make the function's parameters interactive. Here's an example in Tableau.