Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecentChangesEnumerator not properly populating all results #90

Open
tigerpaw28 opened this issue Oct 31, 2021 · 6 comments
Open

RecentChangesEnumerator not properly populating all results #90

tigerpaw28 opened this issue Oct 31, 2021 · 6 comments
Assignees

Comments

@tigerpaw28
Copy link

I used the following code to get log entries via the RecentChangesEnumerator:

 var generator = new RecentChangesGenerator(wiki)
 {
           PaginationSize = 50,
           EndTime = DateTime.Parse("13:36, 30 October 2021"),
           TypeFilters = RecentChangesFilterTypes.Log
 };

 var items = await generator.EnumItemsAsync().ToListAsync();

This is results in two items being returned, but neither have any of their properties populated. Broader searches seem to return a mix of populated and unpopulated results. As usual I'm working with TFWiki.net which is on MW 1.19.

@CXuesong CXuesong added the bug label Oct 31, 2021
@CXuesong CXuesong self-assigned this Oct 31, 2021
@CXuesong
Copy link
Owner

Your code eventually sends the following request to TFWiki

POST https://tfwiki.net/mediawiki/api.php

format=json&action=query&maxlag=5&list=recentchanges&rcdir=older&rcend=2021-10-30T05%3a36%3a00Z&rctype=log&rclimit=50&rcprop=user%7cuserid%7ccomment%7cparsedcomment%7cflags%7ctimestamp%7ctitle%7cids%7csizes%7credirect%7cloginfo%7ctags%7csha1

You can see the response by opening the following link
https://tfwiki.net/mediawiki/api.php?format=json&action=query&maxlag=5&list=recentchanges&rcdir=older&rcend=2021-10-30T05%3a36%3a00Z&rctype=log&rclimit=50&rcprop=user%7cuserid%7ccomment%7cparsedcomment%7cflags%7ctimestamp%7ctitle%7cids%7csizes%7credirect%7cloginfo%7ctags%7csha1
TFWiki responds the request with

{
    "warnings": {
        "recentchanges": {
            "*": "Unrecognized value for parameter 'rcprop': sha1"
        }
    },
    "query": {
        "recentchanges": [
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            },
            {
                "tags": []
            }
        ]
    }
}

This response is abnormal. especially, there is no other fields except tags. I think there must be something wrong with MediaWiki server code to send you such response.

Actually, if you remove the |tag part from rcprop parameter, you will see empty response, as expected:
https://tfwiki.net/mediawiki/api.php?format=json&action=query&maxlag=5&list=recentchanges&rcdir=older&rcend=2021-10-30T05%3a36%3a00Z&rctype=log&rclimit=50&rcprop=user%7cuserid%7ccomment%7cparsedcomment%7cflags%7ctimestamp%7ctitle%7cids%7csizes%7credirect%7cloginfo%7csha1

{"warnings":{"recentchanges":{"*":"Unrecognized value for parameter 'rcprop': sha1"}},"query":{"recentchanges":[]}}

@CXuesong
Copy link
Owner

So what you can do here is

  1. Find out why MediaWiki is sending such response (there could be some bug with MW 1.19 software).
  2. Regardless of whether you are planning to do 1., you can derive your own class from RecentChangesGenerator, override EnumParams method, so that you can later intercept the rvprop parameter and remove the |tag part.
private IEnumerable<KeyValuePair<string, object?>> EnumParams(bool isList)
    => base.EnumParams(isList).Select(p => p.Key == "rvprop" ? new KeyValuePair<string, object?>(p.Key, ((string)p.Value).Replace("|tags", "")) : p);

@CXuesong
Copy link
Owner

To furtherly prove this, try the code below: .NET Fiddle

using System;
using System.Linq;
using WikiClientLibrary.Client;
using WikiClientLibrary.Sites;
using WikiClientLibrary.Generators;

using var client = new WikiClient();
var site = new WikiSite(client, "https://tfwiki.net/mediawiki/api.php");
await site.Initialization;

Console.WriteLine(site.SiteInfo + " " + site.SiteInfo.Version);

var generator = new RecentChangesGenerator(site)
{
	PaginationSize = 50,
	EndTime = DateTime.Parse("13:36, 30 October 2021"),
	TypeFilters = RecentChangesFilterTypes.Log
};

Console.WriteLine("Server side log filtering");
var items = await generator.EnumItemsAsync().ToListAsync();
Console.WriteLine("{0} items:", items.Count);
foreach (var i in items) Console.WriteLine(i);

Console.WriteLine("Client side log filtering");
generator.TypeFilters = RecentChangesFilterTypes.All;
items = await generator.EnumItemsAsync().Where(i => i.Type == RecentChangesType.Log).ToListAsync();
Console.WriteLine("{0} items:", items.Count);
foreach (var i in items) Console.WriteLine(i);

The output is

WikiClientLibrary.Sites.SiteInfo 1.19.20
Server side log filtering
13 items:
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
0,01/01/0001 00:00:00,Edit,[None],,,
Client side log filtering
0 items:

It seems that the issue won't manifest if you are listing everything instead of listing logs on the server-side.

@tigerpaw28
Copy link
Author

tigerpaw28 commented Oct 31, 2021

@tigerpaw28
Copy link
Author

And now I see this isn't even what I want to query since recent changes doesn't appear to include the user creation log entries, despite those log events appearing on the recent changes page.

I don't see a generator for log events so I'm guessing I need to write my own generator and/or use InvokeMediaWikiApiAsync to query that list. Would the same apply to retrieving the allusers list as well?

@tigerpaw28
Copy link
Author

Yet further API testing shows that I can get user creation logs from the RecentChanges API so long as you don't ask it to populate loginfo. This causes it to filter out some types of logs, presumably because they don't have those fields.

With this in mind, I was going to adopt your suggestion of deriving a class from RecentChangesGenerator and override EnumParams...except that EnumParams is a private method and can't be overriden.

Would you be open to changing that or do you have another suggestion? The question about retrieving users still stands as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants