-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update setBaseUrl() to handle URL directors #1929
Conversation
a52ea6c
to
3a5526e
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1929 +/- ##
==========================================
- Coverage 74.17% 73.89% -0.29%
==========================================
Files 39 39
Lines 3106 3095 -11
Branches 686 679 -7
==========================================
- Hits 2304 2287 -17
- Misses 687 689 +2
- Partials 115 119 +4
☔ View full report in Codecov by Sentry. |
c0b1112
to
c5fe712
Compare
The idea behind this patch is to remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the proper architecture:
- CPU inefficient, why testing millions of times something which can be decided once for all?
- The overall concept of having classes sharing the same API and they are interchangeable is always underused, although this is a base principle of the new architecture
- Director in a technicality which should be in the Downloader... and ideally nowhere else.
- To me it looks like a serious regression has been introduced by not handling the case with anymore when
forceRender is null
.
src/Downloader.ts
Outdated
public getArticleUrl(renderer, articleId: string): string { | ||
let articleUrl: string | ||
// Renders and URL builders are independent so there should be a switch/case scenario here | ||
switch (renderer.constructor.name) { | ||
case 'WikimediaDesktopRenderer': | ||
articleUrl = MediaWiki.wikimediaDesktopUrlDirector.buildArticleURL(articleId) | ||
break | ||
case 'VisualEditorRenderer': | ||
articleUrl = MediaWiki.visualEditorURLDirector.buildArticleURL(articleId) | ||
break | ||
case 'WikimediaMobileRenderer': | ||
articleUrl = MediaWiki.wikimediaMobileUrlDirector.buildArticleURL(articleId) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not going do go through this millions of times, if this is can be done once per scrape. Decide which Director to use once for all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this has been revamped in favor of using URL builder's .buildArticleURL(articleId)
method.
public async setBaseUrls(forceRender = null) { | ||
if (!forceRender) { | ||
//* Objects order in array matters! | ||
this.baseUrl = basicURLDirector.buildDownloaderBaseUrl([ | ||
{ condition: await MediaWiki.hasWikimediaMobileApi(), value: MediaWiki.WikimediaMobileApiUrl.href }, | ||
{ condition: await MediaWiki.hasWikimediaDesktopApi(), value: MediaWiki.WikimediaDesktopApiUrl.href }, | ||
{ condition: await MediaWiki.hasVisualEditorApi(), value: MediaWiki.visualEditorApiUrl.href }, | ||
]) | ||
|
||
//* Objects order in array matters! | ||
this.baseUrlForMainPage = basicURLDirector.buildDownloaderBaseUrl([ | ||
{ condition: await MediaWiki.hasWikimediaDesktopApi(), value: MediaWiki.WikimediaDesktopApiUrl.href }, | ||
{ condition: await MediaWiki.hasVisualEditorApi(), value: MediaWiki.visualEditorApiUrl.href }, | ||
{ condition: await MediaWiki.hasWikimediaMobileApi(), value: MediaWiki.WikimediaMobileApiUrl.href }, | ||
]) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where goes all this logic to decide automatically which API/render to use? Init the Director object at construction time. You can not get ride of a some kind of init code for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for choosing render is in saveArticles()
. Here you can find a condition where forceRender is null
. When choosing render, src/renderers/renderer.builder.ts
automatically selects the necessary API based on the predefined renderType
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upd: I brought back this logic + calculate the necessary URL builder here.
src/Downloader.ts
Outdated
default: | ||
throw new Error('Unable to find specific API end-point to retrieve article HTML') | ||
} | ||
public getArticleUrl(renderer, articleId: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should have only one argument: articleId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of renderer
I passed dump
to apply isMainPage condition, pls check.
src/MediaWiki.ts
Outdated
@@ -152,7 +152,7 @@ class MediaWiki { | |||
|
|||
public async hasVisualEditorApi(): Promise<boolean> { | |||
if (this.#hasVisualEditorApi === null) { | |||
this.#hasVisualEditorApi = await checkApiAvailability(this.VisualEditorURLDirector.buildArticleURL(this.apiCheckArticleId)) | |||
this.#hasVisualEditorApi = await checkApiAvailability(this.visualEditorURLDirector.buildArticleURL(this.apiCheckArticleId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instanciate the needed director here temporary if you need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/MediaWiki.ts
Outdated
this.wikimediaDesktopUrlDirector = new WikimediaDesktopURLDirector(this.WikimediaDesktopApiUrl.href) | ||
this.wikimediaMobileUrlDirector = new WikimediaMobileURLDirector(this.WikimediaMobileApiUrl.href) | ||
this.VisualEditorURLDirector = new VisualEditorURLDirector(this.visualEditorApiUrl.href) | ||
this.visualEditorURLDirector = new VisualEditorURLDirector(this.visualEditorApiUrl.href) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should vanish
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, now instantiates in related Mediawiki hasCapabilies lazy methods.
@VadimKovalenkoSNF Unfortunately didn't have the time to look at this in detail. But please rebase and fix conflict and I will within 24 hours. |
df90842
to
4e244a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not there yet unfortunately. See my comments.
src/Downloader.ts
Outdated
{ condition: await MediaWiki.hasVisualEditorApi(), value: MediaWiki.VisualEditorApiUrl.href, Director: VisualEditorURLDirector }, | ||
{ condition: await MediaWiki.hasWikimediaMobileApi(), value: MediaWiki.WikimediaMobileApiUrl.href, Director: WikimediaMobileURLDirector }, | ||
] | ||
this.baseUrlForMainPage = basicURLDirector.buildDownloaderBaseUrl(mainPageCapabilitiesList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we stick with a baseUrl here and don't have a director like for other articles? Should be the case IMO. We should not have anything about baseUrl here in this class and probably nowhere else probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls check recent changes, the property Downloader.baseUrl
is no longer used.
src/Downloader.ts
Outdated
const articlesCapabilitiesList = [ | ||
{ condition: await MediaWiki.hasWikimediaMobileApi(), value: MediaWiki.WikimediaMobileApiUrl.href, Director: WikimediaMobileURLDirector }, | ||
{ condition: await MediaWiki.hasWikimediaDesktopApi(), value: MediaWiki.WikimediaDesktopApiUrl.href, Director: WikimediaDesktopURLDirector }, | ||
{ condition: await MediaWiki.hasVisualEditorApi(), value: MediaWiki.VisualEditorApiUrl.href, Director: VisualEditorURLDirector }, | ||
] | ||
|
||
this.baseUrl = basicURLDirector.buildDownloaderBaseUrl(articlesCapabilitiesList) | ||
this.articleUrlDirector = this.getUrlDirector(articlesCapabilitiesList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the condition does not returns directly the director.... like it was returning directly the baseURL in the past? Concept of capabilites (and related functions) seem useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored this to downloader.setUrlsDirectors(mainPageRenderer, articlesRenderer)
src/MediaWiki.ts
Outdated
@@ -181,15 +184,12 @@ class MediaWiki { | |||
const baseUrlDirector = new BaseURLDirector(this.baseUrl.href) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole function/method initMWApis()
should disappear:
baseUrlDirector
should be make class public variable athis.urlDirector
- BaseURLDirector should take all the necessary arguments to be able to deliver all the necessary paths:
buildWikimediaDesktopApiUrl
,buildVisualEditorURL
... BaseURLDirector
methodbuildWikimediaDesktopApiUrl
,buildVisualEditorURL
... should be made lazy and renamed ingetWikimediaDekstopApiUrl()
, ....- Mediawiki method
urlDirector
should be used in place of all these variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored this part of the code, pls check set base(value: string) { ... }
. Once we set this.baseUrl
in Mediawiki we are eligible to calculate other endpoints.
src/Downloader.ts
Outdated
|
||
if (!this.baseUrl || !this.baseUrlForMainPage) throw new Error('Unable to find appropriate API end-point to retrieve article HTML') | ||
public getArticleUrl(dump: Dump, articleId: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't take dump
as parameter, only the articleId
. Create in addition a getMainPageUrl(articleId)
... the function interface naming is not great I know....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created downloader.getMainPageUrl()
9f313bd
to
aa14e7a
Compare
@VadimKovalenkoSNF This PR should be rebased and hopefuly is ready to review |
aa14e7a
to
db23e5c
Compare
Fixes: #1927
Depends on #1939
Depends on #1944