-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.2.1 -- bugfixes for loading new chapters #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to go! Tested it just now and it loaded new chapters! Date parsing was tested earlier before the scraper driver code was actually committed.
Might want to add a publish commit here as I'm planning to publish immediately upon merging.
The driver code could probably use more comments, but I don't remember some of the reasons for some of the decisions (another reason for version control!) and many were inherited (though I believe I experimented with changing several to understand if they were necessary)
- previously, the new chapter list would just overwrite the old, persisted one - but the read/unread & new tracking meant the two lists had to be merged, and the previous merge logic was causing bugs that are fixed here - for any manga that were persisted, new chapters wouldn't appear in the chapter list - it would only show the ones that were persisted - this is apparently bc MST was swallowing an error when a node is used more than once in the tree - instead of updating each entry in the chapter list as we go, just update the whole list at once with tne new one - and do merging out of tree before updating - fix the incorrect logic that merged via indices - for one thing, chapters are in reverse chronological order, so when a new chapter is released old[0] will be equal to new[1], as new[0] is the new chapter - the previous logic merged old[0] with new[0], which is super broken (woops) - instead merge via `link`, which is the MST identifier of a chapter and therefore must be unique - currently assuming that the chapters are still ordered the same way, so we don't have to search the entire list to see if the chapter exists and instead move an offset as needed - this is somewhat of an over-optimization, and it falls relatively hard if a chapter were released in between old chapters like a 0.5, an uncensored, diff translation, etc - TODO: optimal fix is to have a map of chapters, update the map entries, and then the list is just the ordering of refs - would be somewhat similar to how mangas are stored - this breaks the stored model data though, so it requires a migration, and those aren't yet implemented in mst-persist, and nor are transforms to workaround the lack of it
- bugs were occurring in the scraper code and it felt weird to fix them and not be able to commit or version control that data - in the future, when multiple providers are supported and the API is somewhat stable, probably want to split this out into its own library / sdk and have the app depend on the SDK instead of doing it's own parsing - still gitignore site names and URLs for now - and rename exposed files to numbers instead of names for now - might use acronyms or something in the future for multi-provider support and letting the user choose providers
- "hours ago", "days ago", "Today", "Yesteday", etc - this seems to be all permutations for this provider at least - for now just convert them to Date.now(), but ideally number of days or hours should actually be subtracted out for an accurate date - this is a much simpler fix right now, but it may be confusing to users who know their release dates, so not ideal
- fixes parsing of release dates when a relative date is used - fixes merge bugs causing new chapters to not show up - internal changes: - expose / unignore / version control some scraper code
Invalid Date
#20Invalid Date
#20 as I don't think parsing the relative dates more accurately is high priority / worth the time right nowA slightly big change is that, in the process of this PR, I've exposed much of the scraper driver code that is specific to scraping / parsing so that it can be version controlled (as well as used as an example for forks, custom providers, etc -- see #1). A small portion is still private for now. Once more providers are added (#6), I think it would be wise to consider extracting this logic into its own "SDK" of sorts and just have the app use the SDK. Still unsure how safe-listing and custom providers etc might work though, but I do think that logic is independent enough to be split out, the goal of this app is to just be a reader -- where & how it reads is not a central concern (so long as it supports user needs/wants).