Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering JATS/XML as HTML5 #15

Open
mrchristian opened this issue Sep 15, 2019 · 7 comments
Open

Rendering JATS/XML as HTML5 #15

mrchristian opened this issue Sep 15, 2019 · 7 comments

Comments

@mrchristian
Copy link
Contributor

You want to have some JATS/XML rendered as HTML5 for the Oxford XML Summer School. Can you point me to the type of source, or an example, content that needs rendering that way I can try some things out. Preferably the GitHub Pages Jekyll framework could just use the JATS as is but will have to see.

I take it we would either be wanting concatenate a series of papers from directories into one big HTML output, or create a mini website linking to papers?

@petermr
Copy link
Owner

petermr commented Sep 15, 2019 via email

@mrchristian
Copy link
Contributor Author

What I need to understand is what we want to do with the directory https://github.com/petermr/climate/clim107/

Here you can see an example where I'm simply allowing GitHub's GitHub Pages to render the HTML pages that already existing in /clim107/ as a mini website:

Use this link:
https://mrchristian.github.io/climate-publishing/commonest.dataTables.html

From https://github.com/mrchristian/climate-publishing/tree/master/docs

This is obviously not what we want as an end result, but instead I just wanted to show GitHub pages in action.

So... what would be a stage one version of publishing the results of a 'getpapers' process as GitHub pages?

It could simply be a homepage with list of articles (title, date) and links to HTML version, and original. All well styled.

Then after that we could move onto having an mini website generated that represented the papers and dataset in a way that exposes the different aspects of of the collection: word frequencies, dictionaries used, artifacts, essentially making a website representation of what has been already created in XML.

And we can add more features as we move along.

Thanks S

@petermr
Copy link
Owner

petermr commented Sep 15, 2019 via email

@mrchristian
Copy link
Contributor Author

So if I get it right, first off your interested in getting the articles rendered say as markdown so they can be shown in the GitHub repository and not as a separate GitHub Pages website.

I'll have a go at aggregating 5 Scholarly HTML files into one markdown and providing a ToC, then it can be displayed in the GitHub repo. Does this sound right?

PS when is your Oxford XML day?

@petermr
Copy link
Owner

petermr commented Sep 16, 2019 via email

@mrchristian
Copy link
Contributor Author

OK, I'll carve out some time to get something in place. I'll see if I can pull in some help. Sections, yes very good. I can see there is something very exciting here, but still making my way up the learning curve of the processes and outputs. More soon...

@petermr
Copy link
Owner

petermr commented Sep 16, 2019

I have now upgraded ami3 to extract sections.
using

--sections ALL

we get

sectionList             [ABBREVIATION, ABSTRACT, ACK_FUND, APPENDIX, ARTICLE_META, ARTICLE_TITLE, CONTRIB, AUTH_CONT, BACK, BODY, CASE, CONCL, COMP_INT, DISCUSS, FINANCIAL, FIG, FRONT, INTRO, JOURNAL_META, JOURNAL_TITLE, PUBLISHER_NAME, KEYWORD, METHODS, OTHER, PMCID, REF, RESULTS, SUPPL, TABLE, SUBTITLE, TITLE]

This takes quite a while, but individual ones are quickish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants