Skip to content
This repository has been archived by the owner on Feb 16, 2018. It is now read-only.

How do I use mwdum.py? #1

Closed
bjarnitor opened this issue Mar 20, 2014 · 5 comments
Closed

How do I use mwdum.py? #1

bjarnitor opened this issue Mar 20, 2014 · 5 comments

Comments

@bjarnitor
Copy link

I don't know much about SQL, but I'm intending to import articles from Wikipedia into a local Mediawiki instance.

So I was thinking about doing something like this:

python mwdum.py enwiki-latest-pages-articles.xml > wikipedia.sql

And then figure out how to inject the SQL statesments in the file into my mediawiki database.

Is this a right way to do it?

Thanks.

@andreasnuesslein
Copy link
Owner

=) yes, that sounds correct.

however i should warn you a) have i not maintained it for almost half a year already (although it should still work)
and b) if you do not know much about SQL, you might want to reconsider tinkering around with the Wikipedia-dumps. They're quite huge and it's a challenge to import and actually use them.

cheers

@chrispadfield
Copy link

I just tried this, and get the following error. Am I making a silly mistake here?

python2 mwdum.py enwiki-20140811-pages-articles1.xml-p000000010p000010000 > wiki.sql
Traceback (most recent call last):
File "mwdum.py", line 232, in
mwd = MWDump(filename, MySQL_Output)
File "mwdum.py", line 48, in init
self.output_function = output_function()
File "mwdum.py", line 188, in init
myprint = self.MyPrint()
File "mwdum.py", line 154, in init
uprint('BEGIN;')
File "mwdum.py", line 14, in
uprint = lambda text: sys.stdout.buffer.write((text+'\n').encode('utf-8'))
AttributeError: 'file' object has no attribute 'buffer'

@andreasnuesslein
Copy link
Owner

Hey Chris,

I haven't used this code in months and am not even sure if it's still
working. Have you tried debugging it a bit, have you gotten somewhere with
that?

On Fri, Aug 22, 2014 at 12:40 PM, Chris Padfield [email protected]
wrote:

I just tried this, and get the following error. Am I making a silly
mistake here?

python2 mwdum.py enwiki-20140811-pages-articles1.xml-p000000010p000010000

wiki.sql
Traceback (most recent call last):
File "mwdum.py", line 232, in
mwd = MWDump(filename, MySQL_Output)
File "mwdum.py", line 48, in init
self.output_function = output_function()
File "mwdum.py", line 188, in init
myprint = self.MyPrint()
File "mwdum.py", line 154, in init
uprint('BEGIN;')
File "mwdum.py", line 14, in
uprint = lambda text: sys.stdout.buffer.write((text+'\n').encode('utf-8'))
AttributeError: 'file' object has no attribute 'buffer'


Reply to this email directly or view it on GitHub
#1 (comment)
.

@chrispadfield
Copy link

Hi, I didn't I'm afraid - largely because I've never written a line of python; I tried some google-fu on the error itself but didn't get far; I was not sure if I was doing the wrong syntax so wanted to check that first.

@axiom17
Copy link

axiom17 commented Feb 16, 2018

See @mrgloom's response here.
nenadmarkus/pico#4

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants