You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[{"id":1,"title":"Programming languages created in the 1940s","level":0,"transitivePages":1,"pages":1,"transitiveSubcategories":0,"parentCategories":0,"subcategories":0,"type":"Category"},{"start":1,"type":"ContainsPage","end":2},{"id":2,"title":"Plankalk<9F>l","type":"Page"}]
Upon loading in say Python you get this:
> > > import json
> > > jsonfile = open("bug.json", 'rb')
> > > cgraph = json.load(jsonfile)
> > > Traceback (most recent call last):
> > > File "", line 1, in
> > > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/**init**.py", line 278, in load
> > > **kw)
> > > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/**init**.py", line 326, in loads
> > > return _default_decoder.decode(s)
> > > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode
> > > obj, end = self.raw_decode(s, idx=_w(s, 0).end())
> > > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
> > > obj, end = self.scan_once(s, idx)
> > > UnicodeDecodeError: 'utf8' codec can't decode byte 0x9f in position 8: invalid start byte
> > >
> > >
The text was updated successfully, but these errors were encountered:
It looks like the exported JSON has an encoding problem.
Suppose you are extracting http://en.wikipedia.org/wiki/Category:Programming_languages_created_in_the_1940s
This gives you JSON like this:
Upon loading in say Python you get this:
The text was updated successfully, but these errors were encountered: