Saya mencoba membaca dalam file JSON menjadi bingkai data pandas Python (0.14.0). Berikut ini baris pertama file JSON:
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "P_Mk0ygOilLJo4_WEvabAA", "review_id": "OeT5kgUOe3vcN7H6ImVmZQ", "stars": 3, "date": "2005-08-26", "text": "This is a pretty typical cafe. The sandwiches and wraps are good but a little overpriced and the food items are the same. The chicken caesar salad wrap is my favorite here but everything else is pretty much par for the course.", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}
Saya mencoba melakukan hal berikut: df = pd.read_json(path)
.
Saya mendapatkan kesalahan berikut (dengan pelacakan balik penuh):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 198, in read_json
date_unit).parse()
File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 266, in parse
self._parse_no_numpy()
File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 483, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Trailing data
Apa Trailing data
kesalahannya? Bagaimana cara membacanya menjadi bingkai data?
Berikut beberapa saran, berikut beberapa baris file .json:
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "P_Mk0ygOilLJo4_WEvabAA", "review_id": "OeT5kgUOe3vcN7H6ImVmZQ", "stars": 3, "date": "2005-08-26", "text": "This is a pretty typical cafe. The sandwiches and wraps are good but a little overpriced and the food items are the same. The chicken caesar salad wrap is my favorite here but everything else is pretty much par for the course.", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "TNJRTBrl0yjtpAACr1Bthg", "review_id": "qq3zF2dDUh3EjMDuKBqhEA", "stars": 3, "date": "2005-11-23", "text": "I agree with other reviewers - this is a pretty typical financial district cafe. However, they have fantastic pies. I ordered three pies for an office event (apple, pumpkin cheesecake, and pecan) - all were delicious, particularly the cheesecake. The sucker weighed in about 4 pounds - no joke.\n\nNo surprises on the cafe side - great pies and cakes from the catering business.", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "H_mngeK3DmjlOu595zZMsA", "review_id": "i3eQTINJXe3WUmyIpvhE9w", "stars": 3, "date": "2005-11-23", "text": "Decent enough food, but very overpriced. Just a large soup is almost $5. Their specials are $6.50, and with an overpriced soda or juice, it's approaching $10. A bit much for a cafe lunch!", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}
File .json yang saya gunakan ini berisi satu objek JSON di setiap baris sesuai spesifikasi.
Saya mencoba situs jsonlint.com seperti yang disarankan dan memberikan kesalahan berikut:
Parse error on line 14:
...t7sRT4zwdbzQ8KQmw"}{ "votes": {
----------------------^
Expecting 'EOF', '}', ',', ']'
python
json
python-2.7
pandas
pengguna62198
sumber
sumber
Jawaban:
Dari Pandas versi 0.19.0 Anda dapat menggunakan
lines
parameter, seperti ini:import pandas as pd data = pd.read_json('/path/to/file.json', lines=True)
sumber
lines
argumen? github.com/pandas-dev/pandas/issues/15132Anda harus membacanya baris demi baris. Misalnya, Anda dapat menggunakan kode berikut yang disediakan oleh ryptophan di reddit :
import pandas as pd # read the entire file into a python array with open('your.json', 'rb') as f: data = f.readlines() # remove the trailing "\n" from each line data = map(lambda x: x.rstrip(), data) # each element of 'data' is an individual JSON object. # i want to convert it into an *array* of JSON objects # which, in and of itself, is one large JSON object # basically... add square brackets to the beginning # and end, and have all the individual business JSON objects # separated by a comma data_json_str = "[" + ','.join(data) + "]" # now, load it into pandas data_df = pd.read_json(data_json_str)
sumber
Kode berikut membantu saya memuat
JSON
konten ke dalamdataframe
:import json import pandas as pd with open('Appointment.json', encoding="utf8") as f: data = f.readlines() data = [json.loads(line) for line in data] #convert string to dict format df = pd.read_json(data) # Load into dataframe
sumber
Saya memiliki masalah serupa.
Ternyata itu
pd.read_json(myfile.json)
akan mencari di folder induk secara otomatis, tetapi mengembalikan kesalahan 'data tambahan' ini jika Anda tidak berada di folder yang sama dengan file.Saya menemukan jawabannya, karena ketika saya mencoba melakukannya dengan
open('myfile.json', 'r')
, dan saya mendapat aFileNotFound
kesalahan, jadi saya memeriksa jalurnya.Saya gagal memindahkan myfile.json ke folder yang sama dengan buku catatan saya.
Mengubahnya menjadi
pd.read_json('../myfile.json')
hanya berhasil.sumber
ValueError: Trailing data
saat harus memberiFileNotFound
. Ini terjadi pada saya juga.