Saya mencoba mengurai file json ini dan saya mengalami masalah. jsonnya terlihat seperti ini:

    <ListObject list at 0x2161945a860> JSON: {
  "data": [
    {
      "amount": 100,
      "available_on": 1621382400,
      "created": 1621264875,
      "currency": "usd",
      "description": "0123456",
      "exchange_rate": null,
      "fee": 266,
      "fee_details": [
        {
          "amount": 266,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_abvgd1234",
      "net": 9999,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust1",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    },
    {
      "amount": 25984,
      "available_on": 1621382400,
      "created": 1621264866,
      "currency": "usd",
      "description": "0326489",
      "exchange_rate": null,
      "fee": 93,
      "fee_details": [
        {
          "amount": 93,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_65987jihgf4984oihydgrd",
      "net": 9874,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust2",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    },
  ],
  "has_more": true,
  "object": "list",
  "url": "/v1/balance_"
}

Saya mencoba menguraikannya dengan python dengan skrip ini:

import pandas as pd
df = pd.json_normalize(json)
df.head()

Tapi yang saya dapatkan adalah:

enter image description here

Yang saya butuhkan adalah mengurai setiap titik data ini di kolomnya sendiri. Jadi saya akan memiliki 2 baris data dengan kolom untuk setiap titik data. Sesuatu seperti ini:

enter image description here

Bagaimana saya melakukan ini sekarang?

0
Slavisha84 18 Mei 2021, 05:56

1 menjawab

Jawaban Terbaik

Semua kecuali satu dari bidang Anda adalah salinan langsung dari JSON, jadi Anda cukup membuat daftar bidang yang dapat Anda salin, dan kemudian melakukan pemrosesan tambahan untuk fee_details.

import json
import pandas as pd

inp =  """{
  "data": [
    {
      "amount": 100,
      "available_on": 1621382400,
      "created": 1621264875,
      "currency": "usd",
      "description": "0123456",
      "exchange_rate": null,
      "fee": 266,
      "fee_details": [
        {
          "amount": 266,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_abvgd1234",
      "net": 9999,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust1",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    },
    {
      "amount": 25984,
      "available_on": 1621382400,
      "created": 1621264866,
      "currency": "usd",
      "description": "0326489",
      "exchange_rate": null,
      "fee": 93,
      "fee_details": [
        {
          "amount": 93,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_65987jihgf4984oihydgrd",
      "net": 9874,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust2",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    }
  ],
  "has_more": true,
  "object": "list",
  "url": "/v1/balance_"
}"""

copies = [
    'id',
    'net',
    'object',
    'reporting_category',
    'source',
    'amount',
    'available_on',
    'created',
    'currency',
    'description',
    'exchange_rate',
    'fee'
]

data = json.loads(inp)
rows = []
for inrow in data['data']:
    outrow = {}
    for copy in copies:
        outrow[copy] = inrow[copy]
    outrow['fee_details'] = inrow['fee_details'][0]['description']
    rows.append(outrow)

df = pd.DataFrame(rows)
print(df)

Keluaran:

timr@tims-gram:~/src$ python x.py
                           id   net               object reporting_category source  amount  ...     created  currency description exchange_rate  fee      fee_details
0               txn_abvgd1234  9999  balance_transaction             charge  cust1     100  ...  1621264875       usd     0123456          None  266  processing fees
1  txn_65987jihgf4984oihydgrd  9874  balance_transaction             charge  cust2   25984  ...  1621264866       usd     0326489          None   93  processing fees

[2 rows x 13 columns]
timr@tims-gram:~/src$ 
1
Tim Roberts 18 Mei 2021, 03:23