最近在使用 Pandas 处理 json 数据时遇到了 ValueError: Protocol not known
的问题
后面使用 json 库就解决了,不明白为什么
json 的数据就是 data 里包个 contestUpcomingContests ,里面再包一个数组,内有两个元素
import json
import pandas as pd
data = '{"data":{"contestUpcomingContests":[{"containsPremium":false,"title":"\u7b2c 99 \u573a\u53cc\u5468\u8d5b","cardImg":"https://assets.leetcode.cn/aliyun-lc-upload/contest-config/biweekly-contest-99/contest_detail/pc_card.png","titleSlug":"biweekly-contest-99","startTime":1677940200,"duration":5400,"originStartTime":1677940200},{"containsPremium":false,"title":"\u7b2c 334 \u573a\u5468\u8d5b","cardImg":"https://assets.leetcode.cn/aliyun-lc-upload/contest-config/weekly-contest-334/contest_detail/pc_card.png","titleSlug":"weekly-contest-334","startTime":1677378600,"duration":5400,"originStartTime":1677378600}]}}'
df1 = json.loads(data)
print(df1)
df2 = pd.read_json(data)
print(df2)
{'data': {'contestUpcomingContests': [{'containsPremium': False, 'title': '第 99 场双周赛', 'cardImg': 'https://assets.leetcode.cn/aliyun-lc-upload/contest-config/biweekly-contest-99/contest_detail/pc_card.png', 'titleSlug': 'biweekly-contest-99', 'startTime': 1677940200, 'duration': 5400, 'originStartTime': 1677940200}, {'containsPremium': False, 'title': '第 334 场周赛', 'cardImg': 'https://assets.leetcode.cn/aliyun-lc-upload/contest-config/weekly-contest-334/contest_detail/pc_card.png', 'titleSlug': 'weekly-contest-334', 'startTime': 1677378600, 'duration': 5400, 'originStartTime': 1677378600}]}}
Traceback (most recent call last):
File "/Users/world/Developer/AlgorithmSharkSpider/test.py", line 8, in <module>
df2 = pd.read_json(data)
File "/opt/homebrew/lib/python3.9/site-packages/pandas/util/_decorators.py", line 199, in wrapper
return func(*args, **kwargs)
File "/opt/homebrew/lib/python3.9/site-packages/pandas/util/_decorators.py", line 299, in wrapper
return func(*args, **kwargs)
File "/opt/homebrew/lib/python3.9/site-packages/pandas/io/json/_json.py", line 540, in read_json
json_reader = JsonReader(
File "/opt/homebrew/lib/python3.9/site-packages/pandas/io/json/_json.py", line 622, in __init__
data = self._get_data_from_filepath(filepath_or_buffer)
File "/opt/homebrew/lib/python3.9/site-packages/pandas/io/json/_json.py", line 659, in _get_data_from_filepath
self.handles = get_handle(
File "/opt/homebrew/lib/python3.9/site-packages/pandas/io/common.py", line 558, in get_handle
ioargs = _get_filepath_or_buffer(
File "/opt/homebrew/lib/python3.9/site-packages/pandas/io/common.py", line 333, in _get_filepath_or_buffer
file_obj = fsspec.open(
File "/opt/homebrew/lib/python3.9/site-packages/fsspec/core.py", line 419, in open
return open_files(
File "/opt/homebrew/lib/python3.9/site-packages/fsspec/core.py", line 272, in open_files
fs, fs_token, paths = get_fs_token_paths(
File "/opt/homebrew/lib/python3.9/site-packages/fsspec/core.py", line 574, in get_fs_token_paths
chain = _un_chain(urlpath0, storage_options or {})
File "/opt/homebrew/lib/python3.9/site-packages/fsspec/core.py", line 315, in _un_chain
cls = get_filesystem_class(protocol)
File "/opt/homebrew/lib/python3.9/site-packages/fsspec/registry.py", line 208, in get_filesystem_class
raise ValueError("Protocol not known: %s" % protocol)
ValueError: Protocol not known: {"data":{"contestUpcomingContests":[{"containsPremium":false,"title":"第 99 场双周赛","cardImg":"https
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.