@
faketemp #15 再试试 pandas 、dsq 、fx 这些,以及 dsq 提到的其他比较快的工具 octosql 、duckdb ( q 太大了,算了):
- **pandas**:总计 5.24 s ,实际 pandas 只用 3.85 s 。看来确实有点重量级,不太适合日常小型数据处理。
- **dsq**:5.30 s ,但是不支持 CSV 输出,是以 json 原样输出的。*(看时间统计,似乎是双线程?)*
- **fx**:很久,且占了 4GB+ 内存,直接中断了。。*(好像也不支持 CSV 输出)*
- **octosql**:不支持读取 JSON Array ,只能 JSON Object ?
- **duckdb**:4.61 s ,和 SQLite 差不多。
Shell 操作记录:
```shell
~/Downloads $ time py -c 'import time; import pandas as pd; begin = time.time(); pd.read_json(r"Y:\villages.json")[["code", "name"]].set_index("code").to_csv(r"Y:\out.csv"); print(f"{time.time() - begin:.2f}s")' && showCsv
3.85s
real 0m 5.24s
user 0m 0.00s
sys 0m 0.01s
~/Downloads $ time ./dsq.exe Y:/villages.json 'SELECT code, name FROM {}' > Y:/out.csv && showCsv
real 0m 5.30s
user 0m 8.81s
sys 0m 1.50s
619503 33905785 Y:/out.csv
{"code":"659011502509","name":"十一连生活区"},
{"code":"659011502510","name":"十二连生活区"},
{"code":"659011502512","name":"五连生活区"}]
~/Downloads $ time ./fx.exe Y:/villages.json 'x.map(y => [y.code,
y.name])' > Y:/out.csv && showCsv
^C
~/Downloads $ ./octosql.exe 'SELECT * FROM provinces.json'
Error: typecheck error: couldn't create datasource: expected JSON object, got '[{"code":"11","name":"北京市"},{"code":"12","name":"天津市"},{"code":"13","name":"河北省"},{"code":"14","name":"山西省"},{"code":"15","name":"内蒙古自治区"},{"code":"21","name":"辽宁省"},{"code":"22","name":"吉 林省"},{"code":"23","name":"黑龙江省"},{"code":"31","name":"上海市"},{"code":"32","name":"江苏省"},{"code":"33","name":"浙江省"},{"code":"34","name":"安徽省"},{"code":"35","name":"福建省"},{"code":"36","name":"江西省"},{"code":"37","name":"山东省"},{"code":"41","name":"河南省"},{"code":"42","name":"湖北省"},{"code":"43","name":"湖南省"},{"code":"44","name":"广东省"},{"code":"45","name":"广西壮族自治区"},{"code":"46","name":"海南省"},{"code":"50","name":"重庆市"},{"code":"51","name":"四川省"},{"code":"52","name":"贵州省"},{"code":"53","name":"云南省"},{"code":"54","name":"西藏自治区"},{"code":"61","name":"陕西省"},{"code":"62","name":"甘肃省"},{"code":"63","name":"青海省"},{"code":"64","name":"宁夏回族自治区"},{"code":"65","name":"新疆维吾尔自治区"}]'
~/Downloads $ time ./duckdb.exe -csv :memory: "SELECT code, name FROM 'Y:/villages.json'" > Y:/out.csv && showCsv
real 0m 4.61s
user 0m 3.40s
sys 0m 1.15s
619504 22135235 Y:/out.csv
659011502509,"十一连生活区"
659011502510,"十二连生活区"
659011502512,"五连生活区"
```