微信文件传输助手网页版中文编码问题

28 天前
chenliang0571  chenliang0571

请问有人知道这里的原始中文编码是什么? 如何解码?

json 返回的字符串

流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。
#TodsTheItalianPortrait

创意总监:Walter Chiapponi

实际内容

流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023 秋冬 男士系列。
#TodsTheItalianPortrait

创意总监:Walter Chiapponi

询问了几个 AI, 基本都建议这样解码:

const iconv = require('iconv-lite');
const garbledText = "流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。\n#TodsTheItalianPortrait\n\n创意总监:Walter Chiapponi";
const buf = Buffer.from(garbledText, 'binary');
// const decodedText = iconv.decode(buf, 'windows-1252');
// const decodedText = iconv.decode(buf, 'latin1');
// const decodedText = iconv.decode(buf, 'gbk');
const decodedText = iconv.decode(buf, 'utf-8');
console.log(decodedText);

但是实际输出是这样的, 只有小部分内容被解码:

流�"&轮�㬁精�!�� �`�R�!����9�`��R�&�����}�#TODS2023�9� � ��士系���
#TodsTheItalianPortrait

��:��欻�:�aWalter Chiapponi

request url

https://filehelper.weixin.qq.com/cgi-bin/mmwebwx-bin/webwxsync?sid=9jOOuKz7HAavYjQY&skey=%40crypt_f9b8121a_46a00ef5be0a151552b6da4e4a72d526&pass_ticket=Pr31qvGLTC%252FORqib4BZsCAhorw3BuGqE1br%252FNUxuFFQSjJKCW5AJlqsth2T3UdkxflKlNmfUnuEQXEBe%252FE8mcQ%253D%253D

response

{
    "BaseResponse": {
        "Ret": 0,
        "ErrMsg": ""
    },
    "AddMsgCount": 1,
    "AddMsgList": [
        {
            "MsgId": "100930987469004064",
            "FromUserName": "@c4dcd4010dc50e5ee03f32ae786701de",
            "ToUserName": "filehelper",
            "MsgType": 1,
            "Content": "流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。<br/>#TodsTheItalianPortrait<br/><br/>创意总监:Walter Chiapponi",
            "Status": 3,
            "ImgStatus": 1,
            "CreateTime": 1740397130,
            "VoiceLength": 0,
            "PlayLength": 0,
            "FileName": "",
            "FileSize": "",
            "MediaId": "",
            "Url": "",
            "AppMsgType": 0,
            "StatusNotifyCode": 0,
            "StatusNotifyUserName": "",
            "RecommendInfo": {
                "UserName": "",
                "NickName": "",
                "QQNum": 0,
                "Province": "",
                "City": "",
                "Content": "",
                "Signature": "",
                "Alias": "",
                "Scene": 0,
                "VerifyFlag": 0,
                "AttrStatus": 0,
                "Sex": 0,
                "Ticket": "",
                "OpCode": 0
            },
            "ForwardFlag": 0,
            "AppInfo": {
                "AppID": "",
                "Type": 0
            },
            "HasProductId": 0,
            "Ticket": "",
            "ImgHeight": 0,
            "ImgWidth": 0,
            "SubMsgType": 0,
            "NewMsgId": 100930987469004064,
            "OriContent": "",
            "EncryFileName": ""
        }
    ],
    "ModContactCount": 0,
    "ModContactList": [],
    "DelContactCount": 0,
    "DelContactList": [],
    "ModChatRoomMemberCount": 0,
    "ModChatRoomMemberList": [],
    "Profile": {
        "BitFlag": 0,
        "UserName": {
            "Buff": ""
        },
        "NickName": {
            "Buff": ""
        },
        "BindUin": 0,
        "BindEmail": {
            "Buff": ""
        },
        "BindMobile": {
            "Buff": ""
        },
        "Status": 0,
        "Sex": 0,
        "PersonalCard": 0,
        "Alias": "",
        "HeadImgUpdateFlag": 0,
        "HeadImgUrl": "",
        "Signature": ""
    },
    "ContinueFlag": 0,
    "SyncKey": {
        "Count": 14,
        "List": [
            {
                "Key": 1,
                "Val": 940546031
            },
            {
                "Key": 2,
                "Val": 897439235
            },
            {
                "Key": 3,
                "Val": 940546023
            },
            {
                "Key": 11,
                "Val": 940546048
            },
            {
                "Key": 19,
                "Val": 44482
            },
            {
                "Key": 23,
                "Val": 1740396794
            },
            {
                "Key": 24,
                "Val": 1740397130
            },
            {
                "Key": 25,
                "Val": 897439235
            },
            {
                "Key": 27,
                "Val": 308443
            },
            {
                "Key": 201,
                "Val": 1740397130
            },
            {
                "Key": 203,
                "Val": 1740396590
            },
            {
                "Key": 206,
                "Val": 101
            },
            {
                "Key": 1000,
                "Val": 1740395520
            },
            {
                "Key": 1001,
                "Val": 1740395522
            }
        ]
    },
    "SKey": "",
    "SyncCheckKey": {
        "Count": 14,
        "List": [
            {
                "Key": 1,
                "Val": 940546031
            },
            {
                "Key": 2,
                "Val": 897439235
            },
            {
                "Key": 3,
                "Val": 940546023
            },
            {
                "Key": 11,
                "Val": 940546048
            },
            {
                "Key": 19,
                "Val": 44482
            },
            {
                "Key": 23,
                "Val": 1740396794
            },
            {
                "Key": 24,
                "Val": 1740397130
            },
            {
                "Key": 25,
                "Val": 897439235
            },
            {
                "Key": 27,
                "Val": 308443
            },
            {
                "Key": 201,
                "Val": 1740397130
            },
            {
                "Key": 203,
                "Val": 1740396590
            },
            {
                "Key": 206,
                "Val": 101
            },
            {
                "Key": 1000,
                "Val": 1740395520
            },
            {
                "Key": 1001,
                "Val": 1740395522
            }
        ]
    }
}
1008 次点击
所在节点    微信
4 条回复
ntedshen
28 天前
现(utf8)=e78eb0=现(latin1)
chenliang0571
27 天前
@ntedshen 似乎不对?
> iconv.encode('现', 'utf-8')
<Buffer e7 8e b0>

> iconv.encode('现', 'latin1')
<Buffer e7 3f b0>
ntedshen
27 天前
@chenliang0571
https://cs.stanford.edu/people/miles/iso8859.html
3f 是问号

其实不用管这个,你现在只需要知道编码是错的,接口无论如何也不可能给你一个拉丁字符集让你自己处理中文。。。
看看 contenttype 是不是没 utf8
chenliang0571
27 天前
@ntedshen
request:content-type:application/json;charset=UTF-8
response:content-type:text/plain
---

我知道原因了,windows-1252 的 81 、8D 、8F 、90 和 9D 都未有使用( https://zh.wikipedia.org/wiki/Windows-1252 )

所以下面的中文编码为 windows-1252 ,然后重新解码 utf-8 部分中文会出错。

iconv.decode(iconv.encode(iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023 秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252'), 'windows-1252'), 'utf-8')

浝畅轮廓〝精致细节和自由律动,糅坈呈现#TODS2023 秋冬 男士系列。#TodsTheItalianPortrait 创愝总监:Walter Chiapponi

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/1113928

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX