当当网的在线图书，这个文字乱序是什么技术实现的？防抄袭效果不错

输出是乱序，阅读是顺序：

e.dangdang 点 com/pc/reader/index.html?id=19004 （数字太多敏感词，接上） 65429

<div class="wraper" style="width: 660px; height: 880px; position: absolute; left:0px; top:0px; overflow: hidden; ">
<div class="text" style="width: 660px; height: 880px; position: absolute; left:0px; top:0px;">
<div><style type="text/css">
.fs-40bdb7ae-1aa {
font-size:19px;
 font-family:'Microsoft Yahei';
 position:absolute;
} 
.fs-40bdb7ae-2b0 {
font-size:19px;
 font-family:Arial;
 position:absolute;
} 
</style><span class="fs-40bdb7ae-1aa" style="left:346px; bottom:361px; ">日</span>
<span class="fs-40bdb7ae-1aa" style="left:432px; bottom:469px; ">《</span>
<span class="fs-40bdb7ae-1aa" style="left:134px; bottom:730px; ">、</span>
<span class="fs-40bdb7ae-1aa" style="left:520px; bottom:760px; ">进</span>
<span class="fs-40bdb7ae-1aa" style="left:83px; bottom:439px; ">短</span>
<span class="fs-40bdb7ae-1aa" style="left:467px; bottom:591px; ">；</span>
<span class="fs-40bdb7ae-1aa" style="left:121px; bottom:652px; ">治</span>
<span class="fs-40bdb7ae-1aa" style="left:122px; bottom:561px; ">阅</span>
<span class="fs-40bdb7ae-1aa" style="left:596px; bottom:730px; ">的</span>
<span class="fs-40bdb7ae-1aa" style="left:425px; bottom:300px; ">门</span>
<span class="fs-40bdb7ae-1aa" style="left:463px; bottom:300px; ">族</span>

此处省略几百字。。。

</div>
</div>
</div>

imdong

2019-10-15 23:16:31 +08:00

$.ajax({
type: "GET",
url: "https://e.dangdang.com/media/api.go?action=getPcChapterInfo&epubID=1900465429&permanentId=2.0170111163553175e+34&consumeType=1&platform=3&deviceType=Android&deviceVersion=5.0.0&channelId=70000&platformSource=DDDS-P&fromPaltform=ds_android&deviceSerialNo=html5&clientVersionNo=5.8.4&token=&chapterID=2&pageIndex=0&locationIndex=3&wordSize=2&style=2&autoBuy=0&chapterIndex=",
dataType: "json",
success: function (response) {
let data = JSON.parse(response.data.chapterInfo).snippet,
regex = '<span class="[^"]+" style="left:([0-9]+)px; bottom:([0-9]+)px; ">([^<]+)<\/span>',
result = [],
str = "";

data.match(RegExp(regex, 'g')).forEach(item => {
let info = item.match(RegExp(regex));
result[info[2]] = result[info[2]] || [];
result[info[2]][info[1]] = info[3];
});

result.forEach(item => {
str = item.join("") + str;
});
console.log(str);
}
});

imdong

2019-10-15 23:27:33 +08:00

似乎都不用排序，按照分成数组丢进去，再取出来就自动排序好了...

至于不同的段落是需要考虑一下的，还有就是不同的字体，如序和标题。
其实也很简单，根据 css 再分类一下即可。

我认为当当应该提高难度，同一行列也用不同的 css，css 里面在加上偏移。

maomaomao001

2019-10-16 10:31:57 +08:00

@axwz88 到底是怎么看出这种逻辑的。。。。我表达的明明是
它（比如百度腾讯阿里）作为不那么小的企业，为什么不试着推进更加规范（正常）的商业模式，
我不太相信如果支付宝被黑了他们仅仅是修复漏洞完事儿。。。肯定不会这样吧
既然爬他们的内容违法，为什么不试试拿

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/609635