how could I set cookie followed by a redirect request?

2017-12-08 16:27:08 +08:00
 shanechiu

There was problem when I tried to grab BAIDU tongji infor.

When I logined success, there was a 302 redirection to main page.

That means the internal redirect was from https://tongji.baidu.com/web/welcome/ico?s=sdfsdfsdfsdf to s://tongji.baidu.com/web/12323243/overview/index?siteId=sdfsf.

I wonder know that how does the program(may be the broswer? I am also not clear. LOL) pass the cookie from the 320 page to the destination page? and Why?

Could anyone do me a favor? Thanks in advace.

Append:

302 page : https://tongji.baidu.com/web/welcome/ico?s=sdfsdfsdfsdf

destination page: s://tongji.baidu.com/web/12323243/overview/index?siteId=sdfsf

2002 次点击
所在节点    Python
12 条回复
Cooky
2017-12-08 16:31:01 +08:00
Chinese please or go to Stack Overflow
shanechiu
2017-12-08 16:34:56 +08:00
@Cooky I am a little worried about whether this question lives up to the Stack Overflow's strict standard.
vincenttone
2017-12-08 16:35:05 +08:00
不知道你看得懂中文不
中文答案:
1. http 是无状态的
2. cookie 是通过 header 传递的
3. 留意一下 cookie 的域
hhacker
2017-12-08 16:38:11 +08:00
Because the cookie is shared by same domain
shanechiu
2017-12-08 16:43:35 +08:00
@vincenttone well, Is it that means the 302 page request cookie will also pass to the destination page by header and it also acts as a request cookie in the destination page?
fml87
2017-12-08 16:50:47 +08:00
logined 是什么
shanechiu
2017-12-08 16:54:05 +08:00
@fml87 a past tense of word "login", it means events or actions happen in the past.
vincenttone
2017-12-08 16:54:42 +08:00
@shanechiu 如果你想理解 cookie 在 302 页面中的表现,就必须先了解 cookie 在普通页面中的表现。
如我刚才所说:
1. http 是无状态的
这个是前提。
cookie 存在本地,无状态的情况下,不关心你有没有做 302 跳转。
shanechiu
2017-12-08 17:03:15 +08:00
@vincenttone Thanks for your kindness and patience. There seems like a outline about this.
knightdf
2017-12-08 18:04:17 +08:00
这么秀的吗?看历史原来你不是会中文么?
yospan
2017-12-09 15:19:00 +08:00
之前刚做了,用 session 啊,统计后台设置个第三方密码,然后 post 给他,保持 session 去请求其他页面,接着统计里的数据随便拿~ 那去参考下把;我是 py 新手;

```
##百度统计的第三查看密码,登录并获得 session 和 siteid
idwd = {'passwd': '66666'}
S = Session()
logined = S.post("https://tongji.baidu.com/web/welcome/ico?s=8dfdafdafadfa4bccd", data=idwd, headers=REQ_HEADERS)

#获得 siteid,并转换成字符串
siteid= str(logined.url.split("=")[1])
webid = str(logined.url.split("/")[4])



##搜索词的 post 参数
keyjson = {"siteId":siteid,"st":"","et":"","st2":"","et2":"","indicators":"['pv_count','visitor_count','ip_count','bounce_ratio','avg_visit_time']","order":"pv_count,desc","offset":"","pageSize":"","target":"-1","flag":"indicator","source":"","isGroup":"0","clientDevice":"all","reportId":"12","method":"source/searchword/a","queryId":""}

readkeyjson = S.post("https://tongji.baidu.com/web/"+webid+"/ajax/post", data=keyjson, headers=REQ_HEADERS)


#按文本读取
jsondata = readkeyjson.text
#格式化 json
readjsondict = json.loads(jsondata)

keyNamejson = readjsondict['data']['items'][0]
for items in keyNamejson:
items2 = items
print(items2[0]['name'])

```
shanechiu
2017-12-09 17:43:19 +08:00
@yospan well, I do not know whether your method is workable. But thank you all the same.

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/413149

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX