12306显示有票但又说余票不足(返乡之路不容易之12306余票查询并给出备选方案v2)
12306显示有票但又说余票不足(返乡之路不容易之12306余票查询并给出备选方案v2)当然目前的版本还是存在着问题:就是这个排序太过死板,只考虑了出发站和到达站的距离,没有考虑座位信息,比如本来可以多买几站坐着回去,但非要为了少花点钱补票站着,也不太合适。# coding=utf-8 import requests import urllib.parse as parse import time import json import pretty_errors import re from fake_useragent import UserAgent TRAIN_NO = 2 TRAIN = 3 DEPARTURE_STATION = 6 TERMINUS = 7 DEPARTURE_TIME = 8 ARRIVAL_TIME = 9 DURATION = 10 IF_BOOK = 11 DATE = 13 FROM_STATION_NO = 16 TO_STATION
在第一版的返乡之路不容易之12306余票查询并给出备选方案中,给出了余票查询和备选方案推荐,但当时有两个问题:
- 没有备选排名:虽然给出了备选,但哪个备选好没有给出排序
- 没有座位信息(商务/一等/二等/硬座/硬卧/无座):虽然能买,但是不一定能买到适合自己的(便宜的),有点奢侈了
因此这几天对代码进行了更新。
首先说一下备选方案排序的原理,如果我们直接买不到出发地和目的地的车票,那对于这趟车来说,只要出发站买在首发站和出发地之间,到达站买在出发地和终点站之间,就可以保证我们能顺利踏上这趟车,大不了多买几站或者上车补票。那么我们的排序原理就是花最少的钱回家,怎么办呢?
出发站尽量买在离出发地近的站,到达站尽量买在离目的地近的站,而且尽量补票,不要多买站,毕竟我们秉着花钱最少的原则。
实现的效果如下
趁着这次做2.0,对代码做了优化:
- 城市缩写使用保存的文件进行读入,不使用12306的接口进行获取,加快速度;(城市缩写文件可以由第一版的Citys()函数进行获取保存,形式如下:)
{
"北京北": "VAP"
"北京东": "BOP"
"北京": "BJP"
"北京南": "VNP"
"北京西": "BXP"
"广州南": "IZQ"
"重庆北": "CUW"
"重庆": "CQW"
"重庆南": "CRW"
...
}
- 增加爬虫自省机制,如果接口调用失败则延时后再次调用;
- 增加了座位信息;
- 对备选方案做出排序
代码如下:
# coding=utf-8
import requests
import urllib.parse as parse
import time
import json
import pretty_errors
import re
from fake_useragent import UserAgent
TRAIN_NO = 2
TRAIN = 3
DEPARTURE_STATION = 6
TERMINUS = 7
DEPARTURE_TIME = 8
ARRIVAL_TIME = 9
DURATION = 10
IF_BOOK = 11
DATE = 13
FROM_STATION_NO = 16
TO_STATION_NO = 17
SEAT_TYPES = 35
#
OTHER = 22
NO_SEAT = 26 # WZ
HARD_SEAT = 29 # A1
SECOND_SEAT = 30 # O
FIRST_SEAT = 31 # M
BUSINESS_SEAT = 32 # A9
HARD_SLEEPER = 28 # A3
SOFT_SLEEPER = 23 # A4
# 城市缩写
with open('citys.json' 'r') as f:
Citys = json.load(f)
def Time():
"""
获取当前时间
:return:
"""
list_time = list(time.localtime())
year = str(list_time[0])
month = str(list_time[1])
day = str(list_time[2])
if len(month) == 1:
month = '0' month
if len(day) == 1:
day = '0' day
return year month day
proxy = {'http': '113.125.128.4:8888'}
class Train:
def __init__(self
from_station
to_station
train_date=Time()[0] '-' Time()[1] '-' Time()[2]
):
self.from_station = from_station
self.to_station = to_station
self.train_date = train_date
self.url = 'https://kyfw.12306.cn/otn/leftTicket/queryA?leftTicketDTO.%s&leftTicketDTO.%s&leftTicketDTO.%s&purpose_codes=ADULT'
self.headers = {'User-Agent': str(UserAgent().random)}
self.session = requests.session()
self.session.get(
'https://kyfw.12306.cn/otn/leftTicket/init?linktypeid=dc&fs=杭州东 HGH&ts=太原南 TNV&date=2022-01-19&flag=N N Y'
headers=self.headers proxies=proxy timeout=5)
def station(self train_number):
"""
查找列车起点可买和终点可买
:return:
"""
url = f'https://kyfw.12306.cn/otn/czxx/queryByTrainNo?' \
f'{parse.urlencode({"train_no": train_number})}&' \
f'{parse.urlencode({"from_station_telecode": Citys[self.from_station]})}&' \
f'{parse.urlencode({"to_station_telecode": Citys[self.to_station]})}&' \
f'{parse.urlencode({"depart_date": self.train_date})}'
self.headers['User-Agent'] = str(UserAgent().random)
content = self.session.get(url headers=self.headers proxies=proxy timeout=5)
content = content.content.decode('utf-8')
data = json.loads(content)
stations_data = data['data']['data']
stations_data.sort(key=lambda x: x['station_no'])
from_station_idx = int(
list(filter(lambda x: self.from_station in x['station_name'] stations_data))[0]['station_no'])
to_station_idx = int(
list(filter(lambda x: self.to_station in x['station_name'] stations_data))[0]['station_no'])
from_station_buy = [station['station_name'] for station in stations_data[:from_station_idx]][::-1]
to_station_buy_1 = [station['station_name'] for station in stations_data[from_station_idx:to_station_idx]][::-1]
to_station_buy_2 = [station['station_name'] for station in stations_data[to_station_idx:]]
to_station_buy = to_station_buy_1 to_station_buy_2
return from_station_buy to_station_buy
def train(self):
"""
爬取信息
:return:
"""
url = self.url % (parse.urlencode({"train_date": self.train_date})
parse.urlencode({"from_station": Citys[self.from_station]})
parse.urlencode({"to_station": Citys[self.to_station]}))
self.headers['User-Agent'] = str(UserAgent().random)
content = self.session.get(url headers=self.headers proxies=proxy timeout=5)
content = content.content.decode('utf-8')
data = json.loads(content)
dict_train = data['data']['result']
dict_map = data['data']['map']
res = []
TRAINs = []
for train in dict_train:
train_split = train.split('|')
if train_split[TRAIN] in TRAINs:
continue
print(train_split[TRAIN])
prices = self.price(train_split[FROM_STATION_NO] train_split[TO_STATION_NO]
train_split[SEAT_TYPES] train_split[TRAIN_NO])
from_station_buy to_station_buy = self.station(train_split[TRAIN_NO])
buy = []
for from_station to_station in [[x y] for x in from_station_buy for y in to_station_buy]:
if train_split[IF_BOOK] == 'N' and self.book_if(from_station to_station train_split[TRAIN]):
buy.append(f'{from_station}-{to_station}')
train_str = [train_split[TRAIN] dict_map[train_split[DEPARTURE_STATION]]
dict_map[train_split[TERMINUS]] train_split[DEPARTURE_TIME]
train_split[ARRIVAL_TIME] train_split[DURATION]
f'{train_split[BUSINESS_SEAT]} / {prices.get("A9")}' if train_split[BUSINESS_SEAT] and train_split[BUSINESS_SEAT] != '无' else '无'
f'{train_split[FIRST_SEAT]} / {prices.get("M")}' if train_split[FIRST_SEAT] and train_split[FIRST_SEAT] != '无' else '无'
f'{train_split[SECOND_SEAT]} / {prices.get("O")}' if train_split[SECOND_SEAT] and train_split[SECOND_SEAT] != '无' else '无'
f'{train_split[SOFT_SLEEPER]} / {prices.get("A4")}' if train_split[SOFT_SLEEPER] and train_split[SOFT_SLEEPER] != '无' else '无'
f'{train_split[HARD_SLEEPER]} / {prices.get("A3")}' if train_split[HARD_SLEEPER] and train_split[HARD_SLEEPER] != '无' else '无'
f'{train_split[HARD_SEAT]} / {prices.get("A1")}' if train_split[HARD_SEAT] and train_split[HARD_SEAT] != '无' else '无'
f'{train_split[NO_SEAT]} / {prices.get("WZ")}' if train_split[NO_SEAT] and train_split[NO_SEAT] != '无' else '无'
'可以' if train_split[IF_BOOK] == 'Y' else '不可以' ' '.join(buy)]
res.append('| ' ' | '.join(train_str) ' |')
TRAINs.append(train_split[TRAIN])
return res
def book_if(self from_station to_station train_number):
"""
查询是否有票
:param from_station:
:param to_station:
:param train_number:
:return:
"""
url = self.url % (parse.urlencode({"train_date": self.train_date})
parse.urlencode({"from_station": Citys[from_station]})
parse.urlencode({"to_station": Citys[to_station]}))
train = ['']
for _ in range(20):
try:
self.headers['User-Agent'] = str(UserAgent().random)
content = self.session.get(url headers=self.headers proxies=proxy timeout=5)
content = content.content.decode('utf-8')
data = json.loads(content)
dict_train = data['data']['result']
train = list(filter(lambda x: x.split('|')[TRAIN] == train_number dict_train))
if not train:
return
break
except:
print('查询余票请求失败,重新请求')
time.sleep(2)
return True if train[0].split('|')[IF_BOOK] == 'Y' else False
def price(self from_station_no to_station_no seat_types train_no):
"""
查询车票价格
:return:
"""
url = f'https://kyfw.12306.cn/otn/leftTicket/queryTicketPrice?' \
f'{parse.urlencode({"train_no": train_no})}&' \
f'{parse.urlencode({"from_station_no": from_station_no})}&' \
f'{parse.urlencode({"to_station_no": to_station_no})}&' \
f'{parse.urlencode({"seat_types": seat_types})}&' \
f'{parse.urlencode({"train_date": self.train_date})}'
data = {}
for _ in range(20):
try:
self.headers['User-Agent'] = str(UserAgent().random)
content = self.session.get(url headers=self.headers proxies=proxy timeout=5)
content = content.content.decode('utf-8')
data = json.loads(content)
break
except:
print('查询价格请求失败,重新请求')
time.sleep(2)
return data.get('data')
if __name__ == '__main__':
print('--------------------12306信息查询-----------------------')
while True:
from_station = input('请输入出发地:') or '杭州'
if from_station in Citys:
break
while True:
to_station = input('请输入目的地:') or '太原'
if to_station in Citys:
break
pattern = re.compile('\d{4}-\d{2}-\d{2}')
while True:
date = input('请输入出发时间(注意格式:2022-02-01 默认情况下为购票当日):')
if not date or re.match(pattern date):
break
if not date:
date = Time()[0] '-' Time()[1] '-' Time()[2]
train = Train(from_station to_station date)
#
information = train.train()
print('-------------------------------------------------------')
print('-------------------12306查询结果如下---------------------')
print('-------------------------------------------------------')
print('| 车次 | 出发站 | 到达站 | 出发时间 | 到达时间 | 历时 | 商务座 | 一等座 | 二等座 | 软卧 | 硬卧 | 硬座 | 无座 |直接买 | 备选 |')
for info in information: print(info)
print('-------------------------------------------------------')
山西的疫情防控政策也变松了,之前只要在北京海淀呆过的回家必须14天集中隔离,恶意返乡事件出来后,政策也随之变松,目前海淀回去只要核酸即可,这次回家之路畅通了很多,给国家点赞,给山西政府点赞!
当然目前的版本还是存在着问题:就是这个排序太过死板,只考虑了出发站和到达站的距离,没有考虑座位信息,比如本来可以多买几站坐着回去,但非要为了少花点钱补票站着,也不太合适。
因此为了能舒舒服服地回家,着手下一版…