0%

Download Bilibili Video

起因

  • 最近我最喜欢的一个 B 站 up 主团队宣布解散了,在最后一条视频里也说了账号会还给公司,因为担心视频会被删除掉,所以想着都下到本地来

  • 网上确实有插件可以下载视频,但是需要一个一个手动下,这样太麻烦也不优雅,所以就想着干嘛不用 python 写一个程序批量去下

声明

  • 本文只是记录个人操作的过程,如果有侵权请告知本人删除

环境

思路

  • 先通过接口拿到 up 主所有视频的 BV 号,这个接口经常会提示访问频繁,稍后再试,所以要加上一些失败重试的逻辑

  • 这里设置的是失败后等待 1s,再去重试,会尝试 100 次,当拿到所有 BV 号时,返回一个列表

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    def get_video_lists(page, max_retries=100):
    global author
    url = 'https://api.bilibili.com/x/space/arc/search?mid={}&ps=30&tid=0&pn={}&keyword=&order=pubdate&jsonp=jsonp'.format(
    user_id, page)
    headers = {
    "accept": "application/json, text/plain, */*",
    "accept-encoding": "gzip, deflate, br",
    "accept-language": "zh,en;q=0.9,en-US;q=0.8,zh-CN;q=0.7,zh-TW;q=0.6",
    "cookie": "buvid3=52EE1424-8352-DE0D-C2F9-8CEFBD6D7D2024853infoc; i-wanna-go-back=-1; _uuid=D7F4D7102-F510C-9EFD-B44C-5A15BB3D2B9825216infoc; buvid4=79C7023E-28E0-B231-6510-54E406718DAA25965-022021913-c0D4n8mIkOPQS7cPZ5EOlQ%3D%3D; CURRENT_BLACKGAP=0; LIVE_BUVID=AUTO7016452474409017; rpdid=|(Rlllkm)mY0J'uYRlkRmRum; buvid_fp_plain=undefined; blackside_state=0; fingerprint=6c8532a24d1ddc22356289c4c2d1958f; buvid_fp=34e58163f7b4e31c1736ba5b8416e000; SESSDATA=c35a2a31%2C1662290982%2Ca3c0d%2A31; bili_jct=de750fd4e484b47f40b8bb42a5a72869; DedeUserID=73827743; DedeUserID__ckMd5=9d571d9b5b827b73; sid=c3w73yp7; b_ut=5; hit-dyn-v2=1; nostalgia_conf=-1; PVID=2; innersign=0; b_lsid=B710CBE88_180E5C4ABA4; bp_video_offset_73827743=662643097963855900; CURRENT_FNVAL=80; b_timer=%7B%22ffp%22%3A%7B%22333.1007.fp.risk_52EE1424%22%3A%22180E5C4B0BF%22%2C%22333.337.fp.risk_52EE1424%22%3A%22180E5C521EF%22%2C%22333.999.fp.risk_52EE1424%22%3A%22180E5C5494B%22%7D%7D",
    "origin": "https://space.bilibili.com",
    "referer": "https://space.bilibili.com/518973111/video?tid=0&page=2&keyword=&order=pubdate",
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36",
    }

    for _ in range(max_retries):
    resp = requests.get(
    url=url,
    headers=headers
    )

    if resp.status_code != 200:
    print(f"Error: HTTP status code {resp.status_code}")
    time.sleep(1) # wait for a while before retrying
    continue

    if not resp.text:
    print("Error: Response is empty")
    time.sleep(1) # wait for a while before retrying
    continue

    try:
    js = resp.json()
    except JSONDecodeError:
    print("Error: Unable to parse JSON, trying again")
    time.sleep(1) # wait for a while before retrying
    continue

    if 'code' in js and js['code'] == -799:
    print(f"Error: {js['message']}")
    time.sleep(1) # wait for a while before retrying
    continue

    if 'data' not in js or 'list' not in js['data'] or 'vlist' not in js['data']['list']:
    print("Error: Unexpected response")
    time.sleep(1) # wait for a while before retrying
    continue

    vlist = js['data']['list']['vlist']
    author = vlist[0]['author']
    bvid_list = [x.get('bvid') for x in vlist]
    return bvid_list

    print(f"Error: Failed to get data after {max_retries} attempts")
    return []
  • 在当前目录下创建 up 主名字的目录,后续视频都下到这个目录下

    1
    2
    3
    4
    5
    current_directory = os.getcwd()
    folder_path = os.path.join(current_directory, author)
    if not os.path.exists(folder_path):
    os.makedirs(folder_path)
    print(f"创建文件夹: {folder_path}")
  • 调用 you-get 下载每一个视频

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    def download_videos(bv):
    url = 'https://www.bilibili.com/video/{}'.format(bv)
    print("download link", url, flush=True)
    command = ['you-get', '-o', folder_path, url]
    result = subprocess.run(command)
    if result.returncode != 0:
    raise RuntimeError(f'Failed to download video {bv}')
    print(f'Video {bv} downloaded successfully', flush=True)
    time.sleep(5)
    return bv
  • 直接调用上面下载视频的方法,每次只会下载一个视频,所以要用多线程去提高并发数,这里是同时 5 个线程

    1
    2
    3
    4
    5
    6
    7
    8
    9
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    futures = {executor.submit(download_videos, bv): bv for bv in bv_lists}

    for future in concurrent.futures.as_completed(futures):
    bv = futures[future]
    try:
    future.result()
    except Exception as e:
    print(f'Error downloading video {bv}: {e}', flush=True)
  • 最后由于 you-get 不仅会下载视频,也会下载弹幕评论,文件以 .xml 为后缀,不需要的话可以批量删掉

    1
    2
    3
    4
    5
    6
    7
    xml_files = glob.glob(os.path.join(folder_path, '*.xml'))
    for xml_file in xml_files:
    try:
    os.remove(xml_file)
    print(f'Successfully deleted {xml_file}')
    except Exception as e:
    print(f'Error deleting {xml_file}: {e}')
  • 完整的代码地址:https://github.com/Acolevia/DownloadVideo

不足

  • 由于 B 站默认不登录的情况,只能下载清晰度较低的视频,如果想下载高画质的视频,需要加上已登录的 Cookie 参数

后续

  • 在寻找如何下载高画质的视频时,发现了一个更好用的工具,地址:https://github.com/HFrost0/bilix

  • 下载指定 up 主视频

    1
    bilix get_up 'https://space.bilibili.com/up主id' --num [视频数]