timerring · timerring · Apr 5, 2025 · Apr 5, 2025 · Apr 5, 2025 · Apr 5, 2025
diff --git a/README.md b/README.md
@@ -16,6 +16,8 @@
   <img src="assets/zhipu-color.svg" alt="Zhipu GLM-4V-PLUS" width="60" height="60" />
   <img src="assets/gemini-brand-color.svg" alt="Google Gemini 1.5 Pro" width="60" height="60" />
   <img src="assets/qwen-color.svg" alt="Qwen-2.5-72B-Instruct" width="60" height="60" />
+  <img src="assets/minimax-color.svg" alt="Minimax" width="20" height="60" />
+  <img src="assets/minimax-text.svg" alt="Minimax" width="60" height="60" />
 
 </div>
 
@@ -41,6 +43,8 @@
   - `Qwen-2.5-72B-Instruct`
 - **( :tada: NEW)持久化登录/下载/上传视频(支持多p投稿)**：[bilitool](https://github.com/timerring/bilitool) 已经开源，实现持久化登录，下载视频及弹幕(含多p)/上传视频(可分p投稿)，查询投稿状态，查询详细信息等功能，一键pip安装，可以使用命令行 cli 操作，也可以作为api调用。
 - **( :tada: NEW)自动多平台循环直播推流**：该工具已经开源 [looplive](https://github.com/timerring/looplive) 是一个 7 x 24 小时全自动**循环多平台同时推流**直播工具。
+- **( :tada: NEW)自动生成风格变换的视频封面**：采用图生图多模态模型，自动获取视频截图并上传风格变换后的视频封面。
+  - `Minimax image-01`
 
 项目架构流程如下：
 
@@ -144,11 +148,11 @@ pip install -r requirements.txt
 
 ##### 3.1.1 采用 api 方式
 
-将 `src/config.py` 文件中的 `ASR_METHOD` 参数设置为 `api`，然后填写 `WHISPER_API_KEY` 参数为你的 [API Key](https://console.groq.com/keys)。本项目采用 groq 提供 free tier 的 `whisper-large-v3-turbo` 模型，上传限制为 40 MB（约半小时），因此如需采用 api 识别的方式，请将视频录制分段调整为 30 分钟。此外，free tier 请求限制为 7200秒/20次/小时，28800秒/2000次/天。如果有更多需求，也欢迎升级到 dev tier，更多信息见[groq 官网](https://console.groq.com/docs/rate-limits)。
+将 `settings.toml` 文件中的 `ASR_METHOD` 参数设置为 `api`，然后填写 `WHISPER_API_KEY` 参数为你的 [API Key](https://console.groq.com/keys)。本项目采用 groq 提供 free tier 的 `whisper-large-v3-turbo` 模型，上传限制为 40 MB（约半小时），因此如需采用 api 识别的方式，请将视频录制分段调整为 30 分钟。此外，free tier 请求限制为 7200秒/20次/小时，28800秒/2000次/天。如果有更多需求，也欢迎升级到 dev tier，更多信息见[groq 官网](https://console.groq.com/docs/rate-limits)。
 
 ##### 3.1.2 采用本地部署方式(需保证有 NVIDIA 显卡)
 
-将 `src/config.py` 文件中的 `ASR_METHOD` 参数设置为 `deploy`，然后下载所需模型文件，并放置在 `src/subtitle/models` 文件夹中。
+将 `settings.toml` 文件中的 `ASR_METHOD` 参数设置为 `deploy`，然后下载所需模型文件，并放置在 `src/subtitle/models` 文件夹中。
 
 项目默认采用 [`small`](https://openaipublic.azureedge.net/main/whisper/models/9ecf779972d90ba49c06d968637d720dd632c55bbf19d441fb42bf17a411e794/small.pt) 模型，请点击下载所需文件，并放置在 `src/subtitle/models` 文件夹中。
 
@@ -160,7 +164,7 @@ pip install -r requirements.txt
 
 ##### 3.2 MLLM 模型
 
-MLLM 模型主要用于自动切片后的切片标题生成，此功能默认关闭，如果需要打开请将 `src/config.py` 文件中的 `AUTO_SLICE` 参数设置为 `True`。其他配置分别有：
+MLLM 模型主要用于自动切片后的切片标题生成，此功能默认关闭，如果需要打开请将 `settings.toml` 文件中的 `AUTO_SLICE` 参数设置为 `True`。其他配置分别有：
 - `SLICE_DURATION` 以秒为单位设置切片时长（不建议超过 60 秒）。
 - `SLICE_NUM` 设置切片数量。
 - `SLICE_OVERLAP` 设置切片重叠时长。切片采用滑动窗口法处理，细节内容请见 [auto-slice-video](https://github.com/timerring/auto-slice-video)
@@ -169,21 +173,27 @@ MLLM 模型主要用于自动切片后的切片标题生成，此功能默认关
 
 ##### 3.2.1 GLM-4V-PLUS 模型
 
-> 如需使用 GLM-4V-PLUS 模型，请将 `src/config.py` 文件中的 `MLLM_MODEL` 参数设置为 `zhipu`
+> 如需使用 GLM-4V-PLUS 模型，请将 `settings.toml` 文件中的 `MLLM_MODEL` 参数设置为 `zhipu`
 
-在项目的自动切片功能需要使用到智谱的 [`GLM-4V-PLUS`](https://bigmodel.cn/dev/api/normal-model/glm-4) 模型，请自行[注册账号](https://www.bigmodel.cn/invite?icode=shBtZUfNE6FfdMH1R6NybGczbXFgPRGIalpycrEwJ28%3D)并申请 API Key，填写到 `src/config.py` 文件中对应的 `ZHIPU_API_KEY` 中。
+在项目的自动切片功能需要使用到智谱的 [`GLM-4V-PLUS`](https://bigmodel.cn/dev/api/normal-model/glm-4) 模型，请自行[注册账号](https://www.bigmodel.cn/invite?icode=shBtZUfNE6FfdMH1R6NybGczbXFgPRGIalpycrEwJ28%3D)并申请 API Key，填写到 `settings.toml` 文件中对应的 `ZHIPU_API_KEY` 中。
 
 ##### 3.2.2 Gemini 模型
 
-> 如需使用 Gemini-2.0-flash 模型，请将 `src/config.py` 文件中的 `MLLM_MODEL` 参数设置为 `gemini`
+> 如需使用 Gemini-2.0-flash 模型，请将 `settings.toml` 文件中的 `MLLM_MODEL` 参数设置为 `gemini`
 
-在项目的自动切片功能需要使用到 Gemini-2.0-flash 模型，请自行[注册账号](https://aistudio.google.com/app/apikey)并申请 API Key，填写到 `src/config.py` 文件中对应的 `GEMINI_API_KEY` 中。
+在项目的自动切片功能需要使用到 Gemini-2.0-flash 模型，请自行[注册账号](https://aistudio.google.com/app/apikey)并申请 API Key，填写到 `settings.toml` 文件中对应的 `GEMINI_API_KEY` 中。
 
 ##### 3.2.3 Qwen 模型
 
-> 如需使用 Qwen-2.5-72B-Instruct 模型，请将 `src/config.py` 文件中的 `MLLM_MODEL` 参数设置为 `qwen`
+> 如需使用 Qwen-2.5-72B-Instruct 模型，请将 `settings.toml` 文件中的 `MLLM_MODEL` 参数设置为 `qwen`
 
-在项目的自动切片功能需要使用到 Qwen-2.5-72B-Instruct 模型，请自行[注册账号](https://bailian.console.aliyun.com/?apiKey=1)并申请 API Key，填写到 `src/config.py` 文件中对应的 `QWEN_API_KEY` 中。
+在项目的自动切片功能需要使用到 Qwen-2.5-72B-Instruct 模型，请自行[注册账号](https://bailian.console.aliyun.com/?apiKey=1)并申请 API Key，填写到 `settings.toml` 文件中对应的 `QWEN_API_KEY` 中。
+
+##### 3.2.4 Minimax 模型
+
+> 如需使用 Minimax 模型，请将 `settings.toml` 文件中 `generate_cover` 参数设置为 `true`，并将 `IMAGE_GEN_MODEL` 参数设置为 `minimax`。
+
+在项目的自动切片功能需要使用到 Minimax 模型，请自行[注册账号](https://www.minimax.chat/)并申请 API Key，填写到 `settings.toml` 文件中对应的 `MINIMAX_API_KEY` 中。
 
 #### 4. bilitool 登录
 
@@ -248,7 +258,7 @@ logs # 日志文件夹
 #### 8. 配置上传参数
 
 > [!TIP]
-> 上传默认参数如下，[]中内容全部自动替换。可以在 `src/config.py` 中自定义相关配置，映射关键词为 `{artist}`、`{date}`、`{title}`、`{source_link}`，可自行组合删减定制模板：
+> 上传默认参数如下，[]中内容全部自动替换。可以在 `settings.toml` 中自定义相关配置，映射关键词为 `{artist}`、`{date}`、`{title}`、`{source_link}`，可自行组合删减定制模板：
 > + 标题模板是`{artist}直播回放-{date}-{title}`，效果为"【弹幕+字幕】[XXX]直播回放-[日期]-[直播间标题]"，可自行修改。
 > + 简介模板是`{artist}直播，直播间地址：{source_link} 内容仅供娱乐，直播中主播的言论、观点和行为均由主播本人负责，不代表录播员的观点或立场。`，效果为"【弹幕+字幕】[XXX]直播，直播间地址：[https://live.bilibili.com/XXX] 内容仅供娱乐，直播中主播的言论、观点和行为均由主播本人负责，不代表录播员的观点或立场。"，可自行修改。
 > + 默认标签是根据主播名字自动在 b 站搜索推荐中抓取的热搜词。

diff --git a/assets/minimax-color.svg b/assets/minimax-color.svg
diff --git a/assets/minimax-text.svg b/assets/minimax-text.svg
diff --git a/settings.toml b/settings.toml
@@ -38,6 +38,11 @@ zhipu_api_key = "" # Apply for your own GLM-4v-Plus API key at https://www.bigmo
 gemini_api_key = "" # Apply for your own Gemini API key at https://aistudio.google.com/app/apikey
 qwen_api_key = "" # Apply for your own Qwen API key at https://bailian.console.aliyun.com/?apiKey=1
 
+[cover]
+generate_cover = false # whether to generate cover
+image_gen_model = "minimax" # the image generation model, can be "minimax"
+minimax_api_key = "" # Apply for your own Minimax API key at https://platform.minimaxi.com/user-center/basic-information/interface-key
+
 # blrec Settings
 [[tasks]]
 room_id = 173551

diff --git a/src/config.py b/src/config.py
@@ -71,3 +71,7 @@ def get_interface_config():
 ZHIPU_API_KEY = config.get('slice', {}).get('zhipu_api_key')
 GEMINI_API_KEY = config.get('slice', {}).get('gemini_api_key')
 QWEN_API_KEY = config.get('slice', {}).get('qwen_api_key')
+
+GENERATE_COVER = config.get('cover', {}).get('generate_cover')
+IMAGE_GEN_MODEL = config.get('cover', {}).get('image_gen_model')
+MINIMAX_API_KEY = config.get('cover', {}).get('minimax_api_key')
diff --git a/src/cover/__init__.py b/src/cover/__init__.py
diff --git a/src/cover/cover_generator.py b/src/cover/cover_generator.py
@@ -0,0 +1,59 @@
+from functools import wraps
+from src.log.logger import upload_log
+from src.config import IMAGE_GEN_MODEL
+import subprocess
+
+def cut_cover_use_ffmpeg(video_path):
+    """Cut cover use ffmpeg
+    Args:
+        video_path: str, path to the video file
+    Returns:
+        str: the video cut cover path
+    """
+    upload_log.info("begin to generate cover")
+    cover_path = video_path[:-4] + ".jpg"
+    ffmpeg_command = [
+        'ffmpeg', '-y', '-i', video_path, '-t', '1', '-r', '1', cover_path
+    ]
+    try:
+        result = subprocess.run(ffmpeg_command, check=True, capture_output=True, text=True)
+        upload_log.debug(f"FFmpeg output: {result.stdout}")
+        if result.stderr:
+            upload_log.debug(f"FFmpeg debug: {result.stderr}")
+        return cover_path
+    except subprocess.CalledProcessError as e:
+        upload_log.error(f"Error: {e.stderr}")
+        return None
+
+
+def cover_generator(model_type):
+    """Decorator to select cover generation function based on model type
+    Args:
+        model_type: str, type of model to use
+    Returns:
+        function: wrapped title generation function
+    """
+    def decorator(func):
+        def wrapper(video_path):
+            cover_path = cut_cover_use_ffmpeg(video_path)
+            if cover_path is None:
+                upload_log.error("Failed to generate cover using ffmpeg")
+                return None
+            if model_type == "minimax":
+                from .image_model_sdk.minimax_sdk import minimax_generate_cover
+                return minimax_generate_cover(cover_path)
+            else:
+                upload_log.error(f"Unsupported model type: {model_type}")
+                return None
+        return wrapper
+    return decorator
+
+@cover_generator(IMAGE_GEN_MODEL)
+def generate_cover(video_path):
+    """Generate cover for video
+    Args:
+        video_path: str, path to the video file
+    Returns:
+        str: generated cover
+    """
+    pass  # The actual implementation is handled by the decorator
diff --git a/src/cover/image_model_sdk/minimax_sdk.py b/src/cover/image_model_sdk/minimax_sdk.py
@@ -0,0 +1,53 @@
+import requests
+import json
+import base64
+import os
+import time
+from src.config import MINIMAX_API_KEY
+
+
+def minimax_generate_cover(your_file_path):
+    """Generater cover image using minimax api
+    Args:
+        your_file_path: str, path to the image file
+    Returns:
+        str, local download path of the generated cover image file
+    """
+    cover_name = time.strftime("%Y%m%d%H%M%S") + ".png"
+    temp_cover_path = os.path.join(os.path.dirname(your_file_path), cover_name)
+
+    with open(your_file_path, "rb") as image_file:
+        data = base64.b64encode(image_file.read()).decode('utf-8')
+
+    payload = json.dumps({
+        "model": "image-01",
+        "prompt": "这是一个视频截图，请生成其对应的吉普力风格的图片",
+        "subject_reference": [
+            {
+                "type": "character",
+                "image_file": f"data:image/jpeg;base64,{data}"
+            }
+        ],
+        "n": 2
+    })
+    headers = {
+        'Authorization': f'Bearer {MINIMAX_API_KEY}',
+        'Content-Type': 'application/json'
+    }
+
+    url = "https://api.minimax.chat/v1/image_generation"
+    response = requests.request("POST", url, headers=headers, data=payload).json()
+    if response['base_resp']['status_code'] == 0:
+        image_url = response['data']['image_urls'][0]
+        img_data = requests.get(image_url).content
+        with open(temp_cover_path, 'wb') as handler:
+            handler.write(img_data)
+        os.remove(your_file_path)
+        return temp_cover_path
+    else:
+        print(response['base_resp']['error_msg'])
+        return None
+
+if __name__ == "__main__":
+    your_file_path = ""
+    print(minimax_generate_cover(your_file_path))
diff --git a/src/log/retry.py b/src/log/retry.py
@@ -29,7 +29,7 @@ def run(self, func, *args, **kwargs) -> Tuple[bool, Any]:
                     status = (True,return_value)
                     break
             except Exception as e:
-                scan_log.error(f"Exceptions in trial {i+1}/{self.max_retry} : {e}")
+                scan_log.error(f"Exceptions in function {func.__name__} trial {i+1}/{self.max_retry} : {e}")
                 sleep(self.interval)
 
         return status

diff --git a/src/upload/upload.py b/src/upload/upload.py
@@ -3,26 +3,31 @@
 import subprocess
 import os
 import sys
-from src.config import SRC_DIR, BILIVE_DIR, RESERVE_FOR_FIXING, UPLOAD_LINE
+from src.config import SRC_DIR, BILIVE_DIR, RESERVE_FOR_FIXING, UPLOAD_LINE, GENERATE_COVER
 from datetime import datetime
 from src.upload.generate_upload_data import generate_video_data, generate_slice_data
 from src.upload.extract_video_info import generate_title
-from src.log.logger import upload_log
+from src.log.logger import upload_log, scan_log
 import time
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from db.conn import get_single_upload_queue, delete_upload_queue, update_upload_queue_lock, get_single_lock_queue
 from .bilitool.bilitool import UploadController, FeedController, LoginController
 from src.log.retry import Retry
+from src.cover.cover_generator import generate_cover
 
 @Retry(max_retry = 3, interval = 5).decorator
 def upload_video(upload_path):
     try:
         if upload_path.endswith('.flv'):
             copyright, title, tid, tag = generate_slice_data(upload_path)
-            yaml, desc, source, cover, dynamic = ("",) * 5
+            if GENERATE_COVER:
+                cover = generate_cover(upload_path)
+            else:
+                cover = ""
+            yaml, desc, source, dynamic = ("",) * 4
             if title is None:
-                upload_log.error("Fail to upload slice video, the files will be reserved.")
-                update_upload_queue_lock(upload_path, 0)
+                upload_log.error("Fail to upload slice video, the files will be locked.")
+                update_upload_queue_lock(upload_path, 1)
                 return False
         else:
             copyright, title, desc, tid, tag, source, cover, dynamic = generate_video_data(upload_path)
@@ -31,16 +36,17 @@ def upload_video(upload_path):
         if result == True:
             upload_log.info("Upload successfully, then delete the video")
             os.remove(upload_path)
+            if cover:
+                os.remove(cover)
             delete_upload_queue(upload_path)
             return True
         else:
-            upload_log.error("Fail to upload, the files will be reserved.")
-            update_upload_queue_lock(upload_path, 0)
+            upload_log.error("Fail to upload, the files will be locked.")
+            update_upload_queue_lock(upload_path, 1)
             return False
-
-    except subprocess.CalledProcessError as e:
-        upload_log.error(f"The upload_video called failed, the files will be reserved. error: {e}")
-        update_upload_queue_lock(upload_path, 0)
+    except Exception as e:
+        upload_log.error(f"The upload_video called failed, the files will be converted to locked. error: {e}")
+        update_upload_queue_lock(upload_path, 1)
         return False
 
 @Retry(max_retry = 3, interval = 5).decorator
@@ -54,13 +60,13 @@ def append_upload(upload_path, bv_result):
             delete_upload_queue(upload_path)
             return True
         else:
-            upload_log.error("Fail to append, the files will be reserved.")
-            update_upload_queue_lock(upload_path, 0)
+            upload_log.error("Fail to append, the files will be locked.")
+            update_upload_queue_lock(upload_path, 1)
             return False
 
-    except subprocess.CalledProcessError as e:
-        upload_log.error(f"The append_upload called failed, the files will be reserved. error: {e}")
-        update_upload_queue_lock(upload_path, 0)
+    except Exception as e:
+        upload_log.error(f"The append_upload called failed, the files will be locked. error: {e}")
+        update_upload_queue_lock(upload_path, 1)
         return False
 
 def video_gate(video_path):