这是indexloc提供的服务,不要输入任何密码
Skip to content

feat: Implement hair swapping and enhance realism #1298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
37486f0
feat: Implement hair swapping and enhance realism
google-labs-jules[bot] May 21, 2025
2e617c9
feat: Add setup and run scripts for macOS
google-labs-jules[bot] May 25, 2025
3d8af51
fix: Correct IndentationError in modules/ui.py
google-labs-jules[bot] May 25, 2025
521cad1
fix: Update type hints for Python 3.9 compatibility
google-labs-jules[bot] May 25, 2025
d279403
Okay, I've made a change to prioritize AVFoundation for macOS camera …
google-labs-jules[bot] May 25, 2025
5f2e545
feat: Add Windows setup and run scripts, update README
google-labs-jules[bot] May 25, 2025
49d9971
Jules was unable to complete the task in time. Please review the work…
google-labs-jules[bot] May 31, 2025
166d5a3
fix: Address review feedback for stability and code quality
google-labs-jules[bot] May 31, 2025
6da790e
fix: Correct IndentationError in face_swapper.py
google-labs-jules[bot] Jun 1, 2025
3151535
fix: Correct syntax and structure in face_swapper.py helper functions
google-labs-jules[bot] Jun 1, 2025
8de4c99
Here's the refactor:
google-labs-jules[bot] Jun 7, 2025
b5294c6
criticalfix: Correct major syntax and indentation errors in face_swap…
google-labs-jules[bot] Jun 12, 2025
0db2d10
fix: Lower face detection threshold for improved reliability
google-labs-jules[bot] Jun 13, 2025
4a39070
criticalfix: Remove AI marker causing SyntaxError in face_swapper.py
google-labs-jules[bot] Jun 13, 2025
4f05fa2
fix: Force AVFoundation for macOS camera, improve error clarity
google-labs-jules[bot] Jun 16, 2025
c5c08b6
perf: Implement Nth frame processing for webcam mode
google-labs-jules[bot] Jun 18, 2025
9fd870c
refactor: Revert Nth frame processing in webcam mode
google-labs-jules[bot] Jun 18, 2025
984048b
fix: Remove orphaned Nth frame counter line in ui.py
google-labs-jules[bot] Jun 18, 2025
0fc481d
fix: Revert Nth frame logic in ui.py to fix UnboundLocalError
google-labs-jules[bot] Jun 18, 2025
a01314b
feat: Implement Nth-frame detection with tracking for performance
google-labs-jules[bot] Jun 18, 2025
4e36622
feat: Implement Optical Flow KPS tracking for webcam performance
google-labs-jules[bot] Jun 18, 2025
d7139d5
fix: Correct IndentationError and type hint in create_lower_mouth_mask
google-labs-jules[bot] Jun 18, 2025
8a03fcc
fix: Resolve circular import between core and face_swapper
google-labs-jules[bot] Jun 18, 2025
44ef1fd
criticalfix: Correct IndentationError in create_lower_mouth_mask
google-labs-jules[bot] Jun 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 109 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,12 +134,57 @@ Place these files in the "**models**" folder.
We highly recommend using a `venv` to avoid issues.


For Windows:
```bash
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
```
**For Windows:**

It is highly recommended to use Python 3.10 for Windows for best compatibility with all features and dependencies.

**Automated Setup (Recommended):**

1. **Run the setup script:**
Double-click `setup_windows.bat` or run it from your command prompt:
```batch
setup_windows.bat
```
This script will:
* Check if Python is in your PATH.
* Warn if `ffmpeg` is not found (see "Manual Steps / Notes" below for ffmpeg help).
* Create a virtual environment named `.venv` (consistent with macOS setup).
* Activate the virtual environment for the script's session.
* Upgrade pip.
* Install Python packages from `requirements.txt`.
Wait for the script to complete. It will pause at the end; press any key to close the window if you double-clicked it.

2. **Run the application:**
After setup, use the provided `.bat` scripts to run the application. These scripts automatically activate the correct virtual environment:
* `run_windows.bat`: Runs the application with the CPU execution provider by default. This is a good starting point if you don't have a dedicated GPU or are unsure.
* `run-cuda.bat`: Runs with the CUDA (NVIDIA GPU) execution provider. Requires an NVIDIA GPU and CUDA Toolkit installed (see GPU Acceleration section).
* `run-directml.bat`: Runs with the DirectML (AMD/Intel GPU on Windows) execution provider.

Example: Double-click `run_windows.bat` to launch the UI, or run from a command prompt:
```batch
run_windows.bat --source path\to\your_face.jpg --target path\to\video.mp4
```

**Manual Steps / Notes:**

* **Python:** Ensure Python 3.10 is installed and added to your system's PATH. You can download it from [python.org](https://www.python.org/downloads/).
* **ffmpeg:**
* `ffmpeg` is required for video processing. The `setup_windows.bat` script will warn if it's not found in your PATH.
* An easy way to install `ffmpeg` on Windows is to open PowerShell as Administrator and run:
```powershell
Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1')); choco install ffmpeg -y
```
Alternatively, download from [ffmpeg.org](https://ffmpeg.org/download.html), extract the files, and add the `bin` folder (containing `ffmpeg.exe`) to your system's PATH environment variable. The original README also linked to a [YouTube guide](https://www.youtube.com/watch?v=OlNWCpFdVMA) or `iex (irm ffmpeg.tc.ht)` via PowerShell.
* **Visual Studio Runtimes:** If you encounter errors during `pip install` for packages that compile C code (e.g., some scientific computing or image processing libraries), you might need the [Visual Studio Build Tools (or Runtimes)](https://visualstudio.microsoft.com/visual-cpp-build-tools/). Ensure "C++ build tools" (or similar workload) are selected during installation.
* **Virtual Environment (Manual Alternative):** If you prefer to set up the virtual environment manually instead of using `setup_windows.bat`:
```batch
python -m venv .venv
.venv\Scripts\activate.bat
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
```
(The new automated scripts use `.venv` as the folder name for consistency with the macOS setup).

For Linux:
```bash
# Ensure you use the installed Python 3.10
Expand All @@ -150,22 +195,64 @@ pip install -r requirements.txt

**For macOS:**

Apple Silicon (M1/M2/M3) requires specific setup:

```bash
# Install Python 3.10 (specific version is important)
brew install python@3.10

# Install tkinter package (required for the GUI)
brew install python-tk@3.10

# Create and activate virtual environment with Python 3.10
python3.10 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt
```
For a streamlined setup on macOS, use the provided shell scripts:

1. **Make scripts executable:**
Open your terminal, navigate to the cloned `Deep-Live-Cam` directory, and run:
```bash
chmod +x setup_mac.sh
chmod +x run_mac*.sh
```

2. **Run the setup script:**
This will check for Python 3.9+, ffmpeg, create a virtual environment (`.venv`), and install required Python packages.
```bash
./setup_mac.sh
```
If you encounter issues with specific packages during `pip install` (especially for libraries that compile C code, like some image processing libraries), you might need to install system libraries via Homebrew (e.g., `brew install jpeg libtiff ...`) or ensure Xcode Command Line Tools are installed (`xcode-select --install`).

3. **Activate the virtual environment (for manual runs):**
After setup, if you want to run commands manually or use developer tools from your terminal session:
```bash
source .venv/bin/activate
```
(To deactivate, simply type `deactivate` in the terminal.)

4. **Run the application:**
Use the provided run scripts for convenience. These scripts automatically activate the virtual environment.
* `./run_mac.sh`: Runs the application with the CPU execution provider by default. This is a good starting point.
* `./run_mac_cpu.sh`: Explicitly uses the CPU execution provider.
* `./run_mac_coreml.sh`: Attempts to use the CoreML execution provider for potential hardware acceleration on Apple Silicon and Intel Macs.
* `./run_mac_mps.sh`: Attempts to use the MPS (Metal Performance Shaders) execution provider, primarily for Apple Silicon Macs.

Example of running with specific source/target arguments:
```bash
./run_mac.sh --source path/to/your_face.jpg --target path/to/video.mp4
```
Or, to simply launch the UI:
```bash
./run_mac.sh
```

**Important Notes for macOS GPU Acceleration (CoreML/MPS):**

* The `setup_mac.sh` script installs packages from `requirements.txt`, which typically includes a general CPU-based version of `onnxruntime`.
* For optimal performance on Apple Silicon (M1/M2/M3) or specific GPU acceleration, you might need to install a different `onnxruntime` package *after* running `setup_mac.sh` and while the virtual environment (`.venv`) is active.
* **Example for `onnxruntime-silicon` (often requires Python 3.10 for older versions like 1.13.1):**
The original `README` noted that `onnxruntime-silicon==1.13.1` was specific to Python 3.10. If you intend to use this exact version for CoreML:
```bash
# Ensure you are using Python 3.10 if required by your chosen onnxruntime-silicon version
# After running setup_mac.sh and activating .venv:
# source .venv/bin/activate

pip uninstall onnxruntime onnxruntime-gpu # Uninstall any existing onnxruntime
pip install onnxruntime-silicon==1.13.1 # Or your desired version

# Then use ./run_mac_coreml.sh
```
Check the ONNX Runtime documentation for the latest recommended packages for Apple Silicon.
* **For MPS with ONNX Runtime:** This may require a specific build or version of `onnxruntime`. Consult the ONNX Runtime documentation. For PyTorch-based operations (like the Face Enhancer or Hair Segmenter if they were PyTorch native and not ONNX), PyTorch should automatically try to use MPS on compatible Apple Silicon hardware if available.
* **User Interface (Tkinter):** If you encounter errors related to `_tkinter` not being found when launching the UI, ensure your Python installation supports Tk. For Python installed via Homebrew, this is usually `python-tk` (e.g., `brew install python-tk@3.9` or `brew install python-tk@3.10`, matching your Python version).

** In case something goes wrong and you need to reinstall the virtual environment **

Expand Down
17 changes: 11 additions & 6 deletions modules/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,9 +176,12 @@ def update_status(message: str, scope: str = 'DLC.CORE') -> None:
ui.update_status(message)

def start() -> None:
for frame_processor in get_frame_processors_modules(modules.globals.frame_processors):
if not frame_processor.pre_start():
return
# Note: pre_start is called in run() before start() now.
# If it were to be called here, it would also need the status_fn_callback.
# For example:
# for frame_processor in get_frame_processors_modules(modules.globals.frame_processors):
# if not frame_processor.pre_start(status_fn_callback=update_status): # If pre_start was here
# return
update_status('Processing...')
# process image to image
if has_image_extension(modules.globals.target_path):
Expand All @@ -190,7 +193,7 @@ def start() -> None:
print("Error copying file:", str(e))
for frame_processor in get_frame_processors_modules(modules.globals.frame_processors):
update_status('Progressing...', frame_processor.NAME)
frame_processor.process_image(modules.globals.source_path, modules.globals.output_path, modules.globals.output_path)
frame_processor.process_image(modules.globals.source_path, modules.globals.output_path, modules.globals.output_path, status_fn_callback=update_status)
release_resources()
if is_image(modules.globals.target_path):
update_status('Processing to image succeed!')
Expand All @@ -210,7 +213,7 @@ def start() -> None:
temp_frame_paths = get_temp_frame_paths(modules.globals.target_path)
for frame_processor in get_frame_processors_modules(modules.globals.frame_processors):
update_status('Progressing...', frame_processor.NAME)
frame_processor.process_video(modules.globals.source_path, temp_frame_paths)
frame_processor.process_video(modules.globals.source_path, temp_frame_paths, status_fn_callback=update_status)
release_resources()
# handles fps
if modules.globals.keep_fps:
Expand Down Expand Up @@ -249,7 +252,9 @@ def run() -> None:
if not pre_check():
return
for frame_processor in get_frame_processors_modules(modules.globals.frame_processors):
if not frame_processor.pre_check():
if not frame_processor.pre_check(): # pre_check in face_swapper does not use update_status
return
if hasattr(frame_processor, 'pre_start') and not frame_processor.pre_start(status_fn_callback=update_status): # Pass callback here
return
limit_resources()
if modules.globals.headless:
Expand Down
3 changes: 2 additions & 1 deletion modules/face_analyser.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ def get_face_analyser() -> Any:

if FACE_ANALYSER is None:
FACE_ANALYSER = insightface.app.FaceAnalysis(name='buffalo_l', providers=modules.globals.execution_providers)
FACE_ANALYSER.prepare(ctx_id=0, det_size=(640, 640))
# Lowered detection threshold for potentially better webcam face detection (default is 0.5)
FACE_ANALYSER.prepare(ctx_id=0, det_size=(640, 640), det_thresh=0.4)
return FACE_ANALYSER


Expand Down
1 change: 1 addition & 0 deletions modules/globals.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,4 @@
mask_feather_ratio = 8
mask_down_size = 0.50
mask_size = 1
enable_hair_swapping = False # Default state for enabling/disabling hair swapping
125 changes: 125 additions & 0 deletions modules/hair_segmenter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
import torch
import numpy as np
from PIL import Image
from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
import cv2 # Imported for BGR to RGB conversion, though PIL can also do it.

# Global variables for caching
HAIR_SEGMENTER_PROCESSOR = None
HAIR_SEGMENTER_MODEL = None
MODEL_NAME = "isjackwild/segformer-b0-finetuned-segments-skin-hair-clothing"

def segment_hair(image_np: np.ndarray) -> np.ndarray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (performance): Model reloaded on every call

Instantiating the processor and model inside segment_hair causes them to reload for every frame, resulting in significant performance overhead. Load them once at import or cache them globally to improve inference speed.

"""
Segments hair from an image.

Args:
image_np: NumPy array representing the image (BGR format from OpenCV).

Returns:
NumPy array representing the binary hair mask.
"""
global HAIR_SEGMENTER_PROCESSOR, HAIR_SEGMENTER_MODEL

if HAIR_SEGMENTER_PROCESSOR is None or HAIR_SEGMENTER_MODEL is None:
print(f"Loading hair segmentation model and processor ({MODEL_NAME}) for the first time...")
try:
HAIR_SEGMENTER_PROCESSOR = SegformerImageProcessor.from_pretrained(MODEL_NAME)
HAIR_SEGMENTER_MODEL = SegformerForSemanticSegmentation.from_pretrained(MODEL_NAME)

if torch.cuda.is_available():
try:
HAIR_SEGMENTER_MODEL = HAIR_SEGMENTER_MODEL.to('cuda')
print("INFO: Hair segmentation model moved to CUDA (GPU).")
except Exception as e_cuda:
print(f"ERROR: Failed to move hair segmentation model to CUDA: {e_cuda}. Using CPU instead.")
# Fallback to CPU if .to('cuda') fails
HAIR_SEGMENTER_MODEL = HAIR_SEGMENTER_MODEL.to('cpu')
else:
print("INFO: CUDA not available. Hair segmentation model will use CPU.")

print("INFO: Hair segmentation model and processor loaded successfully (device: {}).".format(HAIR_SEGMENTER_MODEL.device))
except Exception as e:
print(f"ERROR: Failed to load hair segmentation model/processor: {e}")
# Return an empty mask compatible with expected output shape (H, W)
return np.zeros((image_np.shape[0], image_np.shape[1]), dtype=np.uint8)

# Convert BGR (OpenCV) to RGB (PIL)
image_rgb = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
image_pil = Image.fromarray(image_rgb)

inputs = HAIR_SEGMENTER_PROCESSOR(images=image_pil, return_tensors="pt")

if HAIR_SEGMENTER_MODEL.device.type == 'cuda':
try:
# SegformerImageProcessor output (BatchEncoding) is a dict-like object.
# We need to move its tensor components, commonly 'pixel_values'.
if 'pixel_values' in inputs:
inputs['pixel_values'] = inputs['pixel_values'].to('cuda')
else: # Fallback if the structure is different than expected
inputs = inputs.to('cuda')
# If inputs has other tensor components that need to be moved, they'd need similar handling.
except Exception as e_inputs_cuda:
print(f"ERROR: Failed to move inputs to CUDA: {e_inputs_cuda}. Attempting inference on CPU.")
# If moving inputs to CUDA fails, we should ensure model is also on CPU for this inference pass
# This is a tricky situation; ideally, this failure shouldn't happen if model moved successfully.
# For simplicity, we'll assume if model is on CUDA, inputs should also be.
# A more robust solution might involve moving model back to CPU if inputs can't be moved.

with torch.no_grad(): # Important for inference
outputs = HAIR_SEGMENTER_MODEL(**inputs)

logits = outputs.logits # Shape: batch_size, num_labels, height, width

# Upsample logits to original image size
upsampled_logits = torch.nn.functional.interpolate(
logits,
size=(image_np.shape[0], image_np.shape[1]), # H, W
mode='bilinear',
align_corners=False
)

segmentation_map = upsampled_logits.argmax(dim=1).squeeze().cpu().numpy().astype(np.uint8)

# Label 2 is for hair in this model
return np.where(segmentation_map == 2, 255, 0).astype(np.uint8)

if __name__ == '__main__':
# This is a conceptual test.
# In a real scenario, you would load an image using OpenCV or Pillow.
# For example:
# sample_image_np = cv2.imread("path/to/your/image.jpg")
# if sample_image_np is not None:
# hair_mask_output = segment_hair(sample_image_np)
# cv2.imwrite("hair_mask_output.png", hair_mask_output)
# print("Hair mask saved to hair_mask_output.png")
# else:
# print("Failed to load sample image.")

print("Conceptual test: Hair segmenter module created.")
# Create a dummy image for a basic test run if no image is available.
dummy_image_np = np.zeros((100, 100, 3), dtype=np.uint8) # 100x100 BGR image
dummy_image_np[:, :, 1] = 255 # Make it green to distinguish from black mask

try:
print("Running segment_hair with a dummy image...")
hair_mask_output = segment_hair(dummy_image_np)
print(f"segment_hair returned a mask of shape: {hair_mask_output.shape}")
# Check if the output is a 2D array (mask) and has the same H, W as input
assert hair_mask_output.shape == (dummy_image_np.shape[0], dummy_image_np.shape[1])
# Check if the mask is binary (0 or 255)
assert np.all(np.isin(hair_mask_output, [0, 255]))
print("Dummy image test successful. Hair mask seems to be generated correctly.")

# Attempt to save the dummy mask (optional, just for visual confirmation if needed)
# cv2.imwrite("dummy_hair_mask_output.png", hair_mask_output)
# print("Dummy hair mask saved to dummy_hair_mask_output.png")

except ImportError as e:
print(f"An ImportError occurred: {e}. This might be due to missing dependencies like transformers, torch, or Pillow.")
print("Please ensure all required packages are installed by updating requirements.txt and installing them.")
except Exception as e:
print(f"An error occurred during the dummy image test: {e}")
print("This could be due to issues with model loading, processing, or other runtime errors.")

print("To perform a full test, replace the dummy image with a real image path.")
Loading