Yolov4 Based Dog Clip Detection #7

drewaogle · 2024-11-21T04:18:37Z

No description provided.

vishakha041 · 2024-11-21T22:53:10Z

notebooks/video_clips/VideoClips.ipynb

@@ -0,0 +1,963 @@
+{


Let's follow a different organization - video_object_detection/yolov4_clips/*
or something along those lines
I want to call out applications more up top. notebooks doesn't say much and you end up having helper files that are not notebooks.

vishakha041

I mainly focused on the notebook. haven't checked the python code parts - @gsaluja9 or @bovlb can help with that.

vishakha041 · 2024-11-21T22:57:56Z

notebooks/video_clips/VideoClips.ipynb

+   "id": "2ef71144-e2c1-4696-a82a-a9c49dae001e",
+   "metadata": {},
+   "source": [
+    "# Creating Video Clips\n",


Let's change the name to something application specific --> Find Dogs in Videos Stored in ApertureDB or something like that. Remember, we are doing apps here - creating clips / detecting / loading , these are steps for that app

vishakha041 · 2024-11-21T22:58:59Z

notebooks/video_clips/VideoClips.ipynb

+   "source": [
+    "# Creating Video Clips\n",
+    "This notebook demonstrates using a YOLOv4 model to process individual frames and use the resulting output to generate information about the video.\n",
+    "We use detections of labels over sequential frames to generate Clips which describe the existance of those objects within a specific portion of the video."


Suggested change

"We use detections of labels over sequential frames to generate Clips which describe the existance of those objects within a specific portion of the video."

"We use detections of labels over sequential frames to generate Clips which describe the existence of those objects within a specific portion of the video."

"Once these objects are detected, you can then search videos with those objects using ApertureDB video search."

vishakha041 · 2024-11-21T23:01:00Z

notebooks/video_clips/VideoClips.ipynb

+   "id": "8835379a-1c02-44d3-930c-f321279ee0bb",
+   "metadata": {},
+   "source": [
+    "# Install ApertureDB\n",


Suggested change

"# Install ApertureDB\n",

"## Install ApertureDB\n",

I like the structuring in this example: https://docs.aperturedata.io/HowToGuides/Applications/similarity_search
Am looking where we start defining standards but lost @bovlb PRs

But we do need to point to cloud setup in all these.

vishakha041 · 2024-11-21T23:28:23Z

notebooks/video_clips/VideoClips.ipynb

+   "id": "7e1a96c2-b0d1-47b6-b836-c0eb0142101f",
+   "metadata": {},
+   "source": [
+    "# Download resources\n",


COuld you please introduce it as sections / subsections -> right now they are all main headers - so Setup can be main section then Download a subsection etc. This is going to land in docs and so thinking about the look there.

vishakha041 · 2024-11-21T23:31:13Z

notebooks/video_clips/VideoClips.ipynb

+    "# Download resources\n",
+    "Now we need to download the python code to run the model, `yolov4.py` and the video we are going to use, `norman.mp4`, a video about a dog riding a bicycle.\n",
+    "\n",
+    "This was chosen because it includes several labels but also because it has detections which overlap - dogs and bikes."


Suggested change

"This was chosen because it includes several labels but also because it has detections which overlap - dogs and bikes."

"We chose this video because it includes several labels as well as has detections that overlap - dogs and bikes."

vishakha041 · 2024-11-21T23:32:09Z

notebooks/video_clips/VideoClips.ipynb

+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1fda14d1-dc17-48da-af51-bdeb8d6e468d",


Could we move this to the helper file for this notebook too?

vishakha041 · 2024-11-21T23:35:23Z

notebooks/video_clips/VideoClips.ipynb

+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "df = pd.read_csv(\"output/norman/detections.csv\")\n",


Could you run this code or show a snippet? It's nice to see the output right away

vishakha041 · 2024-11-21T23:36:24Z

notebooks/video_clips/VideoClips.ipynb

+    "    initconf=50 # minimun confidence to start ( 0-100 )\n",
+    "    initlen=5 # minimum detection duration in frames to start a clip\n",
+    "    dropconf=25 # confidence to end a frame (0 -100 )\n",
+    "    droplen=5 # number of detection missed frames to end a clip\n",


There is a lot of helper code that's not showing anything Aperture related and distracting. It can be imported in the dependencies sections via a helper file

ALso take a look at this from our previous notebook that @gsaluja9 built to wrap Clips into DataModel classes in Python SDK. I believe you will need to enhance them to remove all this JSON creation from this notebook and we will get more useful code in the Python SDK

vishakha041 · 2024-11-21T23:44:38Z

notebooks/video_clips/VideoClips.ipynb

+   "id": "bd469e3d-c84b-4ce6-ac53-6e24ff47c2f5",
+   "metadata": {},
+   "source": [
+    "## Issue - dropped dog at 120\n",


It's these query parts that are very interesting ...do run the code and check in with responses

vishakha041 · 2024-11-21T23:46:47Z

notebooks/video_clips/VideoClips.ipynb

+    "## Verification Solution\n",
+    "Now we've been able to tell with a powerful query that we didn't have a dog confidence high enough to start a clip.\n",
+    "\n",
+    "We can then either decide to fine tune or model - have it train on partially obscured dogs more, lower the threshold, or accept our results.\n",


Where should people go next from here? Put some suggested links (remember docs :) )

gsaluja9 · 2024-11-22T15:28:23Z

notebooks/video_clips/yolo4.py

+    def initialize_network(self):
+        """ Method to initialize and load the model. """
+
+        self.net = cv2.dnn_DetectionModel(self.args.cfg, self.args.weights)


We should provide more bits of information on the core of this wrapper.

Do we want to clarify to the reader that till v4 the official repo was not from uralytics, and here we are creating a cv2 model. https://github.com/AlexeyAB/darknet .

Also this link would be useful to curious folks. https://arxiv.org/abs/2004.10934

A segway into cv2 dnn would also be like a next step to explore more. https://docs.opencv.org/4.x/da/d9d/tutorial_dnn_yolo.html

gsaluja9 · 2024-11-22T15:29:03Z

notebooks/video_clips/yolo4.py

+    def parse_arguments(self):
+        """ Method to parse arguments using argparser. """
+
+        parser = argparse.ArgumentParser(description='Object Detection using YOLOv4 and OpenCV4')


Is there a reason to add this style of arguments ? Is this intended to be used as a CLI in future?

gsaluja9 · 2024-11-22T15:34:03Z

notebooks/video_clips/yolo4.py

+    def stream_inf(self):
+        """ Method to run inference on a stream. """
+
+        source = cv2.VideoCapture(0 if self.args.stream == 'webcam' else self.args.stream)


this would need to be run on a local machine sans docker for webcam to work as expected.

gsaluja9 · 2024-11-22T15:36:29Z

notebooks/video_clips/yolo4.py

+
+if __name__== '__main__':
+
+    yolo = YOLOv4.__new__(YOLOv4)


This is odd. Isn't yolo = YOLOv4() the way to instantiate, an object rather than the underlying dunder methods.

bovlb

partial review

bovlb · 2025-01-04T00:43:20Z

notebooks/video_clips/VideoClips.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install aperturedb tqdm 2>&1 >/dev/null\n",


Suggested change

"!pip install aperturedb tqdm 2>&1 >/dev/null\n",

"%pip install --quiet --upgrade aperturedb tqdm\n",

Using "%" rather than "!" is more reliable wrt virtual environments.

I'd also prefer to see PIP commands in a separate cell.

bovlb · 2025-01-06T18:07:26Z

notebooks/video_clips/VideoClips.ipynb

+    "\n",
+    "# Retrieve the YOLO4 interface\n",
+    "\n",
+    "!wget https://raw.githubusercontent.com/drewaogle/YOLOv4-OpenCV-CUDA-DNN/refs/heads/main/yolo4.py\n",


It's not ideal that we're relying on your personal repo here. Could this be under applications?

bovlb · 2025-01-06T18:07:51Z

notebooks/video_clips/VideoClips.ipynb

+    "\n",
+    "!wget https://raw.githubusercontent.com/drewaogle/YOLOv4-OpenCV-CUDA-DNN/refs/heads/main/yolo4.py\n",
+    "\n",
+    "# Retreive video\n",


Suggested change

"# Retreive video\n",

"## Retrieve video\n",

bovlb · 2025-01-06T18:17:29Z

notebooks/video_clips/VideoClips.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Now we retrieve the items we are working with:\n",


Suggested change

"# Now we retrieve the items we are working with:\n",

"## Now we retrieve the items we are working with:\n",

bovlb · 2025-01-06T18:17:39Z

notebooks/video_clips/VideoClips.ipynb

+   "source": [
+    "# Now we retrieve the items we are working with:\n",
+    "\n",
+    "# Retrieve the YOLO4 interface\n",


Suggested change

"# Retrieve the YOLO4 interface\n",

"## Retrieve the YOLO4 interface\n",

bovlb · 2025-01-06T18:17:55Z

notebooks/video_clips/VideoClips.ipynb

+   "id": "b5dae152-0643-4701-a505-0b331dd9b204",
+   "metadata": {},
+   "source": [
+    "# Run The Detector\n",


Suggested change

"# Run The Detector\n",

"## Run The Detector\n",

bovlb · 2025-01-06T18:18:36Z

notebooks/video_clips/VideoClips.ipynb

+   "source": [
+    "# Run The Detector\n",
+    "Now that we've downloaded our YOLOv4 code, let's run it.\n",
+    "This will need to download the weights and some configuration; about 300M and will do it automatically.\n",


Suggested change

"This will need to download the weights and some configuration; about 300M and will do it automatically.\n",

"This will automatically download about 300MB of weights and some configuration.\n",

bovlb · 2025-01-06T18:45:05Z

notebooks/video_clips/VideoClips.ipynb

+    "Now that we've downloaded our YOLOv4 code, let's run it.\n",
+    "This will need to download the weights and some configuration; about 300M and will do it automatically.\n",
+    "\n",
+    "After downloading or verify files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output dir, so delete it to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without hardware, and take less 5 minutes.\n",


Suggested change

"After downloading or verify files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output dir, so delete it to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without hardware, and take less 5 minutes.\n",

"After downloading and verifying files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output directory, so delete `output` to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without GPU, and take less 5 minutes.\n",

bovlb · 2025-01-06T18:45:31Z

notebooks/video_clips/VideoClips.ipynb

+    "        self.stream=stream # 'webcam' to open webcam w/ OpenCV\n",
+    "\n",
+    "# now we pull data, and and run detection.\n",
+    "dopts = DetectorOptions( stream=\"norman.mp4\")\n",


Suggested change

"dopts = DetectorOptions( stream=\"norman.mp4\")\n",

"dopts = DetectorOptions(stream=\"norman.mp4\")\n",

bovlb

Generally looks good.

I think it would be good if our notebooks complied with PEP8. Specifically, methods should have a blank line before and after.

bovlb · 2025-01-07T17:46:49Z

notebooks/video_clips/VideoClips.ipynb

+    "# function to prepare dataframe for work; add columns and trim frames we don't want.\n",
+    "def preprocess(df, args ):\n",
+    "   processed = df\n",
+    "   processed.columns = [\"frame\",\"label\",\"confidence\",\"left\",\"top\",\"width\",\"height\" ]\n",


Suggested change

" processed.columns = [\"frame\",\"label\",\"confidence\",\"left\",\"top\",\"width\",\"height\" ]\n",

" processed.columns = [\"frame\", \"label\", \"confidence\", \"left\", \"top\", \"width\", \"height\"]\n",

bovlb · 2025-01-07T17:47:22Z

notebooks/video_clips/VideoClips.ipynb

+    "      processed.drop(processed[processed.frame > args.end_frame].index,inplace=True)\n",
+    "   return processed\n",
+    "\n",
+    "norman_detects = preprocess( df, clip_opts )\n",


Suggested change

"norman_detects = preprocess( df, clip_opts )\n",

"norman_detects = preprocess(df, clip_opts)\n",

bovlb · 2025-01-07T21:59:22Z

notebooks/video_clips/VideoClips.ipynb

+   "metadata": {},
+   "source": [
+    "### Detection Verification\n",
+    "This is pretty much what we would expect. bike, dog, person .. a car detection in the background, a nice find.\n",


Suggested change

"This is pretty much what we would expect. bike, dog, person .. a car detection in the background, a nice find.\n",

"This is pretty much what we would expect. bike, dog, person … and a car detection in the background, a nice find.\n",

bovlb · 2025-01-07T22:01:36Z

notebooks/video_clips/VideoClips.ipynb

+    "import cv2\n",
+    "from PIL import Image\n",
+    "\n",
+    "def display_image_and_bb( num, df ):\n",


Suggested change

"def display_image_and_bb( num, df ):\n",

"def display_image_and_bb( num, df ):\n",

" \"\"\"Display the image with bounding box and label\"\"\"\n",

bovlb · 2025-01-07T22:04:40Z

notebooks/video_clips/VideoClips.ipynb

+    "    return clip_store.finished\n",
+    "       \n",
+    "\n",
+    "    "


Suggested change

" return clip_store.finished\n",

" \n",

"\n",

" "

" return clip_store.finished\n"

Yolov4 Based Dog Clip Detection

dcd570c

drewaogle requested review from vishakha041, bovlb and gsaluja9 November 21, 2024 04:18

drewaogle self-assigned this Nov 21, 2024

remove debugging

c5395e5

vishakha041 reviewed Nov 21, 2024

View reviewed changes

vishakha041 requested changes Nov 21, 2024

View reviewed changes

gsaluja9 reviewed Nov 22, 2024

View reviewed changes

bovlb reviewed Jan 6, 2025

View reviewed changes

bovlb approved these changes Jan 8, 2025

View reviewed changes

	"We use detections of labels over sequential frames to generate Clips which describe the existance of those objects within a specific portion of the video."
	"We use detections of labels over sequential frames to generate Clips which describe the existence of those objects within a specific portion of the video."
	"Once these objects are detected, you can then search videos with those objects using ApertureDB video search."

	"This was chosen because it includes several labels but also because it has detections which overlap - dogs and bikes."
	"We chose this video because it includes several labels as well as has detections that overlap - dogs and bikes."

	"!pip install aperturedb tqdm 2>&1 >/dev/null\n",
	"%pip install --quiet --upgrade aperturedb tqdm\n",

	"# Now we retrieve the items we are working with:\n",
	"## Now we retrieve the items we are working with:\n",

	"# Retrieve the YOLO4 interface\n",
	"## Retrieve the YOLO4 interface\n",

	"This will need to download the weights and some configuration; about 300M and will do it automatically.\n",
	"This will automatically download about 300MB of weights and some configuration.\n",

	"After downloading or verify files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output dir, so delete it to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without hardware, and take less 5 minutes.\n",
	"After downloading and verifying files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output directory, so delete `output` to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without GPU, and take less 5 minutes.\n",

	"dopts = DetectorOptions( stream=\"norman.mp4\")\n",
	"dopts = DetectorOptions(stream=\"norman.mp4\")\n",

	" processed.columns = [\"frame\",\"label\",\"confidence\",\"left\",\"top\",\"width\",\"height\" ]\n",
	" processed.columns = [\"frame\", \"label\", \"confidence\", \"left\", \"top\", \"width\", \"height\"]\n",

	"norman_detects = preprocess( df, clip_opts )\n",
	"norman_detects = preprocess(df, clip_opts)\n",

	"This is pretty much what we would expect. bike, dog, person .. a car detection in the background, a nice find.\n",
	"This is pretty much what we would expect. bike, dog, person … and a car detection in the background, a nice find.\n",

	"def display_image_and_bb( num, df ):\n",
	"def display_image_and_bb( num, df ):\n",
	" \"\"\"Display the image with bounding box and label\"\"\"\n",

Yolov4 Based Dog Clip Detection #7

Are you sure you want to change the base?

Yolov4 Based Dog Clip Detection #7

Uh oh!

Conversation

drewaogle commented Nov 21, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vishakha041 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bovlb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bovlb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!