这是indexloc提供的服务,不要输入任何密码
Skip to content

Yolov4 Based Dog Clip Detection #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Yolov4 Based Dog Clip Detection #7

wants to merge 2 commits into from

Conversation

drewaogle
Copy link

No description provided.

@drewaogle drewaogle self-assigned this Nov 21, 2024
@@ -0,0 +1,963 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow a different organization - video_object_detection/yolov4_clips/*
or something along those lines
I want to call out applications more up top. notebooks doesn't say much and you end up having helper files that are not notebooks.

Copy link
Collaborator

@vishakha041 vishakha041 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly focused on the notebook. haven't checked the python code parts - @gsaluja9 or @bovlb can help with that.

"id": "2ef71144-e2c1-4696-a82a-a9c49dae001e",
"metadata": {},
"source": [
"# Creating Video Clips\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change the name to something application specific --> Find Dogs in Videos Stored in ApertureDB or something like that. Remember, we are doing apps here - creating clips / detecting / loading , these are steps for that app

"source": [
"# Creating Video Clips\n",
"This notebook demonstrates using a YOLOv4 model to process individual frames and use the resulting output to generate information about the video.\n",
"We use detections of labels over sequential frames to generate Clips which describe the existance of those objects within a specific portion of the video."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"We use detections of labels over sequential frames to generate Clips which describe the existance of those objects within a specific portion of the video."
"We use detections of labels over sequential frames to generate Clips which describe the existence of those objects within a specific portion of the video."
"Once these objects are detected, you can then search videos with those objects using ApertureDB video search."

"id": "8835379a-1c02-44d3-930c-f321279ee0bb",
"metadata": {},
"source": [
"# Install ApertureDB\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"# Install ApertureDB\n",
"## Install ApertureDB\n",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the structuring in this example: https://docs.aperturedata.io/HowToGuides/Applications/similarity_search
Am looking where we start defining standards but lost @bovlb PRs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we do need to point to cloud setup in all these.

"id": "7e1a96c2-b0d1-47b6-b836-c0eb0142101f",
"metadata": {},
"source": [
"# Download resources\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COuld you please introduce it as sections / subsections -> right now they are all main headers - so Setup can be main section then Download a subsection etc. This is going to land in docs and so thinking about the look there.

"# Download resources\n",
"Now we need to download the python code to run the model, `yolov4.py` and the video we are going to use, `norman.mp4`, a video about a dog riding a bicycle.\n",
"\n",
"This was chosen because it includes several labels but also because it has detections which overlap - dogs and bikes."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"This was chosen because it includes several labels but also because it has detections which overlap - dogs and bikes."
"We chose this video because it includes several labels as well as has detections that overlap - dogs and bikes."

{
"cell_type": "code",
"execution_count": null,
"id": "1fda14d1-dc17-48da-af51-bdeb8d6e468d",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this to the helper file for this notebook too?

"outputs": [],
"source": [
"import pandas as pd\n",
"df = pd.read_csv(\"output/norman/detections.csv\")\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run this code or show a snippet? It's nice to see the output right away

" initconf=50 # minimun confidence to start ( 0-100 )\n",
" initlen=5 # minimum detection duration in frames to start a clip\n",
" dropconf=25 # confidence to end a frame (0 -100 )\n",
" droplen=5 # number of detection missed frames to end a clip\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of helper code that's not showing anything Aperture related and distracting. It can be imported in the dependencies sections via a helper file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALso take a look at this from our previous notebook that @gsaluja9 built to wrap Clips into DataModel classes in Python SDK. I believe you will need to enhance them to remove all this JSON creation from this notebook and we will get more useful code in the Python SDK

"id": "bd469e3d-c84b-4ce6-ac53-6e24ff47c2f5",
"metadata": {},
"source": [
"## Issue - dropped dog at 120\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's these query parts that are very interesting ...do run the code and check in with responses

"## Verification Solution\n",
"Now we've been able to tell with a powerful query that we didn't have a dog confidence high enough to start a clip.\n",
"\n",
"We can then either decide to fine tune or model - have it train on partially obscured dogs more, lower the threshold, or accept our results.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should people go next from here? Put some suggested links (remember docs :) )

def initialize_network(self):
""" Method to initialize and load the model. """

self.net = cv2.dnn_DetectionModel(self.args.cfg, self.args.weights)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should provide more bits of information on the core of this wrapper.

Do we want to clarify to the reader that till v4 the official repo was not from uralytics, and here we are creating a cv2 model. https://github.com/AlexeyAB/darknet .

Also this link would be useful to curious folks. https://arxiv.org/abs/2004.10934

A segway into cv2 dnn would also be like a next step to explore more. https://docs.opencv.org/4.x/da/d9d/tutorial_dnn_yolo.html

def parse_arguments(self):
""" Method to parse arguments using argparser. """

parser = argparse.ArgumentParser(description='Object Detection using YOLOv4 and OpenCV4')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to add this style of arguments ? Is this intended to be used as a CLI in future?

def stream_inf(self):
""" Method to run inference on a stream. """

source = cv2.VideoCapture(0 if self.args.stream == 'webcam' else self.args.stream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would need to be run on a local machine sans docker for webcam to work as expected.


if __name__== '__main__':

yolo = YOLOv4.__new__(YOLOv4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd. Isn't yolo = YOLOv4() the way to instantiate, an object rather than the underlying dunder methods.

Copy link
Collaborator

@bovlb bovlb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial review

"metadata": {},
"outputs": [],
"source": [
"!pip install aperturedb tqdm 2>&1 >/dev/null\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"!pip install aperturedb tqdm 2>&1 >/dev/null\n",
"%pip install --quiet --upgrade aperturedb tqdm\n",

Using "%" rather than "!" is more reliable wrt virtual environments.

I'd also prefer to see PIP commands in a separate cell.

"\n",
"# Retrieve the YOLO4 interface\n",
"\n",
"!wget https://raw.githubusercontent.com/drewaogle/YOLOv4-OpenCV-CUDA-DNN/refs/heads/main/yolo4.py\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not ideal that we're relying on your personal repo here. Could this be under applications?

"\n",
"!wget https://raw.githubusercontent.com/drewaogle/YOLOv4-OpenCV-CUDA-DNN/refs/heads/main/yolo4.py\n",
"\n",
"# Retreive video\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"# Retreive video\n",
"## Retrieve video\n",

"metadata": {},
"outputs": [],
"source": [
"# Now we retrieve the items we are working with:\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"# Now we retrieve the items we are working with:\n",
"## Now we retrieve the items we are working with:\n",

"source": [
"# Now we retrieve the items we are working with:\n",
"\n",
"# Retrieve the YOLO4 interface\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"# Retrieve the YOLO4 interface\n",
"## Retrieve the YOLO4 interface\n",

"id": "b5dae152-0643-4701-a505-0b331dd9b204",
"metadata": {},
"source": [
"# Run The Detector\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"# Run The Detector\n",
"## Run The Detector\n",

"source": [
"# Run The Detector\n",
"Now that we've downloaded our YOLOv4 code, let's run it.\n",
"This will need to download the weights and some configuration; about 300M and will do it automatically.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"This will need to download the weights and some configuration; about 300M and will do it automatically.\n",
"This will automatically download about 300MB of weights and some configuration.\n",

"Now that we've downloaded our YOLOv4 code, let's run it.\n",
"This will need to download the weights and some configuration; about 300M and will do it automatically.\n",
"\n",
"After downloading or verify files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output dir, so delete it to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without hardware, and take less 5 minutes.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"After downloading or verify files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output dir, so delete it to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without hardware, and take less 5 minutes.\n",
"After downloading and verifying files, it will then process the video. with `no_squash_detections` as `True` it won't overwrite an existing output directory, so delete `output` to rerun. This code can support hardware acceleration, but is designed so it won't be unwieldly without it. Detections should be at about 3-10fps without GPU, and take less 5 minutes.\n",

" self.stream=stream # 'webcam' to open webcam w/ OpenCV\n",
"\n",
"# now we pull data, and and run detection.\n",
"dopts = DetectorOptions( stream=\"norman.mp4\")\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"dopts = DetectorOptions( stream=\"norman.mp4\")\n",
"dopts = DetectorOptions(stream=\"norman.mp4\")\n",

Copy link
Collaborator

@bovlb bovlb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good.

I think it would be good if our notebooks complied with PEP8. Specifically, methods should have a blank line before and after.

"# function to prepare dataframe for work; add columns and trim frames we don't want.\n",
"def preprocess(df, args ):\n",
" processed = df\n",
" processed.columns = [\"frame\",\"label\",\"confidence\",\"left\",\"top\",\"width\",\"height\" ]\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" processed.columns = [\"frame\",\"label\",\"confidence\",\"left\",\"top\",\"width\",\"height\" ]\n",
" processed.columns = [\"frame\", \"label\", \"confidence\", \"left\", \"top\", \"width\", \"height\"]\n",

" processed.drop(processed[processed.frame > args.end_frame].index,inplace=True)\n",
" return processed\n",
"\n",
"norman_detects = preprocess( df, clip_opts )\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"norman_detects = preprocess( df, clip_opts )\n",
"norman_detects = preprocess(df, clip_opts)\n",

"metadata": {},
"source": [
"### Detection Verification\n",
"This is pretty much what we would expect. bike, dog, person .. a car detection in the background, a nice find.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"This is pretty much what we would expect. bike, dog, person .. a car detection in the background, a nice find.\n",
"This is pretty much what we would expect. bike, dog, person … and a car detection in the background, a nice find.\n",

"import cv2\n",
"from PIL import Image\n",
"\n",
"def display_image_and_bb( num, df ):\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"def display_image_and_bb( num, df ):\n",
"def display_image_and_bb( num, df ):\n",
" \"\"\"Display the image with bounding box and label\"\"\"\n",

Comment on lines +483 to +486
" return clip_store.finished\n",
" \n",
"\n",
" "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" return clip_store.finished\n",
" \n",
"\n",
" "
" return clip_store.finished\n"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants