Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Face Detection Explainer #69

Merged
merged 2 commits into from
Aug 25, 2022

Conversation

eehakkin
Copy link
Contributor

Here comes a face detection Explainer.

The explainer contains still two alternative API proposals. It should be coordinated with WebCodecs WG so that an appropriate alternative can be chosen.

@eehakkin eehakkin force-pushed the feature/face-detection-explainer branch from c608b91 to 25e47cb Compare August 23, 2022 10:15
Copy link
Contributor

@youennf youennf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Some comments below that can be addressed here or as follow-ups.


* Face Detection API should be anchored to [VideoFrameMetadata](https://wicg.github.io/video-rvfc/#dictdef-videoframemetadata) defined in [HTMLVideoElement.requestVideoFrameCallback](https://wicg.github.io/video-rvfc/).

* Face Detection API should try to return a **contour** instead of a bounding box. The number of points describing the contour can be user defined via **faceDetectionMode** settings and implementations presently can default to a four point rectangle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is a goal to return a contour instead of a bounding box. Maybe it should not be exclusive, see https://wicg.github.io/shape-detection-api/#dictdef-detectedface for instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contour removed from the new PR


* Face Detection API could be used as an input to various other APIs like Background Blur, Eye Gaze Correction, Face Framing, etc. Face Detection minimizes the surface area other dependent features need to work on for a faster implementation. It should be easy to use Face Detection API along with a custom Eye Gaze Correction or a Funny Hats feature from a ML framework by passing the face coordinates.

* In the spirit of successive enhancement, it should be possible to get results from Face Detection API and add custom enhancements before feeding the metadata to the video stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very clear about this goal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed


* In the spirit of successive enhancement, it should be possible to get results from Face Detection API and add custom enhancements before feeding the metadata to the video stream.

* Facial Landmarks like *eyes*, *nose* and *mouth* should be detected if there's support in the platform.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/detected/exposed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


## Face Detection API

### Common API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should link to https://wicg.github.io/shape-detection-api API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"presence", // Only the presence of face or faces is returned, not location
"bounding-box", // Return bounding box for face
"contour", // Approximate contour of the detected faces is returned
"landmarks", // Approximate contour of the detected faces is returned with facial landmarks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all these values?
What is the practical advantage over a simple true/false value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user should be able to select just face detection without landmarks, as it could save computation. Otherwise, removed other options

Example are shown in sections [Example 1-1](#example-1-1) and
[Example 2-1](#example-2-1).

### VideoFrame Side-Channel API (alternative 2)
Copy link
Contributor

@youennf youennf Aug 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should link to #70 and #71.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the relevant link nowadays is the link to VideoFrameMetadata which was added into the explainer.

@aboba aboba merged commit f885c2a into w3c:main Aug 25, 2022
@alvestrand
Copy link
Contributor

Merged, but do address comments. (Iterate!)

@youennf
Copy link
Contributor

youennf commented Aug 25, 2022

Editor's call: let's merge it but authors should look at comments in this PR.

@eehakkin eehakkin deleted the feature/face-detection-explainer branch March 17, 2023 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants