-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Face Detection Explainer #69
Conversation
c608b91
to
25e47cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Some comments below that can be addressed here or as follow-ups.
|
||
* Face Detection API should be anchored to [VideoFrameMetadata](https://wicg.github.io/video-rvfc/#dictdef-videoframemetadata) defined in [HTMLVideoElement.requestVideoFrameCallback](https://wicg.github.io/video-rvfc/). | ||
|
||
* Face Detection API should try to return a **contour** instead of a bounding box. The number of points describing the contour can be user defined via **faceDetectionMode** settings and implementations presently can default to a four point rectangle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is a goal to return a contour instead of a bounding box. Maybe it should not be exclusive, see https://wicg.github.io/shape-detection-api/#dictdef-detectedface for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contour removed from the new PR
|
||
* Face Detection API could be used as an input to various other APIs like Background Blur, Eye Gaze Correction, Face Framing, etc. Face Detection minimizes the surface area other dependent features need to work on for a faster implementation. It should be easy to use Face Detection API along with a custom Eye Gaze Correction or a Funny Hats feature from a ML framework by passing the face coordinates. | ||
|
||
* In the spirit of successive enhancement, it should be possible to get results from Face Detection API and add custom enhancements before feeding the metadata to the video stream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very clear about this goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
|
||
* In the spirit of successive enhancement, it should be possible to get results from Face Detection API and add custom enhancements before feeding the metadata to the video stream. | ||
|
||
* Facial Landmarks like *eyes*, *nose* and *mouth* should be detected if there's support in the platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/detected/exposed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
## Face Detection API | ||
|
||
### Common API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should link to https://wicg.github.io/shape-detection-api API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
"presence", // Only the presence of face or faces is returned, not location | ||
"bounding-box", // Return bounding box for face | ||
"contour", // Approximate contour of the detected faces is returned | ||
"landmarks", // Approximate contour of the detected faces is returned with facial landmarks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need all these values?
What is the practical advantage over a simple true/false value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user should be able to select just face detection without landmarks, as it could save computation. Otherwise, removed other options
Example are shown in sections [Example 1-1](#example-1-1) and | ||
[Example 2-1](#example-2-1). | ||
|
||
### VideoFrame Side-Channel API (alternative 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the relevant link nowadays is the link to VideoFrameMetadata which was added into the explainer.
Merged, but do address comments. (Iterate!) |
Editor's call: let's merge it but authors should look at comments in this PR. |
Here comes a face detection Explainer.
The explainer contains still two alternative API proposals. It should be coordinated with WebCodecs WG so that an appropriate alternative can be chosen.