HTML
一、认识
基于 TensorFlow.js
,集成人脸识别模型构建 AI
音视频录制工具,采用 WebGL
加速 AI
推理,优化多线程并行计算,确保录制帧率达 40 FPS+
, 支持人脸/手势/微笑检测
一、音视频录制: 通过 navigator.mediaDevices.getUserMedia
获取用户的音频和视频流, 获取视频元素 (videoEl
) 并将媒体流 (mediaStream
) 赋值给 video
元素的 srcObject
属性,使视频可以播放。同时,监听视频的 play
事件,确保在视频播放时开始捕获每一帧,并将捕获的帧绘制到 Canvas
上。捕获视频帧并绘制到 Canvas
过程中, 利用 requestVideoFrameCallback
(如果浏览器支持)或 requestAnimationFrame
,实时捕获 video
元素的每一帧图像, 通过 createImageBitmap
方法将视频帧转化为 ImageBitmap
然后发送到 Worker
中。在 Worker
中使用 OffscreenCanvas
独立线程进行视频帧处理, 比如翻转, 然后采用 transferToImageBitmap()
零拷贝的方式将 OffscreenCanvas
直接转换为 ImageBitmap
, 并通过 postMessage
传输处理后的 ImageBitmap
给主线程。然后通过 Canvas
的 2D
上下文 (processCtx2D
) 绘制到画布上。然后, 通过 this.processCanvas.captureStream(60)
获取每秒 60
帧的画布流。这个流包含了 Canvas
上绘制的视频帧, 将用户音频流的音频轨道添加到画布流中,实现视频和音频的合成。基于合成后的画布流,使用 MediaRecorder
创建一个录制器实例,并设置录制选项(如 mimeType
)。如果录制时支持的 mimeType
不被浏览器支持,则会动态移除不支持的选项。每当 MediaRecorder
获取到录制数据时(通过 ondataavailable
事件),会将数据推送到 recordChunks
数组中,并且触发上传操作。上传数据会通过 uploadWorker
发送到后台,支持实时上传录制片段。每 5
秒钟,录制的数据会被打包并上传处理。上传与网络监控, 实现了网络状态监控,当网络状态变化时(如连接断开、速度变化等),将网络信息发送到 uploadWorker
,实时调整上传策略, 并支持调度上传、网络恢复后的并发上传、上传失败重试等。另外,还处理了多种媒体权限和设备的异常情况。
1.1 多种媒体权限和设备的异常检测:
-
请求用户的摄像头和麦克风权限: 通过调用
navigator.mediaDevices.getUserMedia
方法,传入配置对象{video: true, audio: true}
来同时请求视频和音频设备的访问权限。该方法返回一个Promise
,成功时会得到MediaStream
对象,失败则会抛出异常。最后通过mediaStream.getTracks()
获取所有媒体轨道, 并将所有媒体轨道停止来释放媒体资源。防止内存泄漏和设备资源占用,特别是在处理摄像头和麦克风这样的硬件资源时尤为重要。在视频录制结束或组件卸载时调用此方法,可以确保系统资源得到及时释放。 -
获取系统可用的媒体输入设备: 通过调用
navigator.mediaDevices.enumerateDevices()
方法获取所有媒体设备列表,然后进行了精细的设备筛选和分类处理。过滤出所有有效的输入设备(endsWith("input")
)、排除了空设备ID的设备(deviceId !== ""
)、分别筛选出音频输入设备(audioinput
)和视频输入设备(videoinput
)。我们允许允许用户在多个摄像头或麦克风之间切换, 支持用户选择最佳的录制设备(如高清摄像头或专业麦克风)。 -
通过
navigator.permissions.query({ name: "camera/microphone" })
来检查摄像头和麦克风设备是否开启, 通过权限状态的state
属性判断是否为granted
。 -
调用
navigator.mediaDevices.getSupportedConstraints()
获取window.navigator.mediaDevices.getUserMedia
支持的约束列表, 比如说:audio : {echoCancellation, noiseSuppression, autoGainControl}
、video : {width, height, frameRate, facingMode}
。 -
使用
window.MediaRecorder.isTypeSupported()
检查编码格式支持, 比如Safari
不支持new MediaRecorder(stream, { mimeType: "video/webm" }) mimeType
参数。不支持时,需要去掉参数。
1.2 获取用户的音频和视频流: 通过 navigator.mediaDevices.getUserMedia
获取用户的音频和视频流。getUserMedia
是 WebRTC
(Web Real-Time Communication
)API
的一部分,允许网页访问用户的摄像头和麦克风,为音视频流应用提供基础支持。它的主要作用是采集媒体流(MediaStream
)。可以传入 constraints
约束参数, 来自定义摄像头和麦克风的行为。audio
: 音频参数用于控制麦克风音质、降噪、回声消除、自动增益 等。video
: 视频参数用于控制摄像头的行为,例如 分辨率、帧率、摄像头方向、设备 ID
等。
-
audio
:deviceId
: 指定音频输入设备。deviceId
通过enumerateDevices()
获取所有麦克风;sampleRate
: 数字,指定音频的采样率。{ ideal: 48000 }
,48kHz(48000Hz)
高清音质(推荐);{ ideal: 44100 }
,44100Hz
标准CD
质量;{ ideal: 8000 }
,8kHz
低质量电话音质;sampleSize
: 采样大小(sampleSize
)。{ ideal: 8 }
,8-bit
;{ ideal: 16 }
,16-bit
标准;{ ideal: 24 }
,24-bit
高质量;{ ideal: 32 }
,32-bit
专业级;channelCount
: 数字,指定音频的通道数。1=单声道,2=立体声
, 单声道(1
):适合语音通话; 立体声(2
):适合音乐录制;autoGainControl
: 布尔值, 指定是否启用 自动增益控制autoGainControl
。 自动调整音量,防止声音过大或过小;echoCancellation
: 布尔值,指定是否启用 回声消除(echoCancellation
)。用于语音通话,减少耳机或扬声器导致的回声;noiseSuppression
: 布尔值,指定是否启用 噪声抑制(noiseSuppression
)。减少背景噪音(如风声、键盘声)。 -
video
:width
: 数字, 指定视频宽度分辨率。比如:width: 1920
或者width: { ideal: 1280 }
或者width: { min: 1024, ideal: 1280, max: 1920}
;height
: 数字, 指定视频高度分辨率。比如:height: 1080
或者width: { ideal: 1080 }
或者width: { min: 576, ideal: 1080, max: 1280}
;deviceId
: 指定视频输入设备;frameRate
: 数字,指定视频的 帧率(frameRate
)。视频帧率(FPS
),控制流畅度,通常30FPS
适用于一般应用,60FPS
适用于高性能场景;facingMode
: 字符串,指定选择前置还是后置摄像头。user
(前置摄像头),environment
(后置摄像头)。如果要动态切换, 必须使用后置摄像头, 如果:facingMode: { exact: "environment" }
;resolution
: 对象,指定理想或精确的分辨率,如:{ width: 1280, height: 720 }
;aspectRatio
: 数字,指定视频宽高比、纵横比(aspectRatio
), 比如:aspectRatio: 1.7777
或者aspectRatio: { ideal: 1.7777777778 }
或者aspectRatio: { min: 1, ideal: 1.7777777778, max: 2 }
;noiseSuppression
: 降噪, 布尔值。开启摄像头的降噪功能,提高视频清晰度(部分浏览器支持)
1.3 将用户媒体流渲染到 Video
元素, 捕获视频帧并绘制到 Canvas
: 获取视频元素 (videoEl
) 并将媒体流 (mediaStream
) 赋值给 video
元素的 srcObject
属性,使视频可以播放。设置 muted = true
, 防止麦克风录到视频播放的声音, 如果视频有声音,麦克风可能会二次录制,导致 回声、杂音。同时,监听视频的 play
事件,确保在视频播放时开始捕获每一帧,并将捕获的帧绘制到 Canvas
上。捕获视频帧并绘制到 Canvas
过程中, 利用 requestVideoFrameCallback
(如果浏览器支持)或 requestAnimationFrame
,实时捕获 video
元素的每一帧图像, 通过 createImageBitmap
方法将视频帧转化为 ImageBitmap
然后发送到 Worker
中。在 Worker
中使用 OffscreenCanvas
独立线程进行视频帧处理, 比如翻转, 然后采用 transferToImageBitmap()
零拷贝的方式将 OffscreenCanvas
直接转换为 ImageBitmap
, 并通过 postMessage
传输处理后的 ImageBitmap
给主线程。然后通过 Canvas
的 2D
上下文 (processCtx2D
) 绘制到画布上。
-
捕获视频帧: 首选使用最新的
requestVideoFrameCallback API
, 这是专门为视频帧同步设计的API
,能够提供精确的时间戳和元数据。 如果不支持, 降级使用requestAnimationFrame
,与浏览器的渲染循环同步,提供流畅的帧捕获。如果还不支持, 后降级到setInterval
,以固定30fps
的速率模拟帧捕获。并且, 在封装的回调中, 通过检查video.paused
和video.ended
状态确保视频处于播放状态, 才会进行下一视频帧的回调处理。setInterval
: 无法精准同步视频帧,可能出现帧丢失或重复requestAnimationFrame
: 无法精准同步视频帧,可能出现帧丢失或重复。requestAnimationFrame
不保证 每次都能精准对应视频帧,可能有丢帧或重复帧问题。requestVideoFrameCallback
通过视频帧驱动回调,确保每一帧都能精准捕获,并提供额外的帧元数据(timestamp
、presentationTime
、expectedDisplayTime
)。而且requestVideoFrameCallback
保证 只在新视频帧可用时触发,避免多余计算。因此,requestVideoFrameCallback
可以用于高效同步 视频帧,不会错过或重复帧, 帧驱动 回调,更适用于 视频分析、AI
计算、滤镜渲染。比requestAnimationFrame
更精准,但需要 浏览器支持。 -
视频帧绘制: 将
video
元素通过createImageBitmap
转换为imageBitmap
, 发送到Worker
。并通过postMessage
的transfer
参数传递imageBitmap
可转移对象, 将控制权转移给Worker
线程,避免复制带来的性能损耗。在Worker
中, 利用OffscreenCanvas
进行高性能的图像处理,通过setTransform()
高效实现水平翻转(镜像) 功能。它支持初始化画布 (init
) 和 处理图像 (process
),采用transferToImageBitmap()
零拷贝的方式将OffscreenCanvas
直接转换为ImageBitmap
, 并通过postMessage
传输处理后的ImageBitmap
给主线程, 并将ImageBitmap
直接将控制权转移给主线程,避免复制带来的性能损耗。基于Web Worker
绘制、处理避免主线程计算开销,提高渲染效率。最后在主线程中, 直接绘制ImageBitmap
即可。
1.4 获取 Canvas
画布流: 通过 this.processCanvas.captureStream(60)
获取每秒 60
帧的画布流。这个流包含了 Canvas
上绘制的视频帧。将用户音频流的音频轨道添加到画布流中,实现视频和音频的合成。const audioTracks = this.stream.getAudioTracks(); audioTracks.forEach((track) => this.processStream.addTrack(track));
1.5 实例化录制对象 MediaRecorder
: 基于合成后的画布流,使用 MediaRecorder
创建一个录制器实例,并设置录制选项(如 mimeType
)。注册录制器的回调函数,监听录制器的事件,如 onstart
(录制开始)、onstop
(录制结束)、onpause
(暂停录制)、onresume
(恢复录制)等。
1.6 录制数据获取与上传: 我们通过每隔 5s
调用 mediaRecorder.requestData
来手动控制触发 mediaRecorder.ondataavailable
事件。每当 MediaRecorder
获取到录制数据时(通过 ondataavailable
事件),会将数据推送到 recordChunks
数组中,并且触发上传操作。上传数据会通过 uploadWorker
发送到后台,支持实时上传录制片段。为了准确计时,不受主线程阻塞的影响, 我们将录制计时的操作也放到了 Worker
。
1.7 后台调度发送录制数据片段: Worker
中实现了 Scheduler
调度器, 每当接收到录制数据片段, 作为一个待发送任务添加到调度器,等待调度执行, 每个任务支持请求失败重试, 并且还会根据网络的状态来自动暂停/恢复调度执行。
-
接收
init
消息,初始化Scheduler
调度器:Scheduler
调度器,addTask
: 添加任务,进入任务队列, 执行scheduler
调度函数;pause
: 暂停调度任务执行;resume
: 继续任务调度执行 ;scheduler
:while
循环检测parallelism
、暂停状态、队列长度, 进行调度执行任务 -
主线程监听
navigator.connection change / window online / window offline
事件: 在navigator.connection change
事件中, 获取navigator.connection.downlink
当前网速, 通过postMessage
发送到Worker
中,Worker
中判断downlink
,navigator.connection.downlink < 1 Mbps
,scheduler.pause()
暂停上传;navigator.connection.downlink > 1 Mbps
,scheduler.resume()
继续上传; 在window online
网络恢复事件中, 通过postMessage
发送到Worker
中, 执行scheduler.resume()
继续上传任务; 在window offline
网络中断事件中, 通过postMessage
发送到Worker
中, 执行scheduler.pause()
暂停上传任务; -
添加任务失败重试逻辑(任务失败自动重试 3 次)
-
主线程每隔
5s
执行recorder.requestData()
, 在ondataavailable
中将5s
的录制数据 通过postMessage
发送到Worker
中, 并直接将录制数据的控制权转移给Worker
, 节省复制带来的性能损耗。将上传请求进行请求重试的包装, 允许失败重试, 重试次数达到3
此后,将这段数据的上传置为失败。
二、人脸识别与识别特征绘制: 这个过程一共有三个 Canvas
分别为: 与视频流同步绘制的 Canvas
、用于获取人脸识别图像数据的 Canvas
和用于绘制识别特征的 Canvas
。将视频流同步绘制的 Canvas
中的每一帧绘制到获取人脸识别图像数据的 Canvas
上, 并进行缩放处理, 将处理后的 Canvas
图像数据传递给 Web Worker
,进行进一步的人脸、手部检测推理。在 Web Worker
中使用 TensorFlow.js
加载、运行已经训练好的 graph-model
格式的人脸识别模型, 预测、推理视频帧图像数据中的人脸信息, 并返回预测结果。主线程接收到识别后的特征信息开始进行绘制。
2.1 获取视频帧并缩放处理: 将视频流同步绘制的 Canvas
中的每一帧绘制到获取人脸识别图像数据的 Canvas
上, 并进行缩放处理。使用 OffscreenCanvas
后台进行图像缩放, 采用 transferToImageBitmap()
零拷贝的方式将 OffscreenCanvas
直接转换为 ImageBitmap
, 并通过 postMessage
的 transfer
参数传递 ImageBitmap
可转移对象, 将控制权转移给主线程,避免复制带来的性能损耗。
-
缩放图像: 一方面: 缩放图像, 可以降低采样、压缩图像数据输入,降低模型输入数据量,从而加速模型推理。缩小图像尺寸(例如,从大图到
320x320``)不仅能加速识别,还能减少模型处理过程中的计算复杂度。TensorFlow
进行图像识别时,越小的图像尺寸通常会带来更快的处理速度。另一方面:model.json
模型会期望一致的输入图像大小,因为它已经在这个尺寸上进行训练。使用不正确的输入尺寸可能会导致模型无法有效地捕捉到重要的图像特征,从而影响识别精度。通过这种方式,你确保图像缩放不会影响关键特征的捕捉 -
使用
OffscreenCanvas
后台进行图像缩放:OffscreenCanvas
是一种在Web Worker
中进行图形渲染的API
,可以在后台线程中进行图像处理,从而避免阻塞主线程。 -
postMessage transfer
参数: 这是一个可选的参数,是一个包含可转移对象的数组。可转移对象的所有权将从发送者转移到接收者,而不是复制,从而提高性能。类型:Array
,包含ArrayBuffer
、MessagePort
或ImageBitmap
等可转移对象。作用: 将可转移对象的所有权从发送者转移到接收者,避免数据复制,提高性能。注意, 一旦将对象控制权转移之后, 发送者自身将无法继续使用。
2.2 模型加载并优化: 在 Worker
中通过 importScripts
加载 TensorFlow.js
,并使用 tf.loadGraphModel
加载 model.json
。基于 Service Worker
缓存 TensorFlow
模型文件和相关的 JavaScript
库,并控制缓存的版本。定制 fetchFunc
,优先从 caches.match(url)
先尝试从浏览器缓存中获取资源,若存在则直接返回缓存内容,否则执行网络请求。这样可以降低网络延迟、加快模型加载速度。原因: 模型文件较大(model.json
+ bin
文件),频繁请求会增加带宽消耗并影响加载速度。使用 caches.match(url)
优先读取缓存中的模型,避免重复下载,提高应用的冷启动速度,适用于 Web
端的离线 AI
识别。
-
什么是
Tensorflow.js
:TensorFlow.js
是一个开源的JavaScript
实现的机器学习库,允许在浏览器和Node.js
环境中运行机器学习模型,执行训练和推理。 -
注册
Service Worker
: 通过navigator.serviceWorker.register
来注册Service Worker
。并注意:service worker
最大的作用域是worker
所在的位置(换句话说,如果脚本sw.js
位于/js/sw.js
中,默认情况下它只能控制/js/
下的URL
)。可以使用Service-Worker-Allowed
标头指定worker
的最大作用域列表。如果navigator.serviceWorker.register(workerPath, { scope: "xx"})
超过最大作用域, 会报错。 -
编写
Service Worker
: 1. 缓存版本控制:通过CACHE_VERSION
和CACHE_NAME
,为缓存命名时加入版本号,确保每次更新资源时能清理旧的缓存并使用新的缓存。每次版本更新时只保留当前版本的缓存,删除旧版本缓存。2. 安装阶段 (install
):在Service Worker
安装时,在event.waitUntil
回调中, 使用caches.open
打开缓存,并将指定的文件(model.json
、group1-shard1of1.bin
和TensorFlow JS
库)通过cache.addAll
添加到缓存中。 使用event.waitUntil
确保install
事件处理完成之前,Service Worker
不会进入activated
状态,防止安装阶段未完成就进行缓存管理。3. 激活阶段 (activate
):在Service Worker
激活时,在event.waitUntil
回调中, 从caches
中清理掉不再需要的旧版本缓存,确保只保留当前版本的缓存,避免浪费存储空间。4. 拦截请求 (fetch
):在拦截fetch
请求时,首先判断请求的URL
是否匹配需要缓存的文件。如果是,通过event.respondWith
劫持响应。在event.respondWith
回调中, 我们通过caches.match(event.request)
对网络请求里的每个资源与缓存里可获取的等效资源进行匹配,查看缓存中是否有相应的资源, 如果有缓存,则尝试从缓存中加载文件;如果缓存中没有,则通过网络请求文件并将其缓存。 -
Web Worker
加载模型: 在Web Worker
独立线程中, 通过tf.loadGraphModel
来加载模型相关资源, 并添加fetchFunc
参数来自定义资源请求逻辑。确保TensorFlow.js
能够首先从缓存中加载模型文件,而不是默认通过网络请求。通过这种方式,可以减少重复的网络请求,提高性能和离线支持。Web Worker
用于在浏览器主线程之外执行JavaScript
,适用于计算密集型任务;Service Worker
主要用于缓存和拦截网络请求,适合离线支持、推送通知等场景。
2.3 模型预检测(预热): 当模型加载成功后,则会调用 postPreflightPredictionMessage
。这个预检测逻辑的作用是发送一个初始的预测请求,这通常是为了预热 模型,使得后续的预测操作更加高效。预检测(Preflight Prediction
), 预热模型:加载深度学习模型后,第一次推理通常会比较慢,预热操作可以帮助模型加载时消除这个延迟,保证后续预测的流畅性。确保模型工作正常:通过预检测,确保模型加载正确且可以正常进行推理。这可以帮助捕获一些可能的加载问题(如模型文件损坏、网络延迟等)。资源初始化:有时候模型在加载后会进行一些初始化操作,预检测能够帮助确保这些操作成功完成。
2.4 图像数据预处理: 1. 采用 ImageData Buffer
作为输入, 将 RGBA
转换为 RGB
(丢弃 Alpha
通道) 以适配 TensorFlow.js
。 丢弃 Alpha
通道原因 是 ImageData.data
采用 RGBA
格式(4
通道), 但大多数模型仅支持 RGB
(3
通道)。丢弃 Alpha
通道(透明度), 减少 25%
数据量,提高 tf.tensor()
处理效率, 避免不必要的计算资源消耗。 2. 然后使用 tf.tensor(data)
将预处理后的 Buffer
数组转换为一维张量。3. 但是 Graph Model
期望输入的是一个具有四个维度的张量, 格式为 [batch, height, width, channels]
, batch
表示批量维度, 模型通常一次处理多个样本,即一个批次(batch
)。即使你只处理单个图像,也需要在最前面加上一个批量维度,这里为 1
。height/width
表示图像的空间尺寸, 图像有高度和宽度两个维度,分别对应 imageData.height
和 imageData.width
。channels
表示通道数(Channels
), 对于 RGB
图像,通道数为 3
(分别对应红、绿、蓝三个颜色通道)。因此, 为了确保输入张量的形状与模型的输入要求相匹配, 一维张量还需要通过调用 .reshape([1, imageData.width, imageData.height, 3])
将一维数组转换成一个符合模型要求格式的四维张量。tf.tidy
的使用, 确保在闭包中创建的中间 Tensor
在函数执行结束时被自动释放,防止内存泄露。
2.5 异步推理与优化: 使用 model.executeAsync(inputData)
进行推理,并用 performance.now()
计算帧率,衡量推理性能。采用 tf.tidy()
和 手动 dispose()
释放 Tensor
张量,避免 Web Worker
运行过程中 GPU
/内存泄漏。model.executeAsync
内部计算流程: 1. 输入验证与准备, 当调用 model.executeAsync(inputTensor)
时,首先会对传入的输入张量进行验证(检查形状、数据类型等),确保符合计算图的要求。此时,输入张量已经准备好,通常是通过 tf.tensor()
创建并经过 .reshape([batch, height, width, channels])
格式化好的数据。2. 图执行器(Graph Executor
)的调度, TensorFlow.js
内部会使用一个 GraphExecutor
来管理整个计算图, GraphExecutor
对计算图进行拓扑排序,确定每个节点(即操作)的依赖关系,并生成一个执行计划。每个操作节点都被安排在一个正确的执行顺序中, 只有当所有依赖的前置节点完成计算后,当前节点的计算才会被触发。3. 调度到后端执行, 根据执行计划,GraphExecutor
将每个操作节点依次提交到相应的后端, 例如,在 WebGL
后端中,每个操作会调用 runProgram
,该方法会根据操作类型动态编译或选择已有的 shader
程序,并将输入数据上传到 GPU
内存(通常以纹理或缓冲区形式)。4. 异步 GPU
计算,提交到 GPU
后,shader
程序会在 GPU
指令队列中异步执行, 这种异步执行意味着 JavaScript
不会被阻塞,计算任务在后台并行运行。一些操作可能涉及异步数据传输或特殊的 GPU
调用(例如利用 gl.readPixels
异步读取数据),这都在 Promise
机制的控制下进行。5. 依赖管理与逐层计算, GraphExecutor
负责管理各节点之间的依赖关系,只有在所有依赖节点完成计算后,才会触发下一个操作。这种层层推进的执行方式保证了计算图中所有节点的计算都能按照正确的顺序并行调度。6. GPU
到 CPU
的数据传输, 当计算图中最后一个操作完成后,如果最终结果需要返回给 JavaScript
层(例如调用 .arraySync()
),后端会启动一个异步数据传输过程。WebGL
后端会调用类似 gl.readPixels
的 API
,将 GPU
内存中的数据异步读取回 CPU
内存,这一步也不会阻塞主线程,而是通过 Promise
处理。7. Promise
解析与输出返回, 当所有计算操作和数据传输完成后,GraphExecutor
会收集所有输出张量。model.executeAsync
返回的 Promise
会在此时解析,并将输出 Tensor
(或 Tensor
数组)作为结果返回给调用者。
-
采用异步推理,提高
FPS
:Web Worker
运行在单独的线程,但TensorFlow.js
推理会占用GPU/CPU
计算资源,如果同步执行,可能会阻塞整个Worker
,影响UI
交互。 -
释放内存,防止
GPU
泄漏:TensorFlow.js
在推理过程中会创建大量的Tensor
,如果不手动释放,它们会一直占用WebGL/GPU
内存,导致显存泄漏、应用崩溃。主动调用dispose()
释放Tensor
,减少WebGL
资源占用。tf.tidy(() => {...})
作用域管理,确保中间变量及时释放。
2.6 输出转换与分类: 利用 prediction.arraySync()
将 Tensor
转换为 JavaScript
数组,再调用 transform
函数转换为结构化结果。也就是说, 对原始模型输出进行后处理,将浮点数数组映射为易于使用和理解的结构。具体逻辑为: 提取置信度, 通过 splice(-4)
移除数组末尾的 4
个数,分别代表 smileRate
(笑容概率)、bgRate
(背景概率)、faceRate
(人脸概率)和 handRate
(手部概率)。剩余部分 是归一化的坐标数据,将其乘以 width * scale
转换为实际像素坐标。根据提取的置信度, 比较背景、人脸、手部的概率,确定当前预测属于哪一类别。根据类别分别提取不同部位的位置信息(如人脸的边框、眼睛、嘴巴区域;手部的边框),并存入对应数组中。
2.7 Web Worker
通信: 监听 onmessage
处理主线程发送的图像数据,执行预测后 postMessage
传回识别结果,支持 异步高效通信。
2.8 绘制识别特性信息: 一旦 Worker
返回了预测结果,主线程会使用这些数据在 Canvas
上绘制出相关的特征信息,主要负责人脸框、手势框、嘴型框、眼势框。每个特征(如脸部、眼睛、嘴巴、手部等)都有对应的颜色、透明度设置。
-
计算缩放率:
Web Worker
识别的是缩放后的视频帧图像, 因此, 识别后的特征信息坐标都是基于缩放后的。因此,我们需要计算缩放率, 绘制时按比例放大坐标。 -
绘制人脸框: 如果
faceRate
值大于0.3
, 说明有检测到人脸特征, 通过特征的坐标x1, y1, x2, y2
来绘制人脸框。 -
绘制眼势框特征: 在检测到人脸特征的基础上, 我们只绘制自信的眼神。自信眼神的判断方式为: 眼珠偏移水平中心点距离占水平半径的比例,与眼珠偏移垂直中心点距离占垂直半径的比例,两者平均值小于
0.2
,当作眼睛居中了,认为眼神自信。-
计算自信眼神数 (
getSightPassCount
): 遍历所有的人脸数据,每个人脸的数据中,eyesPos
是一个包含所有眼睛关键点位置的数组。为了将这些关键点按组进行处理,我们使用EYE_GROUP_LEN
(这里是10
个)来划分每一组。这样每个人脸的眼睛组数eyesCount
就被确定了。对于每一组眼睛,计算它们的对称性,通过水平(horizontalRatio
)和垂直(verticalRatio
)坐标的比例来衡量眼睛的对称性。这里的对称性计算的是: 水平(horizontalRatio
):通过眼睛的三个Y
坐标值来进行计算。具体是取ey1
和ey2
的中点,与ey3
的位置差值相比。垂直(verticalRatio
) 通过眼睛的三个X
坐标值来进行计算, 计算了ex1
和ex2
的中点与ex3
的位置差值。然后,sightRatio
是水平比率和垂直比率的总和。对于每一组眼睛的ratio
,代码将它与eyesCount
进行平均,计算出当前人脸的平均视线比例。如果ratio
小于0.035
,说明该组眼睛符合视线通过的条件,因此将passedEyeCount
增加1
。最后,返回统计出的自信眼神数量。 -
计算自信眼神平均值:
passedEyeCount
是自信眼神数量, 除以totalEyeCount
是眼睛的总数(所有人脸的眼睛组数量)表示自信眼神占所有眼睛的比例。
-
-
绘制嘴型框特征: 在检测到人脸特征的基础上, 如果微笑
smileRate
值大于0.4
, 绘制嘴型框 -
绘制手势框特征: 直接绘制手势框。
2.9 继续获取下一帧视频帧,重复上面的处理流程: 我们的逻辑是上一帧的缩放、识别、绘制之后,才会进行下一帧的绘制。
二、实现
2.1 index.js
import { VideoRecorder } from "../videoRecorder/index.js";
import { FaceDrawCanvas } from "../faceDrawCanvas/index.js";
import { calculatePredictScore } from "../predictScore/index.js";
import { FacePredictWorker } from "../facePredictWorker/index.js";
import { FacePredictCanvas } from "../facePredictCanvas/index.js";
export class AIVideoRecorder {
constructor(options) {
const {
videoElId,
videoWidth,
videoHeight,
videoDeviceId,
audioDeviceId,
aiScoreSetting,
predictImageSize,
faceDrawCanvasElId,
videoUploadWorkerUrl,
facePredictCanvasElId,
facePredictWorkerUrl,
processVideoCanvasElId,
videoProcessorWorkerUrl,
videoRecorderTimeWorkerUrl,
} = options;
this.predictResult = [];
this.modelStatus = false;
this.videoElId = videoElId;
this.videoWidth = videoWidth;
this.videoHeight = videoHeight;
this.audioDeviceId = audioDeviceId;
this.videoDeviceId = videoDeviceId;
this.faceDrawCanvasElId = faceDrawCanvasElId;
this.predictImageSize = predictImageSize || 320;
this.facePredictWorkerUrl = facePredictWorkerUrl;
this.videoUploadWorkerUrl = videoUploadWorkerUrl;
this.facePredictCanvasElId = facePredictCanvasElId;
this.processVideoCanvasElId = processVideoCanvasElId;
this.videoProcessorWorkerUrl = videoProcessorWorkerUrl;
this.videoRecorderTimeWorkerUrl = videoRecorderTimeWorkerUrl;
this.videoEl = document.getElementById(this.videoElId);
this.aiScoreSetting = aiScoreSetting || {
smileThreshold: 0.4,
volumeThreshold: 0.2,
smilePassLine: 0.6,
};
this.videoRecorder = null;
this.faceDrawCanvas = null;
this.facePredictWorker = null;
this.facePredictCanvas = null;
this.run();
}
run() {
this.initWorker();
this.initVideoAudio();
}
initCanvas(videoEl) {
this.faceDrawCanvas = new FaceDrawCanvas({
canvasElId: this.faceDrawCanvasElId,
});
this.facePredictCanvas = new FacePredictCanvas({
videoEl: videoEl,
videoWidth: this.videoWidth,
videoHeight: this.videoHeight,
canvasElId: this.facePredictCanvasElId,
predictImageSize: this.predictImageSize,
});
}
initWorker() {
this.facePredictWorker = new FacePredictWorker(this.facePredictWorkerUrl, {
predictImageSize: this.predictImageSize,
predictSuccessCallback: this.predictSuccessCallback,
loadModelSuccessCallback: this.loadModelSuccessCallback,
});
}
getRecorderVideoUrl = () => {
return this.videoRecorder.getRecorderChunkUrl();
};
async initVideoAudio() {
this.videoRecorder = new VideoRecorder({
videoWidth: this.videoWidth,
videoHeight: this.videoHeight,
videoElId: this.videoElId,
videoConstraint: {
noiseSuppression: true,
deviceId: this.videoDeviceId,
width: { ideal: this.videoWidth },
height: { ideal: this.videoHeight },
},
audioConstraint: {
noiseSuppression: true, // 降噪
echoCancellation: true, // 回声消除
autoGainControl: true, // 自动增益
deviceId: this.audioDeviceId,
},
stopCallback: this.videoStopCallback,
pauseCallback: this.videoPauseCallback,
resumeCallback: this.videoResumeCallback,
uploadWorkerUrl: this.videoUploadWorkerUrl,
processWorkerUrl: this.videoProcessorWorkerUrl,
processVideoCanvasElId: this.processVideoCanvasElId,
recorderTimeWorkerUrl: this.videoRecorderTimeWorkerUrl,
});
const stream = await this.videoRecorder.prepareRecord();
this.initCanvas(this.videoRecorder.processCanvas);
}
predict = async () => {
const imageData = await this.facePredictCanvas.getPredictImageData(
this.videoEl,
this.predictImageSize
);
this.facePredictWorker.postMessage(imageData);
};
startRecord = () => {
if (!this.modelStatus) {
console.log("模型未加载");
return;
}
if (this.videoRecorder.status !== "notStarted") {
return;
}
this.videoRecorder.startRecorder();
this.predict();
};
pauseRecord = () => {
this.videoRecorder.pauseRecorder();
};
resumeRecord = () => {
this.videoRecorder.resumeRecorder();
};
stopRecord = () => {
this.videoRecorder.stopRecorder();
};
videoStopCallback = () => {
this.faceDrawCanvas.clear();
this.audioVolumeCanvas.clear();
const predictScoreResult = calculatePredictScore({
config: {
...this.aiScoreSetting,
},
duration: 30, // 暂时先写死
predictResult: this.predictResult,
soundVolumes: [],
});
console.log("predictScoreResult", predictScoreResult);
};
videoPauseCallback = () => {
this.faceDrawCanvas.clear();
this.audioVolumeCanvas.clear();
};
videoResumeCallback = () => {
this.predict();
};
predictSuccessCallback = (predictData) => {
const { id, output } = predictData;
if (id === "preflight") {
return;
}
if (this.videoRecorder.status !== "recording") {
return;
}
if (output && output.faces?.length !== 0) {
this.faceDrawCanvas?.draw?.(predictData);
this.predictResult.push({ ...output });
}
this.predict();
};
loadModelSuccessCallback = () => {
this.modelStatus = true;
};
}
2.2 predictScore.js
const DEFAULT_SCORE = 3;
const EYE_GROUP_LEN = 10;
function getASRSentencesDuration(sentences) {
const sentencesCount = sentences.length;
if (sentencesCount === 0) {
return 0;
}
const firstSentences = sentences[0];
const lastSentences = sentences[sentencesCount - 1];
const startTime = firstSentences.start_time;
const endTime = lastSentences.end_time;
const duration = endTime - startTime; // ms
const minutes = duration / 1000 / 60;
return minutes;
}
/**
* 计算腾讯语音转文字每分钟多少字符
* @param 腾讯ASR
* @returns 每分钟多少字符
*/
function getASRSentencesSpeed(sentences) {
const duration = getASRSentencesDuration(sentences);
if (duration === 0) {
return "0";
}
// 总体的字符数,包括逗号、句号、问号
let totalText = "";
sentences.forEach((sentence) => {
totalText += sentence.voice_text_str;
});
// 去除逗号、句号、问号后的字符总长度
// const totalCharLength = totalText.replace(/,|。|?/g, '').length
const totalCharLength = totalText.length;
const charLengthPerMinutes = totalCharLength / duration;
return charLengthPerMinutes;
}
function parseSentenceData(sentenceData) {
return sentenceData.map((sentence) => ({
start_time: sentence.start_time,
end_time: sentence.end_time,
}));
}
function getHandAverage(handCount, duration) {
const handAverage = (handCount / duration) * 60;
return handAverage;
}
function getSightAverage(totalEyeCount, passedEyeCount) {
if (totalEyeCount === 0) {
return "0";
}
const average = totalEyeCount == 0 ? 0 : passedEyeCount / totalEyeCount;
return average;
}
function getSmileAverage(smileCount, aiSize, smilePassLine) {
let smileAverage = 0;
if (smileCount < 1) {
smileAverage = smilePassLine;
} else if (smileCount < 2) {
smileAverage = smilePassLine;
} else {
smileAverage = Math.min(
1,
smilePassLine + (1 - smilePassLine) * ((4 * smileCount) / aiSize)
);
}
return smileAverage;
}
function getVolumeAverage(soundVolumes, volumeThreshold) {
if (soundVolumes.length === 0) {
return "0";
}
let valueTotal = 0;
let volumeCount = 0;
soundVolumes.forEach((item) => {
if (item > volumeThreshold) {
volumeCount += 1;
valueTotal += item;
}
});
const volumeAverage = (valueTotal / volumeCount) * 100;
return volumeAverage;
}
function getSmileScore(smileCount, aiSize) {
let newScore = DEFAULT_SCORE * 10;
if (!aiSize || !smileCount) {
return 0;
} else if (smileCount < 4) {
newScore += smileCount;
return newScore / 10;
}
const smileRate = smileCount / aiSize;
const scoreLevelList = [
1 / 300,
2 / 300,
3 / 300,
4 / 300,
1 / 60,
4 / 180,
5 / 180,
6 / 180,
1 / 10,
2 / 10,
3 / 10,
4 / 10,
5 / 10,
0.6,
0.8,
1,
];
newScore = newScore + 4;
for (let i = 0; i < scoreLevelList.length; i++) {
const rate = scoreLevelList[i];
newScore++;
if (smileRate <= rate) {
break;
}
}
return newScore / 10;
}
function getSightScore(passedEyeCount, totalEyeCount) {
if (!passedEyeCount) {
return 0;
} else {
const ratio = passedEyeCount / totalEyeCount;
const num = Math.round(1.9 * ratio * 10) / 10;
return 3.1 + num;
}
}
function getHandScore(handCount, aiSize) {
if (!handCount) {
return 0;
}
const handRate = handCount / aiSize;
if (handRate > 0.52) {
return 5;
}
const handLevelList = [
0.005, 0.01, 0.015, 0.02, 0.04, 0.06, 0.08, 0.1, 0.12, 0.16, 0.2, 0.24,
0.28, 0.32, 0.36, 0.4, 0.44, 0.48, 0.52,
];
let newScore = DEFAULT_SCORE * 10;
for (let i = 0; i < handLevelList.length; i++) {
const rate = handLevelList[i];
newScore++;
if (handRate <= rate) {
break;
}
}
return newScore / 10;
}
function parseAIFacePredictData({ predictResult, smileThreshold }) {
let smileCount = 0;
let handCount = 0;
let totalEyeCount = 0;
let passedEyeCount = 0;
predictResult.forEach((item) => {
const { hands, faces } = item;
if (hands.length > 0) {
handCount += item.hands.length;
}
if (faces.length > 0) {
// 累加脸部数量
totalEyeCount += faces.length;
faces.forEach((face) => {
// 微笑程度大于0.2判断为微笑,微笑数量加1。
if (face.smileRate > smileThreshold) {
smileCount += 1;
}
// 眼珠偏移水平中心点距离占水平半径的比例,与眼珠偏移垂直中心点距离占垂直半径的比例,两者平均值小于0.2,当作眼睛居中了,认为眼神自信,自信数量加一。
const eyesCount = Math.round(face.eyesPos.length / EYE_GROUP_LEN);
let ratio = 0;
for (let i = 0; i < eyesCount; i++) {
const startIndex = i * EYE_GROUP_LEN;
const eyePos = face.eyesPos.slice(
startIndex,
startIndex + EYE_GROUP_LEN
);
const ey1 = eyePos[1];
const ey2 = eyePos[3];
const ey3 = eyePos[9];
const ex1 = eyePos[4];
const ex2 = eyePos[6];
const ex3 = eyePos[8];
const horizontalRatio = Math.abs(
((ey1 + ey2) / 2 - ey3) / (ey1 - ey2)
);
const verticalRatio = Math.abs(((ex1 + ex2) / 2 - ex3) / (ex1 - ex2));
const sightRatio = horizontalRatio + verticalRatio;
ratio += sightRatio;
}
ratio /= eyesCount;
if (ratio < 0.2) {
passedEyeCount += 1;
}
});
}
});
return {
smileCount,
handCount,
totalEyeCount,
passedEyeCount,
};
}
export function calculatePredictScore({
duration,
predictResult,
soundVolumes,
totalSentences,
config: { smileThreshold, volumeThreshold, smilePassLine },
}) {
const aiSize = predictResult.length;
const { smileCount, handCount, totalEyeCount, passedEyeCount } =
parseAIFacePredictData({
predictResult,
smileThreshold,
});
const handAverage = getHandAverage(handCount, duration);
const sightAverage = getSightAverage(totalEyeCount, passedEyeCount);
const smileAverage = getSmileAverage(smileCount, aiSize, smilePassLine);
const volumeAverage = getVolumeAverage(soundVolumes, volumeThreshold);
const asr = totalSentences
? {
charLengthPerMinutes: getASRSentencesSpeed(totalSentences),
duration: getASRSentencesDuration(totalSentences),
totalSentences: parseSentenceData(totalSentences),
}
: undefined;
const smileScore = getSmileScore(smileCount, aiSize);
const sightScore = getSightScore(passedEyeCount, totalEyeCount);
const handScore = getHandScore(handCount, aiSize);
return {
handAverage,
sightAverage,
smileAverage,
volumeAverage,
asr,
scoreMap: {
smileScore,
sightScore,
handScore,
},
};
}
2.3 serviceWorker.js
const CACHE_VERSION = "v2"; // 设置缓存的版本号
const CACHE_NAME = `tensorflow-model-cache-${CACHE_VERSION}`; // 缓存名称包含版本号
const cacheUrlList = [
"./facePredictWorker/model.json",
"./facePredictWorker/group1-shard1of1.bin",
"https://cdn.jsdelivr.net/npm/@tensorflow/tfjs/dist/tf.min.js",
];
// 安装阶段:缓存模型文件
self.addEventListener("install", (event) => {
event.waitUntil(
caches.open(CACHE_NAME).then((cache) => {
return cache.addAll(cacheUrlList);
})
);
});
// 激活阶段:删除旧的缓存
self.addEventListener("activate", (event) => {
const cacheWhitelist = [CACHE_NAME]; // 只保留当前版本的缓存
event.waitUntil(
caches.keys().then((cacheNames) => {
return Promise.all(
cacheNames
.filter((cacheName) => !cacheWhitelist.includes(cacheName)) // 删除不在白名单中的缓存
.map((cacheName) => caches.delete(cacheName))
);
})
);
});
function checkIsUseCache(requestUrl) {
const newCacheUrlList = cacheUrlList.map((url) => {
if (url.startsWith("./")) {
url = url.replace("./", "");
}
return url;
});
return newCacheUrlList.some((url) => requestUrl.includes(url));
}
// 拦截网络请求,使用缓存加载模型
self.addEventListener("fetch", (event) => {
const requestUrl = event?.request?.url || "";
const isUseCache = checkIsUseCache(requestUrl);
if (isUseCache) {
event.respondWith(
caches.match(event.request).then((cachedResponse) => {
// 如果缓存中有请求的文件,直接返回缓存
return cachedResponse || fetch(event.request);
})
);
}
});
2.4 faceDrawCanvas.js
const EYE_GROUP_LEN = 10;
const CANVAS_PALETTE = {
face: {
color: "#FFD92B",
pointColor: "#FF9800",
opacity: ["100%", "20%"],
},
eye: {
color: "#46D06E",
pointColor: "#1F9E40",
opacity: ["100%", "20%"],
},
mouth: {
color: "#FFA43E",
pointColor: "#E56014",
opacity: ["100%", "20%"],
},
hand: {
color: "#5ec8fe",
pointColor: "#0b66f9",
opacity: ["100%", "20%"],
},
sound: {
color: "#F5A623",
pointColor: "",
opacity: ["100%", "8%", "%0"],
},
};
function getSightAverage(totalEyeCount, passedEyeCount) {
return totalEyeCount === 0 ? 0 : passedEyeCount / totalEyeCount;
}
function getSightPassCount(workerResult) {
let passedEyeCount = 0;
const faces = workerResult.faces || [];
faces.forEach((face) => {
const eyesCount = Math.round(face.eyesPos.length / EYE_GROUP_LEN);
let ratio = 0;
for (let i = 0; i < eyesCount; i++) {
const startIndex = i * EYE_GROUP_LEN;
const eyePos = face.eyesPos.slice(startIndex, startIndex + EYE_GROUP_LEN);
const ey1 = eyePos[1];
const ey2 = eyePos[3];
const ey3 = eyePos[9];
const ex1 = eyePos[4];
const ex2 = eyePos[6];
const ex3 = eyePos[8];
const horizontalRatio = Math.abs(((ey1 + ey2) / 2 - ey3) / (ey1 - ey2));
const verticalRatio = Math.abs(((ex1 + ex2) / 2 - ex3) / (ex1 - ex2));
const sightRatio = horizontalRatio + verticalRatio;
ratio += sightRatio;
}
ratio /= eyesCount;
if (ratio < 0.035) {
passedEyeCount += 1;
}
});
return passedEyeCount;
}
function drawStartPoint(ctx, x, y, color, radius) {
ctx.fillStyle = color;
ctx.beginPath();
ctx.arc(x, y, radius, 0, 2 * Math.PI);
ctx.fill();
ctx.closePath();
}
function drawGradientLine(
ctx,
xStartPos,
yStartPos,
xEndPos,
yEndPos,
lineColor
) {
const gradient = ctx.createLinearGradient(
xStartPos,
yStartPos,
xEndPos,
yEndPos
);
gradient.addColorStop(0, lineColor);
gradient.addColorStop(0.8, `${lineColor}88`);
gradient.addColorStop(1, `${lineColor}00`);
ctx.beginPath();
ctx.moveTo(xStartPos, yStartPos);
ctx.lineTo(xEndPos, yEndPos);
ctx.strokeStyle = gradient;
ctx.stroke();
}
function drawLine(
ctx,
xStartPos,
yStartPos,
xEndPos,
yEndPos,
lineColor,
pointColor
) {
drawStartPoint(ctx, xStartPos, yStartPos, pointColor, 2);
drawGradientLine(ctx, xStartPos, yStartPos, xEndPos, yEndPos, lineColor);
}
function drawRectBorder(
ctx,
xPos,
yPos,
x2Pos,
y2Pos,
lineWidth,
lineColor,
pointColor
) {
ctx.lineWidth = lineWidth;
drawLine(ctx, xPos, yPos, xPos, y2Pos, lineColor, pointColor);
drawLine(ctx, xPos, y2Pos, x2Pos, y2Pos, lineColor, pointColor);
drawLine(ctx, x2Pos, y2Pos, x2Pos, yPos, lineColor, pointColor);
drawLine(ctx, x2Pos, yPos, xPos, yPos, lineColor, pointColor);
ctx.closePath();
}
function calculateScaleRatio(width, height, predictImageSize) {
return width / height > 1
? width / predictImageSize
: height / predictImageSize;
}
function drawFaceBorder(
ctx,
xPos,
yPos,
x2Pos,
y2Pos,
scaleRatio,
lineWidth = 3,
type = "face"
) {
const { color: lineColor, pointColor } = CANVAS_PALETTE[type];
xPos *= scaleRatio;
yPos *= scaleRatio;
x2Pos *= scaleRatio;
y2Pos *= scaleRatio;
drawRectBorder(
ctx,
xPos,
yPos,
x2Pos,
y2Pos,
lineWidth,
lineColor,
pointColor
);
}
export function drawBy2D(canvas, ctx, predictData) {
const { output, predictImageSize } = predictData;
const { hands, faces } = output;
ctx.clearRect(0, 0, canvas.width, canvas.height);
const sightAverage = getSightAverage(faces.length, getSightPassCount(output));
const showSight = sightAverage > 0.2;
const scaleRatio = calculateScaleRatio(
canvas.width,
canvas.height,
predictImageSize
);
faces.forEach((face) => {
const { faceRate, facePos } = face;
if (faceRate > 0.3) {
const [x1, y1, x2, y2] = facePos;
drawFaceBorder(ctx, x1, y1, x2, y2, scaleRatio, 3, "face");
}
const { eyesPos: eyesPosition } = face;
if (showSight) {
const ex3 = eyesPosition[4];
const ey1 = eyesPosition[1];
const ey2 = eyesPosition[3];
const ex9 = eyesPosition[16];
const ey6 = eyesPosition[11];
const ey7 = eyesPosition[13];
const xOffset = 5;
const yOffset = 5;
const eRect1 = { x: ex3 - xOffset, y: Math.min(ey1, ey6) - yOffset };
const eRect2 = { x: ex9 + xOffset, y: Math.max(ey2, ey7) + yOffset };
drawFaceBorder(
ctx,
eRect1.x,
eRect1.y,
eRect2.x,
eRect2.y,
scaleRatio,
3,
"eye"
);
}
const { mouthPos } = face;
const mx1 = mouthPos[0];
const my1 = mouthPos[7];
const mx2 = mouthPos[12];
const my2 = mouthPos[19];
if (face.smileRate >= 0.4) {
drawFaceBorder(ctx, mx1, my1, mx2, my2, scaleRatio, 3, "mouth");
}
});
hands.forEach((hand) => {
const [x1, y1, x2, y2] = hand.handPos;
drawFaceBorder(ctx, x1, y1, x2, y2, scaleRatio, 3, "hand");
});
}
export class FaceDrawCanvas {
constructor(options = {}) {
const { canvasElId } = options;
this.canvas = document.getElementById(canvasElId);
this.ctx2D = this.canvas.getContext("2d");
}
draw = async (predictData) => {
drawBy2D(this.canvas, this.ctx2D, predictData);
};
clear = async () => {
this.ctx2D.clearRect(0, 0, this.canvas.clientWidth, this.canvas.height);
};
}
2.5 facePredictCanvas.js
function calculateNewDimensions(videoWidth, videoHeight, predictImageSize) {
const aspectRatio = videoWidth / videoHeight;
let newWidth, newHeight;
if (aspectRatio > 1) {
newWidth = predictImageSize;
newHeight = Math.round(predictImageSize / aspectRatio); // 使用四舍五入
} else {
newHeight = predictImageSize;
newWidth = Math.round(predictImageSize * aspectRatio); // 使用四舍五入
}
return { newWidth, newHeight };
}
function scalePredictImageData(
ctx,
videoEl,
videoWidth,
videoHeight,
predictImageSize,
isFlipHorizontal = false,
isFlipVertical = false
) {
const { newWidth, newHeight } = calculateNewDimensions(
videoWidth,
videoHeight,
predictImageSize
);
ctx.clearRect(0, 0, newWidth, newHeight);
ctx.save();
if (isFlipHorizontal) ctx.scale(-1, 1);
if (isFlipVertical) ctx.scale(1, -1);
ctx.drawImage(
videoEl,
0,
0,
videoWidth,
videoHeight,
isFlipHorizontal ? -newWidth : 0,
isFlipVertical ? -newHeight : 0,
newWidth,
newHeight
);
ctx.restore();
}
export function getPredictImageDataBy2D(
ctx,
videoEl,
videoWidth,
videoHeight,
predictImageSize,
isFlipHorizontal,
isFlipVertical
) {
scalePredictImageData(
ctx,
videoEl,
videoWidth,
videoHeight,
predictImageSize,
isFlipHorizontal,
isFlipVertical
);
const imageData = ctx.getImageData(0, 0, predictImageSize, predictImageSize);
// 测试缩放后的 imageData 数据
// ctx.putImageData(imageData, predictImageSize, predictImageSize);
return {
id: +new Date(),
width: imageData.width,
height: imageData.height,
buffer: imageData.data.buffer,
predictImageSize: predictImageSize,
};
}
export class FacePredictCanvas {
constructor(options = {}) {
const {
videoEl,
canvasElId,
videoWidth,
videoHeight,
isFlipVertical,
isFlipHorizontal,
predictImageSize,
} = options;
this.videoEl = videoEl;
this.videoWidth = videoWidth;
this.videoHeight = videoHeight;
this.isFlipVertical = isFlipVertical;
this.predictImageSize = predictImageSize;
this.isFlipHorizontal = isFlipHorizontal;
this.canvas = document.getElementById(canvasElId);
this.ctx2D = this.canvas.getContext("2d");
}
getPredictImageData = async () => {
return getPredictImageDataBy2D(
this.ctx2D,
this.videoEl,
this.videoWidth,
this.videoHeight,
this.predictImageSize,
this.isFlipHorizontal,
this.isFlipVertical
);
};
}
2.6 registerServiceWorker.js
export async function registerServiceWorker(workerPath) {
if (!("serviceWorker" in navigator)) {
console.warn("Service Worker is not supported in this browser.");
return;
}
try {
const registration = await navigator.serviceWorker.register(workerPath);
console.log("Service Worker registered with scope:", registration.scope);
} catch (error) {
console.error(`注册失败:${error}`);
}
}
2.7 videoRecorder/index.js
const recordModeMap = {
face: "face",
screen: "screen",
};
const statusMap = {
notStarted: "notStarted",
recording: "recording",
paused: "paused",
stopped: "stopped",
};
const EMediaError = {
AbortError: "media_aborted",
NotAllowedError: "permission_denied",
NotFoundError: "no_specified_media_found",
NotReadableError: "media_in_use",
OverconstrainedError: "invalid_media_constraints",
TypeError: "no_constraints",
SecurityError: "security_error",
OtherError: "other_error",
NoRecorder: "recorder_error",
// 无法获取浏览器录音功能,请升级浏览器或使用Chrome
NotChrome: "not_chrome",
// chrome下获取浏览器录音功能,因为安全性问题,需要在localhost或127.0.0.1或https下才能获取权限
UrlSecurity: "url_security",
None: "",
};
function isObject(object) {
return typeof object === "object" && object != null;
}
async function checkPermissions(name) {
try {
return await navigator.permissions.query({ name: name });
} catch (error) {
return false;
}
}
function removeUnsupportedConstraints(constraints) {
try {
const supportedMediaConstraints =
navigator.mediaDevices.getSupportedConstraints();
if (!supportedMediaConstraints) {
return;
}
Object.keys(constraints).forEach((constraint) => {
if (!supportedMediaConstraints[constraint]) {
console.log(
`VideoRecorder removeUnsupportedConstraints: Removing unsupported constraint "${constraint}".`
);
delete constraints[constraint];
}
});
} catch (error) {
console.log("VideoRecorder removeUnsupportedConstraints error: ", error);
}
}
function sanitizeStreamConstraints(streamConstraints) {
if (
isObject(streamConstraints.audio) &&
typeof streamConstraints.audio !== "boolean"
) {
removeUnsupportedConstraints(streamConstraints.audio);
}
if (
isObject(streamConstraints.video) &&
typeof streamConstraints.video !== "boolean"
) {
removeUnsupportedConstraints(streamConstraints.video);
}
}
function sanitizeRecorderOptions(recorderOptions) {
if (!window.MediaRecorder) {
console.log(
"VideoRecorder sanitizeRecorderOptions error",
"该浏览器不支持 window.MediaRecorder"
);
return;
}
if (
recorderOptions?.mimeType &&
typeof window.MediaRecorder.isTypeSupported === "function" &&
!window.MediaRecorder.isTypeSupported(recorderOptions.mimeType)
) {
console.log(
`VideoRecorder sanitizeRecorderOptions: Removing unsupported mimeType "${recorderOptions.mimeType}".`
);
delete recorderOptions.mimeType;
}
}
function getMediaPermissionErrorMessage(error) {
const errName = error.name;
if (errName === "NotFoundError" || errName === "DevicesNotFoundError") {
// required track is missing
// 找不到满足请求参数的媒体类型。
return EMediaError.NotFoundError;
} else if (errName === "NotReadableError" || errName === "TrackStartError") {
// 媒体设备已经被其他的应用所占用了
// 操作系统上某个硬件、浏览器或者网页层面发生的错误导致设备无法被访问。
// webcam or mic already in use
return EMediaError.NotReadableError;
} else if (
errName === "OverConstrainedError" ||
errName === "ConstraintNotSatisfiedError"
) {
// 当前设备不满足constraints条件
return EMediaError.OverconstrainedError;
} else if (
errName === "NotAllowedError" ||
errName === "permissionDeniedError"
) {
// permission denied in browser
// 用户拒绝了当前的浏览器实例的访问请求;或者用户拒绝了当前会话的访问;或者用户在全局范围内拒绝了所有媒体访问请求。
return EMediaError.NotAllowedError;
} else if (errName === "TypeError") {
// 类型错误,constraints对象未设置[空],或者都被设置为false。
return EMediaError.TypeError;
} else if (errName === "AbortError") {
// 硬件问题
return EMediaError.AbortError;
} else if (errName === "SecurityError") {
// 安全错误,在getUserMedia() 被调用的 Document 上面,使用设备媒体被禁止。这个机制是否开启或者关闭取决于单个用户的偏好设置。
return EMediaError.SecurityError;
} else {
// other errors
return error;
}
}
function releaseMediaStream(mediaStream) {
if (!mediaStream) {
return;
}
mediaStream.getTracks().forEach((track) => track.stop());
}
async function getMediaStream(recordMode, streamConstraints) {
try {
if (recordMode === recordModeMap.screen) {
const displayMediaSteam =
await window.navigator.mediaDevices.getDisplayMedia(streamConstraints);
/**
* @description: 合并音频流与屏幕录制流
*/
const displayAudioStream =
await window.navigator.mediaDevices.getUserMedia({
audio: this.streamConstraints.audio,
});
displayAudioStream
.getAudioTracks()
.forEach((audioTrack) => displayMediaSteam.addTrack(audioTrack));
return displayMediaSteam;
}
const userMediaStream = await window.navigator.mediaDevices.getUserMedia(
streamConstraints
);
return userMediaStream;
} catch (error) {
console.log("error", error);
const errorMessage = getMediaPermissionErrorMessage(error);
console.log("VideoRecorder setMediaStream Error:", errorMessage);
}
}
export class VideoRecorder {
constructor(options = {}) {
const {
videoElId,
recordMode,
videoWidth,
videoHeight,
stopCallback,
pauseCallback,
startCallback,
resumeCallback,
processCallback,
recorderOptions,
videoConstraint,
audioConstraint,
uploadWorkerUrl,
processWorkerUrl,
recorderTimeWorkerUrl,
processVideoCanvasElId,
} = options;
this.stream = null;
this.videoEl = null;
this.recorder = null;
this.recordChunks = [];
this.videoElId = videoElId;
this.videoWidth = videoWidth;
this.videoHeight = videoHeight;
this.stopCallback = stopCallback;
this.pauseCallback = pauseCallback;
this.startCallback = startCallback;
this.resumeCallback = resumeCallback;
this.processCallback = processCallback;
this.uploadWorkerUrl = uploadWorkerUrl;
this.processWorkerUrl = processWorkerUrl;
this.recorderTimeWorkerUrl = recorderTimeWorkerUrl;
this.processVideoCanvasElId = processVideoCanvasElId;
this.uploadWorker = null;
this.processCtx2D = null;
this.processCanvas = null;
this.processStream = null;
this.processWorker = null;
this.recorderTimerWorker = null;
this.isPauseIng = false;
this.status = statusMap.notStarted;
this.recordMode = recordMode || recordModeMap.face;
this.streamConstraints = {
audio: audioConstraint,
video: videoConstraint,
};
this.recorderOptions = recorderOptions || {
mimeType: "video/webm",
};
this.run();
}
run() {
this.initVideoEl();
this.initEventListener();
this.initUploadWorker();
this.initProcessCanvas();
this.initProcessWorker();
this.initRecorderTimerWorker();
}
initVideoEl() {
this.videoEl = document.getElementById(this.videoElId);
if (!this.videoEl) {
return;
}
//
this.videoEl.muted = true;
this.videoEl.addEventListener("play", () => {
VideoRecorder.captureVideoFrame(this.videoEl, this.processVideoFrame);
});
}
initEventListener() {
navigator.connection?.addEventListener?.("change", () => {
const { downlink, effectiveType } = navigator.connection;
const online = navigator.onLine;
console.log(`🌐 当前网络速度:${downlink} Mbps,类型:${effectiveType}`);
this.uploadWorker.postMessage({
type: "network",
online,
speed: downlink,
});
});
window.addEventListener("online", () => {
this.uploadWorker.postMessage({ type: "network", online: true });
});
window.addEventListener("offline", () => {
this.uploadWorker.postMessage({ type: "network", online: false });
});
}
initProcessCanvas() {
this.processCanvas = document.getElementById(this.processVideoCanvasElId);
this.processCanvas.width = this.videoWidth;
this.processCanvas.height = this.videoHeight;
this.processCtx2D = this.processCanvas.getContext("2d");
}
initProcessWorker() {
this.processWorker = new Worker(this.processWorkerUrl);
this.processWorker.onmessage = this.onVideoFrameMessage;
this.processWorker.postMessage({
type: "init",
width: this.videoWidth,
height: this.videoHeight,
});
}
initUploadWorker() {
this.uploadWorker = new Worker(this.uploadWorkerUrl);
this.uploadWorker.onmessage = this.onUploadMessage;
}
initRecorderTimerWorker() {
this.recorderTimerWorker = new Worker(this.recorderTimeWorkerUrl);
this.recorderTimerWorker.onmessage = this.onRecorderTimerMessage;
}
clearAll() {
this.clearRecorder();
releaseMediaStream(this.stream);
}
clearRecorder() {
this.recorder = null;
this.recordChunks = [];
}
getRecorderChunkUrl() {
const mimeType = "video/mp4";
const blob = new Blob(this.recordChunks, { type: mimeType });
const url = URL.createObjectURL(blob);
return url;
}
static async enumMediaDevices() {
try {
const devices = await window.navigator.mediaDevices.enumerateDevices();
const inputDevices = devices.filter((item) => {
return item.kind.endsWith("input") && item.deviceId !== "";
});
const audioInputs = inputDevices.filter(
(item) => item.kind === "audioinput"
);
const videoInputs = inputDevices.filter(
(item) => item.kind === "videoinput"
);
return [videoInputs, audioInputs];
} catch (error) {
console.error("VideoRecorder static enumMediaDevices error:", error);
return [[], []];
}
}
static isVideoPlaying(video) {
if (!video) {
return false;
}
return !video.paused && !video.ended;
}
static captureVideoFrame(video, callback) {
if (!video) {
return;
}
let requestFrame;
function processFrame(now, metadata) {
if (!VideoRecorder.isVideoPlaying(video)) {
return;
}
callback?.(video, now, metadata);
requestFrame(processFrame);
}
if ("requestVideoFrameCallback" in HTMLVideoElement.prototype) {
requestFrame = (cb) => video.requestVideoFrameCallback(cb);
} else if (window.requestAnimationFrame) {
requestFrame = (cb) => requestAnimationFrame(cb);
} else {
requestFrame = (cb) => {
setInterval(() => {
const now = performance.now();
cb(now);
}, 1000 / 30);
};
}
requestFrame(processFrame);
}
static async requestMediaPermissions() {
let mediaStream = null;
try {
mediaStream = await window.navigator.mediaDevices.getUserMedia({
video: true,
audio: true,
});
} catch (error) {
console.log("VideoRecorder static requestMediaPermissions error", error);
} finally {
// 必须确保每次请求后都能清理资源,避免不必要的设备占用。在某些情况下,如果不停止轨道,可能会导致摄像头或麦克风资源持续占用,影响其他应用使用这些设备。
releaseMediaStream(mediaStream);
return !!mediaStream;
}
}
async checkMediaPermissions() {
try {
const camera = await checkPermissions("camera");
const microphone = await checkPermissions("microphone");
if (camera.state === "granted" && microphone.state === "granted") {
return true;
}
console.log("请允许使用摄像头和麦克风");
return false;
} catch (error) {
const errorMessage = getMediaPermissionErrorMessage(error);
console.log(
"VideoRecorder checkPermissions 出现错误, 错误信息为",
errorMessage
);
return false;
}
}
drawVideoFrame = (imageBitmap) => {
this.processCtx2D.clearRect(
0,
0,
this.processCanvas.width,
this.processCanvas.height
);
this.processCtx2D.drawImage(
imageBitmap,
0,
0,
this.processCanvas.width,
this.processCanvas.height
);
};
onVideoFrameMessage = (event) => {
const { data = {} } = event;
if (data.type !== "processed") {
return;
}
this.drawVideoFrame(data.imageBitmap);
};
processVideoFrame = async (video) => {
const imageBitmap = await createImageBitmap(video);
this.processWorker.postMessage({ type: "process", imageBitmap }, [
imageBitmap,
]);
};
onUploadMessage = (event) => {
const { message } = event.data;
console.log("message", message);
};
onRecorderTimerMessage = (event) => {
const { type } = event.data;
if (type === "updateTime") {
this.recorder.requestData();
}
};
pauseVideo() {
this.videoEl?.pause?.();
}
playVideo() {
if (!this.videoEl) {
return;
}
this.videoEl?.play?.();
}
async prepareRecord() {
const checkResult1 = await this.checkMediaPermissions();
if (!checkResult1) {
return null;
}
sanitizeRecorderOptions(this.recorderOptions);
sanitizeStreamConstraints(this.streamConstraints);
if (this.recorder) {
this.recorder = null;
this.recordChunks = [];
}
this.processStream = this.processCanvas.captureStream(60);
this.stream = await getMediaStream(this.recordMode, this.streamConstraints);
if (!this.stream) {
return null;
}
if (this.videoEl) {
this.videoEl.srcObject = this.stream;
}
const audioTracks = this.stream.getAudioTracks();
audioTracks.forEach((track) => this.processStream.addTrack(track));
this.recorder = new window.MediaRecorder(
this.processStream,
this.recorderOptions
);
this.recorder.onstop = this.onRecorderStop;
this.recorder.onstart = this.onRecorderStart;
this.recorder.onpause = this.onRecorderPause;
this.recorder.onresume = this.onRecorderResume;
this.recorder.ondataavailable = this.onRecorderDataavailable;
return this.processStream;
}
startRecorder() {
if (this.status !== statusMap.notStarted) {
return;
}
this.playVideo();
this.recorder.start();
this.status = statusMap.recording;
this.uploadWorker.postMessage({ type: "init" });
this.recorderTimerWorker.postMessage({ action: "start" });
}
pauseRecorder() {
if (this.status !== statusMap.recording) {
return;
}
this.pauseVideo();
this.recorder.pause();
this.recorderTimerWorker.postMessage({ action: "stop" });
}
resumeRecorder() {
if (!this.isPauseIng) {
return;
}
this.playVideo();
this.recorder.resume();
this.recorderTimerWorker.postMessage({ action: "start" });
}
stopRecorder() {
this.pauseVideo();
this.recorder.stop();
this.recorderTimerWorker.postMessage({ action: "stop" });
}
onRecorderStop = () => {
this.stopCallback?.();
this.status = statusMap.stopped;
};
onRecorderStart = () => {
this.startCallback?.();
this.status = statusMap.recording;
};
onRecorderPause = () => {
this.isPauseIng = true;
this.recorder.requestData();
this.status = statusMap.paused;
this.pauseCallback?.();
};
onRecorderResume = () => {
this.resumeCallback?.();
this.isPauseIng = false;
this.status = statusMap.recording;
};
onRecorderError = () => {
this.errorCallback?.();
};
onRecorderDataavailable = (event) => {
const { data } = event;
if (data.size <= 0) {
return;
}
this.recordChunks.push(data);
this.processCallback?.(data);
this.uploadWorker.postMessage(
{ type: "upload", recorderChunk: data },
data
);
};
}
2.8 videoRecorder/video-upload-worker.js
let scheduler = null;
let uploadQueue = [];
let isUploading = false;
let recorderChunkIndex = 0;
class Scheduler {
constructor(parallelism) {
this.queue = [];
this.paused = false;
this.runningTask = 0;
this.parallelism = parallelism;
}
add(task, callback) {
return new Promise((resolve, reject) => {
const taskItem = {
reject,
resolve,
callback,
processor: () => Promise.resolve().then(() => task()),
};
this.queue.push(taskItem);
this.schedule();
});
}
pause() {
this.paused = true;
console.log("⚠️ 网络状态不佳,上传已暂停");
}
resume() {
if (this.paused) {
this.paused = false;
console.log("✅ 网络恢复,恢复上传");
this.schedule();
}
}
schedule() {
while (
!this.paused &&
this.runningTask < this.parallelism &&
this.queue.length
) {
this.runningTask++;
const taskItem = this.queue.shift();
const { processor, resolve, reject, callback } = taskItem;
processor()
.then((res) => {
resolve && resolve(res);
callback && callback(null, res);
})
.catch((error) => {
reject && reject(error);
callback && callback(error, null);
})
.finally(() => {
this.runningTask--;
this.schedule();
});
}
}
}
function request(data) {
return new Promise((resolve) => {
setTimeout(() => {
console.log(`📤 上传 ${data.id}:`, data.chunk);
resolve();
}, 1000);
});
}
function requestWithRetry(data, retries = 3) {
return new Promise((resolve, reject) => {
function attempt(remaining) {
request(data)
.then(resolve)
.catch((err) => {
if (remaining > 0) {
console.log(
`⚠️ 任务 ${data.id} 失败,正在重试 (${3 - remaining + 1}/3)...`
);
attempt(remaining - 1);
} else {
reject(err);
}
});
}
attempt(retries);
});
}
function addTask(data) {
scheduler.add(
() => requestWithRetry(data),
(error, result) => {
console.log(`${data.id} 已上传完成`);
if (scheduler.queue.length === 0 && scheduler.runningTask === 0) {
console.log("✅ 上传队列已清空");
self.postMessage({ message: "✅ 上传队列已清空" });
}
}
);
}
self.onmessage = (event) => {
const { type, speed, online, recorderChunk } = event.data;
switch (type) {
case "init":
recorderChunkIndex = 0;
scheduler = new Scheduler(4);
break;
case "upload":
addTask({ id: recorderChunkIndex, chunk: recorderChunk });
recorderChunkIndex++;
break;
case "network":
if (!online || (speed !== undefined && speed < 1)) {
scheduler?.pause?.();
} else {
scheduler?.resume?.();
}
break;
}
};
2.9 videoRecorder/video-processor-worker.js
let offscreenCanvas;
let offscreenCanvasCtx2D;
self.onmessage = async (event) => {
const { data = {} } = event;
const { type } = data;
switch (type) {
case "init":
const { width, height } = data;
offscreenCanvas = new OffscreenCanvas(width, height);
offscreenCanvasCtx2D = offscreenCanvas.getContext("2d");
break;
case "process":
const { imageBitmap } = data;
offscreenCanvasCtx2D.setTransform(-1, 0, 0, 1, offscreenCanvas.width, 0);
offscreenCanvasCtx2D.drawImage(
imageBitmap,
0,
0,
offscreenCanvas.width,
offscreenCanvas.height
);
const newImageBitmap = await offscreenCanvas.transferToImageBitmap();
self.postMessage({ type: "processed", imageBitmap: newImageBitmap }, [
newImageBitmap,
]);
break;
default:
console.log("未知情况!!!");
}
};
2.10 videoRecorder/video-recorder-time-worker.js
let recordingTime = 0;
self.onmessage = (event) => {
const { action } = event.data;
if (action === "start") {
recordingTime = 0;
startTimer();
} else if (action === "stop") {
stopTimer();
}
};
let timerId;
function startTimer() {
timerId = setInterval(() => {
recordingTime += 5; // Increment time by 5 seconds
self.postMessage({ type: "updateTime", time: recordingTime });
}, 5000);
}
function stopTimer() {
clearInterval(timerId);
}
2.11 facePredictWorker/index.js
export class FacePredictWorker {
constructor(url, options = {}) {
const {
predictImageSize,
predictSuccessCallback,
loadModelSuccessCallback,
} = options;
this.url = url;
this.worker = null;
this.predictImageSize = predictImageSize;
this.predictSuccessCallback = predictSuccessCallback;
this.loadModelSuccessCallback = loadModelSuccessCallback;
this.run();
}
run() {
this.worker = new Worker(this.url);
this.worker.onmessage = this.onMessage;
}
onMessage = (event) => {
const data = event.data;
if (data.message === "load_model_success") {
console.log("Model loaded successfully.");
this.loadModelSuccessCallback?.();
this.postPreflightPredictionMessage();
} else if (data.message === "load_model_failed") {
console.error("Model loading failed.", event);
} else if (data.message === "predict_success") {
this.predictSuccessCallback?.(data);
} else if (data.message === "model_not_loaded") {
console.log("Model Not Load");
}
};
postMessage(data) {
this.worker.postMessage(
{
id: data.id,
width: data.width,
height: data.height,
buffer: data.buffer,
predictImageSize: data.predictImageSize || this.predictImageSize,
},
[data.buffer]
);
}
postPreflightPredictionMessage() {
const width = this.predictImageSize;
const height = this.predictImageSize;
const buffer = new ArrayBuffer(width * height * 4);
this.worker.postMessage(
{
id: "preflight",
width,
height,
buffer,
predictImageSize: this.predictImageSize,
},
[buffer]
);
}
}
2.12 facePredictWorker/worker.js
importScripts("https://cdn.jsdelivr.net/npm/@tensorflow/tfjs/dist/tf.min.js");
let model;
const width = 640;
const scale = 0.5;
(async () => {
const baseUrl = self.location.href.replace(/\/[^\/]*$/, "");
const modelUrl = `${baseUrl}/model.json`;
try {
model = await tf.loadGraphModel(modelUrl, {
fetchFunc: (url, options) => {
return caches.match(url).then((cachedResponse) => {
return cachedResponse || fetch(url, options); // 如果缓存中有响应,则返回缓存的内容,否则继续发起请求
});
},
});
postMessage({ message: "load_model_success" });
} catch (e) {
postMessage({ message: "load_model_failed" });
}
})();
function transform(prediction) {
let hands = [];
let faces = [];
prediction.forEach((object) => {
let rates = object.splice(-4);
let smileRate = rates[0];
let bgRate = rates[1];
let faceRate = rates[2];
let handRate = rates[3];
let pos = object.map((val) => {
return val * width * scale;
});
let type = [bgRate, faceRate, handRate].indexOf(
Math.max(bgRate, faceRate, handRate)
);
if (type === 1) {
let facePos = pos.slice(0, 4);
let eyesPos = pos.slice(4, 24);
let mouthPos = pos.slice(24, 52);
faces.push({
faceRate,
smileRate,
facePos,
mouthPos,
eyesPos,
});
}
if (type === 2) {
hands.push({
handRate,
handPos: pos.slice(0, 4),
});
}
});
return {
faces,
hands,
};
}
async function predict(imageData, options = {}) {
const { id: imageId, predictImageSize } = options;
const data = new Array(imageData.data.length * 0.75 || 0);
let j = 0;
for (let i = 0; i < imageData.data.length; i++) {
if ((i & 3) !== 3) {
data[j++] = imageData.data[i];
}
}
let inputData = tf.tidy(() => {
return tf.tensor(data).reshape([1, imageData.width, imageData.height, 3]);
});
try {
const t0 = performance.now();
let prediction = await model.executeAsync(inputData);
inputData.dispose();
tf.dispose(inputData);
inputData = null;
let output = tf.tidy(() => {
return transform(prediction.arraySync());
});
const t1 = performance.now();
postMessage({
output,
id: imageId,
predictImageSize,
frame: 1000 / (t1 - t0),
message: "predict_success",
});
prediction.dispose && prediction.dispose();
tf.dispose(prediction);
prediction = null;
output = null;
} catch (error) {
inputData && inputData.dispose();
tf.dispose(inputData);
inputData = null;
postMessage({ message: "predict_failed", error });
}
}
onmessage = function (event) {
const image = new ImageData(
new Uint8ClampedArray(event.data.buffer),
event.data.width,
event.data.height
);
if (!model) {
postMessage({ message: "model_not_loaded" });
return;
}
predict(image, event.data);
};
2.13 facePredictWorker/model.json
2.14 facePredictWorker/group1-shard1of1.bin
三、测试
3.1 test.js
import { VideoRecorder } from "./videoRecorder/index.js";
import { AIVideoRecorder } from "./aiVideoRecorder/index.js";
import { registerServiceWorker } from "./registerServiceWorker.js";
registerServiceWorker("./serviceWorker.js");
await VideoRecorder.requestMediaPermissions();
const mediaDevices = await VideoRecorder.enumMediaDevices();
const videoDeviceId = mediaDevices[0][0].deviceId;
const audioDeviceId = mediaDevices[1][0].deviceId;
const aiVideoRecorder = new AIVideoRecorder({
videoDeviceId,
audioDeviceId,
videoWidth: 862,
videoHeight: 485,
videoElId: "recorder-video",
faceDrawCanvasElId: "face-draw-canvas",
facePredictCanvasElId: "face-predict-canvas",
processVideoCanvasElId: "process-video-canvas",
facePredictWorkerUrl: "./facePredictWorker/worker.js",
videoUploadWorkerUrl: "./videoRecorder/video-upload-worker.js",
videoProcessorWorkerUrl: "./videoRecorder/video-processor-worker.js",
videoRecorderTimeWorkerUrl: "./videoRecorder/video-recorder-time-worker.js",
});
function handleStartRecord() {
aiVideoRecorder.startRecord();
}
function handlePauseRecord() {
aiVideoRecorder.pauseRecord();
}
function handleResumeRecord() {
aiVideoRecorder.resumeRecord();
}
function handleStopRecord() {
aiVideoRecorder.stopRecord();
}
function handlePreviewRecord() {
const url = aiVideoRecorder.getRecorderVideoUrl();
const previewViewEl = document.getElementById("preview-recorder-video");
previewViewEl.src = url;
previewViewEl.play();
}
const stopRecordEl = document.getElementById("stop-record");
const startRecordEl = document.getElementById("start-record");
const pauseRecordEl = document.getElementById("pause-record");
const resumeRecordEl = document.getElementById("resume-record");
const previewRecordEl = document.getElementById("preview-record");
startRecordEl.addEventListener("click", () => handleStartRecord());
pauseRecordEl.addEventListener("click", () => handlePauseRecord());
resumeRecordEl.addEventListener("click", () => handleResumeRecord());
stopRecordEl.addEventListener("click", () => handleStopRecord());
previewRecordEl.addEventListener("click", () => handlePreviewRecord());
3.2 test.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>视频录制</title>
<style>
#recorder-container {
width: 862px;
height: 485px;
position: relative;
}
#recorder-video {
opacity: 0;
width: 100%;
height: 100%;
}
#process-video-canvas {
width: 862;
height: 485;
left: 0;
top: 0;
z-index: 100;
position: absolute;
}
#audio-volume-canvas-container {
position: absolute;
top: 12px;
left: 12px;
width: 180px;
height: 34px;
background: #000000;
border-radius: 17px;
opacity: 0.8;
z-index: 999;
}
#audio-volume-canvas {
position: absolute;
top: 13px;
left: 36px;
border-radius: 4px;
}
#face-draw-canvas {
position: absolute;
left: 0;
top: 0;
z-index: 888;
}
#face-predict-canvas {
position: absolute;
left: 0;
top: 0;
}
#recorder-operation {
width: 862px;
margin-top: 24px;
display: flex;
justify-content: center;
gap: 24px;
}
</style>
</head>
<body>
<div id="recorder-container">
<video id="recorder-video"></video>
<canvas id="face-predict-canvas" width="862" height="485"></canvas>
<canvas id="process-video-canvas"></canvas>
<div id="audio-volume-canvas-container">
<canvas id="audio-volume-canvas" width="110" height="8"></canvas>
</div>
<canvas id="face-draw-canvas" width="862" height="485"></canvas>
</div>
<div id="recorder-operation">
<button id="start-record">开始录制</button>
<button id="pause-record">暂停录制</button>
<button id="resume-record">继续录制</button>
<button id="stop-record">停止录制</button>
<button id="preview-record">预览录制</button>
</div>
<div id="preview-record-container">
<video id="preview-recorder-video"></video>
</div>
<script type="module" src="./test.js"></script>
</body>
</html>