first commit

This commit is contained in:
陈赣
2026-06-03 12:43:14 +08:00
commit ba76cfae28
608 changed files with 120791 additions and 0 deletions

145
nodes/README.md Normal file
View File

@@ -0,0 +1,145 @@
## important tips when using node ##
Most nodes support multi-channel, which means that multiple channels of data can be used as its input. In addition, some nodes do not support multi-channel by default, that is, there can only be one channel of data as its input (the input channel index cannot be changed once it is determined), otherwise an error will occur. The following are common examples:
*support multi-channel*
- all infer nodes, broker nodes, because they work independently of the channel index.
- split node, it works independently of channel index too. it could be used to split pipeline into multi branches accordding to different channel index.
- some nodes which has been designed to support multi-channel, such as record nodes, track nodes, ba nodes, which are able to distinguish different channels inside logic code (`such as using std::map<int, ...> to maintain different data of channels`).
*do NOT support multi-channel*
- all src node and all des node, you MUST specify channel index when initializing instances of them.
- part of osd nodes, because they work dependently of channel index, you can do some work to let them support multi-channel, for example just use some data structure like `std::map<int, ...>` to maintain different data of channels inside code of nodes.
Note, although some nodes support multi-channel, you'd better be careful because using single instance of node to deal with multiple channels of data would have lower performance(process data serially
). On the contrary, one instance dealing with only one channel of data (multiple channels use multiple instances of the SAME node) have higher performance(process data in parallel). below is an example of 2 methods to create pipeline:
```
single instance of track/ba node works on 2 channels
file_src_0 --> osd_0 --> screen_des_0
--> detector --> multi-classifiers --> tracker --> ba_crossline --> split(by channel index)
file_src_1 --> osd_1 --> screen_des_1
2 instances of track/ba node work on 2 channels:
file_src_0 --> tracker_0 --> ba_crossline_0 --> osd_0 --> screen_des_0
--> detector --> multi-classifiers --> split(by channel index)
file_src_1 --> tracker_0 --> ba_crossline_0 --> osd_1 --> screen_des_1
```
## 节点目录 ##
<details open>
<summary>ba</summary>
- vp_ba_crossline_node跨线判断
- vp_ba_jam_node拥堵判断
- vp_ba_stop_node停止判断
</details>
<details open>
<summary>broker</summary>
- vp_ba_socket_broker_node使用udp转发行为分析结果
- vp_embeddings_properties_socket_broker_node使用udp转发目标特征、属性结果
- vp_embeddings_socket_broker_node使用udp转发目标特征结果
- vp_expr_socket_broker_node使用udp转发数学表达式检查结果
- vp_json_console_broker_node以json格式将结构化数据输出到控制台
- vp_json_kafka_broker_node以json格式将结构化数据通过kafka发送给第三方
- vp_msg_broker_node数据代理基类节点
- vp_plate_socket_broker_node使用udp转发车牌识别结果
- vp_xml_file_broker_node以xml格式将结构化数据存储到文件
- vp_xml_socket_broker_node以xml格式将结构化数据通过udp发送给第三方
</details>
<details open>
<summary>infers</summary>
- vp_classifier_node基于resnet系列的图像分类节点opencv::dnn
- vp_enet_seg_node基于ENet网络的图像分割节点opencv::dnn
- vp_face_swap_node基于insightface的人脸替换节点opencv::dnn
- vp_feature_encoder_node基于resnet系列的目标特征提取节点opencv::dnn
- vp_lane_detector_node基于CenterNet的车道线检测节点opencv::dnn
- vp_mask_rcnn_detector_node基于maskrcnn的目标检测节点opencv::dnn
- vp_openpose_detector_node基于openpose的肢体检测节点opencv::dnn
- vp_ppocr_text_detector_node基于paddleocr的文字检测节点paddleinference
- vp_restoration_node基于real-esrgan的图像增强修复节点opencv::dnn
- vp_sface_feature_encoder_node基于sface网络的人脸特征提取节点opencv::dnn
- vp_trt_vehicle_color_classifier基于resnet18的车辆颜色分类节点tensorrt
- vp_trt_vehicle_detector基于yolov5s的车辆检测节点tensorrt
- vp_trt_vehicle_feature_encoder基于fastreid的车辆特征提取节点tensorrt
- vp_trt_vehicle_plate_detector_v2基于yolov5s的车牌检测识别节点一级推理tensorrt
- vp_trt_vehicle_plate_detector基于yolov5s的车牌检测识别节点二级推理tensorrt
- vp_trt_vehicle_scanner基于yolov5s的车身扫描节点tensorrt
- vp_trt_vehicle_type_classifier基于resnet18的车辆车型分类节点tensorrt
- vp_yolo_detector_node基于yolov3含tiny的目标检测节点opencv::dnn
- yolo_yunet_face_detector_node基于yunet网络的人脸检测节点opencv::dnn
</details>
<details open>
<summary>osd</summary>
- vp_ba_crossline_osd_node跨线判断结果绘制节点
- vp_ba_jam_osd_node拥堵判断结果绘制节点
- vp_ba_stop_osd_node停止判断结果绘制节点
- vp_cluster_node目标聚类结果绘制节点
- vp_expr_osd_node数学表达式检查结果绘制节点
- vp_face_osd_node_v2人脸检测结果绘制节点含相似度显示
- vp_face_osd_node人脸检测结果绘制节点
- vp_lane_osd_node车道线检测结果绘制节点
- vp_osd_node_v2目标绘制节点含子目标
- vp_osd_node_v3目标绘制节点含目标mask
- vp_osd_node目标绘制节点
- vp_plate_osd_node车牌检测识别结果绘制节点
- vp_pose_osd_node肢体检测结果绘制节点
- vp_seg_osd_node图像分割结果绘制节点
- vp_text_osd_node文字检测识别结果绘制节点
</details>
<details open>
<summary>proc</summary>
- vp_expr_check_node数学等式准确性判断节点
- vp_frame_fusion_node视频帧按像素比融合节点支持2个通道
</details>
<details open>
<summary>record</summary>
- vp_record_node视频/图片录制节点
</details>
<details open>
<summary>track</summary>
- vp_dsort_track_node基于deepsort的跟踪节点
- vp_sort_track_node基于sort的跟踪节点
</details>
<details open>
<summary>common</summary>
- vp_app_des_node将图片数据推送给application的目标节点
- vp_app_src_node从application接收图片数据的原始节点
- vp_des_node所有目标节点基类
- vp_fake_des_node虚拟目标节点不做任何事
- vp_file_des_node将视频数据存入文件的目标节点
- vp_file_src_node从文件读取视频数据的原始节点
- vp_image_des_node将数据以图片的形式发送给socket或者file的目标节点
- vp_image_src_node从file或者socket读取图片数据的原始节点
- vp_infer_node所有推理节点基类
- vp_message_broker_node所有数据代理节点基类
- vp_node所有节点基类
- vp_placeholder_node虚拟中间节点不做任何事
- vp_primary_infer_node所有一级推理节点基类
- vp_rtmp_des_node将视频数据以rtmp格式推送到rtmp服务器的目标节点
- vp_rtsp_des_node将视频数据以rtsp格式推送无需rtsp服务器的目标节点
- vp_rtsp_src_node以rtsp格式读取网络流的原始节点
- vp_screen_des_node将视频/图片显示到屏幕的目标节点
- vp_secondary_infer_node所有二级推理节点基类
- vp_split_node管道拆分节点
- vp_src_node所有原始节点基类
- vp_sync_node管道分支同步节点
- vp_udp_src_node以udp格式读取网络流的原始节点
</details>

5
nodes/ba/README.md Normal file
View File

@@ -0,0 +1,5 @@
## tips for BA node ##
- behaviour analysisshort as BA works dependently on tracking, all BA nodes Must attached after track node in pipeline.
- all types of BA nodes support multi-channel, single instance of BA node supports multi channels as input.
- all BA nodes can be divided into 2 categories, one is instantaneous(like crossline) and another one is continuous(like stop, jam). `vp_ba_type` contains pair member for continuous BA such as `STOP` and `UNSTOP`, which means target enter BA status and leave BA status.
- there is no flag of BA inside `vp_frame_target` type, all nodes in pipeline need maintain it by themself if they want to know BA status for some task(like counter for crossline).

View File

@@ -0,0 +1,124 @@
#include "vp_ba_crossline_node.h"
namespace vp_nodes {
vp_ba_crossline_node::vp_ba_crossline_node(std::string node_name,
std::map<int, vp_objects::vp_line> lines,
bool need_record_image,
bool need_record_video):
vp_node(node_name), all_lines(lines), need_record_image(need_record_image), need_record_video(need_record_video) {
VP_INFO(vp_utils::string_format("[%s] %s", node_name.c_str(), to_string().c_str()));
this->initialized();
}
vp_ba_crossline_node::~vp_ba_crossline_node() {
deinitialized();
}
std::string vp_ba_crossline_node::to_string() {
/*
* return 2 points of all lines
* [channel 0: x1,y1 x2,y2][channel 1: x1,y1 x2,y2]...
*/
std::stringstream ss;
for(auto& p: all_lines) {
ss << "[channel" << p.first << ": " << p.second.start.x << "," << p.second.start.y << " " << p.second.end.x << "," << p.second.end.y << "]";
}
return ss.str();
}
std::shared_ptr<vp_objects::vp_meta> vp_ba_crossline_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// if need applied on current channel or not
if (all_lines.count(meta->channel_index) == 0) {
return meta;
}
// for current channel
auto& total_crossline = all_total_crossline[meta->channel_index];
auto& line = all_lines[meta->channel_index];
// for vp_frame_target only
for (auto& target : meta->targets) {
auto len = target->tracks.size();
std::vector<int> involve_targets;
if (len > 1 && target->track_id >= 0) {
// check the last 2 points in tracks
auto p1 = target->tracks[len - 1].track_point();
auto p2 = target->tracks[len - 2].track_point();
auto check1 = at_1_side_of_line(p1, line);
auto check2 = at_1_side_of_line(p2, line);
// `true and false` or `false and true`
// means target passed the line in current frame
if (check1 ^ check2) {
total_crossline++;
involve_targets.push_back(target->track_id);
// not empty need fill ba result back to frame meta
if (involve_targets.size() > 0) {
// send record image and record video signal, recording actions would occur if record nodes exist in pipeline
std::string image_file_name_without_ext = ""; // empty means no recording image
std::string video_file_name_without_ext = ""; // empty means no recording video
// send image record control meta
if (need_record_image) {
image_file_name_without_ext = vp_utils::time_format(NOW, "crossline_image__<year><mon><day><hour><min><sec><mili>");
auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
pendding_meta(image_record_control_meta);
}
// send video record control meta
if (need_record_video) {
video_file_name_without_ext = vp_utils::time_format(NOW, "crossline_video__<year><mon><day><hour><min><sec><mili>");
auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
pendding_meta(video_record_control_meta);
}
std::vector<vp_objects::vp_point> involve_region {line.start, line.end};
auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::CROSSLINE,
meta->channel_index,
meta->frame_index,
involve_targets,
involve_region,
"cross line", // meaningful label
image_file_name_without_ext,
video_file_name_without_ext);
// fill back to frame meta
meta->ba_results.push_back(ba_result);
// info log
VP_INFO(vp_utils::string_format("[%s] [channel %d] has found target cross line, total number of crossline: [%d]", node_name.c_str(), meta->channel_index, total_crossline));
if (need_record_image || need_record_video) {
VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
}
}
}
}
}
return meta;
}
bool vp_ba_crossline_node::at_1_side_of_line(vp_objects::vp_point p, vp_objects::vp_line line) {
auto p1 = line.start;
auto p2 = line.end;
if (p1.x == p2.x) {
return p.x < p1.x;
}
if (p1.y == p2.y) {
return p.y < p1.y;
}
if (p2.x < p1.x) {
auto tmp = p2;
p2 = p1;
p1 = tmp;
}
int ret = (p2.y - p.y) * (p2.x - p1.x) - (p2.y - p1.y) * (p2.x - p.x);
return ret < 0;
}
}

View File

@@ -0,0 +1,38 @@
#pragma once
#include <map>
#include "../vp_node.h"
#include "../../objects/shapes/vp_point.h"
#include "../../objects/shapes/vp_line.h"
#include "../../objects/vp_image_record_control_meta.h"
#include "../../objects/vp_video_record_control_meta.h"
namespace vp_nodes {
// behaviour analysis node for crossline (support multi channels)
class vp_ba_crossline_node: public vp_node
{
private:
// channel -> counter
std::map<int, int> all_total_crossline;
// channel -> line, 1 channel supports only 1 line at most (can be 0, which means no crossline check on this channel)
std::map<int, vp_objects::vp_line> all_lines;
// record params
bool need_record_image;
bool need_record_video;
// check if point at one side of line
bool at_1_side_of_line(vp_objects::vp_point p, vp_objects::vp_line line);
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
vp_ba_crossline_node(std::string node_name,
std::map<int, vp_objects::vp_line> lines,
bool need_record_image = true,
bool need_record_video = false);
~vp_ba_crossline_node();
std::string to_string() override;
};
}

180
nodes/ba/vp_ba_jam_node.cpp Normal file
View File

@@ -0,0 +1,180 @@
#include "vp_ba_jam_node.h"
namespace vp_nodes {
vp_ba_jam_node::vp_ba_jam_node(std::string node_name,
std::map<int, std::vector<vp_objects::vp_point>> jam_regions,
bool need_record_image,
bool need_record_video):
vp_node(node_name), all_jam_regions(jam_regions), need_record_image(need_record_image), need_record_video(need_record_video) {
VP_INFO(vp_utils::string_format("[%s] %s", node_name.c_str(), to_string().c_str()));
this->initialized();
}
vp_ba_jam_node::~vp_ba_jam_node() {
deinitialized();
}
std::string vp_ba_jam_node::to_string() {
/*
* return vertexs of all jam regions
* [channel0: x1,y1 x2,y2 ...][channel1: x1,y1 x2,y2 ...]...
*/
std::stringstream ss;
for(auto& r: all_jam_regions) {
ss << "[" << r.first << ":";
for(auto& p: r.second) {
ss << " " << p.x << "," << p.y;
}
ss << "]";
}
return ss.str();
}
bool vp_ba_jam_node::point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region) {
int i, j, c = 0;
int nvert = region.size();
for (i = 0, j = nvert-1; i < nvert; j = i++) {
if (((region[i].y > p.y) != (region[j].y > p.y)) &&
(p.x < (region[j].x - region[i].x) * (p.y - region[i].y)
/ (region[j].y - region[i].y) + region[i].x))
c = !c;
}
return c;
}
std::shared_ptr<vp_objects::vp_meta> vp_ba_jam_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// if need applied on current channel or not
if (all_jam_regions.count(meta->channel_index) == 0) {
return meta;
}
// for current channel
auto& jam_region = all_jam_regions[meta->channel_index];
auto& stop_checking_status = all_stop_checking_status[meta->channel_index];
auto& jam_result = all_jam_results[meta->channel_index];
auto& last_notify = all_last_notifys[meta->channel_index];
// for vp_frame_target only
std::vector<int> hit_traget_ids;
for (auto& target : meta->targets) {
auto len = target->tracks.size();
auto loc = target->get_rect().track_point();
// target has been tracked AND tracked enough frames
if (len < check_interval_frames || target->track_id < 0) {
continue;
}
// if target inside of stop region or not
if (!point_in_poly(loc, jam_region)) {
continue;
}
auto pre_loc = target->tracks[len - check_interval_frames].track_point();
if (pre_loc.distance_with(loc) <= check_max_distance) {
stop_checking_status[target->track_id]++;
hit_traget_ids.push_back(target->track_id);
}
}
// count for stop targets
auto stops = 0;
std::vector<int> involve_targets;
for (auto i = stop_checking_status.begin(); i != stop_checking_status.end();) {
if (std::find(hit_traget_ids.begin(), hit_traget_ids.end(), i->first) == hit_traget_ids.end()) {
// remove since it not satisfy stop condition
i = stop_checking_status.erase(i);
continue;
}
if (i->second >= check_min_hit_frames) {
involve_targets.push_back(i->first);
stops++;
}
i++;
}
// enter jam status
if (stops >= check_min_stops && !jam_result && (meta->frame_index - last_notify) > (check_notify_interval * meta->fps)) {
jam_result = true;
last_notify = meta->frame_index;
// send record image and record video signal, recording actions would occur if record nodes exist in pipeline
std::string image_file_name_without_ext = ""; // empty means no recording image
std::string video_file_name_without_ext = ""; // empty means no recording video
// send image record control meta
if (need_record_image) {
image_file_name_without_ext = vp_utils::time_format(NOW, "jam_image__<year><mon><day><hour><min><sec><mili>");
auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
pendding_meta(image_record_control_meta);
}
// send video record control meta
if (need_record_video) {
video_file_name_without_ext = vp_utils::time_format(NOW, "jam_video__<year><mon><day><hour><min><sec><mili>");
auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
pendding_meta(video_record_control_meta);
}
std::vector<vp_objects::vp_point> involve_region = jam_region;
auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::JAM,
meta->channel_index,
meta->frame_index,
involve_targets,
involve_region,
"jam", // meaningful label
image_file_name_without_ext,
video_file_name_without_ext);
// fill back to frame meta
meta->ba_results.push_back(ba_result);
// info log
VP_INFO(vp_utils::string_format("[%s] [channel %d] has found jam", node_name.c_str(), meta->channel_index));
if (need_record_image || need_record_video) {
VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
}
}
// leave jam status
if (stops < check_min_stops && jam_result) {
jam_result = false;
last_notify = meta->frame_index;
// send record image and record video signal, recording actions would occur if record nodes exist in pipeline
std::string image_file_name_without_ext = ""; // empty means no recording image
std::string video_file_name_without_ext = ""; // empty means no recording video
// send image record control meta
if (need_record_image) {
image_file_name_without_ext = vp_utils::time_format(NOW, "unjam_image__<year><mon><day><hour><min><sec><mili>");
auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
pendding_meta(image_record_control_meta);
}
// send video record control meta
if (need_record_video) {
video_file_name_without_ext = vp_utils::time_format(NOW, "unjam_video__<year><mon><day><hour><min><sec><mili>");
auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
pendding_meta(video_record_control_meta);
}
std::vector<vp_objects::vp_point> involve_region = jam_region;
auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::UNJAM,
meta->channel_index,
meta->frame_index,
involve_targets,
involve_region,
"unjam", // meaningful label
image_file_name_without_ext,
video_file_name_without_ext);
// fill back to frame meta
meta->ba_results.push_back(ba_result);
// info log
VP_INFO(vp_utils::string_format("[%s] [channel %d] has found unjam", node_name.c_str(), meta->channel_index));
if (need_record_image || need_record_video) {
VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
}
}
return meta;
}
}

50
nodes/ba/vp_ba_jam_node.h Normal file
View File

@@ -0,0 +1,50 @@
#pragma once
#include <map>
#include "../vp_node.h"
#include "../../objects/shapes/vp_point.h"
#include "../../objects/shapes/vp_line.h"
#include "../../objects/vp_image_record_control_meta.h"
#include "../../objects/vp_video_record_control_meta.h"
namespace vp_nodes {
// behaviour analysis node for stop (support multi channels)
class vp_ba_jam_node: public vp_node
{
private:
// channel -> vertexs of region, 1 channel supports only 1 region at most (can be 0, which means no jam check on this channel)
std::map<int, std::vector<vp_objects::vp_point>> all_jam_regions;
// channel -> status of targets (id -> num of hit frames)
std::map<int, std::map<int, int>> all_stop_checking_status;
// channel -> jam status of channel (jam or not)
std::map<int, bool> all_jam_results;
// channel -> last frame index to notify jam
std::map<int, int> all_last_notifys;
// record params
bool need_record_image;
bool need_record_video;
// check if point inside of polygon
bool point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region);
// jam checking logic parameters which may be configed by constructor passed in by user
const int check_interval_frames = 20;
const int check_min_hit_frames = 25 * 2; // 25 fps * 2 seconds
const int check_max_distance = 8;
const int check_min_stops = 8;
const int check_notify_interval = 10; // interval time (seconds) to notify (ensure not frequently)
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
vp_ba_jam_node(std::string node_name,
std::map<int, std::vector<vp_objects::vp_point>> jam_regions,
bool need_record_image = true,
bool need_record_video = true);
~vp_ba_jam_node();
std::string to_string() override;
};
}

View File

@@ -0,0 +1,172 @@
#include "vp_ba_stop_node.h"
namespace vp_nodes {
vp_ba_stop_node::vp_ba_stop_node(std::string node_name,
std::map<int, std::vector<vp_objects::vp_point>> stop_regions,
bool need_record_image,
bool need_record_video):
vp_node(node_name), all_stop_regions(stop_regions), need_record_image(need_record_image), need_record_video(need_record_video) {
VP_INFO(vp_utils::string_format("[%s] %s", node_name.c_str(), to_string().c_str()));
this->initialized();
}
vp_ba_stop_node::~vp_ba_stop_node() {
deinitialized();
}
std::string vp_ba_stop_node::to_string() {
/*
* return vertexs of all stop regions
* [channel0: x1,y1 x2,y2 ...][channel1: x1,y1 x2,y2 ...]...
*/
std::stringstream ss;
for(auto& r: all_stop_regions) {
ss << "[channel" << r.first << ":";
for(auto& p: r.second) {
ss << " " << p.x << "," << p.y;
}
ss << "]";
}
return ss.str();
}
bool vp_ba_stop_node::point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region) {
int i, j, c = 0;
int nvert = region.size();
for (i = 0, j = nvert-1; i < nvert; j = i++) {
if (((region[i].y > p.y) != (region[j].y > p.y)) &&
(p.x < (region[j].x - region[i].x) * (p.y - region[i].y)
/ (region[j].y - region[i].y) + region[i].x))
c = !c;
}
return c;
}
std::shared_ptr<vp_objects::vp_meta> vp_ba_stop_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// if need applied on current channel or not
if (all_stop_regions.count(meta->channel_index) == 0) {
return meta;
}
// for current channel
auto& stop_region = all_stop_regions[meta->channel_index];
auto& stop_checking_status = all_stop_checking_status[meta->channel_index];
// for vp_frame_target only
std::vector<int> hit_traget_ids;
for (auto& target : meta->targets) {
auto len = target->tracks.size();
auto loc = target->get_rect().track_point();
// target has been tracked AND tracked enough frames
if (len < check_interval_frames || target->track_id < 0) {
continue;
}
// if target inside of stop region or not
if (!point_in_poly(loc, stop_region)) {
continue;
}
auto pre_loc = target->tracks[len - check_interval_frames].track_point();
if (pre_loc.distance_with(loc) <= check_max_distance) {
stop_checking_status[target->track_id]++;
hit_traget_ids.push_back(target->track_id);
}
}
for (auto i = stop_checking_status.begin(); i != stop_checking_status.end();) {
if (std::find(hit_traget_ids.begin(), hit_traget_ids.end(), i->first) == hit_traget_ids.end()) {
// statisfy unstop condition
if (i->second >= check_min_hit_frames) {
std::vector<int> involve_targets;
involve_targets.push_back(i->first);
// send record image and record video signal, recording actions would occur if record nodes exist in pipeline
std::string image_file_name_without_ext = ""; // empty means no recording image
std::string video_file_name_without_ext = ""; // empty means no recording video
// send image record control meta
if (need_record_image) {
image_file_name_without_ext = vp_utils::time_format(NOW, "unstop_image__<year><mon><day><hour><min><sec><mili>");
auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
pendding_meta(image_record_control_meta);
}
// send video record control meta
if (need_record_video) {
video_file_name_without_ext = vp_utils::time_format(NOW, "unstop_video__<year><mon><day><hour><min><sec><mili>");
auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
pendding_meta(video_record_control_meta);
}
std::vector<vp_objects::vp_point> involve_region = stop_region;
auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::UNSTOP,
meta->channel_index,
meta->frame_index,
involve_targets,
involve_region,
"unstop", // meaningful label
image_file_name_without_ext,
video_file_name_without_ext);
// fill back to frame meta
meta->ba_results.push_back(ba_result);
// info log
VP_INFO(vp_utils::string_format("[%s] [channel %d] has found target unstop", node_name.c_str(), meta->channel_index));
if (need_record_image || need_record_video) {
VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
}
}
// remove since it not satisfy stop condition
i = stop_checking_status.erase(i);
continue;
}
// equal means first time to satisfy stop condition
if (i->second == check_min_hit_frames) {
std::vector<int> involve_targets;
involve_targets.push_back(i->first);
// send record image and record video signal, recording actions would occur if record nodes exist in pipeline
std::string image_file_name_without_ext = ""; // empty means no recording image
std::string video_file_name_without_ext = ""; // empty means no recording video
// send image record control meta
if (need_record_image) {
image_file_name_without_ext = vp_utils::time_format(NOW, "stop_image__<year><mon><day><hour><min><sec><mili>");
auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
pendding_meta(image_record_control_meta);
}
// send video record control meta
if (need_record_video) {
video_file_name_without_ext = vp_utils::time_format(NOW, "stop_video__<year><mon><day><hour><min><sec><mili>");
auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
pendding_meta(video_record_control_meta);
}
std::vector<vp_objects::vp_point> involve_region = stop_region;
auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::STOP,
meta->channel_index,
meta->frame_index,
involve_targets,
involve_region,
"stop", // meaningful label
image_file_name_without_ext,
video_file_name_without_ext);
// fill back to frame meta
meta->ba_results.push_back(ba_result);
// info log
VP_INFO(vp_utils::string_format("[%s] [channel %d] has found target stop", node_name.c_str(), meta->channel_index));
if (need_record_image || need_record_video) {
VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
}
}
i++;
}
return meta;
}
}

View File

@@ -0,0 +1,42 @@
#pragma once
#include <map>
#include "../vp_node.h"
#include "../../objects/shapes/vp_point.h"
#include "../../objects/shapes/vp_line.h"
#include "../../objects/vp_image_record_control_meta.h"
#include "../../objects/vp_video_record_control_meta.h"
namespace vp_nodes {
// behaviour analysis node for stop (support multi channels)
class vp_ba_stop_node: public vp_node
{
private:
// channel -> vertexs of region, 1 channel supports only 1 region at most (can be 0, which means no stop check on this channel)
std::map<int, std::vector<vp_objects::vp_point>> all_stop_regions;
// channel -> status of targets (id -> num of hit frames)
std::map<int, std::map<int, int>> all_stop_checking_status;
// record params
bool need_record_image;
bool need_record_video;
// check if point inside of polygon
bool point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region);
// stop checking logic parameters which may be configed by constructor passed in by user
const int check_interval_frames = 20;
const int check_min_hit_frames = 25 * 2; // 25 fps * 2 seconds
const int check_max_distance = 5;
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
vp_ba_stop_node(std::string node_name,
std::map<int, std::vector<vp_objects::vp_point>> stop_regions,
bool need_record_image = true,
bool need_record_video = true);
~vp_ba_stop_node();
std::string to_string() override;
};
}

View File

@@ -0,0 +1,100 @@
#pragma once
/*
* define EXTERNAL archive functions for objects which need to be serialized by cereal library in VideoPipe.
* refer to `https://uscilab.github.io/cereal/serialization_functions.html` for more details.
*/
// object types
#include "../../../objects/vp_frame_target.h"
#include "../../../objects/vp_frame_face_target.h"
#include "../../../objects/vp_frame_text_target.h"
#include "../../../objects/vp_frame_pose_target.h"
#include "../../../objects/vp_sub_target.h"
/* extend for more types of objects in VideoPipe. */
// headers from cereal
#include "../../../third_party/cereal/cereal.hpp"
#include "../../../third_party/cereal/types/vector.hpp"
#include "../../../third_party/cereal/types/memory.hpp"
#include "../../../third_party/cereal/types/string.hpp"
#include "../../../third_party/cereal/types/utility.hpp"
#include "../../../third_party/cereal/archives/json.hpp"
#include "../../../third_party/cereal/archives/xml.hpp"
/* same namespace as object types */
namespace vp_objects {
/* vp_frame_target */
template<typename Archive>
void serialize(Archive& archive, vp_frame_target& target) {
// define the form of structured data for vp_frame_target
archive(cereal::make_nvp("x", target.x),
cereal::make_nvp("y", target.y),
cereal::make_nvp("width", target.width),
cereal::make_nvp("height", target.height),
cereal::make_nvp("primary_class_id", target.primary_class_id),
cereal::make_nvp("primary_score", target.primary_score),
cereal::make_nvp("primary_label", target.primary_label),
cereal::make_nvp("channel_index", target.channel_index),
cereal::make_nvp("frame_index", target.frame_index),
cereal::make_nvp("track_id", target.track_id),
cereal::make_nvp("secondary_class_ids", target.secondary_class_ids),
cereal::make_nvp("secondary_scores", target.secondary_scores),
cereal::make_nvp("secondary_labels", target.secondary_labels),
cereal::make_nvp("sub_targets", target.sub_targets),
cereal::make_nvp("embeddings", target.embeddings));
}
template<typename Archive>
void serialize(Archive& archive, vp_sub_target& target) {
// define the form of structured data for vp_sub_target
archive(cereal::make_nvp("x", target.x),
cereal::make_nvp("y", target.y),
cereal::make_nvp("width", target.width),
cereal::make_nvp("height", target.height),
cereal::make_nvp("class_id", target.class_id),
cereal::make_nvp("score", target.score),
cereal::make_nvp("label", target.label),
cereal::make_nvp("frame_index", target.frame_index),
cereal::make_nvp("channel_index", target.channel_index),
cereal::make_nvp("attachments", target.attachments));
}
/* END OF vp_frame_target */
/* vp_frame_face_target */
template<typename Archive>
void serialize(Archive& archive, vp_frame_face_target& target) {
// define the form of structured data for vp_frame_face_target
archive(cereal::make_nvp("x", target.x),
cereal::make_nvp("y", target.y),
cereal::make_nvp("width", target.width),
cereal::make_nvp("height", target.height),
cereal::make_nvp("score", target.score),
cereal::make_nvp("embeddings", target.embeddings),
cereal::make_nvp("key_points", target.key_points),
cereal::make_nvp("track_id", target.track_id));
}
/* END OF vp_frame_face_target */
/* vp_frame_text_target */
template<typename Archive>
void serialize(Archive& archive, vp_frame_text_target& target) {
// define the form of structured data for vp_frame_text_target
archive(cereal::make_nvp("text", target.text),
cereal::make_nvp("score", target.score),
cereal::make_nvp("region", target.region_vertexes),
cereal::make_nvp("flags", target.flags));
}
/* END OF vp_frame_text_target */
/* vp_frame_pose_target */
template<typename Archive>
void serialize(Archive& archive, vp_frame_pose_target& target) {
// define the form of structured data for vp_frame_pose_target
}
/* END OF vp_frame_pose_target */
}

View File

@@ -0,0 +1,167 @@
#ifdef VP_WITH_KAFKA
#include "KafkaProducer.h"
// callbacks
class ProducerDeliveryReportCb : public RdKafka::DeliveryReportCb {
public:
void dr_cb(RdKafka::Message &message) {
if (message.err())
std::cerr << "Message delivery failed: " << message.errstr() << std::endl;
else {
// Message delivered to topic test [0] at offset 135000
/*
std::cerr << "Message delivered to topic " << message.topic_name()
<< " [" << message.partition() << "] at offset "
<< message.offset() << std::endl;*/
}
}
};
class ProducerEventCb : public RdKafka::EventCb {
public:
void event_cb(RdKafka::Event &event) {
switch (event.type()) {
case RdKafka::Event::EVENT_ERROR:
std::cout << "RdKafka::Event::EVENT_ERROR: " << RdKafka::err2str(event.err()) << std::endl;
break;
case RdKafka::Event::EVENT_STATS:
//std::cout << "RdKafka::Event::EVENT_STATS: " << event.str() << std::endl;
break;
case RdKafka::Event::EVENT_LOG:
//std::cout << "RdKafka::Event::EVENT_LOG " << event.fac() << std::endl;
break;
case RdKafka::Event::EVENT_THROTTLE:
//std::cout << "RdKafka::Event::EVENT_THROTTLE " << event.broker_name() << std::endl;
break;
}
}
};
class HashPartitionerCb : public RdKafka::PartitionerCb {
public:
int32_t partitioner_cb(const RdKafka::Topic *topic, const std::string *key,
int32_t partition_cnt, void *msg_opaque) {
char msg[128] = { 0 };
int32_t partition_id = generate_hash(key->c_str(), key->size()) % partition_cnt;
// [topic][key][partition_cnt][partition_id]
// :[test][6419][2][1]
/*sprintf(msg, "HashPartitionerCb:topic:[%s], key:[%s]partition_cnt:[%d], partition_id:[%d]", topic->name().c_str(),
key->c_str(), partition_cnt, partition_id);
std::cout << msg << std::endl;*/
return partition_id;
}
private:
static inline unsigned int generate_hash(const char *str, size_t len) {
unsigned int hash = 5381;
for (size_t i = 0; i < len; i++)
hash = ((hash << 5) + hash) + str[i];
return hash;
}
};
KafkaProducer::KafkaProducer(const std::string& brokers, const std::string& topic, int partition) {
m_brokers = brokers;
m_topicStr = topic;
m_partition = partition;
m_config = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
if(m_config==NULL)
std::cout << "Create RdKafka Conf failed." << std::endl;
m_topicConfig = RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC);
if (m_topicConfig == NULL)
std::cout << "Create RdKafka Topic Conf failed." << std::endl;
RdKafka::Conf::ConfResult errCode;
std::string errorStr;
m_dr_cb = new ProducerDeliveryReportCb;
errCode = m_config->set("dr_cb", m_dr_cb, errorStr);
if (errCode != RdKafka::Conf::CONF_OK)
{
std::cout << "Conf set failed:" << errorStr << std::endl;
}
m_event_cb = new ProducerEventCb;
errCode = m_config->set("event_cb", m_event_cb, errorStr);
if (errCode != RdKafka::Conf::CONF_OK)
{
std::cout << "Conf set failed:" << errorStr << std::endl;
}
m_partitioner_cb = new HashPartitionerCb;
errCode = m_topicConfig->set("partitioner_cb", m_partitioner_cb, errorStr);
if (errCode != RdKafka::Conf::CONF_OK)
{
std::cout << "Conf set failed:" << errorStr << std::endl;
}
errCode = m_config->set("statistics.interval.ms", "10000", errorStr);
if (errCode != RdKafka::Conf::CONF_OK)
{
std::cout << "Conf set failed:" << errorStr << std::endl;
}
errCode = m_config->set("message.max.bytes", "10240000", errorStr);
if (errCode != RdKafka::Conf::CONF_OK)
{
std::cout << "Conf set failed:" << errorStr << std::endl;
}
errCode = m_config->set("bootstrap.servers", m_brokers, errorStr);
if (errCode != RdKafka::Conf::CONF_OK)
{
std::cout << "Conf set failed:" << errorStr << std::endl;
}
m_producer = RdKafka::Producer::create(m_config, errorStr);
if (m_producer == NULL)
{
std::cout << "Create Producer failed:" << errorStr << std::endl;
}
m_topic = RdKafka::Topic::create(m_producer, m_topicStr, m_topicConfig, errorStr);
if (m_topic == NULL)
{
std::cout << "Create Topic failed:" << errorStr << std::endl;
}
}
KafkaProducer::~KafkaProducer() {
while (m_producer->outq_len() > 0) {
std::cerr << "Waiting for " << m_producer->outq_len() << std::endl;
m_producer->flush(5000);
}
delete m_config;
delete m_topicConfig;
delete m_topic;
delete m_producer;
delete m_dr_cb;
delete m_event_cb;
delete m_partitioner_cb;
}
void KafkaProducer::pushMessage(const std::string& str) {
int32_t len = str.length();
void* payload = const_cast<void*>(static_cast<const void*>(str.data()));
RdKafka::ErrorCode errorCode = m_producer->produce(
m_topic,
RdKafka::Topic::PARTITION_UA,
RdKafka::Producer::RK_MSG_COPY,
payload,
len,
NULL,
NULL);
m_producer->poll(0);
if (errorCode != RdKafka::ERR_NO_ERROR) {
std::cerr << "Produce failed: " << RdKafka::err2str(errorCode) << std::endl;
if (errorCode == RdKafka::ERR__QUEUE_FULL) {
m_producer->poll(100);
}
}
}
#endif

View File

@@ -0,0 +1,35 @@
#pragma once
#ifdef VP_WITH_KAFKA
#include <string>
#include <iostream>
// compile tips:
// run `apt-get install librdkafka-dev`, v0.11.3 for ubuntu 18.04 by default.
#include <librdkafka/rdkafkacpp.h>
// wrapper class for Producer in librdkafka
class KafkaProducer
{
public:
explicit KafkaProducer(const std::string& brokers, const std::string& topic, int partition);
void pushMessage(const std::string& str);
~KafkaProducer();
private:
std::string m_brokers;
std::string m_topicStr;
int m_partition;
RdKafka::Conf* m_config;
RdKafka::Conf* m_topicConfig;
RdKafka::Topic* m_topic;
RdKafka::Producer* m_producer;
RdKafka::DeliveryReportCb* m_dr_cb;
RdKafka::EventCb* m_event_cb;
RdKafka::PartitionerCb* m_partitioner_cb;
};
#endif

View File

@@ -0,0 +1,130 @@
#include "vp_ba_socket_broker_node.h"
namespace vp_nodes {
vp_ba_socket_broker_node::vp_ba_socket_broker_node(std::string node_name,
std::string des_ip,
int des_port,
vp_broke_for broke_for,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
des_ip(des_ip),
des_port(des_port) {
// only for vp_frame_target since BA logic ONLY works on vp_frame_target
assert(broke_for == vp_broke_for::NORMAL);
udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
this->initialized();
}
vp_ba_socket_broker_node::~vp_ba_socket_broker_node() {
deinitialized();
stop_broking();
}
void vp_ba_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
/* format:
line 0, <--
line 1, time
line 2, channel_index, frame_index
line 3, ba_type, ba_label
line 4, ids of involve targets (target_id_1, target_id_2,...)
line 5, vertexs of involve region (x1,y1,x2,y2,...)
line 6, property labels split by '|' and split by ',' between different involve targets
line 7, record image path
line 8, record video path
line 9, -->
line 10, <--
line 11, time
line 12, channel_index, frame_index
line 13, ba_type, ba_label
line 14, ids of involve targets (target_id_1, target_id_2,...)
line 15, vertexs of involve region (x1,y1,x2,y2,...)
line 16, property labels split by '|' and split by ',' between different involve targets
line 17, record image path
linr 18, record video path
line 19, -->
line 20, ...
*/
std::stringstream msg_stream;
auto format_basic_info = [&](int channel_index, int frame_index) {
msg_stream << vp_utils::time_format(NOW) << std::endl; // line1
msg_stream << channel_index << "," << frame_index << std::endl; // line2
};
auto format_ba_info = [&](vp_objects::vp_ba_type ba_type,
std::string ba_label,
const std::vector<int>& involve_target_ids,
const std::vector<vp_objects::vp_point>& involve_region_vertexs) {
msg_stream << int(ba_type) << "," << ba_label << std::endl; // line3
for (int i = 0; i < involve_target_ids.size(); i++) {
msg_stream << involve_target_ids[i]; // line 4
if (i != involve_target_ids.size() - 1) {
msg_stream << ",";
}
}
msg_stream << std::endl;
for (int i = 0; i < involve_region_vertexs.size(); i++) {
msg_stream << involve_region_vertexs[i].x << "," << involve_region_vertexs[i].y; // line 5
if (i != involve_region_vertexs.size() - 1) {
msg_stream << ",";
}
}
msg_stream << std::endl;
auto targets = meta->get_targets_by_ids(involve_target_ids);
for (int i = 0; i < targets.size(); i++) {
auto t = targets[i];
for (int j = 0; j < t->secondary_labels.size(); j++) {
msg_stream << t->secondary_labels[j]; // line 6
if (j != t->secondary_labels.size() - 1) {
msg_stream << "|";
}
}
if (i != targets.size() - 1) {
msg_stream << ",";
}
}
msg_stream << std::endl;
};
auto format_record_info = [&](const std::string& record_image_name, const std::string& record_video_name) {
msg_stream << record_image_name << std::endl; // line7
msg_stream << record_video_name << std::endl; // line8
};
if (broke_for == vp_broke_for::NORMAL) {
for (int i = 0; i < meta->ba_results.size(); i++) {
auto& ba = meta->ba_results[i];
// start flag
msg_stream << "<--" << std::endl;
// basic info
format_basic_info(meta->channel_index, meta->frame_index);
//ba info
format_ba_info(ba->type, ba->ba_label, ba->involve_target_ids_in_frame, ba->involve_region_in_frame);
// record info
format_record_info(ba->record_image_name, ba->record_video_name);
// end flag
msg_stream << "-->";
if (i != meta->ba_results.size() - 1) {
msg_stream << std::endl; // not the last one
}
}
}
msg = msg_stream.str();
}
void vp_ba_socket_broker_node::broke_msg(const std::string& msg) {
// broke msg to socket by udp
auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
auto bytes_2_send_len = msg.size();
udp_writer.send(bytes_2_send, bytes_2_send_len);
}
}

View File

@@ -0,0 +1,37 @@
#pragma once
#include "vp_msg_broker_node.h"
#include "../../objects/ba/vp_ba_result.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
// light weight socket support
#include "../../third_party/kissnet/kissnet.hpp"
namespace vp_nodes {
// message broker node, broke BA results (ONLY for vp_frame_target) to socket via udp.
// BA results could be used for archive.
class vp_ba_socket_broker_node: public vp_msg_broker_node
{
private:
// host the data sent to via udp
std::string des_ip = "";
// port the data sent to via udp
int des_port = 0;
// udp socket writer
kissnet::udp_socket udp_writer;
protected:
// to custom format
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to socket via udp
virtual void broke_msg(const std::string& msg) override;
public:
vp_ba_socket_broker_node(std::string node_name,
std::string des_ip = "",
int des_port = 0,
vp_broke_for broke_for = vp_broke_for::NORMAL,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_ba_socket_broker_node();
};
}

View File

@@ -0,0 +1,137 @@
#include "vp_embeddings_properties_socket_broker_node.h"
namespace vp_nodes {
vp_embeddings_properties_socket_broker_node::vp_embeddings_properties_socket_broker_node(std::string node_name,
std::string des_ip,
int des_port,
std::string cropped_dir,
int min_crop_width,
int min_crop_height,
vp_broke_for broke_for,
bool only_for_tracked,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
des_ip(des_ip),
des_port(des_port),
cropped_dir(cropped_dir),
min_crop_width(min_crop_width),
min_crop_height(min_crop_height),
only_for_tracked(only_for_tracked) {
// only for vp_frame_target
assert(broke_for == vp_broke_for::NORMAL);
udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
this->initialized();
}
vp_embeddings_properties_socket_broker_node::~vp_embeddings_properties_socket_broker_node() {
deinitialized();
stop_broking();
}
void vp_embeddings_properties_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
/* format:
line 0, <--
line 1, 1st cropped image's path
line 2, 1st properties, multi property labels splited by ','
line 3, 1st sub target info, empty line if sub target detected (sub target may be vehicle plate)
line 4, 1st embeddings
line 5, -->
line 6, <--
line 7, 2nd cropped image's path
line 8, 2nd properties, multi property labels splited by ','
line 9, 2nd sub target info, empty line if sub target detected (sub target may be vehicle plate)
line 10, 2nd embeddings
line 11, -->
line 12, ...
*/
auto& broked = all_broked[meta->channel_index];
// remove 50 elements every 100 ids
if (broked.size() > 100) {
broked.erase(broked.begin(), broked.begin() + 50);
}
std::stringstream msg_stream;
auto format_embeddings = [&](const std::vector<float>& embeddings) {
for (int i = 0; i < embeddings.size(); i++) {
msg_stream << embeddings[i];
if (i != embeddings.size() - 1) {
msg_stream << ",";
}
}
msg_stream << std::endl;
};
auto format_sub_target = [&](const std::string& sub_target_info) {
msg_stream << sub_target_info << std::endl;
};
auto format_properties = [&](const std::vector<std::string>& secondary_labels) {
for (int i = 0; i < secondary_labels.size(); i++) {
msg_stream << secondary_labels[i];
if (i != secondary_labels.size() - 1) {
msg_stream << ",";
}
}
msg_stream << std::endl;
};
auto save_cropped_image = [&](cv::Mat& frame, cv::Rect rect, std::string name) {
auto cropped = frame(rect);
cv::imwrite(name, cropped);
msg_stream << name << std::endl;
};
if (broke_for == vp_broke_for::NORMAL) {
for (int i = 0; i < meta->targets.size(); i++) {
auto& t = meta->targets[i];
// only broke for tracked targets and have enough frames
if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
continue;
}
// only broke 1 time for specific track id if it has been tracked, or broke many times
if (t->track_id >= 0 &&
std::find(broked.begin(), broked.end(), t->track_id) != broked.end()) {
continue;
}
// size filter
if (t->width < min_crop_width || t->height < min_crop_height) {
continue;
}
if (t->track_id >= 0) {
broked.push_back(t->track_id);
}
auto name = cropped_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + std::to_string(t->track_id >= 0 ? t->track_id : i) + ".jpg";
// start flag
msg_stream << "<--" << std::endl;
// save small cropped image
save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), name);
// format properties
format_properties(t->secondary_labels);
// format sub target (vehicle plate here), just using the first sub target's label is enough.
format_sub_target(t->sub_targets.size() > 0 ? t->sub_targets[0]->label : "");
// format embeddings
format_embeddings(t->embeddings);
// end flag
msg_stream << "-->";
if (i != meta->targets.size() - 1) {
msg_stream << std::endl; // not the last one
}
}
}
msg = msg_stream.str();
}
void vp_embeddings_properties_socket_broker_node::broke_msg(const std::string& msg) {
// broke msg to socket by udp
auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
auto bytes_2_send_len = msg.size();
udp_writer.send(bytes_2_send, bytes_2_send_len);
}
}

View File

@@ -0,0 +1,55 @@
#pragma once
#include "vp_msg_broker_node.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
// light weight socket support
#include "../../third_party/kissnet/kissnet.hpp"
namespace vp_nodes {
// message broker node, broke embeddings, AND properties (ONLY for vp_frame_target) to socket via udp.
// embeddings and properties could be used for similiarity search and property search later.
class vp_embeddings_properties_socket_broker_node: public vp_msg_broker_node
{
private:
// save dir for cropped images, which would be used for embeddings similiarity search or property search later
std::string cropped_dir = "cropped_images";
// min width to crop (embedding/property will be ignored if target's width is smaller than this value)
int min_crop_width = 50;
// min height to crop (embedding/property will be ignored if target's height is smaller than this value)
int min_crop_height = 50;
// only broke for tracked targets (track_id is not -1)
bool only_for_tracked = false;
// min tracked frames if only_for_tracked is true
int min_tracked_frames = 25;
// host the data sent to via udp
std::string des_ip = "";
// port the data sent to via udp
int des_port = 0;
// udp socket writer
kissnet::udp_socket udp_writer;
// support multi-channel
std::map<int, std::vector<int>> all_broked; // channel -> target ids which have been broked
protected:
// to custom format
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to socket via udp
virtual void broke_msg(const std::string& msg) override;
public:
vp_embeddings_properties_socket_broker_node(std::string node_name,
std::string des_ip = "",
int des_port = 0,
std::string cropped_dir = "cropped_images",
int min_crop_width = 50,
int min_crop_height = 50,
vp_broke_for broke_for = vp_broke_for::NORMAL,
bool only_for_tracked = false,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_embeddings_properties_socket_broker_node();
};
}

View File

@@ -0,0 +1,156 @@
#include "vp_embeddings_socket_broker_node.h"
namespace vp_nodes {
vp_embeddings_socket_broker_node::vp_embeddings_socket_broker_node(std::string node_name,
std::string des_ip,
int des_port,
std::string cropped_dir,
int min_crop_width,
int min_crop_height,
vp_broke_for broke_for,
bool only_for_tracked,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
des_ip(des_ip),
des_port(des_port),
cropped_dir(cropped_dir),
min_crop_width(min_crop_width),
min_crop_height(min_crop_height),
only_for_tracked(only_for_tracked) {
// only for vp_frame_target or vp_frame_face_target
assert(broke_for == vp_broke_for::NORMAL || broke_for == vp_broke_for::FACE);
udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
this->initialized();
}
vp_embeddings_socket_broker_node::~vp_embeddings_socket_broker_node() {
deinitialized();
stop_broking();
}
void vp_embeddings_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
/* format:
line 0, <--
line 1, 1st cropped image's path
line 2, 1st embeddings
line 3, -->
line 4, <--
line 5, 2nd cropped image's path
line 6, 2nd embeddings
line 7, -->
line 8, ...
*/
auto& broked = all_broked[meta->channel_index];
// remove 50 elements every 100 ids
if (broked.size() > 100) {
broked.erase(broked.begin(), broked.begin() + 50);
}
std::stringstream msg_stream;
auto format_embeddings = [&](const std::vector<float>& embeddings) {
for (int i = 0; i < embeddings.size(); i++) {
msg_stream << embeddings[i];
if (i != embeddings.size() - 1) {
msg_stream << ",";
}
}
msg_stream << std::endl;
};
auto save_cropped_image = [&](cv::Mat& frame, cv::Rect rect, std::string name) {
auto cropped = frame(rect);
cv::imwrite(name, cropped);
msg_stream << name << std::endl;
};
if (broke_for == vp_broke_for::NORMAL) {
for (int i = 0; i < meta->targets.size(); i++) {
auto& t = meta->targets[i];
// only broke for tracked targets and have enough frames
if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
continue;
}
// only broke 1 time for specific track id if it has been tracked, or broke many times
if (t->track_id >= 0 &&
std::find(broked.begin(), broked.end(), t->track_id) != broked.end()) {
continue;
}
// size filter
if (t->width < min_crop_width || t->height < min_crop_height) {
continue;
}
if (t->track_id >= 0) {
broked.push_back(t->track_id);
}
auto name = cropped_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + std::to_string(t->track_id >= 0 ? t->track_id : i) + ".jpg";
// start flag
msg_stream << "<--" << std::endl;
// save small cropped image
save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), name);
// format embeddings
format_embeddings(t->embeddings);
// end flag
msg_stream << "-->";
if (i != meta->targets.size() - 1) {
msg_stream << std::endl; // not the last one
}
}
}
if (broke_for == vp_broke_for::FACE) {
for (int i = 0; i < meta->face_targets.size(); i++) {
auto& t = meta->face_targets[i];
// only broke for tracked targets and have enough frames
if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
continue;
}
// only broke 1 time for specific track id if it has been tracked, or broke many times
if (t->track_id >= 0 &&
std::find(broked.begin(), broked.end(), t->track_id) != broked.end()) {
continue;
}
// size filter
if (t->width < min_crop_width || t->height < min_crop_height) {
continue;
}
if (t->track_id >= 0) {
broked.push_back(t->track_id);
}
auto name = cropped_dir + "/" + std::to_string(meta->channel_index) + "_" + std::to_string(meta->frame_index) + "_" + std::to_string(t->track_id >= 0 ? t->track_id : i) + ".jpg";
// start flag
msg_stream << "<--" << std::endl;
// save small cropped image
save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), name);
// format embeddings
format_embeddings(t->embeddings);
// end flag
msg_stream << "-->";
if (i != meta->face_targets.size() - 1) {
msg_stream << std::endl; // not the last one
}
}
}
msg = msg_stream.str();
}
void vp_embeddings_socket_broker_node::broke_msg(const std::string& msg) {
// broke msg to socket by udp
auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
auto bytes_2_send_len = msg.size();
udp_writer.send(bytes_2_send, bytes_2_send_len);
}
}

View File

@@ -0,0 +1,55 @@
#pragma once
#include "vp_msg_broker_node.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
// light weight socket support
#include "../../third_party/kissnet/kissnet.hpp"
namespace vp_nodes {
// message broker node, broke ONLY embeddings (for vp_frame_target or vp_frame_face_target) to socket via udp.
// embeddings could be used for similiarity search later.
class vp_embeddings_socket_broker_node: public vp_msg_broker_node
{
private:
// save dir for cropped images, which would be used for embeddings similiarity search later
std::string cropped_dir = "cropped_images";
// min width to crop (embedding will be ignored if target's width is smaller than this value)
int min_crop_width = 50;
// min height to crop (embedding will be ignored if target's height is smaller than this value)
int min_crop_height = 50;
// only broke for tracked targets (track_id is not -1)
bool only_for_tracked = false;
// min tracked frames if only_for_tracked is true
int min_tracked_frames = 25;
// host the data sent to via udp
std::string des_ip = "";
// port the data sent to via udp
int des_port = 0;
// udp socket writer
kissnet::udp_socket udp_writer;
// support multi-channel
std::map<int, std::vector<int>> all_broked; // channel -> target ids which have been broked
protected:
// to custom format
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to socket via udp
virtual void broke_msg(const std::string& msg) override;
public:
vp_embeddings_socket_broker_node(std::string node_name,
std::string des_ip = "",
int des_port = 0,
std::string cropped_dir = "cropped_images",
int min_crop_width = 50,
int min_crop_height = 50,
vp_broke_for broke_for = vp_broke_for::NORMAL,
bool only_for_tracked = false,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_embeddings_socket_broker_node();
};
}

View File

@@ -0,0 +1,95 @@
#include "vp_expr_socket_broker_node.h"
namespace vp_nodes {
vp_expr_socket_broker_node::vp_expr_socket_broker_node(std::string node_name,
std::string des_ip,
int des_port,
std::string screenshot_dir,
vp_broke_for broke_for,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
des_ip(des_ip),
des_port(des_port),
screenshot_dir(screenshot_dir) {
// only for vp_frame_text_target
assert(broke_for == vp_broke_for::TEXT);
udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
this->initialized();
}
vp_expr_socket_broker_node::~vp_expr_socket_broker_node() {
deinitialized();
stop_broking();
}
void vp_expr_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
/* format:
line 0, <--
line 1, time
line 2, channel_index, frame_index
line 3, total num, num of yes, num of no, number of invalid
line 4, 1st expression string, 2nd expression string, 3rd expression string, ...
line 5, screenshot path
line 6, -->
*/
std::stringstream msg_stream;
auto format_basic_info = [&](int channel_index, int frame_index) {
msg_stream << vp_utils::time_format(NOW) << std::endl; // line1
msg_stream << channel_index << "," << frame_index << std::endl; // line2
};
auto format_expr_info = [&](const std::vector<std::shared_ptr<vp_objects::vp_frame_text_target>>& expr_targets) {
auto num_yes = 0, num_no = 0, num_invalid = 0;
auto expr_strs = std::string();
for (int i = 0; i < expr_targets.size(); i++) {
expr_strs += expr_targets[i]->text;
if (i != expr_targets.size() - 1) {
expr_strs += ",";
}
if (expr_targets[i]->flags.find("yes") != std::string::npos) {
num_yes++;
}
if (expr_targets[i]->flags.find("no") != std::string::npos) {
num_no++;
}
if (expr_targets[i]->flags.find("invalid") != std::string::npos) {
num_invalid++;
}
}
msg_stream << expr_targets.size() << "," << num_yes << "," << num_no << "," << num_invalid << std::endl; // line3
msg_stream << expr_strs << std::endl; // line4
};
auto format_screenshot = [&](cv::Mat& screenshot, const std::string& name) {
cv::imwrite(name, screenshot);
msg_stream << name << std::endl; // line5
};
// at most 1 record for each frame
if (broke_for == vp_broke_for::TEXT) {
// start flag
msg_stream << "<--" << std::endl;
format_basic_info(meta->channel_index, meta->frame_index);
format_expr_info(meta->text_targets);
auto screenshot_name = screenshot_dir + "/" + std::to_string(meta->channel_index) + "_" + std::to_string(meta->frame_index) + ".jpg";
format_screenshot(meta->osd_frame.empty() ? meta->frame : meta->osd_frame, screenshot_name);
// end flag
msg_stream << "-->" << std::endl;
}
msg = msg_stream.str();
}
void vp_expr_socket_broker_node::broke_msg(const std::string& msg) {
// broke msg to socket by udp
auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
auto bytes_2_send_len = msg.size();
udp_writer.send(bytes_2_send, bytes_2_send_len);
}
}

View File

@@ -0,0 +1,40 @@
#pragma once
#include "vp_msg_broker_node.h"
#include "../../objects/ba/vp_ba_result.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
// light weight socket support
#include "../../third_party/kissnet/kissnet.hpp"
namespace vp_nodes {
// message broker node, broke math expression checking results (ONLY for vp_frame_text_target) to socket via udp.
// math expression checking results could be used for archive.
class vp_expr_socket_broker_node: public vp_msg_broker_node
{
private:
// save dir for screenshot images, which would be used for displaying in web client
std::string screenshot_dir = "screenshot_images";
// host the data sent to via udp
std::string des_ip = "";
// port the data sent to via udp
int des_port = 0;
// udp socket writer
kissnet::udp_socket udp_writer;
protected:
// to custom format
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to socket via udp
virtual void broke_msg(const std::string& msg) override;
public:
vp_expr_socket_broker_node(std::string node_name,
std::string des_ip = "",
int des_port = 0,
std::string screenshot_dir = "screenshot_images",
vp_broke_for broke_for = vp_broke_for::TEXT,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_expr_socket_broker_node();
};
}

View File

@@ -0,0 +1,58 @@
#include "vp_json_console_broker_node.h"
namespace vp_nodes {
vp_json_console_broker_node::vp_json_console_broker_node(std::string node_name,
vp_broke_for broke_for,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold) {
this->initialized();
}
vp_json_console_broker_node::~vp_json_console_broker_node() {
deinitialized();
stop_broking();
}
void vp_json_console_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
// serialize objects to json by cereal
std::stringstream msg_stream;
{
cereal::JSONOutputArchive json_archive(msg_stream);
// global values
json_archive(cereal::make_nvp("channel_index", meta->channel_index),
cereal::make_nvp("frame_index", meta->frame_index),
cereal::make_nvp("width", meta->frame.cols),
cereal::make_nvp("height", meta->frame.rows),
cereal::make_nvp("fps", meta->fps),
cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
// serialize values according to broke_for
if (broke_for == vp_broke_for::NORMAL) {
json_archive(cereal::make_nvp("target_size", meta->targets.size()),
cereal::make_nvp("targets", meta->targets));
}
else if (broke_for == vp_broke_for::FACE) {
json_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
cereal::make_nvp("face_targets", meta->face_targets));
}
else if (broke_for == vp_broke_for::TEXT) {
json_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
cereal::make_nvp("text_targets", meta->text_targets));
}
else {
throw "invalid broke_for!";
}
} // flush
msg = msg_stream.str();
}
void vp_json_console_broker_node::broke_msg(const std::string& msg) {
// broke msg to console by std::cout
std::cout << msg << std::endl;
}
}

View File

@@ -0,0 +1,26 @@
#pragma once
#include <sstream>
#include "vp_msg_broker_node.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
namespace vp_nodes {
// message broker node (for debug purpose), broke json data to console.
class vp_json_console_broker_node: public vp_msg_broker_node
{
private:
/* data */
protected:
// to json
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to console
virtual void broke_msg(const std::string& msg) override;
public:
vp_json_console_broker_node(std::string node_name,
vp_broke_for broke_for = vp_broke_for::NORMAL,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_json_console_broker_node();
};
}

View File

@@ -0,0 +1,65 @@
#ifdef VP_WITH_KAFKA
#include "vp_json_kafka_broker_node.h"
namespace vp_nodes {
vp_json_kafka_broker_node::vp_json_kafka_broker_node(std::string node_name,
std::string kafka_servers,
std::string topic_name,
vp_broke_for broke_for,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold) {
VP_INFO(vp_utils::string_format("[%s] kafka_servers:[%s] topic_name:[%s]", node_name.c_str(), kafka_servers.c_str(), topic_name.c_str()));
kafka_producer = std::make_shared<KafkaProducer>(kafka_servers, topic_name, 0);
this->initialized();
}
vp_json_kafka_broker_node::~vp_json_kafka_broker_node() {
deinitialized();
stop_broking();
}
void vp_json_kafka_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
// serialize objects to json by cereal
std::stringstream msg_stream;
{
cereal::JSONOutputArchive json_archive(msg_stream);
// global values
json_archive(cereal::make_nvp("channel_index", meta->channel_index),
cereal::make_nvp("frame_index", meta->frame_index),
cereal::make_nvp("width", meta->frame.cols),
cereal::make_nvp("height", meta->frame.rows),
cereal::make_nvp("fps", meta->fps),
cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
// serialize values according to broke_for
if (broke_for == vp_broke_for::NORMAL) {
json_archive(cereal::make_nvp("target_size", meta->targets.size()),
cereal::make_nvp("targets", meta->targets));
}
else if (broke_for == vp_broke_for::FACE) {
json_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
cereal::make_nvp("face_targets", meta->face_targets));
}
else if (broke_for == vp_broke_for::TEXT) {
json_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
cereal::make_nvp("text_targets", meta->text_targets));
}
else {
throw "invalid broke_for!";
}
} // flush
msg = msg_stream.str();
}
void vp_json_kafka_broker_node::broke_msg(const std::string& msg) {
// broke msg to kafka by kafka client api
if (kafka_producer != nullptr) {
kafka_producer->pushMessage(msg);
}
}
}
#endif

View File

@@ -0,0 +1,32 @@
#pragma once
#ifdef VP_WITH_KAFKA
#include <sstream>
#include "vp_msg_broker_node.h"
#include "kafka_utils/KafkaProducer.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
namespace vp_nodes {
// message broker node, broke json data to kafka.
class vp_json_kafka_broker_node: public vp_msg_broker_node
{
private:
/* data */
std::shared_ptr<KafkaProducer> kafka_producer = nullptr;
protected:
// to json
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to console
virtual void broke_msg(const std::string& msg) override;
public:
vp_json_kafka_broker_node(std::string node_name,
std::string kafka_servers = "127.0.0.1:9092",
std::string topic_name = "videopipe_topic",
vp_broke_for broke_for = vp_broke_for::NORMAL,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_json_kafka_broker_node();
};
}
#endif

View File

@@ -0,0 +1,85 @@
#include "vp_msg_broker_node.h"
namespace vp_nodes {
vp_msg_broker_node::vp_msg_broker_node(std::string node_name,
vp_broke_for broke_for,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_node(node_name),
broke_for(broke_for),
broking_cache_warn_threshold(broking_cache_warn_threshold),
broking_cache_ignore_threshold(broking_cache_ignore_threshold) {
broking_th = std::thread(&vp_msg_broker_node::broking_run, this);
}
vp_msg_broker_node::~vp_msg_broker_node() {
}
void vp_msg_broker_node::stop_broking() {
broking = false;
frames_to_broke.push(nullptr); // send dead flag to broking_thread
broking_cache_semaphore.signal();
if (broking_th.joinable()) {
broking_th.join();
}
}
std::shared_ptr<vp_objects::vp_meta> vp_msg_broker_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// cache frame meta only if cache size is not greater than threshold
if (frames_to_broke.size() < broking_cache_ignore_threshold) {
// it is a producer
frames_to_broke.push(meta);
broking_cache_semaphore.signal();
}
// warning 1 time in log
auto size = frames_to_broke.size();
if (size > broking_cache_warn_threshold && !broking_cache_warned) {
broking_cache_warned = true;
VP_WARN(vp_utils::string_format("[%s] [message broker] cache size is exceeding threshold! cache size is [%d], threshold is [%d]", node_name.c_str(), size, broking_cache_warn_threshold));
}
if (size <= broking_cache_warn_threshold) {
broking_cache_warned = false;
}
return meta;
}
std::shared_ptr<vp_objects::vp_meta> vp_msg_broker_node::handle_control_meta(std::shared_ptr<vp_objects::vp_control_meta> meta) {
return meta;
}
void vp_msg_broker_node::broking_run() {
while (broking) {
// it is a consumer
broking_cache_semaphore.wait();
auto frame_meta = frames_to_broke.front();
frames_to_broke.pop();
// dead flag
if (frame_meta == nullptr) {
continue;
}
// message to be broked
std::string message;
// step 1, format message
format_msg(frame_meta, message); // MUST be implemented in child class
// ignore if message is empty, because no broking occurs is allowed for some frames if some conditions not satisfied
if (message.empty()) {
continue;
}
// step 2, broke message
broke_msg(message); // MUST be implemented in child class
}
}
}

View File

@@ -0,0 +1,65 @@
#pragma once
#include "../vp_node.h"
namespace vp_nodes {
// broke for what type of data (vp_frame_target, vp_frame_face_target or others)
enum class vp_broke_for {
NORMAL, // vp_frame_target
FACE, // vp_frame_face_target
TEXT, // vp_frame_text_target
POSE // vp_frame_pose_target
// others to extend
};
// base node for message brokers,
// used to serialize objects (inside vp_frame_meta) to structured data and then push them to external modules like kafka, file or sockets.
// note:
// 1. this node works asynchronously which would not block pipeline.
// 2. this class can not be initialized directly.
class vp_msg_broker_node: public vp_node
{
private:
// warning if cache size greater than threshold
int broking_cache_warn_threshold = 50;
bool broking_cache_warned = false;
// ignore if cache size greater than threshold (skip directly)
int broking_cache_ignore_threshold = 200;
// cache frames to be broked
std::queue<std::shared_ptr<vp_objects::vp_frame_meta>> frames_to_broke;
vp_utils::vp_semaphore broking_cache_semaphore;
// broking thread
std::thread broking_th;
// broking function
void broking_run();
bool broking = true;
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override final;
virtual std::shared_ptr<vp_objects::vp_meta> handle_control_meta(std::shared_ptr<vp_objects::vp_control_meta> meta) override final;
// serialize objects to message which SHOULD be implemented in child class.
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) = 0;
// broke message to external modules which SHOULD be implemented in child class.
virtual void broke_msg(const std::string& msg) = 0;
// wait thread exits in vp_msg_broker_node
void stop_broking();
// node applied for what type of target
vp_broke_for broke_for = vp_broke_for::NORMAL;
// string for broke_for
std::map<vp_broke_for, std::string> broke_fors = {{vp_broke_for::NORMAL, "normal"},
{vp_broke_for::FACE, "face"},
{vp_broke_for::TEXT, "text"},
{vp_broke_for::POSE, "pose"}};
public:
vp_msg_broker_node(std::string node_name,
vp_broke_for broke_for = vp_broke_for::NORMAL,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_msg_broker_node();
};
}

View File

@@ -0,0 +1,160 @@
#include "vp_plate_socket_broker_node.h"
namespace vp_nodes {
vp_plate_socket_broker_node::vp_plate_socket_broker_node(std::string node_name,
std::string des_ip,
int des_port,
std::string plates_dir,
int min_crop_width,
int min_crop_height,
vp_broke_for broke_for,
bool only_for_tracked,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
des_ip(des_ip),
des_port(des_port),
plates_dir(plates_dir),
min_crop_width(min_crop_width),
min_crop_height(min_crop_height),
only_for_tracked(only_for_tracked) {
// only for vp_frame_target
assert(broke_for == vp_broke_for::NORMAL);
udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
this->initialized();
}
vp_plate_socket_broker_node::~vp_plate_socket_broker_node() {
deinitialized();
stop_broking();
}
void vp_plate_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
/* format:
line 0, <--
line 1, 1st time
line 2, 1st channel index, frame index
line 3, 1st the cropped image's path
line 4, 1st the whole image's path
line 5, 1st plate color
line 6, 1st plate text
line 7, -->
line 8, <--
line 9, 2nd time
line 10, 2nd channel index, frame index
line 11, 2nd the cropped image's path
line 12, 2nd the whole image's path
line 13, 2nd plate color
line 14, 2nd plate text
line 15, -->
line 16, ...
*/
auto& broked_ids = all_broked_ids[meta->channel_index];
auto& broked_texts = all_broked_texts[meta->channel_index];
// remove 50 elements every 100 ids
if (broked_ids.size() > 100) {
broked_ids.erase(broked_ids.begin(), broked_ids.begin() + 50);
}
if (broked_texts.size() > 100) {
broked_texts.erase(broked_texts.begin(), broked_texts.begin() + 50);
}
std::stringstream msg_stream;
auto format_basic_info = [&](int channel_index, int frame_index) {
msg_stream << vp_utils::time_format(NOW) << std::endl; // line1
msg_stream << channel_index << "," << frame_index << std::endl; // line2
};
auto save_cropped_image = [&](cv::Mat& frame, cv::Rect rect, std::string name) {
auto cropped = frame(rect);
cv::imwrite(name, cropped);
msg_stream << name << std::endl; // line3
};
auto save_whole_image = [&](cv::Mat& frame, std::string name) {
cv::imwrite(name, frame);
msg_stream << name << std::endl; // line4
};
auto format_plate = [&](const std::string& plate_color, const std::string& plate_text) {
msg_stream << plate_color << std::endl; // line5
msg_stream << plate_text << std::endl; // line6
};
if (broke_for == vp_broke_for::NORMAL) {
for (int i = 0; i < meta->targets.size(); i++) {
auto& t = meta->targets[i];
// only broke for tracked targets and have enough frames
if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
continue;
}
// only broke 1 time for specific track id if it has been tracked, or broke many times
if (t->track_id >= 0 &&
std::find(broked_ids.begin(), broked_ids.end(), t->track_id) != broked_ids.end()) {
broked_ids.push_back(t->track_id);
continue;
}
// size filter
if (t->width < min_crop_width || t->height < min_crop_height) {
continue;
}
auto color_and_text = vp_utils::string_split(t->primary_label, '_');
// make sure plate text and color detected
if(color_and_text.size() != 2) {
continue;
}
auto& color = color_and_text[0];
auto& text = color_and_text[1];
// check for length, 7 or 3 + 6 or 3 + 7
if (text.length() != 7 && text.length() != 9 && text.length() != 10) {
continue;
}
// only broke 1 time for specific plate text in small period
if (std::find(broked_texts.begin(), broked_texts.end(), text) != broked_texts.end()) {
broked_texts.push_back(text);
continue;
}
// cache for texts
broked_texts.push_back(text);
// cache for ids
if (t->track_id >= 0) {
broked_ids.push_back(t->track_id);
}
auto cropped_name = plates_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + color + "_" + text + "_cropped.jpg";
auto whole_name = plates_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + color + "_" + text + "_whole.jpg";
// start flag
msg_stream << "<--" << std::endl;
// basic info
format_basic_info(meta->channel_index, meta->frame_index);
// save small cropped image
save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), cropped_name);
// save whole image
save_whole_image(meta->frame, whole_name);
// format color and text
format_plate(color, text);
// end flag
msg_stream << "-->";
if (i != meta->targets.size() - 1) {
msg_stream << std::endl; // not the last one
}
}
}
msg = msg_stream.str();
}
void vp_plate_socket_broker_node::broke_msg(const std::string& msg) {
// broke msg to socket by udp
auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
auto bytes_2_send_len = msg.size();
udp_writer.send(bytes_2_send, bytes_2_send_len);
}
}

View File

@@ -0,0 +1,56 @@
#pragma once
#include "vp_msg_broker_node.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
// light weight socket support
#include "../../third_party/kissnet/kissnet.hpp"
namespace vp_nodes {
// message broker node, broke text & color for license plate (hold by vp_frame_target) to socket via udp.
// which could be used for lpr camera.
class vp_plate_socket_broker_node: public vp_msg_broker_node
{
private:
// save dir for cropped images and whole images, which would be used for displaying in lpr camera
std::string plates_dir = "plate_images";
// min width to crop (license plate will be ignored if target's width is smaller than this value)
int min_crop_width = 50;
// min height to crop (license plate will be ignored if target's height is smaller than this value)
int min_crop_height = 50;
// only broke for tracked targets (track_id is not -1)
bool only_for_tracked = false;
// min tracked frames if only_for_tracked is true
int min_tracked_frames = 25;
// host the data sent to via udp
std::string des_ip = "";
// port the data sent to via udp
int des_port = 0;
// udp socket writer
kissnet::udp_socket udp_writer;
// support multi-channel, used for avoid duplicate data
std::map<int, std::vector<int>> all_broked_ids; // channel -> target ids which have been broked
std::map<int, std::vector<std::string>> all_broked_texts; // channel -> plate texts which have been broked (filter using similiarity comparison)
protected:
// to custom format
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to socket via udp
virtual void broke_msg(const std::string& msg) override;
public:
vp_plate_socket_broker_node(std::string node_name,
std::string des_ip = "",
int des_port = 0,
std::string plates_dir = "plate_images",
int min_crop_width = 100,
int min_crop_height = 0,
vp_broke_for broke_for = vp_broke_for::NORMAL,
bool only_for_tracked = true,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_plate_socket_broker_node();
};
}

View File

@@ -0,0 +1,70 @@
#include "vp_xml_file_broker_node.h"
namespace vp_nodes {
vp_xml_file_broker_node::vp_xml_file_broker_node(std::string node_name,
vp_broke_for broke_for,
std::string file_path_and_name,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold) {
// open file
if (file_path_and_name.empty()) {
file_path_and_name = "vp_broker_" + vp_utils::time_format(NOW, "<hour>-<min>-<sec>-<mili>") + ".xml";
}
VP_INFO(vp_utils::string_format("[%s] [message broker] set broking file path as `%s`", node_name.c_str(), file_path_and_name.c_str()));
xml_writer.open(file_path_and_name);
this->initialized();
}
vp_xml_file_broker_node::~vp_xml_file_broker_node() {
deinitialized();
stop_broking();
}
void vp_xml_file_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
// serialize objects to xml by cereal
std::stringstream msg_stream;
{
cereal::XMLOutputArchive xml_archive(msg_stream);
// global values
xml_archive(cereal::make_nvp("channel_index", meta->channel_index),
cereal::make_nvp("frame_index", meta->frame_index),
cereal::make_nvp("width", meta->frame.cols),
cereal::make_nvp("height", meta->frame.rows),
cereal::make_nvp("fps", meta->fps),
cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
// serialize values according to broke_for
if (broke_for == vp_broke_for::NORMAL) {
xml_archive(cereal::make_nvp("target_size", meta->targets.size()),
cereal::make_nvp("targets", meta->targets));
}
else if (broke_for == vp_broke_for::FACE) {
xml_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
cereal::make_nvp("face_targets", meta->face_targets));
}
else if (broke_for == vp_broke_for::TEXT) {
xml_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
cereal::make_nvp("text_targets", meta->text_targets));
}
else {
throw "invalid broke_for!";
}
} // flush
msg = msg_stream.str();
}
void vp_xml_file_broker_node::broke_msg(const std::string& msg) {
// broke msg to file by ofstream
if (xml_writer.is_open()) {
xml_writer << msg << std::endl;
}
else {
// TO-DO
}
}
}

View File

@@ -0,0 +1,28 @@
#pragma once
#include <fstream>
#include "vp_msg_broker_node.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
namespace vp_nodes {
// message broker node (for demo/debug purpose), broke xml data to file.
class vp_xml_file_broker_node: public vp_msg_broker_node
{
private:
// xml file writer
ofstream xml_writer;
protected:
// to xml
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to file
virtual void broke_msg(const std::string& msg) override;
public:
vp_xml_file_broker_node(std::string node_name,
vp_broke_for broke_for = vp_broke_for::NORMAL,
std::string file_path_and_name = "",
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_xml_file_broker_node();
};
}

View File

@@ -0,0 +1,68 @@
#include "vp_xml_socket_broker_node.h"
namespace vp_nodes {
vp_xml_socket_broker_node::vp_xml_socket_broker_node(std::string node_name,
std::string des_ip,
int des_port,
vp_broke_for broke_for,
int broking_cache_warn_threshold,
int broking_cache_ignore_threshold):
vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
des_ip(des_ip),
des_port(des_port) {
udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
this->initialized();
}
vp_xml_socket_broker_node::~vp_xml_socket_broker_node() {
deinitialized();
stop_broking();
}
void vp_xml_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
// serialize objects to xml by cereal
std::stringstream msg_stream;
{
cereal::XMLOutputArchive xml_archive(msg_stream);
// global values
xml_archive(cereal::make_nvp("channel_index", meta->channel_index),
cereal::make_nvp("frame_index", meta->frame_index),
cereal::make_nvp("width", meta->frame.cols),
cereal::make_nvp("height", meta->frame.rows),
cereal::make_nvp("fps", meta->fps),
cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
// serialize values according to broke_for
if (broke_for == vp_broke_for::NORMAL) {
xml_archive(cereal::make_nvp("target_size", meta->targets.size()),
cereal::make_nvp("targets", meta->targets));
}
else if (broke_for == vp_broke_for::FACE) {
xml_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
cereal::make_nvp("face_targets", meta->face_targets));
}
else if (broke_for == vp_broke_for::TEXT) {
xml_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
cereal::make_nvp("text_targets", meta->text_targets));
}
else {
throw "invalid broke_for!";
}
} // flush
msg = msg_stream.str();
}
void vp_xml_socket_broker_node::broke_msg(const std::string& msg) {
// broke msg to socket by udp
auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
auto bytes_2_send_len = msg.size();
udp_writer.send(bytes_2_send, bytes_2_send_len);
}
}

View File

@@ -0,0 +1,35 @@
#pragma once
#include "vp_msg_broker_node.h"
#include "cereal_archive/vp_objects_cereal_archive.h"
// light weight socket support
#include "../../third_party/kissnet/kissnet.hpp"
namespace vp_nodes {
// message broker node, broke xml data to socket via udp.
class vp_xml_socket_broker_node: public vp_msg_broker_node
{
private:
// host the data sent to via udp
std::string des_ip = "";
// port the data sent to via udp
int des_port = 0;
// udp socket writer
kissnet::udp_socket udp_writer;
protected:
// to xml
virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
// to socket via udp
virtual void broke_msg(const std::string& msg) override;
public:
vp_xml_socket_broker_node(std::string node_name,
std::string des_ip = "",
int des_port = 0,
vp_broke_for broke_for = vp_broke_for::NORMAL,
int broking_cache_warn_threshold = 50,
int broking_cache_ignore_threshold = 200);
~vp_xml_socket_broker_node();
};
}

55
nodes/ffio/ff_common.cpp Normal file
View File

@@ -0,0 +1,55 @@
#ifdef VP_WITH_FFMPEG
#include "ff_common.h"
namespace vp_nodes {
ff_scaler::ff_scaler(int src_width,
int src_height,
AVPixelFormat src_fmt,
int dst_width,
int dst_height,
AVPixelFormat dst_fmt):
m_src_width(src_width),
m_src_height(src_height),
m_src_fmt(src_fmt),
m_dst_width(dst_width),
m_dst_height(dst_height),
m_dst_fmt(dst_fmt) {
sws_ctx = sws_getContext(src_width, src_height, src_fmt, dst_width,dst_height, dst_fmt, SWS_FAST_BILINEAR, NULL, NULL, NULL);
}
ff_scaler::~ff_scaler() {
if (!sws_ctx) {
sws_freeContext(sws_ctx);
}
}
bool ff_scaler::scale(ff_av_frame_ptr src, ff_av_frame_ptr& dst) {
if (!sws_ctx || !src || !dst) {
return false;
}
if (src->format != m_src_fmt || src->width != m_src_width || src->height != m_src_height ||
dst->format != m_dst_fmt || dst->width != m_dst_width || dst->height != m_dst_height) {
return false;
}
if (dst->format == AV_PIX_FMT_BGR24) {
dst->linesize[0] = m_dst_width * 3;
}
/* allocate buffer if need for dst */
if (dst->data[0] == NULL) {
auto ret = av_frame_get_buffer(dst.get(), 0);
if (ret < 0) {
return false;
}
}
auto ret = sws_scale(sws_ctx, src->data, src->linesize, 0, src->height, dst->data, dst->linesize);
dst->pts = src->pts;
dst->pkt_dts = src->pkt_dts;
return ret >= 0;
}
}
#endif

64
nodes/ffio/ff_common.h Normal file
View File

@@ -0,0 +1,64 @@
#pragma once
#ifdef VP_WITH_FFMPEG
#include <memory>
#include <thread>
#include <condition_variable>
#include <mutex>
#include <sstream>
#include <unordered_map>
#include <queue>
extern "C" {
#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
#include <libswscale/swscale.h>
#include <libavutil/avutil.h>
#include <libavutil/pixdesc.h>
#include <libavutil/imgutils.h>
#include <libavdevice/avdevice.h>
}
#define ff_av_packet_ptr std::shared_ptr<AVPacket>
#define ff_av_frame_ptr std::shared_ptr<AVFrame>
#define ff_src_ptr std::shared_ptr<vp_nodes::ff_src>
#define ff_des_ptr std::shared_ptr<vp_nodes::ff_des>
#define ff_scaler_ptr std::shared_ptr<vp_nodes::ff_scaler>
#define alloc_ff_src(channel_index) std::make_shared<vp_nodes::ff_src>(channel_index)
#define alloc_ff_des(channel_index) std::make_shared<vp_nodes::ff_des>(channel_index)
#define alloc_ff_scaler(src_width, src_height, src_fmt, dst_width, dst_height, dst_fmt) \
std::make_shared<vp_nodes::ff_scaler>(src_width, src_height, src_fmt, dst_width, dst_height, dst_fmt)
#define alloc_ff_av_frame() \
std::shared_ptr<AVFrame>(av_frame_alloc(), [](AVFrame *frame) { av_frame_free(&frame); })
#define alloc_ff_av_packet() \
std::shared_ptr<AVPacket>(av_packet_alloc(), [](AVPacket *pkt) { av_packet_free(&pkt); })
namespace vp_nodes {
/**
* tools for color conversion & image resize using FFmpeg on CPUs.
*
*/
class ff_scaler {
private:
SwsContext* sws_ctx = NULL;
int m_src_width;
int m_src_height;
AVPixelFormat m_src_fmt;
int m_dst_width;
int m_dst_height;
AVPixelFormat m_dst_fmt;
public:
/* disable copy and assignment operations */
ff_scaler(const ff_scaler&) = delete;
ff_scaler& operator=(const ff_scaler&) = delete;
ff_scaler(int src_width,
int src_height,
AVPixelFormat src_fmt,
int dst_width,
int dst_height,
AVPixelFormat dst_fmt);
~ff_scaler();
bool scale(ff_av_frame_ptr src, ff_av_frame_ptr& dst);
};
}
#endif

482
nodes/ffio/ff_des.cpp Normal file
View File

@@ -0,0 +1,482 @@
#ifdef VP_WITH_FFMPEG
#include <iostream>
#include <sstream>
#include "ff_des.h"
namespace vp_nodes {
ff_des::ff_des(int channel_index): m_channel_index(channel_index) {
}
ff_des::~ff_des() {
inner_close();
}
void ff_des::enmux_run() {
auto ret = 0;
while (m_enmux_running) {
ff_av_packet_ptr ff_packet = nullptr;
m_enmux_semaphore.wait();
{
std::lock_guard<std::mutex> g(m_enmux_packets_m);
ff_packet = m_enmux_packets_q.front();
m_enmux_packets_q.pop();
}
if (!ff_packet) {
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] get exit flag.",
get_channel_index()));
break;
} else {
/* set parameters for packets */
av_packet_rescale_ts(ff_packet.get(), m_enc_ctx->time_base, m_ofmt_ctx->streams[m_inner_stream_index]->time_base);
ff_packet->stream_index = m_inner_stream_index;
ret = av_interleaved_write_frame(m_ofmt_ctx, ff_packet.get());
}
if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] av_interleaved_write_frame failed. ret: %d",
get_channel_index(), ret));
break;
}
}
/* write the trailer for output stream */
ret = av_write_trailer(m_ofmt_ctx);
if (ret != 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] av_write_trailer failed. ret: %d",
get_channel_index(), ret));
}
/* set enmux running flag to false in case of abnormal exit */
m_enmux_running = false;
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] enmux thread exits.",
get_channel_index()));
}
void ff_des::encode_run() {
while (m_encode_running) {
ff_av_frame_ptr ff_frame;
m_encode_semaphore.wait();
{
std::lock_guard<std::mutex> g(m_encode_frames_m);
ff_frame = m_encode_frames_q.front();
m_encode_frames_q.pop();
}
auto ret = 0;
if (!ff_frame) {
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][encode_run] get exit flag.",
get_channel_index()));
break;
}
ff_frame->pts = m_enc_ctx->frame_number;
ret = avcodec_send_frame(m_enc_ctx, ff_frame.get());
if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][encode_run] avcodec_send_frame failed. ret: %d",
get_channel_index(), ret));
break;
}
while (ret >= 0) {
auto ff_packet = alloc_ff_av_packet();
ret = avcodec_receive_packet(m_enc_ctx, ff_packet.get());
if (ret == AVERROR(EAGAIN)) {
break;
} else if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][encode_run] avcodec_receive_packet failed. ret: %d",
get_channel_index(), ret));
ff_frame = nullptr;
break;
}
bool notify = true;
{
std::lock_guard<std::mutex> g(m_enmux_packets_m);
m_enmux_packets_q.push(ff_packet);
if (m_enmux_packets_q.size() > m_enmux_packets_q_max_size) {
m_enmux_packets_q.pop();
notify = false;
VP_WARN(vp_utils::string_format("[ffio/ff_des][%d][encode_run] exceed m_enmux_packets_q_max_size(%d), discard the front in queue.",
get_channel_index(), m_enmux_packets_q_max_size));
}
}
if (notify) {
m_enmux_semaphore.signal();
}
}
if (!ff_frame) {
break;
}
}
/* set encode running flag to false in case of abnormal exit */
m_encode_running = false;
{
// send exit flag to notify enmux thread
std::lock_guard<std::mutex> g(m_enmux_packets_m);
m_enmux_packets_q.push(nullptr);
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][encode_run] send exit flag to enmux thread.",
get_channel_index()));
}
m_enmux_semaphore.signal();
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][encode_run] encode thread exits.",
get_channel_index()));
}
bool ff_des::
open(const std::string& uri,
int width, int height, int fps, int bitrate, int max_b_frames,
const std::string& encoder_name,
AVPixelFormat sw_pix_fmt,
AVHWDeviceType hw_type,
AVPixelFormat hw_pix_fmt) {
inner_close();
auto uri_valid = false;
if (!uri_valid) {
auto uri_parts = vp_utils::string_split(uri, '.'); // file
if (std::find(m_supported_files.begin(), m_supported_files.end(), uri_parts[uri_parts.size() - 1]) != m_supported_files.end()) {
uri_valid = true;
m_live_stream = false;
}
}
if (!uri_valid) {
auto uri_parts = vp_utils::string_split(uri, ':'); // live stream
if (std::find(m_supported_protocols.begin(), m_supported_protocols.end(), uri_parts[0]) != m_supported_protocols.end()) {
uri_valid = true;
m_live_stream = true;
}
}
if (!uri_valid) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][open] uri invalid! uri MUST start with `rtsp/rtmp/..`(network streams) or end with `mp4/mkv/..`(file streams).",
get_channel_index()));
return false;
}
return inner_open(uri, width, height, fps, bitrate, max_b_frames, encoder_name, sw_pix_fmt, hw_type, hw_pix_fmt);
}
void ff_des::close() {
inner_close();
}
bool ff_des::is_opened() const {
return m_encode_running && m_enmux_running;
}
void ff_des::inner_close() {
/* set running flags to false */
m_encode_running = false;
m_enmux_running = false;
inner_exit_signal();
/* waiting for threads exist */
if (m_encode_th != nullptr && m_encode_th->joinable()) {
m_encode_th->join();
}
if (m_enmux_th != nullptr && m_enmux_th->joinable()) {
m_enmux_th->join();
}
/* free format & codec context */
if (m_ofmt_ctx) {
if (m_ofmt_ctx->pb) {
avio_closep(&m_ofmt_ctx->pb);
}
avformat_free_context(m_ofmt_ctx);
}
if (m_enc_ctx) {
avcodec_free_context(&m_enc_ctx);
}
/* clear queue */
{
// clear enmux_packets_q
std::lock_guard<std::mutex> g(m_enmux_packets_m);
m_enmux_packets_q = {};
m_enmux_semaphore.reset();
}
{
// clear encode_frames_q
std::lock_guard<std::mutex> g(m_encode_frames_m);
m_encode_frames_q = {};
m_encode_semaphore.reset();
}
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][inner_close] close successfully.",
get_channel_index()));
}
std::string ff_des::print_summary() {
return "";
}
bool ff_des::
inner_open(const std::string& uri,
int width, int height, int fps, int bitrate, int max_b_frames,
const std::string& encoder_name,
AVPixelFormat sw_pix_fmt,
AVHWDeviceType hw_type,
AVPixelFormat hw_pix_fmt) {
auto ret = 0;
/* allocate the output media context */
ret = avformat_alloc_output_context2(&m_ofmt_ctx, NULL, NULL, uri.c_str());
if (ret < 0) {
VP_WARN(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not deduce output format from uri, try to use FLV.",
get_channel_index()));
ret = avformat_alloc_output_context2(&m_ofmt_ctx, NULL, "flv", uri.c_str());
}
if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not deduce output format from uri(used FLV).",
get_channel_index()));
return false;
}
auto ofmt = m_ofmt_ctx->oformat;
/* find encoder for the output stream */
const AVCodec* enc = nullptr;
if (encoder_name.empty()) {
enc = avcodec_find_encoder(ofmt->video_codec); // get default encoder by AVCodecID
} else {
enc = avcodec_find_encoder_by_name(encoder_name.c_str()); // get by encoder name
}
if (!enc) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not find proper encoder for output stream.",
get_channel_index()));
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
/* allocate a codec context for the encoder */
m_enc_ctx = avcodec_alloc_context3(enc);
if (!m_enc_ctx) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not allocate context for encoder(%s).",
get_channel_index(), enc->name));
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
/* set parameters for encoder */
m_enc_ctx->framerate = (AVRational) {fps, 1};
m_enc_ctx->time_base = av_inv_q(m_enc_ctx->framerate);
m_enc_ctx->pix_fmt = sw_pix_fmt;
m_enc_ctx->width = width;
m_enc_ctx->height = height;
m_enc_ctx->bit_rate = bitrate * 1024;
m_enc_ctx->max_b_frames = max_b_frames;
/* check if need init hwaccels context for encoder */
if (hw_type != AVHWDeviceType::AV_HWDEVICE_TYPE_NONE) {
m_enc_ctx->pix_fmt = hw_pix_fmt;
if ((ret = hw_encoder_init(m_enc_ctx, hw_type, sw_pix_fmt)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] hw_encoder_init failed. ret: %d",
get_channel_index(), ret));
avcodec_free_context(&m_enc_ctx);
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
}
/* open the encoder */
if ((ret = avcodec_open2(m_enc_ctx, enc, NULL)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not open encoder. ret: %d",
get_channel_index(), ret));
avcodec_free_context(&m_enc_ctx);
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
/* create new stream for output */
auto out_stream = avformat_new_stream(m_ofmt_ctx, NULL);
if (!out_stream) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not create new stream for output uri.",
get_channel_index()));
avcodec_free_context(&m_enc_ctx);
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
m_inner_stream_index = out_stream->index;
out_stream->time_base = m_enc_ctx->time_base;
ret = avcodec_parameters_from_context(out_stream->codecpar, m_enc_ctx);
if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] failed to copy codec parameters from encode context. ret: %d",
get_channel_index(), ret));
avcodec_free_context(&m_enc_ctx);
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
if (!(m_ofmt_ctx->flags & AVFMT_NOFILE)) {
ret = avio_open(&m_ofmt_ctx->pb, uri.c_str(), AVIO_FLAG_WRITE);
m_ofmt_ctx->flush_packets = 1;
m_ofmt_ctx->flags |= AVFMT_FLAG_FLUSH_PACKETS;
if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] avio_open failed on output uri(%s). ret: %d",
get_channel_index(), uri.c_str(), ret));
avcodec_free_context(&m_enc_ctx);
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
}
/* write the stream header */
if ((ret = avformat_write_header(m_ofmt_ctx, NULL)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] avformat_write_header failed. ret: %d",
get_channel_index(), ret));
avcodec_free_context(&m_enc_ctx);
avformat_free_context(m_ofmt_ctx);
m_ofmt_ctx = NULL;
return false;
}
/* initialize video & ff_des properties */
m_hw_type_name = hw_type != AV_HWDEVICE_TYPE_NONE ? std::string(av_hwdevice_get_type_name(hw_type)) : "none";
m_encoder_name = encoder_name.empty() ? std::string(m_enc_ctx->codec->name) : encoder_name;
m_uri = uri;
m_fps = fps;
m_width = width;
m_height = height;
m_bitrate = bitrate;
m_codec_name = std::string(avcodec_get_name(out_stream->codecpar->codec_id));;
m_pixel_format = std::string(av_get_pix_fmt_name(static_cast<AVPixelFormat>(out_stream->codecpar->format)));
m_max_b_frames = max_b_frames;
/* collect summary notify caller */
auto summary = print_summary();
/* go! */
m_enmux_running = true;
m_encode_running = true;
m_enmux_th = std::make_shared<std::thread>(&ff_des::enmux_run, this);
m_encode_th = std::make_shared<std::thread>(&ff_des::encode_run, this);
VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][inner_open] open successfully.",
get_channel_index()));
return true;
}
bool ff_des::write(const ff_av_frame_ptr& frame) {
if (!is_opened()) {
return false;
}
bool notify = true;
{
std::lock_guard<std::mutex> g(m_encode_frames_m);
m_encode_frames_q.push(frame);
if (m_encode_frames_q.size() > m_encode_frames_q_max_size) {
m_encode_frames_q.pop();
notify = false;
VP_WARN(vp_utils::string_format("[ffio/ff_des][%d][write] exceed m_encode_frames_q_max_size(%d), discard the front in queue.",
get_channel_index(), m_encode_frames_q_max_size));
}
}
if (notify) {
m_encode_semaphore.signal();
}
return true;
}
ff_des& ff_des::operator<<(const ff_av_frame_ptr& frame) {
write(frame);
return *this;
}
void ff_des::inner_exit_signal() {
{
// send exit flag to notify encode thread
std::lock_guard<std::mutex> g(m_encode_frames_m);
m_encode_frames_q.push(nullptr);
}
m_encode_semaphore.signal();
}
int ff_des::get_video_fps() const {
return m_fps;
}
int ff_des::get_video_width() const {
return m_width;
}
int ff_des::get_video_height() const {
return m_height;
}
long ff_des::get_video_bitrate() const {
return m_bitrate;
}
std::string ff_des::get_video_codec_name() const {
return m_codec_name;
}
std::string ff_des::get_video_pixel_format_name() const {
return m_pixel_format;
}
bool ff_des::is_live_stream() const {
return m_live_stream;
}
std::string ff_des::get_hw_type_name() const {
return m_hw_type_name;
}
std::string ff_des::get_encoder_name() const {
return m_encoder_name;
}
std::string ff_des::get_uri() const {
return m_uri;
}
int ff_des::get_channel_index() const {
return m_channel_index;
}
const AVCodecContext* ff_des::get_encode_ctx() const {
return m_enc_ctx;
}
int ff_des::
hw_encoder_init(AVCodecContext* enc_ctx,
const enum AVHWDeviceType hw_type,
const enum AVPixelFormat sw_pix_fmt) {
AVBufferRef* hw_device_ctx = nullptr;
AVBufferRef* hw_frames_ref = nullptr;
AVHWFramesContext* frames_ctx = nullptr;
int err = 0;
if ((err = av_hwdevice_ctx_create(&hw_device_ctx, hw_type,
NULL, NULL, 0)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][hw_encoder_init] failed to create specified(%s) HW device context for encoder.",
get_channel_index(), av_hwdevice_get_type_name(hw_type)));
return err;
}
if (!(hw_frames_ref = av_hwframe_ctx_alloc(hw_device_ctx))) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][hw_encoder_init] failed to create specified(%s) HW frame context for encoder.",
get_channel_index(), av_hwdevice_get_type_name(hw_type)));
return err;
}
frames_ctx = (AVHWFramesContext* )(hw_frames_ref->data);
frames_ctx->format = enc_ctx->pix_fmt;
frames_ctx->sw_format = sw_pix_fmt;
frames_ctx->width = enc_ctx->width;
frames_ctx->height = enc_ctx->height;
frames_ctx->initial_pool_size = 20;
if ((err = av_hwframe_ctx_init(hw_frames_ref)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][hw_encoder_init] failed to initialize specified(%s) HW frame context for encoder.",
get_channel_index(), av_hwdevice_get_type_name(hw_type)));
av_buffer_unref(&hw_frames_ref);
av_buffer_unref(&hw_device_ctx);
return err;
}
enc_ctx->hw_frames_ctx = av_buffer_ref(hw_frames_ref);
if (!enc_ctx->hw_frames_ctx)
err = AVERROR(ENOMEM);
av_buffer_unref(&hw_frames_ref);
av_buffer_unref(&hw_device_ctx);
return err;
}
}
#endif

241
nodes/ffio/ff_des.h Normal file
View File

@@ -0,0 +1,241 @@
#pragma once
#ifdef VP_WITH_FFMPEG
#include <string>
#include <queue>
#include <vector>
#include <mutex>
#include <algorithm>
#include <functional>
#include "ff_common.h"
#include "../../utils/vp_semaphore.h"
#include "../../utils/vp_utils.h"
#include "../../utils/logger/vp_logger.h"
namespace vp_nodes {
/**
* encode and enmux using FFmpeg.
* used to encode & enmux network streams or file streams.
*/
class ff_des: public std::enable_shared_from_this<ff_des> {
private:
/* core members */
const std::vector<std::string> m_supported_files = {"mp4", "mkv", "flv", "h265", "h264"};
const std::vector<std::string> m_supported_protocols = {"rtsp", "rtmp", "udp", "rtp"};
AVFormatContext* m_ofmt_ctx = nullptr;
AVCodecContext* m_enc_ctx = nullptr;
std::queue<ff_av_packet_ptr> m_enmux_packets_q;
std::queue<ff_av_frame_ptr> m_encode_frames_q;
std::mutex m_enmux_packets_m;
std::mutex m_encode_frames_m;
int m_enmux_packets_q_max_size = 25;
int m_encode_frames_q_max_size = 25;
std::shared_ptr<std::thread> m_enmux_th = nullptr;
std::shared_ptr<std::thread> m_encode_th = nullptr;
vp_utils::vp_semaphore m_enmux_semaphore;
vp_utils::vp_semaphore m_encode_semaphore;
/**
* live stream or not for output.
*/
bool m_live_stream = false;
/**
* fps of output stream.
*/
int m_fps = 0;
/**
* width of output stream in pixels.
*/
int m_width = 0;
/**
* height of output stream in pixels.
*/
int m_height = 0;
/**
* bitrate of output stream (kbit/s).
*/
long m_bitrate = 0;
/**
* codec type name of output stream (strings like `h264/hevc/vp8/...`).
*/
std::string m_codec_name = "";
/**
* pixel format of output stream (strings like `YUV420P/NV12/...`).
*/
std::string m_pixel_format = "";
/**
* max B frames for encode.
*/
int m_max_b_frames = 0;
/**
* channel index of output.
*/
int m_channel_index = -1;
/**
* type name of hardware used for encode (`none` if no hardware used).
*/
std::string m_hw_type_name = "";
/**
* encoder name used for encoding.
*/
std::string m_encoder_name = "";
/**
* uri for output (rtmp/file/...).
*/
std::string m_uri = "";
/* inner flags */
int m_inner_stream_index = -1;
bool m_enmux_running = false;
bool m_encode_running = false;
/* inner methods */
void enmux_run();
void encode_run();
void inner_close();
void inner_exit_signal();
std::string print_summary();
bool inner_open(const std::string& uri,
int width, int height, int fps, int bitrate, int max_b_frames,
const std::string& encoder_name = "",
AVPixelFormat sw_pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P,
AVHWDeviceType hw_type = AVHWDeviceType::AV_HWDEVICE_TYPE_NONE,
AVPixelFormat hw_pix_fmt = AVPixelFormat::AV_PIX_FMT_NONE);
int hw_encoder_init(AVCodecContext* enc_ctx, const enum AVHWDeviceType hw_type, const enum AVPixelFormat sw_pix_fmt);
public:
/**
* create ff_des instance using initial parameters.
*
* @param channel_index specify the channel index for output stream.
*/
ff_des(int channel_index);
~ff_des();
/* disable copy and assignment operations */
ff_des(const ff_des&) = delete;
ff_des& operator=(const ff_des&) = delete;
/**
* try to open ff_des, not thread-safe.
*
* @param
*/
bool open(const std::string& uri,
int width, int height, int fps, int bitrate, int max_b_frames = 0,
const std::string& encoder_name = "",
AVPixelFormat sw_pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P,
AVHWDeviceType hw_type = AVHWDeviceType::AV_HWDEVICE_TYPE_NONE,
AVPixelFormat hw_pix_fmt = AVPixelFormat::AV_PIX_FMT_NONE);
/**
* close ff_des.
*/
void close();
/**
* if working or not.
*
* @return
* true if it is working.
*/
bool is_opened() const;
/**
* write the next frame to ff_des, keep as same as cv::VideoWriter.
*
* @param frame frame to be written.
*
* @return
* true if write successfully.
*
* @note
* return false means:
* 1. write too quickly no space prepared, please try to write again.
* 2. ff_des not opened yet or not working.
*/
bool write(const ff_av_frame_ptr& frame);
/**
* write the next frame to ff_des, support for `<<` operator.
*
* @param frame frame to be written.
*
* @return
* reference for ff_des.
*/
ff_des& operator<<(const ff_av_frame_ptr& frame);
/**
* get fps of output stream in pixels.
*/
int get_video_fps() const;
/**
* get width of output stream in pixels.
*/
int get_video_width() const;
/**
* get height of output stream in pixels.
*/
int get_video_height() const;
/**
* get bitrate of output stream (kbit/s).
*/
long get_video_bitrate() const;
/**
* get codec type name of output stream (strings like `h264/hevc/vp8/...`).
*/
std::string get_video_codec_name() const;
/**
* get pixel format of output stream (strings like `YUV420P/NV12/...`).
*/
std::string get_video_pixel_format_name() const;
/**
* check is live stream or not.
*/
bool is_live_stream() const;
/**
* get type name of hardware used for encoding (strings like 'cuda/vaapi', 'none' if no hardware used).
*/
std::string get_hw_type_name() const;
/**
* get encoder name used for encoding.
*/
std::string get_encoder_name() const;
/**
* get uri of output.
*/
std::string get_uri() const;
/**
* get channel index of output.
*/
int get_channel_index() const;
/**
* get pointer of encode context from FFmpeg.
* used to allocate hardware AVFrame outside of ff_des.
*/
const AVCodecContext* get_encode_ctx() const;
};
}
#endif

475
nodes/ffio/ff_src.cpp Normal file
View File

@@ -0,0 +1,475 @@
#ifdef VP_WITH_FFMPEG
#include <iostream>
#include <sstream>
#include "ff_src.h"
namespace vp_nodes {
ff_src::ff_src(int channel_index): m_channel_index(channel_index) {
}
ff_src::~ff_src() {
inner_close();
}
void ff_src::demux_run() {
auto start_time = std::chrono::system_clock::now();
auto total_bytes = 0;
while (m_demux_running) {
auto demux_start_t = std::chrono::system_clock::now();
auto ff_packet = alloc_ff_av_packet();
auto ret = 0;
if ((ret = av_read_frame(m_ifmt_ctx, ff_packet.get())) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][demux_run] av_read_frame failed. ret: %d",
get_channel_index(),
ret));
break;
}
// only demux the right video stream
if (ff_packet->stream_index != m_inner_stream_index) {
continue;
}
/*
// analyse
vp_media::stream_analyser analyser(ff_packet->data, ff_packet->size, true);
std::vector<vp_media::nal_unit> nal_units;
analyser.analyse(nal_units);
std::ostringstream oss;
for (auto& nal: nal_units) {
oss << std::setw(8) << std::setfill(' ') << nal.index
<< std::setw(16) << std::setfill(' ') << nal.offset
<< std::setw(8) << std::setfill(' ') << nal.nal_length
<< std::setw(16) << std::setfill(' ') << vp_media::stream_analyser::to_hex(nal.start_bytes) << vp_media::stream_analyser::to_hex(nal.head_bytes, false)
<< std::setw(4) << std::setfill(' ') << nal.nal_type
<< std::setw(24) << std::setfill(' ') << nal.nal_type_name << std::endl;
}
std::cout << oss.str();
*/
// need calculate bitrate manually
if (m_bitrate <= 0) {
total_bytes += ff_packet->size;
auto current_time = std::chrono::system_clock::now();
auto elapsed_seconds = std::chrono::duration_cast<std::chrono::seconds>(current_time - start_time);
if (elapsed_seconds.count() > 5.0) {
m_bitrate = (total_bytes * 8) / elapsed_seconds.count() / 1024; // convert to kbit/s
total_bytes = 0;
start_time = std::chrono::system_clock::now();
}
}
bool notify = true;
{
std::lock_guard<std::mutex> g(m_demux_packets_m);
m_demux_packets_q.push(ff_packet);
if (m_demux_packets_q.size() > m_demux_packets_q_max_size) {
m_demux_packets_q.pop();
notify = false;
VP_WARN(vp_utils::string_format("[ffio/ff_src][%d][demux_run] exceed m_demux_packets_q_max_size(%d), discard the front in queue.",
get_channel_index(),
m_demux_packets_q_max_size));
}
}
if (notify) {
m_demux_semaphore.signal();
}
/* 1. wait for a while for video file
* 2. demux as soon as possible for live stream
*/
if (!m_live_stream) {
auto cost_t = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - demux_start_t);
auto duration_t = std::chrono::milliseconds(int(1000.0 / m_fps));
if (cost_t < duration_t) {
std::this_thread::sleep_for(duration_t - cost_t);
}
}
}
/* set demux running flag to false in case of abnormal exit */
m_demux_running = false;
{
// send exit flag to notify decode thread
std::lock_guard<std::mutex> g(m_demux_packets_m);
m_demux_packets_q.push(nullptr);
VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][demux_run] send exit flag to decode thread.",
get_channel_index()));
}
m_demux_semaphore.signal();
VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][demux_run] demux thread exits.",
get_channel_index()));
}
void ff_src::decode_run() {
while (m_decode_running) {
m_demux_semaphore.wait();
ff_av_packet_ptr ff_packet = nullptr;
{
std::lock_guard<std::mutex> g(m_demux_packets_m);
ff_packet = m_demux_packets_q.front();
m_demux_packets_q.pop();
}
auto ret = 0;
/* get exit flag */
if (!ff_packet) {
VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][decode_run] get exit flag, go to flush decoder.",
get_channel_index()));
ret = avcodec_send_packet(m_dec_ctx, NULL); // flush decoder
} else {
ret = avcodec_send_packet(m_dec_ctx, ff_packet.get());
}
if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][decode_run] avcodec_send_packet failed. ret: %d",
get_channel_index(), ret));
break;
}
while (ret >= 0) {
auto ff_frame = alloc_ff_av_frame();
ret = avcodec_receive_frame(m_dec_ctx, ff_frame.get());
if (ret == AVERROR(EAGAIN)) {
break;
} else if (ret < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][decode_run] avcodec_receive_frame failed. ret: %d",
get_channel_index(),
ret));
ff_packet = nullptr;
break;
}
{
std::lock_guard<std::mutex> g(m_decode_frames_m);
m_decode_frames_q.push(ff_frame);
if (m_decode_frames_q.size() > m_decode_frames_q_max_size) {
m_decode_frames_q.pop();
VP_WARN(vp_utils::string_format("[ffio/ff_src][%d][decode_run] exceed m_decode_frames_q_max_size(%d), discard the front in queue.",
get_channel_index(),
m_decode_frames_q_max_size));
}
}
}
if (!ff_packet) {
break;
}
}
/* set decode running flag to false in case of abnormal exit */
m_decode_running = false;
VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][decode_run] decode thread exits.",
get_channel_index()));
}
bool ff_src::open(const std::string& uri, const std::string& decoder_name, AVHWDeviceType hw_type) {
inner_close();
auto uri_valid = false;
if (!uri_valid) {
auto uri_parts = vp_utils::string_split(uri, '.'); // file
if (std::find(m_supported_files.begin(), m_supported_files.end(), uri_parts[uri_parts.size() - 1]) != m_supported_files.end()) {
uri_valid = true;
m_live_stream = false;
}
}
if (!uri_valid) {
auto uri_parts = vp_utils::string_split(uri, ':'); // live stream
if (std::find(m_supported_protocols.begin(), m_supported_protocols.end(), uri_parts[0]) != m_supported_protocols.end()) {
uri_valid = true;
m_live_stream = true;
}
}
if (!uri_valid) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][open] uri invalid! uri MUST start with `rtsp/rtmp/..`(network streams) or end with `mp4/mkv/..`(file streams).",
get_channel_index()));
return false;
}
return inner_open(uri, decoder_name, hw_type);
}
bool ff_src::inner_open(const std::string& uri, const std::string& decoder_name, AVHWDeviceType hw_type) {
auto ret = 0;
AVDictionary *options = NULL;
// Set options
av_dict_set(&options, "max_delay", "100000", 0); // 设置最大延迟为100ms
av_dict_set(&options, "buffer_size", "2000000", 0); // 设置缓冲区大小为2MB
av_dict_set(&options, "timeout", "3000000", 0); // 设置超时时间为3秒
av_dict_set(&options, "rtsp_transport", "tcp", 0); // 使用TCP传输
av_dict_set(&options, "max_interleave_delta", "1000000", 0); // 设置最大间隔时间为1秒
/* open input uri(file/stream), and allocate format context */
if (avformat_open_input(&m_ifmt_ctx, uri.c_str(), NULL, NULL) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not open input uri.",
get_channel_index()));
return false;
}
/* retrieve stream information */
if (avformat_find_stream_info(m_ifmt_ctx, NULL) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not find stream information.",
get_channel_index()));
avformat_close_input(&m_ifmt_ctx);
return false;
}
/* find video stream */
if ((ret = av_find_best_stream(m_ifmt_ctx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not find proper VIDEO stream.",
get_channel_index()));
avformat_close_input(&m_ifmt_ctx);
return false;
}
m_inner_stream_index = ret;
auto in_stream = m_ifmt_ctx->streams[m_inner_stream_index];
/* initialize video properties */
m_fps = int(av_q2d(in_stream->r_frame_rate));
m_width = in_stream->codecpar->width;
m_height = in_stream->codecpar->height;
m_bitrate = in_stream->codecpar->bit_rate / 1024; // kb/s
m_codec_name = std::string(avcodec_get_name(in_stream->codecpar->codec_id));
m_duration = av_q2d(in_stream->time_base) * in_stream->duration;
m_duration = m_duration < 0 ? 0 : m_duration;
m_pixel_format = std::string(av_get_pix_fmt_name(static_cast<AVPixelFormat>(in_stream->codecpar->format)));
/* find decoder for the input video stream */
const AVCodec* dec = nullptr;
if (decoder_name.empty()) {
dec = avcodec_find_decoder(in_stream->codecpar->codec_id); // get default decoder by AVCodecID
} else {
dec = avcodec_find_decoder_by_name(decoder_name.c_str()); // get by decoder name
}
if (!dec) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not find the proper decoder for input stream.",
get_channel_index()));
avformat_close_input(&m_ifmt_ctx);
return false;
}
/* allocate codec context for the decoder */
m_dec_ctx = avcodec_alloc_context3(dec);
if (!m_dec_ctx) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not allocate context for decoder(%s).",
get_channel_index(), dec->name));
avformat_close_input(&m_ifmt_ctx);
return false;
}
/* copy codec parameters from input stream to codec context */
if ((ret = avcodec_parameters_to_context(m_dec_ctx, in_stream->codecpar)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not copy codec parameters to decode context. ret: %d",
get_channel_index(), ret));
avcodec_free_context(&m_dec_ctx);
avformat_close_input(&m_ifmt_ctx);
return false;
}
/* check if need init hwaccels context for decoder */
if (hw_type != AVHWDeviceType::AV_HWDEVICE_TYPE_NONE
&& hw_decoder_init(m_dec_ctx, hw_type) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] failed to create specified(%s) HW device context for decoder.",
get_channel_index(), av_hwdevice_get_type_name(hw_type)));
avcodec_free_context(&m_dec_ctx);
avformat_close_input(&m_ifmt_ctx);
return false;
}
/* open the decoder */
if ((ret = avcodec_open2(m_dec_ctx, dec, NULL)) < 0) {
VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not open decoder. ret: %d",
get_channel_index(), ret));
avcodec_free_context(&m_dec_ctx);
avformat_close_input(&m_ifmt_ctx);
return false;
}
/* initialize node properties */
m_decoder_name = decoder_name.empty() ? std::string(m_dec_ctx->codec->name) : decoder_name;
m_uri = uri;
m_hw_type_name = hw_type != AV_HWDEVICE_TYPE_NONE ? std::string(av_hwdevice_get_type_name(hw_type)) : "none";
/* collect summary notify caller */
auto summary = print_summary();
if (m_src_opened_hooker) {
m_src_opened_hooker(shared_from_this(), summary);
}
/* go! */
m_demux_running = true;
m_decode_running = true;
m_demux_th = std::make_shared<std::thread>(&ff_src::demux_run, this);
m_decode_th = std::make_shared<std::thread>(&ff_src::decode_run, this);
VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][inner_open] open successfully.",
get_channel_index()));
return true;
}
void ff_src::close() {
inner_close();
}
void ff_src::inner_close() {
/* set running flags to false */
m_demux_running = false;
m_decode_running = false;
/* waiting for threads exist */
if (m_demux_th != nullptr && m_demux_th->joinable()) {
m_demux_th->join();
}
if (m_decode_th != nullptr && m_decode_th->joinable()) {
m_decode_th->join();
}
/* free format & codec context */
if (m_ifmt_ctx) {
avformat_close_input(&m_ifmt_ctx);
}
if (m_dec_ctx) {
avcodec_free_context(&m_dec_ctx);
}
/* clear queue */
{
// clear m_demux_packets_q
std::lock_guard<std::mutex> g(m_demux_packets_m);
m_demux_packets_q = {};
m_demux_semaphore.reset();
}
{
// clear m_decode_frames_q
std::lock_guard<std::mutex> g(m_decode_frames_m);
m_decode_frames_q = {};
}
VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][inner_close] close successfully.",
get_channel_index()));
}
bool ff_src::is_opened() const {
return m_demux_running && m_decode_running;
}
int ff_src::get_video_fps() const {
// no check is_opened or not
return m_fps;
}
int ff_src::get_video_width() const {
// no check is_opened or not
return m_width;
}
int ff_src::get_video_height() const {
// no check is opened or not
return m_height;
}
long ff_src::get_video_bitrate() const {
// no check is opened or not
return m_bitrate;
}
std::string ff_src::get_video_codec_name() const {
// no check is_opened or not
return m_codec_name;
}
double ff_src::get_video_duration() const {
// no check is opened or not
return m_duration;
}
bool ff_src::is_live_stream() const {
// no check is opened or not
return m_live_stream;
}
std::string ff_src::get_hw_type_name() const {
// no check is opened or not
return m_hw_type_name;
}
std::string ff_src::get_decoder_name() const {
// no check is opened or not
return m_decoder_name;
}
std::string ff_src::get_uri() const {
// no check is opened or not
return m_uri;
}
int ff_src::get_channel_index() const {
// no check is opened or not
return m_channel_index;
}
std::string ff_src::get_video_pixel_format_name() const {
// no check is opened or not
return m_pixel_format;
}
std::string ff_src::print_summary() {
std::ostringstream s_stream;
s_stream << std::endl;
s_stream << "################# summary for [ff_src] #################" << std::endl;
s_stream << "|channel_index |: " << get_channel_index() << std::endl;
s_stream << "|uri |: " << get_uri() << std::endl;
s_stream << "|is_live_stream |: " << (is_live_stream() ? std::string("YES") : std::string("NO")) << std::endl;
s_stream << "|decoder_name |: " << get_decoder_name() << std::endl;
s_stream << "|hw_type_name |: " << get_hw_type_name() << std::endl;
s_stream << "|------------------| " << std::endl;
s_stream << "|video_codec |: " << get_video_codec_name() << std::endl;
s_stream << "|video_pic_format |: " << get_video_pixel_format_name() << std::endl;
s_stream << "|video_fps |: " << get_video_fps() << std::endl;
s_stream << "|video_width |: " << get_video_width() << std::endl;
s_stream << "|video_height |: " << get_video_height() << std::endl;
s_stream << "|video_bitrate |: " << get_video_bitrate() << " [kbit/s]" << std::endl;
s_stream << "|video_duration |: " << get_video_duration() << " [seconds]" << std::endl;
s_stream << "|------------------| " << std::endl;
s_stream << "################# summary for [ff_src] #################" << std::endl;
auto summary = s_stream.str();
VP_INFO(summary);
return summary;
}
bool ff_src::read(ff_av_frame_ptr& frame) {
auto ret = false;
{
std::lock_guard<std::mutex> g(m_decode_frames_m);
if (!m_decode_frames_q.empty()) {
frame = m_decode_frames_q.front();
m_decode_frames_q.pop();
ret = true;
} else {
frame = nullptr;
}
}
if (!ret) {
// switch control of CPUs if no frame returned, avoid of occupying CPUs for a long time by caller outside
//std::this_thread::sleep_for(std::chrono::milliseconds(1));
//VP_DEBUG(vp_utils::string_format("[ffio/ff_src][%d][read] read frame failed, sleep for 1 millisecond.",
// get_channel_index()));
}
/* false means no frame returned (not opened or read too quickly so data not prepared) */
return ret;
}
ff_src& ff_src::operator>>(ff_av_frame_ptr& frame) {
read(frame);
return *this;
}
int ff_src::hw_decoder_init(AVCodecContext* dec_ctx, const enum AVHWDeviceType type) {
int err = 0;
AVBufferRef* hw_device_ctx = nullptr;
if ((err = av_hwdevice_ctx_create(&hw_device_ctx, type,
NULL, NULL, 0)) < 0) {
return err;
}
// update using AVHWDeviceContext*(hw_device_ctx->data)
dec_ctx->hw_device_ctx = av_buffer_ref(hw_device_ctx);
av_buffer_unref(&hw_device_ctx);
return err;
}
void ff_src::set_src_opened_hooker(ff_src_opened_hooker src_opened_hooker) {
m_src_opened_hooker = src_opened_hooker;
}
}
#endif

246
nodes/ffio/ff_src.h Normal file
View File

@@ -0,0 +1,246 @@
#pragma once
#ifdef VP_WITH_FFMPEG
#include <string>
#include <queue>
#include <vector>
#include <mutex>
#include <algorithm>
#include <functional>
#include "ff_common.h"
#include "../../utils/vp_semaphore.h"
#include "../../utils/vp_utils.h"
#include "../../utils/logger/vp_logger.h"
namespace vp_nodes {
class ff_src;
typedef std::function<void(ff_src_ptr, const std::string&)> ff_src_opened_hooker;
/**
* demux and decode using FFmpeg.
* used to demux & decode network streams or file streams.
*/
class ff_src: public std::enable_shared_from_this<ff_src> {
private:
/* core members */
const std::vector<std::string> m_supported_files = {"mp4", "mkv", "flv", "avi", "h264"};
const std::vector<std::string> m_supported_protocols = {"rtsp", "rtmp", "http", "rtp"};
AVFormatContext* m_ifmt_ctx = nullptr;
AVCodecContext* m_dec_ctx = nullptr;
std::queue<ff_av_packet_ptr> m_demux_packets_q;
std::queue<ff_av_frame_ptr> m_decode_frames_q;
std::mutex m_demux_packets_m;
std::mutex m_decode_frames_m;
int m_demux_packets_q_max_size = 25;
int m_decode_frames_q_max_size = 25;
std::shared_ptr<std::thread> m_demux_th = nullptr;
std::shared_ptr<std::thread> m_decode_th = nullptr;
vp_utils::vp_semaphore m_demux_semaphore;
/**
* live stream or not for input.
*/
bool m_live_stream = false;
/**
* fps of input stream.
*/
int m_fps = 0;
/**
* width of input stream in pixels.
*/
int m_width = 0;
/**
* height of input stream in pixels.
*/
int m_height = 0;
/**
* bitrate of input stream (kbit/s).
*/
long m_bitrate = 0;
/**
* codec type name of input stream (strings like `h264/hevc/vp8/...`).
*/
std::string m_codec_name = "";
/**
* pixel format of input stream (strings like `YUV420P/NV12/...`).
*/
std::string m_pixel_format = "";
/**
* duration of input stream (seconds), only for file stream.
*/
double m_duration = 0.0;
/**
* channel index of input.
*/
int m_channel_index = -1;
/**
* type name of hardware used for decoding (`none` if no hardware used).
*/
std::string m_hw_type_name = "";
/**
* decoder name used for decoding.
*/
std::string m_decoder_name = "";
/**
* uri of input (rtsp/file/...).
*/
std::string m_uri = "";
/* inner flags */
int m_inner_stream_index = -1;
bool m_demux_running = false;
bool m_decode_running = false;
/* inner methods */
void demux_run();
void decode_run();
void inner_close();
std::string print_summary();
bool inner_open(const std::string& uri, const std::string& decoder_name, AVHWDeviceType hw_type);
int hw_decoder_init(AVCodecContext* dec_ctx, const enum AVHWDeviceType type);
/* callbacks */
ff_src_opened_hooker m_src_opened_hooker;
public:
/**
* create ff_src instance using initial parameters.
*
* @param channel_index specify the channel index for input stream.
*/
ff_src(int channel_index);
~ff_src();
/* disable copy and assignment operations */
ff_src(const ff_src&) = delete;
ff_src& operator=(const ff_src&) = delete;
/**
* try to open ff_src, not thread-safe.
*
* @param uri uri to open, url for network streams or file path for file streams.
* @param decoder_name specify the decoder name used for decoding, MUST supported by FFmpeg.
* @param hw_type type of hardware for decoding, MUST supported by FFmpeg.
*
* @note
* if `decoder_name` not specified, ff_src will choose the default decoder accordding to the codec type of input stream.
*/
bool open(const std::string& uri,
const std::string& decoder_name = "",
AVHWDeviceType hw_type = AVHWDeviceType::AV_HWDEVICE_TYPE_NONE);
/**
* close ff_src.
*/
void close();
/**
* if working or not.
*
* @return
* true if it is working.
*/
bool is_opened() const;
/**
* read the next frame from ff_src, keep as same as cv::VideoCapture.
*
* @param frame frame to be returned (return nullptr if read failed).
*
* @return
* true if read successfully.
*
* @note
* return false means:
* 1. read too quickly no data prepared, please try to read again.
* 2. ff_src not opened yet or not working.
*/
bool read(ff_av_frame_ptr& frame);
/**
* read the next frame from ff_src, support for `>>` operator.
*
* @param frame frame to be returned (return nullptr if read failed).
*
* @return
* reference for ff_src.
*/
ff_src& operator>>(ff_av_frame_ptr& frame);
/**
* get fps of input stream.
*/
int get_video_fps() const;
/**
* get width of input stream in pixels.
*/
int get_video_width() const;
/**
* get height of input stream in pixels.
*/
int get_video_height() const;
/**
* get bitrate of input stream (kbit/s).
*/
long get_video_bitrate() const;
/**
* get codec type name of input stream (strings like `h264/hevc/vp8/...`).
*/
std::string get_video_codec_name() const;
/**
* get pixel format of input stream (strings like `YUV420P/NV12/...`).
*/
std::string get_video_pixel_format_name() const;
/**
* get duration of input stream (seconds), only for file stream.
*/
double get_video_duration() const;
/**
* check is live stream or not.
*/
bool is_live_stream() const;
/**
* get type name of hardware used for decoding (strings like 'cuda/vaapi', 'none' if no hardware used).
*/
std::string get_hw_type_name() const;
/**
* get decoder name used for decoding.
*/
std::string get_decoder_name() const;
/**
* get uri of input.
*/
std::string get_uri() const;
/**
* get channel index of input.
*/
int get_channel_index() const;
/**
* set callback for opened event. would be activated every time ff_src opened.
*
* @param src_opened_hooker callback activated when ff_src opened.
*/
void set_src_opened_hooker(ff_src_opened_hooker src_opened_hooker);
};
}
#endif

View File

@@ -0,0 +1,97 @@
#ifdef VP_WITH_FFMPEG
#include "vp_ff_des_node.h"
namespace vp_nodes {
vp_ff_des_node::
vp_ff_des_node(const std::string& node_name,
int channel_index,
const std::string& out_uri,
vp_objects::vp_size resolution_w_h,
int bitrate,
bool osd,
std::string encoder_name):
vp_des_node(node_name, channel_index),
m_out_uri(out_uri),
m_resolution_w_h(resolution_w_h),
m_out_bitrate(bitrate),
m_use_osd(osd),
m_encoder_name(encoder_name) {
m_ff_des = alloc_ff_des(channel_index);
VP_INFO(vp_utils::string_format("[%s] [%s]", node_name.c_str(), out_uri.c_str()));
this->initialized();
}
vp_ff_des_node::~vp_ff_des_node() {
deinitialized();
if (sws_ctx) {
sws_freeContext(sws_ctx);
}
}
std::shared_ptr<vp_objects::vp_meta> vp_ff_des_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
VP_DEBUG(vp_utils::string_format("[%s] received frame meta, channel_index=>%d, frame_index=>%d", node_name.c_str(), meta->channel_index, meta->frame_index));
cv::Mat resize_frame = (m_use_osd && !meta->osd_frame.empty()) ? meta->osd_frame : meta->frame;
auto out_width = resize_frame.cols;
auto out_height = resize_frame.rows;
if (m_resolution_w_h.width != 0 && m_resolution_w_h.height != 0) {
out_width = m_resolution_w_h.width;
out_height = m_resolution_w_h.height;
}
/* try to open ff_dec. */
if (!m_ff_des->is_opened()) {
if(!m_ff_des->open(m_out_uri,
out_width,
out_height,
meta->fps,
m_out_bitrate,
0,
m_encoder_name,
AV_PIX_FMT_YUV420P)) {
VP_WARN(vp_utils::string_format("[%s] could not open ff_des.", node_name.c_str()));
/* general works in vp_des_node. */
return vp_des_node::handle_frame_meta(meta);
}
}
/* initialize sws_ctx. */
if (!sws_ctx) {
sws_ctx = sws_getContext(resize_frame.cols,
resize_frame.rows,
AV_PIX_FMT_BGR24,
out_width,
out_height,
AV_PIX_FMT_YUV420P,
0, NULL, NULL, NULL);
}
if (!sws_ctx) {
VP_WARN(vp_utils::string_format("[%s] could not initialize sws_ctx.", node_name.c_str()));
/* general works in vp_des_node. */
return vp_des_node::handle_frame_meta(meta);
}
/* cv::Mat -> AVFrame. */
/* resize and convert to AV_PIX_FMT_YUV420P. */
auto dst_frame = alloc_ff_av_frame();
dst_frame->width = out_width;
dst_frame->height = out_height;
dst_frame->format= AV_PIX_FMT_YUV420P;
av_frame_get_buffer(dst_frame.get(), 0);
auto p = resize_frame.data;
int linesize[1] = {resize_frame.cols * 3};
sws_scale(sws_ctx, &p, linesize, 0, resize_frame.rows, dst_frame->data, dst_frame->linesize);
/* write to ff_des. */
m_ff_des->write(dst_frame);
/* general works in vp_des_node. */
return vp_des_node::handle_frame_meta(meta);
}
std::string vp_ff_des_node::to_string() {
return m_out_uri;
}
}
#endif

View File

@@ -0,0 +1,70 @@
#pragma once
#ifdef VP_WITH_FFMPEG
#include "ff_des.h"
#include "../vp_des_node.h"
namespace vp_nodes {
/**
* universal DES node using FFmpeg.
*
* support output uri:
* 1. path of file streams like `./vp_data/out_vp_test.mp4`.
* 2. url of network streams like `rtmp://192.168.77.68/live/stream`.
*/
class vp_ff_des_node final: public vp_des_node {
private:
/* inner members. */
std::string m_out_uri = "";
bool m_use_osd = true;
int m_out_bitrate = 1024;
std::string m_encoder_name = "";
vp_objects::vp_size m_resolution_w_h;
/**
* encode & enmux.
*/
ff_des_ptr m_ff_des = nullptr;
/**
* SwsContext used fot scale by FFmpeg.
*/
SwsContext* sws_ctx = NULL;
protected:
// re-implementation, return nullptr.
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
/**
* create vp_ff_des_node instance using initial parameters.
*
* @param node_name specify the name of DES node.
* @param channel_index specify the channel index of DES node.
* @param out_uri specify the uri to be written.
* @param use_osd specify use osd frame as output or not.
* @param out_width specify the final width of output, 0 means use the width of frame flowing in pipeline.
* @param out_height specify the final height of output, 0 means use the height of frame flowing in pipeline.
* @param out_fps specify the fps of output, 0 means use the fps of original stream in pipeline.
* @param out_bitrate specify the bitrate of output (kbit/s).
* @param out_max_b_frames specify the max B frames in a GOP for encoding.
* @param encoder_name specify the encoder name (`libx264`/`libx265`/`h264_nvenc`/`hevc_nvenc`) used for encoding in FFmpeg.
* @param out_sw_pix_fmt specify the pixel format of output.
*
* @note
* the encoder specified by `encoder_name` MUST be supported already in FFmpeg,
* we can run `ffmpeg -encoders` to show list of encoders supported in FFmpeg.
* if the encoder not found, please reconfigure & rebuild your FFmpeg.
*/
vp_ff_des_node(const std::string& node_name,
int channel_index,
const std::string& out_uri,
vp_objects::vp_size resolution_w_h = {},
int bitrate = 1024,
bool osd = true,
std::string encoder_name = "libx264");
~vp_ff_des_node();
/**
* return out uri of DES node.
*/
virtual std::string to_string() override;
};
}
#endif

View File

@@ -0,0 +1,149 @@
#ifdef VP_WITH_FFMPEG
#include "vp_ff_src_node.h"
namespace vp_nodes {
vp_ff_src_node::
vp_ff_src_node(const std::string& node_name,
int channel_index,
const std::string& uri,
const std::string& decoder_name,
float resize_ratio,
int skip_interval):
vp_src_node(node_name, channel_index, resize_ratio),
m_uri(uri),
m_decoder_name(decoder_name),
m_skip_interval(skip_interval) {
assert(skip_interval >= 0 && skip_interval <= 9);
m_ff_src = alloc_ff_src(channel_index);
VP_INFO(vp_utils::string_format("[%s] [%s]", node_name.c_str(), uri.c_str()));
this->initialized();
}
vp_ff_src_node::~vp_ff_src_node() {
deinitialized();
}
void vp_ff_src_node::handle_run() {
SwsContext* sws_ctx = NULL;
auto free_sws_ctx = [&]() {
/* free swsContext. */
if (sws_ctx) {
sws_freeContext(sws_ctx);
sws_ctx = NULL;
}
};
auto reopen_wait = 10;
auto reopen_times = 0;
ff_av_frame_ptr src_frame;
int video_width = 0;
int video_height = 0;
int fps = 0;
int skip = 0;
while (alive) {
/* wait for data coming. */
gate.knock();
if (!m_ff_src->is_opened()) {
if(!m_ff_src->open(m_uri, m_decoder_name)) {
reopen_times++;
if (reopen_times < 5) {
VP_WARN(vp_utils::string_format("[%s] open uri failed, try again right now...", node_name.c_str()));
}
else {
VP_WARN(vp_utils::string_format("[%s] open uri failed too many times, wait for %d seconds then try again...", node_name.c_str(), reopen_wait));
std::this_thread::sleep_for(std::chrono::milliseconds(1000 * reopen_wait));
reopen_times = 0;
}
continue;
}
free_sws_ctx();
}
// video properties
if (video_width == 0 || video_height == 0 || fps == 0) {
video_width = m_ff_src->get_video_width();
video_height = m_ff_src->get_video_height();
fps = m_ff_src->get_video_fps();
original_fps = fps;
original_width = video_width;
original_height = video_height;
// set true fps because skip some frames
fps = fps / (m_skip_interval + 1);
}
// stream_info_hooker activated if need
vp_stream_info stream_info {channel_index, original_fps, original_width, original_height, to_string()};
invoke_stream_info_hooker(node_name, stream_info);
/* try to read next frame from ff_src. */
if (!m_ff_src->read(src_frame)) {
//VP_WARN(vp_utils::string_format("[%s] reading frame failed, total frame==>%d", node_name.c_str(), frame_index));
continue;
}
// need skip
if (skip < m_skip_interval) {
skip++;
continue;
}
skip = 0;
/* AVFrame -> cv::Mat. */
/* resize and convert to BGR24. */
auto n_width = int(src_frame->width * resize_ratio);
auto n_height = int(src_frame->height * resize_ratio);
if (!sws_ctx) {
sws_ctx = sws_getContext(src_frame->width,
src_frame->height,
AVPixelFormat(src_frame->format),
n_width, n_height,
AV_PIX_FMT_BGR24,
0, NULL, NULL, NULL);
}
if (!sws_ctx) {
VP_WARN(vp_utils::string_format("[%s] could not initialize sws_ctx.", node_name.c_str()));
continue;
}
auto buffer_size = n_width * n_height * 3;
uchar* bgr24 = new uchar[buffer_size];
int linesize[1] = {n_width * 3};
sws_scale(sws_ctx, src_frame->data, src_frame->linesize, 0, src_frame->height, &bgr24, linesize);
cv::Mat frame(n_height, n_width, CV_8UC3, bgr24);
auto c_frame = frame.clone();
delete[] bgr24;
// set true size because resize
video_width = c_frame.cols;
video_height = c_frame.rows;
this->frame_index++;
// create frame meta
auto out_meta =
std::make_shared<vp_objects::vp_frame_meta>(c_frame, this->frame_index, this->channel_index, video_width, video_height, fps);
if (out_meta != nullptr) {
this->out_queue.push(out_meta);
// handled hooker activated if need
if (this->meta_handled_hooker) {
meta_handled_hooker(node_name, out_queue.size(), out_meta);
}
// important! notify consumer of out_queue in case it is waiting.
this->out_queue_semaphore.signal();
VP_DEBUG(vp_utils::string_format("[%s] after handling meta, out_queue.size()==>%d", node_name.c_str(), out_queue.size()));
}
}
free_sws_ctx();
// send dead flag for dispatch_thread
this->out_queue.push(nullptr);
this->out_queue_semaphore.signal();
}
std::string vp_ff_src_node::to_string() {
return m_uri;
}
}
#endif

View File

@@ -0,0 +1,61 @@
#pragma once
#ifdef VP_WITH_FFMPEG
#include "ff_src.h"
#include "../vp_src_node.h"
namespace vp_nodes {
/**
* universal SRC node using FFmpeg.
*
* support uri:
* 1. path of file streams like `./vp_data/vp_test.mp4`.
* 2. url of network streams like `rtsp://192.168.77.68/main_stream`.
*/
class vp_ff_src_node final: public vp_src_node {
private:
/* inner members. */
std::string m_decoder_name = "";
std::string m_uri = "";
// 0 means no skip
int m_skip_interval = 0;
/**
* demux & decode.
*/
ff_src_ptr m_ff_src = nullptr;
protected:
/**
* get frames using FFmpeg.
*/
virtual void handle_run() override;
public:
/**
* create vp_ff_src_node instance using initial parameters.
*
* @param node_name specify the name of SRC node.
* @param channel_index specify the channel index of SRC node.
* @param uri specify the uri to be opened by SRC node.
* @param decoder_name specify the decoder name (`h264`/`hevc`/`h264_cuvid`/`hevc_cuvid`) used for decoding in FFmpeg.
* @param resize_ratio specify the resize ratio applied to frames.
*
* @note
* the decoder specified by `decoder_name` MUST be supported already in FFmpeg,
* we can run `ffmpeg -decoders` to show list of decoders supported in FFmpeg.
* if the decoder not found, please reconfigure & rebuild your FFmpeg.
*/
vp_ff_src_node(const std::string& node_name,
int channel_index,
const std::string& uri,
const std::string& decoder_name = "h264",
float resize_ratio = 1.0,
int skip_interval = 0);
~vp_ff_src_node();
/**
* return uri of SRC node.
*/
virtual std::string to_string() override;
};
}
#endif

5
nodes/infers/README.md Normal file
View File

@@ -0,0 +1,5 @@
# Summary
1. If the inference node works on the whole frame, let it derived from `vp_primary_infer_node` class.
2. If the inference node works on the small cropped images, let it derived from `vp_secondary_infer_node` class.
3. We can define multi derived classes to handle different types of dl models (`detector/pose estimation/classification`). Also, if they work on the same type of target AND with the same logic (for example, hava the same `preprocess/postprocess`), we can use a unique class to load different deep learning models (like resnet18 and resnet50).

View File

@@ -0,0 +1,99 @@
#include "vp_classifier_node.h"
namespace vp_nodes {
vp_classifier_node::vp_classifier_node(std::string node_name,
std::string model_path,
std::string model_config_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
std::vector<int> p_class_ids_applied_to,
int min_width_applied_to,
int min_height_applied_to,
int crop_padding,
bool need_softmax,
float scale,
cv::Scalar mean,
cv::Scalar std,
bool swap_rb,
bool swap_chn):
vp_secondary_infer_node(node_name,
model_path,
model_config_path,
labels_path,
input_width,
input_height,
batch_size,
p_class_ids_applied_to,
min_width_applied_to,
min_height_applied_to,
crop_padding,
scale,
mean,
std,
swap_rb,
swap_chn), need_softmax(need_softmax){
this->initialized();
}
vp_classifier_node::~vp_classifier_node() {
deinitialized();
}
void vp_classifier_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
// make sure heads of output are not zero
assert(raw_outputs.size() > 0);
assert(frame_meta_with_batch.size() == 1);
// just one head of output
auto& output = raw_outputs[0];
assert(output.dims == 2);
auto count = output.rows;
auto index = 0;
auto& frame_meta = frame_meta_with_batch[0];
cv::Point class_id_point;
int class_id;
double score;
for (int i = 0; i < count; i++) {
for (int j = index; j < frame_meta->targets.size(); j++)
{
if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
// continue as its primary_class_id is not in p_class_ids_applied_to
continue;
}
auto prob = output.row(i);
cv::minMaxLoc(prob, 0, &score, 0, &class_id_point);
if (need_softmax) {
float maxProb = 0.0;
float sum = 0.0;
cv::Mat softmaxProb;
maxProb = *std::max_element(prob.begin<float>(), prob.end<float>());
cv::exp(prob - maxProb, softmaxProb);
sum = (float)cv::sum(softmaxProb)[0];
softmaxProb /= sum;
minMaxLoc(softmaxProb.reshape(1, 1), 0, &score, 0, &class_id_point);
}
class_id = class_id_point.x;
auto label = (labels.size() < class_id + 1) ? "" : labels[class_id];
// update back to frame meta
frame_meta->targets[j]->secondary_class_ids.push_back(class_id);
frame_meta->targets[j]->secondary_scores.push_back(score);
frame_meta->targets[j]->secondary_labels.push_back(label);
// break as we found the right target!
index = j + 1;
break;
}
}
}
}

View File

@@ -0,0 +1,38 @@
#pragma once
#include "../vp_secondary_infer_node.h"
namespace vp_nodes {
// common classifier for image classification task.
// used for image classification, update secondary_class_ids/secondary_labels/secondary_scores of vp_frame_target.
class vp_classifier_node: public vp_secondary_infer_node
{
private:
// softmax logic applied on output or not
bool need_softmax;
protected:
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_classifier_node(std::string node_name,
std::string model_path,
std::string model_config_path = "",
std::string labels_path = "",
int input_width = 128,
int input_height = 128,
int batch_size = 1,
std::vector<int> p_class_ids_applied_to = std::vector<int>(),
int min_width_applied_to = 0,
int min_height_applied_to = 0,
int crop_padding = 10,
bool need_softmax = true, // imagenet dataset
float scale = 1 / 255.0,
cv::Scalar mean = cv::Scalar(123.675, 116.28, 103.53), // imagenet dataset
cv::Scalar std = cv::Scalar(0.229, 0.224, 0.225),
bool swap_rb = true,
bool swap_chn = false);
~vp_classifier_node();
};
}

View File

@@ -0,0 +1,33 @@
#include "vp_enet_seg_node.h"
namespace vp_nodes {
vp_enet_seg_node::vp_enet_seg_node(std::string node_name,
std::string model_path,
std::string model_config_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
int class_id_offset,
float scale,
cv::Scalar mean,
cv::Scalar std,
bool swap_rb):
vp_primary_infer_node(node_name, model_path, model_config_path, labels_path, input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb) {
this->initialized();
}
vp_enet_seg_node::~vp_enet_seg_node() {
deinitialized();
}
void vp_enet_seg_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
auto& frame_meta = frame_meta_with_batch[0];
auto& mask = raw_outputs[0];
// save output as mask directly this is the mask for the whole frame.
frame_meta->mask = mask.clone();
}
}

View File

@@ -0,0 +1,30 @@
#pragma once
#include "../vp_primary_infer_node.h"
namespace vp_nodes {
// semantic segmentation based on ENet
//
class vp_enet_seg_node: public vp_primary_infer_node
{
private:
/* data */
protected:
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_enet_seg_node(std::string node_name,
std::string model_path,
std::string model_config_path = "",
std::string labels_path = "",
int input_width = 1024,
int input_height = 512,
int batch_size = 1,
int class_id_offset = 0,
float scale = 1 / 255.0,
cv::Scalar mean = cv::Scalar(0),
cv::Scalar std = cv::Scalar(1),
bool swap_rb = true);
~vp_enet_seg_node();
};
}

View File

@@ -0,0 +1,528 @@
#include "vp_face_swap_node.h"
namespace vp_nodes {
vp_face_swap_node::vp_face_swap_node(std::string node_name,
std::string yunet_face_detect_model,
std::string buffalo_l_face_encoding_model,
std::string emap_file_for_embeddings,
std::string insightface_swap_model,
std::string swap_source_image,
int swap_source_face_index,
int min_face_w_h,
bool swap_on_osd,
bool act_as_primary_detector):
vp_primary_infer_node(node_name, ""),
act_as_primary_detector(act_as_primary_detector),
swap_on_osd(swap_on_osd),
min_face_w_h(min_face_w_h) {
/* init net*/
face_extract_net = cv::dnn::readNetFromONNX(yunet_face_detect_model);
face_encoding_net = cv::dnn::readNetFromONNX(buffalo_l_face_encoding_model);
face_swap_net = cv::dnn::readNetFromONNX(insightface_swap_model);
#ifdef VP_WITH_CUDA
face_extract_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
face_extract_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
face_encoding_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
face_encoding_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
face_swap_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
face_swap_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
#endif
init_source_face_embeddings(swap_source_image, swap_source_face_index, emap_file_for_embeddings);
this->initialized();
}
vp_face_swap_node::~vp_face_swap_node() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_face_swap_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
auto& frame_meta = frame_meta_with_batch[0];
if (frame_meta->face_targets.size() == 0) {
if (!act_as_primary_detector) {
return;
}
// extract faces and update back to frame meta
// to-do
}
auto start_time = std::chrono::system_clock::now();
// iterate each face target
for (int i = 0; i < frame_meta->face_targets.size(); i++) {
if (frame_meta->face_targets[i]->width < min_face_w_h || frame_meta->face_targets[i]->height < min_face_w_h) {
continue;
}
cv::Mat aligned_face, swapped_face;
//auto t = std::chrono::system_clock::now();
// align and crop first
auto warp_mat = align_crop(frame_meta->frame, frame_meta->face_targets[i]->key_points, aligned_face);
//auto T = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - t).count();
//std::cout << "1st T:" << T << std::endl;
//t = std::chrono::system_clock::now();
// swap
swap(aligned_face, swapped_face);
//T = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - t).count();
//std::cout << "2nd T:" << T << std::endl;
//t = std::chrono::system_clock::now();
// past back to frame or osd frame
if (swap_on_osd) {
if (frame_meta->osd_frame.empty()) {
frame_meta->osd_frame = frame_meta->frame.clone();
}
}
auto& bg = swap_on_osd ? frame_meta->osd_frame : frame_meta->frame;
paste_back(bg, swapped_face, warp_mat);
//T = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - t).count();
//std::cout << "3rd T:" << T << std::endl;
}
//auto total_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time).count();
//std::cout << "total T:" << total_time << std::endl;
}
cv::Mat vp_face_swap_node::read_emap_mat_from_txt(const std::string& emap_file_for_embeddings, int rows, int cols) {
std::ifstream file(emap_file_for_embeddings);
cv::Mat matrix(rows, cols, CV_32F);
std::string line;
for (int i = 0; i < rows; ++i) {
std::getline(file, line);
std::istringstream iss(line);
for (int j = 0; j < cols; ++j) {
float value;
assert(iss >> value);
matrix.at<float>(i, j) = value;
}
}
file.close();
return matrix;
}
cv::Mat vp_face_swap_node::process_embeddings_using_emap(const cv::Mat& source_face_normed_embedding, const cv::Mat& emap) {
cv::Mat latent = source_face_normed_embedding.clone().reshape(1, 1);
cv::Mat result = latent * emap;
cv::normalize(result, result);
return result;
}
cv::Mat vp_face_swap_node::get_similarity_transform_matrix(float src[5][2]) {
using namespace cv;
//float dst[5][2] = { {38.2946f, 51.6963f}, {73.5318f, 51.5014f}, {56.0252f, 71.7366f}, {41.5493f, 92.3655f}, {70.7299f, 92.2041f} }; // for 112*112
float dst[5][2] = { {43.0f, 58.0f}, {85.0f, 58.0f}, {64.0f, 81.0f}, {47.0f, 105.0f}, {80.0f, 105.0f} }; // for 128*128
//float dst[5][2] = { {38.0f, 54.0f}, {90.0f, 54.0f}, {64.0f, 85.0f}, {47.0f, 109.0f}, {80.0f, 109.0f} }; // for 128*128, zoom out
float avg0 = (src[0][0] + src[1][0] + src[2][0] + src[3][0] + src[4][0]) / 5;
float avg1 = (src[0][1] + src[1][1] + src[2][1] + src[3][1] + src[4][1]) / 5;
//Compute mean of src and dst.
float src_mean[2] = { avg0, avg1 };
float dst_mean[2] = { 56.0262f, 71.9008f };
//Subtract mean from src and dst.
float src_demean[5][2];
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 5; j++)
{
src_demean[j][i] = src[j][i] - src_mean[i];
}
}
float dst_demean[5][2];
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 5; j++)
{
dst_demean[j][i] = dst[j][i] - dst_mean[i];
}
}
double A00 = 0.0, A01 = 0.0, A10 = 0.0, A11 = 0.0;
for (int i = 0; i < 5; i++)
A00 += dst_demean[i][0] * src_demean[i][0];
A00 = A00 / 5;
for (int i = 0; i < 5; i++)
A01 += dst_demean[i][0] * src_demean[i][1];
A01 = A01 / 5;
for (int i = 0; i < 5; i++)
A10 += dst_demean[i][1] * src_demean[i][0];
A10 = A10 / 5;
for (int i = 0; i < 5; i++)
A11 += dst_demean[i][1] * src_demean[i][1];
A11 = A11 / 5;
Mat A = (Mat_<double>(2, 2) << A00, A01, A10, A11);
double d[2] = { 1.0, 1.0 };
double detA = A00 * A11 - A01 * A10;
if (detA < 0)
d[1] = -1;
double T[3][3] = { {1.0, 0.0, 0.0}, {0.0, 1.0, 0.0}, {0.0, 0.0, 1.0} };
Mat s, u, vt, v;
SVD::compute(A, s, u, vt);
double smax = s.ptr<double>(0)[0]>s.ptr<double>(1)[0] ? s.ptr<double>(0)[0] : s.ptr<double>(1)[0];
double tol = smax * 2 * FLT_MIN;
int rank = 0;
if (s.ptr<double>(0)[0]>tol)
rank += 1;
if (s.ptr<double>(1)[0]>tol)
rank += 1;
double arr_u[2][2] = { {u.ptr<double>(0)[0], u.ptr<double>(0)[1]}, {u.ptr<double>(1)[0], u.ptr<double>(1)[1]} };
double arr_vt[2][2] = { {vt.ptr<double>(0)[0], vt.ptr<double>(0)[1]}, {vt.ptr<double>(1)[0], vt.ptr<double>(1)[1]} };
double det_u = arr_u[0][0] * arr_u[1][1] - arr_u[0][1] * arr_u[1][0];
double det_vt = arr_vt[0][0] * arr_vt[1][1] - arr_vt[0][1] * arr_vt[1][0];
if (rank == 1)
{
if ((det_u*det_vt) > 0)
{
Mat uvt = u*vt;
T[0][0] = uvt.ptr<double>(0)[0];
T[0][1] = uvt.ptr<double>(0)[1];
T[1][0] = uvt.ptr<double>(1)[0];
T[1][1] = uvt.ptr<double>(1)[1];
}
else
{
double temp = d[1];
d[1] = -1;
Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
Mat Dvt = D*vt;
Mat uDvt = u*Dvt;
T[0][0] = uDvt.ptr<double>(0)[0];
T[0][1] = uDvt.ptr<double>(0)[1];
T[1][0] = uDvt.ptr<double>(1)[0];
T[1][1] = uDvt.ptr<double>(1)[1];
d[1] = temp;
}
}
else
{
Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
Mat Dvt = D*vt;
Mat uDvt = u*Dvt;
T[0][0] = uDvt.ptr<double>(0)[0];
T[0][1] = uDvt.ptr<double>(0)[1];
T[1][0] = uDvt.ptr<double>(1)[0];
T[1][1] = uDvt.ptr<double>(1)[1];
}
double var1 = 0.0;
for (int i = 0; i < 5; i++)
var1 += src_demean[i][0] * src_demean[i][0];
var1 = var1 / 5;
double var2 = 0.0;
for (int i = 0; i < 5; i++)
var2 += src_demean[i][1] * src_demean[i][1];
var2 = var2 / 5;
double scale = 1.0 / (var1 + var2)* (s.ptr<double>(0)[0] * d[0] + s.ptr<double>(1)[0] * d[1]);
double TS[2];
TS[0] = T[0][0] * src_mean[0] + T[0][1] * src_mean[1];
TS[1] = T[1][0] * src_mean[0] + T[1][1] * src_mean[1];
T[0][2] = dst_mean[0] - scale*TS[0];
T[1][2] = dst_mean[1] - scale*TS[1];
T[0][0] *= scale;
T[0][1] *= scale;
T[1][0] *= scale;
T[1][1] *= scale;
Mat transform_mat = (Mat_<double>(2, 3) << T[0][0], T[0][1], T[0][2], T[1][0], T[1][1], T[1][2]);
return transform_mat;
}
cv::Mat vp_face_swap_node::align_crop(cv::Mat& src_img, std::vector<std::pair<int, int>>& kps, cv::Mat& aligned_image) {
float face_keypoints[5][2] =
{{kps[0].first, kps[0].second},
{kps[1].first, kps[1].second},
{kps[2].first, kps[2].second},
{kps[3].first, kps[3].second},
{kps[4].first, kps[4].second}};
cv::Mat warp_mat = get_similarity_transform_matrix(face_keypoints);
cv::warpAffine(src_img, aligned_image, warp_mat, cv::Size(128, 128), cv::INTER_LINEAR);
return warp_mat;
}
void vp_face_swap_node::extract_faces(const cv::Mat& frame, std::vector<face_box>& face_boxes) {
auto blob_to_infer = cv::dnn::blobFromImage(frame);
face_extract_net.setInput(blob_to_infer);
const std::vector<std::string> out_names = {"loc", "conf", "iou"};
std::vector<cv::Mat> raw_outputs;
face_extract_net.forward(raw_outputs, out_names);
using namespace cv;
float scoreThreshold = 0.7;
float nmsThreshold = 0.5;
int topK = 50;
int inputW = frame.cols;
int inputH = frame.rows;
// 3 heads of output
assert(raw_outputs.size() == 3);
// Extract from output_blobs
Mat loc = raw_outputs[0];
Mat conf = raw_outputs[1];
Mat iou = raw_outputs[2];
// we need generate priors if input size changed or priors is not initialized
if (loc.rows != priors.size()) {
generatePriors(inputW, inputH);
}
assert(loc.rows == priors.size());
assert(loc.rows == conf.rows);
assert(loc.rows == iou.rows);
// Decode from deltas and priors
const std::vector<float> variance = {0.1f, 0.2f};
float* loc_v = (float*)(loc.data);
float* conf_v = (float*)(conf.data);
float* iou_v = (float*)(iou.data);
Mat faces;
// (tl_x, tl_y, w, h, re_x, re_y, le_x, le_y, nt_x, nt_y, rcm_x, rcm_y, lcm_x, lcm_y, score)
// 'tl': top left point of the bounding box
// 're': right eye, 'le': left eye
// 'nt': nose tip
// 'rcm': right corner of mouth, 'lcm': left corner of mouth
Mat face(1, 15, CV_32FC1);
for (size_t i = 0; i < priors.size(); ++i) {
// Get score
float clsScore = conf_v[i*2+1];
float iouScore = iou_v[i];
// Clamp
if (iouScore < 0.f) {
iouScore = 0.f;
}
else if (iouScore > 1.f) {
iouScore = 1.f;
}
float score = std::sqrt(clsScore * iouScore);
face.at<float>(0, 14) = score;
// Get bounding box
float cx = (priors[i].x + loc_v[i*14+0] * variance[0] * priors[i].width) * inputW;
float cy = (priors[i].y + loc_v[i*14+1] * variance[0] * priors[i].height) * inputH;
float w = priors[i].width * exp(loc_v[i*14+2] * variance[0]) * inputW;
float h = priors[i].height * exp(loc_v[i*14+3] * variance[1]) * inputH;
float x1 = cx - w / 2;
float y1 = cy - h / 2;
face.at<float>(0, 0) = x1;
face.at<float>(0, 1) = y1;
face.at<float>(0, 2) = w;
face.at<float>(0, 3) = h;
// Get landmarks
face.at<float>(0, 4) = (priors[i].x + loc_v[i*14+ 4] * variance[0] * priors[i].width) * inputW; // right eye, x
face.at<float>(0, 5) = (priors[i].y + loc_v[i*14+ 5] * variance[0] * priors[i].height) * inputH; // right eye, y
face.at<float>(0, 6) = (priors[i].x + loc_v[i*14+ 6] * variance[0] * priors[i].width) * inputW; // left eye, x
face.at<float>(0, 7) = (priors[i].y + loc_v[i*14+ 7] * variance[0] * priors[i].height) * inputH; // left eye, y
face.at<float>(0, 8) = (priors[i].x + loc_v[i*14+ 8] * variance[0] * priors[i].width) * inputW; // nose tip, x
face.at<float>(0, 9) = (priors[i].y + loc_v[i*14+ 9] * variance[0] * priors[i].height) * inputH; // nose tip, y
face.at<float>(0, 10) = (priors[i].x + loc_v[i*14+10] * variance[0] * priors[i].width) * inputW; // right corner of mouth, x
face.at<float>(0, 11) = (priors[i].y + loc_v[i*14+11] * variance[0] * priors[i].height) * inputH; // right corner of mouth, y
face.at<float>(0, 12) = (priors[i].x + loc_v[i*14+12] * variance[0] * priors[i].width) * inputW; // left corner of mouth, x
face.at<float>(0, 13) = (priors[i].y + loc_v[i*14+13] * variance[0] * priors[i].height) * inputH; // left corner of mouth, y
faces.push_back(face);
}
if (faces.rows > 1) {
// Retrieve boxes and scores
std::vector<Rect2i> faceBoxes;
std::vector<float> faceScores;
for (int rIdx = 0; rIdx < faces.rows; rIdx++)
{
faceBoxes.push_back(Rect2i(int(faces.at<float>(rIdx, 0)),
int(faces.at<float>(rIdx, 1)),
int(faces.at<float>(rIdx, 2)),
int(faces.at<float>(rIdx, 3))));
faceScores.push_back(faces.at<float>(rIdx, 14));
}
std::vector<int> keepIdx;
dnn::NMSBoxes(faceBoxes, faceScores, scoreThreshold, nmsThreshold, keepIdx, 1.f, topK);
// Get NMS results
Mat nms_faces;
for (int idx: keepIdx)
{
nms_faces.push_back(faces.row(idx));
}
// insert face target back to frame meta
for (int i = 0; i < nms_faces.rows; i++) {
auto x = int(nms_faces.at<float>(i, 0));
auto y = int(nms_faces.at<float>(i, 1));
auto w = int(nms_faces.at<float>(i, 2));
auto h = int(nms_faces.at<float>(i, 3));
// check value range
x = std::max(x, 0);
y = std::max(y, 0);
w = std::min(w, frame.cols - x);
h = std::min(h, frame.rows - y);
auto kp1 = std::pair<int, int>(int(nms_faces.at<float>(i, 4)), int(nms_faces.at<float>(i, 5)));
auto kp2 = std::pair<int, int>(int(nms_faces.at<float>(i, 6)), int(nms_faces.at<float>(i, 7)));
auto kp3 = std::pair<int, int>(int(nms_faces.at<float>(i, 8)), int(nms_faces.at<float>(i, 9)));
auto kp4 = std::pair<int, int>(int(nms_faces.at<float>(i, 10)), int(nms_faces.at<float>(i, 11)));
auto kp5 = std::pair<int, int>(int(nms_faces.at<float>(i, 12)), int(nms_faces.at<float>(i, 13)));
auto score = nms_faces.at<float>(i, 14);
face_box face;
face.x = x;
face.y = y;
face.width = w;
face.height = h;
face.score = score;
face.kps = std::vector<std::pair<int, int>>{kp1, kp2, kp3, kp4, kp5};
face_boxes.push_back(face);
}
}
}
void vp_face_swap_node::init_source_face_embeddings(std::string& swap_source_image, int swap_source_face_index, std::string& emap_file_for_embeddings) {
std::vector<face_box> source_faces;
auto source_mat = cv::imread(swap_source_image);
// extract faces
extract_faces(source_mat, source_faces);
assert(source_faces.size() > 0);
// sort from left to right
std::sort(source_faces.begin(), source_faces.end(), [](face_box a, face_box b){ return a.x < b.x;});
auto the_selected_face = (swap_source_face_index < 0 || swap_source_face_index >= source_faces.size()) ? source_faces[0] : source_faces[swap_source_face_index];
cv::Mat aligned_face;
auto warp_mat = align_crop(source_mat, the_selected_face.kps, aligned_face);
// for debug
cv::imwrite("selected_source_face.jpg", aligned_face);
// read emap from file
auto emap = read_emap_mat_from_txt(emap_file_for_embeddings);
// encoding for the selected face (infer for only 1 time)
cv::Mat source_blob = cv::dnn::blobFromImage(aligned_face, 1 / 127.5, cv::Size(112, 112), (127.5, 127.5, 127.5), true);
std::vector<cv::Mat> source_outputs;
face_encoding_net.setInput(source_blob);
face_encoding_net.forward(source_outputs, face_encoding_net.getUnconnectedOutLayersNames());
auto& source_output = source_outputs[0];
// process using emap
source_face_embeddings = process_embeddings_using_emap(source_output, emap);
}
void vp_face_swap_node::swap(cv::Mat& aligned_face, cv::Mat& swapped_face) {
cv::Mat target_blob = cv::dnn::blobFromImage(aligned_face, 1 / 255.0, cv::Size(128, 128), (0, 0, 0), true);
std::vector<cv::Mat> target_outputs;
face_swap_net.setInput(target_blob, "target");
face_swap_net.setInput(source_face_embeddings, "source");
face_swap_net.forward(target_outputs, face_swap_net.getUnconnectedOutLayersNames());
auto& output = target_outputs[0];
cv::Mat output_channel_last;
cv::transposeND(output, {0,2,3,1}, output_channel_last);
cv::Mat img_fake(output_channel_last.size[1], output_channel_last.size[2], CV_32FC3, output_channel_last.data);
cv::cvtColor(img_fake, swapped_face, cv::COLOR_RGB2BGR);
}
void vp_face_swap_node::paste_back(cv::Mat& bg, cv::Mat& swapped_face, const cv::Mat& transform_matrix) {
bg.convertTo(bg, CV_32FC3, 1.0 / 255);
cv::Mat IM;
cv::invertAffineTransform(transform_matrix, IM);
cv::Mat img_mask(swapped_face.rows, swapped_face.cols, CV_32FC1, 1);
cv::warpAffine(swapped_face, swapped_face, IM, bg.size());
cv::warpAffine(img_mask, img_mask, IM, bg.size());
// create mask
cv::threshold(img_mask, img_mask, 0, 1, cv::THRESH_BINARY);
cv::Point min_loc, max_loc;
double min_val, max_val;
cv::minMaxLoc(img_mask, &min_val, &max_val, &min_loc, &max_loc);
int mask_h = max_loc.y - min_loc.y;
int mask_w = max_loc.x - min_loc.x;
int mask_size = std::sqrt(mask_h * mask_w);
int k = std::max(mask_size / 10, 10);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(k, k));
cv::erode(img_mask, img_mask, kernel);
k = std::max(mask_size / 20, 5);
cv::GaussianBlur(img_mask, img_mask, cv::Size(2 * k + 1, 2 * k + 1), 0);
cv::cvtColor(img_mask, img_mask, cv::COLOR_GRAY2BGR);
// merge swapped face and original background
cv::Mat ones(bg.rows, bg.cols, CV_32FC3, cv::Scalar(1, 1, 1));
bg = img_mask.mul(swapped_face) + (ones - img_mask).mul(bg);
bg.convertTo(bg, CV_8U, 255);
}
void vp_face_swap_node::generatePriors(int inputW, int inputH) {
using namespace cv;
// Calculate shapes of different scales according to the shape of input image
Size feature_map_2nd = {
int(int((inputW+1)/2)/2), int(int((inputH+1)/2)/2)
};
Size feature_map_3rd = {
int(feature_map_2nd.width/2), int(feature_map_2nd.height/2)
};
Size feature_map_4th = {
int(feature_map_3rd.width/2), int(feature_map_3rd.height/2)
};
Size feature_map_5th = {
int(feature_map_4th.width/2), int(feature_map_4th.height/2)
};
Size feature_map_6th = {
int(feature_map_5th.width/2), int(feature_map_5th.height/2)
};
std::vector<Size> feature_map_sizes;
feature_map_sizes.push_back(feature_map_3rd);
feature_map_sizes.push_back(feature_map_4th);
feature_map_sizes.push_back(feature_map_5th);
feature_map_sizes.push_back(feature_map_6th);
// Fixed params for generating priors
const std::vector<std::vector<float>> min_sizes = {
{10.0f, 16.0f, 24.0f},
{32.0f, 48.0f},
{64.0f, 96.0f},
{128.0f, 192.0f, 256.0f}
};
CV_Assert(min_sizes.size() == feature_map_sizes.size()); // just to keep vectors in sync
const std::vector<int> steps = { 8, 16, 32, 64 };
// Generate priors
priors.clear();
for (size_t i = 0; i < feature_map_sizes.size(); ++i)
{
Size feature_map_size = feature_map_sizes[i];
std::vector<float> min_size = min_sizes[i];
for (int _h = 0; _h < feature_map_size.height; ++_h)
{
for (int _w = 0; _w < feature_map_size.width; ++_w)
{
for (size_t j = 0; j < min_size.size(); ++j)
{
float s_kx = min_size[j] / inputW;
float s_ky = min_size[j] / inputH;
float cx = (_w + 0.5f) * steps[i] / inputW;
float cy = (_h + 0.5f) * steps[i] / inputH;
Rect2f prior = { cx, cy, s_kx, s_ky };
priors.push_back(prior);
}
}
}
}
}
void vp_face_swap_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}

View File

@@ -0,0 +1,65 @@
#pragma once
#include "../vp_primary_infer_node.h"
namespace vp_nodes {
// face swap node
// used to swap faces in videos/images using a specific face
class vp_face_swap_node: public vp_primary_infer_node
{
private:
// inner temporary use
struct face_box {
int x;
int y;
int width;
int height;
float score;
std::vector<std::pair<int, int>> kps;
};
/* onnx network using opencv::dnn as backend */
cv::dnn::Net face_extract_net;
cv::dnn::Net face_encoding_net;
cv::dnn::Net face_swap_net;
cv::Mat read_emap_mat_from_txt(const std::string& emap_file_for_embeddings, int rows = 512, int cols = 512);
cv::Mat process_embeddings_using_emap(const cv::Mat& source_face_normed_embedding, const cv::Mat& emap);
cv::Mat get_similarity_transform_matrix(float src[5][2]);
cv::Mat align_crop(cv::Mat& _src_img, std::vector<std::pair<int, int>>& kps, cv::Mat& _aligned_img);
void extract_faces(const cv::Mat& frame, std::vector<face_box>& faces);
void init_source_face_embeddings(std::string& swap_source_image, int swap_source_face_index, std::string& emap_file_for_embeddings);
void swap(cv::Mat& aligned_face, cv::Mat& swapped_face);
void paste_back(cv::Mat& bg, cv::Mat& swapped_face, const cv::Mat& transform_matrix);
cv::Mat source_face_embeddings;
bool act_as_primary_detector = false;
bool swap_on_osd = true;
int min_face_w_h = 50;
/* used for extract faces */
std::vector<cv::Rect2f> priors;
void generatePriors(int inputW, int inputH);
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_face_swap_node(std::string node_name,
std::string yunet_face_detect_model,
std::string buffalo_l_face_encoding_model,
std::string emap_file_for_embeddings,
std::string insightface_swap_model,
std::string swap_source_image,
int swap_source_face_index = 0,
int min_face_w_h = 50,
bool swap_on_osd = true,
bool act_as_primary_detector = false);
~vp_face_swap_node();
};
}

View File

@@ -0,0 +1,15 @@
#include "vp_feature_encoder_node.h"
namespace vp_nodes {
vp_feature_encoder_node::vp_feature_encoder_node(std::string node_name, std::string model_path):
vp_secondary_infer_node(node_name, model_path) {
this->initialized();
}
vp_feature_encoder_node::~vp_feature_encoder_node() {
deinitialized();
}
}

View File

@@ -0,0 +1,18 @@
#pragma once
#include "../vp_secondary_infer_node.h"
namespace vp_nodes {
// common feature encoder for image feature extraction.
// used for feature extraction, update embeddings of vp_frame_target.
class vp_feature_encoder_node: public vp_secondary_infer_node
{
private:
/* data */
public:
vp_feature_encoder_node(std::string node_name, std::string model_path);
~vp_feature_encoder_node();
};
}

View File

@@ -0,0 +1,34 @@
#include "vp_lane_detector_node.h"
namespace vp_nodes {
vp_lane_detector_node::vp_lane_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
int class_id_offset,
float scale,
cv::Scalar mean,
cv::Scalar std,
bool swap_rb,
bool swap_chn):
vp_primary_infer_node(node_name, model_path, model_config_path, labels_path, input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb, swap_chn) {
this->initialized();
}
vp_lane_detector_node::~vp_lane_detector_node() {
deinitialized();
}
void vp_lane_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
auto& frame_meta = frame_meta_with_batch[0];
auto& mask = raw_outputs[0];
// save output as mask directly this is the mask for the whole frame.
frame_meta->mask = mask.clone();
}
}

View File

@@ -0,0 +1,31 @@
#pragma once
#include "../vp_primary_infer_node.h"
namespace vp_nodes {
// lane detect based on CenterNet
//
class vp_lane_detector_node: public vp_primary_infer_node
{
private:
/* data */
protected:
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_lane_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path = "",
std::string labels_path = "",
int input_width = 736,
int input_height = 416,
int batch_size = 1,
int class_id_offset = 0,
float scale = 1,
cv::Scalar mean = cv::Scalar(0),
cv::Scalar std = cv::Scalar(1),
bool swap_rb = true,
bool swap_chn = true);
~vp_lane_detector_node();
};
}

View File

@@ -0,0 +1,89 @@
#include "vp_mask_rcnn_detector_node.h"
namespace vp_nodes {
vp_mask_rcnn_detector_node::vp_mask_rcnn_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
int class_id_offset,
float score_threshold,
float scale,
cv::Scalar mean,
cv::Scalar std,
bool swap_rb):
vp_primary_infer_node(node_name, model_path, model_config_path, labels_path, input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb),
score_threshold(score_threshold) {
this->initialized();
}
vp_mask_rcnn_detector_node::~vp_mask_rcnn_detector_node() {
deinitialized();
}
void vp_mask_rcnn_detector_node::infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) {
// blob_to_infer is a 4D matrix
// the first dim is number of batch, MUST be 1
assert(blob_to_infer.dims == 4);
assert(blob_to_infer.size[0] == 1);
assert(!net.empty());
net.setInput(blob_to_infer);
net.forward(raw_outputs, out_names);
}
void vp_mask_rcnn_detector_node::preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) {
// ignore preprocess logic in base class
cv::dnn::blobFromImages(mats_to_infer, blob_to_infer, 1.0, cv::Size(), cv::Scalar(), true, false);
}
void vp_mask_rcnn_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
auto& frame_meta = frame_meta_with_batch[0];
auto outDetections = raw_outputs[0];
auto outMasks = raw_outputs[1];
// Output size of masks is NxCxHxW where
// N - number of detected boxes
// C - number of classes (excluding background)
// HxW - segmentation shape
const int numDetections = outDetections.size[2];
const int numClasses = outMasks.size[1];
auto outDetections_ = outDetections.reshape(1, outDetections.total() / 7);
auto& frame = frame_meta->frame;
for (int i = 0; i < numDetections; ++i) {
float score = outDetections_.at<float>(i, 2);
if (score > score_threshold) {
// Extract the bounding box
int classId = static_cast<int>(outDetections_.at<float>(i, 1));
int left = static_cast<int>(frame.cols * outDetections_.at<float>(i, 3));
int top = static_cast<int>(frame.rows * outDetections_.at<float>(i, 4));
int right = static_cast<int>(frame.cols * outDetections_.at<float>(i, 5));
int bottom = static_cast<int>(frame.rows * outDetections_.at<float>(i, 6));
left = max(0, min(left, frame.cols - 1));
top = max(0, min(top, frame.rows - 1));
right = max(0, min(right, frame.cols - 1));
bottom = max(0, min(bottom, frame.rows - 1));
auto label = (labels.size() < classId + 1) ? "" : labels[classId];
// Extract the mask for the object
cv::Mat objectMask(outMasks.size[2], outMasks.size[3], CV_32F, outMasks.ptr<float>(i, classId));
classId += class_id_offset;
auto target = std::make_shared<vp_objects::vp_frame_target>(left, top, right - left, bottom - top, classId, score, frame_meta->frame_index, frame_meta->channel_index, label);
target->mask = objectMask;
frame_meta->targets.push_back(target);
}
}
}
}

View File

@@ -0,0 +1,41 @@
#pragma once
#include "../vp_primary_infer_node.h"
#include "../../objects/vp_frame_target.h"
namespace vp_nodes {
// image segmentation based on Mask RCNN
// https://github.com/matterport/Mask_RCNN
class vp_mask_rcnn_detector_node: public vp_primary_infer_node
{
private:
/* data */
// names of output layers in mask_rcnn
const std::vector<std::string> out_names = {"detection_out_final", "detection_masks"};
float score_threshold = 0.5;
protected:
// override infer and preprocess as mask_rcnn has a different logic
virtual void infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) override;
virtual void preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) override;
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_mask_rcnn_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path = "",
std::string labels_path = "",
int input_width = 416,
int input_height = 416,
int batch_size = 1,
int class_id_offset = 0,
float score_threshold = 0.5,
float scale = 1 / 255.0,
cv::Scalar mean = cv::Scalar(0),
cv::Scalar std = cv::Scalar(1),
bool swap_rb = true);
~vp_mask_rcnn_detector_node();
};
}

View File

@@ -0,0 +1,34 @@
#include "vp_mllm_analyser_node.h"
#ifdef VP_WITH_LLM
namespace vp_nodes {
vp_mllm_analyser_node::vp_mllm_analyser_node(std::string node_name,
std::string model_name,
std::string prompt,
std::string api_base_url,
std::string api_key,
llmlib::LLMBackendType backend_type):
vp_primary_infer_node(node_name, ""),
llm_model_name(model_name),
llm_prompt(prompt) {
cli = llmlib::LLMClient(api_base_url, api_key, backend_type);
this->initialized();
}
vp_mllm_analyser_node::~vp_mllm_analyser_node() {
deinitialized();
}
void vp_mllm_analyser_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
auto& frame_meta = frame_meta_with_batch[0];
auto output = cli.simple_chat(llm_model_name, llm_prompt, {frame_meta->frame}, {});
frame_meta->description = output;
}
void vp_mllm_analyser_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,32 @@
#pragma once
#ifdef VP_WITH_LLM
#include "../vp_primary_infer_node.h"
#include "../../third_party/cpp_llmlib/llmlib.hpp"
namespace vp_nodes {
// image(frame) analyser based on Multimodal Large Language Model
class vp_mllm_analyser_node: public vp_primary_infer_node
{
private:
/* data */
llmlib::LLMClient cli;
std::string llm_prompt;
std::string llm_model_name;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_mllm_analyser_node(std::string node_name,
std::string model_name,
std::string prompt,
std::string api_base_url,
std::string api_key = "",
llmlib::LLMBackendType backend_type = llmlib::LLMBackendType::Ollama);
~vp_mllm_analyser_node();
};
}
#endif

View File

@@ -0,0 +1,107 @@
#include <set>
#include "vp_openpose_detector_node.h"
namespace vp_nodes {
vp_openpose_detector_node::vp_openpose_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
int class_id_offset,
float score_threshold,
vp_objects::vp_pose_type type,
float scale,
cv::Scalar mean,
cv::Scalar std,
bool swap_rb):
vp_primary_infer_node(node_name,
model_path,
model_config_path,
labels_path,
input_width,
input_height,
batch_size,
class_id_offset,
scale, mean,
std,
swap_rb),
score_threshold(score_threshold), type(type) {
this->initialized();
}
vp_openpose_detector_node::~vp_openpose_detector_node() {
deinitialized();
}
void vp_openpose_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
// make sure heads of output are not zero
assert(raw_outputs.size() > 0);
// as it just one head of output
auto& netOutputBlobs = raw_outputs[0];
// batch, number of heatmaps, h, w
assert(netOutputBlobs.dims == 4);
auto start1 = std::chrono::system_clock::now();
int netOutputBlob_dims[3] = {netOutputBlobs.size[1], netOutputBlobs.size[2], netOutputBlobs.size[3]};
// scan batch
for (int b = 0; b < netOutputBlobs.size[0]; b++) {
auto& frame_meta = frame_meta_with_batch[b];
//auto h_scale = frame_meta->frame.rows * 1.0 / netOutputBlobs.size[2];
//auto w_scale = frame_meta->frame.cols * 1.0 / netOutputBlobs.size[3];
cv::Mat netOutputBlob = cv::Mat(3, netOutputBlob_dims, CV_32F, const_cast<uchar*>(netOutputBlobs.ptr(b)));
int keyPointId = 0;
std::vector<std::vector<KeyPoint>> detectedKeypoints;
std::vector<KeyPoint> keyPointsList;
auto start2 = std::chrono::system_clock::now();
std::vector<cv::Mat> netOutputParts;
splitNetOutputBlobToParts(netOutputBlob, cv::Size(frame_meta->frame.cols, frame_meta->frame.rows), netOutputParts);
auto start3 = std::chrono::system_clock::now();
for(int i = 0; i < points_map.at(type); ++i) {
std::vector<KeyPoint> keyPoints;
getKeyPoints(netOutputParts[i], score_threshold, keyPoints);
for(int i = 0; i< keyPoints.size(); ++i, ++keyPointId) {
keyPoints[i].id = keyPointId;
}
detectedKeypoints.push_back(keyPoints);
keyPointsList.insert(keyPointsList.end(), keyPoints.begin(), keyPoints.end());
}
std::vector<std::vector<ValidPair>> validPairs;
std::set<int> invalidPairs;
getValidPairs(netOutputParts, detectedKeypoints, validPairs, invalidPairs);
std::vector<std::vector<int>> personwiseKeypoints;
getPersonwiseKeypoints(validPairs, invalidPairs, personwiseKeypoints);
// insert pose targets back into frame meta
for(int n = 0; n < personwiseKeypoints.size(); ++n) {
std::vector<vp_objects::vp_pose_keypoint> kps;
for (int i = 0; i < personwiseKeypoints[n].size(); i++) {
auto index = personwiseKeypoints[n][i];
if (index != -1) {
auto& p = keyPointsList[index];
kps.push_back(vp_objects::vp_pose_keypoint {i, p.point.x, p.point.y, p.probability});
}
else {
// point not detected
kps.push_back(vp_objects::vp_pose_keypoint {i, -1, -1, 0});
}
}
auto pose_target = std::make_shared<vp_objects::vp_frame_pose_target>(type, kps);
frame_meta->pose_targets.push_back(pose_target);
}
}
}
}

View File

@@ -0,0 +1,293 @@
#pragma once
#include <set>
#include <map>
#include <cmath>
#include <opencv2/imgproc.hpp>
#include "../vp_primary_infer_node.h"
#include "../../objects/vp_frame_pose_target.h"
namespace vp_nodes {
// body keypoints detector using openpose
// https://github.com/CMU-Perceptual-Computing-Lab/openpose
class vp_openpose_detector_node: public vp_primary_infer_node
{
private:
float score_threshold;
// pose type (model type)
vp_objects::vp_pose_type type;
// map indexs for PAFs
const std::map<vp_objects::vp_pose_type, std::vector<std::pair<int,int>>> mapIdxes_map = {
{vp_objects::vp_pose_type::body_25, {{0,1}, {14,15}, {22,23}, {16,17}, {18,19}, {24,25}, {26,27}, {6,7}, {2,3}, {4,5}, {8,9}, {10,11}, {12,13}, {30,31}, {32,33}, {36,37}, {34,35},
{38,39}, {40,41} ,{42,43}, {44,45}, {46,47}, {48,49}, {50,51}}},
{vp_objects::vp_pose_type::coco, {{12,13}, {20,21}, {14,15}, {16,17}, {22,23}, {24,25}, {0,1},
{2,3}, {4,5}, {6,7}, {8,9}, {10,11}, {28,29}, {30,31}, {34,35}, {32,33}, {36,37}, {18,19}, {26,27}}},
{vp_objects::vp_pose_type::mpi_15, {{0,1}, {2,3}, {4,5}, {6,7}, {8,9}, {10,11}, {12,13}, {14,15}, {16,17}, {18,19}, {20,21}, {22,23}, {24,25}, {26,27}}},
{vp_objects::vp_pose_type::hand, std::vector<std::pair<int,int>>()},
{vp_objects::vp_pose_type::face, std::vector<std::pair<int,int>>()}
};
// pose pairs for PAFs
const std::map<vp_objects::vp_pose_type, std::vector<std::pair<int,int>>> posePairs_map = {
{vp_objects::vp_pose_type::body_25, {{1,8}, {1,2}, {1,5}, {2,3}, {3,4}, {5,6}, {6,7}, {8,9},
{9,10}, {10,11}, {8,12}, {12,13}, {13,14}, {1,0}, {0,15}, {15,17}, {0,16}, {16,18},
{14,19}, {19,20}, {14,21}, {11,22}, {22,23}, {11,24}}},
{vp_objects::vp_pose_type::coco, {{1,2}, {1,5}, {2,3}, {3,4}, {5,6}, {6,7},
{1,8}, {8,9}, {9,10}, {1,11}, {11,12}, {12,13},
{1,0}, {0,14}, {14,16}, {0,15}, {15,17}, {2,16},
{5,17}}},
{vp_objects::vp_pose_type::mpi_15, {{0,1}, {1,2}, {2,3}, {3,4}, {1,5}, {5,6}, {6,7}, {1,14}, {14,8}, {8,9}, {9,10}, {14,11}, {11,12}, {12,13}, {0, 2}, {0, 5}}},
{vp_objects::vp_pose_type::hand, std::vector<std::pair<int,int>>()},
{vp_objects::vp_pose_type::face, std::vector<std::pair<int,int>>()}
};
// points count for each type of pose model
const std::map<vp_objects::vp_pose_type, int> points_map = {
{vp_objects::vp_pose_type::body_25, 25},
{vp_objects::vp_pose_type::coco, 18},
{vp_objects::vp_pose_type::mpi_15, 15},
{vp_objects::vp_pose_type::hand, 0},
{vp_objects::vp_pose_type::face, 0}
};
// for postprocess purpose
struct ValidPair{
ValidPair(int aId,int bId,float score) {
this->aId = aId;
this->bId = bId;
this->score = score;
}
int aId;
int bId;
float score;
};
struct KeyPoint{
KeyPoint(cv::Point point, float probability) {
this->id = -1;
this->point = point;
this->probability = probability;
}
int id;
cv::Point point;
float probability;
};
// for postprocess purpose end
void getKeyPoints(cv::Mat& probMap, double threshold, std::vector<KeyPoint>& keyPoints) {
cv::Mat smoothProbMap;
cv::GaussianBlur(probMap, smoothProbMap, cv::Size( 3, 3 ), 0, 0);
cv::Mat maskedProbMap;
cv::threshold(smoothProbMap, maskedProbMap, threshold, 255, cv::THRESH_BINARY);
maskedProbMap.convertTo(maskedProbMap, CV_8U, 1);
std::vector<std::vector<cv::Point> > contours;
cv::findContours(maskedProbMap, contours, cv::RETR_TREE, cv::CHAIN_APPROX_SIMPLE);
for(int i = 0; i < contours.size();++i) {
cv::Mat blobMask = cv::Mat::zeros(smoothProbMap.rows, smoothProbMap.cols, smoothProbMap.type());
cv::fillConvexPoly(blobMask, contours[i], cv::Scalar(1));
double maxVal;
cv::Point maxLoc;
cv::minMaxLoc(smoothProbMap.mul(blobMask), 0, &maxVal, 0, &maxLoc);
keyPoints.push_back(KeyPoint(maxLoc, probMap.at<float>(maxLoc.y, maxLoc.x)));
}
}
void splitNetOutputBlobToParts(cv::Mat& netOutputBlob, const cv::Size& targetSize, std::vector<cv::Mat>& netOutputParts) {
int nParts = netOutputBlob.size[0];
int h = netOutputBlob.size[1];
int w = netOutputBlob.size[2];
for(int i = 0; i< nParts; ++i) {
cv::Mat part(h, w, CV_32F, netOutputBlob.ptr(i));
cv::Mat resizedPart;
cv::resize(part, resizedPart, targetSize);
netOutputParts.push_back(resizedPart);
}
}
void populateInterpPoints(const cv::Point& a,const cv::Point& b,int numPoints,std::vector<cv::Point>& interpCoords){
float xStep = ((float)(b.x - a.x)) / (float)(numPoints - 1);
float yStep = ((float)(b.y - a.y)) / (float)(numPoints - 1);
interpCoords.push_back(a);
for(int i = 1; i< numPoints - 1; ++i) {
interpCoords.push_back(cv::Point(a.x + xStep * i, a.y + yStep * i));
}
interpCoords.push_back(b);
}
void getValidPairs(const std::vector<cv::Mat>& netOutputParts,
const std::vector<std::vector<KeyPoint>>& detectedKeypoints,
std::vector<std::vector<ValidPair>>& validPairs,
std::set<int>& invalidPairs) {
int nInterpSamples = 10;
float confTh = 0.7;
for(int k = 0; k < mapIdxes_map.at(type).size(); ++k) {
//A->B constitute a limb
cv::Mat pafA = netOutputParts[mapIdxes_map.at(type)[k].first + points_map.at(type) + 1];
cv::Mat pafB = netOutputParts[mapIdxes_map.at(type)[k].second + points_map.at(type) + 1];
//Find the keypoints for the first and second limb
const std::vector<KeyPoint>& candA = detectedKeypoints[posePairs_map.at(type)[k].first];
const std::vector<KeyPoint>& candB = detectedKeypoints[posePairs_map.at(type)[k].second];
int nA = candA.size();
int nB = candB.size();
/*
# If keypoints for the joint-pair is detected
# check every joint in candA with every joint in candB
# Calculate the distance vector between the two joints
# Find the PAF values at a set of interpolated points between the joints
# Use the above formula to compute a score to mark the connection valid
*/
if(nA != 0 && nB != 0){
std::vector<ValidPair> localValidPairs;
for(int i = 0; i< nA; ++i) {
int maxJ = -1;
float maxScore = -1;
bool found = false;
for(int j = 0; j < nB; ++j) {
std::pair<float, float> distance(candB[j].point.x - candA[i].point.x, candB[j].point.y - candA[i].point.y);
float norm = std::sqrt(distance.first * distance.first + distance.second * distance.second);
if(!norm) {
continue;
}
distance.first /= norm;
distance.second /= norm;
//Find p(u)
std::vector<cv::Point> interpCoords;
populateInterpPoints(candA[i].point, candB[j].point, nInterpSamples, interpCoords);
//Find L(p(u))
std::vector<std::pair<float, float>> pafInterp;
for(int l = 0; l < interpCoords.size();++l) {
pafInterp.push_back(
std::pair<float,float>(
pafA.at<float>(interpCoords[l].y, interpCoords[l].x),
pafB.at<float>(interpCoords[l].y, interpCoords[l].x)
));
}
std::vector<float> pafScores;
float sumOfPafScores = 0;
int numOverTh = 0;
for(int l = 0; l< pafInterp.size(); ++l){
float score = pafInterp[l].first * distance.first + pafInterp[l].second * distance.second;
sumOfPafScores += score;
if(score > score_threshold){
++numOverTh;
}
pafScores.push_back(score);
}
float avgPafScore = sumOfPafScores / ((float)pafInterp.size());
if(((float)numOverTh)/((float)nInterpSamples) > confTh){
if(avgPafScore > maxScore) {
maxJ = j;
maxScore = avgPafScore;
found = true;
}
}
}/* j */
if(found) {
localValidPairs.push_back(ValidPair(candA[i].id, candB[maxJ].id, maxScore));
}
}/* i */
validPairs.push_back(localValidPairs);
} else {
invalidPairs.insert(k);
validPairs.push_back(std::vector<ValidPair>());
}
}/* k */
}
void getPersonwiseKeypoints(const std::vector<std::vector<ValidPair>>& validPairs,
const std::set<int>& invalidPairs,
std::vector<std::vector<int>>& personwiseKeypoints) {
for(int k = 0; k < mapIdxes_map.at(type).size(); ++k) {
if(invalidPairs.find(k) != invalidPairs.end()) {
continue;
}
const std::vector<ValidPair>& localValidPairs(validPairs[k]);
int indexA(posePairs_map.at(type)[k].first);
int indexB(posePairs_map.at(type)[k].second);
for(int i = 0; i< localValidPairs.size(); ++i) {
bool found = false;
int personIdx = -1;
for(int j = 0; !found && j < personwiseKeypoints.size();++j) {
if(indexA < personwiseKeypoints[j].size() &&
personwiseKeypoints[j][indexA] == localValidPairs[i].aId) {
personIdx = j;
found = true;
}
}/* j */
if(found) {
personwiseKeypoints[personIdx].at(indexB) = localValidPairs[i].bId;
}
else if(k < points_map.at(type) - 1) {
std::vector<int> lpkp(std::vector<int>(points_map.at(type), -1));
lpkp.at(indexA) = localValidPairs[i].aId;
lpkp.at(indexB) = localValidPairs[i].bId;
personwiseKeypoints.push_back(lpkp);
}
}/* i */
}/* k */
}
protected:
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_openpose_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path = "",
std::string labels_path = "",
int input_width = 368,
int input_height = 368,
int batch_size = 1,
int class_id_offset = 0,
float score_threshold = 0.1,
vp_objects::vp_pose_type type = vp_objects::vp_pose_type::coco,
float scale = 1 / 255.0,
cv::Scalar mean = cv::Scalar(0),
cv::Scalar std = cv::Scalar(1),
bool swap_rb = false);
~vp_openpose_detector_node();
};
}

View File

@@ -0,0 +1,66 @@
#ifdef VP_WITH_PADDLE
#include "vp_ppocr_text_detector_node.h"
#include "../../objects/vp_frame_text_target.h"
namespace vp_nodes {
vp_ppocr_text_detector_node::vp_ppocr_text_detector_node(std::string node_name,
std::string det_model_dir,
std::string cls_model_dir,
std::string rec_model_dir,
std::string rec_char_dict_path):
vp_primary_infer_node(node_name, "") {
// to make the code simpler, paddle_ocr has no more config other than model path
// we need modify source code at ../../third_party/paddle_ocr/ if we need tune the parameters
ocr = std::make_shared<PaddleOCR::PPOCR>(det_model_dir, cls_model_dir, rec_model_dir, rec_char_dict_path);
this->initialized();
}
vp_ppocr_text_detector_node::~vp_ppocr_text_detector_node() {
deinitialized();
}
void vp_ppocr_text_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
// please refer to vp_infer_node::run_infer_combinations
void vp_ppocr_text_detector_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
// call paddle ocr
auto ocr_results = ocr->ocr(mats_to_infer);
assert(ocr_results.size() == 1);
auto& ocr_result = ocr_results[0];
auto& frame_meta = frame_meta_with_batch[0];
// scan text detected in frame
for (int i = 0; i < ocr_result.size(); i++) {
/* code */
auto& text = ocr_result[i];
std::vector<std::pair<int, int>> region_vertexes = {{text.box[0][0], text.box[0][1]},
{text.box[1][0], text.box[1][1]},
{text.box[2][0], text.box[2][1]},
{text.box[3][0], text.box[3][1]}};
// create text target and update back into frame meta
auto text_target = std::make_shared<vp_objects::vp_frame_text_target>(region_vertexes, text.text, text.score);
frame_meta->text_targets.push_back(text_target);
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
}
#endif

View File

@@ -0,0 +1,33 @@
#pragma once
#ifdef VP_WITH_PADDLE
#include "../vp_primary_infer_node.h"
#include "../../third_party/paddle_ocr/include/paddleocr.h"
namespace vp_nodes {
// ocr based on paddle ocr
// paddle ocr project(official): https://github.com/PaddlePaddle/PaddleOCR
// source code(modified based on official): ../../third_party/paddle_ocr
// note:
// this class is not based on opencv::dnn module but paddle, a few data members declared in base class are not usable any more(just ignore), such as vp_infer_node::net.
class vp_ppocr_text_detector_node: public vp_primary_infer_node
{
private:
// paddle ocr instance
std::shared_ptr<PaddleOCR::PPOCR> ocr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_ppocr_text_detector_node(std::string node_name,
std::string det_model_dir = "",
std::string cls_model_dir = "",
std::string rec_model_dir = "",
std::string rec_char_dict_path = "");
~vp_ppocr_text_detector_node();
};
}
#endif

View File

@@ -0,0 +1,52 @@
#include "vp_restoration_node.h"
namespace vp_nodes {
vp_restoration_node::vp_restoration_node(std::string node_name,
std::string realesrgan_bg_restoration_model,
std::string face_restoration_model,
bool restoration_to_osd):
vp_primary_infer_node(node_name, ""),
restoration_to_osd(restoration_to_osd) {
/* init net*/
restoration_net = cv::dnn::readNetFromONNX(realesrgan_bg_restoration_model);
/* to-do, load face restoration model*/
#ifdef VP_WITH_CUDA
//restoration_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
//restoration_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
#endif
this->initialized();
}
vp_restoration_node::~vp_restoration_node() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_restoration_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
auto& frame_meta = frame_meta_with_batch[0];
// 4x larger for RealESRGAN_x4plus model
cv::Mat target_blob = cv::dnn::blobFromImage(frame_meta->frame, 1 / 255.0, cv::Size(frame_meta->frame.cols, frame_meta->frame.rows), (0, 0, 0), true);
std::vector<cv::Mat> target_outputs;
restoration_net.setInput(target_blob);
restoration_net.forward(target_outputs, restoration_net.getUnconnectedOutLayersNames());
// parse to image
auto& output = target_outputs[0];
cv::Mat output_channel_last;
cv::transposeND(output, {0,2,3,1}, output_channel_last);
cv::Mat img_result(output_channel_last.size[1], output_channel_last.size[2], CV_32FC3, output_channel_last.data);
img_result.convertTo(img_result, CV_8U, 255);
// update back to frame meta
auto& bg = restoration_to_osd ? frame_meta->osd_frame : frame_meta->frame;
cv::cvtColor(img_result, bg, cv::COLOR_RGB2BGR);
}
void vp_restoration_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}

View File

@@ -0,0 +1,27 @@
#pragma once
#include "../vp_primary_infer_node.h"
namespace vp_nodes {
// general image restoration node using Real-ESRGAN
// used to enhance quality of frames in video see more: https://github.com/xinntao/Real-ESRGAN
class vp_restoration_node: public vp_primary_infer_node
{
private:
/* onnx network using opencv::dnn as backend */
cv::dnn::Net restoration_net;
bool restoration_to_osd = true;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_restoration_node(std::string node_name,
std::string realesrgan_bg_restoration_model,
std::string face_restoration_model = "",
bool restoration_to_osd = true);
~vp_restoration_node();
};
}

View File

@@ -0,0 +1,183 @@
#include <algorithm>
#include <opencv2/imgproc.hpp>
#include "vp_sface_feature_encoder_node.h"
namespace vp_nodes {
vp_sface_feature_encoder_node::vp_sface_feature_encoder_node(std::string node_name, std::string model_path):
vp_secondary_infer_node(node_name, model_path,
"", "",
112, 112,
1, std::vector<int>(), 0, 0,
0, 1, cv::Scalar()) {
this->initialized();
}
vp_sface_feature_encoder_node::~vp_sface_feature_encoder_node() {
deinitialized();
}
void vp_sface_feature_encoder_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
// make sure heads of output are not zero
assert(raw_outputs.size() > 0);
assert(frame_meta_with_batch.size() == 1);
// just one head of output
auto& output = raw_outputs[0];
assert(output.dims == 2);
auto count = output.rows;
auto& frame_meta = frame_meta_with_batch[0];
// update feature data back into frame meta
for (int i = 0; i < count; i++) {
cv::Mat feature = output.row(i);
for (int j = 0; j < feature.cols; j++) {
frame_meta->face_targets[i]->embeddings.push_back(feature.at<float>(0, j));
}
}
}
// refer to vp_secondary_infer_node::prepare
void vp_sface_feature_encoder_node::prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) {
// only one by one for secondary infer node
assert(frame_meta_with_batch.size() == 1);
// align face and crop
auto& frame_meta = frame_meta_with_batch[0];
// batch by batch inside single frame
for (auto& i : frame_meta->face_targets) {
// align and crop
float face_keypoints[5][2] =
{{i->key_points[0].first, i->key_points[0].second},
{i->key_points[1].first, i->key_points[1].second},
{i->key_points[2].first, i->key_points[2].second},
{i->key_points[3].first, i->key_points[3].second},
{i->key_points[4].first, i->key_points[4].second}};
cv::Mat aligned_face;
alignCrop(frame_meta->frame, face_keypoints, aligned_face);
mats_to_infer.push_back(aligned_face);
}
}
cv::Mat vp_sface_feature_encoder_node::getSimilarityTransformMatrix(float src[5][2]) {
using namespace cv;
float dst[5][2] = { {38.2946f, 51.6963f}, {73.5318f, 51.5014f}, {56.0252f, 71.7366f}, {41.5493f, 92.3655f}, {70.7299f, 92.2041f} };
float avg0 = (src[0][0] + src[1][0] + src[2][0] + src[3][0] + src[4][0]) / 5;
float avg1 = (src[0][1] + src[1][1] + src[2][1] + src[3][1] + src[4][1]) / 5;
//Compute mean of src and dst.
float src_mean[2] = { avg0, avg1 };
float dst_mean[2] = { 56.0262f, 71.9008f };
//Subtract mean from src and dst.
float src_demean[5][2];
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 5; j++)
{
src_demean[j][i] = src[j][i] - src_mean[i];
}
}
float dst_demean[5][2];
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 5; j++)
{
dst_demean[j][i] = dst[j][i] - dst_mean[i];
}
}
double A00 = 0.0, A01 = 0.0, A10 = 0.0, A11 = 0.0;
for (int i = 0; i < 5; i++)
A00 += dst_demean[i][0] * src_demean[i][0];
A00 = A00 / 5;
for (int i = 0; i < 5; i++)
A01 += dst_demean[i][0] * src_demean[i][1];
A01 = A01 / 5;
for (int i = 0; i < 5; i++)
A10 += dst_demean[i][1] * src_demean[i][0];
A10 = A10 / 5;
for (int i = 0; i < 5; i++)
A11 += dst_demean[i][1] * src_demean[i][1];
A11 = A11 / 5;
Mat A = (Mat_<double>(2, 2) << A00, A01, A10, A11);
double d[2] = { 1.0, 1.0 };
double detA = A00 * A11 - A01 * A10;
if (detA < 0)
d[1] = -1;
double T[3][3] = { {1.0, 0.0, 0.0}, {0.0, 1.0, 0.0}, {0.0, 0.0, 1.0} };
Mat s, u, vt, v;
SVD::compute(A, s, u, vt);
double smax = s.ptr<double>(0)[0]>s.ptr<double>(1)[0] ? s.ptr<double>(0)[0] : s.ptr<double>(1)[0];
double tol = smax * 2 * FLT_MIN;
int rank = 0;
if (s.ptr<double>(0)[0]>tol)
rank += 1;
if (s.ptr<double>(1)[0]>tol)
rank += 1;
double arr_u[2][2] = { {u.ptr<double>(0)[0], u.ptr<double>(0)[1]}, {u.ptr<double>(1)[0], u.ptr<double>(1)[1]} };
double arr_vt[2][2] = { {vt.ptr<double>(0)[0], vt.ptr<double>(0)[1]}, {vt.ptr<double>(1)[0], vt.ptr<double>(1)[1]} };
double det_u = arr_u[0][0] * arr_u[1][1] - arr_u[0][1] * arr_u[1][0];
double det_vt = arr_vt[0][0] * arr_vt[1][1] - arr_vt[0][1] * arr_vt[1][0];
if (rank == 1)
{
if ((det_u*det_vt) > 0)
{
Mat uvt = u*vt;
T[0][0] = uvt.ptr<double>(0)[0];
T[0][1] = uvt.ptr<double>(0)[1];
T[1][0] = uvt.ptr<double>(1)[0];
T[1][1] = uvt.ptr<double>(1)[1];
}
else
{
double temp = d[1];
d[1] = -1;
Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
Mat Dvt = D*vt;
Mat uDvt = u*Dvt;
T[0][0] = uDvt.ptr<double>(0)[0];
T[0][1] = uDvt.ptr<double>(0)[1];
T[1][0] = uDvt.ptr<double>(1)[0];
T[1][1] = uDvt.ptr<double>(1)[1];
d[1] = temp;
}
}
else
{
Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
Mat Dvt = D*vt;
Mat uDvt = u*Dvt;
T[0][0] = uDvt.ptr<double>(0)[0];
T[0][1] = uDvt.ptr<double>(0)[1];
T[1][0] = uDvt.ptr<double>(1)[0];
T[1][1] = uDvt.ptr<double>(1)[1];
}
double var1 = 0.0;
for (int i = 0; i < 5; i++)
var1 += src_demean[i][0] * src_demean[i][0];
var1 = var1 / 5;
double var2 = 0.0;
for (int i = 0; i < 5; i++)
var2 += src_demean[i][1] * src_demean[i][1];
var2 = var2 / 5;
double scale = 1.0 / (var1 + var2)* (s.ptr<double>(0)[0] * d[0] + s.ptr<double>(1)[0] * d[1]);
double TS[2];
TS[0] = T[0][0] * src_mean[0] + T[0][1] * src_mean[1];
TS[1] = T[1][0] * src_mean[0] + T[1][1] * src_mean[1];
T[0][2] = dst_mean[0] - scale*TS[0];
T[1][2] = dst_mean[1] - scale*TS[1];
T[0][0] *= scale;
T[0][1] *= scale;
T[1][0] *= scale;
T[1][1] *= scale;
Mat transform_mat = (Mat_<double>(2, 3) << T[0][0], T[0][1], T[0][2], T[1][0], T[1][1], T[1][2]);
return transform_mat;
}
void vp_sface_feature_encoder_node::alignCrop(cv::Mat& _src_img, float _src_point[5][2], cv::Mat& _aligned_img) {
cv::Mat warp_mat = getSimilarityTransformMatrix(_src_point);
cv::warpAffine(_src_img, _aligned_img, warp_mat, cv::Size(input_width, input_height), cv::INTER_LINEAR);
}
}

View File

@@ -0,0 +1,27 @@
#pragma once
#include "../vp_secondary_infer_node.h"
namespace vp_nodes {
// face feature encoder based on SFace, update embeddings of vp_frame_face_target
// https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/face_recognize.cpp
// https://github.com/zhongyy/SFace
class vp_sface_feature_encoder_node: public vp_secondary_infer_node
{
private:
// get transform matrix for aglin face
cv::Mat getSimilarityTransformMatrix(float src[5][2]);
// align and crop
void alignCrop(cv::Mat& _src_img, float _src_point[5][2], cv::Mat& _aligned_img);
protected:
// override prepare as sface has an additional logic for face align before preprocess
virtual void prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) override;
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_sface_feature_encoder_node(std::string node_name, std::string model_path);
~vp_sface_feature_encoder_node();
};
}

View File

@@ -0,0 +1,64 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_color_classifier.h"
namespace vp_nodes {
vp_trt_vehicle_color_classifier::vp_trt_vehicle_color_classifier(std::string node_name,
std::string vehicle_color_cls_model_path,
std::vector<int> p_class_ids_applied_to,
int min_width_applied_to, int min_height_applied_to):
vp_secondary_infer_node(node_name, "", "", "", 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
vehicle_color_classifier = std::make_shared<trt_vehicle::VehicleColorClassifier>(vehicle_color_cls_model_path);
this->initialized();
}
vp_trt_vehicle_color_classifier::~vp_trt_vehicle_color_classifier() {
deinitialized();
}
void vp_trt_vehicle_color_classifier::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// infer using trt_vehicle library
start_time = std::chrono::system_clock::now();
std::vector<trt_vehicle::ObjCls> vehicles_color;
vehicle_color_classifier->classify(mats_to_infer, vehicles_color);
auto& frame_meta = frame_meta_with_batch[0];
auto index = 0;
for (int i = 0; i < vehicles_color.size(); i++) {
for (int j = index; j < frame_meta->targets.size(); j++) {
// need apply or not?
if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
// continue as its primary_class_id is not in p_class_ids_applied_to
continue;
}
// update back to frame meta
frame_meta->targets[j]->secondary_class_ids.push_back(vehicles_color[i].class_);
frame_meta->targets[j]->secondary_scores.push_back(vehicles_color[i].score);
frame_meta->targets[j]->secondary_labels.push_back(vehicles_color[i].label);
// break as we found the right target!
index = j + 1;
break;
}
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_vehicle_color_classifier::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,26 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_secondary_infer_node.h"
#include "../../third_party/trt_vehicle/models/vehicle_color_classifier.h"
namespace vp_nodes {
// vehicle color classifier based on tensorrt using trt_vehicle library
// update secondary_class_ids/secondary_labels/secondary_scores of vp_frame_target.
class vp_trt_vehicle_color_classifier: public vp_secondary_infer_node
{
private:
/* data */
std::shared_ptr<trt_vehicle::VehicleColorClassifier> vehicle_color_classifier = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_color_classifier(std::string node_name, std::string vehicle_color_cls_model_path = "", std::vector<int> p_class_ids_applied_to = std::vector<int>(), int min_width_applied_to = 0, int min_height_applied_to = 0);
~vp_trt_vehicle_color_classifier();
};
}
#endif

View File

@@ -0,0 +1,64 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_detector.h"
namespace vp_nodes {
vp_trt_vehicle_detector::vp_trt_vehicle_detector(std::string node_name,
std::string vehicle_det_model_path):
vp_primary_infer_node(node_name, "") {
vehicle_detector = std::make_shared<trt_vehicle::VehicleDetector>(vehicle_det_model_path);
this->initialized();
}
vp_trt_vehicle_detector::~vp_trt_vehicle_detector() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_vehicle_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
std::vector<std::vector<trt_vehicle::ObjBox>> vehicles;
vehicle_detector->detect(mats_to_infer, vehicles);
assert(vehicles.size() == 1);
auto& vehicle_list = vehicles[0];
auto& frame_meta = frame_meta_with_batch[0];
for (int i = 0; i < vehicle_list.size(); i++) {
auto& objbox = vehicle_list[i];
// check value range
objbox.x = std::max(objbox.x, 0);
objbox.y = std::max(objbox.y, 0);
objbox.width = std::min(objbox.width, frame_meta->frame.cols - objbox.x);
objbox.height = std::min(objbox.height, frame_meta->frame.rows - objbox.y);
if (objbox.width <= 0 || objbox.height <= 0) {
continue;
}
auto target = std::make_shared<vp_objects::vp_frame_target>(objbox.x, objbox.y, objbox.width, objbox.height,
objbox.class_, objbox.score, frame_meta->frame_index, frame_meta->channel_index, objbox.label);
// create target and update back into frame meta
frame_meta->targets.push_back(target);
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_vehicle_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,24 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_primary_infer_node.h"
#include "../../third_party/trt_vehicle/models/vehicle_detector.h"
namespace vp_nodes {
// vehicle detector based on tensorrt using trt_vehicle library
class vp_trt_vehicle_detector: public vp_primary_infer_node
{
private:
std::shared_ptr<trt_vehicle::VehicleDetector> vehicle_detector = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_detector(std::string node_name, std::string vehicle_det_model_path = "");
~vp_trt_vehicle_detector();
};
}
#endif

View File

@@ -0,0 +1,65 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_feature_encoder.h"
namespace vp_nodes {
vp_trt_vehicle_feature_encoder::vp_trt_vehicle_feature_encoder(std::string node_name,
std::string vehicle_feature_model_path,
std::vector<int> p_class_ids_applied_to,
int min_width_applied_to, int min_height_applied_to):
vp_secondary_infer_node(node_name, "", "", "", 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
vehicle_feature_encoder = std::make_shared<trt_vehicle::VehicleFeatureEncoder>(vehicle_feature_model_path);
this->initialized();
}
vp_trt_vehicle_feature_encoder::~vp_trt_vehicle_feature_encoder() {
deinitialized();
}
void vp_trt_vehicle_feature_encoder::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// infer using trt_vehicle library
start_time = std::chrono::system_clock::now();
std::vector<std::vector<float>> vehicles_feature;
vehicle_feature_encoder->encode(mats_to_infer, vehicles_feature);
auto& frame_meta = frame_meta_with_batch[0];
auto index = 0;
for (int i = 0; i < vehicles_feature.size(); i++) {
for (int j = index; j < frame_meta->targets.size(); j++) {
// need apply or not?
if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
// continue as its primary_class_id is not in p_class_ids_applied_to
continue;
}
// update back to frame meta
auto& features = vehicles_feature[i];
for(int k = 0; k < features.size(); k++) {
frame_meta->targets[j]->embeddings.push_back(features[k]);
}
// break as we found the right target!
index = j + 1;
break;
}
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_vehicle_feature_encoder::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,26 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_secondary_infer_node.h"
#include "../../third_party/trt_vehicle/models/vehicle_feature_encoder.h"
namespace vp_nodes {
// vehicle feature encoder based on tensorrt using trt_vehicle library
// update embeddings of vp_frame_target
class vp_trt_vehicle_feature_encoder: public vp_secondary_infer_node
{
private:
/* data */
std::shared_ptr<trt_vehicle::VehicleFeatureEncoder> vehicle_feature_encoder = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_feature_encoder(std::string node_name, std::string vehicle_feature_model_path = "", std::vector<int> p_class_ids_applied_to = std::vector<int>(), int min_width_applied_to = 0, int min_height_applied_to = 0);
~vp_trt_vehicle_feature_encoder();
};
}
#endif

View File

@@ -0,0 +1,71 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_plate_detector.h"
namespace vp_nodes {
vp_trt_vehicle_plate_detector::vp_trt_vehicle_plate_detector(std::string node_name,
std::string plate_det_model_path,
std::string char_rec_model_path):
vp_secondary_infer_node(node_name, "") {
plate_detector = std::make_shared<trt_vehicle::VehiclePlateDetector>(plate_det_model_path, char_rec_model_path);
this->initialized();
}
vp_trt_vehicle_plate_detector::~vp_trt_vehicle_plate_detector() {
deinitialized();
}
void vp_trt_vehicle_plate_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_vehicle_plate_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
// here we detect vehicle plate for all target(like car/bus) in frame
std::vector<std::vector<trt_vehicle::Plate>> plates;
plate_detector->detect(mats_to_infer, plates, true); // detect at most 1 plate for each image
auto& frame_meta = frame_meta_with_batch[0];
for (int i = 0; i < plates.size(); i++) {
for (int j = 0; j < plates[i].size(); j++) {
auto& plate = plates[i][j];
// only plate detection but no recognition result
if (plate.text.empty()) {
continue;
}
// check value range
auto x = std::max(0, plate.x + frame_meta->targets[i]->x - 10); // offset
auto y = std::max(0, plate.y + frame_meta->targets[i]->y - 8); // offset
auto w = std::min(plate.width, frame_meta->frame.cols - x);
auto h = std::min(plate.height, frame_meta->frame.rows - y);
if (w <= 0 || h <=0) {
continue;
}
// create sub target and update back into frame meta
// we treat vehicle plate as sub target of those in vp_frame_meta.targets
auto sub_target = std::make_shared<vp_objects::vp_sub_target>(x, y, w, h,
-1, 0, plate.color + "_" + plate.text, frame_meta->frame_index, frame_meta->channel_index);
frame_meta->targets[i]->sub_targets.push_back(sub_target);
}
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
}
#endif

View File

@@ -0,0 +1,29 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_secondary_infer_node.h"
#include "../../objects/vp_sub_target.h"
#include "../../third_party/trt_vehicle/models/vehicle_plate_detector.h"
namespace vp_nodes {
// vehicle plate detector based on tensorrt using trt_vehicle library
// source code: ../../third_party/trt_vehicle
// note: derived from vp_secondary_infer_node since it detects plates on small cropped images, which is different from vp_trt_vehicle_plate_detector_v2 class
// this class is not based on opencv::dnn module but tensorrt, a few data members declared in base class are not usable any more(just ignore), such as vp_infer_node::net.
class vp_trt_vehicle_plate_detector: public vp_secondary_infer_node
{
private:
/* data */
std::shared_ptr<trt_vehicle::VehiclePlateDetector> plate_detector = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_plate_detector(std::string node_name, std::string plate_det_model_path = "", std::string char_rec_model_path = "");
~vp_trt_vehicle_plate_detector();
};
}
#endif

View File

@@ -0,0 +1,75 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_plate_detector_v2.h"
namespace vp_nodes {
vp_trt_vehicle_plate_detector_v2::vp_trt_vehicle_plate_detector_v2(std::string node_name,
std::string plate_det_model_path,
std::string char_rec_model_path):
vp_primary_infer_node(node_name, "") {
plate_detector = std::make_shared<trt_vehicle::VehiclePlateDetector>(plate_det_model_path, char_rec_model_path);
this->initialized();
}
vp_trt_vehicle_plate_detector_v2::~vp_trt_vehicle_plate_detector_v2() {
deinitialized();
}
void vp_trt_vehicle_plate_detector_v2::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_vehicle_plate_detector_v2::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
// here we detect vehicle plate on the whole big frame
std::vector<std::vector<trt_vehicle::Plate>> plates;
plate_detector->detect(mats_to_infer, plates);
for (int i = 0; i < plates.size(); i++) {
auto& frame_meta = frame_meta_with_batch[i];
for (int j = 0; j < plates[i].size(); j++) {
auto& plate = plates[i][j];
// only plate detection but no recognition result
if (plate.text.empty()) {
continue;
}
// check value range
auto x = std::max(0, plate.x);
auto y = std::max(0, plate.y);
auto w = std::min(plate.width, frame_meta->frame.cols - x);
auto h = std::min(plate.height, frame_meta->frame.rows - y);
if (w <= 0 || h <=0) {
continue;
}
// create target and update back into frame meta
/*
we treat vehicle plate as target in vp_frame_meta.targets,
1. x,y,width,height in vp_frame_target stand for location of plate in frame
2. primary_label in vp_frame_target stands for plate color and plate text, which combined with '_'
*/
auto target = std::make_shared<vp_objects::vp_frame_target>(x, y, w, h,
-1, 0, frame_meta->frame_index, frame_meta->channel_index, plate.color + "_" + plate.text);
frame_meta->targets.push_back(target);
VP_INFO("plate-color:" + plate.color + " plate-text:" + plate.text);
}
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
}
#endif

View File

@@ -0,0 +1,28 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_primary_infer_node.h"
#include "../../third_party/trt_vehicle/models/vehicle_plate_detector.h"
namespace vp_nodes {
// vehicle plate detector based on tensorrt using trt_vehicle library
// source code: ../../third_party/trt_vehicle
// note: derived from vp_primary_infer_node since it detects plates on the whole big frame, which is different from vp_trt_vehicle_plate_detector class
// this class is not based on opencv::dnn module but tensorrt, a few data members declared in base class are not usable any more(just ignore), such as vp_infer_node::net.
class vp_trt_vehicle_plate_detector_v2: public vp_primary_infer_node
{
private:
/* data */
std::shared_ptr<trt_vehicle::VehiclePlateDetector> plate_detector = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_plate_detector_v2(std::string node_name, std::string plate_det_model_path = "", std::string char_rec_model_path = "");
~vp_trt_vehicle_plate_detector_v2();
};
}
#endif

View File

@@ -0,0 +1,64 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_scanner.h"
namespace vp_nodes {
vp_trt_vehicle_scanner::vp_trt_vehicle_scanner(std::string node_name,
std::string vehicle_scan_model_path):
vp_primary_infer_node(node_name, "") {
vehicle_scanner = std::make_shared<trt_vehicle::VehicleScanner>(vehicle_scan_model_path);
this->initialized();
}
vp_trt_vehicle_scanner::~vp_trt_vehicle_scanner() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_vehicle_scanner::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
std::vector<std::vector<trt_vehicle::ObjBox>> vehicles;
vehicle_scanner->detect(mats_to_infer, vehicles);
assert(vehicles.size() == 1);
auto& vehicle_list = vehicles[0];
auto& frame_meta = frame_meta_with_batch[0];
for (int i = 0; i < vehicle_list.size(); i++) {
auto& objbox = vehicle_list[i];
// check value range
objbox.x = std::max(objbox.x, 0);
objbox.y = std::max(objbox.y, 0);
objbox.width = std::min(objbox.width, frame_meta->frame.cols - objbox.x);
objbox.height = std::min(objbox.height, frame_meta->frame.rows - objbox.y);
if (objbox.width <= 0 || objbox.height <= 0) {
continue;
}
auto target = std::make_shared<vp_objects::vp_frame_target>(objbox.x, objbox.y, objbox.width, objbox.height,
objbox.class_, objbox.score, frame_meta->frame_index, frame_meta->channel_index, objbox.label);
// create target and update back into frame meta
frame_meta->targets.push_back(target);
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_vehicle_scanner::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,24 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_primary_infer_node.h"
#include "../../third_party/trt_vehicle/models/vehicle_scanner.h"
namespace vp_nodes {
// vehicle scanner based on tensorrt using trt_vehicle library
class vp_trt_vehicle_scanner: public vp_primary_infer_node
{
private:
std::shared_ptr<trt_vehicle::VehicleScanner> vehicle_scanner = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_scanner(std::string node_name, std::string vehicle_scan_model_path = "");
~vp_trt_vehicle_scanner();
};
}
#endif

View File

@@ -0,0 +1,64 @@
#ifdef VP_WITH_TRT
#include "vp_trt_vehicle_type_classifier.h"
namespace vp_nodes {
vp_trt_vehicle_type_classifier::vp_trt_vehicle_type_classifier(std::string node_name,
std::string vehicle_type_cls_model_path,
std::vector<int> p_class_ids_applied_to,
int min_width_applied_to, int min_height_applied_to):
vp_secondary_infer_node(node_name, "", "", "", 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
vehicle_type_classifier = std::make_shared<trt_vehicle::VehicleTypeClassifier>(vehicle_type_cls_model_path);
this->initialized();
}
vp_trt_vehicle_type_classifier::~vp_trt_vehicle_type_classifier() {
deinitialized();
}
void vp_trt_vehicle_type_classifier::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// infer using trt_vehicle library
start_time = std::chrono::system_clock::now();
std::vector<trt_vehicle::ObjCls> vehicles_type;
vehicle_type_classifier->classify(mats_to_infer, vehicles_type);
auto& frame_meta = frame_meta_with_batch[0];
auto index = 0;
for (int i = 0; i < vehicles_type.size(); i++) {
for (int j = index; j < frame_meta->targets.size(); j++) {
// need apply or not?
if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
// continue as its primary_class_id is not in p_class_ids_applied_to
continue;
}
// update back to frame meta
frame_meta->targets[j]->secondary_class_ids.push_back(vehicles_type[i].class_);
frame_meta->targets[j]->secondary_scores.push_back(vehicles_type[i].score);
frame_meta->targets[j]->secondary_labels.push_back(vehicles_type[i].label);
// break as we found the right target!
index = j + 1;
break;
}
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_vehicle_type_classifier::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,26 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_secondary_infer_node.h"
#include "../../third_party/trt_vehicle/models/vehicle_type_classifier.h"
namespace vp_nodes {
// vehicle type classifier based on tensorrt using trt_vehicle library
// update secondary_class_ids/secondary_labels/secondary_scores of vp_frame_target.
class vp_trt_vehicle_type_classifier: public vp_secondary_infer_node
{
private:
/* data */
std::shared_ptr<trt_vehicle::VehicleTypeClassifier> vehicle_type_classifier = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_vehicle_type_classifier(std::string node_name, std::string vehicle_type_cls_model_path = "", std::vector<int> p_class_ids_applied_to = std::vector<int>(), int min_width_applied_to = 0, int min_height_applied_to = 0);
~vp_trt_vehicle_type_classifier();
};
}
#endif

View File

@@ -0,0 +1,67 @@
#ifdef VP_WITH_TRT
#include "vp_trt_yolov8_classifier.h"
namespace vp_nodes {
vp_trt_yolov8_classifier::vp_trt_yolov8_classifier(std::string node_name,
std::string model_path,
std::string labels_path,
std::vector<int> p_class_ids_applied_to,
int min_width_applied_to, int min_height_applied_to):
vp_secondary_infer_node(node_name, "", "", labels_path, 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
yolov8_classifier = std::make_shared<trt_yolov8::trt_yolov8_classifier>(model_path);
this->initialized();
}
vp_trt_yolov8_classifier::~vp_trt_yolov8_classifier() {
deinitialized();
}
void vp_trt_yolov8_classifier::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// infer using trt_vehicle library
start_time = std::chrono::system_clock::now();
std::vector<std::vector<Classification>> classidications;
yolov8_classifier->classify(mats_to_infer, classidications);
auto& frame_meta = frame_meta_with_batch[0];
auto index = 0;
for (int i = 0; i < classidications.size(); i++) {
for (int j = index; j < frame_meta->targets.size(); j++) {
// need apply or not?
if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
// continue as its primary_class_id is not in p_class_ids_applied_to
continue;
}
// update back to frame meta
frame_meta->targets[j]->secondary_class_ids.push_back(classidications[i][0].class_id);
frame_meta->targets[j]->secondary_scores.push_back(classidications[i][0].conf);
auto label = labels.size() == 0 ? "" : labels[classidications[i][0].class_id];
frame_meta->targets[j]->secondary_labels.push_back(label);
// break as we found the right target!
index = j + 1;
break;
}
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_yolov8_classifier::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,30 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_secondary_infer_node.h"
#include "../../third_party/trt_yolov8/trt_yolov8_classifier.h"
namespace vp_nodes {
// universal yolov8 classifier based on tensorrt using third_party/trt_yolov8 library
class vp_trt_yolov8_classifier: public vp_secondary_infer_node
{
private:
/* data */
std::shared_ptr<trt_yolov8::trt_yolov8_classifier> yolov8_classifier = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_yolov8_classifier(std::string node_name,
std::string model_path,
std::string labels_path = "",
std::vector<int> p_class_ids_applied_to = std::vector<int>(),
int min_width_applied_to = 0,
int min_height_applied_to = 0);
~vp_trt_yolov8_classifier();
};
}
#endif

View File

@@ -0,0 +1,69 @@
#ifdef VP_WITH_TRT
#include "vp_trt_yolov8_detector.h"
namespace vp_nodes {
vp_trt_yolov8_detector::vp_trt_yolov8_detector(std::string node_name,
std::string model_path,
std::string labels_path):
vp_primary_infer_node(node_name, "", "", labels_path) {
yolov8_detector = std::make_shared<trt_yolov8::trt_yolov8_detector>(model_path);
this->initialized();
}
vp_trt_yolov8_detector::~vp_trt_yolov8_detector() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_yolov8_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
std::vector<std::vector<Detection>> detections;
yolov8_detector->detect(mats_to_infer, detections);
assert(detections.size() == 1);
auto& detection_list = detections[0];
auto& frame_meta = frame_meta_with_batch[0];
for (int i = 0; i < detection_list.size(); i++) {
auto& objbox = detection_list[i];
// objbox.bbox: center_x center_y width height
auto rect = get_rect(frame_meta->frame, objbox.bbox); // convert to: x, y, width,height
// check value range
int x = std::max(rect.x, 0);
int y = std::max(rect.y, 0);
int width = std::min(rect.width, frame_meta->frame.cols - x);
int height = std::min(rect.height, frame_meta->frame.rows - y);
if (width <= 0 || height <= 0) {
continue;
}
auto label = labels.size() == 0 ? "" : labels[objbox.class_id];
auto target = std::make_shared<vp_objects::vp_frame_target>(x, y, width, height,
objbox.class_id, objbox.conf, frame_meta->frame_index, frame_meta->channel_index, label);
// create target and update back into frame meta
frame_meta->targets.push_back(target);
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_yolov8_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,24 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_primary_infer_node.h"
#include "../../third_party/trt_yolov8/trt_yolov8_detector.h"
namespace vp_nodes {
// universal yolov8 detector based on tensorrt using third_party/trt_yolov8 library
class vp_trt_yolov8_detector: public vp_primary_infer_node
{
private:
std::shared_ptr<trt_yolov8::trt_yolov8_detector> yolov8_detector = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_yolov8_detector(std::string node_name, std::string model_path, std::string labels_path = "");
~vp_trt_yolov8_detector();
};
}
#endif

View File

@@ -0,0 +1,60 @@
#ifdef VP_WITH_TRT
#include "vp_trt_yolov8_pose_detector.h"
namespace vp_nodes {
vp_trt_yolov8_pose_detector::vp_trt_yolov8_pose_detector(std::string node_name,
std::string model_path):
vp_primary_infer_node(node_name, "") {
yolov8_pose_detector = std::make_shared<trt_yolov8::trt_yolov8_pose_detector>(model_path);
this->initialized();
}
vp_trt_yolov8_pose_detector::~vp_trt_yolov8_pose_detector() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_yolov8_pose_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
std::vector<std::vector<Detection>> detections;
yolov8_pose_detector->detect(mats_to_infer, detections);
assert(detections.size() == 1);
auto& detection_list = detections[0];
auto& frame_meta = frame_meta_with_batch[0];
for (int i = 0; i < detection_list.size(); i++) {
auto& objbox = detection_list[i];
auto rect = get_rect_adapt_landmark(frame_meta->frame, objbox.bbox, objbox.keypoints);
std::vector<vp_objects::vp_pose_keypoint> kps;
for (int j = 0; j < 51; j += 3) {
kps.push_back(vp_objects::vp_pose_keypoint {j, int(objbox.keypoints[j]), int(objbox.keypoints[j + 1]), objbox.keypoints[j + 2]});
}
auto pose_target = std::make_shared<vp_objects::vp_frame_pose_target>(vp_objects::vp_pose_type::yolov8_pose_17, kps);
frame_meta->pose_targets.push_back(pose_target);
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_yolov8_pose_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,24 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_primary_infer_node.h"
#include "../../third_party/trt_yolov8/trt_yolov8_pose_detector.h"
namespace vp_nodes {
// universal yolov8 pose detector based on tensorrt using third_party/trt_yolov8 library
class vp_trt_yolov8_pose_detector: public vp_primary_infer_node
{
private:
std::shared_ptr<trt_yolov8::trt_yolov8_pose_detector> yolov8_pose_detector = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_yolov8_pose_detector(std::string node_name, std::string model_path);
~vp_trt_yolov8_pose_detector();
};
}
#endif

View File

@@ -0,0 +1,78 @@
#ifdef VP_WITH_TRT
#include "vp_trt_yolov8_seg_detector.h"
namespace vp_nodes {
vp_trt_yolov8_seg_detector::vp_trt_yolov8_seg_detector(std::string node_name,
std::string model_path,
std::string labels_path):
vp_primary_infer_node(node_name, "", "", labels_path) {
yolov8_seg_detector = std::make_shared<trt_yolov8::trt_yolov8_seg_detector>(model_path);
this->initialized();
}
vp_trt_yolov8_seg_detector::~vp_trt_yolov8_seg_detector() {
deinitialized();
}
// please refer to vp_infer_node::run_infer_combinations
void vp_trt_yolov8_seg_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
assert(frame_meta_with_batch.size() == 1);
std::vector<cv::Mat> mats_to_infer;
// start
auto start_time = std::chrono::system_clock::now();
// prepare data, as same as base class
vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
start_time = std::chrono::system_clock::now();
std::vector<std::vector<Detection>> detections;
std::vector<std::vector<cv::Mat>> masks;
yolov8_seg_detector->detect(mats_to_infer, detections, masks);
assert(detections.size() == 1);
assert(masks.size() == 1);
auto& detection_list = detections[0];
auto& mask_list = masks[0];
assert(detection_list.size() == mask_list.size());
auto& frame_meta = frame_meta_with_batch[0];
for (int i = 0; i < detection_list.size(); i++) {
auto& objbox = detection_list[i];
auto& mask = mask_list[i];
auto scaled_mask = scale_mask(mask, frame_meta->frame);
// objbox.bbox: center_x center_y width height
auto rect = get_rect(frame_meta->frame, objbox.bbox); // convert to: x, y, width,height
// check value range
rect.x = std::max(rect.x, 0);
rect.y = std::max(rect.y, 0);
rect.width = std::min(rect.width, frame_meta->frame.cols - rect.x);
rect.height = std::min(rect.height, frame_meta->frame.rows - rect.y);
if (rect.width <= 0 || rect.height <= 0) {
continue;
}
auto label = labels.size() == 0 ? "" : labels[objbox.class_id];
auto target = std::make_shared<vp_objects::vp_frame_target>(rect.x, rect.y, rect.width, rect.height,
objbox.class_id, objbox.conf, frame_meta->frame_index, frame_meta->channel_index, label);
auto rect_mask = scaled_mask(rect);
target->mask = rect_mask;
// create target and update back into frame meta
frame_meta->targets.push_back(target);
}
auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
// can not calculate preprocess time and postprocess time, set 0 by default.
vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
}
void vp_trt_yolov8_seg_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
}
}
#endif

View File

@@ -0,0 +1,24 @@
#pragma once
#ifdef VP_WITH_TRT
#include "../vp_primary_infer_node.h"
#include "../../third_party/trt_yolov8/trt_yolov8_seg_detector.h"
namespace vp_nodes {
// universal yolov8 segmentation detector based on tensorrt using third_party/trt_yolov8 library
class vp_trt_yolov8_seg_detector: public vp_primary_infer_node
{
private:
std::shared_ptr<trt_yolov8::trt_yolov8_seg_detector> yolov8_seg_detector = nullptr;
protected:
// we need a totally new logic for the whole infer combinations
// no separate step pre-defined needed in base class
virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
// override pure virtual method, for compile pass
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_trt_yolov8_seg_detector(std::string node_name, std::string model_path, std::string labels_path = "");
~vp_trt_yolov8_seg_detector();
};
}
#endif

View File

@@ -0,0 +1,307 @@
#include <cmath>
#include <algorithm>
#include <numeric>
#include "vp_yolo5_seg_node.h"
namespace vp_nodes {
vp_yolo5_seg_node::vp_yolo5_seg_node(std::string node_name,
std::string model_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
int class_id_offset,
float conf_threshold,
float iou_threshold):
vp_primary_infer_node(node_name, model_path, "", labels_path, input_width, input_height, batch_size, class_id_offset),
conf_threshold(conf_threshold),
iou_threshold(iou_threshold) {
this->initialized();
}
vp_yolo5_seg_node::~vp_yolo5_seg_node() {
deinitialized();
}
void vp_yolo5_seg_node::prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) {
for (auto& i: frame_meta_with_batch) {
auto letterboxed = letterbox(i->frame, cv::Size(input_width, input_height), scale_ratio, padding);
mats_to_infer.push_back(letterboxed);
}
}
void vp_yolo5_seg_node::infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) {
// blob_to_infer is a 4D matrix
// the first dim is number of batch, MUST be 1
assert(blob_to_infer.dims == 4);
assert(blob_to_infer.size[0] == 1);
assert(!net.empty());
net.setInput(blob_to_infer);
net.forward(raw_outputs, net.getUnconnectedOutLayersNames());
}
void vp_yolo5_seg_node::preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) {
// ignore preprocess logic in base class
cv::dnn::blobFromImages(mats_to_infer, blob_to_infer, 1.0 / 255.0, cv::Size(), cv::Scalar(), true, false);
}
void vp_yolo5_seg_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
auto& frame_meta = frame_meta_with_batch[0];
auto& frame = frame_meta->frame;
auto pred = raw_outputs[0]; // [1, N, 4+1+NUM_CLASSES+32]
auto proto = raw_outputs[1]; // [1, 32, 96, 160]
std::vector<cv::Rect> boxes;
std::vector<float> confidences;
std::vector<int> class_ids;
std::vector<cv::Mat> mask_coeffs;
int num_boxes = pred.size[1];
float* data = (float*)pred.data;
// decode
for (int i = 0; i < num_boxes; ++i) {
float obj_conf = data[4];
if (obj_conf < conf_threshold) {
data += (5 + labels.size() + MASK_CHANNELS);
continue;
}
int class_id = -1;
float max_cls_conf = 0;
for (int c = 0; c < labels.size(); ++c) {
float cls_conf = data[5 + c];
if (cls_conf > max_cls_conf) {
max_cls_conf = cls_conf;
class_id = c;
}
}
float total_conf = obj_conf * max_cls_conf;
if (total_conf < conf_threshold || class_id == -1) {
data += (5 + labels.size() + MASK_CHANNELS);
continue;
}
float cx = data[0];
float cy = data[1];
float w = data[2];
float h = data[3];
float x1 = cx - w * 0.5f;
float y1 = cy - h * 0.5f;
float x2 = cx + w * 0.5f;
float y2 = cy + h * 0.5f;
if (x2 <= x1 || y2 <= y1 || w <= 0 || h <= 0) {
data += (5 + labels.size() + MASK_CHANNELS);
continue;
}
float x_scale = scale_ratio.at<float>(0);
float y_scale = scale_ratio.at<float>(1);
int orig_x1 = cvRound((x1 - padding.x) * x_scale);
int orig_y1 = cvRound((y1 - padding.y) * y_scale);
int orig_x2 = cvRound((x2 - padding.x) * x_scale);
int orig_y2 = cvRound((y2 - padding.y) * y_scale);
orig_x1 = max(0, orig_x1);
orig_y1 = max(0, orig_y1);
orig_x2 = min(frame.cols, orig_x2);
orig_y2 = min(frame.rows, orig_y2);
if (orig_x2 <= orig_x1 || orig_y2 <= orig_y1) {
data += (5 + labels.size() + MASK_CHANNELS);
continue;
}
cv::Rect box(orig_x1, orig_y1, orig_x2 - orig_x1, orig_y2 - orig_y1);
boxes.push_back(box);
confidences.push_back(total_conf);
class_ids.push_back(class_id);
cv::Mat coeff(1, MASK_CHANNELS, CV_32F);
for (int m = 0; m < MASK_CHANNELS; ++m) {
coeff.at<float>(m) = data[5 + labels.size() + m];
}
mask_coeffs.push_back(coeff);
data += (5 + labels.size() + MASK_CHANNELS);
}
std::vector<int> final_indices;
std::unordered_map<int, std::vector<int>> class_boxes;
for (int i = 0; i < boxes.size(); ++i) {
class_boxes[class_ids[i]].push_back(i);
}
// NMS
for (auto& kv : class_boxes) {
const std::vector<int>& indices_in_class = kv.second;
std::vector<cv::Rect> boxes_cls;
std::vector<float> scores_cls;
for (int idx : indices_in_class) {
boxes_cls.push_back(boxes[idx]);
scores_cls.push_back(confidences[idx]);
}
std::vector<int> nms_indices = nms(boxes_cls, scores_cls, iou_threshold);
for (int local_idx : nms_indices) {
final_indices.push_back(indices_in_class[local_idx]);
}
}
// create target push back to frame meta
for (int idx : final_indices) {
cv::Rect box = boxes[idx];
float conf = confidences[idx];
int cls = class_ids[idx];
auto label = (labels.size() < cls + 1) ? "" : labels[cls];
cls += class_id_offset;
auto target = std::make_shared<vp_objects::vp_frame_target>(box.x, box.y, box.width, box.height, cls, conf, frame_meta->frame_index, frame_meta->channel_index, label);
// try to extract mask
auto local_mask = process_mask(proto, mask_coeffs[idx], box,
cv::Size(input_width, input_height),
padding, scale_ratio.at<float>(0), scale_ratio.at<float>(1));
target->mask = local_mask;
frame_meta->targets.push_back(target);
}
std::cout << "Detected: " << final_indices.size() << " instances" << std::endl;
}
cv::Mat vp_yolo5_seg_node::letterbox(const cv::Mat& src, cv::Size target_size, cv::Mat& scale_ratio, cv::Point& padding) {
float scale = min(static_cast<float>(target_size.height) / src.rows,
static_cast<float>(target_size.width) / src.cols);
int new_width = cvRound(src.cols * scale);
int new_height = cvRound(src.rows * scale);
cv::Mat resized;
resize(src, resized, cv::Size(new_width, new_height), 0, 0, cv::INTER_LINEAR);
cv::Mat letterboxed(target_size, CV_8UC3, cv::Scalar(114, 114, 114));
int top = (target_size.height - new_height) / 2;
int left = (target_size.width - new_width) / 2;
cv::Rect roi(left, top, new_width, new_height);
resized.copyTo(letterboxed(roi));
scale_ratio = cv::Mat(1, 2, CV_32F);
scale_ratio.at<float>(0) = 1.0f / scale; // x_scale
scale_ratio.at<float>(1) = 1.0f / scale; // y_scale
padding = cv::Point(left, top);
return letterboxed;
}
// Sigmoid
cv::Mat vp_yolo5_seg_node::sigmoid(const cv::Mat& x) {
cv::Mat result;
cv::exp(-x, result);
result = 1.0f / (1.0f + result);
return result;
}
cv::Mat vp_yolo5_seg_node::process_mask(const cv::Mat& proto, // [1, 32, H/4, W/4]
const cv::Mat& coeffs, // [1, 32]
const cv::Rect& bbox,
const cv::Size& input_size, // e.g., 640x384
const cv::Point& padding,
const float x_scale,
const float y_scale) {
// 1. 获取 proto 尺寸 [1,32,96,160] → h=96, w=160
int h_proto = proto.size[2];
int w_proto = proto.size[3];
// 2. 重塑为 [32, h, w]
cv::Mat proto_3d = proto.reshape(1, {MASK_CHANNELS, h_proto, w_proto});
// 3. 展平为 [32, h*w]
cv::Mat proto_flat = proto_3d.reshape(1, MASK_CHANNELS);
// 4. 矩阵乘: [1,32] × [32, h*w] → [1, h*w]
cv::Mat mask_raw = coeffs * proto_flat;
// 5. 重塑为 [h, w]
cv::Mat mask_2d = mask_raw.reshape(0, h_proto);
// 6. Sigmoid
cv::Mat mask_sigmoid = sigmoid(mask_2d);
// 7. 原图 bbox → 输入图坐标(含 padding
float x1_in = bbox.x / x_scale + padding.x;
float y1_in = bbox.y / y_scale + padding.y;
float x2_in = (bbox.x + bbox.width) / x_scale + padding.x;
float y2_in = (bbox.y + bbox.height) / y_scale + padding.y;
// 8. 输入图 → proto 坐标(/4
float x1_p = x1_in / 4.0f;
float y1_p = y1_in / 4.0f;
float x2_p = x2_in / 4.0f;
float y2_p = y2_in / 4.0f;
int px1 = max(0, (int)floor(x1_p));
int py1 = max(0, (int)floor(y1_p));
int px2 = min(w_proto, (int)ceil(x2_p));
int py2 = min(h_proto, (int)ceil(y2_p));
if (px2 <= px1 || py2 <= py1) {
return cv::Mat::zeros(bbox.size(), CV_8UC1);
}
// 9. 裁剪
cv::Mat cropped = mask_sigmoid(cv::Rect(px1, py1, px2 - px1, py2 - py1)).clone();
// 10. 上采样到 bbox 尺寸
cv::Mat resized;
resize(cropped, resized, bbox.size(), 0, 0, cv::INTER_LINEAR);
// 11. 二值化
cv::Mat mask_bin;
threshold(resized, mask_bin, 0.5, 255, cv::THRESH_BINARY);
mask_bin.convertTo(mask_bin, CV_8UC1);
return mask_bin;
}
// NMS
std::vector<int> vp_yolo5_seg_node::nms(const std::vector<cv::Rect>& boxes, const std::vector<float>& scores, float iou_threshold) {
std::vector<int> indices;
std::vector<float> areas(boxes.size());
for (size_t i = 0; i < boxes.size(); ++i) {
areas[i] = boxes[i].area();
}
std::vector<int> order(scores.size());
iota(order.begin(), order.end(), 0);
sort(order.begin(), order.end(), [&](int a, int b) {
return scores[a] > scores[b];
});
std::vector<bool> keep(scores.size(), true);
for (int i = 0; i < order.size(); ++i) {
int idx = order[i];
if (!keep[idx]) continue;
for (int j = i + 1; j < order.size(); ++j) {
int idx2 = order[j];
if (!keep[idx2]) continue;
cv::Rect intersect = boxes[idx] & boxes[idx2];
float iou = intersect.area() / (areas[idx] + areas[idx2] - intersect.area() + 1e-6f);
if (iou > iou_threshold) {
keep[idx2] = false;
}
}
}
for (int i = 0; i < keep.size(); ++i) {
if (keep[i]) indices.push_back(i);
}
return indices;
}
}

View File

@@ -0,0 +1,48 @@
#pragma once
#include "../vp_primary_infer_node.h"
#include "../../objects/vp_frame_target.h"
namespace vp_nodes {
// driving area segmentation based on yolov5s-seg-v7.0
// https://github.com/ultralytics/yolov5/releases/tag/v7.0
class vp_yolo5_seg_node: public vp_primary_infer_node
{
private:
/* data */
float conf_threshold = 0.5;
float iou_threshold = 0.7;
int MASK_CHANNELS = 32;
cv::Mat scale_ratio;
cv::Point padding;
cv::Mat letterbox(const cv::Mat& src, cv::Size target_size, cv::Mat& scale_ratio, cv::Point& padding);
cv::Mat sigmoid(const cv::Mat& x);
cv::Mat process_mask(const cv::Mat& proto, // [1, 32, H/4, W/4]
const cv::Mat& coeffs, // [1, 32]
const cv::Rect& bbox,
const cv::Size& input_size, // e.g., 640x384
const cv::Point& padding,
const float x_scale,
const float y_scale);
std::vector<int> nms(const std::vector<cv::Rect>& boxes, const std::vector<float>& scores, float iou_threshold);
protected:
virtual void infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) override;
virtual void preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) override;
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
virtual void prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) override;
public:
vp_yolo5_seg_node(std::string node_name,
std::string model_path,
std::string labels_path = "",
int input_width = 640,
int input_height = 384,
int batch_size = 1,
int class_id_offset = 0,
float conf_threshold = 0.5,
float iou_threshold = 0.7);
~vp_yolo5_seg_node();
};
}

View File

@@ -0,0 +1,118 @@
#include "vp_yolo_detector_node.h"
namespace vp_nodes {
vp_yolo_detector_node::vp_yolo_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path,
std::string labels_path,
int input_width,
int input_height,
int batch_size,
int class_id_offset,
float score_threshold,
float confidence_threshold,
float nms_threshold,
float scale,
cv::Scalar mean,
cv::Scalar std,
bool swap_rb):
vp_primary_infer_node(node_name, model_path, model_config_path,labels_path,input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb),
score_threshold(score_threshold),
confidence_threshold(confidence_threshold),
nms_threshold(nms_threshold) {
this->initialized();
}
vp_yolo_detector_node::~vp_yolo_detector_node() {
deinitialized();
}
void vp_yolo_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
// make sure heads of output are not zero
assert(raw_outputs.size() > 0);
// check dims of each output
auto dim_offset = raw_outputs[0].dims == 2 ? 0 : 1;
auto batch = dim_offset == 0 ? 1 : raw_outputs[0].size[0];
std::vector<int> class_ids;
std::vector<float> confidences;
std::vector<cv::Rect> boxes;
for (int k = 0; k < batch; k++) { // scan batch
auto& frame_meta = frame_meta_with_batch[k];
for (int i = 0; i < raw_outputs.size(); ++i) { // scan heads of output
cv::Mat output = cv::Mat(raw_outputs[i].size[0 + dim_offset], raw_outputs[i].size[1 + dim_offset], CV_32F, const_cast<uchar*>(raw_outputs[i].ptr(k)));
auto data = (float*)output.data;
for (int j = 0; j < output.rows; ++j, data += output.cols) {
float confidence = data[4];
// check confidence threshold
if (confidence < confidence_threshold) {
continue;
}
cv::Mat scores = output.row(j).colRange(5, output.cols);
cv::Point class_id;
double max_score;
// Get the value and location of the maximum score
cv::minMaxLoc(scores, 0, &max_score, 0, &class_id);
// check score threshold
if (max_score >= score_threshold) {
// smaller than 1.0 means relative value
// greater than 1.0 means absolute value which need to be converted to relative value
int centerX = (int)((data[0] <= 1 ? data[0] : data[0] / input_width) * frame_meta->frame.cols);
int centerY = (int)((data[1] <= 1 ? data[1] : data[1] / input_height) * frame_meta->frame.rows);
int width = (int)((data[2] <= 1 ? data[2] : data[2] / input_width) * frame_meta->frame.cols);
int height = (int)((data[3] <= 1 ? data[3] : data[3] / input_height) * frame_meta->frame.rows);
int left = centerX - width / 2;
int top = centerY - height / 2;
class_ids.push_back(class_id.x);
confidences.push_back((float)confidence);
boxes.push_back(cv::Rect(left, top, width, height));
}
}
}
// nms
std::vector<int> indices;
cv::dnn::NMSBoxes(boxes, confidences, confidence_threshold, nms_threshold, indices);
// create target
for (int i = 0; i < indices.size(); ++i) {
int idx = indices[i];
auto box = boxes[idx];
auto confidence = confidences[idx];
auto class_id = class_ids[idx];
auto label = (labels.size() < class_id + 1) ? "" : labels[class_id];
// check value range
box.x = std::max(box.x, 0);
box.y = std::max(box.y, 0);
box.width = std::min(box.width, frame_meta->frame.cols - box.x);
box.height = std::min(box.height, frame_meta->frame.rows - box.y);
if (box.width <= 0 || box.height <= 0) {
continue;
}
// apply offset to class id since multi detectors can exist at front of this one.
// later we MUST use the new class id (applied offset) instead of orignal one.
class_id += class_id_offset;
auto target = std::make_shared<vp_objects::vp_frame_target>(box.x, box.y, box.width, box.height, class_id, confidence, frame_meta->frame_index, frame_meta->channel_index, label);
// insert target back to frame meta
frame_meta->targets.push_back(target);
}
class_ids.clear();
confidences.clear();
boxes.clear();
}
}
}

View File

@@ -0,0 +1,35 @@
#pragma once
#include "../vp_primary_infer_node.h"
namespace vp_nodes {
// yolo detector, support yolov3/4/5
// https://github.com/pjreddie/darknet
class vp_yolo_detector_node: public vp_primary_infer_node
{
private:
float score_threshold;
float confidence_threshold;
float nms_threshold;
protected:
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_yolo_detector_node(std::string node_name,
std::string model_path,
std::string model_config_path = "",
std::string labels_path = "",
int input_width = 416,
int input_height = 416,
int batch_size = 1,
int class_id_offset = 0,
float score_threshold = 0.5,
float confidence_threshold = 0.5,
float nms_threshold = 0.5,
float scale = 1 / 255.0,
cv::Scalar mean = cv::Scalar(0),
cv::Scalar std = cv::Scalar(1),
bool swap_rb = true);
~vp_yolo_detector_node();
};
}

View File

@@ -0,0 +1,229 @@
#include "vp_yunet_face_detector_node.h"
namespace vp_nodes {
vp_yunet_face_detector_node::vp_yunet_face_detector_node(std::string node_name,
std::string model_path,
float score_threshold,
float nms_threshold,
int top_k):
vp_primary_infer_node(node_name, model_path),
scoreThreshold(score_threshold),
nmsThreshold(nms_threshold),
topK(top_k) {
this->initialized();
}
vp_yunet_face_detector_node::~vp_yunet_face_detector_node() {
deinitialized();
}
void vp_yunet_face_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
using namespace cv;
// 3 heads of output
assert(raw_outputs.size() == 3);
assert(frame_meta_with_batch.size() == 1);
auto& frame_meta = frame_meta_with_batch[0];
// Extract from output_blobs
Mat loc = raw_outputs[0];
Mat conf = raw_outputs[1];
Mat iou = raw_outputs[2];
// we need generate priors if input size changed or priors is not initialized
if (loc.rows != priors.size()) {
inputW = frame_meta->frame.cols;
inputH = frame_meta->frame.rows;
generatePriors();
}
assert(loc.rows == priors.size());
assert(loc.rows == conf.rows);
assert(loc.rows == iou.rows);
// Decode from deltas and priors
const std::vector<float> variance = {0.1f, 0.2f};
float* loc_v = (float*)(loc.data);
float* conf_v = (float*)(conf.data);
float* iou_v = (float*)(iou.data);
Mat faces;
// (tl_x, tl_y, w, h, re_x, re_y, le_x, le_y, nt_x, nt_y, rcm_x, rcm_y, lcm_x, lcm_y, score)
// 'tl': top left point of the bounding box
// 're': right eye, 'le': left eye
// 'nt': nose tip
// 'rcm': right corner of mouth, 'lcm': left corner of mouth
Mat face(1, 15, CV_32FC1);
for (size_t i = 0; i < priors.size(); ++i) {
// Get score
float clsScore = conf_v[i*2+1];
float iouScore = iou_v[i];
// Clamp
if (iouScore < 0.f) {
iouScore = 0.f;
}
else if (iouScore > 1.f) {
iouScore = 1.f;
}
float score = std::sqrt(clsScore * iouScore);
face.at<float>(0, 14) = score;
// Get bounding box
float cx = (priors[i].x + loc_v[i*14+0] * variance[0] * priors[i].width) * inputW;
float cy = (priors[i].y + loc_v[i*14+1] * variance[0] * priors[i].height) * inputH;
float w = priors[i].width * exp(loc_v[i*14+2] * variance[0]) * inputW;
float h = priors[i].height * exp(loc_v[i*14+3] * variance[1]) * inputH;
float x1 = cx - w / 2;
float y1 = cy - h / 2;
face.at<float>(0, 0) = x1;
face.at<float>(0, 1) = y1;
face.at<float>(0, 2) = w;
face.at<float>(0, 3) = h;
// Get landmarks
face.at<float>(0, 4) = (priors[i].x + loc_v[i*14+ 4] * variance[0] * priors[i].width) * inputW; // right eye, x
face.at<float>(0, 5) = (priors[i].y + loc_v[i*14+ 5] * variance[0] * priors[i].height) * inputH; // right eye, y
face.at<float>(0, 6) = (priors[i].x + loc_v[i*14+ 6] * variance[0] * priors[i].width) * inputW; // left eye, x
face.at<float>(0, 7) = (priors[i].y + loc_v[i*14+ 7] * variance[0] * priors[i].height) * inputH; // left eye, y
face.at<float>(0, 8) = (priors[i].x + loc_v[i*14+ 8] * variance[0] * priors[i].width) * inputW; // nose tip, x
face.at<float>(0, 9) = (priors[i].y + loc_v[i*14+ 9] * variance[0] * priors[i].height) * inputH; // nose tip, y
face.at<float>(0, 10) = (priors[i].x + loc_v[i*14+10] * variance[0] * priors[i].width) * inputW; // right corner of mouth, x
face.at<float>(0, 11) = (priors[i].y + loc_v[i*14+11] * variance[0] * priors[i].height) * inputH; // right corner of mouth, y
face.at<float>(0, 12) = (priors[i].x + loc_v[i*14+12] * variance[0] * priors[i].width) * inputW; // left corner of mouth, x
face.at<float>(0, 13) = (priors[i].y + loc_v[i*14+13] * variance[0] * priors[i].height) * inputH; // left corner of mouth, y
faces.push_back(face);
}
if (faces.rows > 1)
{
// Retrieve boxes and scores
std::vector<Rect2i> faceBoxes;
std::vector<float> faceScores;
for (int rIdx = 0; rIdx < faces.rows; rIdx++)
{
faceBoxes.push_back(Rect2i(int(faces.at<float>(rIdx, 0)),
int(faces.at<float>(rIdx, 1)),
int(faces.at<float>(rIdx, 2)),
int(faces.at<float>(rIdx, 3))));
faceScores.push_back(faces.at<float>(rIdx, 14));
}
std::vector<int> keepIdx;
dnn::NMSBoxes(faceBoxes, faceScores, scoreThreshold, nmsThreshold, keepIdx, 1.f, topK);
// Get NMS results
Mat nms_faces;
for (int idx: keepIdx)
{
nms_faces.push_back(faces.row(idx));
}
// insert face target back to frame meta
for (int i = 0; i < nms_faces.rows; i++) {
auto x = int(nms_faces.at<float>(i, 0));
auto y = int(nms_faces.at<float>(i, 1));
auto w = int(nms_faces.at<float>(i, 2));
auto h = int(nms_faces.at<float>(i, 3));
// check value range
x = std::max(x, 0);
y = std::max(y, 0);
w = std::min(w, frame_meta->frame.cols - x);
h = std::min(h, frame_meta->frame.rows - y);
auto kp1 = std::pair<int, int>(int(nms_faces.at<float>(i, 4)), int(nms_faces.at<float>(i, 5)));
auto kp2 = std::pair<int, int>(int(nms_faces.at<float>(i, 6)), int(nms_faces.at<float>(i, 7)));
auto kp3 = std::pair<int, int>(int(nms_faces.at<float>(i, 8)), int(nms_faces.at<float>(i, 9)));
auto kp4 = std::pair<int, int>(int(nms_faces.at<float>(i, 10)), int(nms_faces.at<float>(i, 11)));
auto kp5 = std::pair<int, int>(int(nms_faces.at<float>(i, 12)), int(nms_faces.at<float>(i, 13)));
auto score = nms_faces.at<float>(i, 14);
auto face_target = std::make_shared<vp_objects::vp_frame_face_target>(x, y, w, h, score, std::vector<std::pair<int, int>>{kp1, kp2, kp3, kp4, kp5});
frame_meta->face_targets.push_back(face_target);
}
}
}
// refer to vp_infer_node::preprocess
void vp_yunet_face_detector_node::preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) {
cv::dnn::blobFromImages(mats_to_infer, blob_to_infer);
}
// refer to vp_infer_node::infer
void vp_yunet_face_detector_node::infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) {
// blob_to_infer is a 4D matrix
// the first dim is number of batch, MUST be 1
assert(blob_to_infer.dims == 4);
assert(blob_to_infer.size[0] == 1);
assert(!net.empty());
net.setInput(blob_to_infer);
net.forward(raw_outputs, out_names);
}
void vp_yunet_face_detector_node::generatePriors() {
using namespace cv;
// Calculate shapes of different scales according to the shape of input image
Size feature_map_2nd = {
int(int((inputW+1)/2)/2), int(int((inputH+1)/2)/2)
};
Size feature_map_3rd = {
int(feature_map_2nd.width/2), int(feature_map_2nd.height/2)
};
Size feature_map_4th = {
int(feature_map_3rd.width/2), int(feature_map_3rd.height/2)
};
Size feature_map_5th = {
int(feature_map_4th.width/2), int(feature_map_4th.height/2)
};
Size feature_map_6th = {
int(feature_map_5th.width/2), int(feature_map_5th.height/2)
};
std::vector<Size> feature_map_sizes;
feature_map_sizes.push_back(feature_map_3rd);
feature_map_sizes.push_back(feature_map_4th);
feature_map_sizes.push_back(feature_map_5th);
feature_map_sizes.push_back(feature_map_6th);
// Fixed params for generating priors
const std::vector<std::vector<float>> min_sizes = {
{10.0f, 16.0f, 24.0f},
{32.0f, 48.0f},
{64.0f, 96.0f},
{128.0f, 192.0f, 256.0f}
};
CV_Assert(min_sizes.size() == feature_map_sizes.size()); // just to keep vectors in sync
const std::vector<int> steps = { 8, 16, 32, 64 };
// Generate priors
priors.clear();
for (size_t i = 0; i < feature_map_sizes.size(); ++i)
{
Size feature_map_size = feature_map_sizes[i];
std::vector<float> min_size = min_sizes[i];
for (int _h = 0; _h < feature_map_size.height; ++_h)
{
for (int _w = 0; _w < feature_map_size.width; ++_w)
{
for (size_t j = 0; j < min_size.size(); ++j)
{
float s_kx = min_size[j] / inputW;
float s_ky = min_size[j] / inputH;
float cx = (_w + 0.5f) * steps[i] / inputW;
float cy = (_h + 0.5f) * steps[i] / inputH;
Rect2f prior = { cx, cy, s_kx, s_ky };
priors.push_back(prior);
}
}
}
}
}
}

View File

@@ -0,0 +1,34 @@
#pragma once
#include "../vp_primary_infer_node.h"
#include "../../objects/vp_frame_face_target.h"
namespace vp_nodes {
// face detector based on YunNet
// https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/face_detect.cpp
// https://github.com/ShiqiYu/libfacedetection
class vp_yunet_face_detector_node: public vp_primary_infer_node
{
private:
// names of output layers in yunet
const std::vector<std::string> out_names = {"loc", "conf", "iou"};
float scoreThreshold = 0.7;
float nmsThreshold = 0.5;
int topK = 50;
int inputW;
int inputH;
std::vector<cv::Rect2f> priors;
void generatePriors();
protected:
// override infer and preprocess as yunet has a different logic
virtual void infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) override;
virtual void preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) override;
virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
public:
vp_yunet_face_detector_node(std::string node_name, std::string model_path, float score_threshold = 0.7, float nms_threshold = 0.5, int top_k = 50);
~vp_yunet_face_detector_node();
};
}

5
nodes/osd/README.md Normal file
View File

@@ -0,0 +1,5 @@
# Summary
Two situations to create a new osd node:
1. Need a new osd style for the same type of target
2. OSD for different types of targets, such as `vp_frame_target` (by default), `vp_frame_pose_target`, `vp_frame_face_target` ...

View File

@@ -0,0 +1,91 @@
#include "vp_ba_crossline_osd_node.h"
namespace vp_nodes {
vp_ba_crossline_osd_node::vp_ba_crossline_osd_node(std::string node_name, std::string font): vp_node(node_name) {
if (!font.empty()) {
ft2 = cv::freetype::createFreeType2();
ft2->loadFontData(font, 0);
}
this->initialized();
}
vp_ba_crossline_osd_node::~vp_ba_crossline_osd_node() {
deinitialized();
}
std::shared_ptr<vp_objects::vp_meta> vp_ba_crossline_osd_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// operations on osd_frame
if (meta->osd_frame.empty()) {
meta->osd_frame = meta->frame.clone();
}
auto& canvas = meta->osd_frame;
// scan targets
for (auto& i : meta->targets) {
// track_id
auto id = std::to_string(i->track_id);
auto labels_to_display = i->primary_label;
// tracked
if (i->track_id != -1) {
labels_to_display = "#" + id + " " + labels_to_display;
}
for (auto& label : i->secondary_labels) {
labels_to_display += "|" + label;
}
// draw tracks if size>=2
if (i->tracks.size() >= 2) {
for (int n = 0; n < (i->tracks.size() - 1); n++) {
auto p1 = i->tracks[n].track_point();
auto p2 = i->tracks[n + 1].track_point();
cv::line(canvas, cv::Point(p1.x, p1.y), cv::Point(p2.x, p2.y), cv::Scalar(0, 255, 255), 1, cv::LINE_AA);
}
}
cv::rectangle(canvas, cv::Rect(i->x, i->y, i->width, i->height), cv::Scalar(255, 255, 0), 2);
if (ft2 != nullptr) {
ft2->putText(canvas, labels_to_display, cv::Point(i->x, i->y), 20, cv::Scalar(255, 0, 255), cv::FILLED, cv::LINE_AA, true);
}
else {
//cv::putText(canvas, labels_to_display, cv::Point(i->x, i->y), 1, 1, cv::Scalar(255, 0, 255));
int baseline = 0;
auto size = cv::getTextSize(labels_to_display, 1, 1, 1, &baseline);
vp_utils::put_text_at_center_of_rect(canvas, labels_to_display, cv::Rect(i->x, i->y - size.height, size.width, size.height), true, 1, 1, cv::Scalar(), cv::Scalar(179, 52, 255), cv::Scalar(179, 52, 255));
}
// scan sub targets
for (auto& sub_target: i->sub_targets) {
cv::rectangle(canvas, cv::Rect(sub_target->x, sub_target->y, sub_target->width, sub_target->height), cv::Scalar(255));
if (ft2 != nullptr) {
ft2->putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 20, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA, true);
}
else {
cv::putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 1, 1, cv::Scalar(0, 0, 255));
}
}
}
/* crossline draw for current channel */
auto& total_crossline = all_total_crossline[meta->channel_index];
auto& line = all_lines[meta->channel_index];
// scan ba results and ONLY deal with crossline
for (auto& i : meta->ba_results) {
if (i->type == vp_objects::vp_ba_type::CROSSLINE) {
// line has 2 points
assert(i->involve_region_in_frame.size() == 2);
line = vp_objects::vp_line(i->involve_region_in_frame[0], i->involve_region_in_frame[1]);
total_crossline += 1;
}
}
// draw crossline data
cv::line(canvas, cv::Point(line.start.x, line.start.y), cv::Point(line.end.x, line.end.y), cv::Scalar(0, 255, 0), 2, cv::LINE_AA);
cv::putText(canvas, vp_utils::string_format("total crossline targets: [%d]", total_crossline), cv::Point(20, 20), 1, 2, cv::Scalar(0, 0, 255), 2);
return meta;
}
}

View File

@@ -0,0 +1,26 @@
#pragma once
#include <map>
#include <opencv2/freetype.hpp>
#include "../vp_node.h"
#include "../../objects/shapes/vp_point.h"
#include "../../objects/shapes/vp_line.h"
namespace vp_nodes {
// osd node for behaviour analysis of crossline
class vp_ba_crossline_osd_node: public vp_node
{
private:
// support chinese font
cv::Ptr<cv::freetype::FreeType2> ft2;
// support multi channels
std::map<int, int> all_total_crossline;
std::map<int, vp_objects::vp_line> all_lines;
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
vp_ba_crossline_osd_node(std::string node_name, std::string font = "");
~vp_ba_crossline_osd_node();
};
}

View File

@@ -0,0 +1,109 @@
#include "vp_ba_jam_osd_node.h"
namespace vp_nodes {
vp_ba_jam_osd_node::vp_ba_jam_osd_node(std::string node_name, std::string font): vp_node(node_name) {
if (!font.empty()) {
ft2 = cv::freetype::createFreeType2();
ft2->loadFontData(font, 0);
}
this->initialized();
}
vp_ba_jam_osd_node::~vp_ba_jam_osd_node() {
deinitialized();
}
std::shared_ptr<vp_objects::vp_meta> vp_ba_jam_osd_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// operations on osd_frame
if (meta->osd_frame.empty()) {
meta->osd_frame = meta->frame.clone();
}
auto& canvas = meta->osd_frame;
// scan targets
for (auto& i : meta->targets) {
// track_id
auto id = std::to_string(i->track_id);
auto labels_to_display = i->primary_label;
// tracked
if (i->track_id != -1) {
labels_to_display = "#" + id + " " + labels_to_display;
}
for (auto& label : i->secondary_labels) {
labels_to_display += "|" + label;
}
// draw tracks if size>=2
if (i->tracks.size() >= 2) {
for (int n = 0; n < (i->tracks.size() - 1); n++) {
auto p1 = i->tracks[n].track_point();
auto p2 = i->tracks[n + 1].track_point();
cv::line(canvas, cv::Point(p1.x, p1.y), cv::Point(p2.x, p2.y), cv::Scalar(0, 255, 255), 1, cv::LINE_AA);
}
}
cv::rectangle(canvas, cv::Rect(i->x, i->y, i->width, i->height), cv::Scalar(255, 255, 0), 2);
if (ft2 != nullptr) {
ft2->putText(canvas, labels_to_display, cv::Point(i->x, i->y), 20, cv::Scalar(255, 0, 255), cv::FILLED, cv::LINE_AA, true);
}
else {
//cv::putText(canvas, labels_to_display, cv::Point(i->x, i->y), 1, 1, cv::Scalar(255, 0, 255));
int baseline = 0;
auto size = cv::getTextSize(labels_to_display, 1, 1, 1, &baseline);
vp_utils::put_text_at_center_of_rect(canvas, labels_to_display, cv::Rect(i->x, i->y - size.height, size.width, size.height), true, 1, 1, cv::Scalar(), cv::Scalar(179, 52, 255), cv::Scalar(179, 52, 255));
}
// scan sub targets
for (auto& sub_target: i->sub_targets) {
cv::rectangle(canvas, cv::Rect(sub_target->x, sub_target->y, sub_target->width, sub_target->height), cv::Scalar(255));
if (ft2 != nullptr) {
ft2->putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 20, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA, true);
}
else {
cv::putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 1, 1, cv::Scalar(0, 0, 255));
}
}
}
/* jam draw for current channel */
auto& region = all_jam_regions[meta->channel_index];
auto& jam_result = all_jam_results[meta->channel_index];
auto& involve_ids = all_involve_ids[meta->channel_index];
// scan ba results and ONLY deal with stop and unstop
for (auto& i : meta->ba_results) {
if (i->type == vp_objects::vp_ba_type::JAM) {
region = i->involve_region_in_frame;
involve_ids = i->involve_target_ids_in_frame;
jam_result = true;
}
if (i->type == vp_objects::vp_ba_type::UNJAM) {
region = i->involve_region_in_frame;
involve_ids = i->involve_target_ids_in_frame; // not used later
jam_result = false;
}
}
// draw jam data
auto poly_vertexs = [&]() {
std::vector<cv::Point> vertexs;
for(auto& p: region) {
vertexs.push_back(cv::Point(p.x, p.y));
}
return vertexs;
};
if (jam_result) {
cv::polylines(canvas, poly_vertexs(), true, cv::Scalar(0, 0, 255), 2, cv::LINE_AA); // region
// targets in jan region, highlight
auto targets = meta->get_targets_by_ids(involve_ids);
for (auto& t: targets) {
cv::rectangle(canvas, cv::Rect(t->x, t->y, t->width, t->height), cv::Scalar(0, 0, 255), 2, cv::LINE_AA);
}
}
return meta;
}
}

View File

@@ -0,0 +1,27 @@
#pragma once
#include <map>
#include <opencv2/freetype.hpp>
#include "../vp_node.h"
#include "../../objects/shapes/vp_point.h"
#include "../../objects/shapes/vp_line.h"
namespace vp_nodes {
// osd node for behaviour analysis of stop
class vp_ba_jam_osd_node: public vp_node
{
private:
// support chinese font
cv::Ptr<cv::freetype::FreeType2> ft2;
// support multi channels
std::map<int, std::vector<vp_objects::vp_point>> all_jam_regions; // channel -> jam region
std::map<int, bool> all_jam_results; // channel -> jam status
std::map<int, std::vector<int>> all_involve_ids; // channel -> target ids when enter jam status
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
vp_ba_jam_osd_node(std::string node_name, std::string font = "");
~vp_ba_jam_osd_node();
};
}

View File

@@ -0,0 +1,114 @@
#include "vp_ba_stop_osd_node.h"
namespace vp_nodes {
vp_ba_stop_osd_node::vp_ba_stop_osd_node(std::string node_name, std::string font): vp_node(node_name) {
if (!font.empty()) {
ft2 = cv::freetype::createFreeType2();
ft2->loadFontData(font, 0);
}
this->initialized();
}
vp_ba_stop_osd_node::~vp_ba_stop_osd_node() {
deinitialized();
}
std::shared_ptr<vp_objects::vp_meta> vp_ba_stop_osd_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// operations on osd_frame
if (meta->osd_frame.empty()) {
meta->osd_frame = meta->frame.clone();
}
auto& canvas = meta->osd_frame;
// scan targets
for (auto& i : meta->targets) {
// track_id
auto id = std::to_string(i->track_id);
auto labels_to_display = i->primary_label;
// tracked
if (i->track_id != -1) {
labels_to_display = "#" + id + " " + labels_to_display;
}
for (auto& label : i->secondary_labels) {
labels_to_display += "|" + label;
}
// draw tracks if size>=2
if (i->tracks.size() >= 2) {
for (int n = 0; n < (i->tracks.size() - 1); n++) {
auto p1 = i->tracks[n].track_point();
auto p2 = i->tracks[n + 1].track_point();
cv::line(canvas, cv::Point(p1.x, p1.y), cv::Point(p2.x, p2.y), cv::Scalar(0, 255, 255), 1, cv::LINE_AA);
}
}
cv::rectangle(canvas, cv::Rect(i->x, i->y, i->width, i->height), cv::Scalar(255, 255, 0), 2);
if (ft2 != nullptr) {
ft2->putText(canvas, labels_to_display, cv::Point(i->x, i->y), 20, cv::Scalar(255, 0, 255), cv::FILLED, cv::LINE_AA, true);
}
else {
//cv::putText(canvas, labels_to_display, cv::Point(i->x, i->y), 1, 1, cv::Scalar(255, 0, 255));
int baseline = 0;
auto size = cv::getTextSize(labels_to_display, 1, 1, 1, &baseline);
vp_utils::put_text_at_center_of_rect(canvas, labels_to_display, cv::Rect(i->x, i->y - size.height, size.width, size.height), true, 1, 1, cv::Scalar(), cv::Scalar(179, 52, 255), cv::Scalar(179, 52, 255));
}
// scan sub targets
for (auto& sub_target: i->sub_targets) {
cv::rectangle(canvas, cv::Rect(sub_target->x, sub_target->y, sub_target->width, sub_target->height), cv::Scalar(255));
if (ft2 != nullptr) {
ft2->putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 20, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA, true);
}
else {
cv::putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 1, 1, cv::Scalar(0, 0, 255));
}
}
}
/* stop draw for current channel */
auto& region = all_stop_regions[meta->channel_index];
auto& total_stop = all_total_stop[meta->channel_index];
auto& stop_count = all_stop_count[meta->channel_index];
// scan ba results and ONLY deal with stop and unstop
for (auto& i : meta->ba_results) {
if (i->type == vp_objects::vp_ba_type::STOP) {
region = i->involve_region_in_frame;
stop_count++;
total_stop.push_back(i->involve_target_ids_in_frame[0]); // only 1 target
}
if (i->type == vp_objects::vp_ba_type::UNSTOP) {
region = i->involve_region_in_frame;
for (auto j = total_stop.begin(); j < total_stop.end();) {
if (*j == i->involve_target_ids_in_frame[0]) {
j = total_stop.erase(j);
break;
}
j++;
}
}
}
// draw stop data
auto poly_vertexs = [&]() {
std::vector<cv::Point> vertexs;
for(auto& p: region) {
vertexs.push_back(cv::Point(p.x, p.y));
}
return vertexs;
};
cv::polylines(canvas, poly_vertexs(), true, cv::Scalar(0, 255, 0), 2, cv::LINE_AA);
// cv::putText(canvas, vp_utils::string_format("total stop targets: [%d]", stop_count), cv::Point(20, 20), 1, 2, cv::Scalar(0, 0, 255), 2);
// stop targets, highlight
auto stop_targets = meta->get_targets_by_ids(total_stop);
for (auto& t: stop_targets) {
cv::rectangle(canvas, cv::Rect(t->x, t->y, t->width, t->height), cv::Scalar(0, 0, 255), 2, cv::LINE_AA);
}
return meta;
}
}

View File

@@ -0,0 +1,27 @@
#pragma once
#include <map>
#include <opencv2/freetype.hpp>
#include "../vp_node.h"
#include "../../objects/shapes/vp_point.h"
#include "../../objects/shapes/vp_line.h"
namespace vp_nodes {
// osd node for behaviour analysis of stop
class vp_ba_stop_osd_node: public vp_node
{
private:
// support chinese font
cv::Ptr<cv::freetype::FreeType2> ft2;
// support multi channels
std::map<int, std::vector<int>> all_total_stop; // channel -> stop target ids
std::map<int, std::vector<vp_objects::vp_point>> all_stop_regions; // channel -> stop region
std::map<int, int> all_stop_count; // channel -> stop count
protected:
virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
public:
vp_ba_stop_osd_node(std::string node_name, std::string font = "");
~vp_ba_stop_osd_node();
};
}

View File

@@ -0,0 +1,214 @@
#include "vp_cluster_node.h"
#include "../../third_party/bhtsne/tsne.h"
namespace vp_nodes {
vp_cluster_node::vp_cluster_node(std::string node_name,
bool use_tSNE,
std::vector<std::string> s_labels_to_display,
int sampling_frequency,
int min_sampling_width,
int min_sampling_height,
int max_sample_num_for_tsne,
int max_sample_num_per_category):
vp_node(node_name),
use_tSNE(use_tSNE),
s_labels_to_display(s_labels_to_display),
sampling_frequency(sampling_frequency),
min_sampling_width(min_sampling_width),
min_sampling_height(min_sampling_height),
max_sample_num_for_tsne(max_sample_num_for_tsne),
max_sample_num_per_category(max_sample_num_per_category) {
this->initialized();
}
vp_cluster_node::~vp_cluster_node() {
deinitialized();
}
// please refer to ../../third_party/trt_vehicle/main/vehicle_cluster.cpp
void vp_cluster_node::reduce_dims_using_tsne(std::vector<std::vector<float>>& low_features,
int no_dims, int max_iter, double perplexity,
double theta, int rand_seed, bool skip_random_init,
int stop_lying_iter, int mom_switch_iter) {
assert(cache_high_features.size() != 0);
auto N = cache_high_features.size();
auto D = cache_high_features[0].second.size(); // all the same as the first feature's dims
// prepare input
double data[N * D];
double Y[N * no_dims];
for (int i = 0; i < N; i++) {
for (int j = 0; j < D; j++) {
data[i * D + j] = cache_high_features[i].second[j];
}
}
// call t-SNE
TSNE::run(data, N, D, Y, no_dims, perplexity, theta, rand_seed, skip_random_init, max_iter, stop_lying_iter, mom_switch_iter);
// prepare output
for (int i = 0; i < N; i++) {
std::vector<float> low_dims_feature;
for (int j = 0; j < no_dims; j++) {
low_dims_feature.push_back(float(Y[i * no_dims + j]));
}
low_features.push_back(low_dims_feature);
}
}
std::shared_ptr<vp_objects::vp_meta> vp_cluster_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
// sampling frequency
if (std::chrono::duration_cast<std::chrono::milliseconds>(NOW - last_sampling_time) < std::chrono::milliseconds(sampling_frequency)) {
return meta;
}
last_sampling_time = NOW;
/* t-SNE */
if (use_tSNE) {
for (int i = 0; i < meta->targets.size(); i++) {
if (meta->targets[i]->width < min_sampling_width || meta->targets[i]->height < min_sampling_height) {
continue;
}
if (cache_high_features.size() >= max_sample_num_for_tsne) {
// too many cache, we need control number of samples
cache_high_features.erase(cache_high_features.begin());
}
// save mat->feature pair to cache
cv::Mat m;
cv::Mat roi(meta->frame, cv::Rect(meta->targets[i]->x, meta->targets[i]->y, meta->targets[i]->width, meta->targets[i]->height));
roi.copyTo(m);
std::pair<cv::Mat, std::vector<float>> p {m, meta->targets[i]->embeddings};
cache_high_features.push_back(p);
}
// at least 10 samples to t-SNE
if (cache_high_features.size() >= 10) {
// reduce dims first
std::vector<std::vector<float>> low_features;
reduce_dims_using_tsne(low_features);
// now display on screen
// normalize low dims feature to coordinate of [0:1] and display them on 2D screen
auto max_x = 0.0f, max_y = 0.0f, min_x = 0.0f, min_y = 0.0f;
for(int i = 0; i < low_features.size(); i++) {
auto& f = low_features[i];
// 2 values in f
max_x = std::max(max_x, f[0]);
max_y = std::max(max_y, f[1]);
min_x = std::min(min_x, f[0]);
min_y = std::min(min_y, f[1]);
}
auto x_range = max_x - min_x;
auto y_range = max_y - min_y;
// draw on (tsne_canvas_w_h + tsne_thumbnail_w_h) * (tsne_canvas_w_h + tsne_thumbnail_w_h)
cv::Mat canvas(tsne_canvas_w_h + tsne_thumbnail_w_h, tsne_canvas_w_h + tsne_thumbnail_w_h, CV_8UC3, cv::Scalar(127, 127, 127));
for(int i = 0; i < low_features.size(); i++) {
auto& f = low_features[i];
// convert to [0:1]
f[0] = (f[0] - min_x) / x_range;
f[1] = (f[1] - min_y) / y_range;
auto& img = cache_high_features[i].first;
cv::Mat img_tmp;
cv::resize(img, img_tmp, cv::Size(tsne_thumbnail_w_h, tsne_thumbnail_w_h));
cv::rectangle(img_tmp, cv::Rect(0, 0, img_tmp.cols, img_tmp.rows), cv::Scalar(255, 0, 0));
cv::Mat roi(canvas, cv::Rect(int(f[0] * tsne_canvas_w_h), int(f[1] * tsne_canvas_w_h), tsne_thumbnail_w_h, tsne_thumbnail_w_h));
img_tmp.copyTo(roi);
}
cv::imshow("cluster using features powered by t-SNE", canvas);
}
}
/* categories */
for (int i = 0; i < meta->targets.size(); i++) {
auto& t = meta->targets[i];
if (t->width < min_sampling_width || t->height < min_sampling_height) {
continue;
}
cv::Mat m;
cv::Mat roi(meta->frame, cv::Rect(t->x, t->y, t->width, t->height));
roi.copyTo(m);
for (int j = 0; j < t->secondary_labels.size(); j++) {
auto& category = t->secondary_labels[j];
bool filter_pass = false;
// all
if (s_labels_to_display.size() == 0) {
cache_categories[category].push_back(m);
filter_pass = true;
}
else {
// has a filter
if (std::find(s_labels_to_display.begin(), s_labels_to_display.end(), category) != s_labels_to_display.end()) {
cache_categories[category].push_back(m);
filter_pass = true;
}
}
if (filter_pass) {
// too many cache, we need control number of samples
if (cache_categories[category].size() >= max_sample_num_per_category) {
cache_categories[category].erase(cache_categories[category].begin());
}
}
}
}
// display on screen
// calculate total number of rows
auto rows_num = 0;
for (auto& p: cache_categories) {
auto num = p.second.size() / category_num_per_row;
if (p.second.size() % category_num_per_row != 0 || num == 0) {
num++;
}
rows_num += num;
}
// draw on (category_canvas_w) * (category_canvas_h)
auto category_canvas_w = (category_thumbnail_w_h + category_gap) * (category_num_per_row + 1);
auto category_canvas_h = (category_thumbnail_w_h + category_gap) * (rows_num + 1);
cv::Mat canvas(category_canvas_h, category_canvas_w, CV_8UC3, cv::Scalar(127, 127, 127));
auto row_index = 0;
for (auto& p: cache_categories) {
auto col_index = 0;
auto& category_items = p.second;
auto& category_name = p.first;
cv::putText(canvas, category_name, cv::Point(10, (category_gap + category_thumbnail_w_h) * row_index + category_gap), 1, 1, cv::Scalar(0, 0, 255));
bool new_row = false;
for (int i = 0; i < category_items.size(); i++) {
cv::Mat img_tmp;
cv::resize(category_items[i], img_tmp, cv::Size(category_thumbnail_w_h, category_thumbnail_w_h));
cv::Mat roi(canvas, cv::Rect((category_gap + category_thumbnail_w_h) * col_index + category_gap, (category_gap + category_thumbnail_w_h) * row_index + category_gap, category_thumbnail_w_h, category_thumbnail_w_h));
img_tmp.copyTo(roi);
col_index++;
if ((col_index + 1) % category_num_per_row == 0) {
col_index = 0;
row_index ++;
new_row = true;
}
else {
new_row = false;
}
}
if (!new_row) {
row_index++;
}
}
cv::imshow("cluster using labels", canvas);
return meta;
}
std::shared_ptr<vp_objects::vp_meta> vp_cluster_node::handle_control_meta(std::shared_ptr<vp_objects::vp_control_meta> meta) {
return meta;
}
}

Some files were not shown because too many files have changed in this diff Show More