first commit

2026-06-03 12:43:14 +08:00
commit ba76cfae28
608 changed files with 120791 additions and 0 deletions
--- a/nodes/README.md
+++ b/nodes/README.md
@@ -0,0 +1,145 @@
+
+## important tips when using node ##
+Most nodes support multi-channel, which means that multiple channels of data can be used as its input. In addition, some nodes do not support multi-channel by default, that is, there can only be one channel of data as its input (the input channel index cannot be changed once it is determined), otherwise an error will occur. The following are common examples:
+
+*support multi-channel*
+- all infer nodes, broker nodes, because they work independently of the channel index.
+- split node, it works independently of channel index too. it could be used to split pipeline into multi branches accordding to different channel index.
+- some nodes which has been designed to support multi-channel, such as record nodes, track nodes, ba nodes, which are able to distinguish different channels inside logic code (`such as using std::map<int, ...> to maintain different data of channels`).
+
+*do NOT support multi-channel*
+- all src node and all des node, you MUST specify channel index when initializing instances of them.
+- part of osd nodes, because they work dependently of channel index, you can do some work to let them support multi-channel, for example just use some data structure like `std::map<int, ...>` to maintain different data of channels inside code of nodes.
+
+
+Note, although some nodes support multi-channel, you'd better be careful because using single instance of node to deal with multiple channels of data would have lower performance(process data serially
+). On the contrary, one instance dealing with only one channel of data (multiple channels use multiple instances of the SAME node) have higher performance(process data in parallel). below is an example of 2 methods to create pipeline:
+```
+single instance of track/ba node works on 2 channels：
+file_src_0                                                                                             --> osd_0 --> screen_des_0
+           --> detector --> multi-classifiers --> tracker --> ba_crossline --> split(by channel index)
+file_src_1                                                                                             --> osd_1 --> screen_des_1 
+
+2 instances of track/ba node work on 2 channels:
+file_src_0                                                                --> tracker_0 --> ba_crossline_0 --> osd_0 --> screen_des_0
+           --> detector --> multi-classifiers --> split(by channel index) 
+file_src_1                                                                --> tracker_0 --> ba_crossline_0 --> osd_1 --> screen_des_1 
+```
+
+
+## 节点目录 ##
+
+<details open>
+  <summary>ba</summary>
+
+  - vp_ba_crossline_node：跨线判断
+  - vp_ba_jam_node：拥堵判断
+  - vp_ba_stop_node：停止判断
+</details>
+
+<details open>
+  <summary>broker</summary>
+  
+  - vp_ba_socket_broker_node：使用udp转发行为分析结果
+  - vp_embeddings_properties_socket_broker_node：使用udp转发目标特征、属性结果
+  - vp_embeddings_socket_broker_node：使用udp转发目标特征结果
+  - vp_expr_socket_broker_node：使用udp转发数学表达式检查结果
+  - vp_json_console_broker_node：以json格式将结构化数据输出到控制台
+  - vp_json_kafka_broker_node：以json格式将结构化数据通过kafka发送给第三方
+  - vp_msg_broker_node：数据代理基类节点
+  - vp_plate_socket_broker_node：使用udp转发车牌识别结果
+  - vp_xml_file_broker_node：以xml格式将结构化数据存储到文件
+  - vp_xml_socket_broker_node：以xml格式将结构化数据通过udp发送给第三方
+</details>
+
+<details open>
+  <summary>infers</summary>
+  
+  - vp_classifier_node：基于resnet系列的图像分类节点（opencv::dnn）
+  - vp_enet_seg_node：基于ENet网络的图像分割节点（opencv::dnn）
+  - vp_face_swap_node：基于insightface的人脸替换节点（opencv::dnn）
+  - vp_feature_encoder_node：基于resnet系列的目标特征提取节点（opencv::dnn）
+  - vp_lane_detector_node：基于CenterNet的车道线检测节点（opencv::dnn）
+  - vp_mask_rcnn_detector_node：基于maskrcnn的目标检测节点（opencv::dnn）
+  - vp_openpose_detector_node：基于openpose的肢体检测节点（opencv::dnn）
+  - vp_ppocr_text_detector_node：基于paddleocr的文字检测节点（paddleinference）
+  - vp_restoration_node：基于real-esrgan的图像增强修复节点（opencv::dnn）
+  - vp_sface_feature_encoder_node：基于sface网络的人脸特征提取节点（opencv::dnn）
+  - vp_trt_vehicle_color_classifier：基于resnet18的车辆颜色分类节点（tensorrt）
+  - vp_trt_vehicle_detector：基于yolov5s的车辆检测节点（tensorrt）
+  - vp_trt_vehicle_feature_encoder：基于fastreid的车辆特征提取节点（tensorrt）
+  - vp_trt_vehicle_plate_detector_v2：基于yolov5s的车牌检测识别节点（一级推理）（tensorrt）
+  - vp_trt_vehicle_plate_detector：基于yolov5s的车牌检测识别节点（二级推理）（tensorrt）
+  - vp_trt_vehicle_scanner：基于yolov5s的车身扫描节点（tensorrt）
+  - vp_trt_vehicle_type_classifier：基于resnet18的车辆车型分类节点（tensorrt）
+  - vp_yolo_detector_node：基于yolov3（含tiny）的目标检测节点（opencv::dnn）
+  - yolo_yunet_face_detector_node：基于yunet网络的人脸检测节点（opencv::dnn）
+
+</details>
+
+<details open>
+  <summary>osd</summary>
+  
+  - vp_ba_crossline_osd_node：跨线判断结果绘制节点
+  - vp_ba_jam_osd_node：拥堵判断结果绘制节点
+  - vp_ba_stop_osd_node：停止判断结果绘制节点
+  - vp_cluster_node：目标聚类结果绘制节点
+  - vp_expr_osd_node：数学表达式检查结果绘制节点
+  - vp_face_osd_node_v2：人脸检测结果绘制节点（含相似度显示）
+  - vp_face_osd_node：人脸检测结果绘制节点
+  - vp_lane_osd_node：车道线检测结果绘制节点
+  - vp_osd_node_v2：目标绘制节点（含子目标）
+  - vp_osd_node_v3：目标绘制节点（含目标mask）
+  - vp_osd_node：目标绘制节点
+  - vp_plate_osd_node：车牌检测识别结果绘制节点
+  - vp_pose_osd_node：肢体检测结果绘制节点
+  - vp_seg_osd_node：图像分割结果绘制节点
+  - vp_text_osd_node：文字检测识别结果绘制节点
+</details>
+
+<details open>
+  <summary>proc</summary>
+  
+  - vp_expr_check_node：数学等式准确性判断节点
+  - vp_frame_fusion_node：视频帧按像素比融合节点（支持2个通道）
+</details>
+
+<details open>
+  <summary>record</summary>
+  
+  - vp_record_node：视频/图片录制节点
+</details>
+
+<details open>
+  <summary>track</summary>
+  
+  - vp_dsort_track_node：基于deepsort的跟踪节点
+  - vp_sort_track_node：基于sort的跟踪节点
+</details>
+
+<details open>
+  <summary>common</summary>
+  
+  - vp_app_des_node：将图片数据推送给application的目标节点
+  - vp_app_src_node：从application接收图片数据的原始节点
+  - vp_des_node：所有目标节点基类
+  - vp_fake_des_node：虚拟目标节点（不做任何事）
+  - vp_file_des_node：将视频数据存入文件的目标节点
+  - vp_file_src_node：从文件读取视频数据的原始节点
+  - vp_image_des_node：将数据以图片的形式发送给socket或者file的目标节点
+  - vp_image_src_node：从file或者socket读取图片数据的原始节点
+  - vp_infer_node：所有推理节点基类
+  - vp_message_broker_node：所有数据代理节点基类
+  - vp_node：所有节点基类
+  - vp_placeholder_node：虚拟中间节点（不做任何事）
+  - vp_primary_infer_node：所有一级推理节点基类
+  - vp_rtmp_des_node：将视频数据以rtmp格式推送到rtmp服务器的目标节点
+  - vp_rtsp_des_node：将视频数据以rtsp格式推送（无需rtsp服务器）的目标节点
+  - vp_rtsp_src_node：以rtsp格式读取网络流的原始节点
+  - vp_screen_des_node：将视频/图片显示到屏幕的目标节点
+  - vp_secondary_infer_node：所有二级推理节点基类
+  - vp_split_node：管道拆分节点
+  - vp_src_node：所有原始节点基类
+  - vp_sync_node：管道分支同步节点
+  - vp_udp_src_node：以udp格式读取网络流的原始节点
+</details>
--- a/nodes/ba/README.md
+++ b/nodes/ba/README.md
@@ -0,0 +1,5 @@
+## tips for BA node ##
+- behaviour analysis（short as BA） works dependently on tracking, all BA nodes Must attached after track node in pipeline.
+- all types of BA nodes support multi-channel, single instance of BA node supports multi channels as input.
+- all BA nodes can be divided into 2 categories, one is instantaneous(like crossline) and another one is continuous(like stop, jam). `vp_ba_type` contains pair member for continuous BA such as `STOP` and `UNSTOP`, which means target enter BA status and leave BA status.
+- there is no flag of BA inside `vp_frame_target` type, all nodes in pipeline need maintain it by themself if they want to know BA status for some task(like counter for crossline).
--- a/nodes/ba/vp_ba_crossline_node.cpp
+++ b/nodes/ba/vp_ba_crossline_node.cpp
@@ -0,0 +1,124 @@
+
+
+#include "vp_ba_crossline_node.h"
+
+namespace vp_nodes {
+    
+    vp_ba_crossline_node::vp_ba_crossline_node(std::string node_name, 
+                                                std::map<int, vp_objects::vp_line> lines,
+                                                bool need_record_image,
+                                                bool need_record_video):
+                                                vp_node(node_name), all_lines(lines), need_record_image(need_record_image), need_record_video(need_record_video) {
+        VP_INFO(vp_utils::string_format("[%s] %s", node_name.c_str(), to_string().c_str()));
+        this->initialized();
+    }
+    
+    vp_ba_crossline_node::~vp_ba_crossline_node() {
+        deinitialized();
+    }
+    
+    std::string vp_ba_crossline_node::to_string() {
+        /*
+        * return 2 points of all lines
+        * [channel 0: x1,y1 x2,y2][channel 1: x1,y1 x2,y2]...
+        */
+        std::stringstream ss;
+        for(auto& p: all_lines) {
+            ss << "[channel" << p.first << ": " << p.second.start.x << "," << p.second.start.y << " " << p.second.end.x << "," << p.second.end.y << "]"; 
+        }
+        return ss.str();
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ba_crossline_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // if need applied on current channel or not
+        if (all_lines.count(meta->channel_index) == 0) {
+            return meta;
+        }
+
+        // for current channel
+        auto& total_crossline = all_total_crossline[meta->channel_index];
+        auto& line = all_lines[meta->channel_index];
+
+        // for vp_frame_target only
+        for (auto& target : meta->targets) {
+            auto len = target->tracks.size();
+            std::vector<int> involve_targets;
+            if (len > 1 && target->track_id >= 0) {
+                // check the last 2 points in tracks
+                auto p1 = target->tracks[len - 1].track_point();
+                auto p2 = target->tracks[len - 2].track_point();
+
+                auto check1 = at_1_side_of_line(p1, line);
+                auto check2 = at_1_side_of_line(p2, line);
+
+                // `true and false`  or `false and true`
+                // means target passed the line in current frame
+                if (check1 ^ check2) {
+                    total_crossline++;
+                    involve_targets.push_back(target->track_id);
+
+                    // not empty need fill ba result back to frame meta
+                    if (involve_targets.size() > 0) {
+                        // send record image and record video signal, recording actions would occur if record nodes exist in pipeline
+                        std::string image_file_name_without_ext = "";    // empty means no recording image
+                        std::string video_file_name_without_ext = "";    // empty means no recording video
+
+                        // send image record control meta
+                        if (need_record_image) {
+                            image_file_name_without_ext = vp_utils::time_format(NOW, "crossline_image__<year><mon><day><hour><min><sec><mili>");
+                            auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
+                            pendding_meta(image_record_control_meta);
+                        }
+                        // send video record control meta
+                        if (need_record_video) {
+                            video_file_name_without_ext = vp_utils::time_format(NOW, "crossline_video__<year><mon><day><hour><min><sec><mili>");        
+                            auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
+                            pendding_meta(video_record_control_meta);
+                        }
+
+                        std::vector<vp_objects::vp_point> involve_region {line.start, line.end};
+                        auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::CROSSLINE, 
+                                                                                    meta->channel_index, 
+                                                                                    meta->frame_index, 
+                                                                                    involve_targets, 
+                                                                                    involve_region,
+                                                                                    "cross line",   // meaningful label
+                                                                                    image_file_name_without_ext, 
+                                                                                    video_file_name_without_ext);
+                        // fill back to frame meta
+                        meta->ba_results.push_back(ba_result);
+                        // info log
+                        VP_INFO(vp_utils::string_format("[%s] [channel %d] has found target cross line, total number of crossline: [%d]", node_name.c_str(), meta->channel_index, total_crossline));
+                        if (need_record_image || need_record_video) {
+                            VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
+                        }
+                    }
+                }
+            }
+        }
+
+        return meta;
+    }
+
+    bool vp_ba_crossline_node::at_1_side_of_line(vp_objects::vp_point p, vp_objects::vp_line line) {
+        auto p1 = line.start;
+        auto p2 = line.end;
+
+        if (p1.x == p2.x) {
+            return p.x < p1.x;
+        }
+
+        if (p1.y == p2.y) {
+            return p.y < p1.y;
+        }
+
+        if (p2.x < p1.x) {
+            auto tmp = p2;
+            p2 = p1;
+            p1 = tmp;
+        }
+
+        int ret = (p2.y - p.y) * (p2.x - p1.x) - (p2.y - p1.y) * (p2.x - p.x);
+        return ret < 0;
+    }
+}
--- a/nodes/ba/vp_ba_crossline_node.h
+++ b/nodes/ba/vp_ba_crossline_node.h
@@ -0,0 +1,38 @@
+#pragma once
+
+#include <map>
+#include "../vp_node.h"
+#include "../../objects/shapes/vp_point.h"
+#include "../../objects/shapes/vp_line.h"
+#include "../../objects/vp_image_record_control_meta.h"
+#include "../../objects/vp_video_record_control_meta.h"
+
+namespace vp_nodes {
+    // behaviour analysis node for crossline (support multi channels)
+    class vp_ba_crossline_node: public vp_node 
+    {
+    private:
+        // channel -> counter
+        std::map<int, int> all_total_crossline;
+
+        // channel -> line, 1 channel supports only 1 line at most (can be 0, which means no crossline check on this channel)
+        std::map<int, vp_objects::vp_line> all_lines;
+
+        // record params
+        bool need_record_image;
+        bool need_record_video;
+
+        // check if point at one side of line
+        bool at_1_side_of_line(vp_objects::vp_point p, vp_objects::vp_line line);
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
+    public:
+        vp_ba_crossline_node(std::string node_name, 
+                            std::map<int, vp_objects::vp_line> lines,
+                            bool need_record_image = true,
+                            bool need_record_video = false);
+        ~vp_ba_crossline_node();
+
+        std::string to_string() override;
+    };
+}
--- a/nodes/ba/vp_ba_jam_node.cpp
+++ b/nodes/ba/vp_ba_jam_node.cpp
@@ -0,0 +1,180 @@
+
+
+#include "vp_ba_jam_node.h"
+
+namespace vp_nodes {
+    
+    vp_ba_jam_node::vp_ba_jam_node(std::string node_name, 
+                                    std::map<int, std::vector<vp_objects::vp_point>> jam_regions,
+                                    bool need_record_image,
+                                    bool need_record_video):
+                                    vp_node(node_name), all_jam_regions(jam_regions), need_record_image(need_record_image), need_record_video(need_record_video) {
+        VP_INFO(vp_utils::string_format("[%s] %s", node_name.c_str(), to_string().c_str()));
+        this->initialized();
+    }
+    
+    vp_ba_jam_node::~vp_ba_jam_node() {
+        deinitialized();
+    }
+
+    std::string vp_ba_jam_node::to_string() {
+        /*
+        * return vertexs of all jam regions
+        * [channel0: x1,y1 x2,y2 ...][channel1: x1,y1 x2,y2 ...]...
+        */
+        std::stringstream ss;
+        for(auto& r: all_jam_regions) {
+            ss << "[" << r.first << ":";
+            for(auto& p: r.second) {
+                ss << " " << p.x << "," << p.y;
+            }
+            ss << "]";
+        }
+        return ss.str();
+    }
+
+    bool vp_ba_jam_node::point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region) {
+        int i, j, c = 0;
+        int nvert = region.size();
+
+        for (i = 0, j = nvert-1; i < nvert; j = i++) {
+            if (((region[i].y > p.y) != (region[j].y > p.y)) &&
+            (p.x < (region[j].x - region[i].x) * (p.y - region[i].y) 
+                / (region[j].y - region[i].y) + region[i].x))
+            c = !c;
+        }
+        return c;
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ba_jam_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // if need applied on current channel or not
+        if (all_jam_regions.count(meta->channel_index) == 0) {
+            return meta;
+        }
+        
+        // for current channel
+        auto& jam_region = all_jam_regions[meta->channel_index];
+        auto& stop_checking_status = all_stop_checking_status[meta->channel_index];
+        auto& jam_result = all_jam_results[meta->channel_index];
+        auto& last_notify = all_last_notifys[meta->channel_index];
+
+        // for vp_frame_target only
+        std::vector<int> hit_traget_ids;
+        for (auto& target : meta->targets) {
+            auto len = target->tracks.size();
+            auto loc = target->get_rect().track_point();
+
+            // target has been tracked AND tracked enough frames
+            if (len < check_interval_frames || target->track_id < 0) {
+                continue;
+            }
+            // if target inside of stop region or not
+            if (!point_in_poly(loc, jam_region)) {
+                continue;
+            }
+
+            auto pre_loc = target->tracks[len - check_interval_frames].track_point();
+            if (pre_loc.distance_with(loc) <= check_max_distance) {
+                stop_checking_status[target->track_id]++;
+                hit_traget_ids.push_back(target->track_id);
+            }
+        }
+
+        // count for stop targets
+        auto stops = 0;
+        std::vector<int> involve_targets;
+        for (auto i = stop_checking_status.begin(); i != stop_checking_status.end();) {
+            if (std::find(hit_traget_ids.begin(), hit_traget_ids.end(), i->first) == hit_traget_ids.end()) {
+                // remove since it not satisfy stop condition
+                i = stop_checking_status.erase(i);
+                continue;
+            }
+            if (i->second >= check_min_hit_frames) {
+                involve_targets.push_back(i->first);
+                stops++;
+            }
+            i++;
+        }
+            
+        // enter jam status
+        if (stops >= check_min_stops && !jam_result && (meta->frame_index - last_notify) > (check_notify_interval * meta->fps)) {
+            jam_result = true;
+            last_notify = meta->frame_index;
+
+            // send record image and record video signal, recording actions would occur if record nodes exist in pipeline
+            std::string image_file_name_without_ext = "";    // empty means no recording image
+            std::string video_file_name_without_ext = "";    // empty means no recording video
+
+            // send image record control meta
+            if (need_record_image) {
+                image_file_name_without_ext = vp_utils::time_format(NOW, "jam_image__<year><mon><day><hour><min><sec><mili>");
+                auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
+                pendding_meta(image_record_control_meta);
+            }
+            // send video record control meta
+            if (need_record_video) {
+                video_file_name_without_ext = vp_utils::time_format(NOW, "jam_video__<year><mon><day><hour><min><sec><mili>");        
+                auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
+                pendding_meta(video_record_control_meta);
+            }
+
+            std::vector<vp_objects::vp_point> involve_region = jam_region;
+            auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::JAM, 
+                                                                        meta->channel_index, 
+                                                                        meta->frame_index, 
+                                                                        involve_targets, 
+                                                                        involve_region,
+                                                                        "jam",   // meaningful label
+                                                                        image_file_name_without_ext, 
+                                                                        video_file_name_without_ext);
+            // fill back to frame meta
+            meta->ba_results.push_back(ba_result);
+            // info log
+            VP_INFO(vp_utils::string_format("[%s] [channel %d] has found jam", node_name.c_str(), meta->channel_index));
+            if (need_record_image || need_record_video) {
+                VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
+            }
+        }
+        
+        // leave jam status
+        if (stops < check_min_stops && jam_result) {
+            jam_result = false;
+            last_notify = meta->frame_index;
+
+            // send record image and record video signal, recording actions would occur if record nodes exist in pipeline
+            std::string image_file_name_without_ext = "";    // empty means no recording image
+            std::string video_file_name_without_ext = "";    // empty means no recording video
+
+            // send image record control meta
+            if (need_record_image) {
+                image_file_name_without_ext = vp_utils::time_format(NOW, "unjam_image__<year><mon><day><hour><min><sec><mili>");
+                auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
+                pendding_meta(image_record_control_meta);
+            }
+            // send video record control meta
+            if (need_record_video) {
+                video_file_name_without_ext = vp_utils::time_format(NOW, "unjam_video__<year><mon><day><hour><min><sec><mili>");        
+                auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
+                pendding_meta(video_record_control_meta);
+            }
+
+            std::vector<vp_objects::vp_point> involve_region = jam_region;
+            auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::UNJAM, 
+                                                                        meta->channel_index, 
+                                                                        meta->frame_index, 
+                                                                        involve_targets, 
+                                                                        involve_region,
+                                                                        "unjam",   // meaningful label
+                                                                        image_file_name_without_ext, 
+                                                                        video_file_name_without_ext);
+            // fill back to frame meta
+            meta->ba_results.push_back(ba_result);
+            // info log
+            VP_INFO(vp_utils::string_format("[%s] [channel %d] has found unjam", node_name.c_str(), meta->channel_index));
+            if (need_record_image || need_record_video) {
+                VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
+            }
+        }
+        return meta;
+    }
+}
--- a/nodes/ba/vp_ba_jam_node.h
+++ b/nodes/ba/vp_ba_jam_node.h
@@ -0,0 +1,50 @@
+#pragma once
+
+#include <map>
+#include "../vp_node.h"
+#include "../../objects/shapes/vp_point.h"
+#include "../../objects/shapes/vp_line.h"
+#include "../../objects/vp_image_record_control_meta.h"
+#include "../../objects/vp_video_record_control_meta.h"
+
+namespace vp_nodes {
+    // behaviour analysis node for stop (support multi channels)
+    class vp_ba_jam_node: public vp_node 
+    {
+    private:
+        // channel -> vertexs of region, 1 channel supports only 1 region at most (can be 0, which means no jam check on this channel)
+        std::map<int, std::vector<vp_objects::vp_point>> all_jam_regions;
+
+        // channel -> status of targets (id -> num of hit frames)
+        std::map<int, std::map<int, int>> all_stop_checking_status;
+
+        // channel -> jam status of channel (jam or not)
+        std::map<int, bool> all_jam_results;
+
+        // channel -> last frame index to notify jam
+        std::map<int, int> all_last_notifys;
+
+        // record params
+        bool need_record_image;
+        bool need_record_video;
+
+        // check if point inside of polygon
+        bool point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region);
+
+        // jam checking logic parameters which may be configed by constructor passed in by user
+        const int check_interval_frames = 20;
+        const int check_min_hit_frames = 25 * 2;  // 25 fps * 2 seconds
+        const int check_max_distance = 8;
+        const int check_min_stops = 8;
+        const int check_notify_interval = 10;  // interval time (seconds) to notify (ensure not frequently)
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
+    public:
+        vp_ba_jam_node(std::string node_name, 
+                            std::map<int, std::vector<vp_objects::vp_point>> jam_regions,
+                            bool need_record_image = true,
+                            bool need_record_video = true);
+        ~vp_ba_jam_node();
+        std::string to_string() override;
+    };
+}
--- a/nodes/ba/vp_ba_stop_node.cpp
+++ b/nodes/ba/vp_ba_stop_node.cpp
@@ -0,0 +1,172 @@
+
+
+#include "vp_ba_stop_node.h"
+
+namespace vp_nodes {
+    
+    vp_ba_stop_node::vp_ba_stop_node(std::string node_name, 
+                                    std::map<int, std::vector<vp_objects::vp_point>> stop_regions,
+                                    bool need_record_image,
+                                    bool need_record_video):
+                                    vp_node(node_name), all_stop_regions(stop_regions), need_record_image(need_record_image), need_record_video(need_record_video) {
+        VP_INFO(vp_utils::string_format("[%s] %s", node_name.c_str(), to_string().c_str()));
+        this->initialized();
+    }
+    
+    vp_ba_stop_node::~vp_ba_stop_node() {
+        deinitialized();
+    }
+
+    std::string vp_ba_stop_node::to_string() {
+        /*
+        * return vertexs of all stop regions
+        * [channel0: x1,y1 x2,y2 ...][channel1: x1,y1 x2,y2 ...]...
+        */
+        std::stringstream ss;
+        for(auto& r: all_stop_regions) {
+            ss << "[channel" << r.first << ":";
+            for(auto& p: r.second) {
+                ss << " " << p.x << "," << p.y;
+            }
+            ss << "]";
+        }
+        return ss.str();
+    }
+
+    bool vp_ba_stop_node::point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region) {
+        int i, j, c = 0;
+        int nvert = region.size();
+
+        for (i = 0, j = nvert-1; i < nvert; j = i++) {
+            if (((region[i].y > p.y) != (region[j].y > p.y)) &&
+            (p.x < (region[j].x - region[i].x) * (p.y - region[i].y) 
+                / (region[j].y - region[i].y) + region[i].x))
+            c = !c;
+        }
+        return c;
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ba_stop_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // if need applied on current channel or not
+        if (all_stop_regions.count(meta->channel_index) == 0) {
+            return meta;
+        }
+        
+        // for current channel
+        auto& stop_region = all_stop_regions[meta->channel_index];
+        auto& stop_checking_status = all_stop_checking_status[meta->channel_index];
+
+        // for vp_frame_target only
+        std::vector<int> hit_traget_ids;
+        for (auto& target : meta->targets) {
+            auto len = target->tracks.size();
+            auto loc = target->get_rect().track_point();
+
+            // target has been tracked AND tracked enough frames
+            if (len < check_interval_frames || target->track_id < 0) {
+                continue;
+            }
+            // if target inside of stop region or not
+            if (!point_in_poly(loc, stop_region)) {
+                continue;
+            }
+
+            auto pre_loc = target->tracks[len - check_interval_frames].track_point();
+            if (pre_loc.distance_with(loc) <= check_max_distance) {
+                stop_checking_status[target->track_id]++;
+                hit_traget_ids.push_back(target->track_id);
+            }
+        }
+
+        for (auto i = stop_checking_status.begin(); i != stop_checking_status.end();) {
+            if (std::find(hit_traget_ids.begin(), hit_traget_ids.end(), i->first) == hit_traget_ids.end()) {
+                // statisfy unstop condition
+                if (i->second >= check_min_hit_frames) {
+                    std::vector<int> involve_targets;
+                    involve_targets.push_back(i->first);
+
+                    // send record image and record video signal, recording actions would occur if record nodes exist in pipeline
+                    std::string image_file_name_without_ext = "";    // empty means no recording image
+                    std::string video_file_name_without_ext = "";    // empty means no recording video
+
+                    // send image record control meta
+                    if (need_record_image) {
+                        image_file_name_without_ext = vp_utils::time_format(NOW, "unstop_image__<year><mon><day><hour><min><sec><mili>");
+                        auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
+                        pendding_meta(image_record_control_meta);
+                    }
+                    // send video record control meta
+                    if (need_record_video) {
+                        video_file_name_without_ext = vp_utils::time_format(NOW, "unstop_video__<year><mon><day><hour><min><sec><mili>");        
+                        auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
+                        pendding_meta(video_record_control_meta);
+                    }
+
+                    std::vector<vp_objects::vp_point> involve_region = stop_region;
+                    auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::UNSTOP, 
+                                                                                meta->channel_index, 
+                                                                                meta->frame_index, 
+                                                                                involve_targets, 
+                                                                                involve_region,
+                                                                                "unstop",   // meaningful label
+                                                                                image_file_name_without_ext, 
+                                                                                video_file_name_without_ext);
+                    // fill back to frame meta
+                    meta->ba_results.push_back(ba_result);
+                    // info log
+                    VP_INFO(vp_utils::string_format("[%s] [channel %d] has found target unstop", node_name.c_str(), meta->channel_index));
+                    if (need_record_image || need_record_video) {
+                        VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
+                    }
+                }
+                
+                // remove since it not satisfy stop condition
+                i = stop_checking_status.erase(i);
+                continue;
+            }
+
+            // equal means first time to satisfy stop condition
+            if (i->second == check_min_hit_frames) {
+                std::vector<int> involve_targets;
+                involve_targets.push_back(i->first);
+
+                // send record image and record video signal, recording actions would occur if record nodes exist in pipeline
+                std::string image_file_name_without_ext = "";    // empty means no recording image
+                std::string video_file_name_without_ext = "";    // empty means no recording video
+
+                // send image record control meta
+                if (need_record_image) {
+                    image_file_name_without_ext = vp_utils::time_format(NOW, "stop_image__<year><mon><day><hour><min><sec><mili>");
+                    auto image_record_control_meta = std::make_shared<vp_objects::vp_image_record_control_meta>(meta->channel_index, image_file_name_without_ext, true);
+                    pendding_meta(image_record_control_meta);
+                }
+                // send video record control meta
+                if (need_record_video) {
+                    video_file_name_without_ext = vp_utils::time_format(NOW, "stop_video__<year><mon><day><hour><min><sec><mili>");        
+                    auto video_record_control_meta = std::make_shared<vp_objects::vp_video_record_control_meta>(meta->channel_index, video_file_name_without_ext);
+                    pendding_meta(video_record_control_meta);
+                }
+
+                std::vector<vp_objects::vp_point> involve_region = stop_region;
+                auto ba_result = std::make_shared<vp_objects::vp_ba_result>(vp_objects::vp_ba_type::STOP, 
+                                                                            meta->channel_index, 
+                                                                            meta->frame_index, 
+                                                                            involve_targets, 
+                                                                            involve_region,
+                                                                            "stop",   // meaningful label
+                                                                            image_file_name_without_ext, 
+                                                                            video_file_name_without_ext);
+                // fill back to frame meta
+                meta->ba_results.push_back(ba_result);
+                // info log
+                VP_INFO(vp_utils::string_format("[%s] [channel %d] has found target stop", node_name.c_str(), meta->channel_index));
+                if (need_record_image || need_record_video) {
+                    VP_INFO(vp_utils::string_format("[%s] [channel %d] image & video record file names are: [%s & %s]", node_name.c_str(), meta->channel_index, image_file_name_without_ext.c_str(), video_file_name_without_ext.c_str()));
+                }
+            }
+            i++;
+        }
+        
+        return meta;
+    }
+}
--- a/nodes/ba/vp_ba_stop_node.h
+++ b/nodes/ba/vp_ba_stop_node.h
@@ -0,0 +1,42 @@
+#pragma once
+
+#include <map>
+#include "../vp_node.h"
+#include "../../objects/shapes/vp_point.h"
+#include "../../objects/shapes/vp_line.h"
+#include "../../objects/vp_image_record_control_meta.h"
+#include "../../objects/vp_video_record_control_meta.h"
+
+namespace vp_nodes {
+    // behaviour analysis node for stop (support multi channels)
+    class vp_ba_stop_node: public vp_node 
+    {
+    private:
+        // channel -> vertexs of region, 1 channel supports only 1 region at most (can be 0, which means no stop check on this channel)
+        std::map<int, std::vector<vp_objects::vp_point>> all_stop_regions;
+
+        // channel -> status of targets (id -> num of hit frames)
+        std::map<int, std::map<int, int>> all_stop_checking_status;
+
+        // record params
+        bool need_record_image;
+        bool need_record_video;
+
+        // check if point inside of polygon
+        bool point_in_poly(vp_objects::vp_point p, std::vector<vp_objects::vp_point> region);
+
+        // stop checking logic parameters which may be configed by constructor passed in by user
+        const int check_interval_frames = 20;
+        const int check_min_hit_frames = 25 * 2;  // 25 fps * 2 seconds
+        const int check_max_distance = 5;
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
+    public:
+        vp_ba_stop_node(std::string node_name, 
+                            std::map<int, std::vector<vp_objects::vp_point>> stop_regions,
+                            bool need_record_image = true,
+                            bool need_record_video = true);
+        ~vp_ba_stop_node();
+        std::string to_string() override;
+    };
+}
--- a/nodes/broker/cereal_archive/vp_objects_cereal_archive.h
+++ b/nodes/broker/cereal_archive/vp_objects_cereal_archive.h
@@ -0,0 +1,100 @@
+#pragma once
+
+/*
+* define EXTERNAL archive functions for objects which need to be serialized by cereal library in VideoPipe.
+* refer to `https://uscilab.github.io/cereal/serialization_functions.html` for more details.
+*/
+
+// object types
+#include "../../../objects/vp_frame_target.h"
+#include "../../../objects/vp_frame_face_target.h"
+#include "../../../objects/vp_frame_text_target.h"
+#include "../../../objects/vp_frame_pose_target.h"
+#include "../../../objects/vp_sub_target.h"
+/* extend for more types of objects in VideoPipe. */
+
+// headers from cereal
+#include "../../../third_party/cereal/cereal.hpp"
+#include "../../../third_party/cereal/types/vector.hpp"
+#include "../../../third_party/cereal/types/memory.hpp"
+#include "../../../third_party/cereal/types/string.hpp"
+#include "../../../third_party/cereal/types/utility.hpp"
+#include "../../../third_party/cereal/archives/json.hpp"
+#include "../../../third_party/cereal/archives/xml.hpp"
+
+/* same namespace as object types */
+namespace vp_objects {
+    /* vp_frame_target */
+    template<typename Archive>
+    void serialize(Archive& archive, vp_frame_target& target) {
+        // define the form of structured data for vp_frame_target
+        archive(cereal::make_nvp("x", target.x),
+                cereal::make_nvp("y", target.y),
+                cereal::make_nvp("width", target.width),
+                cereal::make_nvp("height", target.height),
+                cereal::make_nvp("primary_class_id", target.primary_class_id),
+                cereal::make_nvp("primary_score", target.primary_score),
+                cereal::make_nvp("primary_label", target.primary_label),
+                cereal::make_nvp("channel_index", target.channel_index),
+                cereal::make_nvp("frame_index", target.frame_index),
+                cereal::make_nvp("track_id", target.track_id),
+                cereal::make_nvp("secondary_class_ids", target.secondary_class_ids),
+                cereal::make_nvp("secondary_scores", target.secondary_scores),
+                cereal::make_nvp("secondary_labels", target.secondary_labels),
+                cereal::make_nvp("sub_targets", target.sub_targets),
+                cereal::make_nvp("embeddings", target.embeddings));
+    }
+
+    template<typename Archive>
+    void serialize(Archive& archive, vp_sub_target& target) {
+        // define the form of structured data for vp_sub_target
+        archive(cereal::make_nvp("x", target.x),
+                cereal::make_nvp("y", target.y),
+                cereal::make_nvp("width", target.width),
+                cereal::make_nvp("height", target.height),
+                cereal::make_nvp("class_id", target.class_id),
+                cereal::make_nvp("score", target.score),
+                cereal::make_nvp("label", target.label),
+                cereal::make_nvp("frame_index", target.frame_index),
+                cereal::make_nvp("channel_index", target.channel_index),
+                cereal::make_nvp("attachments", target.attachments));
+    }
+    /* END OF vp_frame_target */
+
+
+    /* vp_frame_face_target */
+    template<typename Archive>
+    void serialize(Archive& archive, vp_frame_face_target& target) {
+        // define the form of structured data for vp_frame_face_target
+        archive(cereal::make_nvp("x", target.x),
+                cereal::make_nvp("y", target.y),
+                cereal::make_nvp("width", target.width),
+                cereal::make_nvp("height", target.height),
+                cereal::make_nvp("score", target.score),
+                cereal::make_nvp("embeddings", target.embeddings),
+                cereal::make_nvp("key_points", target.key_points),
+                cereal::make_nvp("track_id", target.track_id));
+    }
+    /* END OF vp_frame_face_target */
+
+
+    /* vp_frame_text_target */
+    template<typename Archive>
+    void serialize(Archive& archive, vp_frame_text_target& target) {
+        // define the form of structured data for vp_frame_text_target
+        archive(cereal::make_nvp("text", target.text),
+                cereal::make_nvp("score", target.score),
+                cereal::make_nvp("region", target.region_vertexes),
+                cereal::make_nvp("flags", target.flags));
+    }
+    /* END OF vp_frame_text_target */
+
+
+    /* vp_frame_pose_target */
+    template<typename Archive>
+    void serialize(Archive& archive, vp_frame_pose_target& target) {
+        // define the form of structured data for vp_frame_pose_target
+
+    }
+    /* END OF vp_frame_pose_target */
+}
--- a/nodes/broker/kafka_utils/KafkaProducer.cpp
+++ b/nodes/broker/kafka_utils/KafkaProducer.cpp
@@ -0,0 +1,167 @@
+
+#ifdef VP_WITH_KAFKA
+#include "KafkaProducer.h"
+
+// callbacks
+class ProducerDeliveryReportCb : public RdKafka::DeliveryReportCb {
+public:
+	void dr_cb(RdKafka::Message &message) {
+		if (message.err())
+			std::cerr << "Message delivery failed: " << message.errstr() << std::endl;
+		else {
+			// Message delivered to topic test [0] at offset 135000
+            /*
+			std::cerr << "Message delivered to topic " << message.topic_name()
+				<< " [" << message.partition() << "] at offset "
+				<< message.offset() << std::endl;*/
+		}
+	}
+};
+
+class ProducerEventCb : public RdKafka::EventCb {
+public:
+	void event_cb(RdKafka::Event &event) {
+		switch (event.type()) {
+		case RdKafka::Event::EVENT_ERROR:
+			std::cout << "RdKafka::Event::EVENT_ERROR: " << RdKafka::err2str(event.err()) << std::endl;
+			break;
+		case RdKafka::Event::EVENT_STATS:
+			//std::cout << "RdKafka::Event::EVENT_STATS: " << event.str() << std::endl;
+			break;
+		case RdKafka::Event::EVENT_LOG:
+			//std::cout << "RdKafka::Event::EVENT_LOG " << event.fac() << std::endl;
+			break;
+		case RdKafka::Event::EVENT_THROTTLE:
+			//std::cout << "RdKafka::Event::EVENT_THROTTLE " << event.broker_name() << std::endl;
+			break;
+		}
+	}
+};
+
+
+class HashPartitionerCb : public RdKafka::PartitionerCb {
+public:
+	int32_t partitioner_cb(const RdKafka::Topic *topic, const std::string *key,
+		int32_t partition_cnt, void *msg_opaque) {
+		char msg[128] = { 0 };
+		int32_t partition_id = generate_hash(key->c_str(), key->size()) % partition_cnt;
+		//                          [topic][key][partition_cnt][partition_id] 
+		//                          :[test][6419][2][1]
+		/*sprintf(msg, "HashPartitionerCb:topic:[%s], key:[%s]partition_cnt:[%d], partition_id:[%d]", topic->name().c_str(),
+			key->c_str(), partition_cnt, partition_id);
+		std::cout << msg << std::endl;*/
+		return partition_id;
+	}
+private:
+
+	static inline unsigned int generate_hash(const char *str, size_t len) {
+		unsigned int hash = 5381;
+		for (size_t i = 0; i < len; i++)
+			hash = ((hash << 5) + hash) + str[i];
+		return hash;
+	}
+};
+
+
+KafkaProducer::KafkaProducer(const std::string& brokers, const std::string& topic, int partition) {
+	m_brokers = brokers;
+	m_topicStr = topic;
+	m_partition = partition;
+
+	m_config = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
+	if(m_config==NULL)
+		std::cout << "Create RdKafka Conf failed." << std::endl;
+
+	m_topicConfig = RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC);
+	if (m_topicConfig == NULL)
+		std::cout << "Create RdKafka Topic Conf failed." << std::endl;
+
+	RdKafka::Conf::ConfResult errCode;
+	std::string errorStr;
+	m_dr_cb = new ProducerDeliveryReportCb;
+
+	errCode = m_config->set("dr_cb", m_dr_cb, errorStr);
+	if (errCode != RdKafka::Conf::CONF_OK)
+	{
+		std::cout << "Conf set failed:" << errorStr << std::endl;
+	}
+
+	m_event_cb = new ProducerEventCb;
+	errCode = m_config->set("event_cb", m_event_cb, errorStr);
+	if (errCode != RdKafka::Conf::CONF_OK)
+	{
+		std::cout << "Conf set failed:" << errorStr << std::endl;
+	}
+
+	m_partitioner_cb = new HashPartitionerCb;
+	errCode = m_topicConfig->set("partitioner_cb", m_partitioner_cb, errorStr);
+	if (errCode != RdKafka::Conf::CONF_OK)
+	{
+		std::cout << "Conf set failed:" << errorStr << std::endl;
+	}
+
+	errCode = m_config->set("statistics.interval.ms", "10000", errorStr);
+	if (errCode != RdKafka::Conf::CONF_OK)
+	{
+		std::cout << "Conf set failed:" << errorStr << std::endl;
+	}
+	errCode = m_config->set("message.max.bytes", "10240000", errorStr);
+	if (errCode != RdKafka::Conf::CONF_OK)
+	{
+		std::cout << "Conf set failed:" << errorStr << std::endl;
+	}
+	errCode = m_config->set("bootstrap.servers", m_brokers, errorStr);
+	if (errCode != RdKafka::Conf::CONF_OK)
+	{
+		std::cout << "Conf set failed:" << errorStr << std::endl;
+	}
+
+	m_producer = RdKafka::Producer::create(m_config, errorStr);
+	if (m_producer == NULL)
+	{
+		std::cout << "Create Producer failed:" << errorStr << std::endl;
+	}
+
+	m_topic = RdKafka::Topic::create(m_producer, m_topicStr, m_topicConfig, errorStr);
+	if (m_topic == NULL)
+	{
+		std::cout << "Create Topic failed:" << errorStr << std::endl;
+	}
+}
+
+KafkaProducer::~KafkaProducer() {
+	while (m_producer->outq_len() > 0) {
+		std::cerr << "Waiting for " << m_producer->outq_len() << std::endl;
+		m_producer->flush(5000);
+	}
+	delete m_config;
+	delete m_topicConfig;
+	delete m_topic;
+	delete m_producer;
+	delete m_dr_cb;
+	delete m_event_cb;
+	delete m_partitioner_cb;
+}
+
+
+void KafkaProducer::pushMessage(const std::string& str) {
+	int32_t len = str.length();
+	void* payload = const_cast<void*>(static_cast<const void*>(str.data()));
+	RdKafka::ErrorCode errorCode = m_producer->produce(
+		m_topic,
+		RdKafka::Topic::PARTITION_UA,
+		RdKafka::Producer::RK_MSG_COPY,
+		payload,
+		len,
+		NULL,
+		NULL);
+	m_producer->poll(0);
+	if (errorCode != RdKafka::ERR_NO_ERROR) {
+		std::cerr << "Produce failed: " << RdKafka::err2str(errorCode) << std::endl;
+		if (errorCode == RdKafka::ERR__QUEUE_FULL) {
+			m_producer->poll(100);
+		}
+	}
+}
+
+#endif
--- a/nodes/broker/kafka_utils/KafkaProducer.h
+++ b/nodes/broker/kafka_utils/KafkaProducer.h
@@ -0,0 +1,35 @@
+#pragma once
+
+#ifdef VP_WITH_KAFKA
+#include <string>
+#include <iostream>
+
+// compile tips:
+// run `apt-get install librdkafka-dev`, v0.11.3 for ubuntu 18.04 by default.
+#include <librdkafka/rdkafkacpp.h>
+
+// wrapper class for Producer in librdkafka
+class KafkaProducer
+{
+public:
+	explicit KafkaProducer(const std::string& brokers, const std::string& topic, int partition);
+
+	void pushMessage(const std::string& str);
+	~KafkaProducer();
+
+
+private:
+	std::string m_brokers;
+	std::string m_topicStr;
+	int m_partition;
+
+	RdKafka::Conf* m_config;
+	RdKafka::Conf* m_topicConfig;
+	RdKafka::Topic* m_topic;
+	RdKafka::Producer* m_producer;
+
+	RdKafka::DeliveryReportCb* m_dr_cb;
+	RdKafka::EventCb* m_event_cb;
+	RdKafka::PartitionerCb* m_partitioner_cb;
+};
+#endif
--- a/nodes/broker/vp_ba_socket_broker_node.cpp
+++ b/nodes/broker/vp_ba_socket_broker_node.cpp
@@ -0,0 +1,130 @@
+
+
+#include "vp_ba_socket_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_ba_socket_broker_node::vp_ba_socket_broker_node(std::string node_name,
+                                                        std::string des_ip,
+                                                        int des_port,
+                                                        vp_broke_for broke_for, 
+                                                        int broking_cache_warn_threshold, 
+                                                        int broking_cache_ignore_threshold):
+                                                        vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
+                                                        des_ip(des_ip),
+                                                        des_port(des_port) {
+        // only for vp_frame_target since BA logic ONLY works on vp_frame_target                                                   
+        assert(broke_for == vp_broke_for::NORMAL);
+        udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
+        this->initialized();
+    }
+    
+    vp_ba_socket_broker_node::~vp_ba_socket_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+
+    void vp_ba_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        /* format:
+        line 0, <--
+        line 1, time
+        line 2, channel_index, frame_index
+        line 3, ba_type, ba_label
+        line 4, ids of involve targets (target_id_1, target_id_2,...)
+        line 5, vertexs of involve region (x1,y1,x2,y2,...) 
+        line 6, property labels split by '|' and split by ','  between different involve targets
+        line 7, record image path
+        line 8, record video path
+        line 9, -->
+        line 10, <--
+        line 11, time
+        line 12, channel_index, frame_index
+        line 13, ba_type, ba_label
+        line 14, ids of involve targets (target_id_1, target_id_2,...)
+        line 15, vertexs of involve region (x1,y1,x2,y2,...) 
+        line 16, property labels split by '|' and split by ','  between different involve targets
+        line 17, record image path
+        linr 18, record video path
+        line 19, -->
+        line 20, ...
+        */
+
+        std::stringstream msg_stream;
+        auto format_basic_info = [&](int channel_index, int frame_index) {
+            msg_stream << vp_utils::time_format(NOW) << std::endl;           // line1
+            msg_stream << channel_index << "," << frame_index << std::endl;  // line2
+        };
+        auto format_ba_info = [&](vp_objects::vp_ba_type ba_type,
+                                std::string ba_label,
+                                const std::vector<int>& involve_target_ids, 
+                                const std::vector<vp_objects::vp_point>& involve_region_vertexs) {
+            msg_stream << int(ba_type) << "," << ba_label << std::endl;     // line3
+            for (int i = 0; i < involve_target_ids.size(); i++) {
+                msg_stream << involve_target_ids[i];  // line 4
+                if (i != involve_target_ids.size() - 1) {
+                    msg_stream << ",";
+                }
+            }
+            msg_stream << std::endl;
+            
+            for (int i = 0; i < involve_region_vertexs.size(); i++) {
+                msg_stream << involve_region_vertexs[i].x << "," << involve_region_vertexs[i].y;  // line 5
+                if (i != involve_region_vertexs.size() - 1) {
+                    msg_stream << ",";
+                }
+            }
+            msg_stream << std::endl;
+
+            auto targets = meta->get_targets_by_ids(involve_target_ids);
+            for (int i = 0; i < targets.size(); i++) {
+                auto t = targets[i];
+                for (int j = 0; j < t->secondary_labels.size(); j++) {
+                    msg_stream << t->secondary_labels[j];  // line 6
+                    if (j != t->secondary_labels.size() - 1) {
+                        msg_stream << "|";
+                    }
+                }
+                if (i != targets.size() - 1) {
+                    msg_stream << ",";
+                }
+            }
+            msg_stream << std::endl;
+        };
+        auto format_record_info = [&](const std::string& record_image_name, const std::string& record_video_name) {
+            msg_stream << record_image_name << std::endl;  // line7
+            msg_stream << record_video_name << std::endl;  // line8
+        };
+
+        if (broke_for == vp_broke_for::NORMAL) {
+            for (int i = 0; i < meta->ba_results.size(); i++) {
+                auto& ba = meta->ba_results[i];
+                
+                // start flag
+                msg_stream << "<--" << std::endl;
+                // basic info
+                format_basic_info(meta->channel_index, meta->frame_index);
+                //ba info
+                format_ba_info(ba->type, ba->ba_label, ba->involve_target_ids_in_frame, ba->involve_region_in_frame);
+                // record info
+                format_record_info(ba->record_image_name, ba->record_video_name);
+                // end flag
+                msg_stream << "-->";
+                
+                if (i != meta->ba_results.size() - 1) {
+                    msg_stream << std::endl;  // not the last one
+                }
+            }
+        }
+        
+        msg = msg_stream.str();   
+    }
+
+    void vp_ba_socket_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to socket by udp
+        auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
+        auto bytes_2_send_len = msg.size();
+        udp_writer.send(bytes_2_send, bytes_2_send_len);
+    }
+}
--- a/nodes/broker/vp_ba_socket_broker_node.h
+++ b/nodes/broker/vp_ba_socket_broker_node.h
@@ -0,0 +1,37 @@
+#pragma once
+
+#include "vp_msg_broker_node.h"
+#include "../../objects/ba/vp_ba_result.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+// light weight socket support
+#include "../../third_party/kissnet/kissnet.hpp"
+
+namespace vp_nodes {
+    // message broker node, broke BA results (ONLY for vp_frame_target) to socket via udp.
+    // BA results could be used for archive.
+    class vp_ba_socket_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // host the data sent to via udp
+        std::string des_ip = "";
+        // port the data sent to via udp
+        int des_port = 0;
+
+        // udp socket writer
+        kissnet::udp_socket udp_writer;
+    protected:
+        // to custom format
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to socket via udp
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_ba_socket_broker_node(std::string node_name, 
+                                std::string des_ip = "",
+                                int des_port = 0,
+                                vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_ba_socket_broker_node();
+    };
+}
--- a/nodes/broker/vp_embeddings_properties_socket_broker_node.cpp
+++ b/nodes/broker/vp_embeddings_properties_socket_broker_node.cpp
@@ -0,0 +1,137 @@
+
+
+#include "vp_embeddings_properties_socket_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_embeddings_properties_socket_broker_node::vp_embeddings_properties_socket_broker_node(std::string node_name,
+                                                        std::string des_ip,
+                                                        int des_port,
+                                                        std::string cropped_dir,
+                                                        int min_crop_width,
+                                                        int min_crop_height,
+                                                        vp_broke_for broke_for, 
+                                                        bool only_for_tracked,
+                                                        int broking_cache_warn_threshold, 
+                                                        int broking_cache_ignore_threshold):
+                                                        vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
+                                                        des_ip(des_ip),
+                                                        des_port(des_port),
+                                                        cropped_dir(cropped_dir),
+                                                        min_crop_width(min_crop_width),
+                                                        min_crop_height(min_crop_height),
+                                                        only_for_tracked(only_for_tracked) {
+        // only for vp_frame_target                                                    
+        assert(broke_for == vp_broke_for::NORMAL);
+        udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
+        this->initialized();
+    }
+    
+    vp_embeddings_properties_socket_broker_node::~vp_embeddings_properties_socket_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+
+    void vp_embeddings_properties_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        /* format:
+        line 0, <--
+        line 1, 1st cropped image's path
+        line 2, 1st properties, multi property labels splited by ','
+        line 3, 1st sub target info, empty line if sub target detected (sub target may be vehicle plate)
+        line 4, 1st embeddings
+        line 5, -->
+        line 6, <--
+        line 7, 2nd cropped image's path
+        line 8, 2nd properties, multi property labels splited by ','
+        line 9, 2nd sub target info, empty line if sub target detected (sub target may be vehicle plate)
+        line 10, 2nd embeddings
+        line 11, -->
+        line 12, ...
+        */
+        auto& broked = all_broked[meta->channel_index];
+        // remove 50 elements every 100 ids
+        if (broked.size() > 100) {
+            broked.erase(broked.begin(), broked.begin() + 50);
+        }
+
+        std::stringstream msg_stream;
+        auto format_embeddings = [&](const std::vector<float>& embeddings) {
+            for (int i = 0; i < embeddings.size(); i++) {
+                msg_stream << embeddings[i];
+                if (i != embeddings.size() - 1) {
+                    msg_stream << ",";
+                }
+            }
+            msg_stream << std::endl;
+        };
+        auto format_sub_target = [&](const std::string& sub_target_info) {
+            msg_stream << sub_target_info << std::endl;
+        };
+        auto format_properties = [&](const std::vector<std::string>& secondary_labels) {
+            for (int i = 0; i < secondary_labels.size(); i++) {
+                msg_stream << secondary_labels[i];
+                if (i != secondary_labels.size() - 1) {
+                    msg_stream << ",";
+                }
+            }
+            msg_stream << std::endl;
+        };
+        auto save_cropped_image = [&](cv::Mat& frame, cv::Rect rect, std::string name) {
+            auto cropped = frame(rect);
+            cv::imwrite(name, cropped);
+            msg_stream << name << std::endl;
+        };
+        if (broke_for == vp_broke_for::NORMAL) {
+            for (int i = 0; i < meta->targets.size(); i++) {
+                auto& t = meta->targets[i];
+                // only broke for tracked targets and have enough frames
+                if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
+                    continue;
+                }
+                
+                // only broke 1 time for specific track id if it has been tracked, or broke many times
+                if (t->track_id >= 0 && 
+                    std::find(broked.begin(), broked.end(), t->track_id) != broked.end()) {
+                    continue;
+                }
+
+                // size filter
+                if (t->width < min_crop_width || t->height < min_crop_height) {
+                    continue;
+                }
+
+                if (t->track_id >= 0) {
+                    broked.push_back(t->track_id);
+                }
+                auto name = cropped_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + std::to_string(t->track_id >= 0 ? t->track_id : i) + ".jpg";
+                // start flag
+                msg_stream << "<--" << std::endl;
+                // save small cropped image
+                save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), name);
+                // format properties
+                format_properties(t->secondary_labels);
+                // format sub target (vehicle plate here), just using the first sub target's label is enough.
+                format_sub_target(t->sub_targets.size() > 0 ? t->sub_targets[0]->label : "");
+                // format embeddings
+                format_embeddings(t->embeddings);
+                // end flag
+                msg_stream << "-->";
+                
+                if (i != meta->targets.size() - 1) {
+                    msg_stream << std::endl;  // not the last one
+                }
+            }
+        }
+        
+        msg = msg_stream.str();   
+    }
+
+    void vp_embeddings_properties_socket_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to socket by udp
+        auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
+        auto bytes_2_send_len = msg.size();
+        udp_writer.send(bytes_2_send, bytes_2_send_len);
+    }
+}
--- a/nodes/broker/vp_embeddings_properties_socket_broker_node.h
+++ b/nodes/broker/vp_embeddings_properties_socket_broker_node.h
@@ -0,0 +1,55 @@
+#pragma once
+
+#include "vp_msg_broker_node.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+// light weight socket support
+#include "../../third_party/kissnet/kissnet.hpp"
+
+namespace vp_nodes {
+    // message broker node, broke embeddings, AND properties (ONLY for vp_frame_target) to socket via udp.
+    // embeddings and properties could be used for similiarity search and property search later.
+    class vp_embeddings_properties_socket_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // save dir for cropped images, which would be used for embeddings similiarity search or property search later
+        std::string cropped_dir = "cropped_images";
+        // min width to crop (embedding/property will be ignored if target's width is smaller than this value)
+        int min_crop_width = 50;
+        // min height to crop (embedding/property will be ignored if target's height is smaller than this value)
+        int min_crop_height = 50;
+        // only broke for tracked targets (track_id is not -1)
+        bool only_for_tracked = false;
+        
+        // min tracked frames if only_for_tracked is true
+        int min_tracked_frames = 25;
+
+        // host the data sent to via udp
+        std::string des_ip = "";
+        // port the data sent to via udp
+        int des_port = 0;
+
+        // udp socket writer
+        kissnet::udp_socket udp_writer;
+
+        // support multi-channel
+        std::map<int, std::vector<int>> all_broked;  // channel -> target ids which have been broked
+    protected:
+        // to custom format
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to socket via udp
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_embeddings_properties_socket_broker_node(std::string node_name, 
+                                std::string des_ip = "",
+                                int des_port = 0,
+                                std::string cropped_dir = "cropped_images",
+                                int min_crop_width = 50,
+                                int min_crop_height = 50,
+                                vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                bool only_for_tracked = false, 
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_embeddings_properties_socket_broker_node();
+    };
+}
--- a/nodes/broker/vp_embeddings_socket_broker_node.cpp
+++ b/nodes/broker/vp_embeddings_socket_broker_node.cpp
@@ -0,0 +1,156 @@
+
+
+#include "vp_embeddings_socket_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_embeddings_socket_broker_node::vp_embeddings_socket_broker_node(std::string node_name,
+                                                        std::string des_ip,
+                                                        int des_port,
+                                                        std::string cropped_dir,
+                                                        int min_crop_width,
+                                                        int min_crop_height,
+                                                        vp_broke_for broke_for,
+                                                        bool only_for_tracked, 
+                                                        int broking_cache_warn_threshold, 
+                                                        int broking_cache_ignore_threshold):
+                                                        vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
+                                                        des_ip(des_ip),
+                                                        des_port(des_port),
+                                                        cropped_dir(cropped_dir),
+                                                        min_crop_width(min_crop_width),
+                                                        min_crop_height(min_crop_height),
+                                                        only_for_tracked(only_for_tracked) {
+        // only for vp_frame_target or vp_frame_face_target                                                    
+        assert(broke_for == vp_broke_for::NORMAL || broke_for == vp_broke_for::FACE);
+        udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
+        this->initialized();
+    }
+    
+    vp_embeddings_socket_broker_node::~vp_embeddings_socket_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+
+    void vp_embeddings_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        /* format:
+        line 0, <--
+        line 1, 1st cropped image's path
+        line 2, 1st embeddings
+        line 3, -->
+        line 4, <--
+        line 5, 2nd cropped image's path
+        line 6, 2nd embeddings
+        line 7, -->
+        line 8, ...
+        */
+        auto& broked = all_broked[meta->channel_index];
+        // remove 50 elements every 100 ids
+        if (broked.size() > 100) {
+            broked.erase(broked.begin(), broked.begin() + 50);
+        }
+        
+        std::stringstream msg_stream;
+        auto format_embeddings = [&](const std::vector<float>& embeddings) {
+            for (int i = 0; i < embeddings.size(); i++) {
+                msg_stream << embeddings[i];
+                if (i != embeddings.size() - 1) {
+                    msg_stream << ",";
+                }
+            }
+            msg_stream << std::endl;
+        };
+        auto save_cropped_image = [&](cv::Mat& frame, cv::Rect rect, std::string name) {
+            auto cropped = frame(rect);
+            cv::imwrite(name, cropped);
+            msg_stream << name << std::endl;
+        };
+
+        if (broke_for == vp_broke_for::NORMAL) {
+            for (int i = 0; i < meta->targets.size(); i++) {
+                auto& t = meta->targets[i];
+                // only broke for tracked targets and have enough frames
+                if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
+                    continue;
+                }
+                
+                // only broke 1 time for specific track id if it has been tracked, or broke many times
+                if (t->track_id >= 0 && 
+                    std::find(broked.begin(), broked.end(), t->track_id) != broked.end()) {
+                    continue;
+                }
+                
+                // size filter
+                if (t->width < min_crop_width || t->height < min_crop_height) {
+                    continue;
+                }
+
+                if (t->track_id >= 0) {
+                    broked.push_back(t->track_id);
+                }
+                auto name = cropped_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + std::to_string(t->track_id >= 0 ? t->track_id : i) + ".jpg";
+                // start flag
+                msg_stream << "<--" << std::endl;
+                // save small cropped image
+                save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), name);
+                // format embeddings
+                format_embeddings(t->embeddings);
+                // end flag
+                msg_stream << "-->";
+                
+                if (i != meta->targets.size() - 1) {
+                    msg_stream << std::endl;  // not the last one
+                }
+            }
+        }
+
+        if (broke_for == vp_broke_for::FACE) {
+            for (int i = 0; i < meta->face_targets.size(); i++) {
+                auto& t = meta->face_targets[i];
+                // only broke for tracked targets and have enough frames
+                if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
+                    continue;
+                }
+
+                // only broke 1 time for specific track id if it has been tracked, or broke many times
+                if (t->track_id >= 0 && 
+                    std::find(broked.begin(), broked.end(), t->track_id) != broked.end()) {
+                    continue;
+                }
+
+                // size filter
+                if (t->width < min_crop_width || t->height < min_crop_height) {
+                    continue;
+                }
+                
+                if (t->track_id >= 0) {
+                    broked.push_back(t->track_id);
+                }
+                auto name = cropped_dir + "/" + std::to_string(meta->channel_index) + "_" + std::to_string(meta->frame_index) + "_" + std::to_string(t->track_id >= 0 ? t->track_id : i) + ".jpg";
+                // start flag
+                msg_stream << "<--" << std::endl;
+                // save small cropped image
+                save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), name);
+                // format embeddings
+                format_embeddings(t->embeddings);
+                // end flag
+                msg_stream << "-->";
+
+                if (i != meta->face_targets.size() - 1) {
+                    msg_stream << std::endl;  // not the last one
+                }
+            }
+        }
+        
+        msg = msg_stream.str();   
+    }
+
+    void vp_embeddings_socket_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to socket by udp
+        auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
+        auto bytes_2_send_len = msg.size();
+        udp_writer.send(bytes_2_send, bytes_2_send_len);
+    }
+}
--- a/nodes/broker/vp_embeddings_socket_broker_node.h
+++ b/nodes/broker/vp_embeddings_socket_broker_node.h
@@ -0,0 +1,55 @@
+#pragma once
+
+#include "vp_msg_broker_node.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+// light weight socket support
+#include "../../third_party/kissnet/kissnet.hpp"
+
+namespace vp_nodes {
+    // message broker node, broke ONLY embeddings (for vp_frame_target or vp_frame_face_target) to socket via udp.
+    // embeddings could be used for similiarity search later.
+    class vp_embeddings_socket_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // save dir for cropped images, which would be used for embeddings similiarity search later
+        std::string cropped_dir = "cropped_images";
+        // min width to crop (embedding will be ignored if target's width is smaller than this value)
+        int min_crop_width = 50;
+        // min height to crop (embedding will be ignored if target's height is smaller than this value)
+        int min_crop_height = 50;
+        // only broke for tracked targets (track_id is not -1)
+        bool only_for_tracked = false;
+
+        // min tracked frames if only_for_tracked is true
+        int min_tracked_frames = 25;
+
+        // host the data sent to via udp
+        std::string des_ip = "";
+        // port the data sent to via udp
+        int des_port = 0;
+
+        // udp socket writer
+        kissnet::udp_socket udp_writer;
+
+        // support multi-channel
+        std::map<int, std::vector<int>> all_broked;  // channel -> target ids which have been broked
+    protected:
+        // to custom format
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to socket via udp
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_embeddings_socket_broker_node(std::string node_name, 
+                                std::string des_ip = "",
+                                int des_port = 0,
+                                std::string cropped_dir = "cropped_images",
+                                int min_crop_width = 50,
+                                int min_crop_height = 50,
+                                vp_broke_for broke_for = vp_broke_for::NORMAL,
+                                bool only_for_tracked = false, 
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_embeddings_socket_broker_node();
+    };
+}
--- a/nodes/broker/vp_expr_socket_broker_node.cpp
+++ b/nodes/broker/vp_expr_socket_broker_node.cpp
@@ -0,0 +1,95 @@
+
+
+#include "vp_expr_socket_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_expr_socket_broker_node::vp_expr_socket_broker_node(std::string node_name,
+                                                        std::string des_ip,
+                                                        int des_port,
+                                                        std::string screenshot_dir,
+                                                        vp_broke_for broke_for, 
+                                                        int broking_cache_warn_threshold, 
+                                                        int broking_cache_ignore_threshold):
+                                                        vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
+                                                        des_ip(des_ip),
+                                                        des_port(des_port),
+                                                        screenshot_dir(screenshot_dir) {
+        // only for vp_frame_text_target                                                 
+        assert(broke_for == vp_broke_for::TEXT);
+        udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
+        this->initialized();
+    }
+    
+    vp_expr_socket_broker_node::~vp_expr_socket_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+
+    void vp_expr_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        /* format:
+        line 0, <--
+        line 1, time
+        line 2, channel_index, frame_index
+        line 3, total num, num of yes, num of no, number of invalid
+        line 4, 1st expression string, 2nd expression string, 3rd expression string, ...
+        line 5, screenshot path
+        line 6, -->
+        */
+
+        std::stringstream msg_stream;
+        auto format_basic_info = [&](int channel_index, int frame_index) {
+            msg_stream << vp_utils::time_format(NOW) << std::endl;           // line1
+            msg_stream << channel_index << "," << frame_index << std::endl;  // line2
+        };
+        auto format_expr_info = [&](const std::vector<std::shared_ptr<vp_objects::vp_frame_text_target>>& expr_targets) {
+            auto num_yes = 0, num_no = 0, num_invalid = 0;
+            auto expr_strs = std::string();
+            for (int i = 0; i < expr_targets.size(); i++) {
+                expr_strs += expr_targets[i]->text;
+                if (i != expr_targets.size() - 1) {
+                    expr_strs += ",";
+                }
+
+                if (expr_targets[i]->flags.find("yes") != std::string::npos) {
+                    num_yes++;
+                }
+                if (expr_targets[i]->flags.find("no") != std::string::npos) {
+                    num_no++;
+                }
+                if (expr_targets[i]->flags.find("invalid") != std::string::npos) {
+                    num_invalid++;
+                }
+            }
+            msg_stream << expr_targets.size() << "," << num_yes << "," << num_no << "," << num_invalid << std::endl;  // line3
+            msg_stream << expr_strs << std::endl;   // line4
+        };
+        auto format_screenshot = [&](cv::Mat& screenshot, const std::string& name) {
+            cv::imwrite(name, screenshot);
+            msg_stream << name << std::endl;  // line5
+        };
+        
+        // at most 1 record for each frame
+        if (broke_for == vp_broke_for::TEXT) {
+            // start flag
+            msg_stream << "<--" << std::endl;
+            format_basic_info(meta->channel_index, meta->frame_index);
+            format_expr_info(meta->text_targets);
+            auto screenshot_name = screenshot_dir + "/" + std::to_string(meta->channel_index) + "_" + std::to_string(meta->frame_index) + ".jpg";
+            format_screenshot(meta->osd_frame.empty() ? meta->frame : meta->osd_frame, screenshot_name);
+            // end flag
+            msg_stream << "-->" << std::endl;
+        }
+        
+        msg = msg_stream.str();   
+    }
+
+    void vp_expr_socket_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to socket by udp
+        auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
+        auto bytes_2_send_len = msg.size();
+        udp_writer.send(bytes_2_send, bytes_2_send_len);
+    }
+}
--- a/nodes/broker/vp_expr_socket_broker_node.h
+++ b/nodes/broker/vp_expr_socket_broker_node.h
@@ -0,0 +1,40 @@
+#pragma once
+
+#include "vp_msg_broker_node.h"
+#include "../../objects/ba/vp_ba_result.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+// light weight socket support
+#include "../../third_party/kissnet/kissnet.hpp"
+
+namespace vp_nodes {
+    // message broker node, broke math expression checking results (ONLY for vp_frame_text_target) to socket via udp.
+    // math expression checking results could be used for archive.
+    class vp_expr_socket_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // save dir for screenshot images, which would be used for displaying in web client
+        std::string screenshot_dir = "screenshot_images";
+        // host the data sent to via udp
+        std::string des_ip = "";
+        // port the data sent to via udp
+        int des_port = 0;
+
+        // udp socket writer
+        kissnet::udp_socket udp_writer;
+    protected:
+        // to custom format
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to socket via udp
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_expr_socket_broker_node(std::string node_name, 
+                                std::string des_ip = "",
+                                int des_port = 0,
+                                std::string screenshot_dir = "screenshot_images",
+                                vp_broke_for broke_for = vp_broke_for::TEXT, 
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_expr_socket_broker_node();
+    };
+}
--- a/nodes/broker/vp_json_console_broker_node.cpp
+++ b/nodes/broker/vp_json_console_broker_node.cpp
@@ -0,0 +1,58 @@
+
+#include "vp_json_console_broker_node.h"
+
+namespace vp_nodes {
+        
+    vp_json_console_broker_node::vp_json_console_broker_node(std::string node_name, 
+                                                            vp_broke_for broke_for, 
+                                                            int broking_cache_warn_threshold, 
+                                                            int broking_cache_ignore_threshold):
+                                                            vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold) {
+        this->initialized();
+    }
+    
+    vp_json_console_broker_node::~vp_json_console_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+    
+    void vp_json_console_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        // serialize objects to json by cereal
+        std::stringstream msg_stream;
+        {
+            cereal::JSONOutputArchive json_archive(msg_stream);
+            
+            // global values
+            json_archive(cereal::make_nvp("channel_index", meta->channel_index),
+            cereal::make_nvp("frame_index", meta->frame_index),
+            cereal::make_nvp("width", meta->frame.cols),
+            cereal::make_nvp("height", meta->frame.rows),
+            cereal::make_nvp("fps", meta->fps),
+            cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
+
+            // serialize values according to broke_for
+            if (broke_for == vp_broke_for::NORMAL) {
+                json_archive(cereal::make_nvp("target_size", meta->targets.size()), 
+                            cereal::make_nvp("targets", meta->targets));
+            }
+            else if (broke_for ==  vp_broke_for::FACE) {
+                json_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
+                            cereal::make_nvp("face_targets", meta->face_targets));
+            }
+            else if (broke_for == vp_broke_for::TEXT) {
+                json_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
+                            cereal::make_nvp("text_targets", meta->text_targets));
+            }
+            else {
+                throw "invalid broke_for!";
+            }
+        } // flush
+
+        msg = msg_stream.str();
+    }
+
+    void vp_json_console_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to console by std::cout
+        std::cout << msg << std::endl;
+    }
+}
--- a/nodes/broker/vp_json_console_broker_node.h
+++ b/nodes/broker/vp_json_console_broker_node.h
@@ -0,0 +1,26 @@
+#pragma once
+
+#include <sstream>
+
+#include "vp_msg_broker_node.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+namespace vp_nodes {
+    // message broker node (for debug purpose), broke json data to console.
+    class vp_json_console_broker_node: public vp_msg_broker_node
+    {
+    private:
+        /* data */
+    protected:
+        // to json
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to console
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_json_console_broker_node(std::string node_name, 
+                                    vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                    int broking_cache_warn_threshold = 50, 
+                                    int broking_cache_ignore_threshold = 200);
+        ~vp_json_console_broker_node();
+    };
+}
--- a/nodes/broker/vp_json_kafka_broker_node.cpp
+++ b/nodes/broker/vp_json_kafka_broker_node.cpp
@@ -0,0 +1,65 @@
+#ifdef VP_WITH_KAFKA
+#include "vp_json_kafka_broker_node.h"
+
+namespace vp_nodes {
+        
+    vp_json_kafka_broker_node::vp_json_kafka_broker_node(std::string node_name, 
+                                                            std::string kafka_servers,
+                                                            std::string topic_name,
+                                                            vp_broke_for broke_for, 
+                                                            int broking_cache_warn_threshold, 
+                                                            int broking_cache_ignore_threshold):
+                                                            vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold) {
+        VP_INFO(vp_utils::string_format("[%s] kafka_servers:[%s] topic_name:[%s]", node_name.c_str(), kafka_servers.c_str(), topic_name.c_str()));
+        kafka_producer = std::make_shared<KafkaProducer>(kafka_servers, topic_name, 0);
+        this->initialized();
+    }
+    
+    vp_json_kafka_broker_node::~vp_json_kafka_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+    
+    void vp_json_kafka_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        // serialize objects to json by cereal
+        std::stringstream msg_stream;
+        {
+            cereal::JSONOutputArchive json_archive(msg_stream);
+            
+            // global values
+            json_archive(cereal::make_nvp("channel_index", meta->channel_index),
+            cereal::make_nvp("frame_index", meta->frame_index),
+            cereal::make_nvp("width", meta->frame.cols),
+            cereal::make_nvp("height", meta->frame.rows),
+            cereal::make_nvp("fps", meta->fps),
+            cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
+
+            // serialize values according to broke_for
+            if (broke_for == vp_broke_for::NORMAL) {
+                json_archive(cereal::make_nvp("target_size", meta->targets.size()), 
+                            cereal::make_nvp("targets", meta->targets));
+            }
+            else if (broke_for ==  vp_broke_for::FACE) {
+                json_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
+                            cereal::make_nvp("face_targets", meta->face_targets));
+            }
+            else if (broke_for == vp_broke_for::TEXT) {
+                json_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
+                            cereal::make_nvp("text_targets", meta->text_targets));
+            }
+            else {
+                throw "invalid broke_for!";
+            }
+        } // flush
+
+        msg = msg_stream.str();
+    }
+
+    void vp_json_kafka_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to kafka by kafka client api
+        if (kafka_producer != nullptr) {
+            kafka_producer->pushMessage(msg);
+        }
+    }
+}
+#endif
--- a/nodes/broker/vp_json_kafka_broker_node.h
+++ b/nodes/broker/vp_json_kafka_broker_node.h
@@ -0,0 +1,32 @@
+#pragma once
+
+#ifdef VP_WITH_KAFKA
+#include <sstream>
+
+#include "vp_msg_broker_node.h"
+#include "kafka_utils/KafkaProducer.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+namespace vp_nodes {
+    // message broker node, broke json data to kafka.
+    class vp_json_kafka_broker_node: public vp_msg_broker_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<KafkaProducer> kafka_producer = nullptr;
+    protected:
+        // to json
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to console
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_json_kafka_broker_node(std::string node_name, 
+                                    std::string kafka_servers = "127.0.0.1:9092",
+                                    std::string topic_name = "videopipe_topic",
+                                    vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                    int broking_cache_warn_threshold = 50, 
+                                    int broking_cache_ignore_threshold = 200);
+        ~vp_json_kafka_broker_node();
+    };
+}
+#endif
--- a/nodes/broker/vp_msg_broker_node.cpp
+++ b/nodes/broker/vp_msg_broker_node.cpp
@@ -0,0 +1,85 @@
+
+#include "vp_msg_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_msg_broker_node::vp_msg_broker_node(std::string node_name,
+                                        vp_broke_for broke_for,
+                                        int broking_cache_warn_threshold, 
+                                        int broking_cache_ignore_threshold): 
+                                        vp_node(node_name), 
+                                        broke_for(broke_for),
+                                        broking_cache_warn_threshold(broking_cache_warn_threshold),
+                                        broking_cache_ignore_threshold(broking_cache_ignore_threshold) {
+        broking_th = std::thread(&vp_msg_broker_node::broking_run, this);
+    }
+    
+    vp_msg_broker_node::~vp_msg_broker_node() {
+
+    }
+
+    void vp_msg_broker_node::stop_broking() {
+        broking = false;
+        frames_to_broke.push(nullptr);   // send dead flag to broking_thread
+        broking_cache_semaphore.signal();
+        if (broking_th.joinable()) {
+            broking_th.join();
+        }
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_msg_broker_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // cache frame meta only if cache size is not greater than threshold
+        if (frames_to_broke.size() < broking_cache_ignore_threshold) {
+            // it is a producer
+            frames_to_broke.push(meta);
+            broking_cache_semaphore.signal();
+        }
+        
+        // warning 1 time in log
+        auto size = frames_to_broke.size();
+        if (size > broking_cache_warn_threshold && !broking_cache_warned) {
+            broking_cache_warned = true;
+            VP_WARN(vp_utils::string_format("[%s] [message broker] cache size is exceeding threshold! cache size is [%d], threshold is [%d]", node_name.c_str(), size, broking_cache_warn_threshold));
+        }
+
+        if (size <= broking_cache_warn_threshold) {
+            broking_cache_warned = false;
+        }
+
+        return meta;
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_msg_broker_node::handle_control_meta(std::shared_ptr<vp_objects::vp_control_meta> meta) {
+        return meta;
+    }
+
+    void vp_msg_broker_node::broking_run() {
+        while (broking) {
+            // it is a consumer
+            broking_cache_semaphore.wait();
+
+            auto frame_meta = frames_to_broke.front();
+            frames_to_broke.pop();
+
+            // dead flag
+            if (frame_meta == nullptr) {
+                continue;
+            }
+            
+            // message to be broked
+            std::string message;
+
+            // step 1, format message
+            format_msg(frame_meta, message);  // MUST be implemented in child class
+
+            // ignore if message is empty, because no broking occurs is allowed for some frames if some conditions not satisfied
+            if (message.empty()) {
+                continue;
+            }
+            
+            // step 2, broke message
+            broke_msg(message);               // MUST be implemented in child class
+        }
+    }
+}
--- a/nodes/broker/vp_msg_broker_node.h
+++ b/nodes/broker/vp_msg_broker_node.h
@@ -0,0 +1,65 @@
+
+#pragma once
+
+#include "../vp_node.h"
+
+namespace vp_nodes {
+    // broke for what type of data (vp_frame_target, vp_frame_face_target or others)
+    enum class vp_broke_for {
+        NORMAL,  // vp_frame_target
+        FACE,    // vp_frame_face_target
+        TEXT,    // vp_frame_text_target
+        POSE     // vp_frame_pose_target
+                 // others to extend
+    };
+
+    // base node for message brokers, 
+    // used to serialize objects (inside vp_frame_meta) to structured data and then push them to external modules like kafka, file or sockets. 
+    // note: 
+    // 1. this node works asynchronously which would not block pipeline.
+    // 2. this class can not be initialized directly.
+    class vp_msg_broker_node: public vp_node
+    {
+    private:
+        // warning if cache size greater than threshold
+        int broking_cache_warn_threshold = 50;
+        bool broking_cache_warned = false;
+
+        // ignore if cache size greater than threshold (skip directly)
+        int broking_cache_ignore_threshold = 200;
+
+        // cache frames to be broked
+        std::queue<std::shared_ptr<vp_objects::vp_frame_meta>> frames_to_broke;
+        vp_utils::vp_semaphore broking_cache_semaphore;
+
+        // broking thread
+        std::thread broking_th;
+        // broking function
+        void broking_run();
+        bool broking = true;
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override final;
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_control_meta(std::shared_ptr<vp_objects::vp_control_meta> meta) override final;
+
+        // serialize objects to message which SHOULD be implemented in child class.
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) = 0;
+        // broke message to external modules which SHOULD be implemented in child class.
+        virtual void broke_msg(const std::string& msg) = 0;
+        // wait thread exits in vp_msg_broker_node
+        void stop_broking();
+        // node applied for what type of target
+        vp_broke_for broke_for = vp_broke_for::NORMAL;
+
+        // string for broke_for
+        std::map<vp_broke_for, std::string> broke_fors = {{vp_broke_for::NORMAL, "normal"}, 
+                                                        {vp_broke_for::FACE, "face"}, 
+                                                        {vp_broke_for::TEXT, "text"}, 
+                                                        {vp_broke_for::POSE, "pose"}};
+    public:
+        vp_msg_broker_node(std::string node_name, 
+                        vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                        int broking_cache_warn_threshold = 50, 
+                        int broking_cache_ignore_threshold = 200);
+        ~vp_msg_broker_node();
+    };
+}
--- a/nodes/broker/vp_plate_socket_broker_node.cpp
+++ b/nodes/broker/vp_plate_socket_broker_node.cpp
@@ -0,0 +1,160 @@
+
+
+#include "vp_plate_socket_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_plate_socket_broker_node::vp_plate_socket_broker_node(std::string node_name,
+                                                        std::string des_ip,
+                                                        int des_port,
+                                                        std::string plates_dir,
+                                                        int min_crop_width,
+                                                        int min_crop_height,
+                                                        vp_broke_for broke_for, 
+                                                        bool only_for_tracked,
+                                                        int broking_cache_warn_threshold, 
+                                                        int broking_cache_ignore_threshold):
+                                                        vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
+                                                        des_ip(des_ip),
+                                                        des_port(des_port),
+                                                        plates_dir(plates_dir),
+                                                        min_crop_width(min_crop_width),
+                                                        min_crop_height(min_crop_height),
+                                                        only_for_tracked(only_for_tracked) {
+        // only for vp_frame_target                                                    
+        assert(broke_for == vp_broke_for::NORMAL);
+        udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
+        this->initialized();
+    }
+    
+    vp_plate_socket_broker_node::~vp_plate_socket_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+
+    void vp_plate_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        /* format:
+        line 0, <--
+        line 1, 1st time
+        line 2, 1st channel index, frame index
+        line 3, 1st the cropped image's path
+        line 4, 1st the whole image's path
+        line 5, 1st plate color
+        line 6, 1st plate text
+        line 7, -->
+        line 8, <--
+        line 9, 2nd time
+        line 10, 2nd channel index, frame index
+        line 11, 2nd the cropped image's path
+        line 12, 2nd the whole image's path
+        line 13, 2nd plate color
+        line 14, 2nd plate text
+        line 15, -->
+        line 16, ...
+        */
+        auto& broked_ids = all_broked_ids[meta->channel_index];
+        auto& broked_texts = all_broked_texts[meta->channel_index];
+        // remove 50 elements every 100 ids
+        if (broked_ids.size() > 100) {
+            broked_ids.erase(broked_ids.begin(), broked_ids.begin() + 50);
+        }
+        if (broked_texts.size() > 100) {
+            broked_texts.erase(broked_texts.begin(), broked_texts.begin() + 50);
+        }
+
+        std::stringstream msg_stream;
+        auto format_basic_info = [&](int channel_index, int frame_index) {
+            msg_stream << vp_utils::time_format(NOW) << std::endl;           // line1
+            msg_stream << channel_index << "," << frame_index << std::endl;  // line2
+        };
+        auto save_cropped_image = [&](cv::Mat& frame, cv::Rect rect, std::string name) {
+            auto cropped = frame(rect);
+            cv::imwrite(name, cropped);
+            msg_stream << name << std::endl;  // line3
+        };
+        auto save_whole_image = [&](cv::Mat& frame, std::string name) {
+            cv::imwrite(name, frame);
+            msg_stream << name << std::endl;  // line4
+        };
+        auto format_plate = [&](const std::string& plate_color, const std::string& plate_text) {
+            msg_stream << plate_color << std::endl;  // line5          
+            msg_stream << plate_text << std::endl;   // line6
+        };
+        if (broke_for == vp_broke_for::NORMAL) {
+            for (int i = 0; i < meta->targets.size(); i++) {
+                auto& t = meta->targets[i];
+                // only broke for tracked targets and have enough frames
+                if ((only_for_tracked && t->track_id < 0) || (only_for_tracked && t->tracks.size() < min_tracked_frames)) {
+                    continue;
+                }
+                
+                // only broke 1 time for specific track id if it has been tracked, or broke many times
+                if (t->track_id >= 0 && 
+                    std::find(broked_ids.begin(), broked_ids.end(), t->track_id) != broked_ids.end()) {
+                    broked_ids.push_back(t->track_id);
+                    continue;
+                }
+
+                // size filter
+                if (t->width < min_crop_width || t->height < min_crop_height) {
+                    continue;
+                }
+
+                auto color_and_text = vp_utils::string_split(t->primary_label, '_');
+                // make sure plate text and color detected
+                if(color_and_text.size() != 2) {
+                    continue;
+                }
+                auto& color = color_and_text[0];
+                auto& text = color_and_text[1];
+                // check for length, 7 or 3 + 6 or 3 + 7
+                if (text.length() != 7 && text.length() != 9 && text.length() != 10) {
+                    continue;
+                }
+
+                // only broke 1 time for specific plate text in small period
+                if (std::find(broked_texts.begin(), broked_texts.end(), text) != broked_texts.end()) {
+                    broked_texts.push_back(text);
+                    continue;
+                }
+                
+                // cache for texts
+                broked_texts.push_back(text);
+                // cache for ids
+                if (t->track_id >= 0) {
+                    broked_ids.push_back(t->track_id);
+                }
+
+                auto cropped_name = plates_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + color + "_" + text  + "_cropped.jpg";
+                auto whole_name = plates_dir + "/" + std::to_string(t->channel_index) + "_" + std::to_string(t->frame_index) + "_" + color + "_" + text  + "_whole.jpg";
+                // start flag
+                msg_stream << "<--" << std::endl;
+                // basic info
+                format_basic_info(meta->channel_index, meta->frame_index);
+                // save small cropped image
+                save_cropped_image(meta->frame, cv::Rect(t->x, t->y, t->width, t->height), cropped_name);
+                // save whole image
+                save_whole_image(meta->frame, whole_name);
+                // format color and text
+                format_plate(color, text);
+                // end flag
+                msg_stream << "-->";
+                
+                if (i != meta->targets.size() - 1) {
+                    msg_stream << std::endl;  // not the last one
+                }
+            }
+        }
+        
+        msg = msg_stream.str();   
+    }
+
+    void vp_plate_socket_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to socket by udp
+        auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
+        auto bytes_2_send_len = msg.size();
+        udp_writer.send(bytes_2_send, bytes_2_send_len);
+    }
+}
--- a/nodes/broker/vp_plate_socket_broker_node.h
+++ b/nodes/broker/vp_plate_socket_broker_node.h
@@ -0,0 +1,56 @@
+#pragma once
+
+#include "vp_msg_broker_node.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+// light weight socket support
+#include "../../third_party/kissnet/kissnet.hpp"
+
+namespace vp_nodes {
+    // message broker node, broke text & color for license plate (hold by vp_frame_target) to socket via udp.
+    // which could be used for lpr camera.
+    class vp_plate_socket_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // save dir for cropped images and whole images, which would be used for displaying in lpr camera
+        std::string plates_dir = "plate_images";
+        // min width to crop (license plate will be ignored if target's width is smaller than this value)
+        int min_crop_width = 50;
+        // min height to crop (license plate will be ignored if target's height is smaller than this value)
+        int min_crop_height = 50;
+        // only broke for tracked targets (track_id is not -1)
+        bool only_for_tracked = false;
+        
+        // min tracked frames if only_for_tracked is true
+        int min_tracked_frames = 25;
+
+        // host the data sent to via udp
+        std::string des_ip = "";
+        // port the data sent to via udp
+        int des_port = 0;
+
+        // udp socket writer
+        kissnet::udp_socket udp_writer;
+
+        // support multi-channel, used for avoid duplicate data
+        std::map<int, std::vector<int>> all_broked_ids;           // channel -> target ids which have been broked
+        std::map<int, std::vector<std::string>> all_broked_texts;  // channel -> plate texts which have been broked (filter using similiarity comparison)
+    protected:
+        // to custom format
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to socket via udp
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_plate_socket_broker_node(std::string node_name, 
+                                std::string des_ip = "",
+                                int des_port = 0,
+                                std::string plates_dir = "plate_images",
+                                int min_crop_width = 100,
+                                int min_crop_height = 0,
+                                vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                bool only_for_tracked = true, 
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_plate_socket_broker_node();
+    };
+}
--- a/nodes/broker/vp_xml_file_broker_node.cpp
+++ b/nodes/broker/vp_xml_file_broker_node.cpp
@@ -0,0 +1,70 @@
+#include "vp_xml_file_broker_node.h"
+
+namespace vp_nodes {
+    
+    vp_xml_file_broker_node::vp_xml_file_broker_node(std::string node_name, 
+                                                    vp_broke_for broke_for, 
+                                                    std::string file_path_and_name,
+                                                    int broking_cache_warn_threshold, 
+                                                    int broking_cache_ignore_threshold):
+                                                    vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold) {
+        // open file
+        if (file_path_and_name.empty()) {
+            file_path_and_name = "vp_broker_" + vp_utils::time_format(NOW, "<hour>-<min>-<sec>-<mili>") + ".xml";
+        }
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set broking file path as `%s`", node_name.c_str(), file_path_and_name.c_str()));
+        xml_writer.open(file_path_and_name);
+
+        this->initialized();
+    }
+    
+    vp_xml_file_broker_node::~vp_xml_file_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+    
+    void vp_xml_file_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        // serialize objects to xml by cereal
+        std::stringstream msg_stream;
+        {
+            cereal::XMLOutputArchive xml_archive(msg_stream);
+            
+            // global values
+            xml_archive(cereal::make_nvp("channel_index", meta->channel_index),
+            cereal::make_nvp("frame_index", meta->frame_index),
+            cereal::make_nvp("width", meta->frame.cols),
+            cereal::make_nvp("height", meta->frame.rows),
+            cereal::make_nvp("fps", meta->fps),
+            cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
+
+            // serialize values according to broke_for
+            if (broke_for == vp_broke_for::NORMAL) {
+                xml_archive(cereal::make_nvp("target_size", meta->targets.size()), 
+                            cereal::make_nvp("targets", meta->targets));
+            }
+            else if (broke_for ==  vp_broke_for::FACE) {
+                xml_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
+                            cereal::make_nvp("face_targets", meta->face_targets));
+            }
+            else if (broke_for == vp_broke_for::TEXT) {
+                xml_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
+                            cereal::make_nvp("text_targets", meta->text_targets));
+            }
+            else {
+                throw "invalid broke_for!";
+            }
+        } // flush
+
+        msg = msg_stream.str();   
+    }
+
+    void vp_xml_file_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to file by ofstream
+        if (xml_writer.is_open()) {
+            xml_writer << msg << std::endl;
+        }
+        else {
+            // TO-DO
+        }
+    }
+}
--- a/nodes/broker/vp_xml_file_broker_node.h
+++ b/nodes/broker/vp_xml_file_broker_node.h
@@ -0,0 +1,28 @@
+#pragma once
+
+#include <fstream>
+
+#include "vp_msg_broker_node.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+namespace vp_nodes {
+    // message broker node (for demo/debug purpose), broke xml data to file.
+    class vp_xml_file_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // xml file writer
+        ofstream xml_writer;
+    protected:
+        // to xml
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to file
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_xml_file_broker_node(std::string node_name, 
+                                vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                std::string file_path_and_name = "",
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_xml_file_broker_node();
+    };
+}
--- a/nodes/broker/vp_xml_socket_broker_node.cpp
+++ b/nodes/broker/vp_xml_socket_broker_node.cpp
@@ -0,0 +1,68 @@
+
+
+#include "vp_xml_socket_broker_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_xml_socket_broker_node::vp_xml_socket_broker_node(std::string node_name,
+                                                        std::string des_ip,
+                                                        int des_port,
+                                                        vp_broke_for broke_for, 
+                                                        int broking_cache_warn_threshold, 
+                                                        int broking_cache_ignore_threshold):
+                                                        vp_msg_broker_node(node_name, broke_for, broking_cache_warn_threshold, broking_cache_ignore_threshold),
+                                                        des_ip(des_ip),
+                                                        des_port(des_port) {
+        udp_writer = kissnet::udp_socket(kissnet::endpoint(des_ip, des_port));
+        VP_INFO(vp_utils::string_format("[%s] [message broker] set des_ip as `%s` and des_port as [%d]", node_name.c_str(), des_ip.c_str(), des_port));
+        this->initialized();
+    }
+    
+    vp_xml_socket_broker_node::~vp_xml_socket_broker_node() {
+        deinitialized();
+        stop_broking();
+    }
+
+    void vp_xml_socket_broker_node::format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) {
+        // serialize objects to xml by cereal
+        std::stringstream msg_stream;
+        {
+            cereal::XMLOutputArchive xml_archive(msg_stream);
+            
+            // global values
+            xml_archive(cereal::make_nvp("channel_index", meta->channel_index),
+            cereal::make_nvp("frame_index", meta->frame_index),
+            cereal::make_nvp("width", meta->frame.cols),
+            cereal::make_nvp("height", meta->frame.rows),
+            cereal::make_nvp("fps", meta->fps),
+            cereal::make_nvp("broke_for", broke_fors.at(broke_for)));
+
+            // serialize values according to broke_for
+            if (broke_for == vp_broke_for::NORMAL) {
+                xml_archive(cereal::make_nvp("target_size", meta->targets.size()), 
+                            cereal::make_nvp("targets", meta->targets));
+            }
+            else if (broke_for ==  vp_broke_for::FACE) {
+                xml_archive(cereal::make_nvp("face_target_size", meta->face_targets.size()),
+                            cereal::make_nvp("face_targets", meta->face_targets));
+            }
+            else if (broke_for == vp_broke_for::TEXT) {
+                xml_archive(cereal::make_nvp("text_target_size", meta->text_targets.size()),
+                            cereal::make_nvp("text_targets", meta->text_targets));
+            }
+            else {
+                throw "invalid broke_for!";
+            }
+        } // flush
+
+        msg = msg_stream.str();   
+    }
+
+    void vp_xml_socket_broker_node::broke_msg(const std::string& msg) {
+        // broke msg to socket by udp
+        auto bytes_2_send = reinterpret_cast<const std::byte*>(msg.c_str());
+        auto bytes_2_send_len = msg.size();
+        udp_writer.send(bytes_2_send, bytes_2_send_len);
+    }
+}
--- a/nodes/broker/vp_xml_socket_broker_node.h
+++ b/nodes/broker/vp_xml_socket_broker_node.h
@@ -0,0 +1,35 @@
+#pragma once
+
+#include "vp_msg_broker_node.h"
+#include "cereal_archive/vp_objects_cereal_archive.h"
+
+// light weight socket support
+#include "../../third_party/kissnet/kissnet.hpp"
+
+namespace vp_nodes {
+    // message broker node, broke xml data to socket via udp.
+    class vp_xml_socket_broker_node: public vp_msg_broker_node
+    {
+    private:
+        // host the data sent to via udp
+        std::string des_ip = "";
+        // port the data sent to via udp
+        int des_port = 0;
+
+        // udp socket writer
+        kissnet::udp_socket udp_writer;
+    protected:
+        // to xml
+        virtual void format_msg(const std::shared_ptr<vp_objects::vp_frame_meta>& meta, std::string& msg) override;
+        // to socket via udp
+        virtual void broke_msg(const std::string& msg) override;
+    public:
+        vp_xml_socket_broker_node(std::string node_name, 
+                                std::string des_ip = "",
+                                int des_port = 0,
+                                vp_broke_for broke_for = vp_broke_for::NORMAL, 
+                                int broking_cache_warn_threshold = 50, 
+                                int broking_cache_ignore_threshold = 200);
+        ~vp_xml_socket_broker_node();
+    };
+}
--- a/nodes/ffio/ff_common.cpp
+++ b/nodes/ffio/ff_common.cpp
@@ -0,0 +1,55 @@
+
+#ifdef VP_WITH_FFMPEG
+#include "ff_common.h"
+
+namespace vp_nodes {
+    ff_scaler::ff_scaler(int src_width, 
+            int src_height, 
+            AVPixelFormat src_fmt, 
+            int dst_width, 
+            int dst_height,
+            AVPixelFormat dst_fmt):
+            m_src_width(src_width),
+            m_src_height(src_height),
+            m_src_fmt(src_fmt),
+            m_dst_width(dst_width),
+            m_dst_height(dst_height),
+            m_dst_fmt(dst_fmt) {
+        sws_ctx = sws_getContext(src_width, src_height, src_fmt, dst_width,dst_height, dst_fmt, SWS_FAST_BILINEAR, NULL, NULL, NULL);
+    }
+
+    ff_scaler::~ff_scaler() {
+        if (!sws_ctx) {
+            sws_freeContext(sws_ctx);
+        }
+    }
+
+    bool ff_scaler::scale(ff_av_frame_ptr src, ff_av_frame_ptr& dst) {
+        if (!sws_ctx || !src || !dst) {
+            return false;
+        }
+
+        if (src->format != m_src_fmt || src->width != m_src_width || src->height != m_src_height ||
+            dst->format != m_dst_fmt || dst->width != m_dst_width || dst->height != m_dst_height) {
+            return false;
+        }
+
+        if (dst->format == AV_PIX_FMT_BGR24) {
+            dst->linesize[0] = m_dst_width * 3;
+        }
+
+        /* allocate buffer if need for dst */
+        if (dst->data[0] == NULL) {
+            auto ret = av_frame_get_buffer(dst.get(), 0);
+            if (ret < 0) {
+                return false;
+            }
+        }
+        
+        auto ret = sws_scale(sws_ctx, src->data, src->linesize, 0, src->height, dst->data, dst->linesize);
+        dst->pts = src->pts;
+        dst->pkt_dts = src->pkt_dts;
+        return ret >= 0;
+    }
+}
+#endif
--- a/nodes/ffio/ff_common.h
+++ b/nodes/ffio/ff_common.h
@@ -0,0 +1,64 @@
+#pragma once
+#ifdef VP_WITH_FFMPEG
+#include <memory>
+#include <thread>
+#include <condition_variable>
+#include <mutex>
+#include <sstream>
+#include <unordered_map>
+#include <queue>
+extern "C" {
+#include <libavformat/avformat.h>
+#include <libavcodec/avcodec.h>
+#include <libswscale/swscale.h>
+#include <libavutil/avutil.h>
+#include <libavutil/pixdesc.h>
+#include <libavutil/imgutils.h>
+#include <libavdevice/avdevice.h>
+}
+
+#define ff_av_packet_ptr std::shared_ptr<AVPacket>
+#define ff_av_frame_ptr std::shared_ptr<AVFrame>
+#define ff_src_ptr std::shared_ptr<vp_nodes::ff_src>
+#define ff_des_ptr std::shared_ptr<vp_nodes::ff_des>
+#define ff_scaler_ptr std::shared_ptr<vp_nodes::ff_scaler>
+
+#define alloc_ff_src(channel_index) std::make_shared<vp_nodes::ff_src>(channel_index)
+#define alloc_ff_des(channel_index) std::make_shared<vp_nodes::ff_des>(channel_index)
+#define alloc_ff_scaler(src_width, src_height, src_fmt, dst_width, dst_height, dst_fmt)               \
+        std::make_shared<vp_nodes::ff_scaler>(src_width, src_height, src_fmt, dst_width, dst_height, dst_fmt)
+        
+#define alloc_ff_av_frame()                                                                           \
+        std::shared_ptr<AVFrame>(av_frame_alloc(), [](AVFrame *frame) { av_frame_free(&frame); })
+#define alloc_ff_av_packet()                                                                          \
+        std::shared_ptr<AVPacket>(av_packet_alloc(), [](AVPacket *pkt) { av_packet_free(&pkt); })
+    
+namespace vp_nodes {
+    /**
+     * tools for color conversion & image resize using FFmpeg on CPUs.
+     * 
+     */
+    class ff_scaler {
+    private:
+        SwsContext* sws_ctx = NULL;
+        int m_src_width;
+        int m_src_height;
+        AVPixelFormat m_src_fmt;
+        int m_dst_width;
+        int m_dst_height;
+        AVPixelFormat m_dst_fmt;
+    public:
+        /* disable copy and assignment operations */
+        ff_scaler(const ff_scaler&) = delete;
+        ff_scaler& operator=(const ff_scaler&) = delete;
+        ff_scaler(int src_width, 
+                 int src_height, 
+                 AVPixelFormat src_fmt, 
+                 int dst_width, 
+                 int dst_height,
+                 AVPixelFormat dst_fmt);
+        ~ff_scaler();
+        bool scale(ff_av_frame_ptr src, ff_av_frame_ptr& dst);
+    };
+}
+#endif
--- a/nodes/ffio/ff_des.cpp
+++ b/nodes/ffio/ff_des.cpp
@@ -0,0 +1,482 @@
+#ifdef VP_WITH_FFMPEG
+#include <iostream>
+#include <sstream>
+#include "ff_des.h"
+
+namespace vp_nodes {
+    ff_des::ff_des(int channel_index): m_channel_index(channel_index) {
+    }
+
+    ff_des::~ff_des() {
+        inner_close();
+    }
+
+    void ff_des::enmux_run() {
+        auto ret = 0;
+        while (m_enmux_running) {
+            ff_av_packet_ptr ff_packet = nullptr;
+            m_enmux_semaphore.wait();
+            {
+                std::lock_guard<std::mutex> g(m_enmux_packets_m);
+                ff_packet = m_enmux_packets_q.front();
+                m_enmux_packets_q.pop();
+            }
+
+            if (!ff_packet) {
+                VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] get exit flag.", 
+                                                get_channel_index()));
+                break;
+            } else {
+                /* set parameters for packets */
+                av_packet_rescale_ts(ff_packet.get(), m_enc_ctx->time_base, m_ofmt_ctx->streams[m_inner_stream_index]->time_base);
+                ff_packet->stream_index = m_inner_stream_index;
+                ret = av_interleaved_write_frame(m_ofmt_ctx, ff_packet.get());
+            }
+
+            if (ret < 0) {
+                VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] av_interleaved_write_frame failed. ret: %d", 
+                                                 get_channel_index(), ret));
+                break;
+            }
+        }
+        /* write the trailer for output stream */
+        ret = av_write_trailer(m_ofmt_ctx);
+        if (ret != 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] av_write_trailer failed. ret: %d", 
+                                             get_channel_index(), ret));
+        }
+        
+        /* set enmux running flag to false in case of abnormal exit */
+        m_enmux_running = false;
+        VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][enmux_run] enmux thread exits.", 
+                                        get_channel_index()));
+    }
+
+    void ff_des::encode_run() {
+        while (m_encode_running) {
+            ff_av_frame_ptr ff_frame;
+            m_encode_semaphore.wait();
+            {
+                std::lock_guard<std::mutex> g(m_encode_frames_m);
+                ff_frame = m_encode_frames_q.front();
+                m_encode_frames_q.pop();
+            }
+
+            auto ret = 0;
+            if (!ff_frame) {
+                VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][encode_run] get exit flag.", 
+                                                get_channel_index()));
+                break;
+            }
+            ff_frame->pts = m_enc_ctx->frame_number;
+            ret = avcodec_send_frame(m_enc_ctx, ff_frame.get());
+            if (ret < 0) {
+                VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][encode_run] avcodec_send_frame failed. ret: %d", 
+                                                 get_channel_index(), ret));
+                break;
+            }
+            
+            while (ret >= 0) {
+                auto ff_packet = alloc_ff_av_packet();
+                ret = avcodec_receive_packet(m_enc_ctx, ff_packet.get());
+                if (ret == AVERROR(EAGAIN)) {
+                    break;
+                } else if (ret < 0) {
+                    VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][encode_run] avcodec_receive_packet failed. ret: %d", 
+                                                     get_channel_index(), ret));
+                    ff_frame = nullptr;
+                    break;
+                }
+
+                bool notify = true;
+                {
+                    std::lock_guard<std::mutex> g(m_enmux_packets_m);
+                    m_enmux_packets_q.push(ff_packet);
+                    if (m_enmux_packets_q.size() > m_enmux_packets_q_max_size) {
+                        m_enmux_packets_q.pop();
+                        notify = false;
+                        VP_WARN(vp_utils::string_format("[ffio/ff_des][%d][encode_run] exceed m_enmux_packets_q_max_size(%d), discard the front in queue.", 
+                                                        get_channel_index(), m_enmux_packets_q_max_size));
+                    }
+                }
+                if (notify) {
+                    m_enmux_semaphore.signal();
+                }
+            }
+            
+            if (!ff_frame) {
+                break;
+            }  
+        }
+        /* set encode running flag to false in case of abnormal exit */
+        m_encode_running = false;
+        {
+            // send exit flag to notify enmux thread
+            std::lock_guard<std::mutex> g(m_enmux_packets_m);
+            m_enmux_packets_q.push(nullptr);
+            VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][encode_run] send exit flag to enmux thread.", 
+                                            get_channel_index()));
+        }
+        m_enmux_semaphore.signal();
+        VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][encode_run] encode thread exits.", 
+                                        get_channel_index()));
+    }
+
+    bool ff_des::
+    open(const std::string& uri,
+         int width, int height, int fps, int bitrate, int max_b_frames,
+         const std::string& encoder_name, 
+         AVPixelFormat sw_pix_fmt,
+         AVHWDeviceType hw_type,
+         AVPixelFormat hw_pix_fmt) {
+        inner_close();
+
+        auto uri_valid = false;
+        if (!uri_valid) {
+            auto uri_parts = vp_utils::string_split(uri, '.');  // file
+            if (std::find(m_supported_files.begin(), m_supported_files.end(), uri_parts[uri_parts.size() - 1]) != m_supported_files.end()) {
+                uri_valid = true;
+                m_live_stream = false;
+            }
+        }
+        if (!uri_valid) {
+            auto uri_parts = vp_utils::string_split(uri, ':');  // live stream
+            if (std::find(m_supported_protocols.begin(), m_supported_protocols.end(), uri_parts[0]) != m_supported_protocols.end()) {
+                uri_valid = true;
+                m_live_stream = true;
+            }
+        }
+        if (!uri_valid) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][open] uri invalid! uri MUST start with `rtsp/rtmp/..`(network streams) or end with `mp4/mkv/..`(file streams).", 
+                                             get_channel_index()));
+            return false;
+        }
+
+        return inner_open(uri, width, height, fps, bitrate, max_b_frames, encoder_name, sw_pix_fmt, hw_type, hw_pix_fmt);
+    }
+
+    void ff_des::close() {
+        inner_close();
+    }
+
+    bool ff_des::is_opened() const {
+        return m_encode_running && m_enmux_running;
+    }
+
+    void ff_des::inner_close() {
+        /* set running flags to false */
+        m_encode_running = false;
+        m_enmux_running = false;
+
+        inner_exit_signal();
+        /* waiting for threads exist */
+        if (m_encode_th != nullptr && m_encode_th->joinable()) {
+            m_encode_th->join();
+        }
+        if (m_enmux_th != nullptr && m_enmux_th->joinable()) {
+            m_enmux_th->join();
+        }
+
+        /* free format & codec context */
+        if (m_ofmt_ctx) {
+            if (m_ofmt_ctx->pb) {
+                avio_closep(&m_ofmt_ctx->pb);
+            }
+            avformat_free_context(m_ofmt_ctx);
+        }
+        if (m_enc_ctx) {
+            avcodec_free_context(&m_enc_ctx);
+        }
+
+        /* clear queue */
+        {
+            // clear enmux_packets_q
+            std::lock_guard<std::mutex> g(m_enmux_packets_m);
+            m_enmux_packets_q = {};
+            m_enmux_semaphore.reset();
+        }
+        {
+            // clear encode_frames_q
+            std::lock_guard<std::mutex> g(m_encode_frames_m);
+            m_encode_frames_q = {};
+            m_encode_semaphore.reset();
+        }    
+        VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][inner_close] close successfully.", 
+                                        get_channel_index()));
+    }
+
+    std::string ff_des::print_summary() {
+        return "";
+    }
+
+    bool ff_des::
+    inner_open(const std::string& uri,
+               int width, int height, int fps, int bitrate, int max_b_frames,
+               const std::string& encoder_name, 
+               AVPixelFormat sw_pix_fmt,
+               AVHWDeviceType hw_type,
+               AVPixelFormat hw_pix_fmt) {
+        auto ret = 0;
+        /* allocate the output media context */
+        ret = avformat_alloc_output_context2(&m_ofmt_ctx, NULL, NULL, uri.c_str());
+        if (ret < 0) {
+            VP_WARN(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not deduce output format from uri, try to use FLV.", 
+                                             get_channel_index()));
+            ret = avformat_alloc_output_context2(&m_ofmt_ctx, NULL, "flv", uri.c_str());
+        }
+        if (ret < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not deduce output format from uri(used FLV).", 
+                                             get_channel_index()));
+            return false;
+        }
+        auto ofmt = m_ofmt_ctx->oformat;   
+        
+        /* find encoder for the output stream */
+        const AVCodec* enc = nullptr;
+        if (encoder_name.empty()) {
+            enc = avcodec_find_encoder(ofmt->video_codec);             // get default encoder by AVCodecID
+        } else {
+            enc = avcodec_find_encoder_by_name(encoder_name.c_str());  // get by encoder name
+        }
+        if (!enc) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not find proper encoder for output stream.", 
+                                             get_channel_index()));
+            avformat_free_context(m_ofmt_ctx);
+            m_ofmt_ctx = NULL;
+            return false;
+        }
+        /* allocate a codec context for the encoder */
+        m_enc_ctx = avcodec_alloc_context3(enc);
+        if (!m_enc_ctx) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not allocate context for encoder(%s).", 
+                                             get_channel_index(), enc->name));
+            avformat_free_context(m_ofmt_ctx);
+            m_ofmt_ctx = NULL;
+            return false;
+        }
+        /* set parameters for encoder */
+        m_enc_ctx->framerate = (AVRational) {fps, 1};
+        m_enc_ctx->time_base = av_inv_q(m_enc_ctx->framerate);
+        m_enc_ctx->pix_fmt   = sw_pix_fmt;
+        m_enc_ctx->width     = width;
+        m_enc_ctx->height    = height;
+        m_enc_ctx->bit_rate  = bitrate * 1024;
+        m_enc_ctx->max_b_frames = max_b_frames;
+        /* check if need init hwaccels context for encoder */
+        if (hw_type != AVHWDeviceType::AV_HWDEVICE_TYPE_NONE) {
+            m_enc_ctx->pix_fmt   = hw_pix_fmt;
+            if ((ret = hw_encoder_init(m_enc_ctx, hw_type, sw_pix_fmt)) < 0) {
+                VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] hw_encoder_init failed. ret: %d", 
+                                                 get_channel_index(), ret));
+                avcodec_free_context(&m_enc_ctx);
+                avformat_free_context(m_ofmt_ctx);
+                m_ofmt_ctx = NULL;
+                return false;
+            }
+        }
+        /* open the encoder */
+        if ((ret = avcodec_open2(m_enc_ctx, enc, NULL)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not open encoder. ret: %d", 
+                                             get_channel_index(), ret));
+            avcodec_free_context(&m_enc_ctx);
+            avformat_free_context(m_ofmt_ctx);
+            m_ofmt_ctx = NULL;
+            return false;
+        }
+        /* create new stream for output */
+        auto out_stream = avformat_new_stream(m_ofmt_ctx, NULL);
+        if (!out_stream) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] could not create new stream for output uri.", 
+                                             get_channel_index()));
+            avcodec_free_context(&m_enc_ctx);
+            avformat_free_context(m_ofmt_ctx);
+            m_ofmt_ctx = NULL;
+            return false;
+        }
+        m_inner_stream_index = out_stream->index;
+        out_stream->time_base = m_enc_ctx->time_base;
+        ret = avcodec_parameters_from_context(out_stream->codecpar, m_enc_ctx);
+        if (ret < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] failed to copy codec parameters from encode context. ret: %d", 
+                                             get_channel_index(), ret));
+            avcodec_free_context(&m_enc_ctx);
+            avformat_free_context(m_ofmt_ctx);
+            m_ofmt_ctx = NULL;
+            return false;
+        }
+        if (!(m_ofmt_ctx->flags & AVFMT_NOFILE)) {
+            ret = avio_open(&m_ofmt_ctx->pb, uri.c_str(), AVIO_FLAG_WRITE);
+            m_ofmt_ctx->flush_packets = 1;
+            m_ofmt_ctx->flags |= AVFMT_FLAG_FLUSH_PACKETS;
+            if (ret < 0) {
+                VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] avio_open failed on output uri(%s). ret: %d", 
+                                                 get_channel_index(), uri.c_str(), ret));
+                avcodec_free_context(&m_enc_ctx);
+                avformat_free_context(m_ofmt_ctx);
+                m_ofmt_ctx = NULL;
+                return false;
+            }
+        }
+
+        /* write the stream header */
+        if ((ret = avformat_write_header(m_ofmt_ctx, NULL)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][inner_open] avformat_write_header failed. ret: %d", 
+                                             get_channel_index(), ret));
+            avcodec_free_context(&m_enc_ctx);
+            avformat_free_context(m_ofmt_ctx);
+            m_ofmt_ctx = NULL;
+            return false;
+        }
+        
+        /* initialize video & ff_des properties */
+        m_hw_type_name = hw_type != AV_HWDEVICE_TYPE_NONE ? std::string(av_hwdevice_get_type_name(hw_type)) : "none";
+        m_encoder_name = encoder_name.empty() ? std::string(m_enc_ctx->codec->name) : encoder_name;
+        m_uri = uri;
+        m_fps = fps;
+        m_width = width;
+        m_height = height;
+        m_bitrate = bitrate;
+        m_codec_name = std::string(avcodec_get_name(out_stream->codecpar->codec_id));;
+        m_pixel_format = std::string(av_get_pix_fmt_name(static_cast<AVPixelFormat>(out_stream->codecpar->format)));
+        m_max_b_frames = max_b_frames;
+
+        /* collect summary notify caller */
+        auto summary = print_summary();
+
+        /* go! */
+        m_enmux_running = true;
+        m_encode_running = true;
+        m_enmux_th = std::make_shared<std::thread>(&ff_des::enmux_run, this);
+        m_encode_th = std::make_shared<std::thread>(&ff_des::encode_run, this);
+        VP_INFO(vp_utils::string_format("[ffio/ff_des][%d][inner_open] open successfully.", 
+                                        get_channel_index()));
+        return true;   
+    }
+
+    bool ff_des::write(const ff_av_frame_ptr& frame) {
+        if (!is_opened()) {
+            return false;
+        }
+        
+        bool notify = true;
+        {
+            std::lock_guard<std::mutex> g(m_encode_frames_m);
+            m_encode_frames_q.push(frame);
+            if (m_encode_frames_q.size() > m_encode_frames_q_max_size) {
+                m_encode_frames_q.pop();
+                notify = false;
+                VP_WARN(vp_utils::string_format("[ffio/ff_des][%d][write] exceed m_encode_frames_q_max_size(%d), discard the front in queue.", 
+                                                get_channel_index(), m_encode_frames_q_max_size));
+            }
+        }
+        if (notify) {
+            m_encode_semaphore.signal();
+        }
+        return true;
+    }
+
+    ff_des& ff_des::operator<<(const ff_av_frame_ptr& frame) {
+        write(frame);
+        return *this;
+    }
+
+    void ff_des::inner_exit_signal() {
+        {
+            // send exit flag to notify encode thread
+            std::lock_guard<std::mutex> g(m_encode_frames_m);
+            m_encode_frames_q.push(nullptr);
+        }
+        m_encode_semaphore.signal();
+    }
+
+    int ff_des::get_video_fps() const {
+        return m_fps;
+    }
+
+    int ff_des::get_video_width() const {
+        return m_width;
+    }
+
+    int ff_des::get_video_height() const {
+        return m_height;    
+    }
+
+    long ff_des::get_video_bitrate() const {
+        return m_bitrate;
+    }
+
+    std::string ff_des::get_video_codec_name() const {
+        return m_codec_name;
+    }
+
+    std::string ff_des::get_video_pixel_format_name() const {
+        return m_pixel_format;
+    }
+
+    bool ff_des::is_live_stream() const {
+        return m_live_stream;
+    }
+
+    std::string ff_des::get_hw_type_name() const {
+        return m_hw_type_name;
+    }
+
+    std::string ff_des::get_encoder_name() const {
+        return m_encoder_name;
+    }
+
+    std::string ff_des::get_uri() const {
+        return m_uri;
+    }
+
+    int ff_des::get_channel_index() const {
+        return m_channel_index;   
+    }
+
+    const AVCodecContext* ff_des::get_encode_ctx() const {
+        return m_enc_ctx;
+    }
+
+    int ff_des::
+    hw_encoder_init(AVCodecContext* enc_ctx, 
+                    const enum AVHWDeviceType hw_type, 
+                    const enum AVPixelFormat sw_pix_fmt) {
+        AVBufferRef* hw_device_ctx = nullptr;
+        AVBufferRef* hw_frames_ref = nullptr;
+        AVHWFramesContext* frames_ctx = nullptr;
+        int err = 0;
+        if ((err = av_hwdevice_ctx_create(&hw_device_ctx, hw_type,
+                                        NULL, NULL, 0)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][hw_encoder_init] failed to create specified(%s) HW device context for encoder.", 
+                                             get_channel_index(), av_hwdevice_get_type_name(hw_type)));
+            return err;
+        }
+
+        if (!(hw_frames_ref = av_hwframe_ctx_alloc(hw_device_ctx))) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][hw_encoder_init] failed to create specified(%s) HW frame context for encoder.", 
+                                             get_channel_index(), av_hwdevice_get_type_name(hw_type)));
+            return err;
+        }
+        frames_ctx = (AVHWFramesContext* )(hw_frames_ref->data);
+        frames_ctx->format    = enc_ctx->pix_fmt;
+        frames_ctx->sw_format = sw_pix_fmt;
+        frames_ctx->width     = enc_ctx->width;
+        frames_ctx->height    = enc_ctx->height;
+        frames_ctx->initial_pool_size = 20;
+        if ((err = av_hwframe_ctx_init(hw_frames_ref)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_des][%d][hw_encoder_init] failed to initialize specified(%s) HW frame context for encoder.", 
+                                             get_channel_index(), av_hwdevice_get_type_name(hw_type)));
+            av_buffer_unref(&hw_frames_ref);
+            av_buffer_unref(&hw_device_ctx);
+            return err;
+        }
+        enc_ctx->hw_frames_ctx = av_buffer_ref(hw_frames_ref);
+        if (!enc_ctx->hw_frames_ctx)
+            err = AVERROR(ENOMEM);
+
+        av_buffer_unref(&hw_frames_ref);
+        av_buffer_unref(&hw_device_ctx);
+        return err;
+    }
+}
+#endif
--- a/nodes/ffio/ff_des.h
+++ b/nodes/ffio/ff_des.h
@@ -0,0 +1,241 @@
+#pragma once
+#ifdef VP_WITH_FFMPEG
+#include <string>
+#include <queue>
+#include <vector>
+#include <mutex>
+#include <algorithm>
+#include <functional>
+#include "ff_common.h"
+#include "../../utils/vp_semaphore.h"
+#include "../../utils/vp_utils.h"
+#include "../../utils/logger/vp_logger.h"
+
+namespace vp_nodes {
+    /**
+     * encode and enmux using FFmpeg.
+     * used to encode & enmux network streams or file streams.
+     */
+    class ff_des: public std::enable_shared_from_this<ff_des> {
+    private:
+        /* core members */
+        const std::vector<std::string> m_supported_files = {"mp4", "mkv", "flv", "h265", "h264"};
+        const std::vector<std::string> m_supported_protocols = {"rtsp", "rtmp", "udp", "rtp"};
+        AVFormatContext* m_ofmt_ctx = nullptr;
+        AVCodecContext* m_enc_ctx = nullptr;
+        std::queue<ff_av_packet_ptr> m_enmux_packets_q;
+        std::queue<ff_av_frame_ptr> m_encode_frames_q;
+        std::mutex m_enmux_packets_m;
+        std::mutex m_encode_frames_m;
+        int m_enmux_packets_q_max_size = 25;
+        int m_encode_frames_q_max_size = 25;
+        std::shared_ptr<std::thread> m_enmux_th = nullptr;
+        std::shared_ptr<std::thread> m_encode_th = nullptr;
+        vp_utils::vp_semaphore m_enmux_semaphore;
+        vp_utils::vp_semaphore m_encode_semaphore;
+
+        /**
+         * live stream or not for output.
+         */
+        bool m_live_stream = false;
+
+        /**
+         * fps of output stream.
+         */
+        int m_fps = 0;
+        
+        /**
+         * width of output stream in pixels.
+         */
+        int m_width = 0;
+
+        /**
+         * height of output stream in pixels.
+         */
+        int m_height = 0;
+
+        /**
+         * bitrate of output stream (kbit/s).
+         */
+        long m_bitrate = 0;
+
+        /**
+         * codec type name of output stream (strings like `h264/hevc/vp8/...`).
+         */
+        std::string m_codec_name = "";
+
+        /**
+         * pixel format of output stream (strings like `YUV420P/NV12/...`). 
+         */
+        std::string m_pixel_format = "";
+
+        /**
+         * max B frames for encode.
+         */
+        int m_max_b_frames = 0;
+
+        /**
+         * channel index of output.
+         */
+        int m_channel_index = -1;
+
+        /**
+         * type name of hardware used for encode (`none` if no hardware used).
+         */
+        std::string m_hw_type_name = "";
+
+        /**
+         * encoder name used for encoding.
+         */
+        std::string m_encoder_name = "";
+
+        /**
+         * uri for output (rtmp/file/...).
+         */
+        std::string m_uri = "";
+
+        /* inner flags */
+        int m_inner_stream_index = -1;
+        bool m_enmux_running = false;
+        bool m_encode_running = false;
+
+        /* inner methods */
+        void enmux_run();
+        void encode_run();
+        void inner_close();
+        void inner_exit_signal();
+        std::string print_summary();
+        bool inner_open(const std::string& uri,
+                        int width, int height, int fps, int bitrate, int max_b_frames,
+                        const std::string& encoder_name = "", 
+                        AVPixelFormat sw_pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P,
+                        AVHWDeviceType hw_type = AVHWDeviceType::AV_HWDEVICE_TYPE_NONE,
+                        AVPixelFormat hw_pix_fmt = AVPixelFormat::AV_PIX_FMT_NONE);
+        int hw_encoder_init(AVCodecContext* enc_ctx, const enum AVHWDeviceType hw_type, const enum AVPixelFormat sw_pix_fmt);
+
+    public:
+        /**
+         * create ff_des instance using initial parameters.
+         * 
+         * @param channel_index specify the channel index for output stream.
+         */
+        ff_des(int channel_index);
+        ~ff_des();
+
+        /* disable copy and assignment operations */
+        ff_des(const ff_des&) = delete;
+        ff_des& operator=(const ff_des&) = delete;
+
+        /**
+         * try to open ff_des, not thread-safe.
+         * 
+         * @param 
+         */
+        bool open(const std::string& uri,
+                  int width, int height, int fps, int bitrate, int max_b_frames = 0,
+                  const std::string& encoder_name = "", 
+                  AVPixelFormat sw_pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P,
+                  AVHWDeviceType hw_type = AVHWDeviceType::AV_HWDEVICE_TYPE_NONE,
+                  AVPixelFormat hw_pix_fmt = AVPixelFormat::AV_PIX_FMT_NONE);
+        
+        /**
+         * close ff_des.
+         */
+        void close();
+
+        /**
+         * if working or not.
+         * 
+         * @return
+         * true if it is working.
+         */
+        bool is_opened() const;
+
+        /**
+         * write the next frame to ff_des, keep as same as cv::VideoWriter.
+         * 
+         * @param frame frame to be written.
+         * 
+         * @return
+         * true if write successfully.
+         * 
+         * @note
+         * return false means:
+         * 1. write too quickly no space prepared, please try to write again.
+         * 2. ff_des not opened yet or not working.
+         */
+        bool write(const ff_av_frame_ptr& frame);
+
+        /**
+         * write the next frame to ff_des, support for `<<` operator.
+         * 
+         * @param frame frame to be written.
+         * 
+         * @return
+         * reference for ff_des.
+         */
+        ff_des& operator<<(const ff_av_frame_ptr& frame);
+
+        /**
+         * get fps of output stream in pixels.
+         */
+        int get_video_fps() const;
+
+        /**
+         * get width of output stream in pixels.
+         */
+        int get_video_width() const;
+
+        /**
+         * get height of output stream in pixels.
+         */
+        int get_video_height() const;
+
+        /**
+         * get bitrate of output stream (kbit/s).
+         */
+        long get_video_bitrate() const;
+
+        /**
+         * get codec type name of output stream (strings like `h264/hevc/vp8/...`).
+         */
+        std::string get_video_codec_name() const;
+
+        /**
+         * get pixel format of output stream (strings like `YUV420P/NV12/...`). 
+         */
+        std::string get_video_pixel_format_name() const;
+
+        /**
+         * check is live stream or not.
+         */
+        bool is_live_stream() const;
+
+        /**
+         * get type name of hardware used for encoding (strings like 'cuda/vaapi', 'none' if no hardware used).
+         */
+        std::string get_hw_type_name() const;
+
+        /**
+         * get encoder name used for encoding.
+         */
+        std::string get_encoder_name() const;
+
+        /**
+         * get uri of output.
+         */
+        std::string get_uri() const;
+
+        /**
+         * get channel index of output.
+         */
+        int get_channel_index() const;
+
+        /**
+         * get pointer of encode context from FFmpeg.
+         * used to allocate hardware AVFrame outside of ff_des.
+         */
+        const AVCodecContext* get_encode_ctx() const;
+    };
+}
+#endif
--- a/nodes/ffio/ff_src.cpp
+++ b/nodes/ffio/ff_src.cpp
@@ -0,0 +1,475 @@
+#ifdef VP_WITH_FFMPEG
+#include <iostream>
+#include <sstream>
+#include "ff_src.h"
+
+namespace vp_nodes {
+    ff_src::ff_src(int channel_index): m_channel_index(channel_index) {
+    }
+    
+    ff_src::~ff_src() {
+        inner_close();
+    }
+
+    void ff_src::demux_run() {
+        auto start_time = std::chrono::system_clock::now();
+        auto total_bytes = 0;
+        while (m_demux_running) {
+            auto demux_start_t = std::chrono::system_clock::now();
+            auto ff_packet = alloc_ff_av_packet();
+            auto ret = 0;
+
+            if ((ret = av_read_frame(m_ifmt_ctx, ff_packet.get())) < 0) {
+                VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][demux_run] av_read_frame failed. ret: %d", 
+                                                 get_channel_index(), 
+                                                 ret));
+                break;
+            }
+
+            // only demux the right video stream
+            if (ff_packet->stream_index != m_inner_stream_index) {
+                continue;
+            }
+/*
+            // analyse
+            vp_media::stream_analyser analyser(ff_packet->data, ff_packet->size, true);
+            std::vector<vp_media::nal_unit> nal_units;
+            analyser.analyse(nal_units);
+            std::ostringstream oss;
+            for (auto& nal: nal_units) {
+                oss << std::setw(8) << std::setfill(' ') << nal.index 
+                    << std::setw(16) << std::setfill(' ') << nal.offset
+                    << std::setw(8) << std::setfill(' ') << nal.nal_length 
+                    << std::setw(16) << std::setfill(' ') << vp_media::stream_analyser::to_hex(nal.start_bytes) << vp_media::stream_analyser::to_hex(nal.head_bytes, false) 
+                    << std::setw(4) << std::setfill(' ') <<  nal.nal_type
+                    << std::setw(24) << std::setfill(' ') << nal.nal_type_name << std::endl;
+            }
+            std::cout << oss.str();
+*/
+            // need calculate bitrate manually
+            if (m_bitrate <= 0) {
+                total_bytes += ff_packet->size;
+                auto current_time = std::chrono::system_clock::now();
+                auto elapsed_seconds = std::chrono::duration_cast<std::chrono::seconds>(current_time - start_time);
+                if (elapsed_seconds.count() > 5.0) {
+                    m_bitrate = (total_bytes * 8) / elapsed_seconds.count() / 1024;   // convert to kbit/s 
+                    total_bytes = 0;
+                    start_time = std::chrono::system_clock::now();
+                }
+            }
+            
+            bool notify = true;
+            {
+                std::lock_guard<std::mutex> g(m_demux_packets_m);
+                m_demux_packets_q.push(ff_packet);
+                if (m_demux_packets_q.size() > m_demux_packets_q_max_size) {
+                    m_demux_packets_q.pop();
+                    notify = false;
+                    VP_WARN(vp_utils::string_format("[ffio/ff_src][%d][demux_run] exceed m_demux_packets_q_max_size(%d), discard the front in queue.", 
+                                                    get_channel_index(), 
+                                                    m_demux_packets_q_max_size));
+                }
+            }
+            if (notify) {
+                m_demux_semaphore.signal();
+            }
+
+            /* 1. wait for a while for video file
+             * 2. demux as soon as possible for live stream
+            */
+            if (!m_live_stream) {
+                auto cost_t = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - demux_start_t);
+                auto duration_t = std::chrono::milliseconds(int(1000.0 / m_fps));
+                if (cost_t < duration_t) {
+                    std::this_thread::sleep_for(duration_t - cost_t);
+                }
+            }
+        }
+
+        /* set demux running flag to false in case of abnormal exit */
+        m_demux_running = false;
+        {
+            // send exit flag to notify decode thread
+            std::lock_guard<std::mutex> g(m_demux_packets_m);
+            m_demux_packets_q.push(nullptr);
+            VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][demux_run] send exit flag to decode thread.", 
+                                            get_channel_index()));
+        }
+        m_demux_semaphore.signal();
+        VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][demux_run] demux thread exits.", 
+                                        get_channel_index()));
+    }
+
+    void ff_src::decode_run() {
+        while (m_decode_running) {
+            m_demux_semaphore.wait();
+            ff_av_packet_ptr ff_packet = nullptr;
+            {
+                std::lock_guard<std::mutex> g(m_demux_packets_m);
+                ff_packet = m_demux_packets_q.front();
+                m_demux_packets_q.pop();
+            }
+
+            auto ret = 0;
+            /* get exit flag */
+            if (!ff_packet) {
+                VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][decode_run] get exit flag, go to flush decoder.", 
+                                                get_channel_index()));
+                ret = avcodec_send_packet(m_dec_ctx, NULL);  // flush decoder
+            } else {
+                ret = avcodec_send_packet(m_dec_ctx, ff_packet.get());
+            }
+
+            if (ret < 0) {
+                VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][decode_run] avcodec_send_packet failed. ret: %d", 
+                                                 get_channel_index(), ret));
+                break;
+            }
+            
+            while (ret >= 0) {
+                auto ff_frame = alloc_ff_av_frame();
+                ret = avcodec_receive_frame(m_dec_ctx, ff_frame.get());
+                if (ret == AVERROR(EAGAIN)) {
+                    break;
+                } else if (ret < 0) {
+                    VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][decode_run] avcodec_receive_frame failed. ret: %d", 
+                                                     get_channel_index(), 
+                                                     ret));
+                    ff_packet = nullptr;
+                    break;
+                }
+                {
+                    std::lock_guard<std::mutex> g(m_decode_frames_m);
+                    m_decode_frames_q.push(ff_frame);
+                    if (m_decode_frames_q.size() > m_decode_frames_q_max_size) {
+                        m_decode_frames_q.pop();
+                        VP_WARN(vp_utils::string_format("[ffio/ff_src][%d][decode_run] exceed m_decode_frames_q_max_size(%d), discard the front in queue.", 
+                                                        get_channel_index(), 
+                                                        m_decode_frames_q_max_size));
+                    }
+                }
+            }
+
+            if (!ff_packet) {
+                break;
+            }
+        }
+
+        /* set decode running flag to false in case of abnormal exit */
+        m_decode_running = false;
+        VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][decode_run] decode thread exits.", 
+                                        get_channel_index()));
+    }   
+
+    bool ff_src::open(const std::string& uri, const std::string& decoder_name, AVHWDeviceType hw_type) {
+        inner_close();
+        auto uri_valid = false;
+        if (!uri_valid) {
+            auto uri_parts = vp_utils::string_split(uri, '.');  // file
+            if (std::find(m_supported_files.begin(), m_supported_files.end(), uri_parts[uri_parts.size() - 1]) != m_supported_files.end()) {
+                uri_valid = true;
+                m_live_stream = false;
+            }
+        }
+        if (!uri_valid) {
+            auto uri_parts = vp_utils::string_split(uri, ':');  // live stream
+            if (std::find(m_supported_protocols.begin(), m_supported_protocols.end(), uri_parts[0]) != m_supported_protocols.end()) {
+                uri_valid = true;
+                m_live_stream = true;
+            }
+        }
+        if (!uri_valid) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][open] uri invalid! uri MUST start with `rtsp/rtmp/..`(network streams) or end with `mp4/mkv/..`(file streams).", 
+                                             get_channel_index()));
+            return false;
+        }
+        
+        return inner_open(uri, decoder_name, hw_type);
+    }
+
+    bool ff_src::inner_open(const std::string& uri, const std::string& decoder_name, AVHWDeviceType hw_type) {
+        auto ret = 0;
+        AVDictionary *options = NULL;
+        // Set options
+        av_dict_set(&options, "max_delay", "100000", 0);       // 设置最大延迟为100ms
+        av_dict_set(&options, "buffer_size", "2000000", 0);    // 设置缓冲区大小为2MB
+        av_dict_set(&options, "timeout", "3000000", 0);        // 设置超时时间为3秒
+        av_dict_set(&options, "rtsp_transport", "tcp", 0);     // 使用TCP传输
+        av_dict_set(&options, "max_interleave_delta", "1000000", 0);  // 设置最大间隔时间为1秒
+
+        /* open input uri(file/stream), and allocate format context */
+        if (avformat_open_input(&m_ifmt_ctx, uri.c_str(), NULL, NULL) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not open input uri.", 
+                                             get_channel_index()));
+            return false;
+        }
+        /* retrieve stream information */
+        if (avformat_find_stream_info(m_ifmt_ctx, NULL) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not find stream information.", 
+                                             get_channel_index()));
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        /* find video stream */
+        if ((ret = av_find_best_stream(m_ifmt_ctx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not find proper VIDEO stream.", 
+                                             get_channel_index()));
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        m_inner_stream_index = ret;
+        auto in_stream = m_ifmt_ctx->streams[m_inner_stream_index];
+        
+        /* initialize video properties */
+        m_fps = int(av_q2d(in_stream->r_frame_rate));
+        m_width = in_stream->codecpar->width;
+        m_height = in_stream->codecpar->height;
+        m_bitrate = in_stream->codecpar->bit_rate / 1024;   // kb/s
+        m_codec_name = std::string(avcodec_get_name(in_stream->codecpar->codec_id));
+        m_duration = av_q2d(in_stream->time_base) * in_stream->duration;
+        m_duration = m_duration < 0 ? 0 : m_duration;
+        m_pixel_format = std::string(av_get_pix_fmt_name(static_cast<AVPixelFormat>(in_stream->codecpar->format)));
+        
+        /* find decoder for the input video stream */
+        const AVCodec* dec = nullptr;
+        if (decoder_name.empty()) {
+            dec = avcodec_find_decoder(in_stream->codecpar->codec_id);   // get default decoder by AVCodecID
+        } else {
+            dec = avcodec_find_decoder_by_name(decoder_name.c_str());    // get by decoder name
+        }
+        if (!dec) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not find the proper decoder for input stream.", 
+                                             get_channel_index()));
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        /* allocate codec context for the decoder */
+        m_dec_ctx = avcodec_alloc_context3(dec);
+        if (!m_dec_ctx) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not allocate context for decoder(%s).", 
+                                             get_channel_index(), dec->name));
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        /* copy codec parameters from input stream to codec context */
+        if ((ret = avcodec_parameters_to_context(m_dec_ctx, in_stream->codecpar)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not copy codec parameters to decode context. ret: %d", 
+                                             get_channel_index(), ret));
+            avcodec_free_context(&m_dec_ctx);
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        /* check if need init hwaccels context for decoder */
+        if (hw_type != AVHWDeviceType::AV_HWDEVICE_TYPE_NONE 
+            && hw_decoder_init(m_dec_ctx, hw_type) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] failed to create specified(%s) HW device context for decoder.", 
+                                             get_channel_index(), av_hwdevice_get_type_name(hw_type)));
+            avcodec_free_context(&m_dec_ctx);
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        /* open the decoder */
+        if ((ret = avcodec_open2(m_dec_ctx, dec, NULL)) < 0) {
+            VP_ERROR(vp_utils::string_format("[ffio/ff_src][%d][inner_open] could not open decoder. ret: %d", 
+                                             get_channel_index(), ret));
+            avcodec_free_context(&m_dec_ctx);
+            avformat_close_input(&m_ifmt_ctx);
+            return false;
+        }
+        /* initialize node properties */
+        m_decoder_name = decoder_name.empty() ? std::string(m_dec_ctx->codec->name) : decoder_name;
+        m_uri = uri;
+        m_hw_type_name = hw_type != AV_HWDEVICE_TYPE_NONE ? std::string(av_hwdevice_get_type_name(hw_type)) : "none";
+
+        /* collect summary notify caller */
+        auto summary = print_summary();
+        if (m_src_opened_hooker) {
+            m_src_opened_hooker(shared_from_this(), summary);
+        }
+        
+        /* go! */
+        m_demux_running = true;
+        m_decode_running = true;
+        m_demux_th = std::make_shared<std::thread>(&ff_src::demux_run, this);
+        m_decode_th = std::make_shared<std::thread>(&ff_src::decode_run, this);
+        VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][inner_open] open successfully.", 
+                                         get_channel_index()));
+        return true;   
+    }
+
+    void ff_src::close() {
+        inner_close();
+    }
+
+    void ff_src::inner_close() {
+        /* set running flags to false */
+        m_demux_running = false;
+        m_decode_running = false;
+
+        /* waiting for threads exist */
+        if (m_demux_th != nullptr && m_demux_th->joinable()) {
+            m_demux_th->join();
+        }
+        if (m_decode_th != nullptr && m_decode_th->joinable()) {
+            m_decode_th->join();
+        }
+
+        /* free format & codec context */
+        if (m_ifmt_ctx) {
+            avformat_close_input(&m_ifmt_ctx);
+        }
+        if (m_dec_ctx) {
+            avcodec_free_context(&m_dec_ctx);
+        }
+
+        /* clear queue */
+        {
+            // clear m_demux_packets_q
+            std::lock_guard<std::mutex> g(m_demux_packets_m);
+            m_demux_packets_q = {};
+            m_demux_semaphore.reset();
+        } 
+        {
+            // clear m_decode_frames_q
+            std::lock_guard<std::mutex> g(m_decode_frames_m);
+            m_decode_frames_q = {};
+        }
+        VP_INFO(vp_utils::string_format("[ffio/ff_src][%d][inner_close] close successfully.", 
+                                         get_channel_index()));
+    }
+
+    bool ff_src::is_opened() const {
+        return m_demux_running && m_decode_running; 
+    }
+
+    int ff_src::get_video_fps() const {
+        // no check is_opened or not
+        return m_fps;
+    }
+
+    int ff_src::get_video_width() const {
+        // no check is_opened or not
+        return m_width;
+    }
+
+    int ff_src::get_video_height() const {
+        // no check is opened or not
+        return m_height;
+    }
+
+    long ff_src::get_video_bitrate() const {
+        // no check is opened or not
+        return m_bitrate;
+    }
+
+    std::string ff_src::get_video_codec_name() const {
+        // no check is_opened or not
+        return m_codec_name;
+    }
+
+    double ff_src::get_video_duration() const {
+        // no check is opened or not
+        return m_duration;    
+    }
+
+    bool ff_src::is_live_stream() const {
+        // no check is opened or not
+        return m_live_stream;
+    }
+
+    std::string ff_src::get_hw_type_name() const {
+        // no check is opened or not
+        return m_hw_type_name;
+    }
+
+    std::string ff_src::get_decoder_name() const {
+        // no check is opened or not
+        return m_decoder_name;
+    }
+
+    std::string ff_src::get_uri() const {
+        // no check is opened or not
+        return m_uri;
+    }
+
+    int ff_src::get_channel_index() const {
+        // no check is opened or not
+        return m_channel_index;
+    }
+
+    std::string ff_src::get_video_pixel_format_name() const {
+        // no check is opened or not
+        return m_pixel_format;
+    }
+
+    std::string ff_src::print_summary() {
+        std::ostringstream s_stream;
+        s_stream << std::endl;
+        s_stream << "################# summary for [ff_src] #################" << std::endl;
+        s_stream << "|channel_index     |: " << get_channel_index() << std::endl;
+        s_stream << "|uri               |: " << get_uri() << std::endl;
+        s_stream << "|is_live_stream    |: " << (is_live_stream() ? std::string("YES") : std::string("NO")) << std::endl;
+        s_stream << "|decoder_name      |: " << get_decoder_name() << std::endl;
+        s_stream << "|hw_type_name      |: " << get_hw_type_name() << std::endl;
+        s_stream << "|------------------|  " << std::endl;
+        s_stream << "|video_codec       |: " << get_video_codec_name() << std::endl;
+        s_stream << "|video_pic_format  |: " << get_video_pixel_format_name() << std::endl;
+        s_stream << "|video_fps         |: " << get_video_fps() << std::endl;
+        s_stream << "|video_width       |: " << get_video_width() << std::endl;
+        s_stream << "|video_height      |: " << get_video_height() << std::endl;
+        s_stream << "|video_bitrate     |: " << get_video_bitrate() << " [kbit/s]" << std::endl;
+        s_stream << "|video_duration    |: " << get_video_duration() << " [seconds]" << std::endl;
+        s_stream << "|------------------|  " << std::endl;
+        s_stream << "################# summary for [ff_src] #################" << std::endl;
+
+        auto summary = s_stream.str();
+        VP_INFO(summary);
+        return summary;
+    }
+
+    bool ff_src::read(ff_av_frame_ptr& frame) {
+        auto ret = false;
+        {
+            std::lock_guard<std::mutex> g(m_decode_frames_m);
+            if (!m_decode_frames_q.empty()) {
+                frame = m_decode_frames_q.front();
+                m_decode_frames_q.pop();
+                ret = true;
+            } else {
+                frame = nullptr;
+            }
+        }
+        if (!ret) {
+            // switch control of CPUs if no frame returned, avoid of occupying CPUs for a long time by caller outside
+            //std::this_thread::sleep_for(std::chrono::milliseconds(1));
+            //VP_DEBUG(vp_utils::string_format("[ffio/ff_src][%d][read] read frame failed, sleep for 1 millisecond.", 
+            //                                 get_channel_index()));
+        }
+
+        /* false means no frame returned (not opened or read too quickly so data not prepared) */
+        return ret;
+    }
+
+    ff_src& ff_src::operator>>(ff_av_frame_ptr& frame) {
+        read(frame);
+        return *this;
+    }
+
+    int ff_src::hw_decoder_init(AVCodecContext* dec_ctx, const enum AVHWDeviceType type) {
+        int err = 0;
+        AVBufferRef* hw_device_ctx = nullptr;
+        if ((err = av_hwdevice_ctx_create(&hw_device_ctx, type,
+                                        NULL, NULL, 0)) < 0) {
+            return err;
+        }
+        // update using AVHWDeviceContext*(hw_device_ctx->data)
+        dec_ctx->hw_device_ctx = av_buffer_ref(hw_device_ctx);
+        av_buffer_unref(&hw_device_ctx);
+        return err;
+    }
+
+    void ff_src::set_src_opened_hooker(ff_src_opened_hooker src_opened_hooker) {
+        m_src_opened_hooker = src_opened_hooker;
+    }
+}
+#endif
--- a/nodes/ffio/ff_src.h
+++ b/nodes/ffio/ff_src.h
@@ -0,0 +1,246 @@
+#pragma once
+#ifdef VP_WITH_FFMPEG
+#include <string>
+#include <queue>
+#include <vector>
+#include <mutex>
+#include <algorithm>
+#include <functional>
+#include "ff_common.h"
+#include "../../utils/vp_semaphore.h"
+#include "../../utils/vp_utils.h"
+#include "../../utils/logger/vp_logger.h"
+
+namespace vp_nodes {
+    class ff_src;
+    typedef std::function<void(ff_src_ptr, const std::string&)> ff_src_opened_hooker;
+
+    /**
+     * demux and decode using FFmpeg.
+     * used to demux & decode network streams or file streams.
+     */
+    class ff_src: public std::enable_shared_from_this<ff_src> {
+    private:
+        /* core members */
+        const std::vector<std::string> m_supported_files = {"mp4", "mkv", "flv", "avi", "h264"};
+        const std::vector<std::string> m_supported_protocols = {"rtsp", "rtmp", "http", "rtp"};
+        AVFormatContext* m_ifmt_ctx = nullptr;
+        AVCodecContext* m_dec_ctx = nullptr;
+        std::queue<ff_av_packet_ptr> m_demux_packets_q;
+        std::queue<ff_av_frame_ptr> m_decode_frames_q;
+        std::mutex m_demux_packets_m;
+        std::mutex m_decode_frames_m;        
+        int m_demux_packets_q_max_size = 25;
+        int m_decode_frames_q_max_size = 25;
+        std::shared_ptr<std::thread> m_demux_th = nullptr;
+        std::shared_ptr<std::thread> m_decode_th = nullptr;
+        vp_utils::vp_semaphore m_demux_semaphore;
+
+        /**
+         * live stream or not for input.
+         */
+        bool m_live_stream = false;
+        
+        /**
+         * fps of input stream.
+         */
+        int m_fps = 0;
+        
+        /**
+         * width of input stream in pixels.
+         */
+        int m_width = 0;
+
+        /**
+         * height of input stream in pixels.
+         */
+        int m_height = 0;
+
+        /**
+         * bitrate of input stream (kbit/s).
+         */
+        long m_bitrate = 0;
+
+        /**
+         * codec type name of input stream (strings like `h264/hevc/vp8/...`).
+         */
+        std::string m_codec_name = "";
+
+        /**
+         * pixel format of input stream (strings like `YUV420P/NV12/...`). 
+         */
+        std::string m_pixel_format = "";
+
+        /**
+         * duration of input stream (seconds), only for file stream.
+         */
+        double m_duration = 0.0;
+
+        /**
+         * channel index of input.
+         */
+        int m_channel_index = -1;
+
+        /**
+         * type name of hardware used for decoding (`none` if no hardware used).
+         */
+        std::string m_hw_type_name = "";
+
+        /**
+         * decoder name used for decoding.
+         */
+        std::string m_decoder_name = "";
+
+        /**
+         * uri of input (rtsp/file/...).
+         */
+        std::string m_uri = "";
+
+        /* inner flags */
+        int m_inner_stream_index = -1;
+        bool m_demux_running = false;
+        bool m_decode_running = false;
+
+        /* inner methods */
+        void demux_run();
+        void decode_run();
+        void inner_close();
+        std::string print_summary();
+        bool inner_open(const std::string& uri, const std::string& decoder_name, AVHWDeviceType hw_type);
+        int hw_decoder_init(AVCodecContext* dec_ctx, const enum AVHWDeviceType type);
+
+        /* callbacks */
+        ff_src_opened_hooker m_src_opened_hooker;
+    public:
+        /**
+         * create ff_src instance using initial parameters.
+         * 
+         * @param channel_index specify the channel index for input stream.
+         */
+        ff_src(int channel_index);
+        ~ff_src();
+
+        /* disable copy and assignment operations */
+        ff_src(const ff_src&) = delete;
+        ff_src& operator=(const ff_src&) = delete;
+
+        /**
+         * try to open ff_src, not thread-safe.
+         * 
+         * @param uri uri to open, url for network streams or file path for file streams.
+         * @param decoder_name specify the decoder name used for decoding, MUST supported by FFmpeg.
+         * @param hw_type type of hardware for decoding, MUST supported by FFmpeg.
+         * 
+         * @note
+         * if `decoder_name` not specified, ff_src will choose the default decoder accordding to the codec type of input stream.
+         */
+        bool open(const std::string& uri, 
+                  const std::string& decoder_name = "", 
+                  AVHWDeviceType hw_type = AVHWDeviceType::AV_HWDEVICE_TYPE_NONE);
+        /**
+         * close ff_src.
+         */
+        void close();
+
+        /**
+         * if working or not.
+         * 
+         * @return
+         * true if it is working.
+         */
+        bool is_opened() const;
+
+        /**
+         * read the next frame from ff_src, keep as same as cv::VideoCapture.
+         * 
+         * @param frame frame to be returned (return nullptr if read failed).
+         * 
+         * @return
+         * true if read successfully.
+         * 
+         * @note
+         * return false means:
+         * 1. read too quickly no data prepared, please try to read again.
+         * 2. ff_src not opened yet or not working.
+         */
+        bool read(ff_av_frame_ptr& frame);
+
+        /**
+         * read the next frame from ff_src, support for `>>` operator.
+         * 
+         * @param frame frame to be returned (return nullptr if read failed).
+         * 
+         * @return
+         * reference for ff_src.
+         */
+        ff_src& operator>>(ff_av_frame_ptr& frame);
+
+        /**
+         * get fps of input stream.
+         */
+        int get_video_fps() const;
+
+        /**
+         * get width of input stream in pixels.
+         */
+        int get_video_width() const;
+
+        /**
+         * get height of input stream in pixels.
+         */
+        int get_video_height() const;
+
+        /**
+         * get bitrate of input stream (kbit/s).
+         */
+        long get_video_bitrate() const;
+
+        /**
+         * get codec type name of input stream (strings like `h264/hevc/vp8/...`).
+         */
+        std::string get_video_codec_name() const;
+        
+        /**
+         * get pixel format of input stream (strings like `YUV420P/NV12/...`). 
+         */
+        std::string get_video_pixel_format_name() const;
+        
+        /**
+         * get duration of input stream (seconds), only for file stream.
+         */
+        double get_video_duration() const;
+        
+        /**
+         * check is live stream or not.
+         */
+        bool is_live_stream() const;
+        
+        /**
+         * get type name of hardware used for decoding (strings like 'cuda/vaapi', 'none' if no hardware used).
+         */
+        std::string get_hw_type_name() const;
+        
+        /**
+         * get decoder name used for decoding.
+         */
+        std::string get_decoder_name() const;
+        
+        /**
+         * get uri of input.
+         */
+        std::string get_uri() const;
+        
+        /**
+         * get channel index of input.
+         */
+        int get_channel_index() const;
+
+        /**
+         * set callback for opened event. would be activated every time ff_src opened.
+         * 
+         * @param src_opened_hooker callback activated when ff_src opened.
+         */
+        void set_src_opened_hooker(ff_src_opened_hooker src_opened_hooker);
+    };
+}
+#endif
--- a/nodes/ffio/vp_ff_des_node.cpp
+++ b/nodes/ffio/vp_ff_des_node.cpp
@@ -0,0 +1,97 @@
+#ifdef VP_WITH_FFMPEG
+#include "vp_ff_des_node.h"
+
+namespace vp_nodes {
+    vp_ff_des_node::
+    vp_ff_des_node(const std::string& node_name,
+                          int channel_index,
+                          const std::string& out_uri,
+                          vp_objects::vp_size resolution_w_h, 
+                          int bitrate,
+                          bool osd,
+                          std::string encoder_name):
+                          vp_des_node(node_name, channel_index),
+                          m_out_uri(out_uri),
+                          m_resolution_w_h(resolution_w_h),
+                          m_out_bitrate(bitrate),
+                          m_use_osd(osd),
+                          m_encoder_name(encoder_name) {
+        m_ff_des = alloc_ff_des(channel_index);
+        VP_INFO(vp_utils::string_format("[%s] [%s]", node_name.c_str(), out_uri.c_str()));
+        this->initialized();
+    }
+
+    vp_ff_des_node::~vp_ff_des_node() {
+        deinitialized();
+        if (sws_ctx) {
+            sws_freeContext(sws_ctx);
+        }
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ff_des_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        VP_DEBUG(vp_utils::string_format("[%s] received frame meta, channel_index=>%d, frame_index=>%d", node_name.c_str(), meta->channel_index, meta->frame_index));
+        
+        cv::Mat resize_frame = (m_use_osd && !meta->osd_frame.empty()) ? meta->osd_frame : meta->frame;
+        auto out_width = resize_frame.cols;
+        auto out_height = resize_frame.rows;
+        if (m_resolution_w_h.width != 0 && m_resolution_w_h.height != 0) {
+            out_width = m_resolution_w_h.width;
+            out_height = m_resolution_w_h.height;
+        }
+              
+        /* try to open ff_dec. */
+        if (!m_ff_des->is_opened()) {
+            if(!m_ff_des->open(m_out_uri, 
+                                out_width, 
+                                out_height, 
+                                meta->fps, 
+                                m_out_bitrate, 
+                                0, 
+                                m_encoder_name, 
+                                AV_PIX_FMT_YUV420P)) {
+                VP_WARN(vp_utils::string_format("[%s] could not open ff_des.", node_name.c_str()));
+                /* general works in vp_des_node. */
+                return vp_des_node::handle_frame_meta(meta);
+            }
+        }
+
+        /* initialize sws_ctx. */
+        if (!sws_ctx) {
+            sws_ctx = sws_getContext(resize_frame.cols, 
+                                     resize_frame.rows, 
+                                     AV_PIX_FMT_BGR24, 
+                                     out_width, 
+                                     out_height, 
+                                     AV_PIX_FMT_YUV420P, 
+                                     0, NULL, NULL, NULL);
+        }
+        if (!sws_ctx) {
+            VP_WARN(vp_utils::string_format("[%s] could not initialize sws_ctx.", node_name.c_str()));
+            /* general works in vp_des_node. */
+            return vp_des_node::handle_frame_meta(meta);
+        }
+        
+        /* cv::Mat -> AVFrame. */
+        /* resize and convert to AV_PIX_FMT_YUV420P. */
+        auto dst_frame = alloc_ff_av_frame();
+        dst_frame->width = out_width;
+        dst_frame->height = out_height;
+        dst_frame->format= AV_PIX_FMT_YUV420P;
+        av_frame_get_buffer(dst_frame.get(), 0);
+        
+        auto p = resize_frame.data;
+        int linesize[1] = {resize_frame.cols * 3};
+        sws_scale(sws_ctx, &p, linesize, 0, resize_frame.rows, dst_frame->data, dst_frame->linesize);
+
+        /* write to ff_des. */
+        m_ff_des->write(dst_frame);
+
+        /* general works in vp_des_node. */
+        return vp_des_node::handle_frame_meta(meta);
+    }
+
+    std::string vp_ff_des_node::to_string() {
+        return m_out_uri;
+    }
+}
+#endif
--- a/nodes/ffio/vp_ff_des_node.h
+++ b/nodes/ffio/vp_ff_des_node.h
@@ -0,0 +1,70 @@
+#pragma once
+#ifdef VP_WITH_FFMPEG
+#include "ff_des.h"
+#include "../vp_des_node.h"
+
+namespace vp_nodes {
+    /**
+     * universal DES node using FFmpeg.
+     * 
+     * support output uri:
+     * 1. path of file streams like `./vp_data/out_vp_test.mp4`.
+     * 2. url of network streams like `rtmp://192.168.77.68/live/stream`.
+     */
+    class vp_ff_des_node final: public vp_des_node {
+    private:
+        /* inner members. */
+        std::string m_out_uri = "";
+        bool m_use_osd = true;
+        int m_out_bitrate = 1024;
+        std::string m_encoder_name = "";
+        vp_objects::vp_size m_resolution_w_h;
+
+        /**
+         * encode & enmux.
+         */
+        ff_des_ptr m_ff_des = nullptr;
+        /**
+         * SwsContext used fot scale by FFmpeg.
+         */
+        SwsContext* sws_ctx = NULL;
+    protected:
+        // re-implementation, return nullptr.
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override; 
+    public:
+        /**
+         * create vp_ff_des_node instance using initial parameters.
+         * 
+         * @param node_name specify the name of DES node.
+         * @param channel_index specify the channel index of DES node.
+         * @param out_uri specify the uri to be written.
+         * @param use_osd specify use osd frame as output or not.
+         * @param out_width specify the final width of output, 0 means use the width of frame flowing in pipeline.
+         * @param out_height specify the final height of output, 0 means use the height of frame flowing in pipeline.
+         * @param out_fps specify the fps of output, 0 means use the fps of original stream in pipeline.
+         * @param out_bitrate specify the bitrate of output (kbit/s).
+         * @param out_max_b_frames specify the max B frames in a GOP for encoding.
+         * @param encoder_name specify the encoder name (`libx264`/`libx265`/`h264_nvenc`/`hevc_nvenc`) used for encoding in FFmpeg.
+         * @param out_sw_pix_fmt specify the pixel format of output.
+         * 
+         * @note
+         * the encoder specified by `encoder_name` MUST be supported already in FFmpeg, 
+         * we can run `ffmpeg -encoders` to show list of encoders supported in FFmpeg.
+         * if the encoder not found, please reconfigure & rebuild your FFmpeg.
+         */
+        vp_ff_des_node(const std::string& node_name,
+                              int channel_index,
+                              const std::string& out_uri,
+                              vp_objects::vp_size resolution_w_h = {}, 
+                              int bitrate = 1024,
+                              bool osd = true,
+                              std::string encoder_name = "libx264");
+        ~vp_ff_des_node();
+
+        /**
+         * return out uri of DES node.
+         */
+        virtual std::string to_string() override;
+    };
+}
+#endif
--- a/nodes/ffio/vp_ff_src_node.cpp
+++ b/nodes/ffio/vp_ff_src_node.cpp
@@ -0,0 +1,149 @@
+#ifdef VP_WITH_FFMPEG
+#include "vp_ff_src_node.h"
+
+namespace vp_nodes {
+    
+    vp_ff_src_node::
+    vp_ff_src_node(const std::string& node_name, 
+                          int channel_index,
+                          const std::string& uri,
+                          const std::string& decoder_name,
+                          float resize_ratio,
+                          int skip_interval):
+                          vp_src_node(node_name, channel_index, resize_ratio),
+                          m_uri(uri),
+                          m_decoder_name(decoder_name),
+                          m_skip_interval(skip_interval) {
+        assert(skip_interval >= 0 && skip_interval <= 9);
+        m_ff_src = alloc_ff_src(channel_index);
+        VP_INFO(vp_utils::string_format("[%s] [%s]", node_name.c_str(), uri.c_str()));
+        this->initialized();
+    }
+    
+    vp_ff_src_node::~vp_ff_src_node() {
+        deinitialized();
+    }
+
+    void vp_ff_src_node::handle_run() {
+        SwsContext* sws_ctx = NULL;
+        auto free_sws_ctx = [&]() {
+            /* free swsContext. */
+            if (sws_ctx) {
+                sws_freeContext(sws_ctx);
+                sws_ctx = NULL;
+            }
+        };
+        auto reopen_wait = 10;
+        auto reopen_times = 0;
+        ff_av_frame_ptr src_frame;
+        int video_width = 0;
+        int video_height = 0;
+        int fps = 0;
+        int skip = 0;
+        while (alive) {
+            /* wait for data coming. */
+            gate.knock();
+            if (!m_ff_src->is_opened()) {
+                if(!m_ff_src->open(m_uri, m_decoder_name)) {
+                    reopen_times++;
+                    if (reopen_times < 5) {
+                        VP_WARN(vp_utils::string_format("[%s] open uri failed, try again right now...", node_name.c_str()));
+                    }
+                    else {
+                        VP_WARN(vp_utils::string_format("[%s] open uri failed too many times, wait for %d seconds then try again...", node_name.c_str(), reopen_wait));
+                        std::this_thread::sleep_for(std::chrono::milliseconds(1000 * reopen_wait));
+                        reopen_times = 0;
+                    }
+                    continue;
+                }
+                free_sws_ctx();
+            }
+
+            // video properties
+            if (video_width == 0 || video_height == 0 || fps == 0) {
+                video_width = m_ff_src->get_video_width();
+                video_height = m_ff_src->get_video_height();
+                fps = m_ff_src->get_video_fps();
+                
+                original_fps = fps;
+                original_width = video_width;
+                original_height = video_height;
+
+                // set true fps because skip some frames
+                fps = fps / (m_skip_interval + 1);
+            }
+            // stream_info_hooker activated if need
+            vp_stream_info stream_info {channel_index, original_fps, original_width, original_height, to_string()};
+            invoke_stream_info_hooker(node_name, stream_info);
+
+            /* try to read next frame from ff_src. */
+            if (!m_ff_src->read(src_frame)) {
+                //VP_WARN(vp_utils::string_format("[%s] reading frame failed, total frame==>%d", node_name.c_str(), frame_index));
+                continue;
+            }
+
+            // need skip
+            if (skip < m_skip_interval) {
+                skip++;
+                continue;
+            }
+            skip = 0;
+
+            /* AVFrame -> cv::Mat. */
+            /* resize and convert to BGR24. */
+            auto n_width = int(src_frame->width * resize_ratio);
+            auto n_height = int(src_frame->height * resize_ratio);
+            if (!sws_ctx) {
+                sws_ctx = sws_getContext(src_frame->width, 
+                                         src_frame->height, 
+                                         AVPixelFormat(src_frame->format), 
+                                         n_width, n_height, 
+                                         AV_PIX_FMT_BGR24, 
+                                         0, NULL, NULL, NULL);
+            }
+            if (!sws_ctx) {
+                VP_WARN(vp_utils::string_format("[%s] could not initialize sws_ctx.", node_name.c_str()));
+                continue;
+            }
+            auto buffer_size = n_width * n_height * 3;
+            uchar* bgr24 = new uchar[buffer_size];
+            int linesize[1] = {n_width * 3};
+            sws_scale(sws_ctx, src_frame->data, src_frame->linesize, 0, src_frame->height, &bgr24, linesize);
+
+            cv::Mat frame(n_height, n_width, CV_8UC3, bgr24);
+            auto c_frame = frame.clone();
+            delete[] bgr24;
+           // set true size because resize
+            video_width = c_frame.cols;
+            video_height = c_frame.rows;
+            
+            this->frame_index++;
+            // create frame meta
+            auto out_meta = 
+                std::make_shared<vp_objects::vp_frame_meta>(c_frame, this->frame_index, this->channel_index, video_width, video_height, fps);
+
+            if (out_meta != nullptr) {
+                this->out_queue.push(out_meta);
+                
+                // handled hooker activated if need
+                if (this->meta_handled_hooker) {
+                    meta_handled_hooker(node_name, out_queue.size(), out_meta);
+                }
+
+                // important! notify consumer of out_queue in case it is waiting.
+                this->out_queue_semaphore.signal();
+                VP_DEBUG(vp_utils::string_format("[%s] after handling meta, out_queue.size()==>%d", node_name.c_str(), out_queue.size()));
+            } 
+        }
+        
+        free_sws_ctx();
+        // send dead flag for dispatch_thread
+        this->out_queue.push(nullptr);
+        this->out_queue_semaphore.signal();    
+    }
+
+    std::string vp_ff_src_node::to_string() {
+        return m_uri;
+    }
+}
+#endif
--- a/nodes/ffio/vp_ff_src_node.h
+++ b/nodes/ffio/vp_ff_src_node.h
@@ -0,0 +1,61 @@
+
+#pragma once
+#ifdef VP_WITH_FFMPEG
+#include "ff_src.h"
+#include "../vp_src_node.h"
+
+namespace vp_nodes {
+    /**
+     * universal SRC node using FFmpeg.
+     * 
+     * support uri:
+     * 1. path of file streams like `./vp_data/vp_test.mp4`.
+     * 2. url of network streams like `rtsp://192.168.77.68/main_stream`.
+     */
+    class vp_ff_src_node final: public vp_src_node {
+    private:
+        /* inner members. */
+        std::string m_decoder_name = "";
+        std::string m_uri = "";
+        // 0 means no skip
+        int m_skip_interval = 0;
+
+        /**
+         * demux & decode.
+         */
+        ff_src_ptr m_ff_src = nullptr;
+    protected:
+        /**
+         * get frames using FFmpeg.
+         */
+        virtual void handle_run() override;
+    public:
+        /**
+         * create vp_ff_src_node instance using initial parameters.
+         * 
+         * @param node_name specify the name of SRC node.
+         * @param channel_index specify the channel index of SRC node.
+         * @param uri specify the uri to be opened by SRC node.
+         * @param decoder_name specify the decoder name (`h264`/`hevc`/`h264_cuvid`/`hevc_cuvid`) used for decoding in FFmpeg.
+         * @param resize_ratio specify the resize ratio applied to frames.
+         * 
+         * @note
+         * the decoder specified by `decoder_name` MUST be supported already in FFmpeg, 
+         * we can run `ffmpeg -decoders` to show list of decoders supported in FFmpeg.
+         * if the decoder not found, please reconfigure & rebuild your FFmpeg.
+         */
+        vp_ff_src_node(const std::string& node_name, 
+                              int channel_index,
+                              const std::string& uri,
+                              const std::string& decoder_name = "h264",
+                              float resize_ratio = 1.0,
+                              int skip_interval = 0);
+        ~vp_ff_src_node();
+
+        /**
+         * return uri of SRC node.
+         */
+        virtual std::string to_string() override;
+    };
+}
+#endif
--- a/nodes/infers/README.md
+++ b/nodes/infers/README.md
@@ -0,0 +1,5 @@
+# Summary
+
+1. If the inference node works on the whole frame, let it derived from `vp_primary_infer_node` class.
+2. If the inference node works on the small cropped images, let it derived from `vp_secondary_infer_node` class.
+3. We can define multi derived classes to handle different types of dl models (`detector/pose estimation/classification`). Also, if they work on the same type of target AND with the same logic (for example, hava the same `preprocess/postprocess`), we can use a unique class to load different deep learning models (like resnet18 and resnet50).
--- a/nodes/infers/vp_classifier_node.cpp
+++ b/nodes/infers/vp_classifier_node.cpp
@@ -0,0 +1,99 @@
+
+
+#include "vp_classifier_node.h"
+
+namespace vp_nodes {
+    
+    vp_classifier_node::vp_classifier_node(std::string node_name, 
+                                            std::string model_path, 
+                                            std::string model_config_path, 
+                                            std::string labels_path, 
+                                            int input_width, 
+                                            int input_height, 
+                                            int batch_size,
+                                            std::vector<int> p_class_ids_applied_to,
+                                            int min_width_applied_to,
+                                            int min_height_applied_to,
+                                            int crop_padding,
+                                            bool need_softmax,
+                                            float scale,
+                                            cv::Scalar mean,
+                                            cv::Scalar std,
+                                            bool swap_rb,
+                                            bool swap_chn):
+                                            vp_secondary_infer_node(node_name,
+                                                                    model_path,
+                                                                    model_config_path,
+                                                                    labels_path,
+                                                                    input_width, 
+                                                                    input_height,
+                                                                    batch_size,
+                                                                    p_class_ids_applied_to,
+                                                                    min_width_applied_to,
+                                                                    min_height_applied_to,
+                                                                    crop_padding,
+                                                                    scale,
+                                                                    mean,
+                                                                    std,
+                                                                    swap_rb,
+                                                                    swap_chn), need_softmax(need_softmax){
+        this->initialized();
+    }
+    
+    vp_classifier_node::~vp_classifier_node() {
+        deinitialized();
+    }
+
+    void vp_classifier_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        // make sure heads of output are not zero
+        assert(raw_outputs.size() > 0);
+        assert(frame_meta_with_batch.size() == 1);
+
+        // just one head of output
+        auto& output = raw_outputs[0];
+        assert(output.dims == 2);
+
+        auto count = output.rows;
+        auto index = 0;
+
+        auto& frame_meta = frame_meta_with_batch[0];
+        cv::Point class_id_point;
+        int class_id;
+        double score;
+
+        for (int i = 0; i < count; i++) {
+            for (int j = index; j < frame_meta->targets.size(); j++)
+            {
+                if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
+                    // continue as its primary_class_id is not in p_class_ids_applied_to
+                    continue;
+                }
+                
+                auto prob = output.row(i);
+                cv::minMaxLoc(prob, 0, &score, 0, &class_id_point);
+
+                if (need_softmax) {
+                    float maxProb = 0.0;
+                    float sum = 0.0;
+                    cv::Mat softmaxProb;
+                    maxProb = *std::max_element(prob.begin<float>(), prob.end<float>());
+                    cv::exp(prob - maxProb, softmaxProb);
+                    sum = (float)cv::sum(softmaxProb)[0];
+                    softmaxProb /= sum;
+                    minMaxLoc(softmaxProb.reshape(1, 1), 0, &score, 0, &class_id_point);
+                }
+                class_id = class_id_point.x;
+                auto label = (labels.size() < class_id + 1) ? "" : labels[class_id];
+
+                // update back to frame meta
+                frame_meta->targets[j]->secondary_class_ids.push_back(class_id);
+                frame_meta->targets[j]->secondary_scores.push_back(score);
+                frame_meta->targets[j]->secondary_labels.push_back(label);
+
+                // break as we found the right target!
+                index = j + 1;
+                break;
+            }
+        }
+    }
+}
--- a/nodes/infers/vp_classifier_node.h
+++ b/nodes/infers/vp_classifier_node.h
@@ -0,0 +1,38 @@
+
+
+#pragma once
+
+#include "../vp_secondary_infer_node.h"
+
+namespace vp_nodes {
+    // common classifier for image classification task.
+    // used for image classification, update secondary_class_ids/secondary_labels/secondary_scores of vp_frame_target.
+    class vp_classifier_node: public vp_secondary_infer_node
+    {
+    private:
+        // softmax logic applied on output or not
+        bool need_softmax;
+    protected:
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_classifier_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path = "", 
+                            std::string labels_path = "", 
+                            int input_width = 128, 
+                            int input_height = 128, 
+                            int batch_size = 1,
+                            std::vector<int> p_class_ids_applied_to = std::vector<int>(),
+                            int min_width_applied_to = 0,
+                            int min_height_applied_to = 0,
+                            int crop_padding = 10,
+                            bool need_softmax = true,  // imagenet dataset
+                            float scale = 1 / 255.0,
+                            cv::Scalar mean = cv::Scalar(123.675, 116.28, 103.53),  // imagenet dataset
+                            cv::Scalar std = cv::Scalar(0.229, 0.224, 0.225),
+                            bool swap_rb = true,
+                            bool swap_chn = false);
+        ~vp_classifier_node();
+    };
+
+}
--- a/nodes/infers/vp_enet_seg_node.cpp
+++ b/nodes/infers/vp_enet_seg_node.cpp
@@ -0,0 +1,33 @@
+
+#include "vp_enet_seg_node.h"
+
+namespace vp_nodes {
+
+    vp_enet_seg_node::vp_enet_seg_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path, 
+                            std::string labels_path, 
+                            int input_width, 
+                            int input_height, 
+                            int batch_size,
+                            int class_id_offset,
+                            float scale,
+                            cv::Scalar mean,
+                            cv::Scalar std,
+                            bool swap_rb):
+                            vp_primary_infer_node(node_name, model_path, model_config_path, labels_path, input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb) {
+        this->initialized();
+    }
+
+    vp_enet_seg_node::~vp_enet_seg_node() {
+        deinitialized();
+    }
+
+    void vp_enet_seg_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto& mask = raw_outputs[0];
+
+        // save output as mask directly， this is the mask for the whole frame.
+        frame_meta->mask = mask.clone();
+    }
+}
--- a/nodes/infers/vp_enet_seg_node.h
+++ b/nodes/infers/vp_enet_seg_node.h
@@ -0,0 +1,30 @@
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+
+
+namespace vp_nodes {
+    // semantic segmentation based on ENet
+    // 
+    class vp_enet_seg_node: public vp_primary_infer_node
+    {
+    private:
+        /* data */
+    protected:
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_enet_seg_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path = "", 
+                            std::string labels_path = "", 
+                            int input_width = 1024, 
+                            int input_height = 512, 
+                            int batch_size = 1,
+                            int class_id_offset = 0,
+                            float scale = 1 / 255.0,
+                            cv::Scalar mean = cv::Scalar(0),
+                            cv::Scalar std = cv::Scalar(1),
+                            bool swap_rb = true);
+        ~vp_enet_seg_node();
+    };
+}
--- a/nodes/infers/vp_face_swap_node.cpp
+++ b/nodes/infers/vp_face_swap_node.cpp
@@ -0,0 +1,528 @@
+
+#include "vp_face_swap_node.h"
+
+namespace vp_nodes {
+    
+    vp_face_swap_node::vp_face_swap_node(std::string node_name, 
+                                        std::string yunet_face_detect_model,
+                                        std::string buffalo_l_face_encoding_model,
+                                        std::string emap_file_for_embeddings,
+                                        std::string insightface_swap_model,
+                                        std::string swap_source_image,
+                                        int swap_source_face_index,
+                                        int min_face_w_h,
+                                        bool swap_on_osd,
+                                        bool act_as_primary_detector):
+                                        vp_primary_infer_node(node_name, ""),
+                                        act_as_primary_detector(act_as_primary_detector),
+                                        swap_on_osd(swap_on_osd),
+                                        min_face_w_h(min_face_w_h) {
+        /* init net*/
+        face_extract_net = cv::dnn::readNetFromONNX(yunet_face_detect_model);
+        face_encoding_net = cv::dnn::readNetFromONNX(buffalo_l_face_encoding_model);
+        face_swap_net = cv::dnn::readNetFromONNX(insightface_swap_model);
+        #ifdef VP_WITH_CUDA
+        face_extract_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
+        face_extract_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
+        face_encoding_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
+        face_encoding_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
+        face_swap_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
+        face_swap_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
+        #endif
+        init_source_face_embeddings(swap_source_image, swap_source_face_index, emap_file_for_embeddings);
+        this->initialized();
+    }
+    
+    vp_face_swap_node::~vp_face_swap_node() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_face_swap_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        if (frame_meta->face_targets.size() == 0) {
+            if (!act_as_primary_detector) {
+                return;
+            }
+            // extract faces and update back to frame meta
+            // to-do
+        }
+        
+        auto start_time = std::chrono::system_clock::now();
+        // iterate each face target
+        for (int i = 0; i < frame_meta->face_targets.size(); i++) {
+            if (frame_meta->face_targets[i]->width < min_face_w_h || frame_meta->face_targets[i]->height < min_face_w_h) {
+                continue;
+            }
+            
+            cv::Mat aligned_face, swapped_face;
+
+            //auto t = std::chrono::system_clock::now();
+            // align and crop first
+            auto warp_mat = align_crop(frame_meta->frame, frame_meta->face_targets[i]->key_points, aligned_face);
+            //auto T = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - t).count();
+
+            //std::cout << "1st T:" << T << std::endl;
+            //t = std::chrono::system_clock::now();
+            // swap
+            swap(aligned_face, swapped_face);
+            //T = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - t).count();
+            //std::cout << "2nd T:" << T << std::endl;
+
+            //t = std::chrono::system_clock::now();
+            // past back to frame or osd frame
+            if (swap_on_osd) {
+                if (frame_meta->osd_frame.empty()) {
+                    frame_meta->osd_frame = frame_meta->frame.clone();
+                }
+            }
+            auto& bg = swap_on_osd ? frame_meta->osd_frame : frame_meta->frame;
+            paste_back(bg, swapped_face, warp_mat);
+            //T = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - t).count();
+            //std::cout << "3rd T:" << T << std::endl;
+        }
+
+        //auto total_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time).count();
+        //std::cout << "total T:" << total_time << std::endl;
+    }
+
+    cv::Mat vp_face_swap_node::read_emap_mat_from_txt(const std::string& emap_file_for_embeddings, int rows, int cols) {
+        std::ifstream file(emap_file_for_embeddings);
+
+        cv::Mat matrix(rows, cols, CV_32F);
+        std::string line;
+        for (int i = 0; i < rows; ++i) {
+            std::getline(file, line);
+            std::istringstream iss(line);
+            for (int j = 0; j < cols; ++j) {
+                float value;
+                assert(iss >> value);
+                matrix.at<float>(i, j) = value;
+            }
+        }
+        file.close();
+        return matrix;
+    }
+
+    cv::Mat vp_face_swap_node::process_embeddings_using_emap(const cv::Mat& source_face_normed_embedding, const cv::Mat& emap) {
+        cv::Mat latent = source_face_normed_embedding.clone().reshape(1, 1);
+        cv::Mat result = latent * emap;
+        cv::normalize(result, result);
+        return result;
+    }
+
+    cv::Mat vp_face_swap_node::get_similarity_transform_matrix(float src[5][2]) {
+        using namespace cv;
+        //float dst[5][2] = { {38.2946f, 51.6963f}, {73.5318f, 51.5014f}, {56.0252f, 71.7366f}, {41.5493f, 92.3655f}, {70.7299f, 92.2041f} };  // for 112*112
+        float dst[5][2] = { {43.0f, 58.0f}, {85.0f, 58.0f}, {64.0f, 81.0f}, {47.0f, 105.0f}, {80.0f, 105.0f} };  // for 128*128
+        //float dst[5][2] = { {38.0f, 54.0f}, {90.0f, 54.0f}, {64.0f, 85.0f}, {47.0f, 109.0f}, {80.0f, 109.0f} };  // for 128*128, zoom out
+        float avg0 = (src[0][0] + src[1][0] + src[2][0] + src[3][0] + src[4][0]) / 5;
+        float avg1 = (src[0][1] + src[1][1] + src[2][1] + src[3][1] + src[4][1]) / 5;
+        //Compute mean of src and dst.
+        float src_mean[2] = { avg0, avg1 };
+        float dst_mean[2] = { 56.0262f, 71.9008f };
+        //Subtract mean from src and dst.
+        float src_demean[5][2];
+        for (int i = 0; i < 2; i++)
+        {
+            for (int j = 0; j < 5; j++)
+            {
+                src_demean[j][i] = src[j][i] - src_mean[i];
+            }
+        }
+        float dst_demean[5][2];
+        for (int i = 0; i < 2; i++)
+        {
+            for (int j = 0; j < 5; j++)
+            {
+                dst_demean[j][i] = dst[j][i] - dst_mean[i];
+            }
+        }
+        double A00 = 0.0, A01 = 0.0, A10 = 0.0, A11 = 0.0;
+        for (int i = 0; i < 5; i++)
+            A00 += dst_demean[i][0] * src_demean[i][0];
+        A00 = A00 / 5;
+        for (int i = 0; i < 5; i++)
+            A01 += dst_demean[i][0] * src_demean[i][1];
+        A01 = A01 / 5;
+        for (int i = 0; i < 5; i++)
+            A10 += dst_demean[i][1] * src_demean[i][0];
+        A10 = A10 / 5;
+        for (int i = 0; i < 5; i++)
+            A11 += dst_demean[i][1] * src_demean[i][1];
+        A11 = A11 / 5;
+        Mat A = (Mat_<double>(2, 2) << A00, A01, A10, A11);
+        double d[2] = { 1.0, 1.0 };
+        double detA = A00 * A11 - A01 * A10;
+        if (detA < 0)
+            d[1] = -1;
+        double T[3][3] = { {1.0, 0.0, 0.0}, {0.0, 1.0, 0.0}, {0.0, 0.0, 1.0} };
+        Mat s, u, vt, v;
+        SVD::compute(A, s, u, vt);
+        double smax = s.ptr<double>(0)[0]>s.ptr<double>(1)[0] ? s.ptr<double>(0)[0] : s.ptr<double>(1)[0];
+        double tol = smax * 2 * FLT_MIN;
+        int rank = 0;
+        if (s.ptr<double>(0)[0]>tol)
+            rank += 1;
+        if (s.ptr<double>(1)[0]>tol)
+            rank += 1;
+        double arr_u[2][2] = { {u.ptr<double>(0)[0], u.ptr<double>(0)[1]}, {u.ptr<double>(1)[0], u.ptr<double>(1)[1]} };
+        double arr_vt[2][2] = { {vt.ptr<double>(0)[0], vt.ptr<double>(0)[1]}, {vt.ptr<double>(1)[0], vt.ptr<double>(1)[1]} };
+        double det_u = arr_u[0][0] * arr_u[1][1] - arr_u[0][1] * arr_u[1][0];
+        double det_vt = arr_vt[0][0] * arr_vt[1][1] - arr_vt[0][1] * arr_vt[1][0];
+        if (rank == 1)
+        {
+            if ((det_u*det_vt) > 0)
+            {
+                Mat uvt = u*vt;
+                T[0][0] = uvt.ptr<double>(0)[0];
+                T[0][1] = uvt.ptr<double>(0)[1];
+                T[1][0] = uvt.ptr<double>(1)[0];
+                T[1][1] = uvt.ptr<double>(1)[1];
+            }
+            else
+            {
+                double temp = d[1];
+                d[1] = -1;
+                Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
+                Mat Dvt = D*vt;
+                Mat uDvt = u*Dvt;
+                T[0][0] = uDvt.ptr<double>(0)[0];
+                T[0][1] = uDvt.ptr<double>(0)[1];
+                T[1][0] = uDvt.ptr<double>(1)[0];
+                T[1][1] = uDvt.ptr<double>(1)[1];
+                d[1] = temp;
+            }
+        }
+        else
+        {
+            Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
+            Mat Dvt = D*vt;
+            Mat uDvt = u*Dvt;
+            T[0][0] = uDvt.ptr<double>(0)[0];
+            T[0][1] = uDvt.ptr<double>(0)[1];
+            T[1][0] = uDvt.ptr<double>(1)[0];
+            T[1][1] = uDvt.ptr<double>(1)[1];
+        }
+        double var1 = 0.0;
+        for (int i = 0; i < 5; i++)
+            var1 += src_demean[i][0] * src_demean[i][0];
+        var1 = var1 / 5;
+        double var2 = 0.0;
+        for (int i = 0; i < 5; i++)
+            var2 += src_demean[i][1] * src_demean[i][1];
+        var2 = var2 / 5;
+        double scale = 1.0 / (var1 + var2)* (s.ptr<double>(0)[0] * d[0] + s.ptr<double>(1)[0] * d[1]);
+        double TS[2];
+        TS[0] = T[0][0] * src_mean[0] + T[0][1] * src_mean[1];
+        TS[1] = T[1][0] * src_mean[0] + T[1][1] * src_mean[1];
+        T[0][2] = dst_mean[0] - scale*TS[0];
+        T[1][2] = dst_mean[1] - scale*TS[1];
+        T[0][0] *= scale;
+        T[0][1] *= scale;
+        T[1][0] *= scale;
+        T[1][1] *= scale;
+        Mat transform_mat = (Mat_<double>(2, 3) << T[0][0], T[0][1], T[0][2], T[1][0], T[1][1], T[1][2]);
+        return transform_mat;
+    }
+
+    cv::Mat vp_face_swap_node::align_crop(cv::Mat& src_img, std::vector<std::pair<int, int>>& kps, cv::Mat& aligned_image) {
+        float face_keypoints[5][2] = 
+            {{kps[0].first, kps[0].second}, 
+            {kps[1].first, kps[1].second}, 
+            {kps[2].first, kps[2].second},  
+            {kps[3].first, kps[3].second}, 
+            {kps[4].first, kps[4].second}};
+
+        cv::Mat warp_mat = get_similarity_transform_matrix(face_keypoints);
+        cv::warpAffine(src_img, aligned_image, warp_mat, cv::Size(128, 128), cv::INTER_LINEAR);
+        return warp_mat;
+    }
+    
+    void vp_face_swap_node::extract_faces(const cv::Mat& frame, std::vector<face_box>& face_boxes) {
+        auto blob_to_infer = cv::dnn::blobFromImage(frame);
+        face_extract_net.setInput(blob_to_infer);
+        const std::vector<std::string> out_names = {"loc", "conf", "iou"};
+        std::vector<cv::Mat> raw_outputs;
+        face_extract_net.forward(raw_outputs, out_names);
+
+        using namespace cv;
+        float scoreThreshold = 0.7;
+        float nmsThreshold = 0.5;
+        int topK = 50;
+        int inputW = frame.cols;
+        int inputH = frame.rows;
+
+        // 3 heads of output
+        assert(raw_outputs.size() == 3);
+
+        // Extract from output_blobs
+        Mat loc = raw_outputs[0];
+        Mat conf = raw_outputs[1];
+        Mat iou = raw_outputs[2];
+
+        // we need generate priors if input size changed or priors is not initialized
+        if (loc.rows != priors.size()) {
+            generatePriors(inputW, inputH);
+        }
+        
+        assert(loc.rows == priors.size());
+        assert(loc.rows == conf.rows);
+        assert(loc.rows == iou.rows);
+
+        // Decode from deltas and priors
+        const std::vector<float> variance = {0.1f, 0.2f};
+        float* loc_v = (float*)(loc.data);
+        float* conf_v = (float*)(conf.data);
+        float* iou_v = (float*)(iou.data);
+        Mat faces;
+        // (tl_x, tl_y, w, h, re_x, re_y, le_x, le_y, nt_x, nt_y, rcm_x, rcm_y, lcm_x, lcm_y, score)
+        // 'tl': top left point of the bounding box
+        // 're': right eye, 'le': left eye
+        // 'nt':  nose tip
+        // 'rcm': right corner of mouth, 'lcm': left corner of mouth
+        Mat face(1, 15, CV_32FC1);
+        for (size_t i = 0; i < priors.size(); ++i) {
+            // Get score
+            float clsScore = conf_v[i*2+1];
+            float iouScore = iou_v[i];
+            // Clamp
+            if (iouScore < 0.f) {
+                iouScore = 0.f;
+            }
+            else if (iouScore > 1.f) {
+                iouScore = 1.f;
+            }
+            float score = std::sqrt(clsScore * iouScore);
+            face.at<float>(0, 14) = score;
+
+            // Get bounding box
+            float cx = (priors[i].x + loc_v[i*14+0] * variance[0] * priors[i].width)  * inputW;
+            float cy = (priors[i].y + loc_v[i*14+1] * variance[0] * priors[i].height) * inputH;
+            float w  = priors[i].width  * exp(loc_v[i*14+2] * variance[0]) * inputW;
+            float h  = priors[i].height * exp(loc_v[i*14+3] * variance[1]) * inputH;
+            float x1 = cx - w / 2;
+            float y1 = cy - h / 2;
+            face.at<float>(0, 0) = x1;
+            face.at<float>(0, 1) = y1;
+            face.at<float>(0, 2) = w;
+            face.at<float>(0, 3) = h;
+
+            // Get landmarks
+            face.at<float>(0, 4) = (priors[i].x + loc_v[i*14+ 4] * variance[0] * priors[i].width)  * inputW;  // right eye, x
+            face.at<float>(0, 5) = (priors[i].y + loc_v[i*14+ 5] * variance[0] * priors[i].height) * inputH;  // right eye, y
+            face.at<float>(0, 6) = (priors[i].x + loc_v[i*14+ 6] * variance[0] * priors[i].width)  * inputW;  // left eye, x
+            face.at<float>(0, 7) = (priors[i].y + loc_v[i*14+ 7] * variance[0] * priors[i].height) * inputH;  // left eye, y
+            face.at<float>(0, 8) = (priors[i].x + loc_v[i*14+ 8] * variance[0] * priors[i].width)  * inputW;  // nose tip, x
+            face.at<float>(0, 9) = (priors[i].y + loc_v[i*14+ 9] * variance[0] * priors[i].height) * inputH;  // nose tip, y
+            face.at<float>(0, 10) = (priors[i].x + loc_v[i*14+10] * variance[0] * priors[i].width)  * inputW; // right corner of mouth, x
+            face.at<float>(0, 11) = (priors[i].y + loc_v[i*14+11] * variance[0] * priors[i].height) * inputH; // right corner of mouth, y
+            face.at<float>(0, 12) = (priors[i].x + loc_v[i*14+12] * variance[0] * priors[i].width)  * inputW; // left corner of mouth, x
+            face.at<float>(0, 13) = (priors[i].y + loc_v[i*14+13] * variance[0] * priors[i].height) * inputH; // left corner of mouth, y
+
+            faces.push_back(face);
+        }
+
+        if (faces.rows > 1) {
+            // Retrieve boxes and scores
+            std::vector<Rect2i> faceBoxes;
+            std::vector<float> faceScores;
+            for (int rIdx = 0; rIdx < faces.rows; rIdx++)
+            {
+                faceBoxes.push_back(Rect2i(int(faces.at<float>(rIdx, 0)),
+                                           int(faces.at<float>(rIdx, 1)),
+                                           int(faces.at<float>(rIdx, 2)),
+                                           int(faces.at<float>(rIdx, 3))));
+                faceScores.push_back(faces.at<float>(rIdx, 14));
+            }
+
+            std::vector<int> keepIdx;
+            dnn::NMSBoxes(faceBoxes, faceScores, scoreThreshold, nmsThreshold, keepIdx, 1.f, topK);
+
+            // Get NMS results
+            Mat nms_faces;
+            for (int idx: keepIdx)
+            {
+                nms_faces.push_back(faces.row(idx));
+            }
+
+            // insert face target back to frame meta
+            for (int i = 0; i < nms_faces.rows; i++) {
+                auto x = int(nms_faces.at<float>(i, 0));
+                auto y = int(nms_faces.at<float>(i, 1));
+                auto w = int(nms_faces.at<float>(i, 2));
+                auto h = int(nms_faces.at<float>(i, 3));
+                
+                // check value range
+                x = std::max(x, 0);
+                y = std::max(y, 0);
+                w = std::min(w, frame.cols - x);
+                h = std::min(h, frame.rows - y);
+
+                auto kp1 = std::pair<int, int>(int(nms_faces.at<float>(i, 4)), int(nms_faces.at<float>(i, 5)));
+                auto kp2 = std::pair<int, int>(int(nms_faces.at<float>(i, 6)), int(nms_faces.at<float>(i, 7)));
+                auto kp3 = std::pair<int, int>(int(nms_faces.at<float>(i, 8)), int(nms_faces.at<float>(i, 9)));
+                auto kp4 = std::pair<int, int>(int(nms_faces.at<float>(i, 10)), int(nms_faces.at<float>(i, 11)));
+                auto kp5 = std::pair<int, int>(int(nms_faces.at<float>(i, 12)), int(nms_faces.at<float>(i, 13)));
+                auto score = nms_faces.at<float>(i, 14);
+
+                face_box face;
+                face.x = x;
+                face.y = y;
+                face.width = w;
+                face.height = h;
+                face.score = score;
+                face.kps = std::vector<std::pair<int, int>>{kp1, kp2, kp3, kp4, kp5};
+                face_boxes.push_back(face);
+            }
+        }
+    }
+
+    void vp_face_swap_node::init_source_face_embeddings(std::string& swap_source_image, int swap_source_face_index, std::string& emap_file_for_embeddings) {
+        std::vector<face_box> source_faces;
+        auto source_mat = cv::imread(swap_source_image);
+
+        // extract faces
+        extract_faces(source_mat, source_faces);
+
+        assert(source_faces.size() > 0);
+        // sort from left to right
+        std::sort(source_faces.begin(), source_faces.end(), [](face_box a, face_box b){ return a.x < b.x;});
+
+        auto the_selected_face = (swap_source_face_index < 0 || swap_source_face_index >= source_faces.size()) ? source_faces[0] : source_faces[swap_source_face_index];
+        cv::Mat aligned_face;
+        auto warp_mat = align_crop(source_mat, the_selected_face.kps, aligned_face);
+        // for debug
+        cv::imwrite("selected_source_face.jpg", aligned_face);
+
+        // read emap from file
+        auto emap = read_emap_mat_from_txt(emap_file_for_embeddings);
+
+        // encoding for the selected face (infer for only 1 time)
+        cv::Mat source_blob = cv::dnn::blobFromImage(aligned_face, 1 / 127.5, cv::Size(112, 112), (127.5, 127.5, 127.5), true);
+        std::vector<cv::Mat> source_outputs;
+        face_encoding_net.setInput(source_blob);
+        face_encoding_net.forward(source_outputs, face_encoding_net.getUnconnectedOutLayersNames());
+
+        auto& source_output = source_outputs[0];
+        // process using emap
+        source_face_embeddings = process_embeddings_using_emap(source_output, emap);
+    }
+
+    void vp_face_swap_node::swap(cv::Mat& aligned_face, cv::Mat& swapped_face) {
+        cv::Mat target_blob = cv::dnn::blobFromImage(aligned_face, 1 / 255.0, cv::Size(128, 128), (0, 0, 0), true);
+        std::vector<cv::Mat> target_outputs;
+        face_swap_net.setInput(target_blob, "target");
+        face_swap_net.setInput(source_face_embeddings, "source");
+        face_swap_net.forward(target_outputs, face_swap_net.getUnconnectedOutLayersNames());
+
+        auto& output = target_outputs[0];
+        cv::Mat output_channel_last;
+        cv::transposeND(output, {0,2,3,1}, output_channel_last);
+
+        cv::Mat img_fake(output_channel_last.size[1], output_channel_last.size[2], CV_32FC3, output_channel_last.data);
+        cv::cvtColor(img_fake, swapped_face, cv::COLOR_RGB2BGR);
+    }
+
+    void vp_face_swap_node::paste_back(cv::Mat& bg, cv::Mat& swapped_face, const cv::Mat& transform_matrix) {
+        bg.convertTo(bg, CV_32FC3, 1.0 / 255);
+
+        cv::Mat IM;
+        cv::invertAffineTransform(transform_matrix, IM);
+        cv::Mat img_mask(swapped_face.rows, swapped_face.cols, CV_32FC1, 1);
+        cv::warpAffine(swapped_face, swapped_face, IM, bg.size());
+        cv::warpAffine(img_mask, img_mask, IM, bg.size());
+
+        // create mask
+        cv::threshold(img_mask, img_mask, 0, 1, cv::THRESH_BINARY);
+        cv::Point min_loc, max_loc;
+        double min_val, max_val;
+        cv::minMaxLoc(img_mask, &min_val, &max_val, &min_loc, &max_loc);
+
+        int mask_h = max_loc.y - min_loc.y;
+        int mask_w = max_loc.x - min_loc.x;
+
+        int mask_size = std::sqrt(mask_h * mask_w);
+        int k = std::max(mask_size / 10, 10);
+
+        cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(k, k));
+        cv::erode(img_mask, img_mask, kernel);
+
+        k = std::max(mask_size / 20, 5);
+        cv::GaussianBlur(img_mask, img_mask, cv::Size(2 * k + 1, 2 * k + 1), 0);
+        cv::cvtColor(img_mask, img_mask, cv::COLOR_GRAY2BGR);
+
+        // merge swapped face and original background
+        cv::Mat ones(bg.rows, bg.cols, CV_32FC3, cv::Scalar(1, 1, 1));
+        bg = img_mask.mul(swapped_face) + (ones - img_mask).mul(bg);
+        bg.convertTo(bg, CV_8U, 255);
+    }
+
+    void vp_face_swap_node::generatePriors(int inputW, int inputH) {
+         using namespace cv;
+        // Calculate shapes of different scales according to the shape of input image
+        Size feature_map_2nd = {
+            int(int((inputW+1)/2)/2), int(int((inputH+1)/2)/2)
+        };
+        Size feature_map_3rd = {
+            int(feature_map_2nd.width/2), int(feature_map_2nd.height/2)
+        };
+        Size feature_map_4th = {
+            int(feature_map_3rd.width/2), int(feature_map_3rd.height/2)
+        };
+        Size feature_map_5th = {
+            int(feature_map_4th.width/2), int(feature_map_4th.height/2)
+        };
+        Size feature_map_6th = {
+            int(feature_map_5th.width/2), int(feature_map_5th.height/2)
+        };
+
+        std::vector<Size> feature_map_sizes;
+        feature_map_sizes.push_back(feature_map_3rd);
+        feature_map_sizes.push_back(feature_map_4th);
+        feature_map_sizes.push_back(feature_map_5th);
+        feature_map_sizes.push_back(feature_map_6th);
+
+        // Fixed params for generating priors
+        const std::vector<std::vector<float>> min_sizes = {
+            {10.0f,  16.0f,  24.0f},
+            {32.0f,  48.0f},
+            {64.0f,  96.0f},
+            {128.0f, 192.0f, 256.0f}
+        };
+        CV_Assert(min_sizes.size() == feature_map_sizes.size()); // just to keep vectors in sync
+        const std::vector<int> steps = { 8, 16, 32, 64 };
+
+        // Generate priors
+        priors.clear();
+        for (size_t i = 0; i < feature_map_sizes.size(); ++i)
+        {
+            Size feature_map_size = feature_map_sizes[i];
+            std::vector<float> min_size = min_sizes[i];
+
+            for (int _h = 0; _h < feature_map_size.height; ++_h)
+            {
+                for (int _w = 0; _w < feature_map_size.width; ++_w)
+                {
+                    for (size_t j = 0; j < min_size.size(); ++j)
+                    {
+                        float s_kx = min_size[j] / inputW;
+                        float s_ky = min_size[j] / inputH;
+
+                        float cx = (_w + 0.5f) * steps[i] / inputW;
+                        float cy = (_h + 0.5f) * steps[i] / inputH;
+
+                        Rect2f prior = { cx, cy, s_kx, s_ky };
+                        priors.push_back(prior);
+                    }
+                }
+            }
+        }
+    }
+
+    void vp_face_swap_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
--- a/nodes/infers/vp_face_swap_node.h
+++ b/nodes/infers/vp_face_swap_node.h
@@ -0,0 +1,65 @@
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+
+namespace vp_nodes {
+    // face swap node
+    // used to swap faces in videos/images using a specific face
+    class vp_face_swap_node: public vp_primary_infer_node
+    {   
+    private:
+        // inner temporary use
+        struct face_box {
+            int x;
+            int y;
+            int width;
+            int height;
+            float score;
+            std::vector<std::pair<int, int>> kps;
+        };
+
+        /* onnx network using opencv::dnn as backend */
+        cv::dnn::Net face_extract_net;
+        cv::dnn::Net face_encoding_net;
+        cv::dnn::Net face_swap_net;
+
+        cv::Mat read_emap_mat_from_txt(const std::string& emap_file_for_embeddings, int rows = 512, int cols = 512);
+        cv::Mat process_embeddings_using_emap(const cv::Mat& source_face_normed_embedding, const cv::Mat& emap);
+
+        cv::Mat get_similarity_transform_matrix(float src[5][2]);
+        cv::Mat align_crop(cv::Mat& _src_img, std::vector<std::pair<int, int>>& kps, cv::Mat& _aligned_img);
+        
+        void extract_faces(const cv::Mat& frame, std::vector<face_box>& faces);
+        void init_source_face_embeddings(std::string& swap_source_image, int swap_source_face_index, std::string& emap_file_for_embeddings);
+        void swap(cv::Mat& aligned_face, cv::Mat& swapped_face);
+        void paste_back(cv::Mat& bg, cv::Mat& swapped_face, const cv::Mat& transform_matrix);
+
+        cv::Mat source_face_embeddings;
+        bool act_as_primary_detector = false;
+        bool swap_on_osd = true;
+
+        int min_face_w_h = 50;
+
+        /* used for extract faces */
+        std::vector<cv::Rect2f> priors;
+        void generatePriors(int inputW, int inputH);
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_face_swap_node(std::string node_name, 
+                            std::string yunet_face_detect_model,
+                            std::string buffalo_l_face_encoding_model,
+                            std::string emap_file_for_embeddings,
+                            std::string insightface_swap_model,
+                            std::string swap_source_image,
+                            int swap_source_face_index = 0,
+                            int min_face_w_h = 50,
+                            bool swap_on_osd = true,
+                            bool act_as_primary_detector = false);
+        ~vp_face_swap_node();
+    };
+}
--- a/nodes/infers/vp_feature_encoder_node.cpp
+++ b/nodes/infers/vp_feature_encoder_node.cpp
@@ -0,0 +1,15 @@
+
+#include "vp_feature_encoder_node.h"
+
+
+namespace vp_nodes {
+    
+    vp_feature_encoder_node::vp_feature_encoder_node(std::string node_name, std::string model_path):
+                                                    vp_secondary_infer_node(node_name, model_path) {
+        this->initialized();
+    }
+    
+    vp_feature_encoder_node::~vp_feature_encoder_node() {
+        deinitialized();        
+    } 
+}
--- a/nodes/infers/vp_feature_encoder_node.h
+++ b/nodes/infers/vp_feature_encoder_node.h
@@ -0,0 +1,18 @@
+
+#pragma once
+
+#include "../vp_secondary_infer_node.h"
+
+namespace vp_nodes {
+    // common feature encoder for image feature extraction.
+    // used for feature extraction, update embeddings of vp_frame_target.
+    class vp_feature_encoder_node: public vp_secondary_infer_node
+    {
+    private:
+        /* data */
+    public:
+        vp_feature_encoder_node(std::string node_name, std::string model_path);
+        ~vp_feature_encoder_node();
+    };
+
+}
--- a/nodes/infers/vp_lane_detector_node.cpp
+++ b/nodes/infers/vp_lane_detector_node.cpp
@@ -0,0 +1,34 @@
+
+#include "vp_lane_detector_node.h"
+
+namespace vp_nodes {
+
+    vp_lane_detector_node::vp_lane_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path, 
+                            std::string labels_path, 
+                            int input_width, 
+                            int input_height, 
+                            int batch_size,
+                            int class_id_offset,
+                            float scale,
+                            cv::Scalar mean,
+                            cv::Scalar std,
+                            bool swap_rb,
+                            bool swap_chn):
+                            vp_primary_infer_node(node_name, model_path, model_config_path, labels_path, input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb, swap_chn) {
+        this->initialized();
+    }
+
+    vp_lane_detector_node::~vp_lane_detector_node() {
+        deinitialized();
+    }
+
+    void vp_lane_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto& mask = raw_outputs[0];
+
+        // save output as mask directly， this is the mask for the whole frame.
+        frame_meta->mask = mask.clone();
+    }
+}
--- a/nodes/infers/vp_lane_detector_node.h
+++ b/nodes/infers/vp_lane_detector_node.h
@@ -0,0 +1,31 @@
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+
+
+namespace vp_nodes {
+    // lane detect based on CenterNet
+    // 
+    class vp_lane_detector_node: public vp_primary_infer_node
+    {
+    private:
+        /* data */
+    protected:
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_lane_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path = "", 
+                            std::string labels_path = "", 
+                            int input_width = 736, 
+                            int input_height = 416, 
+                            int batch_size = 1,
+                            int class_id_offset = 0,
+                            float scale = 1,
+                            cv::Scalar mean = cv::Scalar(0),
+                            cv::Scalar std = cv::Scalar(1),
+                            bool swap_rb = true,
+                            bool swap_chn = true);
+        ~vp_lane_detector_node();
+    };
+}
--- a/nodes/infers/vp_mask_rcnn_detector_node.cpp
+++ b/nodes/infers/vp_mask_rcnn_detector_node.cpp
@@ -0,0 +1,89 @@
+
+#include "vp_mask_rcnn_detector_node.h"
+
+
+namespace vp_nodes {
+    vp_mask_rcnn_detector_node::vp_mask_rcnn_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path, 
+                            std::string labels_path, 
+                            int input_width, 
+                            int input_height, 
+                            int batch_size,
+                            int class_id_offset,
+                            float score_threshold,
+                            float scale,
+                            cv::Scalar mean,
+                            cv::Scalar std,
+                            bool swap_rb):
+                            vp_primary_infer_node(node_name, model_path, model_config_path, labels_path, input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb),
+                            score_threshold(score_threshold) {
+        this->initialized();
+    }
+    
+    vp_mask_rcnn_detector_node::~vp_mask_rcnn_detector_node() {
+        deinitialized();
+    }
+
+
+    void vp_mask_rcnn_detector_node::infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) {
+        // blob_to_infer is a 4D matrix
+        // the first dim is number of batch, MUST be 1
+        assert(blob_to_infer.dims == 4);
+        assert(blob_to_infer.size[0] == 1);
+        assert(!net.empty());
+
+        net.setInput(blob_to_infer);
+        net.forward(raw_outputs, out_names);
+    }
+
+
+    void vp_mask_rcnn_detector_node::preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) {
+        // ignore preprocess logic in base class
+        cv::dnn::blobFromImages(mats_to_infer, blob_to_infer, 1.0, cv::Size(), cv::Scalar(), true, false);
+    }
+
+    void vp_mask_rcnn_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        auto& frame_meta = frame_meta_with_batch[0];
+        
+        auto outDetections = raw_outputs[0];
+        auto outMasks = raw_outputs[1];
+    
+        // Output size of masks is NxCxHxW where
+        // N - number of detected boxes
+        // C - number of classes (excluding background)
+        // HxW - segmentation shape
+        const int numDetections = outDetections.size[2];
+        const int numClasses = outMasks.size[1];
+    
+        auto outDetections_ = outDetections.reshape(1, outDetections.total() / 7);
+        auto& frame = frame_meta->frame;
+        for (int i = 0; i < numDetections; ++i) {
+            float score = outDetections_.at<float>(i, 2);
+            if (score > score_threshold) {
+                // Extract the bounding box
+                int classId = static_cast<int>(outDetections_.at<float>(i, 1));
+                int left = static_cast<int>(frame.cols * outDetections_.at<float>(i, 3));
+                int top = static_cast<int>(frame.rows * outDetections_.at<float>(i, 4));
+                int right = static_cast<int>(frame.cols * outDetections_.at<float>(i, 5));
+                int bottom = static_cast<int>(frame.rows * outDetections_.at<float>(i, 6));
+    
+                left = max(0, min(left, frame.cols - 1));
+                top = max(0, min(top, frame.rows - 1));
+                right = max(0, min(right, frame.cols - 1));
+                bottom = max(0, min(bottom, frame.rows - 1));
+
+                auto label = (labels.size() < classId + 1) ? "" : labels[classId];
+
+                // Extract the mask for the object
+                cv::Mat objectMask(outMasks.size[2], outMasks.size[3], CV_32F, outMasks.ptr<float>(i, classId));
+                
+                classId += class_id_offset;
+                auto target = std::make_shared<vp_objects::vp_frame_target>(left, top, right - left, bottom - top, classId, score, frame_meta->frame_index, frame_meta->channel_index, label);
+                target->mask = objectMask;
+
+                frame_meta->targets.push_back(target);
+            }
+        }
+    }
+}
--- a/nodes/infers/vp_mask_rcnn_detector_node.h
+++ b/nodes/infers/vp_mask_rcnn_detector_node.h
@@ -0,0 +1,41 @@
+
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+#include "../../objects/vp_frame_target.h"
+
+
+
+namespace vp_nodes {
+    // image segmentation based on Mask RCNN
+    // https://github.com/matterport/Mask_RCNN
+    class vp_mask_rcnn_detector_node: public vp_primary_infer_node
+    {
+    private:
+        /* data */
+        // names of output layers in mask_rcnn
+        const std::vector<std::string> out_names = {"detection_out_final", "detection_masks"};
+        float score_threshold = 0.5;
+    protected:
+        // override infer and preprocess as mask_rcnn has a different logic
+        virtual void infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) override;
+        virtual void preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) override;
+
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_mask_rcnn_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path = "", 
+                            std::string labels_path = "", 
+                            int input_width = 416, 
+                            int input_height = 416, 
+                            int batch_size = 1,
+                            int class_id_offset = 0,
+                            float score_threshold = 0.5,
+                            float scale = 1 / 255.0,
+                            cv::Scalar mean = cv::Scalar(0),
+                            cv::Scalar std = cv::Scalar(1),
+                            bool swap_rb = true);
+        ~vp_mask_rcnn_detector_node();
+    };    
+}
--- a/nodes/infers/vp_mllm_analyser_node.cpp
+++ b/nodes/infers/vp_mllm_analyser_node.cpp
@@ -0,0 +1,34 @@
+#include "vp_mllm_analyser_node.h"
+
+#ifdef VP_WITH_LLM
+namespace vp_nodes {
+    vp_mllm_analyser_node::vp_mllm_analyser_node(std::string node_name,
+                              std::string model_name,
+                              std::string prompt,
+                              std::string api_base_url,
+                              std::string api_key,
+                              llmlib::LLMBackendType backend_type):
+                              vp_primary_infer_node(node_name, ""),
+                              llm_model_name(model_name),
+                              llm_prompt(prompt) {
+        cli = llmlib::LLMClient(api_base_url, api_key, backend_type);
+        this->initialized();
+    }
+
+    vp_mllm_analyser_node::~vp_mllm_analyser_node() {
+        deinitialized();
+    }
+
+    void vp_mllm_analyser_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        auto output = cli.simple_chat(llm_model_name, llm_prompt, {frame_meta->frame}, {});
+        frame_meta->description = output;
+    }
+
+    void vp_mllm_analyser_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_mllm_analyser_node.h
+++ b/nodes/infers/vp_mllm_analyser_node.h
@@ -0,0 +1,32 @@
+#pragma once
+
+#ifdef VP_WITH_LLM
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/cpp_llmlib/llmlib.hpp"
+
+namespace vp_nodes {
+    // image(frame) analyser based on Multimodal Large Language Model
+    class vp_mllm_analyser_node: public vp_primary_infer_node 
+    {
+    private:
+        /* data */
+        llmlib::LLMClient cli;
+        std::string llm_prompt;
+        std::string llm_model_name;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_mllm_analyser_node(std::string node_name,
+                              std::string model_name,
+                              std::string prompt,
+                              std::string api_base_url,
+                              std::string api_key = "",
+                              llmlib::LLMBackendType backend_type = llmlib::LLMBackendType::Ollama);
+        ~vp_mllm_analyser_node();
+    };
+}
+#endif
--- a/nodes/infers/vp_openpose_detector_node.cpp
+++ b/nodes/infers/vp_openpose_detector_node.cpp
@@ -0,0 +1,107 @@
+#include <set>
+
+#include "vp_openpose_detector_node.h"
+
+namespace vp_nodes {
+        
+    vp_openpose_detector_node::vp_openpose_detector_node(std::string node_name, 
+                                                        std::string model_path, 
+                                                        std::string model_config_path, 
+                                                        std::string labels_path, 
+                                                        int input_width, 
+                                                        int input_height, 
+                                                        int batch_size,
+                                                        int class_id_offset,
+                                                        float score_threshold,
+                                                        vp_objects::vp_pose_type type,
+                                                        float scale,
+                                                        cv::Scalar mean,
+                                                        cv::Scalar std,
+                                                        bool swap_rb):
+                                                        vp_primary_infer_node(node_name, 
+                                                                            model_path, 
+                                                                            model_config_path, 
+                                                                            labels_path, 
+                                                                            input_width, 
+                                                                            input_height, 
+                                                                            batch_size, 
+                                                                            class_id_offset, 
+                                                                            scale, mean, 
+                                                                            std, 
+                                                                            swap_rb),
+                                                        score_threshold(score_threshold), type(type) {
+        this->initialized();
+    }
+    
+    vp_openpose_detector_node::~vp_openpose_detector_node() {
+        deinitialized();
+    }
+    
+    void vp_openpose_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        // make sure heads of output are not zero
+        assert(raw_outputs.size() > 0);
+        // as it just one head of output
+        auto& netOutputBlobs = raw_outputs[0];
+        // batch, number of heatmaps, h, w
+        assert(netOutputBlobs.dims == 4);
+
+        auto start1 = std::chrono::system_clock::now();
+
+        int netOutputBlob_dims[3] = {netOutputBlobs.size[1], netOutputBlobs.size[2], netOutputBlobs.size[3]};
+        // scan batch
+        for (int b = 0; b < netOutputBlobs.size[0]; b++) {
+            auto& frame_meta = frame_meta_with_batch[b];
+            //auto h_scale = frame_meta->frame.rows * 1.0 / netOutputBlobs.size[2];
+            //auto w_scale = frame_meta->frame.cols * 1.0 / netOutputBlobs.size[3];
+
+            cv::Mat netOutputBlob = cv::Mat(3, netOutputBlob_dims, CV_32F, const_cast<uchar*>(netOutputBlobs.ptr(b)));
+            
+            int keyPointId = 0;
+            std::vector<std::vector<KeyPoint>> detectedKeypoints;
+            std::vector<KeyPoint> keyPointsList;
+
+            auto start2 = std::chrono::system_clock::now();
+            std::vector<cv::Mat> netOutputParts;
+            splitNetOutputBlobToParts(netOutputBlob, cv::Size(frame_meta->frame.cols, frame_meta->frame.rows), netOutputParts);
+            
+            auto start3 = std::chrono::system_clock::now();
+            for(int i = 0; i < points_map.at(type); ++i) {
+                std::vector<KeyPoint> keyPoints;
+                getKeyPoints(netOutputParts[i], score_threshold, keyPoints);
+
+                for(int i = 0; i< keyPoints.size(); ++i, ++keyPointId) {
+                    keyPoints[i].id = keyPointId;
+                }
+
+                detectedKeypoints.push_back(keyPoints);
+                keyPointsList.insert(keyPointsList.end(), keyPoints.begin(), keyPoints.end());
+            }
+
+            std::vector<std::vector<ValidPair>> validPairs;
+            std::set<int> invalidPairs;
+            getValidPairs(netOutputParts, detectedKeypoints, validPairs, invalidPairs);
+
+            std::vector<std::vector<int>> personwiseKeypoints;
+            getPersonwiseKeypoints(validPairs, invalidPairs, personwiseKeypoints);
+
+            // insert pose targets back into frame meta
+            for(int n = 0; n < personwiseKeypoints.size(); ++n) {
+                std::vector<vp_objects::vp_pose_keypoint> kps;
+                for (int i = 0; i < personwiseKeypoints[n].size(); i++) {
+                    auto index = personwiseKeypoints[n][i];
+                    if (index != -1) {                
+                        auto& p = keyPointsList[index];
+                        kps.push_back(vp_objects::vp_pose_keypoint {i, p.point.x, p.point.y, p.probability});
+                    }
+                    else {
+                        // point not detected
+                        kps.push_back(vp_objects::vp_pose_keypoint {i, -1, -1, 0});
+                    }
+                }
+                
+                auto pose_target = std::make_shared<vp_objects::vp_frame_pose_target>(type, kps);
+                frame_meta->pose_targets.push_back(pose_target);
+            }
+        }
+    }
+}
--- a/nodes/infers/vp_openpose_detector_node.h
+++ b/nodes/infers/vp_openpose_detector_node.h
@@ -0,0 +1,293 @@
+
+#pragma once
+
+#include <set>
+#include <map>
+#include <cmath>
+#include <opencv2/imgproc.hpp>
+
+#include "../vp_primary_infer_node.h"
+#include "../../objects/vp_frame_pose_target.h"
+
+
+namespace vp_nodes {
+    // body keypoints detector using openpose
+    // https://github.com/CMU-Perceptual-Computing-Lab/openpose
+    class vp_openpose_detector_node: public vp_primary_infer_node
+    {
+    private:
+        float score_threshold;
+        // pose type (model type)
+        vp_objects::vp_pose_type type;
+
+        // map indexs for PAFs
+        const std::map<vp_objects::vp_pose_type, std::vector<std::pair<int,int>>> mapIdxes_map = {
+            {vp_objects::vp_pose_type::body_25, {{0,1}, {14,15}, {22,23}, {16,17}, {18,19}, {24,25}, {26,27}, {6,7}, {2,3}, {4,5}, {8,9}, {10,11}, {12,13}, {30,31}, {32,33}, {36,37}, {34,35}, 
+                                                {38,39}, {40,41} ,{42,43}, {44,45}, {46,47}, {48,49}, {50,51}}},
+            {vp_objects::vp_pose_type::coco, {{12,13}, {20,21}, {14,15}, {16,17}, {22,23}, {24,25}, {0,1}, 
+                                            {2,3}, {4,5}, {6,7}, {8,9}, {10,11}, {28,29}, {30,31}, {34,35}, {32,33}, {36,37}, {18,19}, {26,27}}},
+            {vp_objects::vp_pose_type::mpi_15, {{0,1}, {2,3}, {4,5}, {6,7}, {8,9}, {10,11}, {12,13}, {14,15}, {16,17}, {18,19}, {20,21}, {22,23}, {24,25}, {26,27}}},
+            {vp_objects::vp_pose_type::hand, std::vector<std::pair<int,int>>()},
+            {vp_objects::vp_pose_type::face, std::vector<std::pair<int,int>>()}
+        };
+
+        // pose pairs for PAFs
+        const std::map<vp_objects::vp_pose_type, std::vector<std::pair<int,int>>> posePairs_map = {
+            {vp_objects::vp_pose_type::body_25, {{1,8}, {1,2}, {1,5}, {2,3}, {3,4}, {5,6}, {6,7}, {8,9}, 
+                                                {9,10}, {10,11}, {8,12}, {12,13}, {13,14}, {1,0}, {0,15}, {15,17}, {0,16}, {16,18}, 
+                                                {14,19}, {19,20}, {14,21}, {11,22}, {22,23}, {11,24}}},
+            {vp_objects::vp_pose_type::coco, {{1,2}, {1,5}, {2,3}, {3,4}, {5,6}, {6,7},
+                                            {1,8}, {8,9}, {9,10}, {1,11}, {11,12}, {12,13},
+                                            {1,0}, {0,14}, {14,16}, {0,15}, {15,17}, {2,16},
+                                            {5,17}}},
+            {vp_objects::vp_pose_type::mpi_15, {{0,1}, {1,2}, {2,3}, {3,4}, {1,5}, {5,6}, {6,7}, {1,14}, {14,8}, {8,9}, {9,10}, {14,11}, {11,12}, {12,13}, {0, 2}, {0, 5}}},
+            {vp_objects::vp_pose_type::hand, std::vector<std::pair<int,int>>()},
+            {vp_objects::vp_pose_type::face, std::vector<std::pair<int,int>>()}
+        };
+
+        // points count for each type of pose model
+        const std::map<vp_objects::vp_pose_type, int> points_map = {
+            {vp_objects::vp_pose_type::body_25, 25},
+            {vp_objects::vp_pose_type::coco, 18},
+            {vp_objects::vp_pose_type::mpi_15, 15},
+            {vp_objects::vp_pose_type::hand, 0},
+            {vp_objects::vp_pose_type::face, 0}
+        };
+
+        // for postprocess purpose
+        struct ValidPair{
+            ValidPair(int aId,int bId,float score) {
+                this->aId = aId;
+                this->bId = bId;
+                this->score = score;
+            }
+
+            int aId;
+            int bId;
+            float score;
+        };
+        struct KeyPoint{
+            KeyPoint(cv::Point point, float probability) {
+                this->id = -1;
+                this->point = point;
+                this->probability = probability;
+            }
+
+            int id;
+            cv::Point point;
+            float probability;
+        };
+        // for postprocess purpose end
+
+        void getKeyPoints(cv::Mat& probMap, double threshold, std::vector<KeyPoint>& keyPoints) {
+            cv::Mat smoothProbMap;
+            cv::GaussianBlur(probMap, smoothProbMap, cv::Size( 3, 3 ), 0, 0);
+
+            cv::Mat maskedProbMap;
+            cv::threshold(smoothProbMap, maskedProbMap, threshold, 255, cv::THRESH_BINARY);
+
+            maskedProbMap.convertTo(maskedProbMap, CV_8U, 1);
+
+            std::vector<std::vector<cv::Point> > contours;
+            cv::findContours(maskedProbMap, contours, cv::RETR_TREE, cv::CHAIN_APPROX_SIMPLE);
+
+            for(int i = 0; i < contours.size();++i) {
+                cv::Mat blobMask = cv::Mat::zeros(smoothProbMap.rows, smoothProbMap.cols, smoothProbMap.type());
+                cv::fillConvexPoly(blobMask, contours[i], cv::Scalar(1));
+
+                double maxVal;
+                cv::Point maxLoc;
+
+                cv::minMaxLoc(smoothProbMap.mul(blobMask), 0, &maxVal, 0, &maxLoc);
+                keyPoints.push_back(KeyPoint(maxLoc, probMap.at<float>(maxLoc.y, maxLoc.x)));
+            }
+        }
+
+        void splitNetOutputBlobToParts(cv::Mat& netOutputBlob, const cv::Size& targetSize, std::vector<cv::Mat>& netOutputParts) {
+            int nParts = netOutputBlob.size[0];
+            int h = netOutputBlob.size[1];
+            int w = netOutputBlob.size[2];
+
+            for(int i = 0; i< nParts; ++i) {
+                cv::Mat part(h, w, CV_32F, netOutputBlob.ptr(i));
+
+                cv::Mat resizedPart;
+
+                cv::resize(part, resizedPart, targetSize);
+
+                netOutputParts.push_back(resizedPart);
+            }
+        }
+
+        void populateInterpPoints(const cv::Point& a,const cv::Point& b,int numPoints,std::vector<cv::Point>& interpCoords){
+            float xStep = ((float)(b.x - a.x)) / (float)(numPoints - 1);
+            float yStep = ((float)(b.y - a.y)) / (float)(numPoints - 1);
+            interpCoords.push_back(a);
+
+            for(int i = 1; i< numPoints - 1; ++i) {
+                interpCoords.push_back(cv::Point(a.x + xStep * i, a.y + yStep * i));
+            }
+            interpCoords.push_back(b);
+        }
+
+        void getValidPairs(const std::vector<cv::Mat>& netOutputParts,
+                        const std::vector<std::vector<KeyPoint>>& detectedKeypoints,
+                        std::vector<std::vector<ValidPair>>& validPairs,
+                        std::set<int>& invalidPairs) {
+
+            int nInterpSamples = 10;
+            float confTh = 0.7;
+
+            for(int k = 0; k < mapIdxes_map.at(type).size(); ++k) {
+                //A->B constitute a limb
+                cv::Mat pafA = netOutputParts[mapIdxes_map.at(type)[k].first + points_map.at(type) + 1];
+                cv::Mat pafB = netOutputParts[mapIdxes_map.at(type)[k].second + points_map.at(type) + 1];
+
+                //Find the keypoints for the first and second limb
+                const std::vector<KeyPoint>& candA = detectedKeypoints[posePairs_map.at(type)[k].first];
+                const std::vector<KeyPoint>& candB = detectedKeypoints[posePairs_map.at(type)[k].second];
+
+                int nA = candA.size();
+                int nB = candB.size();
+
+                /*
+                # If keypoints for the joint-pair is detected
+                # check every joint in candA with every joint in candB
+                # Calculate the distance vector between the two joints
+                # Find the PAF values at a set of interpolated points between the joints
+                # Use the above formula to compute a score to mark the connection valid
+                */
+
+                if(nA != 0 && nB != 0){
+                    std::vector<ValidPair> localValidPairs;
+
+                    for(int i = 0; i< nA; ++i) {
+                        int maxJ = -1;
+                        float maxScore = -1;
+                        bool found = false;
+
+                        for(int j = 0; j < nB; ++j) {
+                            std::pair<float, float> distance(candB[j].point.x - candA[i].point.x, candB[j].point.y - candA[i].point.y);
+
+                            float norm = std::sqrt(distance.first * distance.first + distance.second * distance.second);
+
+                            if(!norm) {
+                                continue;
+                            }
+
+                            distance.first /= norm;
+                            distance.second /= norm;
+
+                            //Find p(u)
+                            std::vector<cv::Point> interpCoords;
+                            populateInterpPoints(candA[i].point, candB[j].point, nInterpSamples, interpCoords);
+                            //Find L(p(u))
+                            std::vector<std::pair<float, float>> pafInterp;
+                            for(int l = 0; l < interpCoords.size();++l) {
+                                pafInterp.push_back(
+                                    std::pair<float,float>(
+                                        pafA.at<float>(interpCoords[l].y, interpCoords[l].x),
+                                        pafB.at<float>(interpCoords[l].y, interpCoords[l].x)
+                                    ));
+                            }
+
+                            std::vector<float> pafScores;
+                            float sumOfPafScores = 0;
+                            int numOverTh = 0;
+                            for(int l = 0; l< pafInterp.size(); ++l){
+                                float score = pafInterp[l].first * distance.first + pafInterp[l].second * distance.second;
+                                sumOfPafScores += score;
+                                if(score > score_threshold){
+                                    ++numOverTh;
+                                }
+
+                                pafScores.push_back(score);
+                            }
+
+                            float avgPafScore = sumOfPafScores / ((float)pafInterp.size());
+
+                            if(((float)numOverTh)/((float)nInterpSamples) > confTh){
+                                if(avgPafScore > maxScore) {
+                                    maxJ = j;
+                                    maxScore = avgPafScore;
+                                    found = true;
+                                }
+                            }
+                        }/* j */
+
+                        if(found) {
+                            localValidPairs.push_back(ValidPair(candA[i].id, candB[maxJ].id, maxScore));
+                        }
+
+                    }/* i */
+
+                    validPairs.push_back(localValidPairs);
+
+                } else {
+                    invalidPairs.insert(k);
+                    validPairs.push_back(std::vector<ValidPair>());
+                }
+            }/* k */
+        }
+
+        void getPersonwiseKeypoints(const std::vector<std::vector<ValidPair>>& validPairs,
+                                    const std::set<int>& invalidPairs,
+                                    std::vector<std::vector<int>>& personwiseKeypoints) {
+            for(int k = 0; k < mapIdxes_map.at(type).size(); ++k) {
+                if(invalidPairs.find(k) != invalidPairs.end()) {
+                    continue;
+                }
+
+                const std::vector<ValidPair>& localValidPairs(validPairs[k]);
+
+                int indexA(posePairs_map.at(type)[k].first);
+                int indexB(posePairs_map.at(type)[k].second);
+
+                for(int i = 0; i< localValidPairs.size(); ++i) {
+                    bool found = false;
+                    int personIdx = -1;
+
+                    for(int j = 0; !found && j < personwiseKeypoints.size();++j) {
+                        if(indexA < personwiseKeypoints[j].size() &&
+                            personwiseKeypoints[j][indexA] == localValidPairs[i].aId) {
+                            personIdx = j;
+                            found = true;
+                        }
+                    }/* j */
+
+                    if(found) {
+                        personwiseKeypoints[personIdx].at(indexB) = localValidPairs[i].bId;
+                    } 
+                    else if(k < points_map.at(type) - 1) {
+                        std::vector<int> lpkp(std::vector<int>(points_map.at(type), -1));
+
+                        lpkp.at(indexA) = localValidPairs[i].aId;
+                        lpkp.at(indexB) = localValidPairs[i].bId;
+
+                        personwiseKeypoints.push_back(lpkp);
+                    }
+
+                }/* i */
+            }/* k */
+        }
+
+    protected:
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_openpose_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path = "", 
+                            std::string labels_path = "", 
+                            int input_width = 368, 
+                            int input_height = 368, 
+                            int batch_size = 1,
+                            int class_id_offset = 0,
+                            float score_threshold = 0.1,
+                            vp_objects::vp_pose_type type = vp_objects::vp_pose_type::coco,
+                            float scale = 1 / 255.0,
+                            cv::Scalar mean = cv::Scalar(0),
+                            cv::Scalar std = cv::Scalar(1),
+                            bool swap_rb = false);
+        ~vp_openpose_detector_node();
+    };
+}
--- a/nodes/infers/vp_ppocr_text_detector_node.cpp
+++ b/nodes/infers/vp_ppocr_text_detector_node.cpp
@@ -0,0 +1,66 @@
+
+#ifdef VP_WITH_PADDLE
+#include "vp_ppocr_text_detector_node.h"
+#include "../../objects/vp_frame_text_target.h"
+
+namespace vp_nodes {
+        
+    vp_ppocr_text_detector_node::vp_ppocr_text_detector_node(std::string node_name, 
+                                                            std::string det_model_dir, 
+                                                            std::string cls_model_dir, 
+                                                            std::string rec_model_dir, 
+                                                            std::string rec_char_dict_path):
+                                                            vp_primary_infer_node(node_name, "") {
+        // to make the code simpler, paddle_ocr has no more config other than model path
+        // we need modify source code at ../../third_party/paddle_ocr/ if we need tune the parameters 
+        ocr = std::make_shared<PaddleOCR::PPOCR>(det_model_dir, cls_model_dir, rec_model_dir, rec_char_dict_path);
+        this->initialized();
+    }
+    
+    vp_ppocr_text_detector_node::~vp_ppocr_text_detector_node() {
+        deinitialized();
+    }
+    
+    void vp_ppocr_text_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+    
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_ppocr_text_detector_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        // call paddle ocr
+        auto ocr_results = ocr->ocr(mats_to_infer);
+        assert(ocr_results.size() == 1);
+
+        auto& ocr_result = ocr_results[0];
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        // scan text detected in frame
+        for (int i = 0; i < ocr_result.size(); i++) {
+            /* code */
+            auto& text = ocr_result[i];
+            std::vector<std::pair<int, int>> region_vertexes = {{text.box[0][0], text.box[0][1]}, 
+                                                                {text.box[1][0], text.box[1][1]}, 
+                                                                {text.box[2][0], text.box[2][1]}, 
+                                                                {text.box[3][0], text.box[3][1]}};
+            // create text target and update back into frame meta
+            auto text_target = std::make_shared<vp_objects::vp_frame_text_target>(region_vertexes, text.text, text.score);
+            frame_meta->text_targets.push_back(text_target);
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+}
+#endif
--- a/nodes/infers/vp_ppocr_text_detector_node.h
+++ b/nodes/infers/vp_ppocr_text_detector_node.h
@@ -0,0 +1,33 @@
+#pragma once
+
+#ifdef VP_WITH_PADDLE
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/paddle_ocr/include/paddleocr.h"
+
+namespace vp_nodes {
+    // ocr based on paddle ocr
+    // paddle ocr project(official): https://github.com/PaddlePaddle/PaddleOCR
+    // source code(modified based on official): ../../third_party/paddle_ocr
+    // note:
+    // this class is not based on opencv::dnn module but paddle, a few data members declared in base class are not usable any more(just ignore), such as vp_infer_node::net.
+    class vp_ppocr_text_detector_node: public vp_primary_infer_node
+    {
+    private:
+        // paddle ocr instance
+        std::shared_ptr<PaddleOCR::PPOCR> ocr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_ppocr_text_detector_node(std::string node_name, 
+                                    std::string det_model_dir = "", 
+                                    std::string cls_model_dir = "", 
+                                    std::string rec_model_dir = "", 
+                                    std::string rec_char_dict_path = "");
+        ~vp_ppocr_text_detector_node();
+    };
+}
+#endif
--- a/nodes/infers/vp_restoration_node.cpp
+++ b/nodes/infers/vp_restoration_node.cpp
@@ -0,0 +1,52 @@
+
+#include "vp_restoration_node.h"
+
+namespace vp_nodes {
+    
+    vp_restoration_node::vp_restoration_node(std::string node_name, 
+                            std::string realesrgan_bg_restoration_model,
+                            std::string face_restoration_model,
+                            bool restoration_to_osd):
+                            vp_primary_infer_node(node_name, ""),
+                            restoration_to_osd(restoration_to_osd) {        
+        /* init net*/
+        restoration_net = cv::dnn::readNetFromONNX(realesrgan_bg_restoration_model);
+        /* to-do, load face restoration model*/
+        #ifdef VP_WITH_CUDA
+        //restoration_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
+        //restoration_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
+        #endif
+        this->initialized();
+    }
+    
+    vp_restoration_node::~vp_restoration_node() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_restoration_node::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        // 4x larger for RealESRGAN_x4plus model
+        cv::Mat target_blob = cv::dnn::blobFromImage(frame_meta->frame, 1 / 255.0, cv::Size(frame_meta->frame.cols, frame_meta->frame.rows), (0, 0, 0), true);
+        std::vector<cv::Mat> target_outputs;
+        restoration_net.setInput(target_blob);
+        restoration_net.forward(target_outputs, restoration_net.getUnconnectedOutLayersNames());
+
+        // parse to image
+        auto& output = target_outputs[0];
+        cv::Mat output_channel_last;
+        cv::transposeND(output, {0,2,3,1}, output_channel_last);
+        cv::Mat img_result(output_channel_last.size[1], output_channel_last.size[2], CV_32FC3, output_channel_last.data);
+        img_result.convertTo(img_result, CV_8U, 255);
+        
+        // update back to frame meta
+        auto& bg = restoration_to_osd ? frame_meta->osd_frame : frame_meta->frame;
+        cv::cvtColor(img_result, bg, cv::COLOR_RGB2BGR);
+    }
+
+    void vp_restoration_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
--- a/nodes/infers/vp_restoration_node.h
+++ b/nodes/infers/vp_restoration_node.h
@@ -0,0 +1,27 @@
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+
+namespace vp_nodes {
+    // general image restoration node using Real-ESRGAN
+    // used to enhance quality of frames in video， see more: https://github.com/xinntao/Real-ESRGAN
+    class vp_restoration_node: public vp_primary_infer_node
+    {   
+    private:
+        /* onnx network using opencv::dnn as backend */
+        cv::dnn::Net restoration_net;
+        bool restoration_to_osd = true;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_restoration_node(std::string node_name, 
+                            std::string realesrgan_bg_restoration_model,
+                            std::string face_restoration_model = "",
+                            bool restoration_to_osd = true);
+        ~vp_restoration_node();
+    };
+}
--- a/nodes/infers/vp_sface_feature_encoder_node.cpp
+++ b/nodes/infers/vp_sface_feature_encoder_node.cpp
@@ -0,0 +1,183 @@
+
+#include <algorithm>
+#include <opencv2/imgproc.hpp>
+#include "vp_sface_feature_encoder_node.h"
+
+namespace vp_nodes {
+        
+    vp_sface_feature_encoder_node::vp_sface_feature_encoder_node(std::string node_name, std::string model_path):
+                                                                vp_secondary_infer_node(node_name, model_path, 
+                                                                "", "", 
+                                                                112, 112, 
+                                                                1, std::vector<int>(), 0, 0, 
+                                                                0, 1, cv::Scalar()) {
+        this->initialized();
+    }
+    
+    vp_sface_feature_encoder_node::~vp_sface_feature_encoder_node() {
+        deinitialized();
+    }
+
+    void vp_sface_feature_encoder_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        // make sure heads of output are not zero
+        assert(raw_outputs.size() > 0);
+        assert(frame_meta_with_batch.size() == 1);
+
+        // just one head of output
+        auto& output = raw_outputs[0];
+        assert(output.dims == 2);
+
+        auto count = output.rows;
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        // update feature data back into frame meta
+        for (int i = 0; i < count; i++) {
+            cv::Mat feature = output.row(i);
+
+            for (int j = 0; j < feature.cols; j++) {
+                frame_meta->face_targets[i]->embeddings.push_back(feature.at<float>(0, j));
+            }
+        }  
+    }
+
+    // refer to vp_secondary_infer_node::prepare
+    void vp_sface_feature_encoder_node::prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) {
+        // only one by one for secondary infer node
+        assert(frame_meta_with_batch.size() == 1);
+
+        // align face and crop 
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        // batch by batch inside single frame
+        for (auto& i : frame_meta->face_targets) {
+            // align and crop
+            float face_keypoints[5][2] = 
+                {{i->key_points[0].first, i->key_points[0].second}, 
+                 {i->key_points[1].first, i->key_points[1].second}, 
+                 {i->key_points[2].first, i->key_points[2].second}, 
+                 {i->key_points[3].first, i->key_points[3].second}, 
+                 {i->key_points[4].first, i->key_points[4].second}};
+            cv::Mat aligned_face;
+            alignCrop(frame_meta->frame, face_keypoints, aligned_face);
+            mats_to_infer.push_back(aligned_face); 
+        }
+    }
+
+    cv::Mat vp_sface_feature_encoder_node::getSimilarityTransformMatrix(float src[5][2]) {
+        using namespace cv;
+        float dst[5][2] = { {38.2946f, 51.6963f}, {73.5318f, 51.5014f}, {56.0252f, 71.7366f}, {41.5493f, 92.3655f}, {70.7299f, 92.2041f} };
+        float avg0 = (src[0][0] + src[1][0] + src[2][0] + src[3][0] + src[4][0]) / 5;
+        float avg1 = (src[0][1] + src[1][1] + src[2][1] + src[3][1] + src[4][1]) / 5;
+        //Compute mean of src and dst.
+        float src_mean[2] = { avg0, avg1 };
+        float dst_mean[2] = { 56.0262f, 71.9008f };
+        //Subtract mean from src and dst.
+        float src_demean[5][2];
+        for (int i = 0; i < 2; i++)
+        {
+            for (int j = 0; j < 5; j++)
+            {
+                src_demean[j][i] = src[j][i] - src_mean[i];
+            }
+        }
+        float dst_demean[5][2];
+        for (int i = 0; i < 2; i++)
+        {
+            for (int j = 0; j < 5; j++)
+            {
+                dst_demean[j][i] = dst[j][i] - dst_mean[i];
+            }
+        }
+        double A00 = 0.0, A01 = 0.0, A10 = 0.0, A11 = 0.0;
+        for (int i = 0; i < 5; i++)
+            A00 += dst_demean[i][0] * src_demean[i][0];
+        A00 = A00 / 5;
+        for (int i = 0; i < 5; i++)
+            A01 += dst_demean[i][0] * src_demean[i][1];
+        A01 = A01 / 5;
+        for (int i = 0; i < 5; i++)
+            A10 += dst_demean[i][1] * src_demean[i][0];
+        A10 = A10 / 5;
+        for (int i = 0; i < 5; i++)
+            A11 += dst_demean[i][1] * src_demean[i][1];
+        A11 = A11 / 5;
+        Mat A = (Mat_<double>(2, 2) << A00, A01, A10, A11);
+        double d[2] = { 1.0, 1.0 };
+        double detA = A00 * A11 - A01 * A10;
+        if (detA < 0)
+            d[1] = -1;
+        double T[3][3] = { {1.0, 0.0, 0.0}, {0.0, 1.0, 0.0}, {0.0, 0.0, 1.0} };
+        Mat s, u, vt, v;
+        SVD::compute(A, s, u, vt);
+        double smax = s.ptr<double>(0)[0]>s.ptr<double>(1)[0] ? s.ptr<double>(0)[0] : s.ptr<double>(1)[0];
+        double tol = smax * 2 * FLT_MIN;
+        int rank = 0;
+        if (s.ptr<double>(0)[0]>tol)
+            rank += 1;
+        if (s.ptr<double>(1)[0]>tol)
+            rank += 1;
+        double arr_u[2][2] = { {u.ptr<double>(0)[0], u.ptr<double>(0)[1]}, {u.ptr<double>(1)[0], u.ptr<double>(1)[1]} };
+        double arr_vt[2][2] = { {vt.ptr<double>(0)[0], vt.ptr<double>(0)[1]}, {vt.ptr<double>(1)[0], vt.ptr<double>(1)[1]} };
+        double det_u = arr_u[0][0] * arr_u[1][1] - arr_u[0][1] * arr_u[1][0];
+        double det_vt = arr_vt[0][0] * arr_vt[1][1] - arr_vt[0][1] * arr_vt[1][0];
+        if (rank == 1)
+        {
+            if ((det_u*det_vt) > 0)
+            {
+                Mat uvt = u*vt;
+                T[0][0] = uvt.ptr<double>(0)[0];
+                T[0][1] = uvt.ptr<double>(0)[1];
+                T[1][0] = uvt.ptr<double>(1)[0];
+                T[1][1] = uvt.ptr<double>(1)[1];
+            }
+            else
+            {
+                double temp = d[1];
+                d[1] = -1;
+                Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
+                Mat Dvt = D*vt;
+                Mat uDvt = u*Dvt;
+                T[0][0] = uDvt.ptr<double>(0)[0];
+                T[0][1] = uDvt.ptr<double>(0)[1];
+                T[1][0] = uDvt.ptr<double>(1)[0];
+                T[1][1] = uDvt.ptr<double>(1)[1];
+                d[1] = temp;
+            }
+        }
+        else
+        {
+            Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
+            Mat Dvt = D*vt;
+            Mat uDvt = u*Dvt;
+            T[0][0] = uDvt.ptr<double>(0)[0];
+            T[0][1] = uDvt.ptr<double>(0)[1];
+            T[1][0] = uDvt.ptr<double>(1)[0];
+            T[1][1] = uDvt.ptr<double>(1)[1];
+        }
+        double var1 = 0.0;
+        for (int i = 0; i < 5; i++)
+            var1 += src_demean[i][0] * src_demean[i][0];
+        var1 = var1 / 5;
+        double var2 = 0.0;
+        for (int i = 0; i < 5; i++)
+            var2 += src_demean[i][1] * src_demean[i][1];
+        var2 = var2 / 5;
+        double scale = 1.0 / (var1 + var2)* (s.ptr<double>(0)[0] * d[0] + s.ptr<double>(1)[0] * d[1]);
+        double TS[2];
+        TS[0] = T[0][0] * src_mean[0] + T[0][1] * src_mean[1];
+        TS[1] = T[1][0] * src_mean[0] + T[1][1] * src_mean[1];
+        T[0][2] = dst_mean[0] - scale*TS[0];
+        T[1][2] = dst_mean[1] - scale*TS[1];
+        T[0][0] *= scale;
+        T[0][1] *= scale;
+        T[1][0] *= scale;
+        T[1][1] *= scale;
+        Mat transform_mat = (Mat_<double>(2, 3) << T[0][0], T[0][1], T[0][2], T[1][0], T[1][1], T[1][2]);
+        return transform_mat;
+    }
+
+    void vp_sface_feature_encoder_node::alignCrop(cv::Mat& _src_img, float _src_point[5][2], cv::Mat& _aligned_img) {
+        cv::Mat warp_mat = getSimilarityTransformMatrix(_src_point);
+        cv::warpAffine(_src_img, _aligned_img, warp_mat, cv::Size(input_width, input_height), cv::INTER_LINEAR);
+    }
+}
--- a/nodes/infers/vp_sface_feature_encoder_node.h
+++ b/nodes/infers/vp_sface_feature_encoder_node.h
@@ -0,0 +1,27 @@
+
+#pragma once
+
+#include "../vp_secondary_infer_node.h"
+
+namespace vp_nodes {
+    // face feature encoder based on SFace, update embeddings of vp_frame_face_target
+    // https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/face_recognize.cpp
+    // https://github.com/zhongyy/SFace
+    class vp_sface_feature_encoder_node: public vp_secondary_infer_node
+    {
+    private:
+        // get transform matrix for aglin face
+        cv::Mat getSimilarityTransformMatrix(float src[5][2]);
+        // align and crop
+        void alignCrop(cv::Mat& _src_img, float _src_point[5][2], cv::Mat& _aligned_img);
+    protected:
+        // override prepare as sface has an additional logic for face align before preprocess
+        virtual void prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) override;
+
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_sface_feature_encoder_node(std::string node_name, std::string model_path);
+        ~vp_sface_feature_encoder_node();
+    };
+
+ }
--- a/nodes/infers/vp_trt_vehicle_color_classifier.cpp
+++ b/nodes/infers/vp_trt_vehicle_color_classifier.cpp
@@ -0,0 +1,64 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_color_classifier.h"
+
+namespace vp_nodes {
+    vp_trt_vehicle_color_classifier::vp_trt_vehicle_color_classifier(std::string node_name, 
+                                                                    std::string vehicle_color_cls_model_path, 
+                                                                    std::vector<int> p_class_ids_applied_to,
+                                                                    int min_width_applied_to, int min_height_applied_to):
+                                                                    vp_secondary_infer_node(node_name, "", "", "", 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
+        vehicle_color_classifier = std::make_shared<trt_vehicle::VehicleColorClassifier>(vehicle_color_cls_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_color_classifier::~vp_trt_vehicle_color_classifier() {
+        deinitialized();
+    }
+
+    void vp_trt_vehicle_color_classifier::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // infer using trt_vehicle library
+        start_time = std::chrono::system_clock::now();
+        std::vector<trt_vehicle::ObjCls> vehicles_color;
+        vehicle_color_classifier->classify(mats_to_infer, vehicles_color);
+
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto index = 0;
+        for (int i = 0; i < vehicles_color.size(); i++) {
+            for (int j = index; j < frame_meta->targets.size(); j++) {
+                // need apply or not?
+                if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
+                    // continue as its primary_class_id is not in p_class_ids_applied_to
+                    continue;
+                }
+
+                // update back to frame meta
+                frame_meta->targets[j]->secondary_class_ids.push_back(vehicles_color[i].class_);
+                frame_meta->targets[j]->secondary_scores.push_back(vehicles_color[i].score);
+                frame_meta->targets[j]->secondary_labels.push_back(vehicles_color[i].label);
+
+                // break as we found the right target!
+                index = j + 1;
+                break;
+            }
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_vehicle_color_classifier::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_color_classifier.h
+++ b/nodes/infers/vp_trt_vehicle_color_classifier.h
@@ -0,0 +1,26 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_secondary_infer_node.h"
+#include "../../third_party/trt_vehicle/models/vehicle_color_classifier.h"
+
+namespace vp_nodes {
+    // vehicle color classifier based on tensorrt using trt_vehicle library
+    // update secondary_class_ids/secondary_labels/secondary_scores of vp_frame_target.
+    class vp_trt_vehicle_color_classifier: public vp_secondary_infer_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<trt_vehicle::VehicleColorClassifier> vehicle_color_classifier = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_color_classifier(std::string node_name, std::string vehicle_color_cls_model_path = "", std::vector<int> p_class_ids_applied_to = std::vector<int>(), int min_width_applied_to = 0, int min_height_applied_to = 0);
+        ~vp_trt_vehicle_color_classifier();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_detector.cpp
+++ b/nodes/infers/vp_trt_vehicle_detector.cpp
@@ -0,0 +1,64 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_detector.h"
+
+namespace vp_nodes {
+    
+    vp_trt_vehicle_detector::vp_trt_vehicle_detector(std::string node_name, 
+                                                    std::string vehicle_det_model_path):
+                                                    vp_primary_infer_node(node_name, "") {
+        vehicle_detector = std::make_shared<trt_vehicle::VehicleDetector>(vehicle_det_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_detector::~vp_trt_vehicle_detector() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_vehicle_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<trt_vehicle::ObjBox>> vehicles;
+        vehicle_detector->detect(mats_to_infer, vehicles);
+
+        assert(vehicles.size() == 1);
+        auto& vehicle_list = vehicles[0];
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        for (int i = 0; i < vehicle_list.size(); i++) {
+            auto& objbox = vehicle_list[i];
+
+            // check value range
+            objbox.x = std::max(objbox.x, 0);
+            objbox.y = std::max(objbox.y, 0);
+            objbox.width = std::min(objbox.width, frame_meta->frame.cols - objbox.x);
+            objbox.height = std::min(objbox.height, frame_meta->frame.rows - objbox.y);
+            if (objbox.width <= 0 || objbox.height <= 0) {
+                continue;
+            }
+            
+            auto target = std::make_shared<vp_objects::vp_frame_target>(objbox.x, objbox.y, objbox.width, objbox.height, 
+                                                                        objbox.class_, objbox.score, frame_meta->frame_index, frame_meta->channel_index, objbox.label);
+            // create target and update back into frame meta
+            frame_meta->targets.push_back(target);
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_vehicle_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_detector.h
+++ b/nodes/infers/vp_trt_vehicle_detector.h
@@ -0,0 +1,24 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/trt_vehicle/models/vehicle_detector.h"
+
+namespace vp_nodes {
+    // vehicle detector based on tensorrt using trt_vehicle library
+    class vp_trt_vehicle_detector: public vp_primary_infer_node
+    {
+    private:
+        std::shared_ptr<trt_vehicle::VehicleDetector> vehicle_detector = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_detector(std::string node_name, std::string vehicle_det_model_path = "");
+        ~vp_trt_vehicle_detector();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_feature_encoder.cpp
+++ b/nodes/infers/vp_trt_vehicle_feature_encoder.cpp
@@ -0,0 +1,65 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_feature_encoder.h"
+
+namespace vp_nodes {
+    vp_trt_vehicle_feature_encoder::vp_trt_vehicle_feature_encoder(std::string node_name, 
+                                                                    std::string vehicle_feature_model_path, 
+                                                                    std::vector<int> p_class_ids_applied_to,
+                                                                    int min_width_applied_to, int min_height_applied_to):
+                                                                    vp_secondary_infer_node(node_name, "", "", "", 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
+        vehicle_feature_encoder = std::make_shared<trt_vehicle::VehicleFeatureEncoder>(vehicle_feature_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_feature_encoder::~vp_trt_vehicle_feature_encoder() {
+        deinitialized();
+    }
+
+    void vp_trt_vehicle_feature_encoder::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // infer using trt_vehicle library
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<float>> vehicles_feature;
+        vehicle_feature_encoder->encode(mats_to_infer, vehicles_feature);
+
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto index = 0;
+        for (int i = 0; i < vehicles_feature.size(); i++) {
+            for (int j = index; j < frame_meta->targets.size(); j++) {
+                // need apply or not?
+                if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
+                    // continue as its primary_class_id is not in p_class_ids_applied_to
+                    continue;
+                }
+
+                // update back to frame meta
+                auto& features = vehicles_feature[i];
+                for(int k = 0; k < features.size(); k++) {
+                    frame_meta->targets[j]->embeddings.push_back(features[k]);
+                }
+
+                // break as we found the right target!
+                index = j + 1;
+                break;
+            }
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_vehicle_feature_encoder::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_feature_encoder.h
+++ b/nodes/infers/vp_trt_vehicle_feature_encoder.h
@@ -0,0 +1,26 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_secondary_infer_node.h"
+#include "../../third_party/trt_vehicle/models/vehicle_feature_encoder.h"
+
+namespace vp_nodes {
+    // vehicle feature encoder based on tensorrt using trt_vehicle library
+    // update embeddings of vp_frame_target
+    class vp_trt_vehicle_feature_encoder: public vp_secondary_infer_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<trt_vehicle::VehicleFeatureEncoder> vehicle_feature_encoder = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_feature_encoder(std::string node_name, std::string vehicle_feature_model_path = "", std::vector<int> p_class_ids_applied_to = std::vector<int>(), int min_width_applied_to = 0, int min_height_applied_to = 0);
+        ~vp_trt_vehicle_feature_encoder();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_plate_detector.cpp
+++ b/nodes/infers/vp_trt_vehicle_plate_detector.cpp
@@ -0,0 +1,71 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_plate_detector.h"
+
+namespace vp_nodes {
+    
+    vp_trt_vehicle_plate_detector::vp_trt_vehicle_plate_detector(std::string node_name, 
+                                                                std::string plate_det_model_path, 
+                                                                std::string char_rec_model_path):
+                                                                vp_secondary_infer_node(node_name, "") {
+        plate_detector = std::make_shared<trt_vehicle::VehiclePlateDetector>(plate_det_model_path, char_rec_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_plate_detector::~vp_trt_vehicle_plate_detector() {
+        deinitialized();
+    }
+
+    void vp_trt_vehicle_plate_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_vehicle_plate_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        // here we detect vehicle plate for all target(like car/bus) in frame
+        std::vector<std::vector<trt_vehicle::Plate>> plates;
+        plate_detector->detect(mats_to_infer, plates, true);  // detect at most 1 plate for each image
+
+        auto& frame_meta = frame_meta_with_batch[0];
+        for (int i = 0; i < plates.size(); i++) {
+            for (int j = 0; j < plates[i].size(); j++) {
+                auto& plate = plates[i][j];
+
+                // only plate detection but no recognition result
+                if (plate.text.empty()) {
+                    continue;
+                }
+                
+                // check value range
+                auto x = std::max(0, plate.x + frame_meta->targets[i]->x - 10);  // offset
+                auto y = std::max(0, plate.y + frame_meta->targets[i]->y - 8);   // offset
+                auto w = std::min(plate.width, frame_meta->frame.cols - x);
+                auto h = std::min(plate.height, frame_meta->frame.rows - y);
+                if (w <= 0 || h <=0) {
+                    continue;
+                }
+                
+                // create sub target and update back into frame meta
+                // we treat vehicle plate as sub target of those in vp_frame_meta.targets
+                auto sub_target = std::make_shared<vp_objects::vp_sub_target>(x, y, w, h, 
+                                                    -1, 0, plate.color + "_" + plate.text, frame_meta->frame_index, frame_meta->channel_index);
+                frame_meta->targets[i]->sub_targets.push_back(sub_target);
+            }
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_plate_detector.h
+++ b/nodes/infers/vp_trt_vehicle_plate_detector.h
@@ -0,0 +1,29 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_secondary_infer_node.h"
+#include "../../objects/vp_sub_target.h"
+#include "../../third_party/trt_vehicle/models/vehicle_plate_detector.h"
+
+namespace vp_nodes {
+    // vehicle plate detector based on tensorrt using trt_vehicle library
+    // source code: ../../third_party/trt_vehicle
+    // note: derived from vp_secondary_infer_node since it detects plates on small cropped images, which is different from vp_trt_vehicle_plate_detector_v2 class
+    // this class is not based on opencv::dnn module but tensorrt, a few data members declared in base class are not usable any more(just ignore), such as vp_infer_node::net.
+    class vp_trt_vehicle_plate_detector: public vp_secondary_infer_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<trt_vehicle::VehiclePlateDetector> plate_detector = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_plate_detector(std::string node_name, std::string plate_det_model_path = "", std::string char_rec_model_path = "");
+        ~vp_trt_vehicle_plate_detector();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_plate_detector_v2.cpp
+++ b/nodes/infers/vp_trt_vehicle_plate_detector_v2.cpp
@@ -0,0 +1,75 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_plate_detector_v2.h"
+
+namespace vp_nodes {
+    
+    vp_trt_vehicle_plate_detector_v2::vp_trt_vehicle_plate_detector_v2(std::string node_name, 
+                                                                std::string plate_det_model_path, 
+                                                                std::string char_rec_model_path):
+                                                                vp_primary_infer_node(node_name, "") {
+        plate_detector = std::make_shared<trt_vehicle::VehiclePlateDetector>(plate_det_model_path, char_rec_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_plate_detector_v2::~vp_trt_vehicle_plate_detector_v2() {
+        deinitialized();
+    }
+
+    void vp_trt_vehicle_plate_detector_v2::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_vehicle_plate_detector_v2::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        // here we detect vehicle plate on the whole big frame
+        std::vector<std::vector<trt_vehicle::Plate>> plates;
+        plate_detector->detect(mats_to_infer, plates);
+
+        for (int i = 0; i < plates.size(); i++) {
+            auto& frame_meta = frame_meta_with_batch[i];
+            for (int j = 0; j < plates[i].size(); j++) {
+                auto& plate = plates[i][j];
+
+                // only plate detection but no recognition result
+                if (plate.text.empty()) {
+                    continue;
+                }
+                
+                // check value range
+                auto x = std::max(0, plate.x);
+                auto y = std::max(0, plate.y);
+                auto w = std::min(plate.width, frame_meta->frame.cols - x);
+                auto h = std::min(plate.height, frame_meta->frame.rows - y);
+                if (w <= 0 || h <=0) {
+                    continue;
+                }
+                
+                // create target and update back into frame meta
+                /* 
+                  we treat vehicle plate as target in vp_frame_meta.targets,
+                  1. x,y,width,height in vp_frame_target stand for location of plate in frame
+                  2. primary_label in vp_frame_target stands for plate color and plate text, which combined with '_'
+                */ 
+                auto target = std::make_shared<vp_objects::vp_frame_target>(x, y, w, h, 
+                                                    -1, 0, frame_meta->frame_index, frame_meta->channel_index, plate.color + "_" + plate.text);
+                frame_meta->targets.push_back(target);
+                VP_INFO("plate-color:" + plate.color + "    plate-text:" + plate.text);
+            }
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_plate_detector_v2.h
+++ b/nodes/infers/vp_trt_vehicle_plate_detector_v2.h
@@ -0,0 +1,28 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/trt_vehicle/models/vehicle_plate_detector.h"
+
+namespace vp_nodes {
+    // vehicle plate detector based on tensorrt using trt_vehicle library
+    // source code: ../../third_party/trt_vehicle
+    // note: derived from vp_primary_infer_node since it detects plates on the whole big frame, which is different from vp_trt_vehicle_plate_detector class
+    // this class is not based on opencv::dnn module but tensorrt, a few data members declared in base class are not usable any more(just ignore), such as vp_infer_node::net.
+    class vp_trt_vehicle_plate_detector_v2: public vp_primary_infer_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<trt_vehicle::VehiclePlateDetector> plate_detector = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_plate_detector_v2(std::string node_name, std::string plate_det_model_path = "", std::string char_rec_model_path = "");
+        ~vp_trt_vehicle_plate_detector_v2();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_scanner.cpp
+++ b/nodes/infers/vp_trt_vehicle_scanner.cpp
@@ -0,0 +1,64 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_scanner.h"
+
+namespace vp_nodes {
+    
+    vp_trt_vehicle_scanner::vp_trt_vehicle_scanner(std::string node_name, 
+                                                    std::string vehicle_scan_model_path):
+                                                    vp_primary_infer_node(node_name, "") {
+        vehicle_scanner = std::make_shared<trt_vehicle::VehicleScanner>(vehicle_scan_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_scanner::~vp_trt_vehicle_scanner() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_vehicle_scanner::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<trt_vehicle::ObjBox>> vehicles;
+        vehicle_scanner->detect(mats_to_infer, vehicles);
+
+        assert(vehicles.size() == 1);
+        auto& vehicle_list = vehicles[0];
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        for (int i = 0; i < vehicle_list.size(); i++) {
+            auto& objbox = vehicle_list[i];
+
+            // check value range
+            objbox.x = std::max(objbox.x, 0);
+            objbox.y = std::max(objbox.y, 0);
+            objbox.width = std::min(objbox.width, frame_meta->frame.cols - objbox.x);
+            objbox.height = std::min(objbox.height, frame_meta->frame.rows - objbox.y);
+            if (objbox.width <= 0 || objbox.height <= 0) {
+                continue;
+            }
+            
+            auto target = std::make_shared<vp_objects::vp_frame_target>(objbox.x, objbox.y, objbox.width, objbox.height, 
+                                                                        objbox.class_, objbox.score, frame_meta->frame_index, frame_meta->channel_index, objbox.label);
+            // create target and update back into frame meta
+            frame_meta->targets.push_back(target);
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_vehicle_scanner::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_scanner.h
+++ b/nodes/infers/vp_trt_vehicle_scanner.h
@@ -0,0 +1,24 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/trt_vehicle/models/vehicle_scanner.h"
+
+namespace vp_nodes {
+    // vehicle scanner based on tensorrt using trt_vehicle library
+    class vp_trt_vehicle_scanner: public vp_primary_infer_node
+    {
+    private:
+        std::shared_ptr<trt_vehicle::VehicleScanner> vehicle_scanner = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_scanner(std::string node_name, std::string vehicle_scan_model_path = "");
+        ~vp_trt_vehicle_scanner();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_type_classifier.cpp
+++ b/nodes/infers/vp_trt_vehicle_type_classifier.cpp
@@ -0,0 +1,64 @@
+#ifdef VP_WITH_TRT
+#include "vp_trt_vehicle_type_classifier.h"
+
+namespace vp_nodes {
+    vp_trt_vehicle_type_classifier::vp_trt_vehicle_type_classifier(std::string node_name, 
+                                                                    std::string vehicle_type_cls_model_path, 
+                                                                    std::vector<int> p_class_ids_applied_to,
+                                                                    int min_width_applied_to, int min_height_applied_to):
+                                                                    vp_secondary_infer_node(node_name, "", "", "", 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
+        vehicle_type_classifier = std::make_shared<trt_vehicle::VehicleTypeClassifier>(vehicle_type_cls_model_path);
+        this->initialized();
+    }
+    
+    vp_trt_vehicle_type_classifier::~vp_trt_vehicle_type_classifier() {
+        deinitialized();
+    }
+
+    void vp_trt_vehicle_type_classifier::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // infer using trt_vehicle library
+        start_time = std::chrono::system_clock::now();
+        std::vector<trt_vehicle::ObjCls> vehicles_type;
+        vehicle_type_classifier->classify(mats_to_infer, vehicles_type);
+
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto index = 0;
+        for (int i = 0; i < vehicles_type.size(); i++) {
+            for (int j = index; j < frame_meta->targets.size(); j++) {
+                // need apply or not?
+                if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
+                    // continue as its primary_class_id is not in p_class_ids_applied_to
+                    continue;
+                }
+
+                // update back to frame meta
+                frame_meta->targets[j]->secondary_class_ids.push_back(vehicles_type[i].class_);
+                frame_meta->targets[j]->secondary_scores.push_back(vehicles_type[i].score);
+                frame_meta->targets[j]->secondary_labels.push_back(vehicles_type[i].label);
+
+                // break as we found the right target!
+                index = j + 1;
+                break;
+            }
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_vehicle_type_classifier::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_vehicle_type_classifier.h
+++ b/nodes/infers/vp_trt_vehicle_type_classifier.h
@@ -0,0 +1,26 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_secondary_infer_node.h"
+#include "../../third_party/trt_vehicle/models/vehicle_type_classifier.h"
+
+namespace vp_nodes {
+    // vehicle type classifier based on tensorrt using trt_vehicle library
+    // update secondary_class_ids/secondary_labels/secondary_scores of vp_frame_target.
+    class vp_trt_vehicle_type_classifier: public vp_secondary_infer_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<trt_vehicle::VehicleTypeClassifier> vehicle_type_classifier = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_vehicle_type_classifier(std::string node_name, std::string vehicle_type_cls_model_path = "", std::vector<int> p_class_ids_applied_to = std::vector<int>(), int min_width_applied_to = 0, int min_height_applied_to = 0);
+        ~vp_trt_vehicle_type_classifier();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_classifier.cpp
+++ b/nodes/infers/vp_trt_yolov8_classifier.cpp
@@ -0,0 +1,67 @@
+
+#ifdef VP_WITH_TRT
+#include "vp_trt_yolov8_classifier.h"
+
+namespace vp_nodes {
+    vp_trt_yolov8_classifier::vp_trt_yolov8_classifier(std::string node_name, 
+                                                        std::string model_path, 
+                                                        std::string labels_path,  
+                                                        std::vector<int> p_class_ids_applied_to,
+                                                        int min_width_applied_to, int min_height_applied_to):
+                                                        vp_secondary_infer_node(node_name, "", "", labels_path, 1, 1, 1, p_class_ids_applied_to, min_width_applied_to, min_height_applied_to) {
+        yolov8_classifier = std::make_shared<trt_yolov8::trt_yolov8_classifier>(model_path);
+        this->initialized();
+    }
+    
+    vp_trt_yolov8_classifier::~vp_trt_yolov8_classifier() {
+        deinitialized();
+    }
+
+    void vp_trt_yolov8_classifier::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_secondary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // infer using trt_vehicle library
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<Classification>> classidications;
+        yolov8_classifier->classify(mats_to_infer, classidications);
+
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto index = 0;
+        for (int i = 0; i < classidications.size(); i++) {
+            for (int j = index; j < frame_meta->targets.size(); j++) {
+                // need apply or not?
+                if (!need_apply(frame_meta->targets[j]->primary_class_id, frame_meta->targets[j]->width, frame_meta->targets[j]->height)) {
+                    // continue as its primary_class_id is not in p_class_ids_applied_to
+                    continue;
+                }
+
+                // update back to frame meta
+                frame_meta->targets[j]->secondary_class_ids.push_back(classidications[i][0].class_id);
+                frame_meta->targets[j]->secondary_scores.push_back(classidications[i][0].conf);
+                auto label = labels.size() == 0 ? "" : labels[classidications[i][0].class_id];
+                frame_meta->targets[j]->secondary_labels.push_back(label);
+
+                // break as we found the right target!
+                index = j + 1;
+                break;
+            }
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_yolov8_classifier::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_classifier.h
+++ b/nodes/infers/vp_trt_yolov8_classifier.h
@@ -0,0 +1,30 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_secondary_infer_node.h"
+#include "../../third_party/trt_yolov8/trt_yolov8_classifier.h"
+
+namespace vp_nodes {
+    // universal yolov8 classifier based on tensorrt using third_party/trt_yolov8 library
+    class vp_trt_yolov8_classifier: public vp_secondary_infer_node
+    {
+    private:
+        /* data */
+        std::shared_ptr<trt_yolov8::trt_yolov8_classifier> yolov8_classifier = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_yolov8_classifier(std::string node_name, 
+                                std::string model_path,
+                                std::string labels_path = "",  
+                                std::vector<int> p_class_ids_applied_to = std::vector<int>(), 
+                                int min_width_applied_to = 0, 
+                                int min_height_applied_to = 0);
+        ~vp_trt_yolov8_classifier();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_detector.cpp
+++ b/nodes/infers/vp_trt_yolov8_detector.cpp
@@ -0,0 +1,69 @@
+
+#ifdef VP_WITH_TRT
+#include "vp_trt_yolov8_detector.h"
+
+namespace vp_nodes {
+    
+    vp_trt_yolov8_detector::vp_trt_yolov8_detector(std::string node_name, 
+                                                    std::string model_path,
+                                                    std::string labels_path):
+                                                    vp_primary_infer_node(node_name, "", "", labels_path) {
+        yolov8_detector = std::make_shared<trt_yolov8::trt_yolov8_detector>(model_path);
+        this->initialized();
+    }
+    
+    vp_trt_yolov8_detector::~vp_trt_yolov8_detector() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_yolov8_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<Detection>> detections;
+        yolov8_detector->detect(mats_to_infer, detections);
+
+        assert(detections.size() == 1);
+        auto& detection_list = detections[0];
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        for (int i = 0; i < detection_list.size(); i++) {
+            auto& objbox = detection_list[i];
+
+            // objbox.bbox: center_x center_y width height
+            auto rect = get_rect(frame_meta->frame, objbox.bbox);  // convert to: x, y, width,height
+            
+            // check value range
+            int x = std::max(rect.x, 0);
+            int y = std::max(rect.y, 0);
+            int width = std::min(rect.width, frame_meta->frame.cols - x);
+            int height = std::min(rect.height, frame_meta->frame.rows - y);
+            if (width <= 0 || height <= 0) {
+                continue;
+            }
+            auto label = labels.size() == 0 ? "" : labels[objbox.class_id];
+            auto target = std::make_shared<vp_objects::vp_frame_target>(x, y, width, height, 
+                                                                        objbox.class_id, objbox.conf, frame_meta->frame_index, frame_meta->channel_index, label);
+            // create target and update back into frame meta
+            frame_meta->targets.push_back(target);
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_yolov8_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_detector.h
+++ b/nodes/infers/vp_trt_yolov8_detector.h
@@ -0,0 +1,24 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/trt_yolov8/trt_yolov8_detector.h"
+
+namespace vp_nodes {
+    // universal yolov8 detector based on tensorrt using third_party/trt_yolov8 library
+    class vp_trt_yolov8_detector: public vp_primary_infer_node
+    {
+    private:
+        std::shared_ptr<trt_yolov8::trt_yolov8_detector> yolov8_detector = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_yolov8_detector(std::string node_name, std::string model_path, std::string labels_path = "");
+        ~vp_trt_yolov8_detector();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_pose_detector.cpp
+++ b/nodes/infers/vp_trt_yolov8_pose_detector.cpp
@@ -0,0 +1,60 @@
+
+#ifdef VP_WITH_TRT
+#include "vp_trt_yolov8_pose_detector.h"
+
+namespace vp_nodes {
+    
+    vp_trt_yolov8_pose_detector::vp_trt_yolov8_pose_detector(std::string node_name, 
+                                                    std::string model_path):
+                                                    vp_primary_infer_node(node_name, "") {
+        yolov8_pose_detector = std::make_shared<trt_yolov8::trt_yolov8_pose_detector>(model_path);
+        this->initialized();
+    }
+    
+    vp_trt_yolov8_pose_detector::~vp_trt_yolov8_pose_detector() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_yolov8_pose_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<Detection>> detections;
+        yolov8_pose_detector->detect(mats_to_infer, detections);
+
+        assert(detections.size() == 1);
+        auto& detection_list = detections[0];
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        for (int i = 0; i < detection_list.size(); i++) {
+            auto& objbox = detection_list[i];
+            auto rect = get_rect_adapt_landmark(frame_meta->frame, objbox.bbox, objbox.keypoints);
+            
+            std::vector<vp_objects::vp_pose_keypoint> kps;    
+            for (int j = 0; j < 51; j += 3) {
+                kps.push_back(vp_objects::vp_pose_keypoint {j, int(objbox.keypoints[j]), int(objbox.keypoints[j + 1]), objbox.keypoints[j + 2]});
+            }
+                        
+            auto pose_target = std::make_shared<vp_objects::vp_frame_pose_target>(vp_objects::vp_pose_type::yolov8_pose_17, kps);
+            frame_meta->pose_targets.push_back(pose_target);            
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_yolov8_pose_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_pose_detector.h
+++ b/nodes/infers/vp_trt_yolov8_pose_detector.h
@@ -0,0 +1,24 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/trt_yolov8/trt_yolov8_pose_detector.h"
+
+namespace vp_nodes {
+    // universal yolov8 pose detector based on tensorrt using third_party/trt_yolov8 library
+    class vp_trt_yolov8_pose_detector: public vp_primary_infer_node
+    {
+    private:
+        std::shared_ptr<trt_yolov8::trt_yolov8_pose_detector> yolov8_pose_detector = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_yolov8_pose_detector(std::string node_name, std::string model_path);
+        ~vp_trt_yolov8_pose_detector();
+    };
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_seg_detector.cpp
+++ b/nodes/infers/vp_trt_yolov8_seg_detector.cpp
@@ -0,0 +1,78 @@
+
+#ifdef VP_WITH_TRT
+#include "vp_trt_yolov8_seg_detector.h"
+
+namespace vp_nodes {
+    
+    vp_trt_yolov8_seg_detector::vp_trt_yolov8_seg_detector(std::string node_name, 
+                                                    std::string model_path,
+                                                    std::string labels_path):
+                                                    vp_primary_infer_node(node_name, "", "", labels_path) {
+        yolov8_seg_detector = std::make_shared<trt_yolov8::trt_yolov8_seg_detector>(model_path);
+        this->initialized();
+    }
+    
+    vp_trt_yolov8_seg_detector::~vp_trt_yolov8_seg_detector() {
+        deinitialized();
+    }
+
+    // please refer to vp_infer_node::run_infer_combinations
+    void vp_trt_yolov8_seg_detector::run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        assert(frame_meta_with_batch.size() == 1);
+        std::vector<cv::Mat> mats_to_infer;
+
+        // start
+        auto start_time = std::chrono::system_clock::now();
+
+        // prepare data, as same as base class
+        vp_primary_infer_node::prepare(frame_meta_with_batch, mats_to_infer);
+        auto prepare_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        start_time = std::chrono::system_clock::now();
+        std::vector<std::vector<Detection>> detections;
+        std::vector<std::vector<cv::Mat>> masks;
+        yolov8_seg_detector->detect(mats_to_infer, detections, masks);
+
+        assert(detections.size() == 1);
+        assert(masks.size() == 1);
+        auto& detection_list = detections[0];
+        auto& mask_list = masks[0];
+        assert(detection_list.size() == mask_list.size());
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        for (int i = 0; i < detection_list.size(); i++) {
+            auto& objbox = detection_list[i];
+            auto& mask = mask_list[i];
+            auto scaled_mask = scale_mask(mask, frame_meta->frame);
+
+            // objbox.bbox: center_x center_y width height
+            auto rect = get_rect(frame_meta->frame, objbox.bbox);  // convert to: x, y, width,height
+            // check value range
+            rect.x = std::max(rect.x, 0);
+            rect.y = std::max(rect.y, 0);
+            rect.width = std::min(rect.width, frame_meta->frame.cols - rect.x);
+            rect.height = std::min(rect.height, frame_meta->frame.rows - rect.y);
+            if (rect.width <= 0 || rect.height <= 0) {
+                continue;
+            }
+
+            auto label = labels.size() == 0 ? "" : labels[objbox.class_id];
+            auto target = std::make_shared<vp_objects::vp_frame_target>(rect.x, rect.y, rect.width, rect.height, 
+                                                                        objbox.class_id, objbox.conf, frame_meta->frame_index, frame_meta->channel_index, label);
+            auto rect_mask = scaled_mask(rect);
+            target->mask = rect_mask;
+
+            // create target and update back into frame meta
+            frame_meta->targets.push_back(target);
+        }
+        auto infer_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start_time);
+
+        // can not calculate preprocess time and postprocess time, set 0 by default.
+        vp_infer_node::infer_combinations_time_cost(mats_to_infer.size(), prepare_time.count(), 0, infer_time.count(), 0);
+    }
+
+    void vp_trt_yolov8_seg_detector::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+
+    }
+}
+#endif
--- a/nodes/infers/vp_trt_yolov8_seg_detector.h
+++ b/nodes/infers/vp_trt_yolov8_seg_detector.h
@@ -0,0 +1,24 @@
+#pragma once
+
+#ifdef VP_WITH_TRT
+#include "../vp_primary_infer_node.h"
+#include "../../third_party/trt_yolov8/trt_yolov8_seg_detector.h"
+
+namespace vp_nodes {
+    // universal yolov8 segmentation detector based on tensorrt using third_party/trt_yolov8 library
+    class vp_trt_yolov8_seg_detector: public vp_primary_infer_node
+    {
+    private:
+        std::shared_ptr<trt_yolov8::trt_yolov8_seg_detector> yolov8_seg_detector = nullptr;
+    protected:
+        // we need a totally new logic for the whole infer combinations
+        // no separate step pre-defined needed in base class
+        virtual void run_infer_combinations(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        // override pure virtual method, for compile pass
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_trt_yolov8_seg_detector(std::string node_name, std::string model_path, std::string labels_path = "");
+        ~vp_trt_yolov8_seg_detector();
+    };
+}
+#endif
--- a/nodes/infers/vp_yolo5_seg_node.cpp
+++ b/nodes/infers/vp_yolo5_seg_node.cpp
@@ -0,0 +1,307 @@
+
+#include <cmath>
+#include <algorithm>
+#include <numeric> 
+#include "vp_yolo5_seg_node.h"
+
+
+namespace vp_nodes {
+    vp_yolo5_seg_node::vp_yolo5_seg_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string labels_path, 
+                            int input_width, 
+                            int input_height, 
+                            int batch_size,
+                            int class_id_offset,
+                            float conf_threshold,
+                            float iou_threshold):
+                            vp_primary_infer_node(node_name, model_path, "", labels_path, input_width, input_height, batch_size, class_id_offset),
+                            conf_threshold(conf_threshold),
+                            iou_threshold(iou_threshold) {
+        this->initialized();
+    }
+    
+    vp_yolo5_seg_node::~vp_yolo5_seg_node() {
+        deinitialized();
+    }
+
+    void vp_yolo5_seg_node::prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) {
+        for (auto& i: frame_meta_with_batch) {
+            auto letterboxed = letterbox(i->frame, cv::Size(input_width, input_height), scale_ratio, padding);
+            mats_to_infer.push_back(letterboxed);
+        }
+    }
+
+    void vp_yolo5_seg_node::infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) {
+        // blob_to_infer is a 4D matrix
+        // the first dim is number of batch, MUST be 1
+        assert(blob_to_infer.dims == 4);
+        assert(blob_to_infer.size[0] == 1);
+        assert(!net.empty());
+
+        net.setInput(blob_to_infer);
+        net.forward(raw_outputs, net.getUnconnectedOutLayersNames());
+    }
+
+    void vp_yolo5_seg_node::preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) {
+        // ignore preprocess logic in base class
+        cv::dnn::blobFromImages(mats_to_infer, blob_to_infer, 1.0 / 255.0, cv::Size(), cv::Scalar(), true, false);
+    }
+
+    void vp_yolo5_seg_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        auto& frame_meta = frame_meta_with_batch[0];
+        auto& frame = frame_meta->frame;
+        auto pred = raw_outputs[0]; // [1, N, 4+1+NUM_CLASSES+32]
+        auto proto = raw_outputs[1]; // [1, 32, 96, 160]
+
+        std::vector<cv::Rect> boxes;
+        std::vector<float> confidences;
+        std::vector<int> class_ids;
+        std::vector<cv::Mat> mask_coeffs;
+
+        int num_boxes = pred.size[1];
+        float* data = (float*)pred.data;
+
+        // decode
+        for (int i = 0; i < num_boxes; ++i) {
+            float obj_conf = data[4];
+            if (obj_conf < conf_threshold) {
+                data += (5 + labels.size() + MASK_CHANNELS);
+                continue;
+            }
+
+            int class_id = -1;
+            float max_cls_conf = 0;
+            for (int c = 0; c < labels.size(); ++c) {
+                float cls_conf = data[5 + c];
+                if (cls_conf > max_cls_conf) {
+                    max_cls_conf = cls_conf;
+                    class_id = c;
+                }
+            }
+
+            float total_conf = obj_conf * max_cls_conf;
+            if (total_conf < conf_threshold || class_id == -1) {
+                data += (5 + labels.size() + MASK_CHANNELS);
+                continue;
+            }
+
+            float cx = data[0];
+            float cy = data[1];
+            float w  = data[2];
+            float h  = data[3];
+
+            float x1 = cx - w * 0.5f;
+            float y1 = cy - h * 0.5f;
+            float x2 = cx + w * 0.5f;
+            float y2 = cy + h * 0.5f;
+
+            if (x2 <= x1 || y2 <= y1 || w <= 0 || h <= 0) {
+                data += (5 + labels.size() + MASK_CHANNELS);
+                continue;
+            }
+
+            float x_scale = scale_ratio.at<float>(0);
+            float y_scale = scale_ratio.at<float>(1);
+            int orig_x1 = cvRound((x1 - padding.x) * x_scale);
+            int orig_y1 = cvRound((y1 - padding.y) * y_scale);
+            int orig_x2 = cvRound((x2 - padding.x) * x_scale);
+            int orig_y2 = cvRound((y2 - padding.y) * y_scale);
+
+            orig_x1 = max(0, orig_x1);
+            orig_y1 = max(0, orig_y1);
+            orig_x2 = min(frame.cols, orig_x2);
+            orig_y2 = min(frame.rows, orig_y2);
+
+            if (orig_x2 <= orig_x1 || orig_y2 <= orig_y1) {
+                data += (5 + labels.size() + MASK_CHANNELS);
+                continue;
+            }
+
+            cv::Rect box(orig_x1, orig_y1, orig_x2 - orig_x1, orig_y2 - orig_y1);
+            boxes.push_back(box);
+            confidences.push_back(total_conf);
+            class_ids.push_back(class_id);
+
+            cv::Mat coeff(1, MASK_CHANNELS, CV_32F);
+            for (int m = 0; m < MASK_CHANNELS; ++m) {
+                coeff.at<float>(m) = data[5 + labels.size() + m];
+            }
+            mask_coeffs.push_back(coeff);
+
+            data += (5 + labels.size() + MASK_CHANNELS);
+        }
+
+        std::vector<int> final_indices;
+        std::unordered_map<int, std::vector<int>> class_boxes;
+        for (int i = 0; i < boxes.size(); ++i) {
+            class_boxes[class_ids[i]].push_back(i);
+        }
+
+        // NMS
+        for (auto& kv : class_boxes) {
+            const std::vector<int>& indices_in_class = kv.second;
+            std::vector<cv::Rect> boxes_cls;
+            std::vector<float> scores_cls;
+            for (int idx : indices_in_class) {
+                boxes_cls.push_back(boxes[idx]);
+                scores_cls.push_back(confidences[idx]);
+            }
+
+            std::vector<int> nms_indices = nms(boxes_cls, scores_cls, iou_threshold);
+            for (int local_idx : nms_indices) {
+                final_indices.push_back(indices_in_class[local_idx]);
+            }
+        }
+
+        // create target push back to frame meta
+        for (int idx : final_indices) {
+            cv::Rect box = boxes[idx];
+            float conf = confidences[idx];
+            int cls = class_ids[idx];
+
+            auto label = (labels.size() < cls + 1) ? "" : labels[cls];            
+            cls += class_id_offset;
+            auto target = std::make_shared<vp_objects::vp_frame_target>(box.x, box.y, box.width, box.height, cls, conf, frame_meta->frame_index, frame_meta->channel_index, label);
+
+            // try to extract mask
+            auto local_mask = process_mask(proto, mask_coeffs[idx], box,
+                                          cv::Size(input_width, input_height),
+                                          padding, scale_ratio.at<float>(0), scale_ratio.at<float>(1));
+            target->mask = local_mask;
+            
+            frame_meta->targets.push_back(target);
+        }
+
+        std::cout << "Detected: " << final_indices.size() << " instances" << std::endl;
+    }
+
+    cv::Mat vp_yolo5_seg_node::letterbox(const cv::Mat& src, cv::Size target_size, cv::Mat& scale_ratio, cv::Point& padding) {
+        float scale = min(static_cast<float>(target_size.height) / src.rows,
+                        static_cast<float>(target_size.width) / src.cols);
+        int new_width = cvRound(src.cols * scale);
+        int new_height = cvRound(src.rows * scale);
+
+        cv::Mat resized;
+        resize(src, resized, cv::Size(new_width, new_height), 0, 0, cv::INTER_LINEAR);
+
+        cv::Mat letterboxed(target_size, CV_8UC3, cv::Scalar(114, 114, 114));
+        int top = (target_size.height - new_height) / 2;
+        int left = (target_size.width - new_width) / 2;
+        cv::Rect roi(left, top, new_width, new_height);
+        resized.copyTo(letterboxed(roi));
+
+        scale_ratio = cv::Mat(1, 2, CV_32F);
+        scale_ratio.at<float>(0) = 1.0f / scale;  // x_scale
+        scale_ratio.at<float>(1) = 1.0f / scale;  // y_scale
+
+        padding = cv::Point(left, top);
+        return letterboxed;
+    }
+
+    // Sigmoid
+    cv::Mat vp_yolo5_seg_node::sigmoid(const cv::Mat& x) {
+        cv::Mat result;
+        cv::exp(-x, result);
+        result = 1.0f / (1.0f + result);
+        return result;
+    }
+
+    cv::Mat vp_yolo5_seg_node::process_mask(const cv::Mat& proto,           // [1, 32, H/4, W/4]
+                    const cv::Mat& coeffs,          // [1, 32]
+                    const cv::Rect& bbox,           
+                    const cv::Size& input_size,     // e.g., 640x384
+                    const cv::Point& padding,
+                    const float x_scale,
+                    const float y_scale) {
+
+        // 1. 获取 proto 尺寸 [1,32,96,160] → h=96, w=160
+        int h_proto = proto.size[2];
+        int w_proto = proto.size[3];
+
+        // 2. 重塑为 [32, h, w]
+        cv::Mat proto_3d = proto.reshape(1, {MASK_CHANNELS, h_proto, w_proto});
+
+        // 3. 展平为 [32, h*w]
+        cv::Mat proto_flat = proto_3d.reshape(1, MASK_CHANNELS);
+
+        // 4. 矩阵乘: [1,32] × [32, h*w] → [1, h*w]
+        cv::Mat mask_raw = coeffs * proto_flat;
+
+        // 5. 重塑为 [h, w]
+        cv::Mat mask_2d = mask_raw.reshape(0, h_proto);
+
+        // 6. Sigmoid
+        cv::Mat mask_sigmoid = sigmoid(mask_2d);
+
+        // 7. 原图 bbox → 输入图坐标（含 padding）
+        float x1_in = bbox.x / x_scale + padding.x;
+        float y1_in = bbox.y / y_scale + padding.y;
+        float x2_in = (bbox.x + bbox.width) / x_scale + padding.x;
+        float y2_in = (bbox.y + bbox.height) / y_scale + padding.y;
+
+        // 8. 输入图 → proto 坐标（/4）
+        float x1_p = x1_in / 4.0f;
+        float y1_p = y1_in / 4.0f;
+        float x2_p = x2_in / 4.0f;
+        float y2_p = y2_in / 4.0f;
+
+        int px1 = max(0, (int)floor(x1_p));
+        int py1 = max(0, (int)floor(y1_p));
+        int px2 = min(w_proto, (int)ceil(x2_p));
+        int py2 = min(h_proto, (int)ceil(y2_p));
+
+        if (px2 <= px1 || py2 <= py1) {
+            return cv::Mat::zeros(bbox.size(), CV_8UC1);
+        }
+
+        // 9. 裁剪
+        cv::Mat cropped = mask_sigmoid(cv::Rect(px1, py1, px2 - px1, py2 - py1)).clone();
+
+        // 10. 上采样到 bbox 尺寸
+        cv::Mat resized;
+        resize(cropped, resized, bbox.size(), 0, 0, cv::INTER_LINEAR);
+
+        // 11. 二值化
+        cv::Mat mask_bin;
+        threshold(resized, mask_bin, 0.5, 255, cv::THRESH_BINARY);
+        mask_bin.convertTo(mask_bin, CV_8UC1);
+
+        return mask_bin;
+    }
+
+    // NMS
+    std::vector<int> vp_yolo5_seg_node::nms(const std::vector<cv::Rect>& boxes, const std::vector<float>& scores, float iou_threshold) {
+        std::vector<int> indices;
+        std::vector<float> areas(boxes.size());
+        for (size_t i = 0; i < boxes.size(); ++i) {
+            areas[i] = boxes[i].area();
+        }
+
+        std::vector<int> order(scores.size());
+        iota(order.begin(), order.end(), 0);
+        sort(order.begin(), order.end(), [&](int a, int b) {
+            return scores[a] > scores[b];
+        });
+
+        std::vector<bool> keep(scores.size(), true);
+        for (int i = 0; i < order.size(); ++i) {
+            int idx = order[i];
+            if (!keep[idx]) continue;
+            for (int j = i + 1; j < order.size(); ++j) {
+                int idx2 = order[j];
+                if (!keep[idx2]) continue;
+                cv::Rect intersect = boxes[idx] & boxes[idx2];
+                float iou = intersect.area() / (areas[idx] + areas[idx2] - intersect.area() + 1e-6f);
+                if (iou > iou_threshold) {
+                    keep[idx2] = false;
+                }
+            }
+        }
+
+        for (int i = 0; i < keep.size(); ++i) {
+            if (keep[i]) indices.push_back(i);
+        }
+        return indices;
+    }
+}
--- a/nodes/infers/vp_yolo5_seg_node.h
+++ b/nodes/infers/vp_yolo5_seg_node.h
@@ -0,0 +1,48 @@
+
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+#include "../../objects/vp_frame_target.h"
+
+
+
+namespace vp_nodes {
+    // driving area segmentation based on yolov5s-seg-v7.0
+    // https://github.com/ultralytics/yolov5/releases/tag/v7.0
+    class vp_yolo5_seg_node: public vp_primary_infer_node
+    {
+    private:
+        /* data */
+        float conf_threshold = 0.5;
+        float iou_threshold = 0.7;
+        int MASK_CHANNELS = 32;
+        cv::Mat scale_ratio;
+        cv::Point padding;
+        cv::Mat letterbox(const cv::Mat& src, cv::Size target_size, cv::Mat& scale_ratio, cv::Point& padding);
+        cv::Mat sigmoid(const cv::Mat& x);
+        cv::Mat process_mask(const cv::Mat& proto,  // [1, 32, H/4, W/4]
+                    const cv::Mat& coeffs,          // [1, 32]
+                    const cv::Rect& bbox,           
+                    const cv::Size& input_size,     // e.g., 640x384
+                    const cv::Point& padding,
+                    const float x_scale,
+                    const float y_scale);
+        std::vector<int> nms(const std::vector<cv::Rect>& boxes, const std::vector<float>& scores, float iou_threshold);
+    protected:
+        virtual void infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) override;
+        virtual void preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) override;
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+        virtual void prepare(const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch, std::vector<cv::Mat>& mats_to_infer) override;
+    public:
+        vp_yolo5_seg_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string labels_path = "", 
+                            int input_width = 640, 
+                            int input_height = 384, 
+                            int batch_size = 1,
+                            int class_id_offset = 0,
+                            float conf_threshold = 0.5,
+                            float iou_threshold = 0.7);
+        ~vp_yolo5_seg_node();
+    };    
+}
--- a/nodes/infers/vp_yolo_detector_node.cpp
+++ b/nodes/infers/vp_yolo_detector_node.cpp
@@ -0,0 +1,118 @@
+
+#include "vp_yolo_detector_node.h"
+
+namespace vp_nodes {
+    
+    vp_yolo_detector_node::vp_yolo_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path, 
+                            std::string labels_path, 
+                            int input_width, 
+                            int input_height, 
+                            int batch_size,
+                            int class_id_offset,
+                            float score_threshold,
+                            float confidence_threshold,
+                            float nms_threshold,
+                            float scale,
+                            cv::Scalar mean,
+                            cv::Scalar std,
+                            bool swap_rb):
+                            vp_primary_infer_node(node_name, model_path, model_config_path,labels_path,input_width, input_height, batch_size, class_id_offset, scale, mean, std, swap_rb),
+                            score_threshold(score_threshold),
+                            confidence_threshold(confidence_threshold),
+                            nms_threshold(nms_threshold) {
+        this->initialized();
+    }
+    
+    vp_yolo_detector_node::~vp_yolo_detector_node() {
+        deinitialized();
+    }
+    
+    void vp_yolo_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        // make sure heads of output are not zero
+        assert(raw_outputs.size() > 0);
+
+        // check dims of each output
+        auto dim_offset = raw_outputs[0].dims == 2 ? 0 : 1;
+        auto batch = dim_offset == 0 ? 1 : raw_outputs[0].size[0];
+
+        std::vector<int> class_ids;
+        std::vector<float> confidences;
+        std::vector<cv::Rect> boxes;
+
+        for (int k = 0; k < batch; k++) {  // scan batch
+            auto& frame_meta = frame_meta_with_batch[k];
+            for (int i = 0; i < raw_outputs.size(); ++i) {  // scan heads of output
+                cv::Mat output = cv::Mat(raw_outputs[i].size[0 + dim_offset], raw_outputs[i].size[1 + dim_offset], CV_32F, const_cast<uchar*>(raw_outputs[i].ptr(k)));
+
+                auto data = (float*)output.data;
+                for (int j = 0; j < output.rows; ++j, data += output.cols) {
+                    
+                    float confidence = data[4];
+                    // check confidence threshold
+                    if (confidence < confidence_threshold) {
+                        continue;
+                    }
+
+                    cv::Mat scores = output.row(j).colRange(5, output.cols);
+                    cv::Point class_id;
+                    double max_score;
+                    // Get the value and location of the maximum score
+                    cv::minMaxLoc(scores, 0, &max_score, 0, &class_id);
+
+                    // check score threshold
+                    if (max_score >= score_threshold) {
+                        // smaller than 1.0 means relative value 
+                        // greater than 1.0 means absolute value which need to be converted to relative value
+                        int centerX = (int)((data[0] <= 1 ? data[0] : data[0] / input_width) * frame_meta->frame.cols);
+                        int centerY = (int)((data[1] <= 1 ? data[1] : data[1] / input_height) * frame_meta->frame.rows);
+                        int width = (int)((data[2] <= 1 ? data[2] : data[2] / input_width) * frame_meta->frame.cols);
+                        int height = (int)((data[3] <= 1 ? data[3] : data[3] / input_height) * frame_meta->frame.rows);
+                        int left = centerX - width / 2;
+                        int top = centerY - height / 2;
+
+                        class_ids.push_back(class_id.x);
+                        confidences.push_back((float)confidence);
+                        boxes.push_back(cv::Rect(left, top, width, height));
+                    }
+                }
+            }
+
+            // nms
+            std::vector<int> indices;
+            cv::dnn::NMSBoxes(boxes, confidences, confidence_threshold, nms_threshold, indices);
+
+            // create target
+            for (int i = 0; i < indices.size(); ++i) {
+                int idx = indices[i];
+                auto box = boxes[idx];
+                auto confidence = confidences[idx];
+                auto class_id = class_ids[idx];
+                auto label = (labels.size() < class_id + 1) ? "" : labels[class_id];
+
+                // check value range
+                box.x = std::max(box.x, 0);
+                box.y = std::max(box.y, 0);
+                box.width = std::min(box.width, frame_meta->frame.cols - box.x);
+                box.height = std::min(box.height, frame_meta->frame.rows - box.y);
+                if (box.width <= 0 || box.height <= 0) {
+                    continue;
+                }
+                
+                // apply offset to class id since multi detectors can exist at front of this one.
+                // later we MUST use the new class id (applied offset) instead of orignal one.
+                class_id += class_id_offset;
+
+                auto target = std::make_shared<vp_objects::vp_frame_target>(box.x, box.y, box.width, box.height, class_id, confidence, frame_meta->frame_index, frame_meta->channel_index, label);
+
+                // insert target back to frame meta
+                frame_meta->targets.push_back(target);
+            }
+
+            class_ids.clear();
+            confidences.clear();
+            boxes.clear();
+        }
+    }
+}
--- a/nodes/infers/vp_yolo_detector_node.h
+++ b/nodes/infers/vp_yolo_detector_node.h
@@ -0,0 +1,35 @@
+
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+
+namespace vp_nodes {
+    // yolo detector, support yolov3/4/5
+    // https://github.com/pjreddie/darknet
+    class vp_yolo_detector_node: public vp_primary_infer_node
+    {
+    private:
+        float score_threshold;
+        float confidence_threshold;
+        float nms_threshold;
+    protected:
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_yolo_detector_node(std::string node_name, 
+                            std::string model_path, 
+                            std::string model_config_path = "", 
+                            std::string labels_path = "", 
+                            int input_width = 416, 
+                            int input_height = 416, 
+                            int batch_size = 1,
+                            int class_id_offset = 0,
+                            float score_threshold = 0.5,
+                            float confidence_threshold = 0.5,
+                            float nms_threshold = 0.5,
+                            float scale = 1 / 255.0,
+                            cv::Scalar mean = cv::Scalar(0),
+                            cv::Scalar std = cv::Scalar(1),
+                            bool swap_rb = true);
+        ~vp_yolo_detector_node();
+    };
+}
--- a/nodes/infers/vp_yunet_face_detector_node.cpp
+++ b/nodes/infers/vp_yunet_face_detector_node.cpp
@@ -0,0 +1,229 @@
+
+#include "vp_yunet_face_detector_node.h"
+
+
+namespace vp_nodes {
+        
+    vp_yunet_face_detector_node::vp_yunet_face_detector_node(std::string node_name, 
+                                                            std::string model_path, 
+                                                            float score_threshold, 
+                                                            float nms_threshold, 
+                                                            int top_k):
+                                                            vp_primary_infer_node(node_name, model_path),
+                                                            scoreThreshold(score_threshold),
+                                                            nmsThreshold(nms_threshold),
+                                                            topK(top_k) {
+        this->initialized();
+    }
+    
+    vp_yunet_face_detector_node::~vp_yunet_face_detector_node() {
+        deinitialized();
+    }
+    
+    void vp_yunet_face_detector_node::postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) {
+        using namespace cv;
+        // 3 heads of output
+        assert(raw_outputs.size() == 3);
+        assert(frame_meta_with_batch.size() == 1);
+        auto& frame_meta = frame_meta_with_batch[0];
+
+        // Extract from output_blobs
+        Mat loc = raw_outputs[0];
+        Mat conf = raw_outputs[1];
+        Mat iou = raw_outputs[2];
+
+        // we need generate priors if input size changed or priors is not initialized
+        if (loc.rows != priors.size()) {
+            inputW = frame_meta->frame.cols;
+            inputH = frame_meta->frame.rows;
+            generatePriors();
+        }
+        
+        assert(loc.rows == priors.size());
+        assert(loc.rows == conf.rows);
+        assert(loc.rows == iou.rows);
+
+        // Decode from deltas and priors
+        const std::vector<float> variance = {0.1f, 0.2f};
+        float* loc_v = (float*)(loc.data);
+        float* conf_v = (float*)(conf.data);
+        float* iou_v = (float*)(iou.data);
+        Mat faces;
+        // (tl_x, tl_y, w, h, re_x, re_y, le_x, le_y, nt_x, nt_y, rcm_x, rcm_y, lcm_x, lcm_y, score)
+        // 'tl': top left point of the bounding box
+        // 're': right eye, 'le': left eye
+        // 'nt':  nose tip
+        // 'rcm': right corner of mouth, 'lcm': left corner of mouth
+        Mat face(1, 15, CV_32FC1);
+        for (size_t i = 0; i < priors.size(); ++i) {
+            // Get score
+            float clsScore = conf_v[i*2+1];
+            float iouScore = iou_v[i];
+            // Clamp
+            if (iouScore < 0.f) {
+                iouScore = 0.f;
+            }
+            else if (iouScore > 1.f) {
+                iouScore = 1.f;
+            }
+            float score = std::sqrt(clsScore * iouScore);
+            face.at<float>(0, 14) = score;
+
+            // Get bounding box
+            float cx = (priors[i].x + loc_v[i*14+0] * variance[0] * priors[i].width)  * inputW;
+            float cy = (priors[i].y + loc_v[i*14+1] * variance[0] * priors[i].height) * inputH;
+            float w  = priors[i].width  * exp(loc_v[i*14+2] * variance[0]) * inputW;
+            float h  = priors[i].height * exp(loc_v[i*14+3] * variance[1]) * inputH;
+            float x1 = cx - w / 2;
+            float y1 = cy - h / 2;
+            face.at<float>(0, 0) = x1;
+            face.at<float>(0, 1) = y1;
+            face.at<float>(0, 2) = w;
+            face.at<float>(0, 3) = h;
+
+            // Get landmarks
+            face.at<float>(0, 4) = (priors[i].x + loc_v[i*14+ 4] * variance[0] * priors[i].width)  * inputW;  // right eye, x
+            face.at<float>(0, 5) = (priors[i].y + loc_v[i*14+ 5] * variance[0] * priors[i].height) * inputH;  // right eye, y
+            face.at<float>(0, 6) = (priors[i].x + loc_v[i*14+ 6] * variance[0] * priors[i].width)  * inputW;  // left eye, x
+            face.at<float>(0, 7) = (priors[i].y + loc_v[i*14+ 7] * variance[0] * priors[i].height) * inputH;  // left eye, y
+            face.at<float>(0, 8) = (priors[i].x + loc_v[i*14+ 8] * variance[0] * priors[i].width)  * inputW;  // nose tip, x
+            face.at<float>(0, 9) = (priors[i].y + loc_v[i*14+ 9] * variance[0] * priors[i].height) * inputH;  // nose tip, y
+            face.at<float>(0, 10) = (priors[i].x + loc_v[i*14+10] * variance[0] * priors[i].width)  * inputW; // right corner of mouth, x
+            face.at<float>(0, 11) = (priors[i].y + loc_v[i*14+11] * variance[0] * priors[i].height) * inputH; // right corner of mouth, y
+            face.at<float>(0, 12) = (priors[i].x + loc_v[i*14+12] * variance[0] * priors[i].width)  * inputW; // left corner of mouth, x
+            face.at<float>(0, 13) = (priors[i].y + loc_v[i*14+13] * variance[0] * priors[i].height) * inputH; // left corner of mouth, y
+
+            faces.push_back(face);
+        }
+
+        if (faces.rows > 1)
+        {
+            // Retrieve boxes and scores
+            std::vector<Rect2i> faceBoxes;
+            std::vector<float> faceScores;
+            for (int rIdx = 0; rIdx < faces.rows; rIdx++)
+            {
+                faceBoxes.push_back(Rect2i(int(faces.at<float>(rIdx, 0)),
+                                           int(faces.at<float>(rIdx, 1)),
+                                           int(faces.at<float>(rIdx, 2)),
+                                           int(faces.at<float>(rIdx, 3))));
+                faceScores.push_back(faces.at<float>(rIdx, 14));
+            }
+
+            std::vector<int> keepIdx;
+            dnn::NMSBoxes(faceBoxes, faceScores, scoreThreshold, nmsThreshold, keepIdx, 1.f, topK);
+
+            // Get NMS results
+            Mat nms_faces;
+            for (int idx: keepIdx)
+            {
+                nms_faces.push_back(faces.row(idx));
+            }
+
+            // insert face target back to frame meta
+            for (int i = 0; i < nms_faces.rows; i++) {
+                auto x = int(nms_faces.at<float>(i, 0));
+                auto y = int(nms_faces.at<float>(i, 1));
+                auto w = int(nms_faces.at<float>(i, 2));
+                auto h = int(nms_faces.at<float>(i, 3));
+                
+                // check value range
+                x = std::max(x, 0);
+                y = std::max(y, 0);
+                w = std::min(w, frame_meta->frame.cols - x);
+                h = std::min(h, frame_meta->frame.rows - y);
+
+                auto kp1 = std::pair<int, int>(int(nms_faces.at<float>(i, 4)), int(nms_faces.at<float>(i, 5)));
+                auto kp2 = std::pair<int, int>(int(nms_faces.at<float>(i, 6)), int(nms_faces.at<float>(i, 7)));
+                auto kp3 = std::pair<int, int>(int(nms_faces.at<float>(i, 8)), int(nms_faces.at<float>(i, 9)));
+                auto kp4 = std::pair<int, int>(int(nms_faces.at<float>(i, 10)), int(nms_faces.at<float>(i, 11)));
+                auto kp5 = std::pair<int, int>(int(nms_faces.at<float>(i, 12)), int(nms_faces.at<float>(i, 13)));
+                auto score = nms_faces.at<float>(i, 14);
+
+                auto face_target = std::make_shared<vp_objects::vp_frame_face_target>(x, y, w, h, score, std::vector<std::pair<int, int>>{kp1, kp2, kp3, kp4, kp5});
+
+                frame_meta->face_targets.push_back(face_target);
+            }
+            
+        }
+    }
+
+    // refer to vp_infer_node::preprocess
+    void vp_yunet_face_detector_node::preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) {
+        cv::dnn::blobFromImages(mats_to_infer, blob_to_infer);
+    }
+    
+    // refer to vp_infer_node::infer
+    void vp_yunet_face_detector_node::infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) {
+        // blob_to_infer is a 4D matrix
+        // the first dim is number of batch, MUST be 1
+        assert(blob_to_infer.dims == 4);
+        assert(blob_to_infer.size[0] == 1);
+        assert(!net.empty());
+
+        net.setInput(blob_to_infer);
+        net.forward(raw_outputs, out_names);
+    }
+
+    void vp_yunet_face_detector_node::generatePriors() {
+        using namespace cv;
+        // Calculate shapes of different scales according to the shape of input image
+        Size feature_map_2nd = {
+            int(int((inputW+1)/2)/2), int(int((inputH+1)/2)/2)
+        };
+        Size feature_map_3rd = {
+            int(feature_map_2nd.width/2), int(feature_map_2nd.height/2)
+        };
+        Size feature_map_4th = {
+            int(feature_map_3rd.width/2), int(feature_map_3rd.height/2)
+        };
+        Size feature_map_5th = {
+            int(feature_map_4th.width/2), int(feature_map_4th.height/2)
+        };
+        Size feature_map_6th = {
+            int(feature_map_5th.width/2), int(feature_map_5th.height/2)
+        };
+
+        std::vector<Size> feature_map_sizes;
+        feature_map_sizes.push_back(feature_map_3rd);
+        feature_map_sizes.push_back(feature_map_4th);
+        feature_map_sizes.push_back(feature_map_5th);
+        feature_map_sizes.push_back(feature_map_6th);
+
+        // Fixed params for generating priors
+        const std::vector<std::vector<float>> min_sizes = {
+            {10.0f,  16.0f,  24.0f},
+            {32.0f,  48.0f},
+            {64.0f,  96.0f},
+            {128.0f, 192.0f, 256.0f}
+        };
+        CV_Assert(min_sizes.size() == feature_map_sizes.size()); // just to keep vectors in sync
+        const std::vector<int> steps = { 8, 16, 32, 64 };
+
+        // Generate priors
+        priors.clear();
+        for (size_t i = 0; i < feature_map_sizes.size(); ++i)
+        {
+            Size feature_map_size = feature_map_sizes[i];
+            std::vector<float> min_size = min_sizes[i];
+
+            for (int _h = 0; _h < feature_map_size.height; ++_h)
+            {
+                for (int _w = 0; _w < feature_map_size.width; ++_w)
+                {
+                    for (size_t j = 0; j < min_size.size(); ++j)
+                    {
+                        float s_kx = min_size[j] / inputW;
+                        float s_ky = min_size[j] / inputH;
+
+                        float cx = (_w + 0.5f) * steps[i] / inputW;
+                        float cy = (_h + 0.5f) * steps[i] / inputH;
+
+                        Rect2f prior = { cx, cy, s_kx, s_ky };
+                        priors.push_back(prior);
+                    }
+                }
+            }
+        }
+    }
+}
--- a/nodes/infers/vp_yunet_face_detector_node.h
+++ b/nodes/infers/vp_yunet_face_detector_node.h
@@ -0,0 +1,34 @@
+
+#pragma once
+
+#include "../vp_primary_infer_node.h"
+#include "../../objects/vp_frame_face_target.h"
+
+namespace vp_nodes {
+    // face detector based on YunNet
+    // https://github.com/opencv/opencv/blob/4.x/modules/objdetect/src/face_detect.cpp
+    // https://github.com/ShiqiYu/libfacedetection
+    class vp_yunet_face_detector_node: public vp_primary_infer_node
+    {
+    private:
+        // names of output layers in yunet
+        const std::vector<std::string> out_names = {"loc", "conf", "iou"};
+        float scoreThreshold = 0.7;
+        float nmsThreshold = 0.5;
+        int topK = 50;
+        int inputW;
+        int inputH;
+        std::vector<cv::Rect2f> priors;
+        void generatePriors();
+    protected:
+        // override infer and preprocess as yunet has a different logic
+        virtual void infer(const cv::Mat& blob_to_infer, std::vector<cv::Mat>& raw_outputs) override;
+        virtual void preprocess(const std::vector<cv::Mat>& mats_to_infer, cv::Mat& blob_to_infer) override;
+
+        virtual void postprocess(const std::vector<cv::Mat>& raw_outputs, const std::vector<std::shared_ptr<vp_objects::vp_frame_meta>>& frame_meta_with_batch) override;
+    public:
+        vp_yunet_face_detector_node(std::string node_name, std::string model_path, float score_threshold = 0.7, float nms_threshold = 0.5, int top_k = 50);
+        ~vp_yunet_face_detector_node();
+    };
+
+}
--- a/nodes/osd/README.md
+++ b/nodes/osd/README.md
@@ -0,0 +1,5 @@
+# Summary
+
+Two situations to create a new osd node:
+1. Need a new osd style for the same type of target
+2. OSD for different types of targets, such as `vp_frame_target` (by default), `vp_frame_pose_target`, `vp_frame_face_target` ...
--- a/nodes/osd/vp_ba_crossline_osd_node.cpp
+++ b/nodes/osd/vp_ba_crossline_osd_node.cpp
@@ -0,0 +1,91 @@
+
+#include "vp_ba_crossline_osd_node.h"
+
+namespace vp_nodes {
+    
+    vp_ba_crossline_osd_node::vp_ba_crossline_osd_node(std::string node_name, std::string font): vp_node(node_name) {
+        if (!font.empty()) {
+            ft2 = cv::freetype::createFreeType2();
+            ft2->loadFontData(font, 0);   
+        }      
+        this->initialized();
+    }
+    
+    vp_ba_crossline_osd_node::~vp_ba_crossline_osd_node() {
+        deinitialized();
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ba_crossline_osd_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // operations on osd_frame
+        if (meta->osd_frame.empty()) {
+            meta->osd_frame = meta->frame.clone();
+        }
+
+        auto& canvas = meta->osd_frame;
+        // scan targets
+        for (auto& i : meta->targets) {
+            // track_id
+            auto id = std::to_string(i->track_id);
+            auto labels_to_display = i->primary_label;
+
+            // tracked
+            if (i->track_id != -1) {
+                labels_to_display = "#" + id + " " + labels_to_display;
+            }
+            
+            for (auto& label : i->secondary_labels) {
+                labels_to_display += "|" + label;
+            }
+            
+            // draw tracks if size>=2
+            if (i->tracks.size() >= 2) {
+                for (int n = 0; n < (i->tracks.size() - 1); n++) {
+                    auto p1 = i->tracks[n].track_point();
+                    auto p2 = i->tracks[n + 1].track_point();
+                    cv::line(canvas, cv::Point(p1.x, p1.y), cv::Point(p2.x, p2.y), cv::Scalar(0, 255, 255), 1, cv::LINE_AA);
+                }
+            }
+
+            cv::rectangle(canvas, cv::Rect(i->x, i->y, i->width, i->height), cv::Scalar(255, 255, 0), 2);
+            if (ft2 != nullptr) {
+                ft2->putText(canvas, labels_to_display, cv::Point(i->x, i->y), 20, cv::Scalar(255, 0, 255), cv::FILLED, cv::LINE_AA, true);
+            }
+            else {               
+                //cv::putText(canvas, labels_to_display, cv::Point(i->x, i->y), 1, 1, cv::Scalar(255, 0, 255));
+                int baseline = 0;
+                auto size = cv::getTextSize(labels_to_display, 1, 1, 1, &baseline);
+                vp_utils::put_text_at_center_of_rect(canvas, labels_to_display, cv::Rect(i->x, i->y - size.height, size.width, size.height), true, 1, 1, cv::Scalar(), cv::Scalar(179, 52, 255), cv::Scalar(179, 52, 255));
+            }
+
+            // scan sub targets
+            for (auto& sub_target: i->sub_targets) {
+                cv::rectangle(canvas, cv::Rect(sub_target->x, sub_target->y, sub_target->width, sub_target->height), cv::Scalar(255));
+                if (ft2 != nullptr) {
+                    ft2->putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 20, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA, true);
+                }
+                else {
+                    cv::putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 1, 1, cv::Scalar(0, 0, 255));
+                }
+            }
+        }
+        
+        /* crossline draw for current channel */
+        auto& total_crossline = all_total_crossline[meta->channel_index];      
+        auto& line = all_lines[meta->channel_index];
+        
+        // scan ba results and ONLY deal with crossline
+        for (auto& i : meta->ba_results) {
+            if (i->type == vp_objects::vp_ba_type::CROSSLINE) {
+                // line has 2 points
+                assert(i->involve_region_in_frame.size() == 2);
+                line = vp_objects::vp_line(i->involve_region_in_frame[0], i->involve_region_in_frame[1]);
+                total_crossline += 1;
+            }
+        }
+
+        // draw crossline data
+        cv::line(canvas, cv::Point(line.start.x, line.start.y), cv::Point(line.end.x, line.end.y), cv::Scalar(0, 255, 0), 2, cv::LINE_AA);
+        cv::putText(canvas, vp_utils::string_format("total crossline targets: [%d]", total_crossline), cv::Point(20, 20), 1, 2, cv::Scalar(0, 0, 255), 2);
+        return meta;
+    }
+}
--- a/nodes/osd/vp_ba_crossline_osd_node.h
+++ b/nodes/osd/vp_ba_crossline_osd_node.h
@@ -0,0 +1,26 @@
+#pragma once
+
+#include <map>
+#include <opencv2/freetype.hpp>
+#include "../vp_node.h"
+#include "../../objects/shapes/vp_point.h"
+#include "../../objects/shapes/vp_line.h"
+
+namespace vp_nodes {
+    // osd node for behaviour analysis of crossline
+    class vp_ba_crossline_osd_node: public vp_node
+    {
+    private:
+        // support chinese font
+        cv::Ptr<cv::freetype::FreeType2> ft2;
+
+        // support multi channels
+        std::map<int, int> all_total_crossline;
+        std::map<int, vp_objects::vp_line> all_lines;
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
+    public:
+        vp_ba_crossline_osd_node(std::string node_name,  std::string font = "");
+        ~vp_ba_crossline_osd_node();
+    };
+}
--- a/nodes/osd/vp_ba_jam_osd_node.cpp
+++ b/nodes/osd/vp_ba_jam_osd_node.cpp
@@ -0,0 +1,109 @@
+
+#include "vp_ba_jam_osd_node.h"
+
+namespace vp_nodes {
+    
+    vp_ba_jam_osd_node::vp_ba_jam_osd_node(std::string node_name, std::string font): vp_node(node_name) {
+        if (!font.empty()) {
+            ft2 = cv::freetype::createFreeType2();
+            ft2->loadFontData(font, 0);   
+        }      
+        this->initialized();
+    }
+    
+    vp_ba_jam_osd_node::~vp_ba_jam_osd_node() {
+        deinitialized();
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ba_jam_osd_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // operations on osd_frame
+        if (meta->osd_frame.empty()) {
+            meta->osd_frame = meta->frame.clone();
+        }
+
+        auto& canvas = meta->osd_frame;
+        // scan targets
+        for (auto& i : meta->targets) {
+            // track_id
+            auto id = std::to_string(i->track_id);
+            auto labels_to_display = i->primary_label;
+
+            // tracked
+            if (i->track_id != -1) {
+                labels_to_display = "#" + id + " " + labels_to_display;
+            }
+            
+            for (auto& label : i->secondary_labels) {
+                labels_to_display += "|" + label;
+            }
+            
+            // draw tracks if size>=2
+            if (i->tracks.size() >= 2) {
+                for (int n = 0; n < (i->tracks.size() - 1); n++) {
+                    auto p1 = i->tracks[n].track_point();
+                    auto p2 = i->tracks[n + 1].track_point();
+                    cv::line(canvas, cv::Point(p1.x, p1.y), cv::Point(p2.x, p2.y), cv::Scalar(0, 255, 255), 1, cv::LINE_AA);
+                }
+            }
+
+            cv::rectangle(canvas, cv::Rect(i->x, i->y, i->width, i->height), cv::Scalar(255, 255, 0), 2);
+            if (ft2 != nullptr) {
+                ft2->putText(canvas, labels_to_display, cv::Point(i->x, i->y), 20, cv::Scalar(255, 0, 255), cv::FILLED, cv::LINE_AA, true);
+            }
+            else {               
+                //cv::putText(canvas, labels_to_display, cv::Point(i->x, i->y), 1, 1, cv::Scalar(255, 0, 255));
+                int baseline = 0;
+                auto size = cv::getTextSize(labels_to_display, 1, 1, 1, &baseline);
+                vp_utils::put_text_at_center_of_rect(canvas, labels_to_display, cv::Rect(i->x, i->y - size.height, size.width, size.height), true, 1, 1, cv::Scalar(), cv::Scalar(179, 52, 255), cv::Scalar(179, 52, 255));
+            }
+
+            // scan sub targets
+            for (auto& sub_target: i->sub_targets) {
+                cv::rectangle(canvas, cv::Rect(sub_target->x, sub_target->y, sub_target->width, sub_target->height), cv::Scalar(255));
+                if (ft2 != nullptr) {
+                    ft2->putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 20, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA, true);
+                }
+                else {
+                    cv::putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 1, 1, cv::Scalar(0, 0, 255));
+                }
+            }
+        }
+        
+        /* jam draw for current channel */    
+        auto& region = all_jam_regions[meta->channel_index];
+        auto& jam_result = all_jam_results[meta->channel_index];
+        auto& involve_ids = all_involve_ids[meta->channel_index];
+
+        // scan ba results and ONLY deal with stop and unstop
+        for (auto& i : meta->ba_results) {
+            if (i->type == vp_objects::vp_ba_type::JAM) {
+                region = i->involve_region_in_frame;
+                involve_ids = i->involve_target_ids_in_frame;
+                jam_result = true;
+            }
+            if (i->type == vp_objects::vp_ba_type::UNJAM) {
+                region = i->involve_region_in_frame;
+                involve_ids = i->involve_target_ids_in_frame;  // not used later 
+                jam_result = false;
+            }
+        }
+
+        // draw jam data
+        auto poly_vertexs = [&]() {
+            std::vector<cv::Point> vertexs;
+            for(auto& p: region) {
+                vertexs.push_back(cv::Point(p.x, p.y));
+            }
+            return vertexs;
+        };
+        if (jam_result) {
+            cv::polylines(canvas, poly_vertexs(), true, cv::Scalar(0, 0, 255), 2, cv::LINE_AA); // region
+            // targets in jan region, highlight
+            auto targets = meta->get_targets_by_ids(involve_ids);
+            for (auto& t: targets) {
+                cv::rectangle(canvas,  cv::Rect(t->x, t->y, t->width, t->height), cv::Scalar(0, 0, 255), 2, cv::LINE_AA);
+            }
+        }
+        return meta;
+    }
+}
--- a/nodes/osd/vp_ba_jam_osd_node.h
+++ b/nodes/osd/vp_ba_jam_osd_node.h
@@ -0,0 +1,27 @@
+#pragma once
+
+#include <map>
+#include <opencv2/freetype.hpp>
+#include "../vp_node.h"
+#include "../../objects/shapes/vp_point.h"
+#include "../../objects/shapes/vp_line.h"
+
+namespace vp_nodes {
+    // osd node for behaviour analysis of stop
+    class vp_ba_jam_osd_node: public vp_node
+    {
+    private:
+        // support chinese font
+        cv::Ptr<cv::freetype::FreeType2> ft2;
+
+        // support multi channels
+        std::map<int, std::vector<vp_objects::vp_point>> all_jam_regions;   // channel -> jam region
+        std::map<int, bool> all_jam_results;  // channel -> jam status
+        std::map<int, std::vector<int>> all_involve_ids;  // channel -> target ids when enter jam status
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
+    public:
+        vp_ba_jam_osd_node(std::string node_name,  std::string font = "");
+        ~vp_ba_jam_osd_node();
+    };
+}
--- a/nodes/osd/vp_ba_stop_osd_node.cpp
+++ b/nodes/osd/vp_ba_stop_osd_node.cpp
@@ -0,0 +1,114 @@
+
+#include "vp_ba_stop_osd_node.h"
+
+namespace vp_nodes {
+    
+    vp_ba_stop_osd_node::vp_ba_stop_osd_node(std::string node_name, std::string font): vp_node(node_name) {
+        if (!font.empty()) {
+            ft2 = cv::freetype::createFreeType2();
+            ft2->loadFontData(font, 0);   
+        }      
+        this->initialized();
+    }
+    
+    vp_ba_stop_osd_node::~vp_ba_stop_osd_node() {
+        deinitialized();
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_ba_stop_osd_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // operations on osd_frame
+        if (meta->osd_frame.empty()) {
+            meta->osd_frame = meta->frame.clone();
+        }
+
+        auto& canvas = meta->osd_frame;
+        // scan targets
+        for (auto& i : meta->targets) {
+            // track_id
+            auto id = std::to_string(i->track_id);
+            auto labels_to_display = i->primary_label;
+
+            // tracked
+            if (i->track_id != -1) {
+                labels_to_display = "#" + id + " " + labels_to_display;
+            }
+            
+            for (auto& label : i->secondary_labels) {
+                labels_to_display += "|" + label;
+            }
+            
+            // draw tracks if size>=2
+            if (i->tracks.size() >= 2) {
+                for (int n = 0; n < (i->tracks.size() - 1); n++) {
+                    auto p1 = i->tracks[n].track_point();
+                    auto p2 = i->tracks[n + 1].track_point();
+                    cv::line(canvas, cv::Point(p1.x, p1.y), cv::Point(p2.x, p2.y), cv::Scalar(0, 255, 255), 1, cv::LINE_AA);
+                }
+            }
+
+            cv::rectangle(canvas, cv::Rect(i->x, i->y, i->width, i->height), cv::Scalar(255, 255, 0), 2);
+            if (ft2 != nullptr) {
+                ft2->putText(canvas, labels_to_display, cv::Point(i->x, i->y), 20, cv::Scalar(255, 0, 255), cv::FILLED, cv::LINE_AA, true);
+            }
+            else {               
+                //cv::putText(canvas, labels_to_display, cv::Point(i->x, i->y), 1, 1, cv::Scalar(255, 0, 255));
+                int baseline = 0;
+                auto size = cv::getTextSize(labels_to_display, 1, 1, 1, &baseline);
+                vp_utils::put_text_at_center_of_rect(canvas, labels_to_display, cv::Rect(i->x, i->y - size.height, size.width, size.height), true, 1, 1, cv::Scalar(), cv::Scalar(179, 52, 255), cv::Scalar(179, 52, 255));
+            }
+
+            // scan sub targets
+            for (auto& sub_target: i->sub_targets) {
+                cv::rectangle(canvas, cv::Rect(sub_target->x, sub_target->y, sub_target->width, sub_target->height), cv::Scalar(255));
+                if (ft2 != nullptr) {
+                    ft2->putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 20, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA, true);
+                }
+                else {
+                    cv::putText(canvas, sub_target->label, cv::Point(sub_target->x, sub_target->y), 1, 1, cv::Scalar(0, 0, 255));
+                }
+            }
+        }
+        
+        /* stop draw for current channel */    
+        auto& region = all_stop_regions[meta->channel_index];
+        auto& total_stop = all_total_stop[meta->channel_index];
+        auto& stop_count = all_stop_count[meta->channel_index];
+
+        // scan ba results and ONLY deal with stop and unstop
+        for (auto& i : meta->ba_results) {
+            if (i->type == vp_objects::vp_ba_type::STOP) {
+                region = i->involve_region_in_frame;
+                stop_count++;
+                total_stop.push_back(i->involve_target_ids_in_frame[0]); // only 1 target
+            }
+            if (i->type == vp_objects::vp_ba_type::UNSTOP) {
+                region = i->involve_region_in_frame;
+                for (auto j = total_stop.begin(); j < total_stop.end();) {
+                    if (*j == i->involve_target_ids_in_frame[0]) {
+                        j = total_stop.erase(j);
+                        break;
+                    }
+                    j++;
+                }
+            }
+        }
+
+        // draw stop data
+        auto poly_vertexs = [&]() {
+            std::vector<cv::Point> vertexs;
+            for(auto& p: region) {
+                vertexs.push_back(cv::Point(p.x, p.y));
+            }
+            return vertexs;
+        };
+        cv::polylines(canvas, poly_vertexs(), true, cv::Scalar(0, 255, 0), 2, cv::LINE_AA);
+        // cv::putText(canvas, vp_utils::string_format("total stop targets: [%d]", stop_count), cv::Point(20, 20), 1, 2, cv::Scalar(0, 0, 255), 2);
+        
+        // stop targets, highlight
+        auto stop_targets = meta->get_targets_by_ids(total_stop);
+        for (auto& t: stop_targets) {
+            cv::rectangle(canvas,  cv::Rect(t->x, t->y, t->width, t->height), cv::Scalar(0, 0, 255), 2, cv::LINE_AA);
+        }
+        return meta;
+    }
+}
--- a/nodes/osd/vp_ba_stop_osd_node.h
+++ b/nodes/osd/vp_ba_stop_osd_node.h
@@ -0,0 +1,27 @@
+#pragma once
+
+#include <map>
+#include <opencv2/freetype.hpp>
+#include "../vp_node.h"
+#include "../../objects/shapes/vp_point.h"
+#include "../../objects/shapes/vp_line.h"
+
+namespace vp_nodes {
+    // osd node for behaviour analysis of stop
+    class vp_ba_stop_osd_node: public vp_node
+    {
+    private:
+        // support chinese font
+        cv::Ptr<cv::freetype::FreeType2> ft2;
+
+        // support multi channels
+        std::map<int, std::vector<int>> all_total_stop;  // channel -> stop target ids
+        std::map<int, std::vector<vp_objects::vp_point>> all_stop_regions;   // channel -> stop region
+        std::map<int, int> all_stop_count;  // channel -> stop count
+    protected:
+        virtual std::shared_ptr<vp_objects::vp_meta> handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) override;
+    public:
+        vp_ba_stop_osd_node(std::string node_name,  std::string font = "");
+        ~vp_ba_stop_osd_node();
+    };
+}
--- a/nodes/osd/vp_cluster_node.cpp
+++ b/nodes/osd/vp_cluster_node.cpp
@@ -0,0 +1,214 @@
+#include "vp_cluster_node.h"
+#include "../../third_party/bhtsne/tsne.h"
+
+namespace vp_nodes {
+    vp_cluster_node::vp_cluster_node(std::string node_name, 
+                        bool use_tSNE, 
+                        std::vector<std::string> s_labels_to_display,
+                        int sampling_frequency,
+                        int min_sampling_width,
+                        int min_sampling_height,
+                        int max_sample_num_for_tsne,
+                        int max_sample_num_per_category):
+                        vp_node(node_name),
+                        use_tSNE(use_tSNE),
+                        s_labels_to_display(s_labels_to_display),
+                        sampling_frequency(sampling_frequency),
+                        min_sampling_width(min_sampling_width),
+                        min_sampling_height(min_sampling_height),
+                        max_sample_num_for_tsne(max_sample_num_for_tsne),
+                        max_sample_num_per_category(max_sample_num_per_category) {
+        this->initialized();
+    }
+    
+    vp_cluster_node::~vp_cluster_node() {
+        deinitialized();        
+    }
+
+    // please refer to ../../third_party/trt_vehicle/main/vehicle_cluster.cpp
+    void vp_cluster_node::reduce_dims_using_tsne(std::vector<std::vector<float>>& low_features,
+                                int no_dims, int max_iter, double perplexity, 
+                                double theta, int rand_seed, bool skip_random_init, 
+                                int stop_lying_iter, int mom_switch_iter) {
+        assert(cache_high_features.size() != 0);
+        auto N = cache_high_features.size();
+        auto D = cache_high_features[0].second.size();  // all the same as the first feature's dims
+
+        // prepare input
+        double data[N * D];
+        double Y[N * no_dims];
+        for (int i = 0; i < N; i++) {
+            for (int j = 0; j < D; j++) {
+                data[i * D + j] = cache_high_features[i].second[j];
+            }
+        }
+        
+        // call t-SNE
+        TSNE::run(data, N, D, Y, no_dims, perplexity, theta, rand_seed, skip_random_init, max_iter, stop_lying_iter, mom_switch_iter);
+
+        // prepare output
+        for (int i = 0; i < N; i++) {
+            std::vector<float> low_dims_feature;
+            for (int j = 0; j < no_dims; j++) {
+                low_dims_feature.push_back(float(Y[i * no_dims + j]));
+            }
+            low_features.push_back(low_dims_feature);
+        }
+    }
+
+
+    std::shared_ptr<vp_objects::vp_meta> vp_cluster_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta> meta) {
+        // sampling frequency
+        if (std::chrono::duration_cast<std::chrono::milliseconds>(NOW - last_sampling_time) < std::chrono::milliseconds(sampling_frequency)) {
+            return meta;
+        }
+        last_sampling_time = NOW;
+        
+        /* t-SNE */
+        if (use_tSNE) {
+            for (int i = 0; i < meta->targets.size(); i++) {
+                if (meta->targets[i]->width < min_sampling_width || meta->targets[i]->height < min_sampling_height) {
+                    continue;
+                }
+                
+                if (cache_high_features.size() >= max_sample_num_for_tsne) {
+                    // too many cache, we need control number of samples
+                    cache_high_features.erase(cache_high_features.begin());
+                }
+                // save mat->feature pair to cache
+                cv::Mat m;
+                cv::Mat roi(meta->frame, cv::Rect(meta->targets[i]->x, meta->targets[i]->y, meta->targets[i]->width, meta->targets[i]->height));
+                roi.copyTo(m);
+                std::pair<cv::Mat, std::vector<float>> p {m, meta->targets[i]->embeddings};
+                cache_high_features.push_back(p);
+            }
+
+            // at least 10 samples to t-SNE
+            if (cache_high_features.size() >= 10) {
+                // reduce dims first
+                std::vector<std::vector<float>> low_features;
+                reduce_dims_using_tsne(low_features);
+
+                // now display on screen
+                // normalize low dims feature to coordinate of [0:1] and display them on 2D screen
+                auto max_x = 0.0f, max_y = 0.0f, min_x = 0.0f, min_y = 0.0f;
+                for(int i = 0; i < low_features.size(); i++) {
+                    auto& f = low_features[i];
+                    // 2 values in f
+                    max_x = std::max(max_x, f[0]);
+                    max_y = std::max(max_y, f[1]);
+                    min_x = std::min(min_x, f[0]);
+                    min_y = std::min(min_y, f[1]);
+                }
+                auto x_range = max_x - min_x;
+                auto y_range = max_y - min_y;
+
+                // draw on (tsne_canvas_w_h + tsne_thumbnail_w_h) * (tsne_canvas_w_h + tsne_thumbnail_w_h)
+                cv::Mat canvas(tsne_canvas_w_h + tsne_thumbnail_w_h, tsne_canvas_w_h + tsne_thumbnail_w_h, CV_8UC3, cv::Scalar(127, 127, 127));
+                for(int i = 0; i < low_features.size(); i++) {
+                    auto& f = low_features[i];
+                    // convert to [0:1]
+                    f[0] = (f[0] - min_x) / x_range;
+                    f[1] = (f[1] - min_y) / y_range;
+
+                    auto& img = cache_high_features[i].first;
+                    cv::Mat img_tmp;
+                    cv::resize(img, img_tmp, cv::Size(tsne_thumbnail_w_h, tsne_thumbnail_w_h));
+                    cv::rectangle(img_tmp, cv::Rect(0, 0, img_tmp.cols, img_tmp.rows), cv::Scalar(255, 0, 0));
+                    cv::Mat roi(canvas, cv::Rect(int(f[0] * tsne_canvas_w_h), int(f[1] * tsne_canvas_w_h), tsne_thumbnail_w_h, tsne_thumbnail_w_h));
+                    img_tmp.copyTo(roi);
+                }
+
+                cv::imshow("cluster using features powered by t-SNE", canvas);
+            }
+        }
+
+        /* categories */
+        for (int i = 0; i < meta->targets.size(); i++) {
+            auto& t = meta->targets[i];
+            if (t->width < min_sampling_width || t->height < min_sampling_height) {
+                continue;
+            }
+
+            cv::Mat m;
+            cv::Mat roi(meta->frame, cv::Rect(t->x, t->y, t->width, t->height));
+            roi.copyTo(m);
+
+            for (int j = 0; j < t->secondary_labels.size(); j++) {
+                auto& category = t->secondary_labels[j];
+                bool filter_pass = false;
+                // all
+                if (s_labels_to_display.size() == 0) {
+                    cache_categories[category].push_back(m);
+                    filter_pass = true;
+                }
+                else {
+                    // has a filter
+                    if (std::find(s_labels_to_display.begin(), s_labels_to_display.end(), category) != s_labels_to_display.end()) {
+                        cache_categories[category].push_back(m);
+                        filter_pass = true;
+                    }
+                }
+
+                if (filter_pass) {                
+                    // too many cache, we need control number of samples
+                    if (cache_categories[category].size() >= max_sample_num_per_category) {
+                        cache_categories[category].erase(cache_categories[category].begin());
+                    }
+                }
+            }
+        }
+        // display on screen
+        // calculate total number of rows
+        auto rows_num = 0; 
+        for (auto& p: cache_categories) {
+            auto num = p.second.size() / category_num_per_row;
+            if (p.second.size() % category_num_per_row != 0 || num == 0) {
+                num++;
+            }
+            rows_num += num;
+        }
+        
+        // draw on (category_canvas_w) * (category_canvas_h)
+        auto category_canvas_w = (category_thumbnail_w_h + category_gap) * (category_num_per_row + 1);
+        auto category_canvas_h = (category_thumbnail_w_h + category_gap) * (rows_num + 1);
+        cv::Mat canvas(category_canvas_h, category_canvas_w, CV_8UC3, cv::Scalar(127, 127, 127));
+
+        auto row_index = 0;
+        for (auto& p: cache_categories) {
+            auto col_index = 0;
+            auto& category_items = p.second;
+            auto& category_name = p.first;
+            cv::putText(canvas, category_name, cv::Point(10, (category_gap + category_thumbnail_w_h) * row_index + category_gap), 1, 1, cv::Scalar(0, 0, 255));
+
+            bool new_row = false;
+            for (int i = 0; i < category_items.size(); i++) {
+                cv::Mat img_tmp;
+                cv::resize(category_items[i], img_tmp, cv::Size(category_thumbnail_w_h, category_thumbnail_w_h));
+                cv::Mat roi(canvas, cv::Rect((category_gap + category_thumbnail_w_h) * col_index + category_gap, (category_gap + category_thumbnail_w_h) * row_index + category_gap, category_thumbnail_w_h, category_thumbnail_w_h));
+                img_tmp.copyTo(roi);
+
+                col_index++;
+                if ((col_index + 1) % category_num_per_row == 0) {
+                    col_index = 0;
+                    row_index ++;
+                    new_row = true;
+                } 
+                else {
+                    new_row = false;
+                }
+            }
+            if (!new_row) {
+                row_index++;
+            }
+        }
+
+        cv::imshow("cluster using labels", canvas);
+
+        return meta;
+    }
+
+    std::shared_ptr<vp_objects::vp_meta> vp_cluster_node::handle_control_meta(std::shared_ptr<vp_objects::vp_control_meta> meta) {
+        return meta;
+    }
+}
--- a/Show More
+++ b/Show More