在实际生产中,经常用到自动化的抓取装置来代替人工操作,抓取的准确性基本上取决于视觉系统的识别准确性。在视觉抓取中,获得目标物体精准的轮廓边缘决定是否可以对目标进行准确地抓取。
在本文中,通过实例分割例子,对目标物体(方块)进行边缘轮廓分割,实现目标物体的抓取。
如何数据标注可以参看:24.人工智能:计算机视觉任务——数据格式和标注。
注:LabelMe标注后的数据还需要进行转换为MSCOCO格式,才可以用于实例分割任务的训练。
>paddlex --data_conversion --source labelme --to MSCOCO --pics block-grab/JPEGImages --annotations block-grab/Annotations --save_dir block-grab-coco划分数据集
>paddlex --split_dataset --format COCO --dataset_dir block-grab-coco --val_value 0.2 --test_value 0.1划分结果
二、模型选择与训练
这里选择MaskRCNN模型
train_transforms = T.Compose([
T.RandomResizeByShort(
short_sizes=[640, 672, 704, 736, 768, 800],
max_size=1333,
interp='CUBIC'), T.RandomHorizontalFlip(), T.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
eval_transforms = T.Compose([
T.ResizeByShort(
short_size=800, max_size=1333, interp='CUBIC'), T.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
# 定义训练和验证所用的数据集
train_dataset = pdx.datasets.CocoDetection(
data_dir='block-grab-coco/JPEGImages',
ann_file='block-grab-coco/train.json',
transforms=train_transforms,
shuffle=True)
eval_dataset = pdx.datasets.CocoDetection(
data_dir='block-grab-coco/JPEGImages',
ann_file='block-grab-coco/val.json',
transforms=eval_transforms)
# 初始化模型,并进行训练
num_classes = len(train_dataset.labels)
model = pdx.det.MaskRCNN(
num_classes=num_classes, backbone='ResNet50', with_fpn=True)
model.train(
num_epochs=12,
train_dataset=train_dataset,
train_batch_size=1,
eval_dataset=eval_dataset,
pretrain_weights='COCO',
learning_rate=0.00125,
lr_decay_epochs=[8, 11],
warmup_steps=10,
warmup_start_lr=0.0,
save_dir='output/mask_rcnn_r50_fpn-block',
use_vdl=False)训练过程部分信息
......
2022-06-28 10:34:56 [INFO] There are 301/307 variables loaded into MaskRCNN.
2022-06-28 10:34:58 [INFO] [TRAIN] Epoch=1/12, Step=10/21, loss_mask=0.133326, loss_rpn_cls=0.038073, loss_rpn_reg=0.011600, loss_bbox_cls=0.521337, loss_bbox_reg=0.898016, loss=1.602352, lr=0.001125, time_each_step=0.21s, eta=0:0:52
2022-06-28 10:34:59 [INFO] [TRAIN] Epoch=1/12, Step=20/21, loss_mask=0.137111, loss_rpn_cls=0.039009, loss_rpn_reg=0.016852, loss_bbox_cls=0.335344, loss_bbox_reg=0.753059, loss=1.281375, lr=0.001250, time_each_step=0.15s, eta=0:0:36
2022-06-28 10:35:00 [INFO] [TRAIN] Epoch 1 finished, loss_mask=1.0921271, loss_rpn_cls=0.19929981, loss_rpn_reg=0.03650659, loss_bbox_cls=0.50937736, loss_bbox_reg=0.80119514, loss=2.6385062 .
2022-06-28 10:35:00 [INFO] Start to evaluate(total_samples=6, total_steps=6)...
2022-06-28 10:35:17 [INFO] Start evaluate...
......
2022-06-28 10:37:09 [INFO] [EVAL] Finished, Epoch=12, bbox_mmap=0.916755, segm_mmap=0.947328 .
2022-06-28 10:37:09 [INFO] Current evaluated best model on eval_dataset is epoch_10, bbox_mmap=0.9170318016259392
2022-06-28 10:37:10 [INFO] Model saved in output/mask_rcnn_r50_fpn-block/epoch_12.从最后训练结果来看:bbox_mmap=0.9170318016259392,效果还是非常不错。
三、模型导出与预测
输出部署模型
>paddlex --export_inference --model_dir best_model --save_dir inferimport paddlex as pdx
import cv2
import os
import numpy as np
#os.environ["CUDA_VISIBLE_DEVICES"]="0,1"
predictor=pdx.deploy.Predictor("output/mask_rcnn_r50-block/inference_model",use_gpu=True)
image_name="block-grab-coco/JPEGImages/Image_20210615204354811.bmp"
img=cv2.imread(image_name)
result=predictor.predict(img)
#print(result)
for dt in np.array(result):
cname, bbox, score = dt['category'], dt['bbox'], dt['score']
if score < 0.5:
continue
print(cname, bbox, score)
vis_img=pdx.det.visualize(img,result,threshold=0.5,save_dir=None)
cv2.namedWindow("result",cv2.WINDOW_NORMAL)
cv2.imshow("result",vis_img)
cv2.waitKey(0)
cv2.destroyAllWindows()预测结果,输出每个目标的位置和置信度,用于目标抓取。
total_time(ms): 73650.5, img_num: 1, batch_size: 1
average latency time(ms): 73650.50, QPS: 0.013578
preprocess_time_per_im(ms): 327.80, inference_time_per_batch(ms): 72922.20, postprocess_time_per_im(ms): 400.50
block [1082.9652099609375, 499.8653259277344, 272.3438720703125, 261.7666931152344] 0.9946463704109192
block [924.103515625, 1490.95556640625, 245.0869140625, 248.14501953125] 0.9959415793418884
block [1395.4515380859375, 1393.887451171875, 253.7200927734375, 248.3074951171875] 0.9967671632766724
block [1078.8236083984375, 826.9075927734375, 258.23779296875, 250.3233642578125] 0.9965422749519348
block [1498.30615234375, 1070.516357421875, 236.033203125, 248.6375732421875] 0.9964565634727478
block [1445.41015625, 812.9586791992188, 245.572265625, 242.78192138671875] 0.9943506717681885
block [1165.2928466796875, 1146.5408935546875, 242.5567626953125, 250.6881103515625] 0.9964064955711365预测结果
本文主要演示如何把实例分割的技术,应用到实际生产中,提供一种思路。
| 留言与评论(共有 0 条评论) “” |