VISUAL OBJECT DETECTION BY COLOR, SHAPE AND DIMENSION (DETECCIÓN VISUAL DE OBJETOS POR COLOR, FORMA Y DIMENSIÓN)

Alejandro Israel Barranco Gutiérrez, Saúl Martínez Díaz, Juan Prado Olivarez

Resumen


Abstract
3D neural object detection by color, shape and dimension (3DOD-CSD) is a novel and powerful tool that can be used to build object detection systems in environments with uncontrolled illumination. Inspired by the global structure of the human visual system, it uses a neural network in the classification stage and determines the physical dimension of the object’s fea-tures using commercial digital cameras calibrated in stereo configuration. This permits the analysis of images of objects with the same form and color but different dimensions, such as scaled replicas or photographs of a photograph of a 3D object. With this method, a fixed distance from the camera to the object to be analyzed is not necessary - essential for a dynamic recognition system in changing conditions. The results show strong discrimination between desired and undesired objects. This system has many possible applications, including face identification and object selection in varying environments, utility pole detection, coin detection, and more.
Keywords: Color, dimensional features, invariant features, neural network, stereo vision, 3D object recognition.

Resumen
La detección de objetos 3D por color, forma y dimensión (3DOD-CSD) es una herramienta para construir sistemas de detección de objetos en entornos con iluminación incontrolada. Inspirado en la estructura global del sistema visual humano, utiliza una red neuronal en la etapa de clasificación y determina la dimensión física de las características del objeto utilizando cámaras digitales comerciales calibradas en configuración estéreo. Esto permite el análisis de imágenes de objetos con la misma forma y color, pero de diferentes dimensiones, como réplicas a escala o fotografías de una fotografía de un objeto 3D. Con este método, no es necesaria una distancia fija entre la cámara y el objeto a analizar, algo esencial para un sistema de reconocimiento dinámico en condiciones cambiantes. Los resultados muestran una fuerte discriminación entre objetos deseados y no deseados. Este sistema tiene muchas aplicaciones posibles, incluida la identificación de rostros y la selección de objetos en entornos variantes, detección de postes, detección de monedas, entre otros.
Palabras Clave: Color, características dimensionales, características invariantes, reconocimiento de objetos 3D, red neuronal, visión estéreo.

Texto completo:

100-119 PDF

Referencias


Amos A., Suppa M., Gerth W. (2001). Detection of stair dimensions for the path planning of a bipedal robot, International Conference on Advanced Intelligent Mechatronics Proceedings. 1291-1296. DOI: 10.1109/AIM.2001.936909

Baek H. S., Choi J. M., Lee B. S. (2010). Improvement of distance measurement algorithm on stereo vision system (SVS), Proceedings of the 5th International Conference on Ubiquitous Information Technologies and Applications (CUTE), 1-3. DOI: 10.1109/ICUT.2010.5678176

Barranco G. A. I., Medel J. J. J. (2009). Digital Camera Calibration Analysis Using Perspective Projection Matrix, in Proceedings of the 8th WSEAS International Conference on Signal Processing, Robotics and Automation, 321-325. Available at: http://dl.acm.org/citation.cfm?id=1558971

Barranco G. A. I., Medel J. J. J. (2011). Automatic object recognition based on dimensional relationships, Computación y Sistemas, 15(2), 267-272. Available at: http://www.scielo.org.mx/pdf/cys/v15n2/v15n2a11.pdf

Barranco G. A. I., Martínez D. S., Gómez T. J. L. (2013). An Approach for Utility Pole Recognition in Real Conditions, PSIVT Workshop on Quality Assessment and Control by Image and Video Analysis, 113-121. DOI: 10.1007/978-3-642-53926-8_11

Box E., Junter S., Hunter W. (2005) Statistics for experimenters, design, innovation and discovery, USA, John Wiley & Sons Inc.

Costa M. S., Shapiro L. G. (2000). 3D Object Recognition and Pose with Relational Indexing, Computer Vision and Image Understanding, 79(3), 364–407. DOI: 10.1006/cviu.2000.0865

Evangelidis G. D., Hansard M., and Horaud R. (2015). Fusion of Range and Stereo Data for High-Resolution Scene-Modeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(11), 2178 - 2192. DOI 10.1109/TPAMI.2015.2400465

Forsyth D., Mundy J.L., Zisserman A., Coelho C., Heller A., Rothwell C. (1991). Invariant Descriptors for 3D Object Recognition and Pose, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10), 971 – 991. DOI: 10.1109/34.99233

Gonzalez R. C., Woods R. E. (2006) Digital Image Processing, USA, Prentice Hall.

Gupta M., Agrawal A., Veeraraghavan A., Narasimhan S.G. (2011). Structured light 3D scanning in the presence of global illumination, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 713-720. DOI: 10.1109/CVPR.2011.5995321

Hartley R., Zisserman A. (2003). Multiple view geometry in computer vision, UK, Cambridge.

Hu M. K., (1962). Visual Pattern Recognition by Moment Invariants, IRE Transactions on Information Theory, vol. IT-8, 179–187, DOI: 10.1109/TIT.1962.1057692

Hu Y., Jiang D.; Yan S., Zhang L., Zhang H. (2004). Automatic 3D reconstruction for face recognition, Sixth IEEE International Conference on Automatic Face and Gesture Recognition Proceedings, 843–848. DOI: 10.1109/AFGR.2004.1301639

Kviatkovsky I., Adam A., Rivlin E., (2013). Color Invariants for Person Reidentification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 35(7), 1622-1634. DOI: 10.1109/TPAMI.2012.246

Linderberg T., Garding T. (1997). Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure, Image and Vision Computing, 15(6), 415–434, Available at: http://www.sciencedirect.com/science/article/pii/S026288569701144X

Marquez M. A., Vargas C. A., and Arguello H. (2016). Automatic detection of bumblebees using video analysis, Ingeniería e Investigación, 36(3), 81-84. DOI: http://www.revistas.unal.edu.co/index.php/ingeinv/article/view/54267/58067

Mustafah Y. M., Noor R., Hasbi H., Azma A. W. (2012). Stereo Vision Images Processing for Real-time Object Distance and Size Measurements, International Conference on Computer and Communication Engineering (ICCCE 2012), 659 – 663. DOI: 10.1109/ICCCE.2012.6271270

Ocegueda O., Fang T., Shah S., Kakadiaris I. (2013). 3D Face Discriminant Analysis Using Gauss-Markov Posterior Marginals, IEEE Transactions on Pattern Recognition and Machine Intelligence, 35(3), 728-738. DOI: 10.1109/TPAMI.2012.126

Osman M.K. (2009). 3D object recognition using MANFIS network with orthogonal and non-orthogonal moments, 5th International Colloquium on Signal Processing & Its Applications, 302 – 306. DOI: 10.1109/CSPA.2009.5069239

Pajares G., De la Cruz J. M. (2004). Visión por computador imágenes digitales y aplicaciones, España, AlfaOmega.

Rahman K. A., Hossain M. S., Bhuiyan M. A., Tao Z., Hasanuzzaman M., Ueno H. (2009). Person to Camera Distance Measurement Based on Eye-Distance, 3rd International Conference on Multimedia and Ubiquitous Engineering (MUE'09), 137-141. DOI: 10.1109/MUE.2009.34

Ren H., Zhong Q., Kang J. (2009). Object recognition algorithm research based on variable illumination, IEEE International Conference on Automation and Logistics (ICAL'09), 1609–1613. DOI: 10.1109/ICAL.2009.5262717

Zhuang H., Low K. S., Yau W. Y. (2013). Multichannel Pulse-Coupled-Neural-Network-Based Color Image Segmentation for Object Detection. IEEE Transactions on Industrial Electronics, 59(8), 3299 – 3308. DOI: 10.1109/TIE.2011.2165451

Sigal L., Sclaroff S., Athitsos V. (2004). Skin Color-Based Video Segmentation under Time-Varying Illumination, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(7), 862 – 877, DOI: 10.1109/TPAMI.2004.35

Sossa H. (2006) Rasgos descriptores para el reconocimiento de Objetos, Instituto Politécnico Nacional, 113-176.

Trucco E., Verri A. (1998) Introductory techniques to 3D computer vision, USA, Prentice Hall.

Wang J., Athtsos V., Sclaroff S., Betke M. (2008). Detecting Objects of Variable Shape Structure with Hidden State Shape Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(3), Available at: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4359323

Xiaoming L., Tian Q., Wanchun C., Xingliang Y. (2009). Real-Time Distance Measurement Using a Modified Camera, IEEE Sensors Applications Symposium, 54-58. 10.1109/SAS.2010.5439423

Yi-Xing L., Ying L., Yu G., Li-Tao K., Xiao-Qi C., Xiao-you S. (2007). Features of human skin in HSV color space and new recognition parameter, Optoelectronics Letters, 3(4), 312-314, DOI: 10.1007/s11801-007-6175-3

Zhang Z. (2000). A Flexible New Technique for Camera Calibration, IEEE transactions on pattern analysis and machine intelligence, 22(11), 1330-1334, Available at:

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=888718

Zalevsky Z., Shpunt A., Maizels A., Garcia J. (2007). Method and system for object reconstruction, World Intellectual Property Organization, Available at: http://www.google.com/patents/US20100177164






URL de la licencia: https://creativecommons.org/licenses/by/3.0/deed.es

Barra de separación

Licencia Creative Commons    Pistas Educativas está bajo la Licencia Creative Commons Atribución 3.0 No portada.    

TECNOLÓGICO NACIONAL DE MÉXICO / INSTITUTO TECNOLÓGICO DE CELAYA

Antonio García Cubas Pte #600 esq. Av. Tecnológico, Celaya, Gto. México

Tel. 461 61 17575 Ext 5450 y 5146

pistaseducativas@itcelaya.edu.mx

http://pistaseducativas.celaya.tecnm.mx/index.php/pistas