Visual representation learning enables computers or systems to simulate the function of retinas, optic nerves, and visual cortex in the human brain, and derive meaningful information from digital images, videos, and other visual inputs. To learn the effective presentation of visual data is essential for many computer vision and artificial intelligence applications ranging from energy and utilities to manufacturing and automotive. Current popular deep learning-based visual representation learning methods do not fully consider the nature of the biological visual nervous system and are lack in interpretability. To solve visual representation well, the integration of psychological or neuroscientific approaches is required to enhance the cognition of visual data.
Visual representation extraction is a complex neural process. The performance of visual representation is closely related to specific vision tasks. In the past decades, a set of deep learning-based approaches have been proposed to achieve state-of-the-art performance on a broad range of topics, such as visual classification, detection, tracking, retrieval, segmentation, and video understanding. Despite attracting a surge of research interest, some core issues in visual representation, such as how to design and optimize an efficient 2D/3D visual representation model/architecture in complex scenes, and how to handle a large amount of heterogeneous or unlabeled visual data, are still open problems. This research topic aims to provide a platform for exchanging research works, technical trends, and practical experience related to visual representation learning theories and their related applications using deep neural networks and other psychological and neuroscientific approaches.
The Editors solicit original papers, especially encourage papers that integrate more approaches to achieve a complete understanding. Topics of interest include (but are not limited to):
1. Neuroscience-driven visual representation learning theories/algorithms.
2. Efficient visual representation architectures through model compression, distillation, or other mechanisms.
3. Low-level visual processing (e.g., image/video feature detection and matching, reduce noise, and enhancement).
4. High-Level visual semantic understanding (e,g, classification, detection, tracking, registration, segmentation, recognition and affective computing).
5. Medical/biomedical image analysis.
6. Multi-modal learning with visual data (visual captioning, visual grounding, image/video-text retrieval, cross-modal visual generation).
7. Visual representation learning-inspired applications for robots, human-computer interaction, and autonomous driving.
8. New dataset and problem for visual representation learning and visual understanding.
Visual representation learning enables computers or systems to simulate the function of retinas, optic nerves, and visual cortex in the human brain, and derive meaningful information from digital images, videos, and other visual inputs. To learn the effective presentation of visual data is essential for many computer vision and artificial intelligence applications ranging from energy and utilities to manufacturing and automotive. Current popular deep learning-based visual representation learning methods do not fully consider the nature of the biological visual nervous system and are lack in interpretability. To solve visual representation well, the integration of psychological or neuroscientific approaches is required to enhance the cognition of visual data.
Visual representation extraction is a complex neural process. The performance of visual representation is closely related to specific vision tasks. In the past decades, a set of deep learning-based approaches have been proposed to achieve state-of-the-art performance on a broad range of topics, such as visual classification, detection, tracking, retrieval, segmentation, and video understanding. Despite attracting a surge of research interest, some core issues in visual representation, such as how to design and optimize an efficient 2D/3D visual representation model/architecture in complex scenes, and how to handle a large amount of heterogeneous or unlabeled visual data, are still open problems. This research topic aims to provide a platform for exchanging research works, technical trends, and practical experience related to visual representation learning theories and their related applications using deep neural networks and other psychological and neuroscientific approaches.
The Editors solicit original papers, especially encourage papers that integrate more approaches to achieve a complete understanding. Topics of interest include (but are not limited to):
1. Neuroscience-driven visual representation learning theories/algorithms.
2. Efficient visual representation architectures through model compression, distillation, or other mechanisms.
3. Low-level visual processing (e.g., image/video feature detection and matching, reduce noise, and enhancement).
4. High-Level visual semantic understanding (e,g, classification, detection, tracking, registration, segmentation, recognition and affective computing).
5. Medical/biomedical image analysis.
6. Multi-modal learning with visual data (visual captioning, visual grounding, image/video-text retrieval, cross-modal visual generation).
7. Visual representation learning-inspired applications for robots, human-computer interaction, and autonomous driving.
8. New dataset and problem for visual representation learning and visual understanding.