Recent advancements in diffusion-based image gen- eration and large vision/language models have revolutionized vi- sual brain decoding (VBD), driving progress in neuroscience and brain-computer interfaces. While state-of-the-art models produce high-quality reconstructed images, a significant semantic gap re- mains between original stimuli and reconstructed images, posing challenges for applications like forensics, medical treatments, and human-robot interactions. This gap arises from VBD models’ limitations in interpreting brain signals and generating accurate representations. To address this, we analyze the issue through the lens of salient object detection, statistically comparing the similarity between visual stimuli and reconstructed images with a focus on salient objects. To our knowledge, this is the first study to evaluate fMRI-based VBD models from this perspective. Our findings provide measurable insights to guide the development of VBD models that align more closely with human perception.
What is Wrong with Visual Brain Decoding? A Saliency-based Investigation
Mohammad Moradi;Morteza Moradi;Marco Grassia;Giuseppe Mangioni
2025-01-01
Abstract
Recent advancements in diffusion-based image gen- eration and large vision/language models have revolutionized vi- sual brain decoding (VBD), driving progress in neuroscience and brain-computer interfaces. While state-of-the-art models produce high-quality reconstructed images, a significant semantic gap re- mains between original stimuli and reconstructed images, posing challenges for applications like forensics, medical treatments, and human-robot interactions. This gap arises from VBD models’ limitations in interpreting brain signals and generating accurate representations. To address this, we analyze the issue through the lens of salient object detection, statistically comparing the similarity between visual stimuli and reconstructed images with a focus on salient objects. To our knowledge, this is the first study to evaluate fMRI-based VBD models from this perspective. Our findings provide measurable insights to guide the development of VBD models that align more closely with human perception.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


