Decoding brain activity to extract meaningful visual and semantic representations remains a fundamental challenge in neuroscience. In this study, we introduce a novel multimodal framework that integrates BLIP-2, a vision-language model, with Stable Diffusion to decode functional magnetic resonance imaging (fMRI) data. Our approach translates neural signals into structured textual descriptions, extracts semantic information to guide visual reconstruction, and enables fine-grained neural feature analysis. By aligning fMRI-derived embeddings with a shared visual-textual space, we capture hierarchical information ranging from low-level perceptual features to high-level semantic abstractions. Experimental results demonstrate the efficacy of this framework in preserving key visual attributes within neural representations, highlighting its potential for advancing fMRI-based multimodal decoding and providing deeper insights into the neural mechanisms underlying visual perception.Clinical relevance - This work provides a further step toward the development of tools for decoding and reconstructing mental imagery, which could aid practicing clinicians in diagnosing and treating neurological conditions. By enabling a deeper understanding of how the brain processes and represents visual and semantic information, this framework holds potential for applications in rehabilitation, early detection of cognitive decline, and the development of brain-to-text interfaces for communication in patients with severe motor disabilities.

From Questions to Neural Insights: Towards Query-Based fMRI Decoding

Finocchiaro M.;Calcagno S.;Sorrenti A.;Pennisi M.;Salanitri F. P.;Spampinato C.
2025-01-01

Abstract

Decoding brain activity to extract meaningful visual and semantic representations remains a fundamental challenge in neuroscience. In this study, we introduce a novel multimodal framework that integrates BLIP-2, a vision-language model, with Stable Diffusion to decode functional magnetic resonance imaging (fMRI) data. Our approach translates neural signals into structured textual descriptions, extracts semantic information to guide visual reconstruction, and enables fine-grained neural feature analysis. By aligning fMRI-derived embeddings with a shared visual-textual space, we capture hierarchical information ranging from low-level perceptual features to high-level semantic abstractions. Experimental results demonstrate the efficacy of this framework in preserving key visual attributes within neural representations, highlighting its potential for advancing fMRI-based multimodal decoding and providing deeper insights into the neural mechanisms underlying visual perception.Clinical relevance - This work provides a further step toward the development of tools for decoding and reconstructing mental imagery, which could aid practicing clinicians in diagnosing and treating neurological conditions. By enabling a deeper understanding of how the brain processes and represents visual and semantic information, this framework holds potential for applications in rehabilitation, early detection of cognitive decline, and the development of brain-to-text interfaces for communication in patients with severe motor disabilities.
File in questo prodotto:
File Dimensione Formato  
From_Questions_to_Neural_Insights_Towards_Query-Based_fMRI_Decoding.pdf

solo gestori archivio

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 2.11 MB
Formato Adobe PDF
2.11 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/721052
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact