In this work we propose an application of XLNet to address the task hosted at PAN@CLEF2023 related to Profiling Cryptocurrency Influencers with Few-shot Learning. For our proposed approach we made use of XLNet fine-tuned on an augmented version of the training set provided for the competition. Given the few-shot learning perspective of the task we found useful to employ a data augmentation strategy similar to one proposed in a previous edition of a PAN task. The augmentation is performed augmenting each sample in the training dataset with its corresponding backtranslated version to a target language. The target languages we used for our two submissions were German and Italian. After the fine-tuning of the XLNet we predict the labels for the unlabeled test set. After fine-tuning the XLNet model we evaluated it on the original non-augmented training set. We evaluated all the F1 with regards to each label, and then we reported the Macro F1 across all the labels provided. Our results prove that on the original training set our approach can obtain a maximum Macro F1 of 0.6937 and a maximum accuracy of 0.6893.
XLNet with Data Augmentation to Profile Cryptocurrency Influencers
	
	
	
		
		
		
		
		
	
	
	
	
	
	
	
	
		
		
		
		
		
			
			
			
		
		
		
		
			
			
				
				
					
					
					
					
						
							
						
						
					
				
				
				
				
				
				
				
				
				
				
				
			
			
		
			
			
				
				
					
					
					
					
						
						
							
							
						
					
				
				
				
				
				
				
				
				
				
				
				
			
			
		
		
		
		
	
Siino M.
						
						
							Primo
;
	
		
		
	
			2023-01-01
Abstract
In this work we propose an application of XLNet to address the task hosted at PAN@CLEF2023 related to Profiling Cryptocurrency Influencers with Few-shot Learning. For our proposed approach we made use of XLNet fine-tuned on an augmented version of the training set provided for the competition. Given the few-shot learning perspective of the task we found useful to employ a data augmentation strategy similar to one proposed in a previous edition of a PAN task. The augmentation is performed augmenting each sample in the training dataset with its corresponding backtranslated version to a target language. The target languages we used for our two submissions were German and Italian. After the fine-tuning of the XLNet we predict the labels for the unlabeled test set. After fine-tuning the XLNet model we evaluated it on the original non-augmented training set. We evaluated all the F1 with regards to each label, and then we reported the Macro F1 across all the labels provided. Our results prove that on the original training set our approach can obtain a maximum Macro F1 of 0.6937 and a maximum accuracy of 0.6893.| File | Dimensione | Formato | |
|---|---|---|---|
| 
									
										
										
										
										
											
												
												
												    
												
											
										
									
									
										
										
											paper-231.pdf
										
																				
									
										
											 accesso aperto 
											Tipologia:
											Versione Editoriale (PDF)
										 
									
									
									
									
										
											Licenza:
											
											
												Creative commons
												
												
													
													
													
												
												
											
										 
									
									
										Dimensione
										959.19 kB
									 
									
										Formato
										Adobe PDF
									 
										
										
								 | 
								959.19 kB | Adobe PDF | Visualizza/Apri | 
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


