IbPRIA 2025: 12th Iberian Conference on Pattern Recognition and Image Analysis

Coimbra, Portugal. June 30 - July 3, 2025

IbPRIA 2025 Accepted Papers

Oral Session 4 - Machine and Deep Learning 2

Causal-SHAP: Feature Selection with Explainability and Causal Analysis

Asmae Lamsaf, Pranita Samale, João C. Neves

Abstract:
Feature selection is a critical step in machine learning, particularly when dealing with high-dimensional datasets that contain redundant or irrelevant features. Traditional feature selection methods rely on correlation-based selection, which can misidentify important features due to spurious relationships rather than true causal effects. In this paper, we introduce Causal-SHAP, a novel feature selection method that integrates explainability with causal analysis. Our approach combines SHAP values to measure feature importance and causal analysis to determine which features have impact on predictive performance and a genuine causal impact on the target variable. By merging these two techniques, Causal-SHAP reduces dataset complexity while keeping only the most meaningful features, leading to better model accuracy and interpretability. The experimental validation on diverse datasets and machine learning tasks demonstrates that our method outperforms conventional feature selection techniques. Our findings underscore the importance of causality in feature selection, paving the way for more robust and trustworthy machine learning models.

Impact of label-level noise on multi-label learning: a case study on the k-Nearest Neighbor classifier

Antonio Requena, Antonio Javier Gallego, and Jose J. Valero-Mas

Abstract:
Multi-label classification represents the learning paradigm that categorizes a given instance with an undetermined number of labels. While this framework has progressively gained attention as its formulation naturally suits many real-life tasks, numerous limitations still hinder its practical application. This work addresses one such limitation by analyzing the impact of label-level noise on k-Nearest Neighbor (kNN)-based multi-label classifiers. Specifically, it examines six different noise induction mechanisms at the label level and assesses their effects on six representative multi-label datasets and three well-known kNN-based classifiers, along with three mechanisms for improving their efficiency. The results show that while some recognition strategies are severely affected by label-level noise, others naturally exhibit greater robustness. Moreover, the effectiveness of each approach varies depending on the specific type of noise, highlighting the need for tailored mitigation strategies.

Learning to Detect and Describe a Wireframe

Iván Ferre, Luis Baumela, Iago Suárez

Abstract:
Image matching is essential in computer vision, enabling applications such as pose estimation, object retrieval, 3D Reconstruction, and SLAM. These applications consist of three phases: Front-End (feature detection and descriptor extraction), Middle-End (feature matching and filtering), and Back-End (global map integration). This entire pipeline is usually designed taking into account a certain computational budget. For on-device inference, sparse methods efficiently implement the Front-End by representing images as keypoints and lines that can be associated in a wireframe. We propose a unified neural network to produce wireframes with meaningful descriptors associated with each node of the wireframe, reducing the computational cost and simplifying the process. We also show that our model can be trained using multi-task distillation from two state-of-the-art point and line extractors. We show competitive results in the accuracy vs. efficiency trade-off.

Using LoRA and Reinforcement Learning In Interactive Machine Translation

Ángel Navarro, Francisco Casacuberta

Abstract:
The use of large language models (LLMs) is rapidly expanding due to their impressive performance across various tasks. However, as newer versions continue to improve results, their increasing size poses challenges for maintaining multiple domain-specialized versions. The Low-Rank Adaptation (LoRA) method offers a solution to this limitation by enabling fine-tuning modifications to be stored in a file of just a few megabytes, significantly reducing storage requirements. In Machine Translation (MT), models are often specialized for specific domains or language pairs. In our case, we apply these models within Interactive Machine Translation (IMT), where generating high-quality translations and adapting effectively to user modifications are crucial. We have integrated Reinforcement Learning (RL) techniques, optimizing the model using various evaluation metrics. Our results demonstrate that these methods effectively improve translation quality; however, in some cases, this improvement comes at the cost of a slight reduction in generalization capability.

Publisher

Endorsed by

Technical Sponsors