IbPRIA 2025: 12th Iberian Conference on Pattern Recognition and Image Analysis

Coimbra, Portugal. June 30 - July 3, 2025

IbPRIA 2025 Accepted Papers

Oral Session 5 - Applications

Deciphering the Silent Signals: Unveiling Frequency Importance for Wi-Fi-Based Human Pose Estimation with Explainability

Leonardo Capozzi, Leonardo Ferreira, Tiago Gonçalves, Ana Rebelo, Jaime S. Cardoso, Ana F. Sequeira

Abstract:
The rapid advancement of wireless technologies, particularly Wi-Fi, has spurred significant research into indoor human activity detection across various domains (e.g., healthcare, security, and industry). This work explores the non-invasive and cost-effective Wi-Fi paradigm and the application of deep learning for human activity recognition using Wi-Fi signals. Focusing on the challenges in machine interpretability, motivated by the increase in data availability and computational power, this paper uses explainable artificial intelligence to understand the inner workings of transformer-based deep neural networks designed to estimate human pose (i.e., human skeleton key points) from Wi-Fi channel state information. Using different strategies to assess the most relevant sub-carriers (i.e., rollout attention and masking attention) for the model predictions, we evaluate the performance of the model when it uses a given number of sub-carriers as input, selected randomly or by ascending (high-attention) or descending (low-attention) order. We concluded that the models trained with fewer (but relevant) sub-carriers are competitive with the baseline (trained with all sub-carriers) but better in terms of computational efficiency (i.e., processing more data per second).

Efficient Malicious UAV Detection Using Autoencoder-TSMamba Integration

Azim Akhtarshenas, Ramin Toosi, David López-Pérez, Tohid Alizadeh, Alireza Hosseini

Abstract:
Malicious Unmanned Aerial Vehicles (UAVs) present a significant threat to next-generation networks (NGNs), posing risks such as unauthorized surveillance, data theft, and the delivery of hazardous materials. This paper proposes an integrated (AE)-classifier system to detect malicious UAVs. The proposed AE, based on a 4-layer Tri-orientated Spatial Mamba (TSMamba) architecture, effectively captures complex spatial relationships crucial for identifying malicious UAV activities. The first phase involves generating residual values through the AE, which are subsequently processed by a ResNet-based classifier. This classifier leverages the residual values to achieve lower complexity and higher accuracy. Our experiments demonstrate significant improvements in both binary and multi-class classification scenarios, achieving up to 99.8 % recall compared to 96.7 % in the benchmark. Additionally, our method reduces computational complexity, making it more suitable for large-scale deployment. These results highlight the robustness and scalability of our approach, offering an effective solution for malicious UAV detection in NGN environments.

On the Correction of GFS Wind Speed Forecasts in Portugal Using LSTM Networks

Vasco Gomes, David Carvalho, Sónia Gouveia

Abstract:
Correcting forecast errors improves weather prediction accuracy, which is vital for optimizing decision-making in renewable energy production. This study investigates the application of Long Short-Term Memory (LSTM) models to correct 24-hour wind speed forecast errors produced by the Global Forecast System (GFS), a physical meteorological model. The focus is on error correction, quantified as the difference between observed and predicted values. The dataset includes 6-hourly data over a 24-hour forecasting horizon in 20 locations of the Portuguese territory. LSTM models are trained to predict forecast errors, which are subsequently used to generate a corrected time series by applying a mean bias correction to GFS forecasts and adding the LSTM-predicted error. This corrected series is then compared to the original GFS forecasts. The results show that the corrected GFS forecast yields relevant improvements in performance (RMSE, MAE and R2) in comparison with the GFS forecast, where the largest improvements were observed in locations where GFS traditionally underperforms, particularly in high-altitude regions. This methodology shows promise in improving wind speed forecasting accuracy for renewable energy applications in Portugal, enhancing operational efficiency and supporting sustainable energy goals.

Statistical Downscaling of Wind Speed in the Iberian Peninsula Using Machine Learning

João Vieitas, José Contente, David Carvalho, Sónia Gouveia

Abstract:
Accurate wind speed forecasting is essential for optimizing renewable energy production and supporting climate resilience initiatives. This study explores the application of machine learning, specifically Long Short-Term Memory (LSTM) networks, for downscaling wind speed data from coarse 100 km grid resolution to a finer 10 km grid. Using a dataset spanning 30 years from Portugal and a small part of Spain, the new model is based on historical information on both coarse- and fine-resolution wind speed data to improve the quality of the predictions. A one-for-all model was developed using randomly selected, equal-sized samples from each location and day of the year, drawn from the available data. The model training and validation were conducted via 5-fold cross-validation. The model including both fine and coarse information showed the best results, achieving an RMSE of 0.52 ± 0.01 and R2 of 0.53 ± 0.01 over validation. Furthermore, the spatial distribution of predicted wind speeds is closely aligned with observed patterns, further confirming the ability of the model to capture adequately the spatial variations of wind speed. Additionally, residuals from the model exhibited no statistically significant autocorrelation, indicating that the model has successfully captured the underlying temporal evolution of the time series data. These findings highlight the importance of combining coarse and fine-resolution data to enhance wind speed predictions, with implications for renewable energy production and climate studies.

Using gait to monitor health: an experimental baseline

Jorge Zafra-Palma, Nuria Marín-Jiménez, José Castro-Piñero, Magdalena Cuenca-García, Rafael Muñoz-Salinas, Manuel J. Marín-Jimenez

Abstract:
In this work, we explore the use of gait as an indicator of health and anthropomorphic measurements. We support the idea that with the progress of Computer Vision techniques, the visual information associated with the walking pattern of people may encode useful information that can be estimated in a non-invasive way, i.e. without the need for attached or task-specific expensive sensors. The main contribution of this work is an experimental baseline that shows promising results in the inference tasks of demographic and anthropometric characteristics based on visual information and gait parameters obtained from the recent Health & Gait dataset. We employ various data representations, including silhouette, semantic segmentation, and optical flow, as well as gait parameters measured by sensor-based systems and estimated from video-based pose information. Our findings highlight the potential of gait analysis as a valuable tool for biometric identification and health monitoring, offering a non-intrusive and accessible alternative to traditional methods.

Zipf 's Curves of Plainchant Melodies

Aitana Menárguez-Box, Enrique Vidal, Alejandro H. Toselli

Abstract:
As important as being able to know the words in a text, being able to identify melodic patterns in musical scores is of utmost importance in many applications, such as indexing for information search and retrieval. We explore here methods to set up appropriate musical units that extend beyond individual notes into higher-level units of communication (i.e., melodic patterns, dubbed ''musical words''). This prospect is studied following two main ideas for musical encoding, understood as the process of transforming musical content into musical text based on a musical word vocabulary: 1) based on the association of notes of musical scores and corresponding lyrics text (available in some datasets annotated with such explicit alignments); 2) based on the unsupervised grouping of individual notes into note sequences using a technique known as Byte Pair Encoding. To assess the appropriateness of these musical encodings, we resort to Zipf's curves of the encoded musical texts and measure how close such curves are to a Zipfian law. Results suggest that the musical words derived from the alignment with lyrics words produce the most natural melodic patterns and that better results are achieved if the encoding is based on note pitch intervals rather than notes encoded as absolute pitch symbols. Additionally, BPE appears to be a promising unsupervised way to encode musical content into sequences of fully automatically discovered musical words.

Publisher

Endorsed by

Technical Sponsors