-
GUDIE: a flexible, user-defined method to extract subgraphs of interest from large graphs
Authors:
Maria Inês Silva,
David Aparício,
Beatriz Malveiro,
João Tiago Ascensão,
Pedro Bizarro
Abstract:
Large, dense, small-world networks often emerge from social phenomena, including financial networks, social media, or epidemiology. As networks grow in importance, it is often necessary to partition them into meaningful units of analysis. In this work, we propose GUDIE, a message-passing algorithm that extracts relevant context around seed nodes based on user-defined criteria. We design GUDIE for…
▽ More
Large, dense, small-world networks often emerge from social phenomena, including financial networks, social media, or epidemiology. As networks grow in importance, it is often necessary to partition them into meaningful units of analysis. In this work, we propose GUDIE, a message-passing algorithm that extracts relevant context around seed nodes based on user-defined criteria. We design GUDIE for rich, labeled graphs, and expansions consider node and edge attributes. Preliminary results indicate that GUDIE expands to insightful areas while avoiding unimportant connections. The resulting subgraphs contain the relevant context for a seed node and can accelerate and extend analysis capabilities in finance and other critical networks.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
Finding NeMo: Fishing in banking networks using network motifs
Authors:
Xavier Fontes,
David Aparício,
Maria Inês Silva,
Beatriz Malveiro,
João Tiago Ascensão,
Pedro Bizarro
Abstract:
Banking fraud causes billion-dollar losses for banks worldwide. In fraud detection, graphs help understand complex transaction patterns and discovering new fraud schemes. This work explores graph patterns in a real-world transaction dataset by extracting and analyzing its network motifs. Since banking graphs are heterogeneous, we focus on heterogeneous network motifs. Additionally, we propose a no…
▽ More
Banking fraud causes billion-dollar losses for banks worldwide. In fraud detection, graphs help understand complex transaction patterns and discovering new fraud schemes. This work explores graph patterns in a real-world transaction dataset by extracting and analyzing its network motifs. Since banking graphs are heterogeneous, we focus on heterogeneous network motifs. Additionally, we propose a novel network randomization process that generates valid banking graphs. From our exploratory analysis, we conclude that network motifs extract insightful and interpretable patterns.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
A mechanism of Individualistic Indirect Reciprocity with internal and external dynamics
Authors:
Mario Ignacio González Silva,
Ricardo Armando González Silva,
Héctor Alfonso Juárez López,
Antonio Aguilera Ontiveros
Abstract:
The cooperation mechanism of indirect reciprocity has been studied by making multiple variations of its parts. This research proposes a new variant of Nowak and Sigmund model, focused on agents' attitude; it is called Individualistic Indirect Reciprocity. In our model, an agent reinforces its strategy to the extent to which it makes a profit. We also include conditions related to the environment,…
▽ More
The cooperation mechanism of indirect reciprocity has been studied by making multiple variations of its parts. This research proposes a new variant of Nowak and Sigmund model, focused on agents' attitude; it is called Individualistic Indirect Reciprocity. In our model, an agent reinforces its strategy to the extent to which it makes a profit. We also include conditions related to the environment, visibility of agents, cooperation demand, and the attitude of an agent to maintain his cooperation strategy. Using Agent-Based Model and a Data Science method, we show on simulation results that the discriminatory stance of the agents prevails in most cases. In general, cooperators only appear in conditions with low visibility of reputation and a high degree of cooperation demand. The results also show that when the reputation of others is unknown, with a high obstinacy and high cooperation demand, a heterogeneous society is obtained. The simulations show a wide diversity of scenarios, centralized, polarized, and mixed societies.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
GuiltyWalker: Distance to illicit nodes in the Bitcoin network
Authors:
Catarina Oliveira,
João Torres,
Maria Inês Silva,
David Aparício,
João Tiago Ascensão,
Pedro Bizarro
Abstract:
Money laundering is a global phenomenon with wide-reaching social and economic consequences. Cryptocurrencies are particularly susceptible due to the lack of control by authorities and their anonymity. Thus, it is important to develop new techniques to detect and prevent illicit cryptocurrency transactions. In our work, we propose new features based on the structure of the graph and past labels to…
▽ More
Money laundering is a global phenomenon with wide-reaching social and economic consequences. Cryptocurrencies are particularly susceptible due to the lack of control by authorities and their anonymity. Thus, it is important to develop new techniques to detect and prevent illicit cryptocurrency transactions. In our work, we propose new features based on the structure of the graph and past labels to boost the performance of machine learning methods to detect money laundering. Our method, GuiltyWalker, performs random walks on the bitcoin transaction graph and computes features based on the distance to illicit transactions. We combine these new features with features proposed by Weber et al. and observe an improvement of about 5pp regarding illicit classification. Namely, we observe that our proposed features are particularly helpful during a black market shutdown, where the algorithm by Weber et al. was low performing.
△ Less
Submitted 21 July, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
TripMD: Driving patterns investigation via Motif Analysis
Authors:
Maria Inês Silva,
Roberto Henriques
Abstract:
Processing driving data and investigating driving behavior has been receiving an increasing interest in the last decades, with applications ranging from car insurance pricing to policy making. A common strategy to analyze driving behavior is to study the maneuvers being performance by the driver. In this paper, we propose TripMD, a system that extracts the most relevant driving patterns from senso…
▽ More
Processing driving data and investigating driving behavior has been receiving an increasing interest in the last decades, with applications ranging from car insurance pricing to policy making. A common strategy to analyze driving behavior is to study the maneuvers being performance by the driver. In this paper, we propose TripMD, a system that extracts the most relevant driving patterns from sensor recordings (such as acceleration) and provides a visualization that allows for an easy investigation. Additionally, we test our system using the UAH-DriveSet dataset, a publicly available naturalistic driving dataset. We show that (1) our system can extract a rich number of driving patterns from a single driver that are meaningful to understand driving behaviors and (2) our system can be used to identify the driving behavior of an unknown driver from a set of drivers whose behavior we know.
△ Less
Submitted 5 July, 2021; v1 submitted 7 July, 2020;
originally announced July 2020.
-
Machine learning methods to detect money laundering in the Bitcoin blockchain in the presence of label scarcity
Authors:
Joana Lorenz,
Maria Inês Silva,
David Aparício,
João Tiago Ascensão,
Pedro Bizarro
Abstract:
Every year, criminals launder billions of dollars acquired from serious felonies (e.g., terrorism, drug smuggling, or human trafficking) harming countless people and economies. Cryptocurrencies, in particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns. However, labels are so scarce that traditional supervised algorithms a…
▽ More
Every year, criminals launder billions of dollars acquired from serious felonies (e.g., terrorism, drug smuggling, or human trafficking) harming countless people and economies. Cryptocurrencies, in particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns. However, labels are so scarce that traditional supervised algorithms are inapplicable. Here, we address money laundering detection assuming minimal access to labels. First, we show that existing state-of-the-art solutions using unsupervised anomaly detection methods are inadequate to detect the illicit patterns in a real Bitcoin transaction dataset. Then, we show that our proposed active learning solution is capable of matching the performance of a fully supervised baseline by using just 5\% of the labels. This solution mimics a typical real-life situation in which a limited number of labels can be acquired through manual annotation by experts.
△ Less
Submitted 5 October, 2021; v1 submitted 29 May, 2020;
originally announced May 2020.
-
Exploring time-series motifs through DTW-SOM
Authors:
Maria Inês Silva,
Roberto Henriques
Abstract:
Motif discovery is a fundamental step in data mining tasks for time-series data such as clustering, classification and anomaly detection. Even though many papers have addressed the problem of how to find motifs in time-series by proposing new motif discovery algorithms, not much work has been done on the exploration of the motifs extracted by these algorithms. In this paper, we argue that visually…
▽ More
Motif discovery is a fundamental step in data mining tasks for time-series data such as clustering, classification and anomaly detection. Even though many papers have addressed the problem of how to find motifs in time-series by proposing new motif discovery algorithms, not much work has been done on the exploration of the motifs extracted by these algorithms. In this paper, we argue that visually exploring time-series motifs computed by motif discovery algorithms can be useful to understand and debug results. To explore the output of motif discovery algorithms, we propose the use of an adapted Self-Organizing Map, the DTW-SOM, on the list of motif's centers. In short, DTW-SOM is a vanilla Self-Organizing Map with three main differences, namely (1) the use the Dynamic Time Warping distance instead of the Euclidean distance, (2) the adoption of two new network initialization routines (a random sample initialization and an anchor initialization) and (3) the adjustment of the Adaptation phase of the training to work with variable-length time-series sequences. We test DTW-SOM in a synthetic motif dataset and two real time-series datasets from the UCR Time Series Classification Archive. After an exploration of results, we conclude that DTW-SOM is capable of extracting relevant information from a set of motifs and display it in a visualization that is space-efficient.
△ Less
Submitted 17 April, 2020;
originally announced April 2020.
-
Finding manoeuvre motifs in vehicle telematics
Authors:
Maria Inês Silva,
Roberto Henriques
Abstract:
Driving behaviour has a great impact on road safety. A popular way of analysing driving behaviour is to move the focus to the manoeuvres as they give useful information about the driver who is performing them. In this paper, we investigate a new way of identifying manoeuvres from vehicle telematics data, through motif detection in time-series. We implement a modified version of the Extended Motif…
▽ More
Driving behaviour has a great impact on road safety. A popular way of analysing driving behaviour is to move the focus to the manoeuvres as they give useful information about the driver who is performing them. In this paper, we investigate a new way of identifying manoeuvres from vehicle telematics data, through motif detection in time-series. We implement a modified version of the Extended Motif Discovery (EMD) algorithm, a classical variable-length motif detection algorithm for time-series and we applied it to the UAH-DriveSet, a publicly available naturalistic driving dataset. After a systematic exploration of the extracted motifs, we were able to conclude that the EMD algorithm was not only capable of extracting simple manoeuvres such as accelerations, brakes and curves, but also more complex manoeuvres, such as lane changes and overtaking manoeuvres, which validates motif discovery as a worthwhile line for future research.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.