Skip to main content

Showing 1–16 of 16 results for author: Aparício, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.08647  [pdf, other

    cs.CL cs.AI

    Intent Detection at Scale: Tuning a Generic Model using Relevant Intents

    Authors: Nichal Narotamo, David Aparicio, Tiago Mesquita, Mariana Almeida

    Abstract: Accurately predicting the intent of customer support requests is vital for efficient support systems, enabling agents to quickly understand messages and prioritize responses accordingly. While different approaches exist for intent detection, maintaining separate client-specific or industry-specific models can be costly and impractical as the client base expands. This work proposes a system to sc… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 6 pages, 6 tables, 2 figures, ICMLA 2023

  2. arXiv:2308.15239  [pdf, other

    cs.AI

    Natural language to SQL in low-code platforms

    Authors: Sofia Aparicio, Samuel Arcadinho, João Nadkarni, David Aparício, João Lages, Mariana Lourenço, Bartłomiej Matejczyk, Filipe Assunção

    Abstract: One of the developers' biggest challenges in low-code platforms is retrieving data from a database using SQL queries. Here, we propose a pipeline allowing developers to write natural language (NL) to retrieve data. In this study, we collect, label, and validate data covering the SQL queries most often performed by OutSystems users. We use that data to train a NL model that generates SQL. Alongside… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  3. arXiv:2307.13787  [pdf, other

    cs.LG cs.CR

    The GANfather: Controllable generation of malicious activity to improve defence systems

    Authors: Ricardo Ribeiro Pereira, Jacopo Bono, João Tiago Ascensão, David Aparício, Pedro Ribeiro, Pedro Bizarro

    Abstract: Machine learning methods to aid defence systems in detecting malicious activity typically rely on labelled data. In some domains, such labelled data is unavailable or incomplete. In practice this can lead to low detection rates and high false positive rates, which characterise for example anti-money laundering systems. In fact, it is estimated that 1.7--4 trillion euros are laundered annually and… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  4. arXiv:2307.08433  [pdf, other

    cs.LG

    From random-walks to graph-sprints: a low-latency node embedding framework on continuous-time dynamic graphs

    Authors: Ahmad Naser Eddin, Jacopo Bono, David Aparício, Hugo Ferreira, João Ascensão, Pedro Ribeiro, Pedro Bizarro

    Abstract: Many real-world datasets have an underlying dynamic graph structure, where entities and their interactions evolve over time. Machine learning models should consider these dynamics in order to harness their full potential in downstream tasks. Previous approaches for graph representation learning have focused on either sampling k-hop neighborhoods, akin to breadth-first search, or random walks, akin… ▽ More

    Submitted 16 February, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 9 pages, 5 figures, 7 tables

  5. arXiv:2209.10254  [pdf, other

    cs.LG cs.DB

    T5QL: Taming language models for SQL generation

    Authors: Samuel Arcadinho, David Aparício, Hugo Veiga, António Alegria

    Abstract: Automatic SQL generation has been an active research area, aiming at streamlining the access to databases by writing natural language with the given intent instead of writing SQL. Current SOTA methods for semantic parsing depend on LLMs to achieve high predictive accuracy on benchmark datasets. This reduces their applicability, since LLMs requires expensive GPUs. Furthermore, SOTA methods are ungr… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: 11 pages, 5 figures

  6. arXiv:2112.07508  [pdf, ps, other

    cs.LG

    Anti-Money Laundering Alert Optimization Using Machine Learning with Graphs

    Authors: Ahmad Naser Eddin, Jacopo Bono, David Aparício, David Polido, João Tiago Ascensão, Pedro Bizarro, Pedro Ribeiro

    Abstract: Money laundering is a global problem that concerns legitimizing proceeds from serious felonies (1.7-4 trillion euros annually) such as drug dealing, human trafficking, or corruption. The anti-money laundering systems deployed by financial institutions typically comprise rules aligned with regulatory frameworks. Human investigators review the alerts and report suspicious cases. Such systems suffer… ▽ More

    Submitted 17 June, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: 8 pages, 5 figures

    MSC Class: I.2.1; J.4

  7. arXiv:2108.09200  [pdf, other

    cs.SI

    GUDIE: a flexible, user-defined method to extract subgraphs of interest from large graphs

    Authors: Maria Inês Silva, David Aparício, Beatriz Malveiro, João Tiago Ascensão, Pedro Bizarro

    Abstract: Large, dense, small-world networks often emerge from social phenomena, including financial networks, social media, or epidemiology. As networks grow in importance, it is often necessary to partition them into meaningful units of analysis. In this work, we propose GUDIE, a message-passing algorithm that extracts relevant context around seed nodes based on user-defined criteria. We design GUDIE for… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

    Comments: 16 pages, 8 figures, accepted at GEM2021

  8. arXiv:2108.04494  [pdf, other

    cs.SI

    Finding NeMo: Fishing in banking networks using network motifs

    Authors: Xavier Fontes, David Aparício, Maria Inês Silva, Beatriz Malveiro, João Tiago Ascensão, Pedro Bizarro

    Abstract: Banking fraud causes billion-dollar losses for banks worldwide. In fraud detection, graphs help understand complex transaction patterns and discovering new fraud schemes. This work explores graph patterns in a real-world transaction dataset by extracting and analyzing its network motifs. Since banking graphs are heterogeneous, we focus on heterogeneous network motifs. Additionally, we propose a no… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: 6 pages, 6 figures, accepted at SEAData 2021

  9. arXiv:2102.05373  [pdf, other

    cs.LG cs.SI

    GuiltyWalker: Distance to illicit nodes in the Bitcoin network

    Authors: Catarina Oliveira, João Torres, Maria Inês Silva, David Aparício, João Tiago Ascensão, Pedro Bizarro

    Abstract: Money laundering is a global phenomenon with wide-reaching social and economic consequences. Cryptocurrencies are particularly susceptible due to the lack of control by authorities and their anonymity. Thus, it is important to develop new techniques to detect and prevent illicit cryptocurrency transactions. In our work, we propose new features based on the structure of the graph and past labels to… ▽ More

    Submitted 21 July, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: 5 pages, 3 figures

  10. arXiv:2005.14635  [pdf, other

    cs.LG stat.ML

    Machine learning methods to detect money laundering in the Bitcoin blockchain in the presence of label scarcity

    Authors: Joana Lorenz, Maria Inês Silva, David Aparício, João Tiago Ascensão, Pedro Bizarro

    Abstract: Every year, criminals launder billions of dollars acquired from serious felonies (e.g., terrorism, drug smuggling, or human trafficking) harming countless people and economies. Cryptocurrencies, in particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns. However, labels are so scarce that traditional supervised algorithms a… ▽ More

    Submitted 5 October, 2021; v1 submitted 29 May, 2020; originally announced May 2020.

    Comments: 8 pages, 7 figures

  11. arXiv:2002.06075  [pdf, other

    cs.LG cs.AI cs.DB stat.ML

    ARMS: Automated rules management system for fraud detection

    Authors: David Aparício, Ricardo Barata, João Bravo, João Tiago Ascensão, Pedro Bizarro

    Abstract: Fraud detection is essential in financial services, with the potential of greatly reducing criminal activities and saving considerable resources for businesses and customers. We address online fraud detection, which consists of classifying incoming transactions as either legitimate or fraudulent in real-time. Modern fraud detection systems consist of a machine learning model and rules defined by h… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: 11 pages, 12 figures, submitted to KDD '20 Applied Data Science Track

  12. A Survey on Subgraph Counting: Concepts, Algorithms and Applications to Network Motifs and Graphlets

    Authors: Pedro Ribeiro, Pedro Paredes, Miguel E. P. Silva, David Aparicio, Fernando Silva

    Abstract: Computing subgraph frequencies is a fundamental task that lies at the core of several network analysis methodologies, such as network motifs and graphlet-based metrics, which have been widely used to categorize and compare networks from multiple domains. Counting subgraphs is however computationally very expensive and there has been a large body of work on efficient algorithms and strategies to ma… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

    Comments: 35 pages

    Journal ref: ACM Computing Surveys, Volume 54, Issue 2, March 2022 ,Article No 28, pp 1-36

  13. arXiv:1808.08195  [pdf, ps, other

    cs.LG cs.SI stat.ML

    GoT-WAVE: Temporal network alignment using graphlet-orbit transitions

    Authors: David Aparício, Pedro Ribeiro, Tijana Milenković, Fernando Silva

    Abstract: Global pairwise network alignment (GPNA) aims to find a one-to-one node mapping between two networks that identifies conserved network regions. GPNA algorithms optimize node conservation (NC) and edge conservation (EC). NC quantifies topological similarity between nodes. Graphlet-based degree vectors (GDVs) are a state-of-the-art topological NC measure. Dynamic GDVs (DGDVs) were used as a dynamic… ▽ More

    Submitted 24 August, 2018; originally announced August 2018.

  14. arXiv:1806.07209  [pdf, other

    cs.CR

    Formal verification of the YubiKey and YubiHSM APIs in Maude-NPA

    Authors: Antonio González-Burgueño, Damián Aparicio, Santiago Escobar, Catherine Meadows, José Meseguer

    Abstract: In this paper, we perform an automated analysis of two devices developed by Yubico: YubiKey, designed to authenticate a user to network-based services, and YubiHSM, Yubicos hardware security module. Both are analyzed using the Maude-NPA cryptographic protocol analyzer. Although previous work has been done applying automated tools to these devices, to the best of our knowledge there has been no com… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

  15. arXiv:1707.04572  [pdf, other

    cs.SI physics.soc-ph

    Temporal Network Comparison using Graphlet-orbit Transitions

    Authors: David Aparício, Pedro Ribeiro, Fernando Silva

    Abstract: Networks are widely used to model real-world systems and uncover their topological features. Network properties such as the degree distribution and shortest path length have been computed in numerous real-world networks, and most of them have been shown to be both scale-free and small-world networks. Graphlets and network motifs are subgraph patterns that capture richer structural information than… ▽ More

    Submitted 14 July, 2017; originally announced July 2017.

  16. arXiv:1511.01964  [pdf, other

    cs.SI physics.soc-ph q-bio.MN

    Network comparison using directed graphlets

    Authors: David Aparício, Pedro Ribeiro, Fernando Silva

    Abstract: With recent advances in high-throughput cell biology the amount of cellular biological data has grown drastically. Such data is often modeled as graphs (also called networks) and studying them can lead to new insights into molecule-level organization. A possible way to understand their structure is by analysing the smaller components that constitute them, namely network motifs and graphlets. Graph… ▽ More

    Submitted 5 November, 2015; originally announced November 2015.

    Comments: 9 pages