Publication List

2026

  1. Assessing LLMs for Serendipity Discovery in Knowledge Graphs: A Case for Drug Repurposing
    The 40th AAAI Conference on Artificial Intelligence (AAAI 26, Accepted Rate: 17.6%)
    Mengying Wang, Chenhui Ma, Ao Jiao, Tuo Liang, Pengjun Lu, Shrinidhi Hegde, Yu Yin, Evren Gurkan-Cavusoglu, Yinghui Wu

2025

  1. ML-Asset Management: Curation, Discovery, and Utilization
    The 51st International Conference on Very Large Databases (VLDB 2025, Tutorial)
    Mengying Wang, Moming Duan, Yicong Huang, Chen Li, Bingsheng He, Yinghui Wu
  2. Position: Current Model Licensing Practices are Dragging Us into a Quagmire of Legal Noncompliance
    The 42nd International Conference on Machine Learning (ICML 2025, Selected as Oral (top ~1%))
    Moming Duan, Mingzhe Du, Rui Zhao, Mengying Wang, Yinghui Wu, Nigel Shadbolt, Bingsheng He
  3. Generating Skyline Datasets for Data Science Models
    The 28th International Conference on Extending Database Technology (EDBT 2025)
    Mengying Wang, Hanchao Ma, Yiyang Bian, Yangxin Fan, Yinghui Wu
  4. Graph Data Management and Graph Machine Learning: Synergies and Opportunities
    SIGMOD record
    Arijit Khan, Xiangyu Ke, Yinghui Wu

2024

  1. ModsNet: Performance-aware Top-𝑘 Model Search using Exemplar Datasets
    The 50th International Conference on Very Large Databases (VLDB 2024)
    Mengying Wang*, Hanchao Ma*, Sheng Guan, Yiyang Bian, Haolai Che, Abhishek Daundkar, Alp Sehirlioglu, Yinghui Wu
  2. Generating Robust Counterfactual Witnesses for Graph Neural Networks
    The 40th IEEE International Conference on Data Engineering (ICDE 2024)
    Dazhuo Qiu*, Mengying Wang*, Arijit Khan, Yinghui Wu
  3. GraphLingo: Domain Knowledge Exploration by Synchronizing Knowledge Graphs and Large Language Models
    The 40th IEEE International Conference on Data Engineering (ICDE 2024)
    Duy Le, Kris Zhao, Mengying Wang, Yinghui Wu
  4. View-based Explanations for Graph Neural Networks
    Proceedings of the ACM on Management of Data, 2024 (SIGMOD 2024)
    Tingyang Ghen, Dazhuo Qiu, Yinghui Wu, Arijit Khan, Xiangyu Ke, Yunjun Gao

2023

  1. Selecting Top-𝑘 Data Science Models by Example Dataset
    The 32nd ACM International Conference on Information and Knowledge Management (CIKM 2023)
    Mengying Wang, Sheng Guan, Hanchao Ma, Yiyang Bian, Haolai Che, Abhishek Daundkar, Alp Sehirlioglu, Yinghui Wu
  2. Fair Group Summarization with Graph Patterns
    The 39th IEEE International Conference on Data Engineering (ICDE 2023)
    Hanchao Ma, Sheng Guan, Mengying Wang, Yinghui Wu
  3. GALE: Active Adversarial Learning for Erroneous Node Detection in Graphs
    The 39th IEEE International Conference on Data Engineering (ICDE 2023)
    Sheng Guan, Hanchao Ma, Mengying Wang, Yinghui Wu

2022

  1. CRUX: Crowdsourced Materials Science Resource and Workflow Exploration
    The 31st ACM International Conference on Information and Knowledge Management (CIKM 2022)
    Mengying Wang, Hanchao Ma, Abhishek Daundkar, Sheng Guan, Yiyang Bian, Alp Sehirlioglu, Yinghui Wu
  2. RoboGNN: Robustifying Node Classification under Link Perturbation
    The 31st International Joint Conference on Artificial Intelligence (IJCAI 2022) [Video]
    Sheng Guan, Hanchao Ma, Yinghui Wu
  3. Subgraph Query Generation with Fairness and Diversity Constraints
    The 38th IEEE International Conference on Data Engineering (ICDE 2022)
    Hanchao Ma, Sheng Guan, Mengying Wang, Yen-shuo Chang, Yinghui Wu
  4. Diversified Subgraph Query Generation with Group Fairness
    The 15th International Conference on Web Search and Data Mining (WSDM 2022)
    Hanchao Ma, Sheng Guan, Christopher Toomey, Yinghui Wu

2021

  1. GEDet: Detecting Erroneous Nodes with A Few Examples
    The 47rd International Conference on Very Large Data Bases (VLDB 2021)
    Sheng Guan, Hanchao Ma, Sutanay Choudhury, Yinghui Wu
  2. GRIP: Constraint-based Explanation of Missing Entities in Graph Search
    ACM SIGMOD Conference on Management of Data (SIGMOD), 2021 [Video]
    Qi Song, Peng Lin, Hanchao Ma, Yinghui Wu
  3. Explaining Missing Data in Graphs: A Constraint-based Approach
    The 37th IEEE International Conference on Data Engineering (ICDE 2021)
    Qi Song, Peng Lin, Hanchao Ma, Yinghui Wu

System & Software

CRUX Platform

CRUX is a crowdsourced scientific ML-Asset Management platform with dedicated support for materials science. It provides an end-to-end ecosystem for curating, searching, and discovering AI/ML assets through three major components:

  1. Asset Curation. Upload assets with self/lab/public permissions; assets are standardized to CRUX schemas and integrated into the CRUX AI/ML Asset Graph for consistent representation, cross-asset linking, and scalable metadata management.
  2. Asset Search. Intelligent, domain-aware search across tags and metadata with LLM-enhanced subgraph reasoning and custom interfaces tailored for materials scientists.
  3. Asset Discovery. Automated and hybrid workflows for data-driven model selection (see ModsNet(VLDB’24, CIKM’23)) and model-driven data discovery (see MODis(EDBT’25)).

Launch the CRUX platform to explore the end-to-end workflow.