Hi, my name is Yin Lin. I am a senior algorithm engineer at Tongyi Lab, Alibaba Group, where I explore AI-driven solutions for enhancing data analytics and management. My current research focuses on the intersection of AI agents and data management, developing systems that leverage intelligent agents to better process, analyze, and reason over data.
Before joining Alibaba, I earned my Ph.D. from the University of Michigan, Ann Arbor, where I was advised by Dr. H. V. Jagadish. My thesis research focused on data equity systems.
If you are looking for a research intern position or interested in collaboration on AI agents, data management, or related areas, feel free to reach out!
Ph.D., Computer Science and Engineering
B.S., Computer Science
Research Intern, Data Analytics and Intelligence Lab (DAIL), Damo Academy
Research Intern, Data Management, Exploration and Mining (DMX)
Summer Intern, Software Architecture Group
I am currently exploring how AI agents can better manage, reason over, and interact with data. Here are some of my ongoing projects:
Natural language queries to databases are often ambiguous. AmbiSQL is an interactive system that detects and resolves ambiguities in text-to-SQL translation, enabling users to clarify their intent and receive accurate query results.
Semantic operators bring AI-powered data transformations to large-scale data processing pipelines. These operators are now part of Data-Juicer, enabling intelligent filtering, enrichment, and transformation of training data for foundation models.
DojoZero is an agent arena where AI agents react to real-time data streams to make predictions for sports events. It serves as a testbed for evaluating agents' capabilities in dynamic, time-sensitive decision-making scenarios.
Exploring novel agent applications for data analytics, including using LLMs for feature engineering and data engineering tasks.
SIGMOD 2026 (Demo)
IEEE Data Engineering Bulletin 2025
CoRR, 2024, Arxiv/2412.16864
ICDE 2024
CIDR 2024
SIGMOD 2023 (Best Paper Award)
ACM Computing Surveys
Highly cited survey in the field of AI fairnessVLDB 2022 (Demo)
VLDB 2022
VLDB 2020
CoRR, 2020, Arxiv/2010.08807
MLG@KDD 2020
DEXA 2018
Program Committees / Reviewers: NeurIPS, TKDE, CIKM, IEEE BigData, AIBSD (AAAI Workshop on AI with Biased or Scarce Data), ReLM (AAAI Workshop on Responsible Language Models)