Lily's picture

20

Lily

chenyingli

https://scholar.google.com/citations?user=iSgs5r0AAAAJ&hl=en&authuser=2

AI & ML interests

None yet

Organizations

None yet

upvoted a paper 5 months ago

CellForge: Agentic Design of Virtual Cell Models

Paper • 2508.02276 • Published Aug 4, 2025 • 39

upvoted 4 papers 6 months ago

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

Paper • 2507.13300 • Published Jul 17, 2025 • 19

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Paper • 2507.10787 • Published Jul 14, 2025 • 12

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Paper • 2507.06229 • Published Jul 8, 2025 • 75

Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers

Paper • 2507.02694 • Published Jul 3, 2025 • 19

upvoted 2 papers 7 months ago

Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure

Paper • 2506.12278 • Published Jun 13, 2025 • 16

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published May 29, 2025 • 93

upvoted 2 papers 8 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 76

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 54

upvoted 3 papers 9 months ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published Apr 1, 2025 • 27

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published Apr 1, 2025 • 26

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

Paper • 2503.21821 • Published Mar 26, 2025 • 21

upvoted 7 papers 10 months ago

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Paper • 2503.20757 • Published Mar 26, 2025 • 11

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20, 2025 • 95

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published Mar 13, 2025 • 92

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 170

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published Mar 6, 2025 • 21

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113

upvoted a paper 12 months ago

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21, 2025 • 84