16 148 24

Jiaheng Liu

CheeryLJH

AI & ML interests

None yet

Recent Activity

liked a dataset about 5 hours ago

NJU-LINK/IF-VidCap

liked a dataset about 5 hours ago

NJU-LINK/ViDiC-1K

upvoted a paper 1 day ago

EditThinker: Unlocking Iterative Reasoning for Any Image Editor

View all activity

Organizations

liked 2 datasets about 5 hours ago

NJU-LINK/IF-VidCap

Viewer • Updated Oct 22 • 1.4k • 461 • 1

NJU-LINK/ViDiC-1K

Viewer • Updated about 6 hours ago • 4.11k • 861 • 2

upvoted a paper 1 day ago

EditThinker: Unlocking Iterative Reasoning for Any Image Editor

Paper • 2512.05965 • Published 4 days ago • 33

upvoted a paper 5 days ago

Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction

Paper • 2512.04987 • Published 5 days ago • 69

updated a dataset 5 days ago

NJU-LINK/ViDiC-1K

Viewer • Updated about 6 hours ago • 4.11k • 861 • 2

upvoted a paper 6 days ago

ViDiC: Video Difference Captioning

Paper • 2512.03405 • Published 6 days ago • 25

upvoted 2 papers 7 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published 16 days ago • 247

How Far Are We from Genuinely Useful Deep Research Agents?

Paper • 2512.01948 • Published 8 days ago • 50

commented a paper 7 days ago

How Far Are We from Genuinely Useful Deep Research Agents?

Paper • 2512.01948 • Published 8 days ago • 50 •

upvoted a paper 8 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 12 days ago • 168

upvoted a paper 12 days ago

Monet: Reasoning in Latent Visual Space Beyond Images and Language

Paper • 2511.21395 • Published 13 days ago • 15

upvoted a paper 28 days ago

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Paper • 2511.07250 • Published 29 days ago • 17

commented a paper 28 days ago

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Paper • 2511.07250 • Published 29 days ago • 17 •

upvoted 2 papers about 1 month ago

Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation

Paper • 2510.24821 • Published Oct 28 • 37

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 219

upvoted a paper about 2 months ago

MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues

Paper • 2510.17722 • Published Oct 20 • 19

commented a paper about 2 months ago

MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues

Paper • 2510.17722 • Published Oct 20 • 19 •

upvoted a paper about 2 months ago

IF-VidCap: Can Video Caption Models Follow Instructions?

Paper • 2510.18726 • Published Oct 21 • 24

commented a paper about 2 months ago

IF-VidCap: Can Video Caption Models Follow Instructions?

Paper • 2510.18726 • Published Oct 21 • 24 •

upvoted a paper about 2 months ago

Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21 • 36

Jiaheng Liu

AI & ML interests

Recent Activity

Organizations

CheeryLJH's activity