Inui
Norm
AI & ML interests
Video Diffusion; Large Language Model; Object Detection; OCR
Recent Activity
liked
a dataset
11 days ago
wsdwJohn1231/DreamLIP_capion_csv_w_key
liked
a dataset
11 days ago
Jyuhamdik/RealSyn15M
upvoted
a
paper
2 months ago
Revisiting Multimodal Positional Encoding in Vision-Language Models