Wenbo (Gordon) Hu

whu at cs dot ucla dot edu

Hi! I am Wenbo. I'm a graduate student at University of California, Los Angeles Computer Science Program. I joined PLUS lab as a research assitant advised by Prof. Nanyun Peng at UCLA. Before that, I worked at Machine Learning, Perception, and Cognition Lab (mlPC) advised by Prof. Zhuowen Tu. I graduated from University of California, San Diego majoring Data Science in March 2023. I was advised by Prof. Tsui-Wei Weng for my undergraduate capstone project.

My goal is aligning vision and language to enable large multimodal model has perception and comprehension following human values. My primary research interest lies in multimodal machine learning, vision language models, and language generation. I also have prior experience in unsupervised image classification under language supervision, 2D and 3D object detection, and model-free reinforcement learning for generalizable manipulation skills.

CV  /  GitHub  /  Google Scholar /  LinkedIn  /  Email  /  Twitter

  • 04/2024: Released VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models. VALOR-BENCH is a comprehensive human-annotated benchmark covering relation, attribute, object with challenging images based on associative bias. VALOR-EVAL generalizes previous methods by introducing semantic matching and incorporates both the faithfulness and coverage evaluation. It can handle complex hallucination types in object, attribute, and relations in open vocabulary captions from LVLMs.
  • 09/2023: Joined PLUS lab at UCLA as a graduate research assistant advised by Prof. Nanyun Peng, working on Evaluation of Hallucinations in multimodal LLM.
  • 08/2023: Released BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions, assists LLM to capture intricate details of visual information potentially missed during the query decoding process.
  • 04/2023: Joined Machine Learning, Perception, and Cognition Lab (mlPC), working on multimodal Large Language Model (LLM).
  • 01/2023: Machine Learning Teaching Assistant with Professor Sanjoy Dasgupta, teaching CSE151A: Intro to Machine Learning at UCSD.
  • 09/2022: Research improving CLIP's performance with Multi Modal Prompt Engineering, Feature Adapaters, etc. mentored by Tsui-Wei Weng.
  • 02/2022: Joined Hao Su Lab for computer vision and robotics research.
  • Research (Highlighted / All)
    Work Experience
    Deep Learning Research Intern at Synthesis Electronic Technology Computer Vision Group
  • - Accelerate lightweight object detection models such as YoloV5 to compress deep learning models to run on small devices (CPU chip/mobile end).
  • - Improve objection detection model accuracy on the company's working datasets and fit business requirements.
  • - Convert models from different frameworks to NCNN, ONNX, and TensorRT that can run on mobile devices and deploy them.
  • Software Engineering Intern at Inspur Groups
  • - Launch project with MyBatis and SpringMVC framework under Maven and Tomcat
  • - Big data back-end development and support including writing SQL to select data series from the database, developing controller, dao, data, and service layer by JAVA to achieve requests from the front end.
  • - Use customer-provided data to build systems and webpages for customer companies to help for making business decisions.

  • Pageviews

    Inspired by this and this.