Zhaoyuan “Andy” Fang

I'll be working as a software engineer at Google Sunnyvale.

I got my M.S. in Robotics at Carnegie Mellon University, working with Prof. Katerina Fragkiadaki on Computer Vision and Machine Learning. You can find my MSR thesis here.

Before that, I got my B.S. in Electrical Engineering and Maths from the University of Notre Dame. During my undergrad, I was fortunate to work with Prof. Adam Czajka and Prof. Kevin Bowyer on Biometrics. I was also lucky to have interned with Prof. David Held and collaborated with Prof. Hang Zhao.

Email  /  CV  /  Google Scholar  /  Github

profile photo
Research

I'm interested in and currently working on computer vision and machine learning. During my undergrad, I worked mostly on Biometrics, while my first research experience was on nanophotonics. For a full list please see Google Scholar. Some works can be found below (* = equal contribution).

A Simple Baseline for BEV Perception Without LiDAR
Adam W. Harley, Zhaoyuan Fang, Jie Li, Rares Ambrus, Katerina Fragkiadaki
Preprint, 2022
project page / code

We build a surprisingly simple baseline for BEV perception that obtains state-of-the-art performance, and explores what are the factors behind it's good performance.

Particle Videos Revisited: Tracking Through Occlusions Using Point Trajectories
Adam W. Harley, Zhaoyuan Fang, Katerina Fragkiadaki
European Conference on Computer Vision (ECCV), 2022 - Oral
project page / code

We re-build the classic “particle video” approach using components that drive the current state-of-the-art in flow and object tracking, such as dense cost maps, iterative optimization, and learned appearance updates.

TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Common Sense Priors
Gabriel Sarch, Zhaoyuan Fang, Adam W. Harley, Paul Schydlo, Michael J. Tarr, Saurabh Gupta, Katerina Fragkiadaki
European Conference on Computer Vision (ECCV), 2022
project page / code

TIDEE is an embodied agent that tides up a disordered scene based on learned commonsense object placement and room arrangement priors.

Move to See Better: Self-Improving Embodied Object Detection
Zhaoyuan Fang*, Ayush Jain*, Gabriel Sarch*, Adam W. Harley, Katerina Fragkiadaki
British Machine Vision Conference (BMVC), 2021
project page / code

Assuming an embodied agent with a 2D pre-trained detector, a depth sensor, and approximate egomotion, we show how the agent can improve its 2D and 3D detection performance in a new environment under occlusions and uncommon viewpoints, simply by moving around.

Robust Iris Presentation Attack Detection Fusing 2D and 3D Information
Zhaoyuan Fang, Adam Czajka, Kevin W. Bowyer
IEEE Transactions on Information Forensics and Security (T-IFS), 2020
code / video

Experiments show that 2D textural and 3D shape features are complementary for iris presentation attack detection and fusing then together results in robust performance under various open-set testing scenarios.

Open Source Iris Recognition Hardware and Software with Presentation Attack Detection
Zhaoyuan Fang, Adam Czajka
IEEE International Joint Conference on Biometrics (IJCB), 2020
code / video

An open source, low-cost, fast and accuract iris recognition protoype with presentation attack detection based on Raspberry-Pi and Python.

GSIR: Generalizable 3D Shape Interpretation and Reconstruction
Jianren Wang, Zhaoyuan Fang
European Conference on Computer Vision (ECCV), 2020
project page / video

A model designed for joint shape interpretation and reconstruction improves performance on both tasks.

AlignNet: A Unifying Approach to Audio-Visual Alignment
Jianren Wang*, Zhaoyuan Fang*, Hang Zhao
IEEE Winter Conf. on Applications of Computer Vision (WACV), 2020
project page / code / video

End-to-end dense correspondence between each frame of a video and an audio can be learned with a model with simple and well-established principles: attention, pyramidal processing, warping, and affinity function.

Iris Presentation Attack Detection Based on Photometric Stereo Features
Adam Czajka, Zhaoyuan Fang, Kevin W. Bowyer
IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019
code

Traditional 3D reconstruction technique Photometric Stereo provides surprisingly simple but effective features to classify real iris images from fake ones.

Misc.

If you're bored, here's my favorite song of: 07/22, 04/22, 03/22, 11/21, 08/21, 07/21, 06/21



Awesome template borrowed from here.