Short Bio

Philip A. Chou has longstanding interests in data compression, signal processing, machine learning, communication theory, and information theory and their applications to processing media such as holograms, video, images, audio, speech, and documents. He did the first work on multiple reference frame video coding (used in all modern video codecs), and he pioneered rate-distortion optimization for codecs (used in all modern video codecs and still driving further advances there). He performed the seminal work on client-driven network-adaptive streaming media on demand, including fast start (used in all modern video streaming) and multi-bitrate streaming (also used in all modern video streaming), leading up to Microsoft IIS Smooth Streaming and the MPEG DASH standard. He is one of the inventors of practical network coding using random codes, and one of the inventors of wireless network coding. He is the first or among the first to use entropy as a splitting criterion for decision tree design, and is the first to develop a method for decision DAG design. He worked on prosody in the Speech Plus CallText 5010 text-to-speech system (used by Stephen Hawking). He is among the first to define a probabilistic language grammar over images (used in Xerox and Google OCR systems) and the first to use it for optical character recognition of mathematical notation. As a research manager, he helped his group to develop error correction codes for data centers, saving Microsoft on the order of a billion dollars in infrastructure and operational costs. Further, he initiated his group’s research in immersive telepresence, which over many years led to Holoportation (for which he was personally responsible for the spatial audio). His recent algorithms appear in 8i products as well as in the emerging MPEG point cloud compression standard. He holds degrees in electrical engineering and computer science from Princeton, Berkeley, and Stanford. He has been a member of the research staff or research manager at AT&T Bell Laboratories, Xerox PARC, Microsoft, and Google. He has played key roles in startups Telesensory Systems, Speech Plus, VXtreme (acquired by Microsoft), and 8i. He has been an affiliate faculty member at Stanford, the University of Washington, and the Chinese University of Hong Kong. He has been associate or guest editor for the IEEE Trans. Information Theory, the IEEE Trans. Image Processing, and the IEEE Trans. Multimedia. He has been an organizer or technical co-chair for the inaugural NetCod, ICASSP’07, MMSP’09, ICIP’15, ICME’16, ICIP’17, among others. He is an IEEE Fellow and has served on the IEEE Fellow evaluation committees of the IEEE Computer and Signal Processing societies, as well as on the Board of Governors of the IEEE Signal Processing Society. He has been an active participant in MPEG, where he instigated the work on the file format (.mp4), and contributed datasets, algorithms, and code used as a starting point for point cloud compression. He has won or co-authored best paper awards in the IEEE Trans. Signal Processing, the IEEE Trans. Multimedia, ICME, and ICASSP. He is co-editor of a book on multimedia communication. He was with a startup, 8i.com, where he led the effort to compress and communicate volumetric media, popularly known as holograms, for virtual and augmented reality. In September 2018, he joined Google Daydream as a senior staff research scientist. In April 2023 he retired from the Perception team at Google Research. Research.com has listed him as one of the top 1000 computer scientists worldwide.

Share this: