Conduct design and development to build and optimize deep learning software.
Design, develop & optimize for deep learning training and inference frameworks.
Implement various distributed algorithms such as model / data parallel frameworks, parameter servers, dataflow based asynchronous data communication in deep learning frameworks.
Transform computational graph representation of neural network model.
Develop deep learning primitives in math libraries.
Profile distributed DL models to identify performance bottlenecks and propose solutions across individual component teams.
Optimizing code for various computing hardware backends.
Interacting with deep learning researchers and experience with deep learning frameworks.
In this position you will be a part of IOTG VPU IP team developing software stack for deployment of Deep Learning and traditional Computer Vision algorithms on the hardware accelerator.
Your responsibilities will include :
Integration of Computer Vision algorithms in complex video processing applications
Design and implementation of task and resource scheduling algorithms
Implementation of low-level kernels for embedded class hardware
BS / MS in Computer Science Computer Engineering or a similar field
At least 3-4 years of experience in programming
Excellent C programming skills
Understanding of multithreading and multithreaded application design
Understanding of modern compiler architecture and code compilation process
The following will be an additional advantage :
Experience in low-level optimization vectorization
Knowledge of modern processor architecture
Knowledge of Intel architecture
Experience with Deep Learning frameworks (TF, Caffe, PyTorch, OpenCV etc)
Experience in Linux system programming or embedded
Spoken and written English (upper-intermediate level or advanced)