Amazon Alexa and Rekognition services are based on machine learning, and process millions of requests every second. By switching to AWS Inferentia-based Amazon EC2 Inf1 instances from GPU-based instances for machine learning inference, these services saved 45% of their inference costs while boosting performance. If you are a developer or business building machine learning capabilities into your applications to run at scale, this tech talk will take you inside Alexa’s and Rekognition’s architecture and show
Hide player controls
Hide resume playing