Amazon Selection and Catalog Systems is looking to hire an Applied Scientists who will push the boundaries to develop the best possible title for a product. Our vision is to innovate and automate product titles that adapts to different customer experiences such as website, mobile, echo devices, kindle and so on. Title of a product is the most important attribute that relates to a customer when shopping on Amazon and influences customer search experience and buying decisions significantly. It is used across hundreds of systems within Amazon.
You will analyze how information on how Amazon's product titles affect our customers and help devise short term and long term strategy for product titles.. You will have the opportunity to design new data analytical workflows at a scale rarely available elsewhere, utilizing state-of-the-art data science and machine learning tools such as Spark, Python, and Theano and Amazon's cloud computing technologies such as Elastic Map Reduce (EMR), Kinesis, and Redshift. You will apply your knowledge in data science by creating algorithmic solutions that combine techniques such as clustering, pattern mining, predictive modeling, deep learning, statistical testing, information retrieval, and natural language processing and apply them to the voluminous data describing the products in the catalog and the customer interactions. You will evaluate with scientific rigor and provide inputs to business strategy and technical direction. You will collaborate with software engineering teams to integrate your algorithmic solutions into large-scale highly complex Amazon production systems.
Our organization has a strong focus and great track record of growing our employees.
You will encounter many challenges, including
* scale (build models for billions of products in the catalog utilizing trillions of customer interactions),
* accuracy (extreme requirements for precision or recall due to impact of getting it wrong,
* speed (generate predictions for millions of new or changed products with low latency),
* diversity (titles for the same product in different languages. Titles presented based on context in which the customer is viewing these titles),
* high dimensionality
* noise (build models robust to varying quality of data provided by millions of sellers and labels derived implicitly or collected from humans).
You will need to be creative and go far beyond text book solutions to deal with these challenges. Meeting the business requirements will involve combining several different machine learning algorithms with domain knowledge into complex data analytical workflows that automate what can be automated and efficiently utilize experts when needed to mitigate risk.
You will help us to
* Define the strategy for title generation. How to gather the best possible information for the products and use them in the titles.
* Continuously innovate and adapt to customer needs and never satisfied with status quo.
* Lead/guide and influence other applied scientists in the Selection Contribution Platform team.
* Identify which product information matters most to our customers.
* Extract product information from unstructured data to augment the titles
* Determine the apt title lengths based on experiences such as mobile, echo devices, PC and so on.
* Determine the sequence of information represented in the title
* Dynamic generation of titles for products not seen before.
* Ability to do an automated quality analysis on titles to detect and prevent bad titles from flowing to the website.
* Help automate manual audit process, by taking the learnings from audits and building them to the models.
* Estimate the financial impact of title improvements.
Your solutions will directly impact the customer experience by making products discoverable, presenting them in the right place, with complete and accurate product information to enable informed purchase decisions.
Amazon is a company operating a marketplace for consumers, sellers, and content creators.