How Etsy Uses Thermodynamics to Help You Search for “Geeky”
Etsy shoppers love the large and diverse selection of our marketplace. But, for those who don’t know exactly what they’re looking for, the sheer number and variety of items available can be more...
View ArticleManaging Hadoop Job Submission to Multiple Clusters
At Etsy we have been running a Hadoop cluster in our datacenter since 2012. This cluster handled both our scheduled production jobs as well as all ad hoc jobs. After several years of running our...
View ArticleIntroducing Arbiter: A Utility for Generating Oozie Workflows
At Etsy we have been using Apache Oozie for managing our production workflows on Hadoop for several years. We’ve even recently started using Oozie for managing our ad hoc Hadoop jobs as well. Oozie has...
View ArticleSEO Title Tag Optimization at Etsy: Experimental Design and Causal Inference
External search engines like Google and Bing are a major source of traffic for Etsy, especially for our longer-tail, harder to find items, and thus Search Engine Optimization (SEO) is important in...
View ArticleModeling Spelling Correction for Search at Etsy
Introduction When a user searches for an item on Etsy, they don’t always type what they mean. Sometimes they type the query jewlery when they’re looking for jewelry; sometimes they just accidentally...
View ArticleModeling User Journeys via Semantic Embeddings
Etsy is a global marketplace for unique goods. This means that as soon as an item becomes popular, it runs the risk of selling out. Machine learning solutions that simply memorize the popular items are...
View ArticleHow Etsy Handles Peeking in A/B Testing
Etsy relies heavily on experimentation to improve our decision-making process. We leverage our internal A/B testing tool when we launch new features, polish the look and feel of our site, or even make...
View ArticleDouble-bucketing in A/B Testing
Previously, we’ve posted about the importance we put in Etsy’s experimentation systems for our decision-making process. In a continuation of that theme, this post will dive deep into an interesting...
View Articleboundary-layer : Declarative Airflow Workflows
When Etsy decided last year to migrate our operations to Google Cloud Platform (GCP), one of our primary motivations was to enable our machine learning teams with scalable resources and the latest...
View ArticleExecuting a Sunset
We all know how exciting it is to build new products, the thrill of a pile of new ideas waiting to be tested, new customers to reach, knotty problems to solve, and dreams of upward-sloping graphs. But...
View ArticleThe Causal Analysis of Cannibalization in Online Products
Introduction Nowadays an internet company typically has a wide range of online products to fulfill customer needs. It is common for users to interact with multiple online products on the same...
View ArticleCloud Jewels: Estimating kWh in the Cloud
Image: Lightning Storm Earrings, GojoDesign on Etsy Etsy has been increasingly enjoying the perks of public cloud infrastructure for a few years, but has been missing a crucial feature: we’ve been...
View ArticleHow to Pick a Metric as the North Star for Algorithms to Optimize Business...
This article draws on our published paper in KDD 2020 (Oral Presentation, Selection Rate: 5.8%, 44 out of 756) Introduction It is common in the internet industry to develop algorithms that power...
View ArticleIncreasing experimentation accuracy and speed by using control variates
At Etsy, we strive to nurture a culture of continuous learning and rapid innovation. To ensure that new products and functionalities built by teams — from polishing the look and feel of our app and...
View Article