The Gilt technology organization. We make gilt.com work.

Gilt Tech

NYC Scrum User Group - January 19th

meetups

We’ll be hosting our first meetup of 2017 in partnership with the NYC Scrum User Group on Thursday, January 19th. This is our first time hosting this group and we’re off to a great start: Ken Rubin will be joining us to lead a talk on Agile.

More on Ken:

Ken is the author of Amazon’s #1 best selling book Essential Scrum: A Practical Guide to the Most Popular Agile Process. As an agile thought leader, he founded Innolution where he helps organizations thrive through the application of agile principles in an effective and economically sensible way. He has coached over 200 companies ranging from startups to Fortune 10, and is an angel investor and mentor to numerous exciting startups. As a Certified Scrum Trainer, Ken has trained over 24,000 people in agile / Scrum as well as object-oriented technology. He was the first managing director of the worldwide Scrum Alliance, a nonprofit organization focused on transforming the world of work using Scrum.

The title of the talk is “Agile Transition Lessons That Address Practical Questions” and will address questions like:

  • Is there a way to quantify the cost of the transition?
  • How many teams or what scope should the initial transition effort cover?
  • Should we use an internal coach or hire an external coach?
  • How does training fit in to the adoption?
  • How do we measure our success?
  • Should we use an existing scaling framework or develop our own?

If you plan to attend, please RSVP on the NYC Scrum User Group Meetup page. As always there will be refreshments, networking opportunities and a chance to chat with the speaker. We hope to see you there!

meetups 34 agile 7
Tech
Gilt Tech

BackOffice Hike

Ryan Martin culture

Now that winter is here and the cold is keeping us inside, I thought it would be good to dream of warm days and look back at a BackOffice team outing from October.

On what turned out to be an unseasonably warm and beautiful Monday, a bunch of us took the day off and hiked the Breakneck Ridge Trail. It’s a 3.5-mile loop that climbs quickly (read: steep scrambling) to amazing views of the Hudson River Valley. It’s accessible via Metro-North out of NYC, so it’s a great target for a day trip from the City.

Here are some photos of the trip - can’t wait for the next one.


culture 30 outing 1
Tech
Gilt Tech

Deep Learning at GILT

Pau Carré Cardona machine learning, deep learning

Cognitive Fashion Industry Challenges

In the fashion industry there are many tasks that require human-level cognitive skills, such as detecting similar products or identifying facets in products (e.g. sleeve length or silhouette types in dresses).

In GILT we are building automated cognitive systems to detect dresses based on their silhouette, neckline, sleeve type and occasion. On top of that, we are also developing systems to detect dress similarity which can be useful for product recommendations. Furthermore, when integrated with automated tagging, our customers will be able to find similar products with different facets. For instance, a customer might be very interested in a dress in particular, but with a different neckline or sleeve length.

For these automated cognitive tasks we are leveraging the power of a technology called Deep Learning that recently managed to achieve groundbreaking results thanks to mathematical and algorithmic advances together with the massive parallel processing power of modern GPUs.

GILT Automated dress faceting

GILT Automated dress similarity

Deep Learning

Deep learning is based on what’s called deep neural networks. A neural network is a sequence of numerical parameters that transform an input into an output. The input can be the raw pixels in an image, and the output can be the probability of that image to be of a specific type (for example, a dress with boat neckline).

To achieve these results it’s necessary to set the right numerical parameters to the network so it can make accurate predictions. This process is called neural network training and most times, involves different forms of a base algorithm called backpropagation. The training is done using a set of inputs (e.g. images of dresses) and known output targets (e.g. the probability of each dress to be of a given silhouette) called the training set. The training set is used by the backpropagation algorithm to update the network parameters. Given an input image, the backpropagation refines parameters so to get closer to the target. Iterating many times through backpropagation will lead to a model that is able to produce, for a given input, outputs very close to the target.

Once the training is done, if it has a high accuracy and the model is not affected by overfitting, whenever the network is fed with a brand new image, it should be able to produce accurate predictions.

For example, say that we train a neural network to detect necklines in dresses using a dataset of images of dresses with known necklines. We’d expect that if the network parameters are properly set, when we feed the network with an image of a cowl neckline, the output probability for the cowl neckline should be close to 1 (100% confidence). The accuracy of the model can be computed using a set of inputs and expected targets called test set. The test set is never used during training and thus it provides an objective view of how the network would behave with new data.

Neural networks are structured in layers which are atomic forms of neural networks. Each layer gets as an input the output of the previous layer, computes a new output with its numerical parameters and feeds it forward into the next layer’s input. The first layers usually extract low level features in images such as edges, corners and curves. The deeper the layer is, the more high level features it extracts. Deep neural networks have many layers, usually one stacked on top of the other.

Deep Neural Network Diagram

Dress Faceting

Automatic dress faceting is one of the new initiatives GILT is working on. GILT is currently training deep neural networks to tag occasion, silhouette, neckline and sleeve type in dresses.

Dress Faceting Model

The model used for training is Facebook’s open source Torch implementation of Microsoft’s ResNet. Facebook’s project is an image classifier, with models already trained in ImageNet. We’ve added a few additional features to the original open source project:

  • Selection of dataset for training and testing (silhouette, occasion, neckline…)

  • Weighted loss for imbalanced datasets

  • Inference given a file path of an image

  • Store and load models in/from AWS S3

  • Automatic synchronization image labels with imported dataset

  • Tolerance to corrupted or invalid images

  • Custom ordering of labels

  • Test and train F1 Score accuracy computation for each class as well as individual predictions for each image across all tags.

  • Spatial transformer attachment in existing networks

The models are trained in GPU P2 EC2 instances deployed using Cloud Formation and attaching EBS to them. We plan to substitute EBS by EFS (Elastic File System) to be able to share data across many GPU instances.

We are also investing efforts trying to archive similar results using TensorFlow and GoogleNet v3.

Data and Quality Management

To keep track of the results that our model is generating we’ve built a Play web application to analyze results, keep a persistent dataset, and change the tags of the samples if we detect they are wrong.

Model Accuracy Analysis

The most basic view to analyze machine learning results is the F1 Score, which provides a good metric that takes into account both false positive and false negative errors.

On top of that, we provide a few views to be able to analyze results, specifically intended to make sure samples are properly tagged.

F1 Score View

Image Tagging Refining

The accuracy analysis allows us to detect which are the images the model is struggling to properly classify. Often times, these images are mistagged and they have to be manually retagged and the model retrained with the new test and training set. Once the model is retrained, very often its accuracy increases and it’s possible to spot further mistagged images.

It’s important to note here that images in either the test or the training set always remain in test or in train. It’s only the tag that is changed: for example, a long sleeve could be retagged to three-quarters sleeve.

To scale the system we are attempting to automate the retagging using Amazon Mechanical Turk.

False Negatives View

Image Tagging Refining Workflow

Alternatives using SaaS

There are other alternatives to image tagging from SaaS companies. We’ve tried them without success. The problem with most of these platforms is that at this point in time they are not accurate nor detailed enough in regards of fashion tagging.

Amazon Rekognition short sleeve dress image tagging results

Dress Similarity

Product similarity will allow us to be able to offer our customers recommendations based on product similarity. It’ll also allow our customers to find visually similar product with other facets.

Dress Similarity Model

For the machine learning model we are using TiefVision.

TiefVision is based on reusing an existing already trained network to classify on the ImageNet dataset, and swapping its last layers with a new network specialized for another purpose. This technique is know as transfer learning.

The first trained networks are used to locate the dress in the image following Yann LeCun Overfeat paper. This location algorithm trains two networks using transfer learning:

  • Background detection: detects background and foreground (dress) patches.

  • Dress location network: locates a dress in an image given a patch of a dress.

Combination of Dress Location and Background detection to accurately detect the Location of the dress

Once the dress is located, the next step is to detect whether two dresses are similar or not. This can be done using unsupervised learning from the embeddings of the output of one of the last layers. Another approach is to train a network for to learn dress similarity (supervised learning).

For the supervised side, we use Google’s DeepRank paper. The supervised learning network uses as input three images: a reference dress, a dress similar to the reference, and another dissimilar to the reference. Using a siamese network and training the network using a Hinge loss function, the network learns to detect dress similarities.

Similarity Network Topology

To compute the similarity of a dress and the other dresses we have in our database TiefVision does the following two steps:

  • The dress is first cropped using the location and background detection networks.

  • Finally the dress similarity network computes the similarity between the cropped dress and the cropped dresses we have in our database. It’s also possible to compute similarity using unsupervised learning.

For more information about TiefVision you can take a look at this presentation.

deep learning 2 Pau Carré Cardona 2 machine learning 7 computer vision 1
Tech
Gilt Tech

From Monolothic to Microservices - Gilt's Journey to Microservices on AWS

John Coghlan conferences

Watch Emerson Louriero’s talk from AWS re:Invent 2016

At AWS re:Invent, Emerson Loureiro, Senior Software Engineer at Gilt, lead two well-received sessions on our journey from a single monolithic Rails application to more than 300 Scala/Java microservices deployed in the cloud. We learned many important lessons along the way so we’re very happy to share this video of Emerson’s talk with you.

microservices 19 aws 6 scala 15 emerson louriero 1
Tech
Gilt Tech

December Meetups at HBC Digital

John Coghlan continuous integration

We’re closing out 2016 with two more meetups at our offices in Brookfied Place.

New York Scala University - December 13

On Tuesday, December 13 at 6:30pm, Puneet Arya, Senior Application Developer at HBC Digital, will talk about how MongoDB and the Play Framework get along.

The talk will touch on: * What is MongoDB and how it can be used. * How MongoDB works with Play Framework. * What configuration is needed to make Mongo DB work with Play. * Explore CRUD operations for MongoDB with Scala and Play Framework.

Please RSVP here if you’d like to attend.

HBC Digital Technology Meetup - December 15

We’re partnering with the NYC PostgreSQL User Group on Thursday, December 15 at 6:30pm to host Bruce Momjian.

Here’s the description of Bruce’s talk:

Postgres 9.6 adds many features that take the database to a new level of scalability, with parallelism and multi-CPU-socket scaling improvements.

Easier maintenance is achieved by reduced cleanup overhead and adding snapshot and idle-transaction controls. Enhanced monitoring simplifies the job of administrators. Foreign data wrapper and full text improvements are also included in this release. The talk will also cover some of the major focuses of the next major release, Postgres 10.

Speaker: Bruce Momjian is co-founder and core team member of the PostgreSQL Global Development Group, and has worked on PostgreSQL since 1996. He has been employed by EnterpriseDB since 2006. He has spoken at many international open-source conferences and is the author of PostgreSQL: Introduction and Concepts, published by Addison-Wesley. Prior to his involvement with PostgreSQL, Bruce worked as a consultant, developing custom database applications for some of the world’s largest law firms.

As an academic, Bruce holds a Masters in Education, was a high school computer science teacher, and lectures internationally at universities.

Please RSVP here if you’d like to attend.

meetups 34 scala 15 postgresql 2
Tech
Gilt Tech

Increasing Build IQ with Travis CI

Andrew Powell continuous integration

Continuous Integration is a must these days. And for social, open source projects it’s crucial. Our tool of choice for automated testing is Travis CI. Like most tools, Travis does what it does well. Unfortunately it’s not very “smart”. Heaven help you if you have a large or modular project with a multitude of tests - you’ll be waiting an eternity between builds.

And that’s exactly what we ran into. We have a repository that contains 30 NPM modules, each with their own specs (tests). These modules are part of an aging assets pipeline that I briefly mentioned last week. As such each module is subject to a litany of tasks each time a change is made. Travis CI is hooked into the repo and for each Pull Request would run specs for every module, assuring that there are no errors in the code changes contained in the PR. When you’re only working on one or two modules the run-time for the tasks is relatively low; typically 1 - 2 minutes. That of course depends on things such as npm install time, as each module requires an install for testing. Multiply that by 30 and you start to see where the problem arises.

Waiting Sucks

Without targeted build testing we’re left waiting or task-shifting until the build completes successfully. Our need was clear: figure out what files were affected, map and filter the results, and run only the specs for the modules changed in any particular Pull Request or push. That’s where travis-target comes into play.

Here’s a snippet of our repo structure, for reference:

ui-tracking
|--src
   |--common
      |--event_registry
      |--tracking_metadata
   |--tracking
      |--cart
      |--etc..

Target Acquired

In order to target modules, we need to know what their normalized names are; the names that we publish them to NPM under. Because of some legacy stuff baked into our pipeline, we store modules at group/name but publish them as group.name. So let’s fire up travis-target (bear in mind we’re using ES6 syntax that Node v7 supports).

const target = require('travis-target');
const pattern = /^src\//;

let targets = await target();

Let’s pretend that the common.event_registry and tracking.cart modules were both modified in one Pull Request (a common pattern for us) - our results would could look like this:

[
  'README.md',
  'src/common/event_registry/js/event_registry.js',
  'src/common/event_registry/js/registry.js',
  'src/common/event_registry/package.json',
  'src/tracking/cart/js/cart.js',
  'src/tracking/cart/package.json'
]

But that’s just silly, so let’s give travis-target some options to work with:

const target = require('travis-target');
const pattern = /^src\//;

let targets = await target({
  pattern: pattern,
  map: (result) => {
    let parts;

    result = result.replace(pattern, '');
    parts = result.split('/');

    return parts.slice(0, 2).join('.');
  }
});

By passing pattern in options, we’re telling travis-target to filter on (or return only those results which match) the regular expression pattern. That gives us an initial result set of directories starting with src/.

[
  'src/common/event_registry/js',
  'src/tracking/cart/js'
]

You’ll notice that the initial example of results contained some duplicate directories; travis-target cleans that up for you.

Next, we specify the map function on options. That’ll let us transform the each element in the Array of results so that it’s ready to use. Our results would now look like this:

[
  'common.event_registry',
  'tracking.cart'
]

Great Justice

Using this last result set, we now know which modules were affected in the PR, and we know which modules to run specs for. Our next steps are firing off a sequence of shell commands using @exponent/spawn-async, which plays nicely with async/await patterns now supported in Node v7.1 with the --harmony-async-await flag.

Since we implemented this pattern, build times for our PRs are in the 1-2 minute range; a vast improvement and one sure to bring developer happiness in some small degree.

Cheers!

Originally published at shellscape.org on November 16, 2016.

continuous integration 2 node.js 4 open source 64 travis ci 1
Tech
Gilt Tech

Linting NPM Version Conflicts

Andrew Powell node

Say you had the need for shared front-end assets (scripts, stylesheets, images, etc.) and a need for an entire org to access them, independently, for reliable builds of many different apps which used those assets. NPM might be a good choice - With NPM’s move to a flat-ish install tree, it’s still a relevant choice. But what about package version conflicts?

At Gilt, the choice was made years ago; before Bower, JSPM, and the host of other package managers came to be. NPM was the logical choice then. And we’re still using it.

Tooling

Traditionally hard. With the nested NPM installs of yesteryear it was compounded. Not only did we have to detect and move around scripts and other assets from within module packages, we also had to check versions against one another. That was essential in building the final script bundles and combining css for a production deployment.

The Scenario

Let’s say that fictitious module-a depends on module-b and module-util. And that module-b also depends on module-util. That’s a pretty straightforward tree and the bundle for the scripts of that tree should be easy. You’d think. But consider that scenario if module-a depends on module-util@1.0.0 and module-b depends on module-util@0.5.0. Now we’ve got a conflict, and that could totally hose our production bundle.

The Trees

In prior versions of NPM, the npm install tree would look like this:

node_modules
  module-a
    node_modules
      module-util@1.0.0
      module-b
        node_modules
          module-util@0.5.0

In today’s NPM it looks like this:

node_modules
  module-a
  module-util@1.0.0
  module-b
    node_modules
      module-util@0.5.0

NPM is quarantining outlier versions of shared modules so that everything plays nicely in a Node.js environment. That’s cool for Node, but not for us… using this as a front-end assets package manager.

A Solution?

What we ended up doing was installing the entire package tree to a temporary directory, and then polling every package.json in that directory, building a dependency tree and looking through the tree for conflicts. It worked. It wasn’t a bad method, and it’s one that we duplicated in three generations of tooling.

A Better Solution

That’s a heck of a lot of work to perform after we make NPM do a heck of a lot of work. There’s a better, faster way. Using NPM’s ability to pull metadata quickly for modules, we can leverage Node 7’s async/await capabilities to produce some elegant code that quickly retrieves and maps an NPM module’s version dependency tree.

Running that script for koa, we get:

...
koa: [ { version: '1.2.4', parent: '' } ],
  'koa-compose': [ { version: '2.5.1', parent: 'koa' } ],
  'koa-is-json': [ { version: '1.0.0', parent: 'koa' } ],
  'media-typer': [ { version: '0.3.0', parent: 'type-is' } ],
  'mime-db':
   [ { version: '1.24.0', parent: 'mime-types' },
     { version: '1.24.0', parent: 'mime-types' },
     { version: '1.24.0', parent: 'mime-types' } ],
...

In which we can clearly see that there are no version conflicts. That’s a snippet of the much larger tree returned, but the result is the same. Running that script on our fictitious module-a, the result would look like:

{
  'module-b': [ { version: '0.0.1', parent: 'module-a' } ]
  'module-util': [
    { version: '1.0.0', parent: 'module-a' },
    { version: '0.5.0', parent: 'module-b' },
  ]
}

And our conflict is visible.

Robustednesseses

While interesting, this isn’t inherently useful by itself. Moving forward, we’ll be wrapping this into a Gulp plugin with proper reporting output and blocking, and run from the local package.json file.

Cheers!

Originally published at shellscape.org on November 9, 2016.

npm 4 node.js 4 open source 64
Tech
Gilt Tech

Running with Scissors: Koa2 and Vue.js

Andrew Powell node

We love Koa at Gilt. (Hell, I love Koa.) Embarking on a new project, I wanted to try something that wasn’t React or Angular. After poking at the alternatives I landed on Vue.js. I picked up the sharpest pair of scissors I could find and started running.

The Chosen Tech

If you haven’t heard of Koa:

Koa is a new web framework designed by the team behind Express, which aims to be a smaller, more expressive, and more robust foundation for web applications and APIs.

Koa is Express without the bells and whistles from the factory. Koa is the stock car that you bolt aftermarket parts onto, based on your needs. It’s fast as hell. It’s small. It’s a dang joy to work with. Koa2 improves on performance and makes use of await/async patterns.

If you haven’t heard of Vue.js;

Vue is a progressive framework for building user interfaces. …Vue is designed from the ground up to be incrementally adoptable. …is focused on the view layer only, and is also perfectly capable of powering sophisticated Single-Page Applications.

Basically Vue.js takes the good parts from React, Angular, and Aurelia and bundles them into a single lib. Who want’s to go fast?. React folks won’t like it because there’s no JSX and there’s two-way binding. I dig it explicitly because I personally think JSX is an abomination, but more specifically that Vue.js uses <templates> which are very close to web components and will make that transition easy in the future, if and when wide-spread support for them ever drops. I digress.

Setting Up

It’s worth noting that we’re using Node v7, which supports most ES5 and a lot of ES6 syntax, with BabelJS thrown in to fill the gaps.

Koa

I always start my Koa projects with a few base modules: koa (duh), koa-router, and koa-static. koa-static allows you to specify and serve one or more directories as static assets - a necessity. koa-router typically handles app routes; that is to say the different endpoints for your app. The code is pretty straight forward, here’s a basic app.js:

'use strict';
import 'colors';
import Koa from 'koa';
import serve from 'koa-static';

const app = new Koa(),
  port = 3000;

app
  .use(serve(`${__dirname}/public`))
  .listen(port, () => {
    console.log('Server Started ∹'.green, 'http://localhost:'.grey + port.toString().blue);
  });

export default app;

Vue.js

To start off a project with Vue, we used vue-cli. That allows us to create boilerplate apps. Since we’re running with scissors here, we jumped right in and went with the simplest template.

$ vue init webpack-simple .

That gives us a basic (as in pumpkin spice latte) Vue app which displays a single page at index.html with a logo and some links. That also generates the App.vue file, which is the main “Vue file” for the app. All of that resides in src now. That’s all well and good, but what about that Webpack file created?

Enter Webpack

Getting a Vue app working right means bundling correctly. We use Webpack, but there are a lot of examples out there with Browserify if you prefer that. Webpack is a wonderful bundling tool and can perform a host of functions. So much so that it can be considered a full-fledged build tool. You’ll need a webpack.config.js file to kick things off:

And then you’ll need Webpack itself:

$ npm install webpack -g

You’ll note that we did a few things different with webpack.config.js, beyond what the vue-cli generated for us; moved assets to assets and introduced a Webpack Plugin to combine CSS in Vue files into one single CSS file to be served by Koa.

Tying it together

If you’re following our example and using Babel as well, you’ll need a Node entry point file to set Babel up for development. We named this index.js:

require('babel-core/register');
require('babel-polyfill');
require('./app');

Assuming you have a solid .babelrc and your dependencies setup correctly, you’re ready to try running this beast. First, fire up Webpack and create your bundles:

$ webpack

Assuming everything succeeded, you’re ready to start the server:

$ node index

And you should see something like this in your console:

→ node index
Server Started ∹ http://localhost:3000

Hit that address in your browser and you should see the Vue demo app. How do you like them apples? I love ‘em.

Running With Machetes

We’ve conquered scissors, let’s get silly and run around with something larger and sharper - in a following post I’ll walk through using middleware with Koa2 to allow live development and Hot Module Reloading. That means you don’t have to restart your server to rebuild your bundles, and you see changes nearly instantly. It’s cool, right?

Cheers!

Originally published at shellscape.org on November 3, 2016.

koa2 1 node.js 4 open source 64 vue.js 1 webpack 1
Tech
Gilt Tech

Watch Gilt's QCon New York talks

John Coghlan conferences

Held each year in June, QCon New York is one of the world’s leading software development conferences. This year, we had speakers from HBC Digital present how our teams work and how we have leveraged containers.

What we’ve learned about building great teams and improving team communication

Heather Fleming, our VP, People Operations & Product Delivery PMO, delivered a talk on two frameworks that can be used to improve communication, increase empathy and establish the psychologically safe environment a team needs to thrive. She also demonstrates how we build teams around initiatives using a “Team Ingredients” framework that focuses on each individual’s strengths and talents and what they contribute to the team.

Watch here: How to Unlock the “Secret Sauce” of Great Teams

What we’ve learned from using container technology at HBC Digital

Our SVP, Engineering Adrian Trenaman’s talk provides detailed examples of how HBC Digital has used Docker and where container technology has paid off and, pragmatically, what aspects of the technology haven’t panned out as they thought they would.

Watch here: How Containers Have Panned Out

communication 2 team building 2 containers 3 docker 3 adrian trenaman 1 heather fleming 3
Tech
Gilt Tech

Where to find our team in October

John Coghlan meetups

Here’s where to find Gilt and HBC Digital this month:

  • Oct 11 - Dana Pylayeva, Agile Coach, is leading a Product Discovery Game Workshop at the Agile/Lean Practitioners Meetup - RSVP
  • Oct 13 - Sophie Huang, Product Manager, is on the “From Downloads to Daily Active Users” panel at the 🍕🍺📱 (Pizza + Beer + Mobile) Meetup - RSVP
  • Oct 18 - Justin Riservato, Director of Data Engineering & Data Science, is on the “Push Data to Every Department” panel at Looker Join - SOLD OUT
  • Oct 18 - We’re hosting the HBC Digital Technology Meetup featuring Tom Beute, Front End Engineer, and Jared Stetzelberg, UX Designer, talking about “Creating A Live Styleguide: Bridge the Gap Between Tech & Design Teams” - RSVP
  • Oct 25 - We’re hosting the Dublin Swift Meetup at our Dublin office. This meetup will feature 3 great speakers including our own Ievgen Salo giving a talk on Advanced Collection in Swift. - RSVP
  • Oct 26 - We’re co-hosting the Dublin Scala Users Group with our friends from the Functional Kats Meetup. This Meetup will feature Juan Manuel Serrano on Scala and be held at Workday’s Dublin office. - RSVP

See you soon!

agile 7 product 6 data 26 john coghlan 2
Tech
Page 1 of 66