The Gilt technology organization. We make work.

Gilt Tech

ION-Roller - Immutable software deployment service for AWS

Natalia Bartol aws deployment

Gilt has been at the forefront of the wave of microservice architecture. With the advantage of many individual services that do one thing well comes increased complexity of service management.

Our previous system was based on a reasonably traditional model of deployment scripts that simply replaced software on machines in order, and performed health checks before proceeding to the next server. A number of issues caused friction in attempts to release software. One was that in spite of services being run as microservices, many deployment decisions had to be done in lockstep, due to the shared platform. Another was that services that took a long time to start up (cache loads, etc.) had a large cost when the service was stopped; if a new release had to be rolled back, this could take more time than people might like.

Two ideas for software deployment arose from this. One was to separate the environments for each piece of software, and due to this, Gilt became an early leader in trials of the Docker deployment platform. Another idea was inspired by the principles of functional programming, where data was not changed in-place, but was modified by creating a new copy with the changes applied. This idea, “immutable infrastructure”, could allow new software releases, without the risk attached to shutting down the previous software version, as rollbacks would be quick and painless.

A third constraint appeared, which was not core to the concept, but based on the realities of Gilt’s deployment environments, which was that the AWS services, and EC2 computing platform, were going to be central to any service we created.

ION-Roller took these ideas, to create a deployment tool for immutable releases of software, using Docker environments, on top of the AWS infrastructure.

Following our article about deploying microservices published in InfoQ, now we made our work publicly available: Check out a short demo!

Since development of the product started, Amazon has released a number of features in their own products (including the release of, and then enhancements to, the CodeDeploy service for deploying applications with EC2). In many cases, this is sufficient, especially for small companies which do not want to run their own deployment services. The major difference between the approaches is that CodeDeploy is not currently “immutable”, that is, it replaces the software as it installs new versions. We expect future changes to AWS services to further close the gap in functionality.

Gary Coady 2 Natalia Bartol 3 aws 3 docker 1 deployment 1
Gilt Tech

Private NPM Modules: A Song of Fire and Ice

Andrew Powell npm

Grab a coffee or pop and a snack, this is a read.

Several years ago, Gilt Tech decided to adopt a new strategy for developing the overall Gilt platform. It was determined that many small applications should comprise, rather than one (or few) large monolithic applications. We called this strategy 'Lots of Small Applications,' or 'LOSA.' Key to LOSA were decentralized, distributed front-end assets. Enter NPM.

Our requirements for how this was to work was varied, and we authored build tools to facilitate them. Our build tools solved many problems that at the time, didn't have solutions. We also knew we couldn't rely on external services to host our modules, because we need to know that the registry these little guys were stored in needed to always be available. NPM was still new, and was still a baby back then. Third-party SaaS providers weren't a thing for this yet. And so we spun up our own NPM registry in-house. This worked for years, and worked well - until one day it became obvious that we were hopelessly out of date.

June 2014: A framework for developing NodeJS applications at Gilt took shape, and the need to start installing public modules from became real. The architects of the in-house registry had long-since parted ways with Gilt, and we scrambled to try and update an aged system. We attempted to implement a proxy from our private registry to the public. Things worked, but not well, and it was decided that we'd look to a third-party for our registry needs. We eventually settled on a well-known company in the NodeJS space. We were able to migrate our internal registry to their service, and everything worked very, very well.

In February of this year we were informed that a very large entity, we'll call them "MovableUncle," had purchased the registry-hosting company and they would cease their registry business. We knew we'd have to move off of the platform and began carefully and cautiously considering alternatives. We were told that we'd have until December 2015 - and then someone rolled in the dumpster and lit that baby afire. The registry company experienced massive attrition, including all of their support staff, resulting in a near-complete inability to get support. Our data was locked there, and despite exhausting many channels of communication, we were unable to procure a backup archive of our data. Some amount of panic set in when their website was updated to show August 2015 as the cutoff date.

Without knowing for sure just when the service would go down, we knew it was time to act, and quickly. We identified the three most viable options for moving forward: Hosting in-house again, going with another private registry SaaS provider that Gilt was already using for a different service, or going with the newly announced Private Modules offering on

After much discussion and weighing the options against one-another, we decided to bite the bullet and go with That bullet we had to bite meant a lot of changes internally. Their scoping mechanism while elegant for installing modules and organization, was an enormous pain point for Gilt - it meant we had to update the names and dependencies for 220 modules. We'd also be losing our history of published modules covering nearly four years. Representatives for explicitly told us there was no way to import a registry, even if we could get a backup out of the dumpster fire. That also meant that we'd have to republish 220 modules.

'Aint Got Time To Bleed, as August was fast-approaching. This process required either a metric poop-ton of man-hours, or some quasi-clever scripts. We put together two NodeJS scripts;

  1. npm-registry-backup would pull down a manual backup of our registry. That would mean iterating over all of our modules in the repo, fetching the npm info for each, and downloading the tarball for each revision.
  2. npm-scope would iterate through a target directory looking for package.json files, update the name by adding our scope, adding the scope to any private modules in the dependencies property, and then finally publishing the module to This script also allowed us to automagically update our apps that had private module dependencies.

We'll make those scripts publicly available in the coming week(s) once we're out of the forest. From start to finish the process took about 9 man-hours (3 hours / 3 fellas) for backup and update of 220 modules, distributed through 14 different git repositories. Getting everything just right across the org was another matter altogether. Switching module paradigms isn't a trivial thing when the org has relied on one model for a few years.

Knowing Is Half The Battle and to say we learned a lot from this ordeal would be an understatement the size of Texas.

Build Tools

One of my favorite Gilt-isms is "It solved the problem at the time." Our primary build tool (named ui-build) for front-end assets was written a few years ago when we still thought Rake was all the hotness. It's a brilliant piece of code and study in Ruby polymorphism - unfortunately it's infested with massively complex regular expressions, hardcoded assumptions about filesystem locations, and chock-full of black magic. "Black magic" is the term we use at Gilt for all of the things in ui-build that were written by people no longer with the company, for which we don't know it's doing. Once we updated ui-build to handle scoped modules, we learned that we had to republish around 80 modules, due to black magic. We require publishing of modules to go through the build tool so that things like linting and verification of certain data and standards are performed prior to actually publishing to npm. We learned that in that process, things like automatic compilation of LESS and manipulation of vendor code, are done SILENTLY.

While our next-generation, gulp-based build tool is being polished and rolled-out, we've used this lesson to ensure that we don't have gaps in knowledge and documentation like we experienced with ui-build. We're also using this opportunity to explore how we can use the standard processes of npm to perform pre-publish or post-install steps and remove the need for black magic.

Maintenance, Maintenance, Maintenance

Some of our apps are so out-of-date they might as well be wearing bell-bottoms and driving Gremlins. So so so many apps had dependencies on module versions that were a year+ old. Remember - we lost that revision history in the registry when we moved to Things blew up in epic fashion when attempting builds of many of our less-frequently-maintained apps. Some of the dependencies were so stubborn that we had to republish the tarballs for old versions that we had pulled down from npm-registry-backup.

We need to be better at updating our apps. It doesn't always make cost-benefit sense to management, but it helps eliminate technical debt that we paid for in man-hours during this process. On top of the the original 9 man-hours, we had to spend roughly an additional 32 man-hours (8 hours / 4 fellas) with subsequent clean-up. There is progress on this front, however; Gilt Tech recently embarked on an ownership campaign internally called the 'Genome Project' which should help address this concern.

Conclusion, or Potty Break

Overall the move was a success. We're still chasing the occasional edge case and build problem, but the major hurdles have been overcome. The improvement in speed and reliability of NPM over the SaaS registry host has been marked. We're able to use the latest features of the NPM registry and have regained the ability to unpublish; something we lost with the SaaS host. The technical improvements coupled with the knowledge we've gained have made the endeavor worthwhile. And unless "MovableUncle" inexplicably acquires, we've set the company up for stability in this area for some time to come.

npm 3 private modules 1 tools 6 rake 1 gulp 1
Gilt Tech

Gilt Tech + Jekyll = ♥

Andrew Powell tech blog

Gilt Tech Blog Moves to Github

In continuing the march towards a more open Gilt Tech, we've made the switch to Github Pages and Jekyll from Tumblr.

You can view the entirety of this blog's code and content here:

Why We Did This

First and foremost: control. We've got control over the layout and design of the blog from top to bottom, without restriction and without any kind of a funky theming engine to wrestle with. We're really good at this kind of thing (we should be, after all) and we've ended up with a blog design that matches elements of

We've also got better control over how we write posts. We're a fairly large, distributed organization, and coordinating the setup, permissions, et al for Tumblr was kind of a pain. There was also very little positive feedback about the authoring experience. With the move everyone is able to author in their editor of choice and go through the already familiar process of submitting the post to the tech-blog repository.

We Used Technology

Jekyll Bootstrap is a fantastic bootstrap/scaffolding package put together to quickly spin up a blog. While we customized the bajeezus out of it, which was incredibly quick and painless, and rolled our own 'gilt' theme, we used a healthy chunk of what it ships with by default. We did roll our own helpful tidbits, like the authors page, for which you can view the source here.

Github Pages. If you haven't looked into Github Pages for simple hosted websites or project websites yet, give it a gander. It's a phenomenal way to quickly spin up a site, and it's Jekyll support is pretty great.


We hope you like the new digs, and we're forecasting a lot more great content to come.

jekyll 1 github 3 blogging 1
Gilt Tech

Sean Smith to Present at Microservices Day!

Sean Smith conferences

What is Microservices Day?


Microservices Day is a conference by the enterprise, for the enterprise. It brings together people from organizations that have adopted (or are planning to adopt) a microservices approach to running their business and focuses on the issues that companies face. It gives participants a forum to discuss and share their experiences with microservices through advice, tips and tricks as well as driving forward both the technology and business community. Come join us at The Altman Building, 135 West 18th Street in New York on July 13th! The event will consist of a range of talks from industry leaders who have deep experience building and operating microservices systems at scale. Panel discussions and active audience participation will be a further part of the event, as the implications of enabling business to innovate at a faster pace than ever before will be a key discussion point. Come join the early adopters at the dawn of a new era in business computing!

Sean Smith is a Software Engineer from Canada. He currently works at Gilt as a Principal Software Engineer. He has worked on a variety of services at Gilt, ranging from third party warehouse systems integration to email content generation and delivery.

Register now for this great event!  See you on July 13th!

microservices 18 meetup 6 conference 3 microservices day 1 software engineering 66
Gilt Tech

Introducing Gumshoe

Andrew Powell analytics

An analytics and event tracking sleuth.

In late 2014 Gilt received word that a core feature of Google Analytics (henceforth known as GA) would be removed in the next iteration of the platform. The ability to redirect collected data and events to a secondary endpoint had grown critical to our the analization of user behavior, patterns, and effectiveness of our user experience, as well as sales forecasting and predictive modeling. Our two most viable options for retaining this ability were to implement another tracking platform from a different vendor or create and implement our own, tailored to our specific needs.

After taking a hard look at the heavy hitters of the analytics platform sphere it was decided that we’d take the divergent route and roll our own analytics and event tracking library. We won’t go into detail on why we passed on big players like Snowplow, Omniture, CoreMetrics, et al. Rather, we’ll focus on Gumshoe and its merits.

What Did We Need?

To anyone in development, marketing, or data sciences on the web, our list of needs will sound quite familiar.

First and foremost we required parity with GA’s base page data. The industry has come to be reliant and has standardized on the kind of data that GA collects for various reporting, including vital year-over-year reporting. So no shockers there.

As performance-minded developers we also had a low page footprint in mind.

We desperately needed organized event names and data. One major shortcoming of GA was the lack of enforcement of any kind of standard in event naming and data. Arbitrary event names and varying event data formats produced migraines the size of Mount Fuji for our Data Scientists. It wasn’t uncommon to have the same target event with several different names, in different applications, with different data in different delimited formats.

Related to the variance in event data, we were looking for a high degree of data integrity and confidence.

And last but not least, a low delivery failure or miss rate for the data collection.

Rolling Our Own

Events are events are events. In Gumshoe, everything is an event. There’s no disambiguation between page views, virtual page views, or custom user events. Everything is an event, and it’s a core tenant of Gumshoe.

Right off the bat, parity with GA was key. We started by breaking down the data that GA collected from each page view, and moved forward from there. A basic Gumshoe event consists of the following:


pageData consists of the GA parity data.
eventName is self-explanatory; the name of the event we’re passing.
eventData is custom data passed by the consumer which should be associated with an event.

Wunderbar! We had the recipe for sending a page view event. But how do we set this up so that others can easily send these events to where ever they’d like? Enter the transport.

Transports are the mechanism(s) by which consumers instruct Gumshoe on how it should send the data for each event. One can configure Gumshoe to send to multiple transports, or a single transport, it’s your choice. A brief example from the repository:

(function (root) {

  var gumshoe = root.gumshoe;


    name: 'example-transport',

    send: function (data) {
      console.log('Gumshoe: Test Transport: Sending...');

    map: function (data) {
      return {
        customData: {
          someData: true
        ipAddress: ''



The send method is responsible for actually sending the data. Gumshoe bundles the library, which any transports can freely use. At Gilt we use reqwest.js to send the data to our backend event stack, which we’ll cover in subsequent blog posts.

The map method allows consumers to extend the data which Gumshoe is sending through the transport. At Gilt we use this method to attach a giltData object that is sent along side pageData and contains vital Gilt-specific data for every single event that we send.

Using Gumshoe

Once you’ve got Gumshoe included on your page and you’ve specified a transport, you’ll need to initialize the library. It’s as simple as specifying the transport:

// tell gumshoe to use our transport  
window.gumshoe({ transport: 'example-transport' });  

And sending an event:

window.gumshoe.send('page.view', {});  

What’s Next

In forthcoming posts we’ll be talking more about the other goals we listed, specifically how we internally solved event organization, event naming standardization, data integrity, and the remainder of the stack that takes over once Gumshoe has delivered the payload.

We’ve been using Gumshoe on for some time now, and have been running constant parity comparisons against GA and we’re very happy with the results thus far. However, we’d be bonkers to claim that Gumshoe is perfect. We’ve tailored Gumshoe to be extensible but so far it’s only been targeted for use at Gilt. We’d love feedback from the community at large. And if you feel you might be able to use Gumshoe, but find it lacking, we’d love to talk about how it can be improved.

gumshoe 1 google analytics 1 gilt 87 events 26 event tracking 1 andrew powell 1
Gilt Tech

Welcome to our Summer Interns!

Ryan Caloras culture

Introducing Gilt Tech’s first official Summer Apprentice Program class! Here’s our class described in their own words :D

Alex Luo

  • Has no middle name
  • Lived for more than two years in three different countries (China, US, Israel)
  • Walks to class in the snow in April because he goes to Cornell
  • Once volunteered by distributing medicine to poorer families in rural Dominican Republic
  • Voted “best eyes” for high school yearbook
  • Ate a cricket once
  • Fixed a graphics card by baking it in the oven
  • Broke a bone from accidentally falling off a building
  • Got kicked out of the public library once for “hacking” a computer
  • Actually can believe it’s not butter
  • Named by Time Magazine as 2006’s Person of the Year
  • Purposefully uses Comic Sans in documents to tick people off

Courtney Ligh

Courtney is a native New Yorker studying Computer Science at Dartmouth College. She is a future software developer and is excited about her summer internship here with Gilt’s Development Team. Besides coding, Courtney has a passion for food and prides herself with knowing all the best eateries around the city.  

Yogisha Dixit

The only thing you need to know about me is that I’m obsessed with chocolate. Literally…

Okay, maybe not. I’m interested in harnessing the power of computers to help make the world a better place. And I’m really looking forward to learning a lot and being challenged this summer.

That’s it. I swear.

And a very special Welcome Back to

Helena Wu

Helena was born and raised in São Paulo, Brazil. As a Computer Science student at Cornell University, she’s been on the Engineering Merit List and volunteers for different international organizations on-campus. When she’s not hacking on a project, you can find her rocking to metal songs! She also loves to stargaze in open fields on warm nights. She is excited for the summer and ready to explore NYC!

internships 1 Alex Luo 1 Helena Wu 3 Courtney Ligh 1 Yogisha Dixit 1 Gilt Tech Apprentice Program 1
Gilt Tech

Is Agile a Competitive Advantage?

Heather Fleming agile

Registration is open for this Webinar on May 27th!

Summary below from

A new Harvard Business Review Report asserts that agile development practices have steadily risen to become one of the most trusted and preferred methods of development across software teams in almost every industry. The study also discovered that by using agile frameworks, organizations can respond to market changes faster, deliver higher quality software, and gain a significant competitive edge. For these reasons HBR has dubbed agile development the competitive advantage for a digital age.

Join Jake Brereton, Sr. Brand Manager at Atlassian, as he discusses these HBR findings with Heather Fleming (Sr. Director, PMO, Gilt), Nate Van Dusen (Engineering Program Management Director, Trulia), and Maira Benjamin (Director of Engineering, Pandora); three seasoned agile experts who have successfully lead agile transformations and witnessed tangible, positive results firsthand.

In this webinar you’ll learn:

  • How Trulia benefited from taking engineering agile principles and spreading them throughout their entire Enterprise
  • What agreements engineering made with Gilt leadership to improve their unique agile development process
  • The way Pandora’s agile approach to implementing agile allowed them to keep what worked and throw out what didn’t, quickly
webinar 5 atlassian 3 gilt 87 pandora 1 trulia 1 HBR 1 Harvard Business Review 1
Gilt Tech

Last Chance to Join the Meetup Tonight!!!

Heather Fleming conferences

Adrian Trenaman, VP Engineering at Gilt will be speaking tonight at our Gilt offices on how we adopted Solr at Gilt to power our personalized search results!

We will be raffling off copies of Solr In Action after the presentation! Drinks and sandwiches will be served.

Registration closes soon, so sign up now!  It’s free!

Gilt Security requires all reservations to have a FIRST AND LAST name in order to enter the building.  Please make sure you include this when you RSVP.

meetup 6 Gilt 340 ade trenaman 2 solr 4
Gilt Tech

Importing Google Trends data

Igor Elbert analytics

Google Trends offers a trove of data for analysis. It’s not used nearly enough partially because good folks at Google did not provide an API to access the data. You can play with Trends in you browser, embed it into your webpages but it’s not that simple to get the raw data behind it to use it in your analysis.

There is a number of packages in Python, Perl, or R that pull the data for you but none of them did when I needed: compare hundreds of trends against each other.

You see, not contend with the lack of API Google returns trends on 1 to 100 scale so it’s hard to compare numbers for many different trends. You can plot several trends on the same graph but you will not be able to tell how they stand relative to another set of trends.

For example:


Above: “Game of Thrones” vs. “House of Cards”.
Below: “Orange is New Black” vs. “The Newsroom” 

Since each set is rescaled there is no way to tell how “House…” stands against “Orange…” 


I needed to get the trends for hundreds of fashion brands Gilt deals with and compare them against each other.

The logical solution seemed to use one search term as a baseline in every set and then rescale the results relative to a baseline term.

I borrowed heavily from GTrendsR package and came up with a script in R that pulls trends and rescales them relative to baseline.

It took some hacking: I took Google Trends export link that looks something like,+/m/0tlwzvq,+/m/0b6hm_f,+/m/0c5_m3,+/m/0f4w93&geo=US&date=2/2013+25m&cmpt=q&tz&tz&content=1&cid=TIMESERIES_GRAPH_0&export=5

and after looking at various export options found the one (export=3) that returns raw data in JSON-like string. For example:


// Data table response
 {"id":"query0","label":"Agent Provocateur","type":"number","pattern":""},
 {"id":"query1","label":"Kate Spade","type":"number","pattern":""}
 {"v":new Date(2015,3,18),"f":"Saturday, April 18, 2015"},
 {"v":new Date(2015,3,19),"f":"Sunday, April 19, 2015"},
 {"v":new Date(2015,3,20),"f":"Monday, April 20, 2015"},
 {"v":new Date(2015,3,21),"f":"Tuesday, April 21, 2015"},
 {"v":new Date(2015,3,22),"f":"Wednesday, April 22, 2015"},
 {"v":new Date(2015,3,23),"f":"Thursday, April 23, 2015"},
 {"v":new Date(2015,3,24),"f":"Friday, April 24, 2015"},

Which is turned into a valid JSON with 4 lines of R code and then parsed into R data structures with rjson package.

The result looks like:


And can now be analyzed, plotted, and joined with other data.


Because of the powers of SQL/MapReduce in our Teradata Aster database we can pull the trends and join with our relational data on the fly in SQL:

      ON (SELECT brand_name from dim_brands)
      PARTITION BY brand_name
      OUTPUTS('week DATE, term VARCHAR, trend INTEGER’)

Any comments on Google Trends, R style and overall approach are appreciated.

Happy trending!

Google Trends 1 Data 1 How-To 1
Page 1 of 64