The Gilt technology organization. We make gilt.com work.

Gilt Tech

Akka HTTP Talk with Cake Solutions

meetups

We are thrilled to be hosting Aleksandr Ivanov of Cake Solutions on Tuesday, April 12th. He’ll be presenting an excellent talk on Akka HTTP. Who is Aleksandr? We’re glad you asked:

Aleksandr Ivanov is a senior software engineer at Cake Solutions, one of the leading European companies building Reactive Software with Scala. Scala is his main language of choice since 2011. He’s taken part in various projects, from developing backends for trivial web applications to enterprise-level reactive system and machine learning apps.

Beside engineering, he’s taking an active part in the life of the developer community, giving talks at local meetups, writing articles and helping others on the mailing list, Gitter and Stack Overflow. He’s always ready to hear and discuss interesting projects and events, share his experience or simply have a nice conversation over a drink.

Refreshments will be served!

Important: You must provide First Name & Last Name in order to enter the building.

Please RSVP!

meetup 8 cake solutions 1 akka 3 akka http 1 scala 12
Tech
Gilt Tech

Urgency vs. Panic

Hilah Almog tech blog

Urgency vs. Panic

My first initiative as a product manager at Gilt was something called “Urgency”. It was formed under the premise that Gilt customers had become numb to the flash model, and we in tech could find ways to reinvigorate the sense of urgency that once existed while shopping at Gilt; the noon rush wherein products were flying off the virtual shelves, and customers knew if they liked something they had precious few minutes to purchase it before it’d be gone forever. I came into this initiative not only new to Gilt but also new to e-commerce, and I felt an acute sensitivity towards the customer.At Gilt we walk a fine line between creating urgency and inciting panic, and it’s something I personally grappled with continuously. The former’s outcome is positive, the shopping experience becomes gamified, and the customer’s win is also ours. The latter’s outcome is negative. The customer has a stressful and unsuccessful shopping experience, and then churns. This fine line meant that we as a team couldn’t just conceive of features, we also had to find the perfect logical balance as to when they should appear – and more importantly, when they shouldn’t.

Cart Reservation Time

Our first feature was reducing the customer’s reservation time by half once they add a product to their cart. This tested well, but felt mean. We therefore held its release until we could build a product marketing campaign around it that communicated the shorter time as an effort in fairness: “if other customers can’t hoard without real intention to buy, then you get the most coveted products faster”. The customer service calls ended once our shoppers felt the feature was for their protection, not harm.

Live Inventory Badging

We wanted to continue running with this theme of helpful urgency, leading us to our second feature: live inventory badges. When we have less than 3 of any given item, a gold badge appears atop the product image saying “Only 3 Left”. It then animates in real time as inventory of that item changes. If you are ever on Gilt right at noon, notice how our sales come alive through these badges. Unlike the cart reservation time, this feature felt like a one-two punch. Not only were we creating urgency, but we were also giving the customer something they rarely get while shopping online – a view of the store shelf.

Timer in Nav + Alerts

Our third feature was our biggest challenge with regard to striking the right balance between urgency and panic. We added a persistent cart timer in the navigation, showing how much of your aforementioned five-minute reservation had transpired. The timer’s partner in crime is an alert, in the form of a banner, that appears on the bottom of the page when only a minute is left on your item’s reservation, urging you to checkout before it’s gone.

In order to find ourselves on the right side of the line, we implemented stringent rules around when this banner could appear, limiting it only to products that are low inventory (less than 3 in your size), and once per session.

Live Views

We faced an altogether different challenge when it came to our final feature, live product views. Here, the feature itself wasn’t strong enough, the views had to carry their weight. We again were forced to think through very specific thresholds depending on inventory levels and view count in order to determine under what circumstances we show, and under which we hide the feature.

Each of these features were tested individually, and each yielded positive results. After each was released we saw a combined 4% increase in our Key Performance Indicators on revenue within the first hour of a sale. The line was traversed successfully without panic but with the intended effect. And to our customers we say; Because you’re mine, I walk the line.

KPI 1 urgency 1 panic 1 shopping 14
Tech
Gilt Tech

Breaking the Mold: Megaservice Architecture at Gilt

Adrian Trenaman aws

Today we announce a novel approach to software and system architecture that we’ve been experimenting with for the last while at Gilt: internally, we’ve been referring to it ‘mega-service’ architecture, and, the name seems to have stuck. We’re pretty excited about it, as it represents a real paradigm shift for us.

In a mega-service architecture, you take all your code and you put it in one single software repository, the mega-service. There are so many advantages to having a single repository: only one code-base; no confusion where anything is; you make a change - it’s done, and will go out with the next deploy. It all compiles, from source, 100% of the time at least 50% of the time. Software ownership is a perpetual challenge for any tech organisation: in the mega-service model, there are many, many owners which means of course that the code is really, really well owned.

The mega-service is deployed to one really big machine: we prefer to run this in our own ‘data centre’ as we believe we can provision and run our hardware more reliably and cost-effectively than existing cloud players. The benefits of having a mega-service application are manifold: there’s one way to do everything and it’s all automated; code instrumentation, configs and metrics are all consistently applied, and, all eyes are on the same project, scripts and code, so people are more familiar with more parts of the system.

We’ve abandoned the sophisticated distributed code control mechanisms of recent lore in favour of a big ‘directory’ hosted on a shared ‘file server’. We’ve resorted to an optimistic, non-locking, non-blocking, zero-merge, high-conflict algorithm called ‘hope’ for contributing code changes: we copy the changes into the directory, and then ‘hope’ that it works. Rather than work with multiple different programming languages and paradigms, we’ve settled on an ‘imperative’ programming style using a framework we’ve recently adopted called Dot Net. Aligning previous lambda-based actor-thinking to a world of mutable variables, for-loops and ‘threads’ has not been easy for us; however, we suspect that the challenges and difficulties we’re experiencing are mere birthing pains and a clear sign that we’re heading in the right direction: if it’s hard, then we must be onto something.

This new architectural approach is an optimization on Neward’s ‘Box-Arrow-Box-Arrow-Cylinder’ pattern, reduced to a profoundly simple ‘Box -Arrow-Cylinder’ diagram (despite forming an elegant visual, the solution is just slightly too large to fit in the margin). We typically draw a box (our monolithic code) on top of a cylinder (our monolithic database), both connected with a line of some fashion; however, some have drawn the box to the left, right or bottom of the cylinder depending on cultural preference. Distinguished Engineers at Gilt have postulated a further simplification towards a single ‘lozenge’ architecture incorporating both code and data store in a single lozenge: while that architecture is theoretically possible, current thinking is that it is unlikely that we will get to prototype this within the next ten years.

New architectures require new thinking about organisational structure: everything so far points to a need for a software organisation of about five Dunbars in size to maintain our code-base, structured with a golden-ratio proportion of about eight non-engineering staff to every five engineers. Additionally, the benefits of really thinking about and formalizing requirements, following through with formal design, code and test over long periods, in an style we refer to as ‘Radical Waterfall’, bring us to a rapid release cycle of one or two releases per solar year.

While most readers will be familiar with open-source contributions from gilt on http://github.com/gilt and our regular talks and meetups, the innovations described in this post are subject to patent, and available through a proprietary licence and submission of non disclosure agreement. We’ll be releasing details of same on our next blog post, due for publication a year from now on April 1st, 2017.

aws 5 codedeploy 2 newrelic 2 notifications 2 micro-services 22 april-fool 1
Tech
Gilt Tech

Front End Engineering Lightning Talks with HBC Digital

meetups

Join us for an evening of lightning talks by 4 of HBC Digital’s Front End Engineers and an introduction by Steve Jacobs, SVP, Digital Technology and Demand Marketing.

  • Ozgur Uksal - Front End Engineer: Working with Typescript
  • Lei Zhu - Front End Engineer: Redux
  • Norman Chou - Front End Engineer: Using React router
  • Rinat Ussenov - Front End Engineer: Juggling bits in Javascript

Refreshments will be served!

Important: You must provide First Name & Last Name in order to enter the building.

Please RSVP!

meetup 8 hbc digital 1 frontend 12 typescript 1 redux 1 react 1 javascript 11
Tech
Gilt Tech

OSX, Docker, NFS and packet filter firewall

Andrey Kartashov docker

The Mobile Services team at Gilt uses Docker to both build and run software. In addition to the usual Docker benefits for software deployments moving toolchains to Docker has a few advantages:

  • it’s easy to (re)create a development environment
  • the environment is preserved in a stable binary form (libs, configs, CLI tools, etc, etc don’t bit rot as main OS packages or OS itself evolve)
  • easy to support multiple divergent environments where different versions of tools/libs are the default; e.g. java7/8, python, ruby, scala, etc

We develop primarily on OSX, but since Docker is a Linux-specific tool, we must use docker-machine and VirtualBox to actually run it. Toolchains rely on having access to the host OS’s filesystem. By default /Users is exposed in the Docker VM. Unfortunately, the default setup uses VBOXFS which is very slow. This can be really painful when building larger projects or relying on build steps that require a lot of IO, such as the sbt-assembly plugin.

Here’s a great comparison of IO performance.

There’s really no good solution for this problem at the moment, but some folks have come up with a reasonable hack: use NFS.

One of them was even nice enough to wrap it up into a shell script that “just works”.

Indeed, with NFS enabled, project build times begin to approach “native” speeds, so it’s tempting. The issue with NFS continues to be its aging design and intent to function in trusted network environment where access is given to hosts, not to authenticated users. While this is a reasonable access model for secure production networks, it’s hard to guarantee anything about random networks you may have to connect to with your laptop, and having /Users exposed via NFS on un-trusted networks is a scary prospect.

OSX has not one but two built-in firewalls. There’s a simplified app-centric firewall available from Preferences panel. Unfortunately all it can do is either block all NFS traffic (docker VM can’t access your exported file system) or open up NFS traffic on all interfaces (insecure), so it doesn’t really work for this case.

Fortunately, under the hood there’s also a much more flexible built-in packet level firewall that can be configured. It’s called PF (packet filter) and its main CLI tool is pfctl. Here’s a nice intro.

With that, one possible solution is to disable firewall in the Preferences panel and add this section at the end of the /etc/pf.conf file instead:

# Do not filter anything on private interfaces
set skip on lo0
set skip on vboxnet0
set skip on vboxnet1

# Allow all traffic between host and docker VM
table <docker> const { 192.168.99/24 }
docker_if = "{" bridge0  vboxnet0  vboxnet1 "}"
pass quick on $docker_if inet proto icmp from <docker> to <docker>
pass quick on $docker_if inet proto udp from <docker> to <docker> keep state
pass quick on $docker_if inet proto tcp from <docker> to <docker> keep state

# Allow icmp
pass in quick inet  proto icmp
pass in quick inet6 proto ipv6-icmp

# Bonjour
pass in quick proto udp from any to any port 5353

# DHCP Client
pass in quick proto udp from any to any port 68

# Block all incoming traffic by default
block drop in
pass out quick

Then turn it on at a system boot time by adding /Library/LaunchDaemons/com.yourcompany.pfctl.plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Disabled</key>
	<false/>
	<key>Label</key>
	<string>com.gilt.pfctl</string>
	<key>WorkingDirectory</key>
	<string>/var/run</string>
	<key>Program</key>
	<string>/sbin/pfctl</string>
	<key>ProgramArguments</key>
	<array>
		<string>pfctl</string>
		<string>-E</string>
		<string>-f</string>
		<string>/etc/pf.conf</string>
	</array>
	<key>RunAtLoad</key>
	<true/>
</dict>
</plist>

And configuring it to start by running

sudo launchctl load -w /Library/LaunchDaemons/com.yourcompany.pfctl.plist

The main difference from /System/Library/LaunchDaemons/com.apple.pfctl.plist here is the addition of -E parameter.

You can check that it starts by default after a reboot with

sudo pfctl -s info

And check the rules with

sudo pfctl -s rules

It should be ‘Enabled’ and you should see the configured rules.

You can verify your overall setup by running nmap from a different node against your laptop, e.g.

sudo nmap -P0 -sU  YOUR_IP
sudo nmap -P0 -sT  YOUR_IP

And check for open ports.

After that configuration you should see a noticeable improvement in docker performance for file system intensive workloads. Hopefully this will no longer be necessary in future versions of Docker VM so check the docs to be sure.

docker 2 osx 1 nfs 1 firewall 1
Tech
Gilt Tech

gulp-scan • Find Yourself Some Strings

Andrew Powell gulp

We recently ran across the need to simply scan a file for a particular term during one of our build processes. Surpringly enough, we didn’t find a Gulp plugin that performed only that one simple task. And so gulp-scan was born and now resides on npmjs.org.

Simply put - gulp-scan is a Gulp plugin to scan a file for a particular string or (regular) expression.

Setting Up

As per usual, you’ll have to require the module.

var gulp = require('gulp');
var scan = require('gulp-scan');

Doing Something Useful

gulp.task('default', function () {
  return gulp.src('src/file.ext')
		.pipe(scan({ term: '@import', fn: function (match) {
			// do something with {String} match
		}}));
});

Or if RegularExpressions are more your speed:

gulp.task('default', function () {
	return gulp.src('src/file.ext')
		.pipe(scan({ term: /\@import/gi, fn: function (match) {
			// do something with {String} match
		}}));
});

Pretty simple. There’s always room for improvement, and we welcome contribution on Github.

gulp 2 javascript 11
Tech
Gilt Tech

Codedeploy Notifications as a Service

Emerson Loureiro aws

After moving our software stack to AWS, some of us here at Gilt have started deploying our services to production using AWS’s Codedeploy. Before that, in a not-so-distant past, we used an in-house tool for deployments - IonCannon. One of the things IonCannon provided were deployment notifications. In particular, it would:

  1. Send an email to the developer who pushed the deployment, for successful and failed deployments;
  2. Send a new deployment notification to Newrelic;
  3. Optionally, send a Hipchat message to a pre-configured room, also for successful and failed deployments.

These notifications had a few advantages.

  1. If you - like me - prefer to switch to something else while the deployment is ongoing, you would probably want to be notified when it is finished; i.e., “don’t call me, I’ll call you”-sort of thing. The email notifications were a good fit for that;
  2. Having notifications sent via more open channels, like Newrelic and Hipchat, meant that anyone in the team - or in the company really - could quickly check when a given service was released, which version was released, whether it was out on a canary or all production nodes, etc. In Newrelic, in particular, one can see for example, all deployments for a given time range and filter out errors based on specific deployments. These can come in handy when trying to identify a potentially broken release.

Codedeploy, however, doesn’t provide anything out-of-the-box for deployment notifications. With that in mind, we have started looking at the different options available to achieve that. For example, AWS itself has the necessary components to get that working - e.g., SNS topics, Codedeploy hooks - but that means you have to do the gluing between your application and those components yourself and, with Codedeploy hooks in particular, on an application-by-application basis. Initially, what some of us have done was a really simple Newrelic deployment notification, by hitting Newrelic’s deployment API in the Codedeploy healthcheck script. This approach worked well for successful deployments. Because the healthcheck script is the last hook called by Codedeploy, it was safe to assume the deployment was successful. It was also good for realtime purposes, i.e., the deployment notification would be triggered at the same as the deployment itself.

Despite that, one can easily think of more complicated workflows. For example, let’s say I want to notify on failed deployments now. Since a failure can happen at any stage of the deployment, the healthcheck hook will not even be called in those cases. Apart from failed deployments, it’s reasonable to think about notifications via email, SNS topics, and so on. All of that essentially means adding various logic to different Codedeploy hooks, triggering the notifications “manually” from there - which for things like sending an email isn’t as simple as hitting an endpoint. Duplication of that logic across different services is then inevitable. An alternative to that would be Cloudtrail and a Lambda. However, given the delay for delivering Cloudtrail log files to S3, we would lose too much on the realtime aspect of the notifications. One good aspect of this approach, though, is that it could handle different applications with a single Lambda.

So, the ideal approach here would be one that could deliver realtime notifications - or as close to that as possible - and handle multiple Codedeploy applications. Given that, the solution we have been using to some extent here at Gilt is to provide deployment notifications in a configurable way, as a service, by talking directly to Codedeploy. Below is a high-level view of our solution. In essence, our codedeploy notifications service gets deployments directly from Codedeploy and relies on a number of different channels - e.g., SNS, SES, Newrelic, Hipchat - for sending out deployment notifications. These channels are implemented and plugged in as we need though, so not really part of the core of our service. Dynamo DB is used for persisting registrations - more on that below - and successful notifications - to prevent duplications.

fancy highlevel view

We have decided to require explicit registration for any application that we would want to have deployment notifications. There are two reasons for doing this. First, our service runs in an account where different applications - from different teams - are running, so we wanted the ability to select which of those would have deployment notifications triggered. Second, as part of registering the application, we wanted the ability to define over which channels those notifications would be triggered. So our service provides an endpoint that takes care of registering a Codedeploy application. Here’s what a request to this endpoint look like.

curl -H 'Content-type: application/json' -X POST -d '{ "codedeploy_application_name": "CODE_DEPLOY_APPLICATION_NAME", "notifications": [ { "newrelic_notification": { "application_name": "NEWRELIC_APPLICATION_NAME" } } ] }' 'http://localhost:9000/registrations'

This will register a Codedeploy application and set it up for Newrelic notifications. The Codedeploy application name -CODE_DEPLOY_APPLICATION_NAME above - is used for fetching deployments, so it needs to be the exact name of the application in Codedeploy. The Newrelic application name - NEWRELIC_APPLICATION_NAME - on the other hand, is used to tell Newrelic which application the deployment notification belongs to. Even though we have only illustrated a single channel above, multiple ones can be provided, each always containing setup specific to that channel - e.g., SMTP server for Emails, topic name for SNS.

For each registered application, the service then queries Codedeploy for all deployments across all of their deployment groups, for a given time window. Any deployment marked as successful will have a notification triggered over all channels configured for that application. Each successful notification is then saved in Dynamo. That’s done on scheduled-basis - i.e., every time a pre-configured amount of time passes the service checks again for deployments. This means we can make the deployment notifications as realtime as possible, by just adjusting the scheduling frequency.

Our idea of a notification channel is completely generic. In other words, it’s independent with regards to the reason behind the notification. In that sense, it would be perfectly possible to register Newrelic notifications for failed deployments - even though in practical terms it would be a bit of nonsense, given Newrelic notifications are meant for successful deployments only. We leave it up to those registering their applications to make sure the setup is sound.

Even though we have talked about a single service doing all of the above, our solution, in fact, is split into two projects. One is a library - codedeploy-notifications - which provides an API for adding registrations, listing Codedeploy deployments, and triggering notifications. The service is then separate, simply integrating with the library. For example, for the registration endpoint we described above, the service uses the following API from codedeploy-notifications under the hood.

val amazonDynamoClient = ...
val registrationDao = new DynamoDbRegistrationDao(amazonDynamoClient)
val newRelicNotificationSetup = NewrelicNotificationSetup("NewrelicApplicationName")
val newRegistration = NewRegistration("CodedeployApplicationName", Seq(newRelicNotificationSetup))
registrationDao.newRegistration(newRegistration)

Splitting things this way means that the library - being free from anything Gilt-specific - can be open-sourced much more easily. Also, it gives users the freedom to choose how to integrate with it. Whereas we currently have it integrated with a small dedicated service, on a T2 Nano instance, others may find it better to integrate it with a service responsible for doing multiple things. Even though the service itself isn’t something open-sourcable - as it would contain API keys, passwords, and such - and it’s currently being owned by one team only, it is still generic enough so it can be used by other teams.

We have been using this approach for some of our Codedeploy applications, and have been quite happy with the results. The notifications are being triggered with minimal delay - within a couple of seconds for the vast majority of the deployments and under 10 seconds in the worst-case scenarios. The codedeploy-notifications library is open source from day 1 and available here. It currently supports Newrelic notifications, for successful deployments, and there is current work to support Emails as well as notifications for failed deployments. Suggestions, comments, and contributions are, of course, always welcome.

aws 5 codedeploy 2 newrelic 2 notifications 2
Tech
Gilt Tech

Using tvOS, the Focus Engine & Swift to build Apple TV apps

Evan Maloney tvOS

At the Apple product launch event in September, Gilt demonstrated its upcoming “Gilt on TV” app for the Apple TV. (If you missed the demo and would like to see it, Apple has the video available on their website. Eddy Cue’s introduction of Gilt begins at the 74:20 mark.)

Last night, at the iOSoho meetup, I presented some of the things our team learned while developing that application, such as:

  • Similarities and differences with iOS development
  • The tvOS interaction model: What is “focus” and how does it work?
  • Interacting with the Focus Engine to set up your initial UI state
  • How the Focus Engine sees your views onscreen, and how that affects navigation
  • Using focus guides to make your UI easier to navigate
  • Swift: Is it ready for full-scale app development?

Slides from the talk are available here:

Apple TV 1 software development 7 Swift 4 Focus Engine 1
Tech
Gilt Tech

ION-Roller - Immutable software deployment service for AWS

Natalia Bartol aws deployment

Gilt has been at the forefront of the wave of microservice architecture. With the advantage of many individual services that do one thing well comes increased complexity of service management.

Our previous system was based on a reasonably traditional model of deployment scripts that simply replaced software on machines in order, and performed health checks before proceeding to the next server. A number of issues caused friction in attempts to release software. One was that in spite of services being run as microservices, many deployment decisions had to be done in lockstep, due to the shared platform. Another was that services that took a long time to start up (cache loads, etc.) had a large cost when the service was stopped; if a new release had to be rolled back, this could take more time than people might like.

Two ideas for software deployment arose from this. One was to separate the environments for each piece of software, and due to this, Gilt became an early leader in trials of the Docker deployment platform. Another idea was inspired by the principles of functional programming, where data was not changed in-place, but was modified by creating a new copy with the changes applied. This idea, “immutable infrastructure”, could allow new software releases, without the risk attached to shutting down the previous software version, as rollbacks would be quick and painless.

A third constraint appeared, which was not core to the concept, but based on the realities of Gilt’s deployment environments, which was that the AWS services, and EC2 computing platform, were going to be central to any service we created.

ION-Roller took these ideas, to create a deployment tool for immutable releases of software, using Docker environments, on top of the AWS infrastructure.

Following our article about deploying microservices published in InfoQ, now we made our work publicly available: https://github.com/gilt/ionroller. Check out a short demo!

Since development of the product started, Amazon has released a number of features in their own products (including the release of, and then enhancements to, the CodeDeploy service for deploying applications with EC2). In many cases, this is sufficient, especially for small companies which do not want to run their own deployment services. The major difference between the approaches is that CodeDeploy is not currently “immutable”, that is, it replaces the software as it installs new versions. We expect future changes to AWS services to further close the gap in functionality.

Gary Coady 2 Natalia Bartol 3 aws 5 docker 2 deployment 1
Tech
Gilt Tech

Private NPM Modules: A Song of Fire and Ice

Andrew Powell npm

Grab a coffee or pop and a snack, this is a read.

Several years ago, Gilt Tech decided to adopt a new strategy for developing the overall Gilt platform. It was determined that many small applications should comprise Gilt.com, rather than one (or few) large monolithic applications. We called this strategy ‘Lots of Small Applications,’ or ‘LOSA.’ Key to LOSA were decentralized, distributed front-end assets. Enter NPM.

Our requirements for how this was to work was varied, and we authored build tools to facilitate them. Our build tools solved many problems that at the time, didn’t have solutions. We also knew we couldn’t rely on external services to host our modules, because we need to know that the registry these little guys were stored in needed to always be available. NPM was still new, and npmjs.org was still a baby back then. Third-party SaaS providers weren’t a thing for this yet. And so we spun up our own NPM registry in-house. This worked for years, and worked well - until one day it became obvious that we were hopelessly out of date.

June 2014: A framework for developing NodeJS applications at Gilt took shape, and the need to start installing public modules from npmjs.org became real. The architects of the in-house registry had long-since parted ways with Gilt, and we scrambled to try and update an aged system. We attempted to implement a proxy from our private registry to the public. Things worked, but not well, and it was decided that we’d look to a third-party for our registry needs. We eventually settled on a well-known company in the NodeJS space. We were able to migrate our internal registry to their service, and everything worked very, very well.

In February of this year we were informed that a very large entity, we’ll call them “MovableUncle,” had purchased the registry-hosting company and they would cease their registry business. We knew we’d have to move off of the platform and began carefully and cautiously considering alternatives. We were told that we’d have until December 2015 - and then someone rolled in the dumpster and lit that baby afire. The registry company experienced massive attrition, including all of their support staff, resulting in a near-complete inability to get support. Our data was locked there, and despite exhausting many channels of communication, we were unable to procure a backup archive of our data. Some amount of panic set in when their website was updated to show August 2015 as the cutoff date.

Without knowing for sure just when the service would go down, we knew it was time to act, and quickly. We identified the three most viable options for moving forward: Hosting in-house again, going with another private registry SaaS provider that Gilt was already using for a different service, or going with the newly announced Private Modules offering on npmjs.org.

After much discussion and weighing the options against one-another, we decided to bite the bullet and go with npmjs.org. That bullet we had to bite meant a lot of changes internally. Their scoping mechanism while elegant for installing modules and organization, was an enormous pain point for Gilt - it meant we had to update the names and dependencies for 220 modules. We’d also be losing our history of published modules covering nearly four years. Representatives for npmjs.org explicitly told us there was no way to import a registry, even if we could get a backup out of the dumpster fire. That also meant that we’d have to republish 220 modules.

‘Aint Got Time To Bleed, as August was fast-approaching. This process required either a metric poop-ton of man-hours, or some quasi-clever scripts. We put together two NodeJS scripts;

  1. npm-registry-backup would pull down a manual backup of our registry. That would mean iterating over all of our modules in the repo, fetching the npm info for each, and downloading the tarball for each revision.
  2. npm-scope would iterate through a target directory looking for package.json files, update the name by adding our npmjs.org scope, adding the scope to any private modules in the dependencies property, and then finally publishing the module to npmjs.org. This script also allowed us to automagically update our apps that had private module dependencies.

We’ll make those scripts publicly available in the coming week(s) once we’re out of the forest. From start to finish the process took about 9 man-hours (3 hours / 3 fellas) for backup and update of 220 modules, distributed through 14 different git repositories. Getting everything just right across the org was another matter altogether. Switching module paradigms isn’t a trivial thing when the org has relied on one model for a few years.

Knowing Is Half The Battle and to say we learned a lot from this ordeal would be an understatement the size of Texas.

Build Tools

One of my favorite Gilt-isms is “It solved the problem at the time.” Our primary build tool (named ui-build) for front-end assets was written a few years ago when we still thought Rake was all the hotness. It’s a brilliant piece of code and study in Ruby polymorphism - unfortunately it’s infested with massively complex regular expressions, hardcoded assumptions about filesystem locations, and chock-full of black magic. “Black magic” is the term we use at Gilt for all of the things in ui-build that were written by people no longer with the company, for which we don’t know it’s doing. Once we updated ui-build to handle scoped modules, we learned that we had to republish around 80 modules, due to black magic. We require publishing of modules to go through the build tool so that things like linting and verification of certain data and standards are performed prior to actually publishing to npm. We learned that in that process, things like automatic compilation of LESS and manipulation of vendor code, are done SILENTLY.

While our next-generation, gulp-based build tool is being polished and rolled-out, we’ve used this lesson to ensure that we don’t have gaps in knowledge and documentation like we experienced with ui-build. We’re also using this opportunity to explore how we can use the standard processes of npm to perform pre-publish or post-install steps and remove the need for black magic.

Maintenance, Maintenance, Maintenance

Some of our apps are so out-of-date they might as well be wearing bell-bottoms and driving Gremlins. So so so many apps had dependencies on module versions that were a year+ old. Remember - we lost that revision history in the registry when we moved to npmjs.org. Things blew up in epic fashion when attempting builds of many of our less-frequently-maintained apps. Some of the dependencies were so stubborn that we had to republish the tarballs for old versions that we had pulled down from npm-registry-backup.

We need to be better at updating our apps. It doesn’t always make cost-benefit sense to management, but it helps eliminate technical debt that we paid for in man-hours during this process. On top of the the original 9 man-hours, we had to spend roughly an additional 32 man-hours (8 hours / 4 fellas) with subsequent clean-up. There is progress on this front, however; Gilt Tech recently embarked on an ownership campaign internally called the ‘Genome Project’ which should help address this concern.

Conclusion, or Potty Break

Overall the move was a success. We’re still chasing the occasional edge case and build problem, but the major hurdles have been overcome. The improvement in speed and reliability of NPM over the SaaS registry host has been marked. We’re able to use the latest features of the NPM registry and have regained the ability to unpublish; something we lost with the SaaS host. The technical improvements coupled with the knowledge we’ve gained have made the endeavor worthwhile. And unless “MovableUncle” inexplicably acquires npmsj.org, we’ve set the company up for stability in this area for some time to come.

npm 3 private modules 1 tools 6 rake 1 gulp 2
Tech
Page 1 of 65