The Gilt technology organization. We make work.

Gilt Tech

HBC Tech Talks: February 2017 through July 2017

HBC Tech conferences

We’ve had a busy 2017 at HBC. The great work of our teams has created opportunities to share what we’ve learned with audiences around the world. This year our folks have been on stage in Austin, Sydney, Portland, Seattle, San Diego, Boston, London, Israel and on our home turf in NYC and Dublin. The talks have covered deep learning, design thinking, data streaming and developer experience to name just a few.

Lucky for you, if you haven’t been able to check out our talks in person, we’ve compiled the decks and videos from a bunch of our talks right here. Enjoy!







  • Sean Sullivan spoke at Scala Up North and the Portland Java User Group about ApiBuilder.
  • Sophie Huang spoke at the Customer Love Summit in Seattle.
  • Kyla Robinson gave a keynote on Key to Success: Creating A Mobile–First Mentality.
  • Sera Chin and Yi Cao spoke at the NYC Scrum User Group about HBC’s Design Sprints.
meetups 36 conferences 28 evangelism 4
Gilt Tech

Sundial or AWS Batch, Why not both?

Kevin O'Riordan data

Sundial on AWS Batch

About a year ago, we (the Gilt/HBC personalization team) open sourced Sundial ( , a batch job orchestration system leveraging Amazon EC2 Container Service.

We built Sundial to provide the following features on top of the standard ECS setup:

  • Streaming Logs (to Cloudwatch and S3 and live in Sundial UI)
  • Metadata collection (through Graphite and displayed live in Sundial UI)
  • Dependency management between jobs
  • Retry strategies for failed jobs
  • Cron style scheduling for jobs
  • Email status reporting for jobs
  • Pagerduty integration for notifying team members about failing critical jobs

alt text

Other solutions available at the time didn’t suit our needs. Solutions we considered included Chronos which lacked the features we needed and required a Mesos cluster, Spotify Luigi and Airbnb Airflow, which was immature at the time.

At the time, we chose ECS because we hoped to take advantages of AWS features such as autoscaling in order to save costs by scaling the cluster up and down by demand. In practice, this required too much manual effort and moving parts so we lived with a long running cluster scaled to handle peak load.

Since then, our needs have grown and we have jobs ranging in size from a couple of hundred MB of memory to 60GB of memory. Having a cluster scaled to handle peak load with all these job sizes had become too expensive. Most job failure noise has been due to cluster resources not being available or smaller jobs taking up space on instances meant to be dedicated to bigger jobs. (ECS is weak when it comes to task placement strategies).

Thankfully AWS have come along with their own enhancements on top of ECS in the form of AWS Batch.

What we love about Batch

  • Managed compute environment. This means AWS handles scaling up and down the cluster in response to workload.
  • Heterogenous instance types (useful when we have outlier jobs taking large amounts of CPU/memory resources)
  • Spot instances (save over half on on-demand instance costs)
  • Easy integration with Cloudwatch Logs (stdout and stderr captured automatically)

What sucks

  • Not being able to run “linked” containers (We relied on this for metadata service and log upload to S3)
  • Needing a custom AMI to configure extra disk space on the instances.

What we’d love for Batch to do better

  • Make disk space on managed instances configurable. Currently the workaround is to create a custom AMI with the disk space you need if you have jobs that store a lot of data on disk (Not uncommon in a data processing environment). Gilt has a feature request open with Amazon on this issue.

Why not dump Sundial in favour of using Batch directly?

Sundial still provides features that Batch doesn’t provide:

  • Email reporting
  • Pagerduty integration
  • Easy transition, processes can be a mixed workload of jobs running on ECS and Batch.
  • Configurable backoff strategy for job retries.
  • Time limits for jobs. If a job hangs, we can kill and retry after a certain period of time
  • Nice dashboard of processes (At a glance see what’s green and what’s red)

alt text

Sure enough, some of the above can be configured through hooking up lambdas/SNS messages etc. but Sundial gives it to you out of the box.

What next?

Sundial with AWS Batch backend now works great for the use cases we encounter doing personalization. We may consider enhancements such as Prometheus push gateway integration (to replace the Graphite service we had with ECS and to keep track of metrics over time) and UI enhancements to Sundial.

In the long term we may consider other open source solutions as maintaining a job system counts as technical debt that is a distraction from product focused tasks. The HBC data team, who have very similar requirements to us, have started adopting Airflow (by Airbnb). As part of their adoption, they have contributed to an open source effort to make Airflow support Batch as a backend: If it works well, this is a solution we may adopt in the future.

batch 1 aws 11 tech 22 personalization 15
Gilt Tech

Visually Similar Recommendations

Chris Curro personalization

Previously we’ve written about about Tiefvision , a technical demo showcasing the ability to automatically find similar dresses to a particular one of interest. For example:

Since then, we’ve worked on taking the ideas at play in Tiefvision, and making them usable in a production scalable way, that allows us to roll out to new product categories besides dresses quickly and efficiently. Today, we’re excited to announce that we’ve rolled out visually similar recommendations on Gilt for all dresses, t-shirts, and handbags, as well as to women’s shoes, women’s denim, women’s pants, and men’s outerwear.

Let’s start with a brief overview. Consider the general task at hand. We have a landing page for every product on our online stores. For the Gilt store, we refer to this as the product detail page (PDP). On the PDP we would like to offer the user a variety of alternatives to the product they are looking at, so that they can best make a purchasing decision. There exist a variety of approaches to selecting other products to display as alternatives; a particularly popular approach is called collaborative filtering which leverages purchase history across users to make recommendations. However this approach is what we call content-agnostic – it has no knowledge of what a particular garment looks like. Instead, we’d like to look at the photographs of garments and recommend similar looking garments within the same category.

Narrowing our focus a little bit, our task is to take a photograph of a garment and find similar looking photographs. First, we need to come up with some similarity measure for photographs, then we will need to be able to quickly query for the most similar photographs from our large catalog.

This is something we need to do numerically. Recall that we can represent a photograph as some tensor (in other words a three dimensional array with entries in between 0 and 1). Given that we have a numerical representation for an photograph, you might think we could so something simple to the measure the similarity between two photographs. Consider:

which we’d refer to as the Frobenius norm of the difference between the two photographs. The problem with this, although it is simple, is that we’re not measuring the difference between semantically meaningful features. Consider these three dresses: a red floral print, pink stripes, and a blue floral print.

With this “pixel-space” approach the red floral print and the pink stripes are more likely to be recognized as similar than the red floral print and the blue floral print, because they have pixels of similar colors at similar locations. The “pixel-space” approach ignores locality and global reasoning, and has no insight into semantic concepts.

What we’d like to do is find some function that extracts semantically meaningful features. We can then compute our similarity metric in the feature-space rather than the pixel-space. Where do we get this ? In our case, we leverage deep neural networks (deep learning) for this function. Neural networks are hierarchical functions composed of typically sequential connections of simple building blocks. This structure allows us take a neural network trained for a specific task, like arbitrary object recognition and pull from some intermediate point in the network. For example say we take a network, trained to recognize objects in the ImageNet dataset, composed of building blocks :

We might take the output of and call those our features:

In the case of convolutional networks like the VGG, Inception, or Resnet families our output features would lie in some vector space . The first two dimensions correspond to the original spatial dimensions (at some reduced resolution) while the third dimension corresponds to some set of feature types. So in other words, if one of our feature types detects a human face, we might see a high numerical value in spatial position near where a person’s face is in the photograph. In our use cases, we’ve determined that this spatial information isn’t nearly as important as the feature types that we detect, so at this point we aggregate over the spatial dimensions to get a vector in . A simple way to do this aggregation is with a simple arithmetic mean but other methods work as well.

From there we could build up some matrix where is the number of items in a category of interest. We could then construct an similarity matrix

Then to find the most similar items to a query , we look at the locations of the highest values in row of the matrix.

This approach is infeasible as becomes large, as it has computational complexity and space complexity . To alleviate this issue, we can leverage a variety of approximate nearest neighbor methods. We empirically find that approximate neighbors are sufficient. Also when we consider that our feature space represents some arbitrary embedding with no guarantees of any particular notion of optimality, it becomes clear there’s no grounded reason to warrant exact nearest neighbor searches.

How do we do it?

We leverage several open source technologies, as well as established results from published research to serve visually similar garments. As far as open source technology is concerned, we use Tensorflow, and (our very own) Sundial. Below you can see a block diagram of our implementation:

Let’s walk through this process. First, we have a Sundial job that accomplishes two tasks. We check for new products, and then we compute embeddings using Tensorflow and a pretrained network of a particular type for particular categories of products. We persist the embeddings on AWS S3. Second, we have another Sundial job, again with two tasks. This job filters the set of products to ones of some particular interest and generates a nearest neighbors index for fast nearest neighbor look-ups. The job completes, persisting the index on AWS S3. Finally, we wrap a cluster of servers in a load balancer. Our product recommendation service can query these nodes to get visually similar recommendations as desired.

Now, we can take a bit of a deeper dive into the thought process behind some of the decisions we make as we roll out to new categories. First, and perhaps the most important, is what network type and where to tap it off so that we can compute embeddings. If we recall that neural networks produce hierarchical representations, we can deduce (and notice empirically) that deeper tap-points (more steps removed from the input) produce embeddings that pick up on “higher level” concepts rather than “low level” textures. So, for example, if we wish to pick up on basic fabric textures we might pull from near the input, and if we wish to pick up something higher level like silhouette type we might pull from deeper in the network.

The filtering step before we generate a index is also critically important. At this point we can narrow down our products to only come from one particular category, or even some further sub-categorization to leverage the deep knowledge of fashion present at HBC.

Finally, we must select the parameters for the index generation, which control the error rate and performance trade-off in the approximate nearest neighbors search. We can select these parameters empirically. We utilize our knowledge of fashion, once again, to determine a good operation point.

What’s next?

We’ll be working to roll out to more and more categories, and even do some cross category elements, perhaps completing outfits based on their visual compatibility.

machine learning 10 deep learning 5 personalization 15 recommendation 2
Gilt Tech

How Large Is YOUR Retrospective?

Dana Pylayeva agile

Can you recall the size and length of your typical retrospective? If your team operates by The Scrum Guide, your retrospectives likely have less than ten people in one room and last about an hour for a two-weeks Sprint.

What if your current team is larger than a typical Scrum team and a retrospective period is longer than a month? What if the team members are distributed across locations, countries, time zones and multiple third party vendors? Is this retrospective doomed to fail? Not quite. These factors just add an additional complexity and call for a different facilitation approach.

Last month at HBC we facilitated a large-scale mid-project retrospective for a 60 people-project team. While this project certainly didn’t start as an agile project, bringing in an agile retrospective practice helped identify significant improvements. Here is how we did it.

From Inquiry to Buy-in

This all started with one of the project sub-teams reaching out with an inquiry: “Can you facilitate a retrospective for us?” That didn’t sound like anything major. We’ve been advocating for and facilitating retrospectives on various occasions at HBC: regular Sprint retrospectives, process retrospectives, new hire onboarding retrospectives etc.

Further digging into a list of participants revealed that this retro would be unlike any others. We were about to pull together a group of 60 people from HBC and five consulting companies(!) In spite of working on the same project for a long time, these people never had a chance to step back and reflect on how they could work together differently.

In order to make it successful, we needed buy-in from the leadership team to bring the entire team (including consultants) into the retrospective. Our first intent was to bring everyone into the same space (physical and virtual) and facilitate a retrospective with Open Space Technology. Initial response wasn’t promising:

“We have another problem with this retro […] is concerned that it is all day and that the cost of doing this meeting is like $25K-$50K”

We had to go back and re-think the retrospective approach. How can we reduce the cost of this event without affecting the depth and breadth of the insights?

Options we considered

Thanks to the well-documented large retrospectives experiments by other agile practitioners, there was a number of options to evaluate:

1) Full project team, full day, face-to-face, Open Space-style retro 2) Decentralized, themes-based retros with learnings collected over a period of time and shared with the group 3) Decentralized retrospectives using Innovation Games Online platform 4) Overall retrospective (LeSS framework)

Around the same time, I was fortunate to join a Retrospective Facilitator’s Gathering (RFG2017) - an annual event that brought together the most experienced retrospective facilitators from around the World. Learning from their experience as well as brainstorming together on the possible format was really helpful. Thank you Tobias Baier, Allan Jepsen, Joanne Perold, George Dinwiddie and many others for sharing your insights! I was especially grateful for the in-depth conversation with Diana Larsen in which she pointed out to

“Clarify the goal and commitment of the key stakeholders before you start designing how to run the retrospective.”

Back to the drawing board again! More conversations, clarifications and convincing… With some modifications and adjustments, we finally were able to get the buy-in and moved forward with the retrospective.

What worked for us – a tiered format.

Tiered Retro

Individual team-level retrospectives

We had a mix of co-located and distributed sub-teams on this project and chose to enlist some help from multiple facilitators. To simplify data consolidation, each facilitator received a data gathering format along with a sample retrospective facilitation plan. Each individual sub-team was asked to identify two types of action items: ones that they felt were in their power to address and others that required a system-level thinking and the support from the larger project community. The former were selected by the sub-teams and put in motion by their respective owners. The latter were passed to the main facilitator for analysis and aggregation to serve as a starting point for the final retrospective.

Final retrospective

For the final retrospective we brought together two types of participants:

1) Leads and delegates from individual sub-teams who participated actively at all times. 2) Senior leaders of the organization who joined in the last hour to review and support team’s recommendations.

The goal of this workshop was to review the ideas from sub-teams, explore system level improvements and get the support from senior leadership to put the system-level changes into motion.

Retrospective plans

Each retrospective was structured according to the classic five-steps framework and included a number of activities selected from Retromat.

Example of an in-room sub-team retrospective (1 - 1.5 hours)

Set the Stage

We used a happiness histogram to get things started and get a sense for how the people felt about the overall project. Happiness Histogram

Instead of reading the Prime Directive once at the beginning with the team, we opted for displaying it in the room on a large poster as a visible reminder throughout the retrospective.

Gather Data

Everyone was instructed to think about the things they liked about the project (What worked well?) and the ones that could’ve been better (What didn’t work so well?). In a short time-boxed silent brainstorming each team member had to come up with at least two items in each category.

Next we facilitated a pair-share activity in a “speed dating” format. Forming two lines, we asked participants to face each other and take turns discussing what each of them wrote on their post-its. After two minutes the partners were switched and the new pairs were formed to continue discussions with the new set of people.

Pair Share

At the end of the timebox, we asked the last pairs to walk together to the four posters on the wall and place their post-its into respective categories: 1) Worked Well/ Can’t control 2) Worked Well/Can control 3) Didn’t work so well/Can’t control 4) Didn’t work so well/ Can control

After performing an affinity mapping and a dot-voting the group selected top three issues that they felt were in their control to address.

Generate Insights/Decide What To Do

Every selected issue got picked up by a self-organized sub-group. Using a template each sub-group designed a team level experiment defining the action they propose to take, an observable behavior they expect to see after taking that action and the specific measurement that will confirm a success of the experiment.


Close the Retro

We closed the retro by getting a feedback on the retro format, taking photos of the insights generated by the team. These were passed on to the main facilitator for further analysis and preparation for the final retrospective event.

Modifications for distributed teams

For those teams that had remote team members or were fully distributed, we used a FunRetro tool. Flexibility to configure columns and the number of votes, along with easy user interface, fun colors and free cost made this tool a good substitute for an in-room retrospective.

Fun Retro

Final Retrospective (3 hours)

Once all individual sub-teams retrospective were completed, we consolidated the project-level improvement proposals. These insights were reviewed, analyzed for trends and systemic issues and then shared during Tier 2 Final Retrospective.

Set the stage

We used story cubes to reflect and share how each of the participants felt about this project. This is a fun way to run a check in activity, equally effective with introverted and extraverted participants. The result is a collection of images that build a shared story about the project:

Story Cubes

We also reviewed an aggregated happiness histogram from each individual sub-teams to learn about the mood of 60 people on this project.

Gather data

Since the retrospective period was very long, building a timeline together was really helpful in re-constructing the full view of the project. We asked participants to sort the events into the ones that had a positive impact on the project (placing them above the timeline) and the ones that had a negative impact on the project (placing them below the timeline). The insight we gained from this exercise alone were invaluable!


Generate Insights

Next we paired the participants and asked them to walk to the consolidated recommendations posters. As a pair, they were tasked with selecting the most pressing issues and bringing them back for a follow up discussion at their table.

What Worked What Didn't

Each table used the LeanCoffee format to vote on the selected issues, prioritize them into a discussion backlog and explore as many of them as the timebox allowed. Participants used roman voting as a way to decide if they are ready to more on to the next topic or need more discussion about the current one. Closing each discussion, participants recorded their recommended action. At the end of the timebox all actions from each table were shared with the rest of the group to get feedback.


Decide What To Do/Close

In the final hour of the retrospective the action owners shared their proposed next steps with the senior leadership team and reviewed the insights from the consolidated teams’ feedback.


Was this experiment successful? Absolutely! One of the biggest benefits of this retrospective was this collective experience of working across sub-teams and designing organizational improvements together.

Could we have done it better? You bet! As the project continues, we will be looking to run the retrospectives more frequently and will take into account things we learnt in this experiment.

What did we learn?

  • Designing a retrospective of this size is a project in itself. You need to be clear about the vision, the stakeholders and the success criteria for the retrospective.
  • Do your research, tap into the knowledge of agile community and get inspired by the experience of others. Take what you like and then adapt to make it work in the context of your organization.
  • Ask for help. Involve additional facilitators to get feedback, speed up the execution and created a safe space for individual sub-teams.
  • Inclusion trumps exclusion. Invite consultants as well as full-time employees into your retrospective to better understand the project dynamic.
  • Beware of potential confusion around retrospective practice. Be ready to explain the benefits and highlight the differences between a retrospective and a postmortem.
  • Bringing senior leaders into the last hour of final retrospective can negatively affect the dynamics of the discussions. Either work on prepping them better or plan on re-establishing the safe space after they join.

What would we like to do next?

  • Continue promoting the retrospective practice across the organization.
  • Offer a retrospective facilitator training to Scrum Masters, Agile Project Managers and anyone who is interested in learning how to run an effective retro.
  • Establish retrospective facilitator circle to help maintain and improve the practice for all teams.

Inspired by our experiment? Have your own experience worth sharing? We’d love to hear from you and learn what works in your environment. Blog about it and tweet your questions at @hbcdigital.

World Retrospective Day

Whether you are a retrospective pro, have never tried one in the past or your experience is anywhere in between, please do yourself a favor and mark February 6, 2018 on your calendar. A group of experienced retrospective facilitators is currently planning a record-breaking World Retrospective Day with live local workshops on every continent and in every time zone along with many on-line learning opportunities. We are engaging with the industry thought leaders to make this one of the best and most engaging learning experience. We hope to see you there!

agile 12 retrospective 1 scaling 4
Gilt Tech

Advanced tips for building an iOS Notification Service Extension

Kyle Dorman ios

The Gilt iOS team is officially rolling out support for “rich notifications” in the coming days. By “rich notifications”, I mean the ability to include media (images/gifs/video/audio) with push notifications. Apple announced rich notifications as a part of iOS 10 at WWDC last year (2016). For a mobile first e-commerce company with high quality images, adding media to push notifications is an exciting way to continue to engage our users.

alt image

This post details four helpful advanced tips I wish I had when I started building a Notification Service Extension(NSE) for the iOS app. Although all of this information is available through different blog posts and Apple documentation, I am putting it all in one place in the context of building a NSE in the hopes that it saves someone the time I spent hunting and testing this niche feature. Specifically, I will go over things I learned after the point where I was actually seeing modified push notifications on a real device (even something as simple as appending MODIFIED to the notification title).

If you’ve stumbled upon this post, you’re most likely about to start building a NSE or started already and have hit an unexpected roadblock. If you have not already created the shell of your extension, I recommend reading the official Apple documentation and some other helpful blog posts found here and here. These posts give a great overview of how to get started receiving and displaying push notifications with media.

Tip 0: Sending notifications

When working with NSEs it is extremely helpful to have a reliable way of sending yourself push notifications. Whether you use a third party push platform or a home grown platform, validate that you can send yourself test notifications before going any further. Additionally, validate that you have the ability to send modified push payloads.

Tip 1: Debugging

Being able to debug your code while you work is paramount. If you’ve ever built an app extension this tip may be old hat to you but as a first time extension builder it was a revelation to me! Because a NSE is not actually a part of your app, but an extension, it does not run on the same process id as your application. When you install your app on an iOS device from Xcode, the Xcode debugger and console are only listening to the process id of your application. This means any print statements and break points you set in the NSE won’t show up in the Xcode console and won’t pause the execution of your NSE.

alt image

You actually can see all of your print statements in the mac Console app but the Console also includes every print/log statement of every process running on your iOS device and filtering these events is more pain than its worth.

alt image

Fortunately, there is another way. You can actually have Xcode listen to any of the processes running on your phone including low level processes like wifid, Xcode just happens to default to your application.

alt image

To attach to the NSE, you first need to send your device a notification to start up the NSE. Once you receive the notification, in Xcode go to the “Debug” tab, scroll down to “Attach to Process” and look to see if your NSE is listed under “Likely Targets”.

alt image

If you don’t see it, try sending another notification to your device. If you do, attach to it! If you successfully attached to your NSE process you should see it grayed out when yo go back to Debug > Attach to Process.

alt image

You should also be able to select the NSE from the Xcode debug area.

alt image

To validate both the debugger and print statements are working add a breakpoint and a print statement to your NSE. Note: Everytime you rebuild the app, you will unfortunately have to repeat the process of sending yourself a notification before attaching to the NSE process.

Amazing! Your NSE development experience will now be 10x faster than my own. I spent two days appending “print statements” to the body of the actual notification before I discovered the ability to attach to multiple processes.

alt image

Tip 2: Sharing data between your application and NSE

Although your NSE is bundled with your app, it is not part of your app, does not run on the same process id (see above), and does not have the same bundle identifier. Because of this, your application and NSE cannot talk to each other and cannot use the same file system. If you have any information you would like to share between the app and the NSE, you will need to add them both to an App Group. For the specifics of adding an app group check out Apple’s Sharing Data with Your Containing App.

This came up in Gilt’s NSE because we wanted to have the ability to get logs from the NSE and include them with the rest of the app. For background, the Gilt iOS team uses our own open sourced logging library, CleanroomLogger. The library writes log files in the app’s allocated file system. To collect the log files from the NSE in the application, we needed to save the log files from the NSE to the shared app group.

Another feature you get once you set up the App Group is the ability to share information using the app group’s NSUserDefaults. We aren’t using this feature right now, but might in the future.

Tip 3: Using frameworks in your NSE

If you haven’t already realized, rich notifications don’t send actual media but just links to media which your NSE will download. If you’re a bolder person than me, you might decide to forgo the use of an HTTP framework in your extension and re-implement any functions/classes you need. For the rest of us, its a good idea to include additional frameworks in your NSE. In the simplest case, adding a framework to a NSE is the same as including a framework in another framework or your container app. Unfortunately, not all frameworks can be used in an extension.

alt image

To use a framework in your application, the framework must check the “App Extensions” box.

alt image

Most popular open source frameworks are already set up to work with extensions but its something you should look out for. The Gilt iOS app has one internal framework which we weren’t able to use in extensions and I had to re-implement a few functions in the NSE. If you come across a framework that you think should work in an extension, but doesn’t, check out Apple’s Using an Embedded Framework to Share Code.

Tip 4: Display different media for thumbnail and expanded view

When the rich notification comes up on the device, users see a small thumbnail image beside the notification title and message.

alt image

And when the user expands the notification, iOS shows a larger image.

alt image

In the simple case (example above), you might just have a single image to use as the thumbnail and the large image. In this case setting a single attachment is fine. In the Gilt app, we came across a case where we wanted to show a specific square image as the thumbnail and a specific rectangular image when the notification is expanded. This is possible because UNMutableNotificationContent allows you to set a list of UNNotificationAttachment. Although this is not a documented feature it is possible.

var bestAttemptContent = request.content.mutableCopy() as? UNMutableNotificationContent
let expandedAttachment = UNNotificationAttachment(url: expandedURL, options: [UNNotificationAttachmentOptionsThumbnailHiddenKey : true])
let thumbnailAttachment = UNNotificationAttachment(url: thumbnailURL, options: [UNNotificationAttachmentOptionsThumbnailHiddenKey : false])
bestAttemptContent.attachments = [expandedAttachment, thumbnailAttachment]

This code snippet sets two attachments on the notification. This may be confusing because, currently, iOS only allows and app to show one attachment. If we can only show one attachment, then why set two attachments on the notification? I am setting two attachments because I want to show different images in the collapsed and expanded notification views. The first attchment in the array, expandedAttachment, is hidden in the collapsed view (UNNotificationAttachmentOptionsThumbnailHiddenKey : true). The second attachment, thumbnailAttachment, is not. In the collapsed view, iOS will select the first attachment where UNNotificationAttachmentOptionsThumbnailHiddenKey is false. But when the nofication is expanded, the first attachment in the array, in this case expandedAttachment, is displayed. If that is confusing see the example images below. Notice, this is not one rectangular image cropped for the thumbnail.

alt image

alt image

Note: There is a way to specify a clipping rectangle using the UNNotificationAttachmentOptionsThumbnailClippingRectKey option, but our backend system doesn’t include cropping rectangle information and we do have multiple approprite crops of product/sale images available.


Thats it! I hope this post was helpful and you will now fly through building a Notification Service Extension for your app. If there is anything you think I missed and should add to the blog please let us know,

alt image

ios 7 push notifications 5 notification service extension 1
Gilt Tech

Open Source Friday

HBC Tech open source

From the 54 public repos maintained at to the name of our tech blog (displayed in this tab’s header), open source has been part of our team’s DNA for years. Check out this blog post from 2015 if you’re not convinced.

Our open source love is why we’re excited to participate in our first Open Source Friday on June 30. Open Source Friday is an effort being led by GitHub to make it easier to contribute to the open source community. This blog post has more detail on the who, what and why. We’re hoping to make this a regular activity to help our teams foster an open-source-first culture as we grow and evolve.

Some of the projects we’ll be working on:

  • CleanroomLogger - Evan Maloney will be tackling a specific long-standing user request: custom “named subsystems” for logging. Some background here:
  • ApiBuilder - Ryan Martin will be working on fixing some edge cases in the Swagger Generator for ApiBuilder.
  • gfc-guava - Sean Sullivan will be updating the documentation for gfc-guava and working on compatability with Google Guava 22.0.

If you’re inspired but don’t know where to start, head to our directory of open source projects, visit this list by GitHub or ping us on Twitter and we can help point you in the right direction.

open source 65 culture 35 community 1
Gilt Tech

Hudson's Bay Company at QCon

HBC Tech conferences

Heading to QCon? Don’t miss these two sessions! If you can’t make it, stay tuned here for slides and recordings from the conference.

Removing Friction In the Developer Experience

If you follow this blog at all, you know that we talk a lot about how we work here. Whether it’s out approach to adopting new technology, the work of our POps team or our team ingredients framework, we’re not shy when it comes to our people and culture.

With that in mind, it only makes sense that Ade Trenaman, SVP Engineering at Hudson’s Bay Company, will be part of the Developer Experience track at QCon New York in June. Titled “Removing Friction In the Developer Experience”, Ade will highlight a number of the steps we’ve taken as an organisation to improve how we work. His session will cover:

  • how we blend microservice / serverless architectures, continuous deployment, and cloud technology to make it easy to push code swiftly, safely and frequently and operate it reliably in production.
  • the organisational tools like team self-selection, team ingredients (see above), voluntary adoption and internal startups that allow us to decentralise and decouple high-performing teams.

Survival of the Fittest - Streaming Architectures

Michael Hansen will also be at QCon this year. Mike’s talk will help guide the audience through a process to adopt the best streaming model for their needs (because there is no perfect solution).

In his own words: “Frameworks come and go, ​so this talk is not about the “best” framework or platform to use, rather it’s about core principles that will stand the tests of streaming evolution.”

His talk will also cover:

  • major potential pitfalls that you may stumble over on your path to streaming and how to avoid them
  • the next evolutionary step in streaming at Hudson’s Bay Company

Hope to see you there!

developer experience 1 culture 35 stream processing 1 data 27 qcon 1
Gilt Tech

Let’s run an experiment! Self-selection at HBC Digital

Dana Pylayeva agile


Inspired by Opower’s success story, we ran a self-selection experiment at HBC Digital.

Dubbed as “the most anticipated event of the year” it enabled 39 team members to self-select into 4 project teams. How did they do it? By picking a project they wanted to work on, the teammates they wanted to work with and keeping a “Do what’s best for the company” attitude. Read on to learn about our experience and consider giving a self-selection a try!

A little bit of introduction:

Who are we?

HBC Digital is the group that drives the digital retail/ecommerce and digital customer experience across all HBC retail banners including Hudson’s Bay, Lord & Taylor, Saks Fifth Avenue, Gilt, and Saks OFF 5TH.

Our process, trifectas and team ingredients

Our development process is largely inspired by the original Gilt process and has the ideas of intrinsic motivation in its core. What agile flavor do we use? It depends on the team.

Each team has a full autonomy in selecting Scrum, Kanban, XP, a combination thereof or none of the above as their process. As long as they remain small, nimble, able to collaborate and continuously deliver value, they can tailor the process to their needs.

We do keep certain key components standard across all teams. One of them is a “Trifecta” – a group of servant-leaders in each team: a Product Manager, an Agile Project Manager and a Tech Lead. They work together to support their team and enable the team’s success. We value continuous learning and facilitate role blending by instilling our Team Ingredients framework. Originally designed by Heather Fleming, the Team Ingredients framework facilitates team-level conversations about the team strengths, learning interests and cross-training opportunities.

Over the years the framework evolved from being a management tool for assessing teams from “outside in” to being a team tool that supports self-organizing and learning discussions. After a major revamp and gamification of the framework in 2016, we now use it as part of our Liftoff sessions and team working agreement conversations.

Just like our Team Ingredients framework, our process continues to evolve. We experiment with new ideas and practices to facilitate teams’ effectiveness and create an environment for teams to thrive. The self-selection is our latest experiment and this blog post is a glimpse into how it went.

Self-selection triggers and enablers

Organizational change

As an organization that grew through acquisitions, at one point we found ourselves dealing with an unhealthy mix of cultures, duplicate roles and clashing mindsets. To remain lean and agile, we went through a restructuring at all levels.

Inspiring case studies

When we were evaluating the best ways to re-form the teams, we came across Amber King and Jess Huth’s talk on self-selection at Business Agility 2017 Conference. The lightbulb went on! Amber and Jess were describing exactly the situation we were in at that time and were reporting the positive effect of running a self-selection with the teams at Opower. We followed up with them on Skype afterwards. Hearing their compelling story again and being encouraged by their guidance, we left the call fired up to give the self-selection a try!

Self-selection manual

When it is your turn to plan for self-selection, pick up a copy of Sandy Mamoli and David Mole’s book “Creating Great Teams: How Self-Selection Lets People Excel” This very detailed facilitation guide from the inventors of self-selection process is indispensable in preparing for and facilitating a self-selection event.

Past success

What worked in our favor was the fact that Gilt had tried running a self-selection in 2012 as part of a transition to “two-pizza” teams. The self-selection event was called a Speed Dating, involved 50 people and 6 projects. Fun fact - a number of today’s leaders were involved in 2012 event as regular participants.


We kept the preparation process very transparent. Dedicated Slack channel, Confluence page with progress updates and participants’ info, communication at the tech all-hands meetings and Q&A sessions – everything to avoid creating discomfort and to reduce the fear factor amongst team members.

Self-selection in seven steps

Seven Steps of Self-Selection

1. Get Leadership Buy-In

One of the first steps in a self-selection is getting buy-in from your leadership team. Whether you start from feature teams or component teams, a self-selection event has a potential of impacting the existing reporting structure in your organization. Have an open conversation with each of the leaders to clarify the process, understand their concerns and answer questions.

Is there a small modification you can make to the process to mitigate these concerns and turn the leaders into your supporters? From our experience, making a self-selection invitational and positioning it as “an experiment” fast-tracked its acceptance in the organization.

2. Identify Participants

How many people will be involved in your self-selection? Will it include all of your existing project teams or a subset?

Reducing the size of the self-selection to only a subset of the teams at HBC Digital made our experiment more plausible. By the same token, it created a bit of a confusion around who was in vs. who was not.

If you are running a self-selection for a subset of your teams, make sure that the list of participants is known and publicly available to everyone. Verify that the total number of participants is equal or smaller than the number of open spots on the new teams.

Pre-selected vs. free-moving participants

Decide if you need to have any of the team members pre-selected in each team. For us, the only two pre-selected roles in each team were a Product Manager and a Tech Lead. They were the key partners in pitching the initiative to team members. All others (including Agile Project Managers) were invited to self-select into new teams.

FTEs vs. Contractors

If you have contractors working on your projects alongside the full-time employees, you will need to figure out if limiting self-selection to full-time employees makes sense in your environment.

Since our typical team had a mix of full-time employees and contractors, it was logical for us to invite both groups to participate in the self-selection. After all, individuals were selecting the teams based on a business idea, a technology stack and the other individuals that they wanted to work with. We did make one adjustment to the process and asked contractors to give employees “first dibs” at selecting their new teams. Everyone had equal opportunity after the first round of the self-selection.


Usually, you would want to limit participation to those directly involved in a self-selection. In our case, there was so much interest in the self-selection experiment across the organization, that we had to compromise by introducing an observer role. Observers were invited to join in the first part of the self-selection event. They could check out how the room was set up, take a peek at the participants’ cards. They could listen to initiative pitches for all teams, without making an actual selection. Observers were asked to leave after the break and before the start of actual teams’ selection.

3. Work with Your Key Partners

Adjust the process to fit your needs

During our prep work we discovered that some team members felt very apprehensive about self-selection processes. To some extent, it reminded them of a negative experience they had in their childhood with a selection into sports teams. We collaborated with current teams’ Trifectas to reduce potential discomfort with the following adjustments:

  • We modified the “I have no squad” poster into “Available to help” poster for a more positive spin.
  • We made a compromise on consultants’ participation, asking them to add their cards to “I am available to help” poster in the first round and letting them participate equally starting from the second round.
  • We introduced a “No first come first serve” rule to keep the options open for everyone and avoid informal pre-selection.

Product Managers and Tech Leads pitches.

Coach them to inspire people with their short pitches about a product vision and a technology stack:

  • Why is this initiative important to our business?
  • How can you make a difference if you join?
  • What exciting technologies will you get a chance to work with if you become a part of this team?
  • What kind of team are we looking to build?

Establish the team formula

This part is really critical.

Your team formula may include the core team only, or like in our case, include members from the larger project community (Infrastructure Engineers, UX Designers etc.) As a facilitator, you want to understand very well the needs of each project in terms of specific roles and the number of people required for each role. Cross-check the total number of people based on the team formula with the number of people invited to participate in the self-selection. Avoid putting people into “musical chairs” at all cost!

4. Evangelize

Take the uncertainty out of the self-selection! Clarify questions, address concerns, play the “what-ifs”, collect questions and make answers available to everyone.

We learnt to use a variety of channels to spread the word about the self-selection:

  • announcements at Tech All-hands meetings
  • dedicated Q&A sessions with each existing group.
  • Confluence Q&A page
  • #self-selection Slack channel
  • formal and informal one-on-one conversations (including hallway and elevator chats)
  • discussion between the Tech Leads and Product Managers and their potential team members

5. Prepare


It was important for us to find the right space and set the right mood for the actual self-selection event. The space that worked for us met all of our criteria:

1) Appropriate for the size of the group 2) Natural light 3) Separate space for pitches and for team posters 4) Away from the usual team spaces (to minimize distractions)


Speaking of the right mood, we had enough good snacks brought in for all participants and observers!

Depending on the time of the day, you may plan on bringing breakfast, lunch or snacks into your self-selection event. We ran ours in the afternoon and brought in a selection of European chocolate, popcorn and juices.


Help the participants remember the rules and find the team corners by preparing posters. Be creative, make them visually appealing. Here is what worked for us:

1) One team poster per team with the project/team name, team formula and a team mascot.

2) Rules posters:

  • “Do what’s best for the company”
  • “Equal team selection opportunity”
  • “Teams have to be capable of delivering end to end”

3) “Available to help” poster. This is very similar to “I have no squad” poster from Sandi Mamoli’s book. However, we wanted to make the message on that poster a little bit more positive.

Participants Cards

At a minimum, have a printed photo prepared for each participant and color-coded labels to indicated different roles.

We invested a little more time in making participants cards look like game cards and included:

  • a LinkedIn photo of the participant
  • their name
  • a current role
  • their proficiency and learning interests in the eleven team ingredients
  • a space to indicate their first, second and third choices of the team (during the event).

Using our Team ingredients framework and Kahoot! survey platform we created a gamified self-assessment to collect the data for these cards.

Participants rated their skill levels and learning interests for each of the ingredients using the following scale:

3 – I can teach it

2 – I can do it

1 – I’d like to learn it

0 – Don’t make me do it

6. Run

It took us exactly one month to get to this point. On the day of the self-selection the group walked into the room. The product managers, tech leads and the facilitator were already there. The room was set and ready for action!

Initiative Pitches

Participants picked up their cards and settled in their chairs, prepared to hear the initiative pitches and to make their selections. This was one of the most attentive audience we’ve seen! We didn’t even have to set the rules around device usage - everyone was giving the pitches their undivided attention.

After a short introduction from the facilitator and a “blessing” from one of the leaders, Product Managers and Tech Leads took the stage.

For each initiative they presented their vision of the product, the technology stack and their perspective on the team they’d like to build. It was impressive to see how each pair worked together to answer questions and inspire people. At the end of the pitches, we took a short break. It was a signal for observers to leave the room.

Two rounds of self-selection

After the break, Product Managers and Tech Leads took their places in the team corners. We ran two rounds of self-selection, ten minutes each.

During the first self-selection round people walked around, checked the team formula, chatted with others and placed their cards on a poster of their first choice team. Contractors and others, who didn’t want to make a selection in the first round, placed their cards on “Available to help” poster. At the end of the round, each tech lead was asked to give an update on the following:

  • Was the team complete after this round?
  • Were there any ingredients or skills missing in the team after the first round?

During the second round, there were more conversations, more negotiations and more movement between the teams. Some people agreed to move to their second choice teams to help fill the project needs. The “Do what’s best for the company” poster served as a good reminder during this process.

The debrief revealed that three teams out of four had been fully formed by the end of the second round. The last team had more open spots still. It was decided that those will be filled later by hiring new people with the required skillset.

The self-selection event was completed. It was a time to celebrate and to start planning the work with the new teams.

7. Support New Teams

Transition Plan

With the self-selection exercise, our teams formed a vision for their ideal “end state”. Afterwards, we needed to figure out how to achieve that vision. Tech leads worked with their new team members to figure our the systems they supported, the projects they were involved with at that time and mapped out the transition plan.

Team Working Agreement

Once all members of the new teams were available to start, we faciliated Liftoff workshops to help them get more details on the product purpose, establish team working agreements and help the teams understand larger organizational context.

Coaching/Measuring Happiness

Our experiment didn’t stop there. We continue checking in with the team through coaching, measuring happiness (we use gamified Spotify Squad Health check) and facilitating regular retrospectives.

What’s next?

As our roadmap continues to change and as we get more people joining the organization, we may consider running a self-selection again with a new group. Or we may decide to move away from “large batches” of self-selection and experiment with a flow of Dynamic Reteaming.

Time will tell. One thing is clear - we will continue learning and experimenting.

How can you learn more?

We hope this blog post inspired you to think about a self-selection for your teams. Still have questions after reading it? Get in touch with us, we’d love to tell you more!

We are speaking

Join our talks and workshops around the World:

  1. “The New Work Order” keynote at Future of Work by Heather Fleming, VP People Operations & PMO

  2. Removing Friction In the Developer Experience at QConn New York by Adrian Trenaman, SVP Engineering

  3. Discover Your Dream Teams Through Self-Selection with a Team Ingredients Game at Global Scrum Gathering Dublin by Dana Pylayeva, Agile Coach

Great books that inspired us

  1. Sandy Mamoli, David Mole “Creating Great Teams: How Self-Selection Lets People Excel”
  2. Diana Larsen, Ainsley Nies Liftoff: Launching Agile Teams & Projects
  3. Heidi Shetzer Helfand Dynamic Reteaming. The Art and Wisdom of Changing Teams
culture 35 agile 12 self-selection 1 leadership 7
Gilt Tech

CloudFormation Nanoservice

Ryan Martin aws

One of the big HBC Digital initiatives for 2017 is “buy online, pickup in store” - somewhat awkwardly nicknamed “BOPIS” internally. This is the option for the customer to, instead of shipping an order to an address, pick it up in a store that has the items in inventory.

A small part of this new feature is the option to be notified of your order status (i.e. when you can pickup the order) via SMS. A further smaller part of the SMS option is what to do when a customer texts “STOP” (or some other similar stop word) in response to one of the SMS notifications. Due to laws such as the Telephone Consumer Protection Act (TCPA) and CAN-SPAM Act, we are required to immediately stop sending additional messages to a phone number, once that person has requested an end to further messaging.

Our SMS provider is able to receive the texted response from the customer and POST it to an endpoint of our choosing. We could wrap such an endpoint into one of our existing microservices, but the one that sends the SMS (our customer-notification-service) is super-simple: it receives order events and sends notifications (via email or SMS) based on the type of event. It is essentially a dumb pipe that doesn’t care about orders or users; it watches for events and sends messages to customers based on those events. Wrapping subscription information into this microservice felt like overstepping the bounds of the simple, clean job that it does.

So this is the story of how I found myself writing a very small service (nanoservice, if you will) that does one thing - and does it with close-to-zero maintenance, infrastructure, and overall investment. Furthermore, I decided to see if I could encapsulate it entirely within a single CloudFormation template.

How we got here

Here are the two things this nanoservice needs to do:

  1. Receive the texted response and unsubscribe the customer if necessary
  2. Allow the customer notification service (CNS) to check the subscription status of a phone number before sending a SMS

In thinking about the volume of traffic for these two requests, we consider the following:

  1. This is on [] only (for the moment)
  2. Of the online Saks orders, only a subset of inventory is available to be picked up in the store
  3. Of the BOPIS-eligible items, only a subset of customers will choose to pickup in store
  4. Of those who choose to pickup in store, only a subset will opt-in for SMS messages
  5. Of those who opt-in for SMS, only a subset will attempt to stop messages after opting-in

For the service’s endpoints, the request volume for the unsub endpoint (#1 above) is roughly the extreme edge case of #5; the CNS check (#2) is the less-edgy-but-still-low-volume #4 above. So we’re talking about a very small amount of traffic: at most a couple dozen requests per day. This hardly justifies spinning up a microservice - even if it runs on a t2.nano, you still have the overhead of multiple nodes (for redundancy), deployment, monitoring, and everything else that comes with a new microservice. Seems like a perfect candidate for a serverless approach.

The architecture

As mentioned above, a series of order events flows to the customer notification service, which checks to make sure that the destination phone number is not blacklisted. If it is not, CNS sends the SMS message through our partner, who in turn delivers the SMS to the customer. If the customer texts a response, our SMS partner proxies that message back to our blacklist service.

The blacklist service is a few Lambda functions behind API Gateway; those Lambda functions simply write to and read from DynamoDB. Because the stack is so simple, it felt like I could define the entire thing in a single artifact: one CloudFormation template. Not only would that be a geeky because-I-can coding challenge, it also felt really clean to be able to deploy a service using only one resource with no dependencies. It’s open source, so anyone can literally copy-paste the template into CloudFormation and have the fully-functioning service in the amount of time it takes to spin up the resources - with no further knowledge necessary. Plus, the template is in JSON (which I’ll explain later) and the functions are in Node.js, so it’s a bit of


Here at HBC Digital, we’ve really started promoting the idea of API-driven development (ADD). I like it a lot because it forces you to fully think through the most important models in your API, how they’re defined, and how clients should interact with them. You can iron out a lot of the kinks (Do I really need this property? Do I need a search? How does the client edit? What needs to be exposed vs locked-down? etc) before you write a single line of code.

I like to sit down with a good API document editor such as SwaggerHub and define the entire API at the beginning. The ADD approach worked really well for this project because we needed a quick turnaround time: the blacklist was something we weren’t expecting to own internally until very late in the project, so we had to get it in place and fully tested within a week or two. With an API document in hand (particularly one defined in Swagger), I was able to go from API definition to fully mocked endpoints (in API Gateway) in about 30 mins. The team working on CNS could then generate a client (we like the clients in Apidoc, an open-source tool developed internally that supports Swagger import) and immediately start integrating against the API. This then freed me to work on the implementation of the blacklist service without being a blocker for the remainder of the team. We settled on the blacklist approach one day; less than 24 hours later we had a full API defined with no blockers for development.

The API definition is fairly generic: it supports blacklisting any uniquely-defined key for any type of notification. The main family of endpoints looks like this:


notification_type currently only supports sms, but could very easily be expanded to support things like email, push, facebook-messenger, etc. With this, you could blacklist phone numbers for sms independently from email addresses for email independently from device IDs for push.

A simple GET checks to see if the identifier of the destination is blacklisted for that type of notification:

> curl https://your-blacklist-root/sms/555-555-5555
{"message":"Entry not blacklisted"}

This endpoint is used by CNS to determine whether or not it should send the SMS to the customer. In addition to the GET endpoint, the API defines a PUT and a DELETE for manual debugging/cleanup - though a client could also use them directly to maintain the blacklist.

The second important endpoint is a POST that receives a XML document with details about the SMS response:

<?xml version="1.0" encoding="UTF-8"?>
<moMessage messageId="123456789" receiptDate="YYYY-MM-DD HH:MM:SS Z" attemptNumber="1">
    <source address="+15555555555" carrier="" type="MDN" />
    <destination address="12345" type="SC" />
    <message>Stop texting me</message>

The important bits are the source address (the phone number that sent the message) and the message itself. With those, the API can determine whether or not to add the phone number to the blacklist. If it does, the next time CNS calls the GET endpoint for that phone number, the API will return a positive result for the blacklist and CNS will not send the SMS. The POST to /mo_message lives at the top-level because it is only through coincidence that it results in blacklisting for SMS; one could imagine other endpoints at the top-level that blacklist from other types of notifications - or even multiple (depending on the type of event).

Let’s see some code

First there are a couple functions shared across all the endpoints (and their backing Lambda functions):

function withSupportedType(event, context, lambdaCallback, callback) {
  const supportedTypes = ['sms'];
  if (supportedTypes.indexOf(event.pathParameters.notification_type.toLowerCase()) >= 0) {
  } else {
    lambdaCallback(null, { statusCode: 400, body: JSON.stringify({ message: 'Notification type [' + event.pathParameters.notification_type + '] not supported.' }) });

function sanitizeNumber(raw) {
  var numbers = raw.replace(/[^\d]+/g, '');
  if (numbers.match(/^1\d{10}$/)) numbers = numbers.substring(1, 11);
  return numbers;

These are there to ensure that each Lambda function is a) dealing with invalid notification_types and b) cleaning up the phone number in the same manner across all functions. Given those common functions, the amount of code for each function is fairly minimal.

The GET endpoint simply queries the DynamoDB for the unique combination of notification_type and blacklist_id:

const AWS = require('aws-sdk'),
      dynamo = new AWS.DynamoDB();

exports.handler = (event, context, callback) => {
  const blacklistId = sanitizeNumber(event.pathParameters.blacklist_id);
  withSupportedType(event, context, callback, function(notificationType) {
      TableName: event.stageVariables.TABLE_NAME,
      Key: { Id: { S: blacklistId }, Type: { S: notificationType } }
    }, function(err, data) {
      if (err) return callback(err);
      if ((data && data.Item && afterNow(data, "DeletedAt")) || !onWhitelist(blacklistId, event.stageVariables.WHITELIST)) {
        callback(null, { statusCode: 200, body: JSON.stringify({ id: blacklistId }) });
      } else {
        callback(null, { statusCode: 404, body: JSON.stringify({ message: "Entry not blacklisted" }) });

function afterNow(data, propertyName) {
  if (data && data.Item && data.Item[propertyName] && data.Item[propertyName].S) {
    return Date.parse(data.Item[propertyName].S) >= new Date();
  } else {
    return true;

// Set the whitelist in staging to only allow certain entries.
function onWhitelist(blacklistId, whitelist) {
  if (whitelist && whitelist.trim() != '') {
    const whitelisted = whitelist.split(',');
    return whitelisted.findIndex(function(item) { return blacklistId == item.trim(); }) >= 0;
  } else {
    return true;

Disregarding the imports at the top and some minor complexity around a whitelist (which we put in place only for staging/test environments so we don’t accidentally spam people while testing), it’s about a dozen lines of code (depending on spacing) - with minimal boilerplate. This is the realization of one of the promises of the serverless approach: very little friction against getting directly to the meat of what you’re trying to do. There is nothing here about request routing or dependency-injection or model deserialization; the meaningful-code-to-boilerplate ratio is extremely high (though we’ll get to deployment later).

The PUT (add an entry to the blacklist, managing soft-deletes correctly)

exports.handler = (event, context, callback) => {
  const blacklistId = sanitizeNumber(event.pathParameters.blacklist_id);
  withSupportedType(event, context, callback, function(notificationType) {
      TableName: event.stageVariables.TABLE_NAME,
      Key: { Id: { S: blacklistId }, Type: { S: notificationType } },
      ExpressionAttributeNames: { '#l': 'Log' },
      ExpressionAttributeValues: {
        ':d': { S: (new Date()).toISOString() },
        ':m': { SS: [ toMessageString(event) ] }
      UpdateExpression: 'SET UpdatedAt=:d ADD #l :m REMOVE DeletedAt'
    }, function(err, data) {
      if (err) return callback(err);
      callback(null, { statusCode: 200, body: JSON.stringify({ id: blacklistId }) });

and DELETE (soft-delete entries when present)

exports.handler = (event, context, callback) => {
  const blacklistId = sanitizeNumber(event.pathParameters.blacklist_id);
  withSupportedType(event, context, callback, function(notificationType) {
      TableName: event.stageVariables.TABLE_NAME,
      Key: { Id: { S: blacklistId }, Type: { S: notificationType } },
      ExpressionAttributeNames: { '#l': 'Log' },
      ExpressionAttributeValues: {
        ':d': { S: (new Date()).toISOString() },
        ':m': { SS: [ toMessageString(event) ] }
      UpdateExpression: 'SET DeletedAt=:d, UpdatedAt=:d ADD #l :m'
    }, function(err, data) {
      if (err) return callback(err);
      callback(null, { statusCode: 200, body: JSON.stringify({ id: blacklistId }) });

functions are similarly succinct. The POST endpoint that receives the moMessage XML is a bit more verbose, but only because of a few additional corner cases (i.e. when the origin phone number or the message isn’t present).

exports.handler = (event, context, callback) => {
  const moMessageXml = event.body;
  if (messageMatch = moMessageXml.match(/<message>(.*)<\/message>/)) {
    if (messageMatch[1].toLowerCase().match(process.env.STOP_WORDS)) { // STOP_WORDS should be a Regex
      if (originNumberMatch = moMessageXml.match(/<\s*source\s+.*?address\s*=\s*["'](.*?)["']/)) {
        var originNumber = sanitizeNumber(originNumberMatch[1]);
          TableName: event.stageVariables.TABLE_NAME,
          Key: { Id: { S: originNumber }, Type: { S: 'sms' } },
          ExpressionAttributeNames: { '#l': 'Log' },
          ExpressionAttributeValues: {
            ':d': { S: (new Date()).toISOString() },
            ':m': { SS: [ moMessageXml ] }
          UpdateExpression: 'SET UpdatedAt=:d ADD #l :m REMOVE DeletedAt'
        }, function(err, data) {
          if (err) return callback(err);
          callback(null, { statusCode: 200, body: JSON.stringify({ id: originNumber }) });
      } else {
        callback(null, { statusCode: 400, body: JSON.stringify({ message: 'Missing source address' }) });
    } else {
      callback(null, { statusCode: 200, body: JSON.stringify({ id: '' }) });
  } else {
    callback(null, { statusCode: 400, body: JSON.stringify({ message: 'Invalid message xml' }) });

A couple things to call out here. First - and I know this looks terrible - this function doesn’t parse the XML - it instead uses regular expressions to pull out the data it needs. This is because Node.js doesn’t natively support XML parsing and importing a library to do it is not possible given my chosen constraints (the entire service defined in a CloudFormation template); I’ll explain further below. Second, there is expected to be a Lambda environment variable named STOP_WORDS that contains a regular expression to match the desired stop words (things like stop, unsubscribe, fuck you, etc).

That’s pretty much the extent of the production code.

Deployment - CloudFormation

Here’s where this project gets a little verbose. Feel free to reference the final CloudFormation template as we go through this. In broad strokes, this template matches the simple architecture diagram above: API Gateway calls Lambda functions which each interact with the same DynamoDB database. The bottom of the stack (i.e. the top of the template) is fairly simple: two DynamoDBs (one for prod, one for stage) and an IAM role that allows the Lambda functions to access the databases.

On top of that are the four Lambda functions - which contain the Node.js code (this is the “YO DAWG” part, since the Javascript is in the JSON template) - plus individual permissions for API gateway to call each function. This section (at the bottom of the template) is long but is mostly code-generated (we’ll get to that later).

In the middle of the template lie a bunch of CloudFormation resources that define the API Gateway magic: a top-level Api record; resources that define the path components under that Api; methods that define the endpoints and which Lambda functions they call; separate configurations for stage vs prod. At this point, we’re just going to avert our eyes and reluctantly admit that, okay, fine, serverless still requires some boilerplate (just not inline with the code, damn it!). At some level, every service needs to define its endpoints; this is where our blacklist nanoservice does it.

All-in, the CloudFormation template approaches 1000 lines (fully linted, mind you, so there are a bunch of lines with just tabs and curly brackets). “But wait!” you say, “Doesn’t CloudFormation support YAML now?” Why yes, yes it does. I even started writing the template in YAML until I realized I shouldn’t.

Bringing CloudFormation together with Node.js

To fully embed the Node.js functions inside the CloudFormation template would have been terrible. How would you run the code? How would you test it? A cycle of: tweak the code => deploy the template to the CloudFormation stack => manually QA - that would be a painful way of working. It’s unequivocally best to be able to write fully isolated and functioning Node.js code, plus unit tests in a standard manner. The problem is that Node.js code then needs to be zipped and uploaded to S3 and referenced by the CloudFormation template - which would create a dependency for the template and would not have achieved the goal of defining the entire service in a single template with no dependencies.

To resolve this, I wrote a small packaging script that reads the app’s files and embeds them in the CloudFormation template. This can then be run after every code change (which obviously would have unit tests and a passing CI build), to keep the template inline with all code changes. The script is written in Node.js (hey, if you’re running tests locally, you must already have Node.js installed locally), so a CloudFormation template written in JSON (as opposed to YAML) is essentially native - no parsing necessary. The script can load the template as JSON, inject a CloudFormation resource for each function in the /app directory, copy that function’s code into the resource, and iterate. Which brings us to

The other thing to note about going down the path of embedding the Node.js code directly in the CloudFormation template (as opposed to packaging it in a zip file): all code for a function must be fully contained within that function definition (other than the natively supported AWS SDK). This has two implications: first, we can’t include external libraries such as a XML parser or a Promise framework (notice all the code around callbacks, which makes the functions a little more verbose than I’d like). Second, we can’t DRY out the functions by including common functions in a shared library; thus they are repeated in the code for each individual function.


So that’s it: we end up with a 1000-line CloudFormation template that entirely defines a blacklist nanoservice that exposes four endpoints and runs entirely serverless. It is fully tested, can run as a true Node.js app (if you want), and will likely consume so few resources that it is essentially free. We don’t need to monitor application servers, we don’t need to administer databases, we don’t need any non-standard deployment tooling. And there are even separate stage and production versions.

You can try it out for yourself by building a CloudFormation stack using the template. Enjoy!

aws 11 cloudformation 1 sms 1 nanoservice 1 api-driven development 1 swagger 1 apibuilder 1 serverless 1
Gilt Tech

The POps Up Plant Shop

HBC Digital culture

How do we keep our teams happy and high-performing? That’s the focus for the People Operations (POps) team.

The POps mission is:

To build and maintain the best product development teams in the world through establishing the models around how we staff and organize our teams, how we plan and execute our work, and how we develop our people and our culture.

Our work includes:

We also like to have some fun, too.

Surprise and Delight

This week we coordinated an intercontinental “POps Up Plant Shop” for our people in NYC and Dublin. Between the two offices, we distributed 350 plants. Crotons, ivies, succulents and more were on offer. Everyone loved the surprise. While POps is focused on working with our tech teams, we noticed a few folks from other departments at HBC taking plants for their desks - a good indicator that what we’re doing is working!

Beyond adding a dash of color the office, offices plants are proven to increase happiness and productivity which aligns perfectly with the mission of the POps team.

people operations 1 happiness 2 productivity 2
Page 1 of 68