Use-case Cake

Cake brings together all your existing bank accounts and transactions. It analyses your financial data and habits, to make your life better. Cake makes your bank accounts pay off again, by sharing its profits with you. It’s the app that rewrites the rules. To make sure you will always get your piece.

Intro

IN4IT has been selected to be the go-to cloud partner for a financial data analytics firm called Cake. As a boutique consultancy company which specialises in Amazon Web Services (AWS), IN4IT were selected for their high levels of expertise in cloud architecture. Cake were working on their own green field cloud architecture, and IN4IT supported in the development and implementation of this innovative architecture. Throughout the process of implementation, IN4IT worked on a host of robust security policies to ensure a supremely secure environment using a least privilege approach.

Excellent knowledge of AWS services, architecture and operations. Very fast response times on any question or request we give them. Very transparent service agreements and simply a correct way of doing business. Highly recommended!

Quote by Pieter Schelfhout, Head of Engineering at Cake

Technology

Network Security

AWS has introduced a number of security features to its services in recent years. One such feature, which Cake is leveraging to great effect, is Virtual Private Cloud (VPC) endpoints. Anyone who has experience working with AWS will be familiar with the problem which VPC endpoints solve so elegantly.

Essentially, the problem occurs when you deploy all your services in a VPC with the aim of controlling your egress traffic; you can end up inadvertently sending all your traffic over the NAT gateway. The reason for this is that Amazon's services advertise on external IP addresses, and VPC endpoints help by enabling private VPC addresses for virtually every AWS service you could need. This improves the overall transparency of the network and makes network security far easier.

Data Security

On the journey to a more comprehensive security setup, the process of having all services deployed within a VPC was only the beginning. To protect data, AWS allows customers to set up specific Identity and Access Management (IAM) policies as a means of controlling where users are connecting from. We build AWS policies in a way that users and roles can only gain access if the traffic originates from within the VPC. This has the effect of significantly reducing the attack surface, since merely having a user role is not sufficient to use the permissions granted by AWS; you must also be within a specific VPC network.

A Unified Platform

There is a service that provides a unified platform for developers and data scientists to deploy their applications in. It is called Elastic Container Service (ECS), and every microservice is dockerized on the Continuous Integration/Continuous Delivery (CI/CD) platform. Here, it is pushed to the EC2 Container Registry (ECR) and scheduled as a service on Elastic Container Service (ECS). IN4IT uses specialist ECS-deploy software to act as the glue between CI/CD and ECS, enabling users to easily deploy and rollback applications using yaml files in a microservice's git repository.

In the same architecture, we also deploy ingress and egress gateways. The ingress is provided by Roxprox, and it has a definition for every REST endpoint, as well as handling authentication/authorisation. Meanwhile, egress traffic passes through a bespoke forward proxy to facilitate HTTP and HTTPS hostname filtering. With this architecture, all inbound and outbound traffic can be filtered on a fine, granular level. Incidentally, the platform is also appmesh ready, which can enable enhanced communication and observability between the relevant microservices.

Data Processing

The application layer captures data, but will not necessarily provide any financial insights by itself. The microservices thus send the data to Kinesis Streams, which aggregates everything in the data lake, which is based on S3.

Extract, Transform, Load (ETL) is a service provided by Spark jobs in Amazon Glue. The main benefit of this approach is that the entire ETL pipeline is managed by AWS. This means that managing more data can be achieved by simply increasing Data Processing Units. Glue Workflows also enables the data science team to create specific workflows that trigger Glue jobs and crawlers.

Finally, after Glue is finished processing the data, Athena can be leveraged to provide users with an interface to facilitate ad-hoc data analysis and explore the data in a meaningful way.

Want to learn more on how we can help your organization? Let's talk!

Jorn Jambers
Published on November 14, 2019