By: Sebastian Bassi (Solution Architect – Associate)
Here is a summary of the AWS Summit San Francisco 2018 for all interested people who could not attend.
The event began with a keynote of Dr. Werner Vogels, CTO of Amazon.com. In a concert-like setting, he presented the event and made some key announcements (more on this later) and also introduced several co-hosts who joined him on stage for about two hours. One of these co-hosts was Dr. Matt Wood, who talked about the Amazon commitment to Artificial Intelligence as the rest of the industry, I might add.
Werner also presented startups that are using AWS services, below we mention two: Peloton and Intuit.
- Peloton: Sells an internet connected fixed bike that enables the person to workout in their home while connected with Peloton instructors in NYC or with friends that have the same connected bike. They have more than a million of customers that burned together more than 7 billion of calories. The CTO and co-founder of Peloton told the audience about one event where 13,000 raiders joined for a communal ride at Thanksgiving, were they generated 20K requests per second, and fed it to the leaderboard in almost real time using AWS. This was called “The Turkey Ride” and was a Guinness world record event. They also announced their next product: Peloton Tread.
You may want to see this video on how to manage to stream Live Spin Classes to thousands with Loggly & AWS: https://www.youtube.com/watch?v=E0kgsGXkvu8
- Intuit: If you are in the US, most likely you used their TurboTax software at some point. Nhung Ho, Head of Data Science at Intuit told us that they use almost all range of AWS. They need elasticity since the tax preparation industry is highly seasonal. For example, last year on peak day they had about 2 million tax returns preparations. When there is a bug, it must be solved quickly, that is why they use all CI/CD tools AWS provides. Data submitted for tax preparation sometimes includes medical records, that is why they need a HIPAA compliance, that is provided by AWS. They also started working on AI, they have a 20 data scientist team doing only machine learning.
With SageMaker, the scientists can focus on doing machine learning rather than DevOps and other roles since a lot of the features they had to setup, now come out of the box. This reduces model implementation time from six months to one week.
They also plan to use AWS Transcribe and AWS Comprehend (Sentiment Analysis) to improve the user experience.
There were about a dozen announcements made during the keynote. Here is a summary of what I believe are the most important (for a complete list, see below):
- AWS Certificate Manager Private Certificate Authority (ACMPCA): ACMPCA is a managed private service that enables to manage the lifecycle of private certificates. To access this private CA, click here: https://console.aws.amazon.com/acm
- AWS Firewall Manager (AFM): AFM allows to use multiple AWS accounts and to host applications in any desired region while maintaining centralized control over their organization’s security settings and profiles. Amazon already had WAF (Application Firewall) and Shield (DDoS protection), but AFM integrates both solutions in a centralized way. For WAF and Shield customers, FM has a monthly fee for each protection policy created per AWS Region.
- AWS Secrets Manager: A secrets management service that helps to protect the access to your applications, services, and IT resources. Until today, this was a missing piece. We had to rely on leaving the keys in environmental variables and use KWS to encrypt the API keys and credentials. Now we can rotate, manage and retrieve these secrets using a GUI or with an API. Pay for the secrets you store in AWS Secrets Manager and for the use of these secrets.
- S3 One Zone Infrequent Access: A new storage class. The files are stored in only one availability zone, it has the same durability of S3 Standard and S3 Standard-Infrequent Access (99.999999999%) but less availability (99.5 instead of 99.9 of Standard-IA or 99.99 of Standard). It is 20% cheaper than the Standard-IA. This is at cost of lost of resilience in the event of data center destruction. The common use case is infrequently accessed data that is re-creatable, such as storing secondary backup copies. See this page to compare all available storage classes.
- S3 Select: This was already available only for selected customers, it was not generally available. Use SQL statements to retrieve subsets of data from a S3 object. This way you don’t need to download the whole object to process it and get a data subset. This can lead to a 80% of cost reduction and a 400% improvement in access speed. See an example in Python here.
- Amazon Translate: As S3 select, this is a service that was somewhat available, but now is available for everyone. These services support translating texts from English to and from 6 major languages: Spanish, Chinese, French, German, Arabic, and Portuguese. Six additional languages will be supported in the coming months, including Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech. You can try it online without calling the API.
- Amazon Transcribe: This service was announced on re:Invent 2017. It supports English and Spanish, generates easy-to-read transcriptions (with capitalization and punctuation), recognizes multiple speakers (to attribute the text) and generates timestamps (for video captioning).
Not all announcements were services related, there was a demo of the upcoming DeepLens IA powered camera. This camera is an 8Gb RAM computer running Ubuntu 16.04 with custom models trained to make image recognition in real time, with integration with AWS services.
The AWS Expo
There was a floor full of AWS clients and partners. The event had a AWS Certification Lounge, where certified individuals could access to comfortable armchairs, chargers, food and beverages. This is a nice recognition. They also had a reception at the end of the day with a cocktail service only for certification holders, with the purpose of networking.
Between all the workshops, sessions, and talks there were more than 100 activities, and had no chance to even go to half of the talks I wanted to go. Anyway, Amazon uses to publish most of the talks in their YouTube channel, so stay tuned.
I went to Docker and Lambdas related talks, because I am interested in these technologies and also think that this is where the market is heading. On this regards, it is now easier to spot that the evolution of this market is more less:
This is not so lineal as the schema implies, some workload can’t be handled with lambdas and they may need containers or ever a whole VM. In any case, the most interesting developments are related to the orchestration of lambdas and containers, thus o there were lot of activity on these subjects.
There was a startup section where new companies that use AWS showcased their architectures, challenges and solutions. There was one that is related to my expertise, which I like to share: Powering CRISPR with AWS Lambda, by Vineet Gopal from Benchling.com. They need to find a small DNA sequence in a genome. This sequence is used to find a target site for the CAS9 endonuclease to work. The CAS9 complex works as a molecular scissors cutting the DNA at a specific site. Is important to cut only in this site and not in another to avoid unexpected effects. In order to find this match, they reduced the search namespace by using bits and hashes. Even with these transformations, they needed to scale the search in an horizontal way, so they came up with a lambda setup that reduced the time from hours to seconds and the costs from 100K/yr to 3.2K/yr (this was only the AWS budget, which does not factor the devops cost reduction).
Another talk worth mentioning was IoT Building Blocks: From Edge Devices to Analytics in the Cloud, that showed how to integrate data captures by IOT devices with the AWS infrastructure. The example the speakers gave was related with a factory or warehouse were all the workers were wearing a bracelet that collected accelerometer and environment data (light, dust, temperature, not body data). They integrated the information so they could know each employee’s location and also their work environment.
I thought this was going to be a regular event but I was happily wrong. I managed to go to most of the talks and activities the time allowed me. The speakers confirmed the thoughts I had regarding the industry and also opened my mind in new directions. ECS, EKS and Fargate are the places to go for DevOps and Lambdas with Step Function, which are what developers should be currently evaluating. AI field is still on the rise, with new services and products, if you are interested in this topic you may consider learning SageMaker.