AWS Technical Essentials
This post distills key insights from my study of "AWS Technical Essentials" from AWS Skillbuilder.
Cloud computing deployment models
On-premises
Companies managed their own data centers with costly hardware and operations. Scaling and experimenting were difficult due to high costs.
Cloud
Cloud computing offers on-demand IT resources over the internet with pay-as-you-go pricing. Providers like AWS handle hardware, letting businesses focus on innovation.
Hybrid
Hybrid setups connect on-premises and cloud resources, allowing organizations to scale by combining both infrastructures.
Comparison
On-premises requires time and money to set up environments, delaying development. Cloud enables quick replication of production setups, saving time and allowing focus on unique business needs. AWS simplifies repetitive tasks, providing scalable, cost-effective services.
AWS Global Infrastructure
Regions
The AWS Cloud spans 93 Availability Zones across 29 geographic Regions worldwide. Regions are independent locations hosting AWS data centers, named after their geographic areas (e.g., us-east-1 for Northern Virginia).
When choosing a Region, consider latency, pricing, service availability, and compliance.
Examples of Region Codes
- us-east-1: Northern Virginia
- ap-northeast-1: Tokyo
Choosing the Right Region
Latency: Pick a Region close to your users for better performance.
Pricing: Costs vary by Region due to economic factors.
Service Availability: Not all services are available in every Region.
Compliance: Choose a Region that meets data storage regulations.
Availability Zones
Each Region contains multiple Availability Zones (AZs), which are clusters of data centers with redundant power, networking, and connectivity. AZs are identified by appending letters to Region codes (e.g., us-east-1a).
For high availability, replicate workloads across multiple AZs.
Scope of AWS Services
Services are scoped at the Availability Zone, Region, or Global level. Region-scoped services handle data durability and availability automatically, while AZ-scoped services require user configuration for resilience.
Maintaining Resiliency
Use Region-scoped managed services or replicate workloads across at least two AZs to ensure high availability and resilience.
Edge Locations
Edge locations are global points where content is cached for faster access. With over 400+ edge locations, Amazon CloudFront uses these to deliver content with low latency and high performance.
Interacting with AWS
AWS Management Console
The AWS Management Console is a web-based interface for managing cloud resources. It is ideal for beginners to create and manage resources.
Key features:
- Services Menu: Access AWS services by category (Compute, Storage, etc.).
- Region Selector: Change the Region to make requests to services in that location. The URL updates to reflect the selected Region.
AWS CLI
The AWS CLI is a unified command-line tool for managing AWS services. It is useful for automation and handling repetitive tasks with scripts.
Features:
- Open-source tool available for Windows, Linux, and macOS.
- Allows programmatic interaction with AWS services via API calls.
AWS SDKs
AWS SDKs allow developers to integrate AWS services into their applications using popular programming languages such as Python, JavaScript, and Java.
Features:
- Open source and maintained by AWS.
- Supports integration with various AWS services through programming languages.
AWS Root User Best Practices
Root User Overview
The AWS root user has full access to all services and resources. It is accessed with the email and password used during account creation. For programmatic access, use access keys (ID and secret).
Best Practices
- Limit Root User Usage: Use it only for critical tasks.
- Delete Access Keys: Avoid creating or keep them securely.
- Enable MFA: Add a second layer of security using supported MFA devices.
Multi-Factor Authentication (MFA)
MFA adds a second layer of security by combining:
- Something you know: Password or PIN.
- Something you have: One-time code from a device.
Types of MFA
- Virtual MFA: Software apps (e.g., Google Authenticator) generate time-based codes.
- Hardware TOTP Tokens: Physical devices generate one-time codes.
- FIDO Security Keys: Physical USB or NFC keys for secure access.
- Virtual MFA: Software apps (e.g., Google Authenticator) generate time-based codes.
- Hardware TOTP Tokens: Physical devices generate one-time codes.
- FIDO Security Keys: Physical USB or NFC keys for secure access.
Activating MFA on the root user is a critical AWS security best practice.
AWS Identity and Access Management
Authentication
Authentication verifies the user's identity. It answers the question: "Are you who you say you are?" The most common forms of authentication are usernames and passwords, though it can also include token-based authentication or biometric methods like fingerprints.
Authorization
Authorization determines what actions a user can perform on AWS resources. After authentication, it answers: "What can you do?" It grants specific permissions, like creating, editing, or deleting resources.
IAM Overview
AWS Identity and Access Management (IAM) helps you manage access to your AWS account. IAM is used to define who can access your account and what they can do once inside. It offers a granular way of managing permissions to AWS services and resources.
IAM Features
- Global integration with AWS services
- Shared access
- Multi-factor authentication
- Identity federation
- Free to use
IAM Users
An IAM user represents an individual or service interacting with AWS. Users can be granted access to the AWS Management Console or programmatic access via the AWS CLI/API.
IAM users have permanent credentials that stay until manually rotated by admins.
IAM Groups
Groups are collections of IAM users that inherit permissions assigned to the group. Groups help manage permissions at scale. For example, a "developer" group could include all users working on development tasks.
Key Group Features:
- Users can belong to multiple groups.
- Groups cannot belong to other groups.
IAM Policies
IAM policies define permissions to access AWS resources. They are attached to IAM users, groups, or roles. When a request is made by an IAM identity, AWS evaluates the associated policies to determine if the action should be allowed or denied.
Amazon ECS (Elastic Container Service)
- End-to-end container orchestration service.
- Runs on:
- AWS Fargate: Serverless infrastructure.
- Amazon EC2 Instances: Manual cluster management with ECS Agent.
- Uses Task Definitions (JSON) to configure resources like CPU, memory, and networking.
Amazon EKS (Elastic Kubernetes Service)
- Managed Kubernetes service for containerized workloads.
- Uses worker nodes (Kubernetes nodes) to run containers.
- Containers are organized in Pods.
- Best for teams familiar with Kubernetes needing advanced orchestration.
Key Differences
Feature | Amazon ECS | Amazon EKS |
---|---|---|
Technology | AWS native | Kubernetes |
Compute Options | Fargate, EC2 | Kubernetes nodes |
Containers | Tasks | Pods |
Use Case | Simple, AWS-focused. | Kubernetes-based, advanced control. |
Removing Undifferentiated Heavy Lifting with AWS
The Challenge
When running workloads on Amazon EC2, AWS manages the physical hardware, but you handle:
- Guest operating system
- Security and patching
- Networking and scaling
With Amazon ECS or EKS, AWS manages container deployment and clustering, but you’re still responsible for maintaining EC2 instances.
Go Serverless
Serverless computing removes the burden of managing servers. Key features:
- No servers to provision or manage
- Automatic scaling with usage
- Pay only for active resources
- Built-in availability and fault tolerance
Serverless Services
- AWS Fargate: Run containers without managing EC2 instances.
- AWS Lambda: Execute code in response to events, only paying for execution time.
Networking Basics
What is Networking?
Networking connects computers worldwide, enabling communication. AWS demonstrates this through its global infrastructure of data centers, Availability Zones, and Regions.
Networking Analogy: Sending a Letter
- The letter represents the data being sent.
- The sender's address is the source of the message.
- The recipient's address is the destination of the message.
IP Addresses
Just like a home has a mailing address, computers have IP addresses to ensure messages are routed correctly.
Example of a 32-bit address in binary: 11000000 10101000 00000001 00011110
IP addresses are typically displayed in decimal format as IPv4 addresses, e.g., 192.168.1.30.
CIDR Notation
Classless Inter-Domain Routing (CIDR) allows representation of IP address ranges. For instance:
- 192.168.1.30: A single IP address.
- 192.168.1.0/24: A range with the first 24 bits fixed, leaving 256 possible addresses.
In AWS, CIDR ranges vary from /28 (16 IPs) to /16 (65,536 IPs).
What is a VPC?
A Virtual Private Cloud (VPC) is an isolated network you create within the AWS Cloud, similar to a traditional on-premises network. When creating a VPC, you specify:
- Name of the VPC
- Region: The VPC spans all Availability Zones within the selected region.
- IP Range: Defined in CIDR notation, supporting up to five CIDRs.
AWS provisions the network and allocates IP addresses based on this information.
Creating Subnets
Subnets are smaller networks within a VPC, comparable to VLANs in traditional networks. Subnets can be:
- Public: For internet-connected resources.
- Private: For resources not connected to the internet.
When creating a subnet, you specify:
- The VPC to associate it with.
- The Availability Zone where it resides.
- The IPv4 CIDR block, which must be a subset of the VPC's range.
High Availability
To ensure fault tolerance, create at least two subnets across different Availability Zones. If one AZ fails, resources in the other AZ remain accessible.
Reserved IPs
AWS reserves five IP addresses per subnet for routing, DNS, and network management. For example, in a /24 subnet with 256 IP addresses, only 251 are available for use.
Gateways
Internet Gateway
Enables communication between your VPC and the internet. Highly available and scalable, it acts like a modem for your network.
Virtual Private Gateway
Used to connect your VPC to another private network via a VPN. Requires a customer gateway on the other side of the connection.
AWS Direct Connect
Provides a secure physical connection between your on-premises data center and your VPC via Ethernet. It keeps traffic on the AWS global network.
Main Route Table
When a VPC is created, AWS automatically creates a main route table. This table contains rules (routes) that dictate the direction of network traffic. By default, the main route table allows traffic between all subnets in the local network.
- You cannot delete the main route table.
- A gateway route table cannot be set as the main route table.
- You can replace the main route table with a custom subnet route table.
- Routes in the main route table can be added, removed, or modified.
- Subnets can be explicitly associated with the main route table, even if they are already implicitly associated.
Custom Route Tables
Subnets without an explicitly associated route table use the main route table. However, for specific use cases, such as isolating application components, you can create custom route tables. These allow distinct routes for each subnet, such as:
- Frontend subnet with routes for external access.
- Database subnet with routes limited to internal access.
Custom route tables include a default local route for communication between resources and subnets inside the VPC. For better control, explicitly associate subnets with custom route tables, leaving the main route table unchanged.
Storage Types
1. File Storage
Data is stored in file hierarchies with metadata such as name and path. Ideal for centralized file sharing and management.
Use cases:
- Web serving
- Analytics
- Media
- Home directories
2. Block Storage
Data is split into addressable blocks for fast and efficient access. Ideal for critical and high-performance applications.
Use cases:
- Transactional workloads
- Containers
- Virtual machines
3. Object Storage
Data is stored as objects in a flat structure within buckets. Scalable for large or unstructured data sets.
Use cases:
- Data archiving
- Backup and recovery
- Rich media
Comparison with Traditional Systems
- Block Storage: Similar to DAS/SAN.
- File Storage: Similar to NAS servers.
- Object Storage: Best for scalability and flexibility.
Cloud Benefits
With cloud storage, you can create and modify storage solutions in minutes, unlike the rigidity of traditional data centers.
1. Amazon Elastic File System (Amazon EFS)
Amazon EFS is a scalable file system that automatically adjusts capacity as files are added or removed. It provides consistent performance without the need for manual provisioning. EFS supports both AWS compute services and on-premises resources, allowing multiple instances to connect simultaneously.
Key Features:
- Simple web interface for quick setup.
- No minimum fees or setup costs.
- Pay only for used storage.
- Support for multiple storage classes.
Storage Classes:
- EFS Standard: Multi-AZ resilience, high durability, and availability.
- EFS Standard-IA (Infrequent Access): Cost-effective, lower access frequency.
- EFS One Zone: Saves data in a single availability zone for cost savings.
- EFS One Zone-IA: Lower cost for infrequent access data within a single zone.
2. Amazon FSx
Amazon FSx is a fully managed service offering a variety of high-performance file systems. It provides reliability, security, scalability, and flexibility for running file systems in the cloud.
Available File Systems:
- Lustre: High-performance file system for fast processing.
- NetApp ONTAP: Ideal for enterprise-grade data management and storage solutions.
- OpenZFS: Open-source file system with advanced features.
- Windows File Server: Managed Windows file system for seamless integration with Windows workloads.
Benefits:
- Fully managed for ease of use.
- Multiple file system options for different workloads.
- Scalable and secure.
Amazon S3 Overview
Bucket Names
Bucket names must be 3-63 characters, contain only lowercase letters, numbers, dots, and hyphens, and start and end with a letter or number.
Object Keys
Object keys uniquely identify objects, and prefixes/delimiters can simulate folder structures in the console.
Use Cases
- Backup and storage
- Media hosting
- Software delivery
- Data lakes
- Static websites
Security
By default, S3 resources are private. You can adjust permissions with IAM and S3 policies, and encrypt data in transit and at rest.
Storage Classes
S3 offers various storage classes, such as Standard, Intelligent-Tiering, Glacier, and others, based on access frequency and cost.
Amazon EBS
Amazon EBS is designed for data that changes frequently and must persist through instance stops, terminations, or hardware failures. It has two types of volumes: SSD-backed and HDD-backed.
- SSD-backed volumes: Performance depends on IOPs, ideal for transactional workloads like databases and boot volumes.
- HDD-backed volumes: Performance depends on MBps, suitable for throughput-intensive workloads like big data, data warehouses, log processing, and sequential data I/O.
Key Features of Amazon EBS:
- Block storage.
- Pay for what you provision (storage must be provisioned in advance).
- Volumes are replicated across multiple servers in a single Availability Zone.
- Most EBS volumes can only be attached to a single EC2 instance at a time.
Databases on AWS
AWS Service(s) | Database Type | Use Cases |
---|---|---|
Amazon RDS, Aurora, Amazon Redshift | Relational | Traditional applications, ERP, CRM, ecommerce |
DynamoDB | Key-value | High-traffic web applications, ecommerce systems, gaming applications |
Amazon ElastiCache for Memcached, Amazon ElastiCache for Redis | In-memory | Caching, session management, gaming leaderboards, geospatial applications |
Amazon DocumentDB | Document | Content management, catalogs, user profiles |
Amazon Keyspaces | Wide column | High-scale industrial applications for equipment maintenance, fleet management, route optimization |
Neptune | Graph | Fraud detection, social networking, recommendation engines |
Timestream | Time series | IoT applications, Development Operations (DevOps), industrial telemetry |
Amazon QLDB | Ledger | Systems of record, supply chain, registrations, banking transactions |
Breaking up applications and databases
As the industry changes, applications and databases change too. Today, with larger applications, you no longer see just one database supporting them. Instead, applications are broken into smaller services, each with its own purpose-built database supporting it. This shift removes the idea of a one-size-fits-all database and replaces it with a complimentary database strategy. You can give each database the appropriate functionality, performance, and scale the workload requires.
Amazon CloudWatch Overview
Amazon CloudWatch is a monitoring and observability service that provides insights into AWS resources and applications. It automatically collects metrics from AWS services, enabling users to monitor resources like EC2 instances, set up dashboards, and create alarms. In this demo, the goal was to create a CloudWatch dashboard to track EC2 instance CPU utilization and set up an alarm for notifications when CPU usage exceeds 70% for more than 5 minutes.
Steps Covered:
1. Creating a Dashboard:
- Navigate to CloudWatch in the AWS console.
- Create a dashboard to display CPU utilization for an EC2 instance.
- Use a line graph to visualize the metric and save the dashboard.
2. Using CloudWatch Alarms:
- Set up an alarm to trigger when CPU utilization exceeds 70% for 5 minutes.
- Configure actions such as sending an email alert using Amazon SNS when the alarm is triggered.
- Create an SNS topic and specify the recipient for notifications.
CloudWatch Alarm States:
Amazon CloudWatch Alarms can be in three possible states:
- OK: This state means that the metric being monitored is within the defined threshold or normal range. The alarm is not triggered because the metric is operating as expected.
- ALARM: The alarm enters this state when the metric crosses the threshold defined for the alarm, indicating that there may be a potential issue. For example, if CPU utilization exceeds 70% for a certain period, the alarm would enter this state.
- INSUFFICIENT_DATA: This state occurs when CloudWatch does not have enough data to determine whether the metric is within an acceptable range or not. This could happen when an instance is in a starting or stopping state, or if no data has been sent for that metric.
Each of these states allows you to configure actions that can be triggered, such as sending notifications through Amazon SNS when the state changes (e.g., from OK to ALARM).
CloudWatch allows for real-time monitoring of AWS resources, providing essential insights into application performance. Custom metrics can also be added to CloudWatch for a more comprehensive view of application health.
Types of Load Balancers
In this guide, we will cover three types of load balancers: Application Load Balancer (ALB), Network Load Balancer (NLB), and Gateway Load Balancer (GLB).
Load Balancer Types Overview
- Application Load Balancer (ALB): Operates at Layer 7 of the OSI model, ideal for HTTP/HTTPS traffic.
- Network Load Balancer (NLB): Operates at Layer 4 of the OSI model, best for TCP/UDP traffic.
- Gateway Load Balancer (GLB): Helps manage third-party virtual appliances, supports Layer 3 gateway and Layer 4 load balancing.
Application Load Balancer (ALB)
An ALB is ideal for balancing HTTP and HTTPS traffic. It evaluates listener rules to route traffic based on request content.
Key Features:
- Routes traffic based on request data
- Sends responses directly to the client
- Uses TLS offloading
- Authenticates users
- Secures traffic
- Supports sticky sessions
Network Load Balancer (NLB)
A Network Load Balancer works at Layer 4, routing TCP and UDP traffic based on IP protocol data.
Key Features:
- Sticky sessions: Routes requests from the same client to the same target
- Low latency: Ideal for latency-sensitive applications
- Preserves source IP address
- Supports static IP and Elastic IP addresses
- DNS failover: Uses Amazon Route 53 for traffic management
Gateway Load Balancer (GLB)
A Gateway Load Balancer helps distribute traffic to third-party appliances like firewalls, intrusion detection systems, and more.
Key Features:
- High availability: Routes traffic through healthy virtual appliances
- Monitoring: Monitored using CloudWatch metrics
- Streamlined deployments: Deploy virtual appliances from the AWS Marketplace
- Private connectivity: Connects resources over a private network
Choosing the Right Load Balancer
You can select between the ELB service types based on the features required for your application. Below is a comparison table of major features:
Feature | ALB | NLB | GLB |
---|---|---|---|
Load Balancer Type | Layer 7 | Layer 4 | Layer 3 gateway and Layer 4 load balancing |
Target Type | IP, instance, Lambda | IP, instance, ALB | IP, instance |
Protocol Listeners | HTTP, HTTPS | TCP, UDP, TLS | IP |
Static IP and Elastic IP Address | Yes | Yes | Yes |
Preserve Source IP Address | Yes | Yes | Yes |
Fixed Response | Yes | No | No |
User Authentication | Yes | No | No |