W H I T E P A P E R
www.persistent.com
© 2017 Persistent Systems Ltd. All rights reserved.
10.2.2 AWS Components in detail
10.2.2.1 AWS Data Pipeline
AWS Data Pipeline is a web service that helps to reliably process and move data between different AWS compute
and storage services, as well as on premise data sources, at specified intervals. With AWS Data Pipeline, we can
regularly access data where it’s stored, transform and process it at scale, and efficiently transfer the results to
AWS services such as Amazon S3, Amazon RDS, Amazon Dynamo DB, and Amazon EMR. AWS Data pipeline
is a reliable, easy to use, flexible, scalable and transparent web service for data process and data transfer.
References
https://aws.amazon.com/datapipeline/ https://aws.amazon.com/blogs/aws/category/aws-data-pipeline/10.2.2.2 AWS Lambda
AWS Lambda is a compute service that lets users run code without provisioning or managing servers. AWS
Lambda executes code only when needed and scales automatically, from a few requests per day to thousands
per second. AWS Lambda can be used to run a code in response to events, such as changes to data in an
Amazon S3 bucket or an Amazon Dynamo DB table; response to HTTP requests using Amazon API Gateway; or
invoke a code using API calls made using AWS SDKs.
Lambda performs operational and administrative activities for users, including capacity provisioning, scaling, high
availability, monitoring fleet health, applying security patches, deploying the code, running a web service front
end, and monitoring and logging the user’s functions. Supported runtimes include
Node.js, Python, Java and C#
through .NET Core.
References
http://docs.aws.amazon.com/lambda/latest/dg/welcome.html10.2.2.3 Amazon Redshift
Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud.
Amazon Redshift significantly lowers the cost of a data warehouse, also makes it easy to analyze large amounts
of data very quickly. AWS Redshift provides different features such as specially optimized for data warehouse,
Petabyte scale, automated backups, encryption, network isolation and fault tolerant.
References
https://aws.amazon.com/redshift/ http://docs.aws.amazon.com/redshift/latest/mgmt/overview.html https://en.wikipedia.org/wiki/Amazon_Redshift10.2.2.4 Amazon RDS
Amazon Relational Database Service (or Amazon RDS) is a distributed relational database service by Amazon
Web Services. It is a web service running in the cloud designed to simplify the setup, operation, and scaling
of a relational database for use in applications. Complex administration processes like patching the database
software, backing up databases and enabling point-in-time recovery are managed automatically. Scaling storage
and compute resources can be performed by a single API call. Amazon RDS provides six familiar database
engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL
Server.
References
https://aws.amazon.com/rds/ https://en.wikipedia.org/wiki/Amazon_Relational_Database_Service34