W H I T E P A P E R
www.persistent.com
© 2017 Persistent Systems Ltd. All rights reserved.
10.1.2 GCP Components in detail
10.1.2.1 Cloud Dataflow
Cloud dataflow is a fully-managed data processing service, supporting both stream and batch execution of
pipelines. It is used to transfer data from multiple source like Avro files, BigQuery tables, Bigtable, Datastore,
Pub/Sub, Text files etc. to multiple sinks. Developers can create a custom source and sink by extending the
Dataflow SDK’s abstract Source subclasses such as BoundedSource or UnboundedSource and by extending the
abstract Sinkbase class. It provides features like auto scaling, easy to integrate with multiple tools Cloud Storage,
Cloud Pub/Sub, Cloud Datastore, Cloud Bigtable and BigQuery.
References
https://cloud.google.com/dataflow/model/custom-io-java https://cloud.google.com/dataflow/10.1.2.2 Cloud Functions
Google Cloud Functions is a lightweight, event-based, asynchronous compute solution that allows users to
create small, single-purpose functions that respond to cloud events without the need to manage a server or a
runtime environment. Events from Google Cloud Storage and Google Cloud Pub/Sub can trigger Cloud Functions
asynchronously, or you can use HTTP invocation for synchronous execution. It runs in a fully-managed, server-
less environment where GCP handles the servers, operating systems and runtime environments, and developers
focus on building solutions. It provides event-based services, Cloud Pub/Sub Triggers, Cloud Storage Triggers,
HTTPS Invocation, Logging & Monitoring and GitHub/Bitbucket Support. It uses
Node.jsto write code and deploy
using
gcloud
.
References
https://cloud.google.com/functions/10.1.2.3 Cloud Pub/Sub
Cloud Pub/Sub is a fully-managed real-time messaging service. It allows developers to send and receive
messages between independent applications. Cloud Pub/Sub’s flexibility can be leveraged to decouple systems
and components hosted on Google Cloud Platform or elsewhere on the Internet. By building on the same
technology Google uses, Cloud Pub/Sub is designed to provide “at least once” delivery at low latency with on-
demand scalability to 1 million messages per second (and beyond). It encrypts all messages on the wire and at
rest provides data security and protection.
References
https://cloud.google.com/pubsub/10.1.2.4 StackDriver Logging
StackDriver Logging is a GCP component. It allows to store, search, analyze, monitor, and alert on log data and
events from GCP and Amazon Web Services (AWS). It provides support to integrate with Cloud Pub/Sub, Splunk
& Log entries, Google Cloud Storage and google BigQuery; allows to browse log data and create metrics from
log data. StackDriver Logs Viewer, APIs, and the gCloud CLI can be used to access Audit Logs that capture all
the admin and data access events within GCP.
References
https://cloud.google.com/logging/29