

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 86
www.persistent.com
—
Cloud-based services, whether operating as high-level analytics services or foundational platform services,
address some security capabilities while introducing new challenges. The service provider may be addressing
platform and network security to a high degree of assurance but lack visibility into who has accessed what. In
this cases, it is desirable to build an auditing framework (or re-use an existing one) on premise to control data
access.
—
Compliance: Compliance requirements can come from both internal and external sources and organizations
might adhere to certain regulatory requirements or those imposed by customers or partners. Some typical
data-related compliance requirements that might affect a cloud provider include: PCI DSS, HIPAA, SOX etc.
—
Visibility: Extended governance requires visibility into cloud operations, including ETL, archiving, and the like.
Cloud providers offer tools and protection strategies to (i) avoid problems that may occur during normal
operations, as well as (ii) to support service-level agreements. These are summarized in the following table.
7.3.4Big data
The following best practices apply to the performance and security management of a big data environment.
7.3.4.1 Performance
—
Hadoop is a flexible, general purpose environment for many forms of processing presented in section
3.2above. The same data in Hadoop can be accessed and transformed with Hive, Pig, HBase, Spark and
MapReduce (MR) code written in a variety of languages, even simultaneously. Choose the tooling that
provides optimal performance for your use case as depicted below.
—
MR applies massive parallel computation to the data, but is a batch operation and is too slow for interactive
workloads. Hive onMapReduce and Pig inherits from this problem.
—
Partitioning the data sets is the single key recommended best practice to speed up computations on data
lakes
Concern
Protection strategy
Accidental information disclosure
Permissions File, partition, volume or application-
level encryption
Data integrity compromise
Permissions
Data integrity checks
Backup / Restore
Versioning
Accidental deletion
Permissions
Backup
Versioning
System, infrastructure, hardware or software
availability
Backup / Restore
Replication