

[19] M. Hammer and J. Champy, Reingeneering the corporation: A Manifesto for Business Revolution. New York,
Harper Collins, 1993.
[20] D.B. Rubin. Multiple imputation after 18+ years. Journal of the American Statistical Association, 91:473—489,
1996.
[21] T. Dasu et. al., FIT toMonitor Feed Quality, VLDB 2015,
.
http://www.vldb.org/pvldb/vol8/p1728-dasu.pdf[22] A Gruenheid et. al., Incremental Record Linkage, VLDB 2014,
http://www.vldb.org/pvldb/vol7/p697-gruenheid.pdf
.
[23] Jeff Dean, Numbers you should know, slides 14 to 19 in presentation
https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf[24] David Dewitt and Jim Gray, Parallel Database Systems: the future of High Performance Database Systems,
CACM June 1992 Vol 35 No 6,
.
http://people.eecs.berkeley.edu/~brewer/cs262/5-dewittgray92.pdf[25] TPC-H: an ad-hoc, decision support benchmark,
http://www.tpc.org/tpch/[26] Beyond SQL: Speeding up Spark with DataFrames
http://www.slideshare.net/databricks/spark-sqlsse2015public
[27] S. Agarwal, B. Mozafari, A.Panda, H. Milner, S. Madden, I. Stoica, BlinkDB: Queries with Bounded Errors and
Bounded Response Times on Very Large Data,ACMEurosys 2013, Prague, Czech Republic.
[28]
http://kylin.apache.org/[29]
http://druid.io/druid.html[30]
https://lens.apache.org/[31]
http://www.atscale.com/[32]
https://nifi.apache.org/[33]
https://storm.apache.org/[34]
https://kafka.apache.org/[35]
http://airbnb.io/projects/airflow/[36] Martin Fowler and Rebecca Parsons. Domain-Specific Languages.Addison-Wesley, 2011.
© 2017 Persistent Systems Ltd. All rights reserved. 95
W H I T E P A P E R
www.persistent.com