Monthly archive for March 2016

Augmented Data Warehouse

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Big set infographic of graph, charts and diagrams. Flat infographic collection schemes in trend color. Can be used for web banners, marketing and promotional materials, presentation templates

Over the last few years, you have invested heavily in building your Data Warehouse and Business Intelligence capabilities. And if you have done it right, you have been reaping the benefits of it in a meaningful way too. But we all know that it is handling only a reasonable amount of your structured data. Now we are in the Big Data era and you need to harness insights out of the large data sets – structured and un-structured – internal as well as external.

Read more

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Getting Started With Apache Flink 1.0

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Flink

Apache Flink is the new star in the town. It is stealing the thunder from Apache Spark (at least in the streaming system) which has been creating buzz for some time now. This is because Spark streaming is built on top of RDDs which is essentially a collection, not a Stream. So now would be the right time to try your hands on Flink, even more so since Flink 1.0 was released last week.

Read more

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Flink Streaming – Tumbling and Sliding Windows

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Flink Streaming

Flink has two types of Windows – Tumbling Window and Sliding Window. The main difference between these windows is that Tumbling windows are non-overlapping whereas Sliding windows can beoverlapping.
In this article, I will try to explain these two windows and will also show how to write Scala program for each of these. Code used in this blog is also available in my Github Read more

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Spark RDDs Simplified – Part 2

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

Spark Rdd

This is Part 2 of the blog Spark RDDs Simplified. In this part, I am trying to cover the topics Persistence, Broadcast variables and Accumulators. You can read the first part from here where I talked about Partitions, Actions/Transformations and Caching.

Read more

FacebooktwitterlinkedinmailFacebooktwitterlinkedinmail

BLOG POSTS

ADDRESS

222 S Church St
Charlotte, NC 28202
Phone: (+1) 704 804 1090
Website: http://datalakes.com
Email: [email protected]

PRIVACY POLICY

Important: Datalakes is committed to protecting the privacy of our subscribers and prospective subscribers. We want to provide a safe, secure user experience. Please review our Privacy Policy and Terms of Use.