Apache Spark for Big Analytics (Updated for Spark Summit and Release 1.0.1)
Updated and bumped July 10, 2014. For a powerpoint version on Slideshare, go here. Introduction Apache Spark is an open source distributed computing framework for advanced analytics in Hadoop....
View ArticleSpark Summit 2014 Roundup
Key highlights from the 2014 Spark Summit: Spark is the single most active project in the Hadoop ecosystem Among Hadoop distributors, Cloudera and MapR are clear leaders with Spark SAP now offers a...
View ArticleSpark Summit East: A Report (Updated)
Updated with links to slides where available. Some links are broken, conference organizers have been notified. Spark Summit East 2015 met on March 18 and 19 at the Sheraton Times Square in New York...
View ArticleBig Analytics Roundup (May 11, 2015)
Lots of news this week, to compensate for last week’s lame haul. In an excellent post on O’Reilly Radar, Ben Lorica surveys the landscape of workbooks, notebooks and workflow tools, which he...
View ArticleBig Analytics Roundup (June 22, 2015)
Last week’s Spark Summit is the big news driver for this roundup: On the Databricks blog, Scott Walent recaps the summit here Anmol Rajpurohit writes KDnuggets’ play-by-play for Day One and Day Two My...
View ArticleBig Analytics Roundup (July 13, 2015)
Light news this week, likely due to summer vacations. Story of the week: Microsoft announces Spark in Azure. Shivon Zilis spends three months compiling a list of 2,529 analytic startups, creates this...
View ArticleBig Analytics Roundup (July 27, 2015)
Top stories this week: Palantir’s valuation grows, Continuum Analytics gets a bump, Cloudera announces a Python interface for Impala, and we have a winner in KDD Cup 2015. Nate Desmond chronicles...
View ArticleLooking Ahead: Big Analytics in 2016
Every year around this time I review last year’s forecast and publish some thoughts about the coming year. 2015 Assessment First, a brief review of my predictions for 2015: (1) Apache Spark usage will...
View ArticleBig Analytics Roundup (July 5, 2016)
Quite a few open source announcements this week. One of the most interesting is Apache Bahir, which includes a number of bits spun out from Apache Spark. It’s another indicator of the size and strength...
View ArticleBig Analytics Roundup (September 12, 2016)
On a Google blog, Kaz Sato describes how a Japanese farmer uses TensorFlow to classify cucumbers. Very good. Perhaps now Google can set TensorFlow to work figuring out how to comply with EU...
View ArticleBig Analytics Roundup (September 26, 2016)
Note to readers: Recently, I’ve noticed that news about events that occur on Tuesdays seems stale by the time I publish on Monday. Beginning this week, I’m shifting to a new publication model, posting...
View ArticleSpark is the Future of Analytics
At the 2016 Spark Summit, Gartner Research Director Nick Heudecker asked: Is Spark the Future of Data Analysis? It’s an interesting question, and it requires a little parsing. Nobody believes that...
View ArticleApache Spark for Big Analytics (Updated for Spark Summit and Release 1.0.1)
Updated and bumped July 10, 2014. For a powerpoint version on Slideshare, go here. Introduction Apache Spark is an open source distributed computing framework for advanced analytics in Hadoop....
View ArticleSpark Summit 2014 Roundup
Key highlights from the 2014 Spark Summit: Spark is the single most active project in the Hadoop ecosystem Among Hadoop distributors, Cloudera and MapR are clear leaders with Spark SAP now offers a...
View Article
More Pages to Explore .....