Something recently caught my eye, and that is Google’s cloud Hadoop nodes offering. It is a really powerful tool, and I’d like to do my part to share the education. In a nutshell, it’s a distributed database that is incredibly easy to access. Many organizations have embraced Hadoop for big data, but often times they use it with on-site hardware. It’s so easy to max out on that hardware, which makes creating your own Hadoop cluster by renting nodes from Google so appealing.
What Google has to Say
Google boasts quick start-up times, fast I/O, and the ability to run at scale.
Hadoop scales fast on Google Cloud Platform
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Sign Me Up
Yeah, we’d like to leverage this internally at Arke. We would love to convert our project “map my brand” for our Twitter account which would allow us to monitor our social traffic better. To use it, you simply put keywords into the system and then it monitors your keywords for Twitter. A lot of people use geo location with their tweets. We leverage that information and map them like a heat map. The greatest motivator to drive this project is the fact that we’ve hit a wall with hardware.
Doing a quick search on the web you will see many businesses leveraging Hadoop in many ways; from Facebook to eBay to Netflix and beyond. There are so many possibilities. Ask yourself how you can use Hadoop nodes now, given you can easily eliminate the hardware issue.