Keeping the Trains Running: Effective Troubleshooting for Hadoop

September 12, 2016
Dez Blanchfield


Image credit: flickr @anonymous

Pepperdata in the Tech Lab – Keeping the Trains Running: Effective Troubleshooting for Hadoop

blog post by Dez Blanchfield, Chief Data Scientist, The Bloor Group

Pepperdata Tech Lab – Part 2


In the second of our four part series with Pepperdata in the Tech Lab 2, we delved deeper into the road to success with Hadoop. We covered many of the key challenges faced by almost all Hadoop deployments, in particular we discussed how even a small cluster will before long present challenges, and how resolving those challenges can indeed take time if approached in a manual or human-dependent manner.

It quickly becomes obvious to any who embark upon a journey to deploy and maintain a Hadoop cluster of any size, that the day-to-day administration and performance management of both the underlying hardware, operating system, security, patching, storage, log files, and the general ecosystem, is far from trivial.

Add to that the reality that for even the smallest of cluster, (such as our small twelve node compute farm deployed in Pepperdata Tech Lab 1), that even when a cluster is up and running, there are still many hurdles to overcome.

On the webcast, we discussed one of the biggest questions on everyone’s mind when they set out to deploy a Hadoop cluster:

“How can your organization avoid the common pitfalls with distributed computing,
and many of the common challenges of multiple users & mixed workloads?”

If you were not able to join us for the live broadcast of this second webcast, you can still sign up for the on-demand version to join me and the Pepperdata team as we demonstrates how to identify and resolve some of the more vexing issues with parallel computing.

In this the second of four episodes, I had the opportunity to discuss key insights on the topic of distributed computing on Hadoop with Kirk Lewis of Pepperdata. Kirk gave us an invaluable insider’s look at the Pepperdata platform, and how it has been designed to provide increased visibility for troubleshooting and automated, adaptive performance management to improve cluster performance and guarantee service levels for critical Hadoop workloads.

Kirk and I discussed the deployment and installation of a full production grade Pepperdata instance I have built from scratch, the whole “zero to hero journey”, in very little time, with little difficulty, onto our Tech Lab cluster.

You’ll be interested to note that throughout the installation and deployment of the Pepperdata software, and the configuration of the centralised dashboard, my only tech support question was a quick email to one of Pepperdata’s Field Engineers, to verify the location to place the Pepperdata software license file once the Pepperdata software tool suite was installed.

Embarrassingly, I later realised the very question I emailed about was in fact clearly detailed with dummy safe instructions and screenshots, on the very the next page of the manual had I just read the whole chapter on Installing Pepperdata – yes indeed, evidence that it’s important to read the manual in full.

Of course the upside of this was that it proved just how straightforward the deployment was, such that my one and only question throughout the installation, was in fact documented had I simply read the full chapter – indeed a testament to both the Pepperdata development team, and the technical documentation team. Naturally I had a good laugh at myself as a result.

We did go a little over time on this episode, but it was worth the bonus ten minutes, as our audience offered us an amazing series of questions for us to address. Be sure not to miss out on what we covered during this amazing bonus ten minutes of Q & A.

I would go so far as to say we were peppered ( pardon the pun ) with a wonderful array of probing questions which let us help attendees better understand how clusters can be optimized at machine speed with the Pepperdata tools – many of which you are likely to be questions you too will have if you are deploying Hadoop at any time I’m sure.

If you missed the live episode of this webinar, don’t panic, we record them for you to enjoy in your own time.

Pepperdata Tech Lab Webinar 2: Keeping the Trains Running: Effective Troubleshooting for Hadoop

Also, if you missed the live episode of part 1 in the series, relax, we do record them for you to enjoy in your own time.

Pepperdata Tech Lab Webinar 1: The Perils & Pitfalls of Distributed Computing – How to Get Your Operations Back on Track

For more information about our sponsor partner Pepperdata, please visit their website:


No comments

Leave a Reply

Your email address will not be published. Required fields are marked *