ISC14 - Interview: Overlay HPC cluster stack information
At the ISC14 Christian had an interview with Rich Brueckner from insideHPC about his QNIBTerminal BoF-Session. Slides of the talk could be found in this post.
At the ISC14 Christian had an interview with Rich Brueckner from insideHPC about his QNIBTerminal BoF-Session. Slides of the talk could be found in this post.
At ISC14 I gave a Birds-of-the-Feather talk about the benefits provided by overlaying multiple information layers within the HPC cluster stack. The topic debuted at OSDC14 (post with video here). Furthermore I had an video-taped interview with Rich Brueckner from insideHPC, which is available here.
Yesterday I pimped the way to build the cluster; now it is time to start the beast. For now it is a simple bash function; there must be a smarter way... fabfile, I heard... :)
The cluster looks like this...
Last time I gave a walkthrough to the docker cluster. This time around I would like to enable more people to bootstrap it.
So I polished the bashrc functions to fetch and build the neccessary git-repositories. The next post should use Python Fabric to spin it up.
In my previous post I described what drove me to give docker a spin and create a virtual HPC cluster stack.
This post provides a step by step guide to run a basic QNIBTerminal with four nodes. To get this one going there is no need for a lot of horsepower. I ran it on a 3-core AMD machine from back in the days. Even a VM should be able to lift it.
On my way home (at least to an intermediate stop at my mothers) from the OSDC2014 I guess it's time to recap the last couple of weeks.
I gave a talk which title reads 'Understand your data-center by overlaying multiple information layers'. The pain-point I had in mind when I submitted the talk was my SysOps days debugging an InfiniBand problem that was connected to other layers of the stack we were dealing with. After being frustrated about it I choose to use my BSc-thesis to tackle this problem. The outcome was a not-scaling OpenSM plug-in to monitor InfiniBand. :) But the basics were not as bad, so I revisited the topic with some state-of-the-art log management (logstash) and performance measurement (graphite) experience I gained over the last couple of month. Et voila, it scales better...
At the OSDC14 in Berlin Christian debuted with QNIBTerminal, a framework to spin up a complete cluster software stack. The talk was about overlaying multiple information layers to correlate metrics and events throughout the cluster stack.
In 2013 Christian was at the ISC13 conference in Leipzig to talk about the current state of monitoring in regards of an HPC system (event detail).
In 2012 Christian teamed up with a colleague at the time and gave a talk about the state of InfiniBand monitoring at ISC12.
ISC 12 BoF: InfiniBand? Problems? Do you care? from slideshare.net/sciecomp
Christian's first public talk was given at the MMB & DFT Converence in 2012, where he presented the tool he developed during his BSc report (link to proceedings)