Category: Development

  • Querying Apache Hadoop Resource Manager with Python.

    Querying Apache Hadoop Resource Manager with Python.

    I was recently asked to write a script that would monitor the running application on the Apache Hadoop Resource Manager.

    I wonder over to the Apache Hadoop Cluster Application Statistics API. The API allows to query most of the information that you see in the WEB UI. Information such as status on the cluster, metrics on the cluster, scheduler information, information about nodes in the cluster, and information about applications on the cluster.

    I first start by querying the cluster info.

    import urllib2
    import json
    
    resource_manager = 'http://resourcemanager:8088'
    
    info_url = resource_manager+"/ws/v1/cluster/info"
    
    request = urllib2.Request(info_url)
    
    '''
    If you prefer to work with xml replace json below with xml
    '''
    request.add_header('Accept', 'application/json')
    
    response = urllib2.urlopen(request)
    data = json.loads(response.read())
    
    print json.dumps(data, sort_keys=True, indent=4, separators=(',', ': '))
    
    

    returns the following:

    {
    "clusterInfo": {
    "haState": "ACTIVE",
    "hadoopBuildVersion": "2.6.0-cdh5.7.0 from c00978c67b0d3fe9f3b896b5030741bd40bf541a by jenkins source checksum b2eabfa328e763c88cb14168f9b372",
    "hadoopVersion": "2.6.0-cdh5.7.0",
    "hadoopVersionBuiltOn": "2016-03-23T18:36Z",
    "id": 1478120586043,
    "resourceManagerBuildVersion": "2.6.0-cdh5.7.0 from c00978c67b0d3fe9f3b896b5030741bd40bf541a by jenkins source checksum deb0fdfede32bbbb9cfbda6aa7e380",
    "resourceManagerVersion": "2.6.0-cdh5.7.0",
    "resourceManagerVersionBuiltOn": "2016-03-23T18:43Z",
    "rmStateStoreName": "org.apache.hadoop.yarn.server.resourcemanager.recovery.NullRMStateStore",
    "startedOn": 1478120586043,
    "state": "STARTED"
    }
    }
    

    Now onto what I need to do, querying the Resource Manager about running applications. The Cluster Applications API allow you to collect information on resources, which represents an application. There are multiple parameters that can be specified to retrieve data. For a list of parameters go to Cluster_Applications_API

    I however just need the information on running applications. Which looks something like.

    import urllib2
    import json
    
    resource_manager = 'http://dvcdhnn02:8088'
    
    info_url = resource_manager+"/ws/v1/cluster/apps?states=running"
    
    request = urllib2.Request(info_url)
    
    '''
    If you prefer to work with xml replace json below with xml
    '''
    request.add_header('Accept', 'application/json')
    
    response = urllib2.urlopen(request)
    data = json.loads(response.read())
    
    print json.dumps(data, sort_keys=True, indent=4, separators=(',', ': '))
    

    which returns something like:

    {
    "apps": {
    "app": [
    {
    "allocatedMB": 24576,
    "allocatedVCores": 3,
    "amContainerLogs": "http://resourcemanager:8042/node/containerlogs/container_1478120586043_15232_01_000001/hdfs",
    "amHostHttpAddress": "resourcemanager:8042",
    "applicationTags": "",
    "applicationType": "MAPREDUCE",
    "clusterId": 1478120586043,
    "diagnostics": "",
    "elapsedTime": 18009,
    "finalStatus": "UNDEFINED",
    "finishedTime": 0,
    "id": "application_1478120586043_15232",
    "logAggregationStatus": "NOT_START",
    "memorySeconds": 431865,
    "name": "SELECT 1 AS `number_of_records...TIMESTAMP))(Stage-1)",
    "numAMContainerPreempted": 0,
    "numNonAMContainerPreempted": 0,
    "preemptedResourceMB": 0,
    "preemptedResourceVCores": 0,
    "progress": 54.07485,
    "queue": "root.hdfs",
    "runningContainers": 3,
    "startedTime": 1479156085020,
    "state": "RUNNING",
    "trackingUI": "ApplicationMaster",
    "trackingUrl": "http://resourcemanager:8088/proxy/application_1478120586043_15232/",
    "user": "hdfs",
    "vcoreSeconds": 51
    }
    ]
    }
    }
    

    straight forward and simple to use.

  • MongoDB Script for counting records in collections in all the databases

    Here is a quick script. I wrote for a co-worker.

    var host = "localhost"
    var port = 27000
    var dbslist = db.adminCommand('listDatabases');
    
    for( var d = 0; d < dbslist.databases.length; d++) {
         var db = connect(host+":"+port+"/"+dbslist.databases[d].name);
         var collections = db.getCollectionNames();
         for(var i = 0; i < collections.length; i++){
             var name = collections[i];
             if(name.substr(0, 6) != 'system') {
                print("\t"+dbslist.databases[d].name+"."+name + ' = ' + db[name].count() + ' records');
             }
         }
    }
    

     

  • 2014 AT&T Developer Summit

    I will be attending the AT&T Developer Summit in Las Vegas. I will also be taking part in the Summit Hackathon.

    “The AT&T Summit Hackathon is the premier hackathon of the year for the AT&T Developer Program. This year will be focused on wearable technologies and participants will be able to choose between a Wearables Track and an AT&T API Track. Finalists from each track will be featured in live fast pitches on stage with our executives during the keynote at the AT&T Developer Summit on January 6th. In addition, competitors will also have the ability to complete in accelerator challenges, details to be announced, which will offer prizes of up to $10,000 for eligible teams”

    more >>

  • SIC:// AT&T Hackathon 2013

    Well, since my last blog entry I went to my first Hackathon, the Seattle Interactive Conference / SIC:// AT&T Hackathon. Ended up joining a team with two complete strangers, Joan Jasak and Arunabh Verma, and we end up presenting at the Conference and taking 3rd place. Check out the video http://vimeo.com/78582527

  • Something new, writing Android App in HTML 5

    I wanted to write an App for my Android Tablet, something easy so I could get a basic understanding of what it would take to write an app. I figured that I could knock something out fairly quickly, but I didn’t know it would be this easy.

    I’m a fairly big user of Eclipse, so install the the Android SDK kit and off I went.

    I created a New Android Project and edited the main.xml, so it look like the following.

    — code starts—

    <?xml version=”1.0″ encoding=”utf-8″?>

    <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
                         android:layout_width="fill_parent"
                         android:layout_height="fill_parent"
                         android:orientation="vertical">
      <WebView
                         android:layout_width="fill_parent"
                         android:layout_height="fill_parent"
                         android:id="@+id/webView" />
    </LinearLayout>
    --- code stops ---
    I then updated my java code so that would use the webView toolkit and open my HTML 5 app.
    --- code starts---
    package com.rogerhosto.myandroidapp;
    import android.app.Activity;
    import android.os.Bundle;
    import android.webkit.WebView;
    public class MyAndroidAppActivity extends Activity {
    	/** Called when the activity is first created. */
    	@Override
    	public void onCreate(Bundle savedInstanceState) {
    		super.onCreate(savedInstanceState);
    		setContentView(R.layout.main);
    		WebView webView = (WebView)findViewById(R.id.webView);
    		webView.getSettings().setJavaScriptEnabled(true);
    		webView.loadUrl("file:///android_asset/www/index.html");
    	}
    }
    
    

    — code stops —

    Then I create a www directory in existing assets directory and created index.html for all my HTML5 code. That’s it.

    Pretty Easy.

     

  • Install Git From Source On Linux

    If you are like me and want to install git-core core from source instead of one to the many binary packages out or you just have a distro that does have a binary for it. Here is what you will need to get it installed.

    • POSIX-compliant shell
    • GCC – gnu c compiler
    • GNU Interactive Tools
    • Perl 5.8 or Later
    • Openssl
    • Openssh
    • zlib
    • libcurl
    • expat

    Once all you have verified or install on the required packages. You can download the source from Git Homepage.

    shell$ wget http://git-core.googlecode.com/files/git-1.x.x.tar.gz

    shell$ tar -zxf git-1.x.x.tar.gz

    shell$ cd git-1.x.x

    shell$ ./configure –prefix=[install_path]

    shell$ make all

    shell$ sudo make install

    A little old school and not the hard.

    Resources:
    http://git-scm.com/
    http://ruby.about.com/od/git/a/2.htm

  • Here is what happens when I recovery from surgery.

    After being told that I would need to take five days to recovery from a recent surgery and being informed that they would prefer me not to work on any of the system at work well under the influence of pain killers, I started looking for thing to kill time.

    Well trying to find something to do for 5 days and not having much luck, I was surfing the web using Google’s Chrome Web Browser, I started looking at their “New Tab Apps Feature” or whatever it’s called. I started thinking that would be really simple to create a web based version of the application that did roughly the something, but wasn’t dependent on the web browser. It would also be handy for all those table devices that have web browsers. It’s kind of a pain to hit those little links with fat fingers like mine. After doing a quick design in my head, I figured what the heck; I got nothing better to do.

    I decided to go old school LAMP on it with Linux, Apache, MySQL, and PERL. I decided to use mod_perl and MASON as my frame work, MASON being a throw back from my short stent at Amazon.com. I went with the old MVC architecture, since it’s the easiest for a one node system and because of my state of the art Dual Pentium Pro 180 MHZ, with 256G of RAM, and 14G 5400 RPM Hard Drive for the server that I was building it on.

    After 5 days, a few additional week nights and Saturdays, here is what I came up withhttp://www.myapplinks.com. It is still in the Alpha/Beta stage by not a bad start.

  • Simple HTTP Server with Python

    Ever needed a quick web server to share something with a Windows user from you Linux box.  Python has really easy to use embedded HTTP Server. Just try the following.shell> python -m SimpleHTTPServer 9001

    And point you web browser at http://localhost:9001 and see what happens.

    — Cheers

     

  • Slacking over the Holidays

    So December was an interesting month for me, with a few changes in my personal career and the holidays. I take a week off. Which of course lead me spending two weeks catching up on work and to top it off my boss or someone above him decided that we need to take look at a our service architecture. So of course I am now in research mode, which should hopefully lead to some good blog post. Stay tuned.

  • Connecting PHP to An Oracle Instance on RedHat or CentOS 5

    Here lately it seems that everyone wants to connect to Oracle, but I have to admit this was the first time someone asked me to get PHP to talk to Oracle. It was a lot less painful then I thought it would be, so here is what I did.

    A long with the standard PHP RPMs you need to install a couple of additional RPMs from Oracle. These are oracle-instantclient-basic and oracle-instantclient-devel which can be downloaded from http://www.oracle.com/technetwork/database/features/instant-client/index-100365.html. You will also need php-oci8 RPM which can be download fromhttp://oss.oracle.com/projects/php/files/EL5/.

    So after you have downloaded the RPMs go a head and install the packages and create the symlink for libcIntsh.so.

    $ rpm -Uvh oracle-instantclient-basic-##.#.#.rpm
    $ rpm -Uvh oracle-instantclient-devel-##.#.#.rpm
    $ cd /usr/include/oracle/##.#/[client|client64]
    $ ln –s libclntsh.so.##.# libclntsh.so

    Now you are going to want to setup you environment settings, It is important to set all Oracle environment variables before starting Apache or running a PHP script,  so that the OCI8 process environment is correctly initialized. Setting environment variables in PHP scripts can lead to obvious or non-obvious problems. You can also add Instant Client library path to /etc/ld.so.conf.

    $ LD_LIBRARY_PATH=/usr/lib/oracle/##.#/[client|client64]/lib:${LD_LIBRARY_PATH}
    $ export LD_LIBRARY_PATH

    And now for the big finish. Here is a simple connection script to test it all out.

    <?php
    $c = oci_connect( ‘USERNAME’,
    ‘PASSWORD’,
    ‘SERVERNAME:PORT/SERVICE_NAME’, INSTANCE_NAME’ );

    if( $c ) {

    $s = oci_parse( $c, ‘SELECT TABLE_NAME FROM all_tables’ );

    oci_execute($s) ;

    while($res = oci_fetch_array( $s, OCI_ASSOC) ) {

    echo $res[‘TABLE_NAME’] . “\n”;
    }

    }
    ?>

    For a complete list of function and additional install resources check out the following sites:

    http://php.net/manual/en/book.oci8.php
    http://wiki.oracle.com/page/PHP
    http://www.oracle.com/technetwork/articles/technote-php-instant-084410.html