Category: System Administration

  • Disable 70-persistent-net.rules generation on CentOS 6 VM

    If you’re like me you probably have an environment that is running on some virtual platform and like everyone else you have built a template to spin Linux systems. One of the things lately we were running into was the “70-persistent-net.rules”, which associated MAC address to Network interfaces.

    The easiest way I have found to disable this was to do the following, it’s not pretty but works.

    rm /etc/udev/rules.d/70-persistent-net.rules

    echo “#” > /lib/udev/rules.d/75-persistent-net-generator.rules

    Happy hacking.

  • Work Blog: Managing Your Linux Deployments with Spacewalk

    I have been using Spacewalk for a while now and really like a lot of the built-in functionality. I have been using it to build out and manage a lot of my Red Hat, and CentOS installations.

    The latest thing I have been using it for it to manage is my Hadoop cluster build out and configuration updates. I think that it helps to be able to control as much of it as possible from one management system. I know there are applications like Ambari out there, but to be honest who wants to add another tool if they don’t have to go to my site.

    Here’s the link to my work blog about it.

    http://gotomojo.com/managing-your-linux-deployments-with-spacewalk/

  • What does Facebook consider an average day's worth of data?

    Well according to this article from gigaom.com. The average day looks something like this.

    • 2.5 billion content items shared per day (status updates + wall posts + photos + videos + comments)
    • 2.7 billion Likes per day
    • 300 million photos uploaded per day
    • 100+ petabytes of disk space in one of FB’s largest Hadoop (HDFS) clusters
    • 105 terabytes of data scanned via Hive, Facebook’s Hadoop query language, every 30 minutes
    • 70,000 queries executed on these databases per day
    • 500+terabytes of new data ingested into the databases every day

    I also love this quote from the VP of Infrastructure.

    “If you aren’t taking advantage of big data, then you don’t have big data, you have just a pile of data,” said Jay Parikh, VP of infrastructure at Facebook on Wednesday. “Everything is interesting to us.”

  • CentOS 6.4 service virt-who won't start – work around

    Here is the problem.

    [root@bob ~]# service virt-who start

    Démarrage de virt-who : Traceback (most recent call last):

    File “/usr/share/virt-who/virt-who.py”, line 33, in <module>

    from subscriptionmanager import SubscriptionManager, SubscriptionManagerError

    File “/usr/share/virt-who/subscriptionmanager.py”, line 24, in <module>

    import rhsm.connection as rhsm_connection

    ImportError: No module named rhsm.connection

    [FAILED]

     

    There is a simple work around. Install the Scientific Linux 6 python-rhsm package.

     

    Name : python-rhsm

    Version : 1.1.8 Vendor : Scientific Linux

    Release : 1.el6 Date : 2013-02-22 01:54:26

    Group : Development/Libraries Source RPM : python-rhsm-1.1.8-1.el6.src.rpm

    Size : 0.27 MB

    Packager : Scientific Linux

    Summary : A Python library to communicate with a Red Hat Unified Entitlement Platform

    Description :

    A small library for communicating with the REST interface of a Red Hat Unified

    Entitlement Platform. This interface is used for the management of system

    entitlements, certificates, and access to content.

     

    First install python-simplejson

     

    [root@bob ~]# yum install python-simplejson

     

    Then pick a mirror from http://rpm.pbone.net/index.php3/stat/4/idpl/20813982/dir/scientific_linux_6/com/python-rhsm-1.1.8-1.el6.x86_64.rpm.html and download python-rhsm-1.1.8-1.el6.x86_64.rpm and install it

     

    [root@bob ~]# rpm –install python-rhsm-1.1.8-1.el6.x86_64.rpm

     

    Then start virt-who

     

    [root@bob ~]# service virt-who start

  • Hadoop to Hadoop Copy

    Here recently I need to copy the content of one hadoop cluster to another for geo redundancy. Thankfully instead of have to write something to do it, Hadoop supply a hand tool to do it “DistCp (distributed copy)”.

     

    DistCp is a tool used for large inter/intra-cluster copying. It uses Map/Reduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list. Its Map/Reduce pedigree has endowed it with some quirks in both its semantics and execution. The purpose of this document is to offer guidance for common tasks and to elucidate its model.

     

    Here are the basic for using:

     

    bash$ hadoop distcp hdfs://nn1:8020/foo/bar \

    hdfs://nn2:8020/bar/foo

     

    This will expand the namespace under /foo/bar on nn1 into a temporary file, partition its contents among a set of map tasks, and start a copy on each TaskTracker from nn1 to nn2. Note that DistCp expects absolute paths.

     

    Here is how you can handle multiple source directories on the command line:

     

    bash$ hadoop distcp hdfs://nn1:8020/foo/a \

    hdfs://nn1:8020/foo/b \

    hdfs://nn2:8020/bar/foo

  • Amazon Relational Database Service (Amazon RDS)

    It appears that Amazon is introducing a new service specifically targeted at Relational Databases helpful hints. You can choose from MySQLOracle, and Microsoft Sql Server.

    Amazon Relational Database Service (Amazon RDS) is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business.

  • Using Iozone for Filesystem Benchmarking

    If you have been around computer systems long enough everyone knows how import disk performance is, espeacially with database systems. There’s the standard htparm -tT and dd test the everyone does, but it really does give you the whole picture. What you really want is to test read, write, re-read, re-write, read backwards, read strided, fread, fwrite, random read, pread ,mmap, aio_read, and aio_write. For that I would recommend using Iozone. It gives you a better idea of what’s going on.

    http://www.iozone.org

     

  • Install Git From Source On Linux

    If you are like me and want to install git-core core from source instead of one to the many binary packages out or you just have a distro that does have a binary for it. Here is what you will need to get it installed.

    • POSIX-compliant shell
    • GCC – gnu c compiler
    • GNU Interactive Tools
    • Perl 5.8 or Later
    • Openssl
    • Openssh
    • zlib
    • libcurl
    • expat

    Once all you have verified or install on the required packages. You can download the source from Git Homepage.

    shell$ wget http://git-core.googlecode.com/files/git-1.x.x.tar.gz

    shell$ tar -zxf git-1.x.x.tar.gz

    shell$ cd git-1.x.x

    shell$ ./configure –prefix=[install_path]

    shell$ make all

    shell$ sudo make install

    A little old school and not the hard.

    Resources:
    http://git-scm.com/
    http://ruby.about.com/od/git/a/2.htm

  • Using CURL to manage Tomcat

    The other day I and a few of my colleges were talking about a easy way to deploy and undeploy war files from the command line like you could through the Tomcat Web Application Manager portal and being on a python kick, I started writing it in python. After an hour or two I realized that I had made this way more complex then I need to.  I had been reading theApache Tomcat 6.0 Manager App HOW-TO and I was using curl to test all the commands from localhost.

    shell> curl –anyauth -u admin:password http://localhost:8080/manager/start?path=/myapp

    So now after slapping myself in forehead and saying “duh!”. I decided I could write this as shell script and have it knock out in 20 minutes.

    So here what I came up with tomcat-cli.sh.

    –Cheers

     

  • Simple HTTP Server with Python

    Ever needed a quick web server to share something with a Windows user from you Linux box.  Python has really easy to use embedded HTTP Server. Just try the following.shell> python -m SimpleHTTPServer 9001

    And point you web browser at http://localhost:9001 and see what happens.

    — Cheers