Introduction to Apache Cassandra.


What is Cassandra?

Apache Cassandra is a non relational database which is given by the Apache. Initially, Cassandra was open sourced by Facebook in 2008, and is now developed by Apache Group.

In the normal relational databases data stores in the format of rows, but in Cassandra the data will stored in columns format as key value pairs. Due to this column based data storage its giving the high performance while comparing the relational databases.

Cassandra can handle many terabytes of data if need be and can easily handle millions of rows, even on a smaller cluster. Cassandra can get around 20K inserts per second.

The performance of Cassandra is high and Keeping the performance up while reading mostly depends on the hardware, configuration and number of nodes in your cluster.  It can be done in Cassandra without much trouble.

But there is No SQL, then how to Query?

To insert and retrieve the data there are some apis. Thrift framework is also one of its client API. Essentially a communication protocol used not just by Cassandra but by many others.

Who are using Apache Cassandra?

Cassandra is in use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX, and more companies that have large, active data sets. The largest production cluster has over 100 TB of data in over 150 machines.

For the RDBMS users it will take time to implement the Cassandra.

Terminology of Cassandra:

Column – Column is a tuple with binary no-fixed length name and value along with the timestamp. To keep it simple ignore the timestamp for the moment.

Super Column – Essentially a container for one or more columns. It is again a tuple with a binary name and a map of where key is the same as the name of the column. The main difference between Column and Super Column is, Column contains string format data and Super Column contains Hash Table Format.

Column Family – A structure which keeps an infinite number of rows just like a traditional table. Each row in itself has a binary key and a map of where again a key is the same as the name of the column

Super Column Family – Same as column family with the exception that each row has a map of super columns instead of columns. The map is keyed with the name of each SuperColumn and the value is the SuperColumn itself

Keyspace – It is like schema containing the column families

Sorting – The data is sorted as soon as we put the data within the cluster and it remains that way as there is no way to do it while fetching the data which makes it all the most necessary to plan it right as per the access path.

What might be the reason to develop Cassandra in Java?

* Security: it’s easier to write secure software in Java than in C++ (remember the buffer overflows?)

* Performance: it’s not THAT worse. It’s definetely worse at startup, but once the code is up and running, it’s not a big thing. Actually, you have to remember an important point here: Java code is continually optimized by the VM, so in some circunstances it get faster then C++

Features of Cassandra:

Fault Tolerant : Data is automatically replicated to multiple nodes. Loosing a node doesn’t bring down the cluster

Flexible Schema : We are talking in terms of columns, supercolumns and columnfamilies instead of rows and tables. BigTable datamodel

Symmetric : No single point of failure, Every node within the cluster is identical and there are no network bottlenecks

Scalable : Linear with addition of new machines with no downtime or interruption to applications. Read and write throughput increase linearly as new machines are added

Support for Large Data : The ability to scale to many hundreds of gigabytes of data

Written in Java : Originally built for facebook and then made open source

Apache Cassandra


Some of useful links

http://cassandra.apache.org/http://schabby.de/cassandra-installation-configuration/http://wiki.apache.org/cassandra/ThriftExampleshttp://prettyprint.me/2010/02/23/hector-a-java-cassandra-client/http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/http://wiki.apache.org/cassandra/APIhttp://anismiles.wordpress.com/2010/05/24/connecting-to-cassandra-1/

Tips to fallow while doing JAVA code


  • Counting down (i.e. for (int i=n; i>0; i–)) is twice as fast as counting up: my machine can count down to 144 million in a second, but up to only 72 million.
  • Calling Math.max(a,b) is 7 times slower than (a > b) ? a : b. This is the cost of a method call.
  • Arrays are 15 to 30 times faster than Vectors. Hashtables are 2/3 as fast as Vectors.
  • Use System.arraycopy(firstArray, 0, secondArray, 0, firstArray.length) method instead of iterating the first array and copying into second array.
  • Use compound assignment operators (+=, -=, *=, and /=) instead of using normal operators.
  • Ex: a = a+b takes longer time to execute when compared to a += b. In fact, these cause different Java byte codes to be generated.
  • Eliminate unnecessary code in the loops and you should avoid declaring variables unnecessarily within loops.
  • Use int instead of other primitive types because operations performed on int primitives generally execute faster than for any other primitive type supported by Java, so you should use int values whenever possible; char and short values are promoted to int automatically before arithmetic operations.
  • Use notify() instead of notifyAll(), notify will execute faster than notifyAll().
  • When joining couple of Stings use StringBuffer instead of String.
  • Don’t try to convert Strings to upper or lower case for String comparison.
  • In the String Class prefer charAt() method instead of startsWith() method. From performance perspective, startWith() makes quite a few comparisons preparing itself to compare it’s prefix with another string.
  • Don’ t initialize the public instance variable in constructor if they already initialized outside the constructor. Because all public initialized instance variables are again initialized in constructor by default.
  • Vector provides the following methods to insert elements.
    addElementAt( e, index)
    addElement (e)
    add(e)
    add(index, e)
  • Out of these try to avoid using methods, addElementAt( e, index) and add(index, e). The way these methods work is , all the elements are moved down between the insertion point and the end of the vector, making space for the new Element. The same works for deleting element at a Particular index. If possible, if these features are required, then try to use a different Data Structure if possible.
  • If the approximate size of the Vector is know initially then use it. Instead of declaring Vector as,

Vector v = newVector();
declare it as,Vector v = new Vector(40);
or  Vector v = new Vector(40,25) ;

  • This method indicates initial capacity of Vector is 40 and increment by 25 elements per expansion.The way the Vector is expanded is, a new Vector of double the size of currentVector is created, all the Elements in the old Vector is copied to the new Vector and then the old Vector is discarded. (During GC). This has major effect on performance.

Enable JMX Remote port in WebSphere


By default JMX remote port is not enabled in WebSphere, We have to manually enable the JMX remote port. Here I am giving some steps to enable the JMX remote port in Websphere.

This has been done with Websphere 7.0.
After installation of web sphere application server 7.0, fallow the fallowing steps to configure Remote JMX port.

STEP 1:
Login to Admin console of the web sphere any profile(server), short cut will be available in start menu programs.
deploy the PerfServletApp.ear application if not deployed already.
GO TO Applications IN LEFT PANE CLICK WebSphere Enterpise Applications TO CHECK PerfServletApp.ear IS DEPLOYED OR NOT.
IF NOT THEN CLICK New Application UNDER Applications. BROWSE FROM WebSphere directory -> AppServer -> InstallableApps.
FOLLOW THE STEPS.
STEP 2:
Enable the PMI Data and set all the statistics enabled.
GO TO Monitoring and Tuning IN LEFT PANE CLICK ON Performance Monitoring Infrastructure(PMI) IN CONIFGURATION TAB ENABLE THE PMI AND
SET THE ALL STATISTICS. ALSO SET THE ALL STATISTICS IN Runtime Tab. SAVE THE CHANGES.

STEP 3:

Set the generic jvm argument = -Djavax.management.builder.initial= -Dcom.sun.management.jmxremote

in Severs -> Server Types -> WebSphere Application Servers
shows the servers list. click on the server you want.
In the right pane -> Server Infrastructure -> Java and Process Management  click on Process definition, again in Additional Properties of Configuration tab
click on Java Virtual Machine. put the -Djavax.management.builder.initial= -Dcom.sun.management.jmxremote in Generic Jvm Argument field.
and save changes.

STEP 4:

To enable the JMX remote port open the below properties file and add the code below.

FILE : WebSphere directory \AppServer\java\jre\lib\management\management.properties

CODE :
com.sun.management.jmxremote.port=9001
com.sun.management.jmxremote.ssl=false
com.sun.management.jmxremote.authenticate=false

STEP 5:

SAVE THE MASTER DATA AND STOP THE SERVER AND START THE SERVER TO LOAD THE CHANGES………

Enabling JMX port in JBOSS


By default JMX port is not enabled in the JBOSS. If you want enable the JMX port, add the falowing properties
in run.bat file which located in bin directory of the %CATALINA_HOME%.

set JAVA_OPTS= %JAVA_OPTS% -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
-Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=8007
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false

To add the above propertis in run.bat fallow the steps

1. Find for the fallowing code:

set JAVA_OPTS=%JAVA_OPTS% -Xms128m -Xmx512m

2. Replace the above code with :

set JAVA_OPTS=%JAVA_OPTS% -Xms128m -Xmx512m -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=8007 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false

Enabling JMX port in Tomcat


By default JMX port is not enable in the Tomcat.
If you want to enable the JMX port, add the fallowing properties
in catalina.bat file which located in bin directory of the %CATALINA_HOME%.

Add the fallowing properties as starting line of catalina.bat

set JAVA_OPTS= %JAVA_OPTS% -Dcom.sun.management.jmxremote.port=7009
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

SCJP Questions


1

Class C {

public static void main(String[] args) {

int[]a1[]=new int[3][3]; //3

int a2[4]={3,4,5,6}; //4

int a2[5]; //5

}}

5.None of the above

What is the result of attempting to compile and run the program ?.

1.compiletime error at lines 3,4,5

2.compiltime error at line 4,5

3.compiletime error at line 3

4.Runtime Exception

Ans: 2

Explanation:

no value shoud be specified in the rightsidebrackets when constructing an array

Difference between Abstract Class and Interface:


Most of the people confuses with Abstract class and interface. Here I am planning to share some information with examples, I hope this will help you more………

Simple abstract class looks like this:

public abstract class KarateFight{

public void bowOpponent(){

//implementation for bowing which is common for every participant       }

public void takeStand(){

//implementation which is common for every participant

}

public abstract boolean fight(Opponent op);

//this is abstract because it differs from person to person

}

The basic interface looks like this:

public interface KarateFight{

public boolean fight(Opponent op);

public Integer timeOfFight(String person);

}

The differences between abstract class an interface as fallows:

1.  Abstract class has the constructor, but interface doesn’t.

2.  Abstract classes can have implementations for some of its members (Methods), but the interface can’t have implementation for any of its members.

3.  Abstract classes should have subclasses else that will be useless..

4. Interfaces must have implementations by other classes else that will be useless

5. Only an interface can extend another interface, but any class can extend an abstract class..

6.  All variable in interfaces are final by default

7. Interfaces provide a form of multiple inheritance. A class can extend only one other class.

8. Interfaces are limited to public methods and constants with no implementation. Abstract classes can have a partial implementation, protected parts, static methods, etc.

9.  A Class may implement several interfaces. But in case of abstract class, a class may extend only one abstract class.

10. Interfaces are slow as it requires extra indirection to to find corresponding method in in the actual class. Abstract classes are fast.

11. Accessibility modifier(Public/Private/internal) is allowed for abstract class. Interface doesn’t allow accessibility modifier

12.  An abstract class may contain complete or incomplete methods. Interfaces can contain only the signature of a method but no body. Thus an abstract class can implement methods but an interface can not implement methods.

13.  An abstract class can contain fields, constructors, or destructors and implement properties. An interface can not contain fields, constructors, or destructors and it has only the property’s signature but no implementation.

14. Various access modifiers such as abstract, protected, internal, public, virtual, etc. are useful in abstract Classes but not in interfaces.

15.  Abstract scope is upto derived class.

16.  Interface scope is upto any level of its inheritance chain.

Other Useful Links:

What are the differences between object and instance?

What are the differences between EAR, JAR and WAR file?

Differences between callable statements, prepare statements, createstatements

The basic Properties of Interface


1. Interface must be declared with the key word ‘interface’.

2. All interface methods are implicitly public and abstract. In another words you dont need to atually type the public or abstract modifiers in the metod declaration, but method is still allways public and abstract.

3. All variables defined in an interface is public, static, and final. In another words, interfaces can declare only constants , not instance variables.

4. Interface methods must not be static.

5. Because interface methods are abstract, they cannot be marked final, strictfp, or native.

6. An interfaces can extend one or more other interfaces.

7. An interface cannot implement another interface or class.

8. interface types can be used polymorphically.

SCJP


This is the day i am start posting the topics on scjp. Basically i am planning to prepare for the SCJP (Sun Certified Java programmer), So that I am happy to share my preparation, I am happy to discuss each and every important point which i feel, I want to figure out the success tips to help others.

To day i am happy that  strongly decided to complete the exam in three months. Lets see what will happen.

I hope I will get your support to success.

Lets wish me once for the success.