Java


Java handles sub-strings in a way that isn’t immediately obvious and which can have some nasty side effects if you don’t realize what is going on under the hood. When the substring method is used it appears to return a new String but it actually isn’t a new String at all. What the method does is create a new String descriptor that maintains a pointer to the original String and the start and end index – in other words the relevant part of the old String. This is all well and good most of the time and works because Strings are immutable. It can, however, lead to memory exhaustion problems in some cases. The cause of the potential memory exhaustion problem is simple. The newly created sub-string is maintaining a reference to the old String -“ even when the old String appears to have no more references it won’t be garbage collected if there are references to the sub-string. Imagine for a moment that the old String was several megabytes in size and the sub-string was just the first few characters. Those few characters are actually consuming many megabytes of memory! The example program at the bottom of the page shows this in action.

Interestingly the behaviour described above even happens with a completely empty String (e.g. substring is called with the same start and end index) so it’s quite easy to end up with an apparently empty String using vast chunks of the system memory. A way to avoid this problem is to internalize the sub-string. This will provide a reference to the String held in the internal cache and is created if necessary. The String in the internalized cache is a real String so only takes up the amount of space you would expect.

It is also worth pointing out that sub-strings aren’t internalized automatically even if the original String was internalized. You might have expected that if the original String was internalized sub-strings would be as well but that is not the case. The reason for this behaviour is probably because intern always creates a complete new String in the cache whereas substring can simply use references into an old String. For long Strings and their sub-strings the currently method is probably the most efficient, this method is even more efficient if neither the original String nor it’s sub-string descendants are long lived which is often the case in real world applications.

public class Substring {
public static void main( String[] args ) {
Substring substring = new Substring();
substring.memoryUsageExample();
substring.internExample();
}

private void memoryUsageExample() {
Runtime runtime = Runtime.getRuntime();

System.gc();
// Show the base memory load
System.out.println( "Memory Usage (1): " + ( runtime.totalMemory() - runtime.freeMemory() ) );

// Build a huge String
StringBuilder builder = new StringBuilder();
for( int i = 0, n = 10000000; i < n; i++ ) {
builder.append( "x" );
}

String hugeString = builder.toString();
// At this point we have the StringBuilder and the String in memory
System.out.println( "Memory Usage (2): " + ( runtime.totalMemory() - runtime.freeMemory() ) );

builder = null;
System.gc();
// We should now only have hugeString in memory
System.out.println( "Memory Usage (3): " + ( runtime.totalMemory() - runtime.freeMemory() ) );

String littleString = hugeString.substring( 0, 5 );

// Internalize the littleString to remove the reference to the hugeString
// littleString = littleString.intern();

hugeString = null;
System.gc();

// Even though hugeString is dereferenced the memory usage is still high.
// Try commenting out the display of littleString below you will notice
// that the memory drops as you would expect.
System.out.println( "Memory Usage (4): " + ( runtime.totalMemory() - runtime.freeMemory() ) );

System.out.println( "Little String: " + littleString );
}

private void internExample() {
String test = "This is a test string";
test = test.intern();
String sub = test.substring( 10, 14 );

// False as you would expect
System.out.println( "Are the test string and sub-string the same? " + ( test == sub ? "True" : "False" ) );

// False but you might expect it to be true. After all the original test
// String was internalized so you would be forgiven for expecting it to
// internalize sub-strings as well.
System.out.println( "Is the sub-string internalized? " + ( "test" == sub ? "True" : "False" ) );

sub = sub.intern();

// True. String literals are automatically internalized and now our sub-string is
// the same String.
System.out.println( "Is the sub-string internalized now? " + ( "test" == sub ? "True" : "False" ) );
}
}

Reference:
http://www.crazysquirrel.com/computing/java/basics/substring.jspx

In Java, String is derived data type. String is a class. String is not array of characters.String variables are Objects. The important feature of String is Immutability. Strings are immutable, which means contents of the string cannot be changed after they are initially constructed. There are some concepts people need to understand when using String objects. when we create strings by initializing the variables and the values are available at compile time, then java uses a technique called Interning to reduce the space used by duplicate strings.

For example

//Declare two Strings s1 and s2, Initialize them with the same value.
// so s1 and s2 should be two different objects of String Class But
// s1 and s2 both point to the same reference in the Heap, which means
// only one object will exist for "JAVA" and it is referred by s1 and s2.
// This is called automatic Interning of String
String s1 = "JAVA";
String s2 = "JAVA";

This can be proved by using the == operator. The equality operator(==) when used with primitive data types, checks whether the data or values are the same but when it is used with Objects, they check the equality of the Identity (whether they belong to the same instance) but not the equality of the state or data of the object.To check the equality of the data, we need to use the overrided equals() method of the object.

String s1 = "JAVA";
String s2 = "JAVA"; /* The equality operator (==) checks whether the String instance are equals, that is whether both s1 and s2 refer to the same memory instance. so in this case, it returns true as both point to the same reference */
if(s1==s2) System.out.println("Same Instance");
else System.out.println("Different Instance");

so you may have a question, if i change the content of the String through s1, will the change be visible from s2? There is the catch, as i told you before, String is Immutable, so the contents cannot be changed after the initial construction. so what happens when you assign a new value to the existing string.

String s1 = "JAVA";
String s2 = "JAVA";
s1 = "SUN";

In the above scenario, a new string object will be constructed with the value “SUN” and its reference is stored in s1.Similaryly, when we call any of the methods of the String class, a new String object is constructed and returned. Java does not perform interning,if the String values are determined at the RunTime. if you want to explicitly perform the interning, then we can call the intern method of the String.

char[] chars = {'A', ' ', 'S', 't', 'r', 'i', 'n', 'g'};
//the String org has a value available at compile time
String org = "A String";
//the value of aRuntimeString is not known till the run time, as it constructed from characters during execution String aRuntimeString = new String(chars)
//The equality operator will return false, because the aRuntimeString will a different instance even though it has the same value.
//The Interning of string didnt happen as the value was known at the Runtime
if(org == aRuntimeString) System.out.println("Same Instance");
else System.out.println("Different Instance");
//we can explicitly ask for interning by calling the intern method
// This method checks the intern table in the heap for the same value, if it exists then that reference is returned.
aRuntimeString = aRuntimeString.intern();
//Now both org and aRuntimeString will be referencing the same String object, since intern was done
// here the equality operator returns true
if(org == aRuntimeString) System.out.println("Same Instance");
else System.out.println("Different Instance");

If you want avoid interning when strings are created, then create string using the Constructor.

//In this Case, two different Object of Strings are Created.
String s1 = new String("NIIT");
String s2 = new String("NIIT");
//it will return false as both of them belong to different instances
if(s1==s2) System.out.println("Same Instance");
else System.out.println("Different Instance");
//equals methods checks equality of the contents of the String
if(s1.equals(s2)) System.out.println("String values are Same");
else System.out.println("String values are different");

References http://javatechniques.com/blog/string-equality-and-interning/
http://mindprod.com/jgloss/interned.html

An immutable object is one whose externally visible state cannot change after it is instantiated. The String, Integer, and BigDecimal classes in the Java class library are examples of immutable objects — they represent a single value that cannot change over the lifetime of the object.

Mutuable Objects are ones whose state or data can be changed at any point of time. StringBuffer is an example of Mutable object.

Benefits of immutability

Immutable classes, when used properly, can greatly simplify programming. They can only be in one state, so as long as they are properly constructed, they can never get into an inconsistent state. so the advantages are

* You can freely share and cache references to immutable objects without having to copy or clone them;
* Immutable classes generally make the best map keys.
* they are inherently thread-safe, so you don’t have to synchronize access to them across threads.

Better Candidate for Caching

Because there is no danger of immutable objects changing their value, you can freely cache references to them and be confident that the reference will refer to the same value later. Similarly, because their properties cannot change, you can cache their fields and the results of their methods.

Automatically Thread Safe, No Need to Synchronize

Most thread-safety issues arise when multiple threads are trying to modify an object’s state concurrently (write-write conflicts) or when one thread is trying to access an object’s state while another thread is modifying it (read-write conflicts.) To prevent such conflicts, you must synchronize access to shared objects so that other threads cannot access them while they are in an inconsistent state. This can be difficult to do correctly, requires significant documentation to ensure that the program is extended correctly, and may have negative performance consequences as well. As long as immutable objects are constructed properly (which means not letting the object reference escape from the constructor), they are exempt from the requirement to synchronize access, becuase their state cannot be changed and therefore cannot have write-write or read-write conflicts.

Makes the Best for Keys in Collection

Immutable objects make the best HashMap or HashSet keys. Some mutable objects will change their hashCode() value depending on their state (like the StringHolder example class in Listing 2). If you use such a mutable object as a HashSet key, and then the object changes its state, the HashSet implementation will become confused — the object will still be present if you enumerate the set, but it may not appear to be present if you query the set with contains(). Needless to say, this could cause some confusing behavior.

Here is an Example below

Guidelines for writing immutable classes

Writing immutable classes is easy. A class will be immutable if all of the following are true:

* All of its fields are final
* The class is declared final
* The this reference is not allowed to escape during construction
* Any fields that contain references to mutable objects, such as arrays, collections, or mutable classes like Date:
o Are private
o Are never returned or otherwise exposed to callers,if returned a defensive copy of it should be returned
o Are the only reference to the objects that they reference
o Do not change the state of the referenced objects after construction


class ImmutableArrayHolder {

 private final int[] theArray;

 // Right way to write a constructor -- copy the array
 public ImmutableArrayHolder(int[] anArray) {
 this.theArray = (int[]) anArray.clone();
 }

 // Wrong way to write a constructor -- copy the reference
 // The caller could change the array after the call to the constructor
 //public ImmutableArrayHolder(int[] anArray) {
 //  this.theArray = anArray;
 //}

 // Right way to write an accessor -- don't expose the array reference
 public int getArrayLength() { return theArray.length }
 public int getArray(int n)  { return theArray[n]; }

 // Right way to write an accessor -- use clone()
 public int[] getArray()       { return (int[]) theArray.clone(); }

 // Wrong way to write an accessor -- expose the array reference
 // A caller could get the array reference and then change the contents
 //public int[] getArray()       { return theArray }
}

 

References :

http://www.javapractices.com/topic/TopicAction.do?Id=29
http://www.ibm.com/developerworks/java/library/j-jtp02183.html
http://www.javaranch.com/journal/2003/04/immutable.htm

Levenshtein Distance (or) Edit Distance

Levenshtein distance (LD) is a measure of the similarity between two strings, which we will refer to as the source string (s) and the target string (t). The distance is the number of deletions, insertions, or substitutions required to transform s into t.

For example,

    * If s is “test” and t is “test”, then LD(s,t) = 0, because no transformations are needed. The strings are already identical.
    * If s is “test” and t is “tent”, then LD(s,t) = 1, because one substitution (change “s” to “n”) is sufficient to transform s into t.

The greater the Levenshtein distance, the more different the strings are.

Levenshtein distance is named after the Russian scientist Vladimir Levenshtein, who devised the algorithm in 1965. If you can’t spell or pronounce Levenshtein, the metric is also sometimes called edit distance.

The Levenshtein distance algorithm has been used in:

    * Spell checking
    * Speech recognition
    * DNA analysis
    * Plagiarism detection

The Algorithm is as follows

Set n to be the length of s.
Set m to be the length of t.

If n = 0, return m and exit.
If m = 0, return n and exit.

Construct a matrix containing 0..m rows and 0..n columns.

Initialize the first row to 0..n.
Initialize the first column to 0..m.

  Examine each character of s (i from 1 to n).
  Examine each character of t (j from 1 to m).
  If s[i] equals t[j], the cost is 0.
 If s[i] doesn’t equal t[j], the cost is 1.
  Set cell d[i,j] of the matrix equal to the minimum of:
  a. The cell immediately above plus 1: d[i-1,j] + 1.
  b. The cell immediately to the left plus 1: d[i,j-1] + 1.
  c. The cell diagonally above and to the left plus the cost: d[i-1,j-1] + cost.

After the iteration steps are complete, the distance is found in cell d[n,m].

The Java Implementation of the above algorithm is as follows

public class EditDistance {
 
 private int getMinimum(int a, int b, int c)
 {
  return Math.min(a,Math.min(b, c));
 }
 
 public int findCost(String s1,String s2)
 {
  
  //Find the length of strings
  int s1Length = s1.length();
  int s2Length = s2.length();
  //declare two dimension array of string length + 1
  int matrix[][] = new int[s1Length+1][s2Length+1];
  
  //If Length of any one of the String is zero then return the length of the other as Cost
  if(s1Length==0) return s2Length;
  if(s2Length==0) return s1Length;
  
  //Initialize the first row with 0 to s1Length
  for(int i=0;i<=s1Length;i++)
   matrix[i][0] = i;
  
  //Initialize the first Column  with 0 to s2Length
  for(int i=0;i<=s2Length;i++)
   matrix[0][i] = i;  
  
  //Iterate through the Matrix
  for(int i=1;i<=s1Length;i++)
  {
   for(int j=1;j<=s2Length;j++)
   {
    
    int cost;
    //Check the Character at each location, if they are same then the Cost is Zero else Cost is One
    if(s1.charAt(i-1)==s2.charAt(j-1))
     cost=0;
    else
     cost=1;
    
    //Set the value of matrix[i][j] based on the logic of the minimum value
    //  of 1) Value of the Cell on the Top + 1
    //     2) Value of the Cell on the Left + 1
    //     3) Value of the Cell on the Left Top Diagonal + cost
    matrix[i][j] = getMinimum(matrix[i][j-1]+1, matrix[i-1][j]+1, matrix[i-1][j-1]+cost);
   }
  }
  
  // Used for debugging the Matrix
  /*
  for(int i=0;i<=s1Length;i++)
  {
   for(int j=0;j<=s2Length;j++)
    System.out.print(matrix[i][j]  + " ");
   System.out.println();
  }  
  */
  return matrix[s1Length][s2Length];
 }
public static void main(String[] args)
 {
  EditDistance d = new EditDistance();
  System.out.println("Distance is " + d.findCost("worse", "world"));
 }

 Reference Links:

http://www.merriampark.com/ld.htm

http://en.wikipedia.org/wiki/Levenshtein_distance

A code sample which shows how to format a currency value which is stored in a BigDecimal object. The Code uses the US Currency Locale and formats the value with comma seperation.


import java.util.*;
import java.math.*;
import java.text.*;
class  MyDecimalFormatter
{
 public void formatDecimal(BigDecimal b)
 {

      NumberFormat n = NumberFormat.getCurrencyInstance(Locale.US);
      double doublePayment = b.doubleValue();
      String s;
      if(doublePayment < 0)
      {
	      s = n.format(doublePayment * -1);
	      s  = "-" + s;
      }
      else
     {
	      s = n.format(doublePayment);
      }
      System.out.println(s);
 }
 public static void main(String[] args)
 {
  BigDecimal payment = new BigDecimal("-1234523423.67");
  MyDecimalFormatter mf = new MyDecimalFormatter();
  mf.formatDecimal(payment);
 }
}

Thanks kamal
WSAD Shortcuts
 

 

Navigational Shortcuts
Ctrl+L Go to line number.
Ctrl+I Indent the highlighted text.
Ctrl+Q Goto last modified editor position
Ctrl+K Go to next occurrence of the selected text
Ctrl+Shift+K Go to previous occurrence of the selected text
Ctrl+F6 Navigate between open Editors (or, files in an editor)
Ctrl+F7 Navigate between open Views
Ctrl+F8 Navigate between open Perspectives
Ctrl+Shift+P Navigating to Matching braces
F12 Jump to the open Editor
Ctrl+Shift+W List all files open in an editor
Alt+left arrow Back (last editing position)
Alt+right arrow Forward (next editing position)

 

Coding Assistants
Ctrl+Space Brings up coding/context assistant.
Make required selection and press enter or, double click.
After selection, Javadoc will appear in hover Help.
Ctrl+Shift+M Add import
Ctrl+Shift+O Organize Imports
Ctrl+B Executes an incremental build of a project in the navigation view
Ctrl+E Goes to the next error.
Alt+Shift+M Extract Method
Alt+Shift+R Rename method/variable
Ctrl+Shift+F Format Code
Ctrl+Shift+E Delete till end-of-line
Search Shortcuts
F3 Open Declaration of Selected Element
F4 Open hierarchy
Ctrl+F Find/replace
Ctrl+H Brings up Java search with the selected item in the search table.
Ctrl+Shift+T
Or
Ctrl+Shift+H
Type Browser:- Search a class/interface by typing it’s name
(wildcards: ‘*’ and ‘?’ can also be used)
“Ctrl+Shift+H” brings up Hierarchical Browser… works much the same as Type Browser!
Ctrl+Shift+G Search References in workspace
Ctrl+O Search data-members or methods in the Class (works similar to Ctrl+Shift+T ,i.e, type search… but the scope is only the current Class file open in the editor)
Ctrl+Shift+U Search occurrences (of selected methodName or dataMember) within the same Class file.

 

New features in Workbench
1. Users can now customize the key bindings (Key short-cuts) using preferences.
Windows-> Preferences->Workbench-> Keys
2.
Editor now keeps the Navigation history. So if you open a second editor you can use Navigate ->Back option to go back. One can also use key shortcut Alt + Left arrow.
3. Text editor has improved to provide line numbers , current line highlighting , print margins etc.
4. There is a new Ant view (Window > Show View > Ant) that makes it easier to run Ant buildfiles.
5. There is a new Ant editor that makes it easier to edit Ant buildfiles. The Ant editor provides content assist, syntax highlighting, an outline, and error reporting

Set

Set is a type of Collection in java. It is similar to List but does not allow duplicate elements

to be inserted. Set is an Interface and the Concrete Implementations which uses the Set interface

is HashSet.

Example

//Create a Set Object

Set s = new HashSet();

//Add method adds a element into the Set. This method returns a boolean

//If duplicate is found, then the element is not inserted and it will return false

//if the element is unique, it will add the element to the set and returns true

bool a = s.add(new Integer(1));

if(a) { System.out.println(“Inserted”);}

else { System.out.println(“Duplicate Element. Not Inserted”);