Tuesday, 30 June 2015

Data Mining

Data Mining:
The purpose of data mining is to analyse.

The non trivial extraction of implicit previously unknown and potentially useful information from Data is known as Data Mining.

KDD: Knowledge Data Discovery.

Data Mining is extraction of hidden Data.

The following techniques are used for these purposes:
Association Rules
Regression Equation
Decision Tree Induction

Monday, 29 June 2015

Introduction to BigData: purposes

Big Data and Analytics Introduction:
Types of Data:
Relational Data [Structured Data]
Text Data [Web: Comments, tweets, etc]
Semi Structured Data [XML]
Graph Data [Social Networks]
Streaming Data [This type of data available only once for scan.]

Operations of Data:
Aggregation and Statistics :
Data Warehouse and OLAP [Online Analytic Processing]
Indexing Searching and Querying:
Keyword based search
pattern matching
Knowledge Discovery:
Data Mining
Statistical Modeling

The 4th V in the previous 3V model of bigdata is Veracity which the handling of data uncertainty due to inconsistancy and incompleteness.

Business Intelligence:
Data Mining [Knowledge, Patterns, classification, Estimation]

Expert Systems

Types of Analytics:
Data Science
Business intelligence

Friday, 26 June 2015

.bashrc file Ubuntu

boot block
super block
Inode block
Data block

Device file: Linux treats hard drive as a file too.

/dev : all devices files are in here. Everything is a file.

cat a1>/dev/lp1
$mount: all mounted things will be shown using this command.

$mkdir /usr/bin/mnt
$mount /dev/sdb1(Device file of a pendrive) /usr/bin/mnt (folder to mounted on)
After this file the pendrive will be in the mnt directory.

For every mount there must be an unmount after the work is complete.

du command shows the disk usage.

df command shows the disk free space of all mounted devices.

.bashrc :
Make a file in your home witht the name .bashrc
cat> .bashrc
write code inside of this file and it will run whenever you login to this folder these are called configuration files.

Thursday, 25 June 2015

Redirection symbols in Ubuntu

Redirection Symbol:
>, <, >>, |
> standard o/p
< standard i/p
>> append in the o/p
| pipe

sort < a1 > a2
This means a1 and copy to a2.

cat a | less (This will pagewise showcase of contents of file)

similarly ls | sort
Here input will come from the leftside of the pipe to the command on the right side.

sort < a1 | tail > a2
sorts a1, last ten lines - a2

cat < a1 >> a2
append contents of as at the end of a2.

t=`ls` back quotes runs the command.

Wednesday, 24 June 2015

Configuring ssh server in Ubuntu

To create a subshel as sh commands:
$sh filename

In Linux, success is represented by 0 and failure is represented by anything but 0.

$ps returns all the processes running on the system

%whoami tell the currently logged in user.

$echo $PATH
The : is the separator between two paths.
The above command displays the addresses where an external cmmand might be. UNIX searches for the file in one of these locations and then runs it. Above is the default value for PATH. So change it according to your requirements.

The . represents that user wants PATH to have the address of the present working directory as well.
This is the as setting the classpath for java class files.
Similarly we could use: PATH=$((.:+&PATH))

export PATH makes changes in PATH variable as global.

On starting a filename with a dot(.), that file will become hidden.
To see all hidden files along with others use: ls -a

Use variable holding paths of most frequently use folder.
x=/usr/a2/95 (export x to make this global)
Now simply do
cd $x and you're done.

Tuesday, 23 June 2015

File permissions in Ubuntu

ls -l //details of every file
The first 9 bits are of authority priveledges.

owner - u
group - g
others - o

read r 4
write w 2
execute x 1

  r w x  r w x  r w x
_  _ _ _  _ _ _  _ _ _

  owner  group  others

The first place can have three values:
d- directory
l- link
-  file

chmod g+w  (add to present permissions)

Absolute method:
chmod 464 a1

Monday, 22 June 2015

Some commands in Ubuntu

ls -i filename gives the inode number of the file

ln a1 a2
this will give a2 the inode no. of a1 // the linking process

The contents or list of files of a folder will go into the specified file

ls a* means give the list of all the files starting with 'a'

ls a? means give all list of all the files starting with 'a' but with just one character after that.

ln -s a1 a2
this will make a2 the shortcut for a1
a2 will have a different inode number i.e. different disk space carrying only the address of the disk location of a1.

tail filename // display last portion of the file

head filename // display starting portion of the file

rmdir // delete the folder if it is empty

rm -r // stands for remove recursively

mv //changes the names of the file or their locations

cal // displays the calender

sort // sorts the list of files or contents of a file.

$chmod 777 filename or $chmod u+x filename
follow this by
this file will now be executed.

Following any command with the symbol '&' eg ./filename& will run the command in the background. Whenever the CPU is idle, this command will run.

jobs: shows all the background processes which are currently running.

Friday, 19 June 2015

Commands from Root user in Ubuntu

Loging into the root user from terminal:
sudo su
The # symbol recognizes you as the root user.

We can add new users from the root users as:
#useradd newuser
#passwd newusers
Enter the password.

The cd command stands for change Directory.

ls -F //Distinction biw files and folders

cat > filename
The above structure redirects the data. Now ehatever u'll type will go into a new file.

Thursday, 18 June 2015

Ubuntu Introduction

Ubuntu is an open source operating system.
The root user has all the possible permissions a user can possible have in Ubuntu.
In order to get to the root level terminal:
ctrl+Alt+f1 (at the time of booting into the system)

$ means that you are not yet the administrator, and just a regular user.

Trick to get the list of users:
cat /etc/passwd

The password for a user is stored in the following manner in the above file:
encrypted password:1003:1003::/home/username/:/bin/sh
1003 is where the user id goes.
And the 1003 right next to the user id is where the id of the group to which the user belongs is mentioned.

Wednesday, 17 June 2015

Operating System

Operating System:
File System:
A typical system has the following main components:
Boot Block:
This portion of the file system contains all the neccessary routines and programs needed for the booting process of computer.
Super Block:
Administrative block, manages partitions and other control information.
Inode Block:
This is the store of Inode number of every file. Inode number is uniqly assigned to every file.
Data Block:
This is the biggest block. This contains the actual data divided into several segments. All the files and documents are choped into certain pieces and stored in a fixed manner in this protion of the file system.

Disk management is something done by the Operating System. This is amongst many other functions an operating system performs for the computer.

Tuesday, 16 June 2015


JDBC stands for Java Data Base Connectivity.
Java uses JDBC to  interact with various DataBase products belonging to various enterprises.
Apart from JDBC, the database to be used has to have a driver as well for proper functioning
JDBC has the following things:
Native API
Native protocols (eg Java and Oracle)
Net protocols (TCP/IP)

Monday, 15 June 2015

Connectivity Pseudocode

Connectivity pseudocode:
1. Load JDBC driver
2. Specify the name and location of the database being used.
3. Connect to the data base with Connection object.
4. Execute query.
5. Get the result in ResultSet object.
6. Finish by closing the ResultSet statments and Connection objects.

url=name of database
Connection conn=DriverManager.getConection(url);
Statement st=conn.CreateStatement();
ResultSet rs=st.executeQuery();   //executeUpdate is also used for other queries

PreparedStatement obj= conn.prepareStatement(str);
obj.setString('hey') //passing of parameters


DriveManager.getConnection(url, username, password);

String query="select *from tablename";


System.out.print(rs.getString(1)) //first colun for every row and its type has to be string in this case

Enclose everything in a try catch block.

Friday, 12 June 2015

Database connectivity

MySql is an open source database management system.
The following commands are used commonly:
mysql>create database dbname;
mysql>use dbname;
mysql>create table tablename
column name datatype,
column name datatype

popular datatypes: varchar, char, int, number, date, raw

insert into tablename values(1,'abc');
insert into tablename (column1, column2,..) values(1,'xyz',..);
select * from tablename where condition;
select column from tablename where columnname IS NULL;
where salary between 1200 and 1800;
 between (1200, 1800);
where salary IN (15000, 20000) //only these values in the set shal be met
where ename like (A*) //Name starting from A

update tablename set name='c' where id=2;

delete from tablename where id=2;

Thursday, 11 June 2015

Packages in Java

Package and Interface:
Very first statement of the source file.
There has to be a folder with the same name as that of every package.
package p1;
class A
class B

The folder containing these files should also be named p1.
After moving out of the folder containing the package in cmd.

Access Specifiers:
Accessible in all parts of the packets and even across the packet.
It behaves similarly as public members within a package.

Across the package it is inherited but cannot be used directly through the object
Accessible only within the class.
No specifier:
behaves like public within a package

same as private across the package

To extend a class across packages, the class has to be public and its constructor has to be public. And the name of the file containing the to be extende class should be the same as the class name.

One source file can have only one public file.

packages from different folders can be imported just by setting classpath to the location of the package folder.

converts everything inside the documentation comments into an XML file.

Wednesday, 10 June 2015


Interfaces are classes with restrictions.
In a interface only static values can be had.
Apart from the variable there are restrictions on methods in an interface.
Methods dont have a body.
They are just signatures for other classes to implement in specific ways.

interface A

class B extends C implements A, A2, A3...

Tuesday, 9 June 2015

Abstract and Final Keywords

abstract keyword:
To change the value of a const variavle, the use of pointer is done,
const int a=10;
int *p;

astract keyword used with class and method only.

final keyword used with class and method and variable

abstract class A
abstract int myMethod(int x); // this is the part of the signature
class B extends A
//now every child of A has to have a overridden method with the same signature int myMethod(int x), otherwise use abstract //with B also

Now objects of class A cant be in memory only variable and not reference variable.

The main use of abstract is to force every cild to have a method for overriding.

final int x;
x=20;  // this is not allowed

final int mymethod() {} now this method can't be overridden

final class A{} This class can't be overridden

abstract and final cannot be used simulataneously on the same thing because abstract compulsifies overriding and final keyword prohibits it.

Monday, 8 June 2015

Runtime Polymorphism

Runtime Polymorphism:
class shape
class square extends shape
class circle extends shape
class rectangle extends shape
shape S=new shape();
S.O.P.(2 for circle, 1 for square, 3 for rectangle);
case 1: square s=new square();
case 2: circle c=new circle();
case 3: rectangle r=new rectangle();
S.area(); //what area() will be called depends on user input at run time.

Friday, 5 June 2015

Inheritance in Java

extends is keyword is used

multiple inheritance is not allowed in java:
class A extends B, C // this is not allowed

multilevel inheritance is possible via interfaces

constructors are never inherited. Everything else is.

consider First obj=new First();
on printing obj, the addreess of obj will be printed. or rather the address contained in obj [its value as it is nothing but a reference variable i.e. something containing an address]

class A
int a;

class B extends A
int a;

here a means B's a
and super.a means A's a.
the same concept can be used to reference method of inherited classes with the same names as methods in the parent class.
the super keyword is optional for use only when there is duplicated names.

super() has to be the very first statment in the child 'class' constructor.

only child class can use the super keyword.

Thursday, 4 June 2015

Overloading in Java

Static members (eg main()) can't access non sttic members (any instance variable) directly i.e. without an object.
The default access specifier in Java is equivalent public and not exactly public i.e it is public for a package, across package access will not be allowed.

int a=sc.nextInt("Enter a number\n")

float is a raw data type
float is a user defined data type [wrapper class]

Method overloading is also known as static overloading
1) same method names
2) NO. of arguments:
if same:
check data types:
if same:
then not overloaded

3) return type of a method does not effect whether methods are overloaded or not

Example: the method to find the areas of different polygons can be implemented using the overloading concept.

Wednesday, 3 June 2015

Bitwise and Logical operators in Java, Jump Statements

^ is the bitwise xor operator
>>>> is the shift right zero fill

(cond1 & cond2 & cond3)
in the above statment we have used the logical AND, will evaluate every condition and then perform the logical AND.

(cond1 && cond2 && cond3)
this is the short circuit AND, will stop if a false condition is found in between.

The result will be same in both cases, and the same expressions can done for logical OR and short circuit logical

The choice of taking & as logical or bitwise is dependent on the operands.
a&b, a and b are int then & is bitwise
a&b, a and b are boolean then & is logical

Jump statments:

If a source file contains a public class, the class name must be the name of the public class. A source file can only have one public class.

now both objects will have the same memory addressand from there onwards, obj1 and obj2 will effect the same variables.

Tuesday, 2 June 2015

Introduction to Java

Analysis can be done only in structured form of data.
char takes 2 bytes using the unicode encoding standard.
boolean data type has two values: True or False.
Pillars of OOPs: Encapsulation, polymorphism and Inheritance.
Inheritance is needed for overriding and runtime polymorphism.

class name
instance variable (Normal variable)
static variable (shared by all the objects)
constructors (needed most for inheritance)
intialization block
Unnamed block/anonymous block

initialization block:
class A
int a;
a=10;   //will execute on creation of an object before the call goes automatically to constructor.
int b;

the various blocks execute in the exact order as written in the class.

static block:
static int a;
static { a=10; } // this will execute whenever a static variable is used.

There is no destructor in java, instead there is a garbage collector.

floating point:
float f=4.96f
or f=4.96F
or f=496e-2

String is an inbuilt class in Java.

Lifetime and scope of a variable:

Type conversion and casting:
float b;
int a;
float b;
int a;

in java, all expressions are automatically promoted to int
eg: byte a=2,b=4,c;
c=a*b //not possible
this is because a*b is promoted to int and c is still in byte format.
instead do this:
or int c=a*b;

Arrays are treated as an object in java.
only runtime allocation is allowed.
int a[];
int []a;
and then this is followed by,
a=new int[10];
int a[][];
or int [][]a;
or int []a[];

int a[][]={{1,2},{1},{3,2,2}}

Monday, 1 June 2015

First Day: Industrial Training - Big Data

The database used in Big Data is based on the No SQL approach.

HADOOP: It is a Java framework which was initially named "NUTCH"

It has strong ties to the SMAC principles: Social Mobile Analytic Cloud.

HDFS: stands for Hadoop DIstributed File System. It uses the concept of FAT: File Allocation Table.

The distributed feature of HDFS refers to the fact that many machines have the same database under the same software monitoring. 

Main purpose of Hadoop is MapReduce framework and ability to handle with flat files.

flat files contain data in no tabular format eg JSON files.

3Vs: Volume, Velocity and Variety.

Partial Failure Support: the properity to maintain the availability of data even when data at some servers is lost.

Scalability: Smooth performance transition on increasing the load on the same algorithm or software.

In Hadoop, each storage is done three times on different nodes, this is known as replication.

DoS attack: Denial of Service attack, This attack sends so many requests to a server that actual genuine may not be given to actual clients.