Saturday, 23 March 2013

ITBAL-Session 9 (19.03.13)

FLOT

It is a javascript library for plotting line graphs and bar charts. It is a jQuery library that works in all the browsers that support canvas. In addtion to library for plotting bar and line graphs it has numerous call back functions making it possible to use our own style and code. One major limitation is that it can be used to plot only bar charts and line graphs, so can be used only for common functions.

Industrial Applications
Education: Used to create learnining material and plot statistical graphs.
Financial: Used for plotting repayment schedules,financial graphing,track stock information, interest rate movements.
Healthcare: To plot wellness statistics,BMI, workout routine progress etc.
IT Monitoring: Used to monitor service, network packet statistics etc.
Science: To display infrared spectra,plot galaxies and other celestial objects.
Sports: To plot wave forecast for surfing, plot motorcycle racing speeds.
Technology: To visualise statistics for major cities, prices of automobiles etc.
Apart from these Flot is also used in Gaming, Physical Data Monitoring etc.

Usage
1.Basic Usage: To create a simple graph on a given direct data. Simply calling of plot function is required. It also supports lines,points,bars,filled areas or any combination of these. There are a number of options to control the looks of the graph. It can also be used with AJAX to load and plot the real time data.
 
2.Interactivity: Can help to create plots with series on-off options. 
Flot supports selection through selection pluginn.It enables user to select only one axis through rectangular or one-dimensional selection. Such selections are useful for zooming. It also has the feature of showing mouse position. Flot has a  "resize plugin" to re-size the plot according to the available space.

3.Additional Features: -Plots can be marked with various symbols using the "symbol plugin" 

- It has a facility of multiple axis.
-"Threshold Plugin" can be used to apply different colour to data below a certain threshold.
-With "Stack Pluggi" Flot can be used to stack the series.
-"Errorbars plugin" can be used to plot error bars to show standard deviation and other useful statistical properties.
Many other features are also available which make the graphs more attractive.

Thus Flot is an easy, simple at the same time a very attractive tool for plotting line graphs and bar charts.







Thursday, 7 February 2013

ITBAL:Session 5(05.02.2013)


Asgn1: To find and plot returns for NSE data of more than     months.

> z<-read.csv(file.choose(),header=T)
> head(z)
         Date    Open    High     Low   Close Shares.Traded Turnover..Rs..Cr.
1 02-Jul-2012 5283.85 5302.15 5263.35 5278.60     126161441           4991.57
2 03-Jul-2012 5298.85 5317.00 5265.95 5287.95     133117055           5161.82
3 04-Jul-2012 5310.40 5317.65 5273.30 5302.55     155995887           5750.10
4 05-Jul-2012 5297.05 5333.65 5288.85 5327.30     118915392           4709.79
5 06-Jul-2012 5324.70 5327.20 5287.75 5316.95     113300726           4760.51
6 09-Jul-2012 5283.70 5300.60 5257.75 5275.15     101169926           4189.25
> open<-z$Open[10:95]
> open.ts<-ts(open,deltat=1/252)
> open.ts
Time Series:
Start = c(1, 1)
End = c(1, 86)
Frequency = 252
 [1] 5242.75 5232.35 5228.05 5199.10 5249.85 5233.55 5163.25 5128.80 5118.40
[10] 5126.30 5124.30 5129.75 5214.85 5220.70 5233.10 5195.60 5260.85 5295.40
[19] 5345.25 5348.30 5308.20 5316.35 5343.25 5385.95 5368.60 5368.70 5395.75
[28] 5426.15 5392.60 5387.85 5348.05 5343.85 5268.60 5298.20 5276.50 5249.15
[37] 5243.90 5217.65 5309.45 5343.65 5361.90 5336.10 5404.45 5435.20 5528.35
[46] 5631.75 5602.40 5536.95 5577.00 5691.95 5674.90 5653.40 5673.75 5684.80
[55] 5704.75 5727.70 5751.55 5815.00 5751.85 5708.15 5671.15 5663.50 5681.70
[64] 5674.25 5705.60 5681.10 5675.30 5703.30 5667.60 5715.65 5688.80 5683.55
[73] 5665.20 5656.35 5596.75 5609.85 5696.35 5693.05 5694.10 5718.60 5709.00
[82] 5731.10 5688.45 5689.70 5650.35 5624.80
> summary(open.ts)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   5118    5281    5431    5474    5682    5815
> z.diff<-diff(open.ts)
> z.diff
Time Series:
Start = c(1, 2)
End = c(1, 86)
Frequency = 252
 [1] -10.40  -4.30 -28.95  50.75 -16.30 -70.30 -34.45 -10.40   7.90  -2.00
[11]   5.45  85.10   5.85  12.40 -37.50  65.25  34.55  49.85   3.05 -40.10
[21]   8.15  26.90  42.70 -17.35   0.10  27.05  30.40 -33.55  -4.75 -39.80
[31]  -4.20 -75.25  29.60 -21.70 -27.35  -5.25 -26.25  91.80  34.20  18.25
[41] -25.80  68.35  30.75  93.15 103.40 -29.35 -65.45  40.05 114.95 -17.05
[51] -21.50  20.35  11.05  19.95  22.95  23.85  63.45 -63.15 -43.70 -37.00
[61]  -7.65  18.20  -7.45  31.35 -24.50  -5.80  28.00 -35.70  48.05 -26.85
[71]  -5.25 -18.35  -8.85 -59.60  13.10  86.50  -3.30   1.05  24.50  -9.60
[81]  22.10 -42.65   1.25 -39.35 -25.55
> returns<-cbind(open.ts,z.diff,lag(open.ts,k=-1))
> returns
Time Series:
Start = c(1, 1)
End = c(1, 87)
Frequency = 252
         open.ts z.diff lag(open.ts, k = -1)
1.000000 5242.75     NA                   NA
1.003968 5232.35 -10.40              5242.75
1.007937 5228.05  -4.30              5232.35
1.011905 5199.10 -28.95              5228.05
1.015873 5249.85  50.75              5199.10
1.019841 5233.55 -16.30              5249.85
1.023810 5163.25 -70.30              5233.55
1.027778 5128.80 -34.45              5163.25
1.031746 5118.40 -10.40              5128.80
1.035714 5126.30   7.90              5118.40
1.039683 5124.30  -2.00              5126.30
1.043651 5129.75   5.45              5124.30
1.047619 5214.85  85.10              5129.75
1.051587 5220.70   5.85              5214.85
1.055556 5233.10  12.40              5220.70
1.059524 5195.60 -37.50              5233.10
1.063492 5260.85  65.25              5195.60
1.067460 5295.40  34.55              5260.85
1.071429 5345.25  49.85              5295.40
1.075397 5348.30   3.05              5345.25
1.079365 5308.20 -40.10              5348.30
1.083333 5316.35   8.15              5308.20
1.087302 5343.25  26.90              5316.35
1.091270 5385.95  42.70              5343.25
1.095238 5368.60 -17.35              5385.95
1.099206 5368.70   0.10              5368.60
1.103175 5395.75  27.05              5368.70
1.107143 5426.15  30.40              5395.75
1.111111 5392.60 -33.55              5426.15
1.115079 5387.85  -4.75              5392.60
1.119048 5348.05 -39.80              5387.85
1.123016 5343.85  -4.20              5348.05
1.126984 5268.60 -75.25              5343.85
1.130952 5298.20  29.60              5268.60
1.134921 5276.50 -21.70              5298.20
1.138889 5249.15 -27.35              5276.50
1.142857 5243.90  -5.25              5249.15
1.146825 5217.65 -26.25              5243.90
1.150794 5309.45  91.80              5217.65
1.154762 5343.65  34.20              5309.45
1.158730 5361.90  18.25              5343.65
1.162698 5336.10 -25.80              5361.90
1.166667 5404.45  68.35              5336.10
1.170635 5435.20  30.75              5404.45
1.174603 5528.35  93.15              5435.20
1.178571 5631.75 103.40              5528.35
1.182540 5602.40 -29.35              5631.75
1.186508 5536.95 -65.45              5602.40
1.190476 5577.00  40.05              5536.95
1.194444 5691.95 114.95              5577.00
1.198413 5674.90 -17.05              5691.95
1.202381 5653.40 -21.50              5674.90
1.206349 5673.75  20.35              5653.40
1.210317 5684.80  11.05              5673.75
1.214286 5704.75  19.95              5684.80
1.218254 5727.70  22.95              5704.75
1.222222 5751.55  23.85              5727.70
1.226190 5815.00  63.45              5751.55
1.230159 5751.85 -63.15              5815.00
1.234127 5708.15 -43.70              5751.85
1.238095 5671.15 -37.00              5708.15
1.242063 5663.50  -7.65              5671.15
1.246032 5681.70  18.20              5663.50
1.250000 5674.25  -7.45              5681.70
1.253968 5705.60  31.35              5674.25
1.257937 5681.10 -24.50              5705.60
1.261905 5675.30  -5.80              5681.10
1.265873 5703.30  28.00              5675.30
1.269841 5667.60 -35.70              5703.30
1.273810 5715.65  48.05              5667.60
1.277778 5688.80 -26.85              5715.65
1.281746 5683.55  -5.25              5688.80
1.285714 5665.20 -18.35              5683.55
1.289683 5656.35  -8.85              5665.20
1.293651 5596.75 -59.60              5656.35
1.297619 5609.85  13.10              5596.75
1.301587 5696.35  86.50              5609.85
1.305556 5693.05  -3.30              5696.35
1.309524 5694.10   1.05              5693.05
1.313492 5718.60  24.50              5694.10
1.317460 5709.00  -9.60              5718.60
1.321429 5731.10  22.10              5709.00
1.325397 5688.45 -42.65              5731.10
1.329365 5689.70   1.25              5688.45
1.333333 5650.35 -39.35              5689.70
1.337302 5624.80 -25.55              5650.35
1.341270      NA     NA              5624.80
> returns<-z.diff/lag(open.ts,k=-1)
> returns
Time Series:
Start = c(1, 2)
End = c(1, 86)
Frequency = 252
 [1] -1.983692e-03 -8.218105e-04 -5.537437e-03  9.761305e-03 -3.104851e-03
 [6] -1.343256e-02 -6.672154e-03 -2.027765e-03  1.543451e-03 -3.901449e-04
[11]  1.063560e-03  1.658950e-02  1.121796e-03  2.375160e-03 -7.165925e-03
[16]  1.255870e-02  6.567380e-03  9.413831e-03  5.706001e-04 -7.497710e-03
[21]  1.535360e-03  5.059862e-03  7.991391e-03 -3.221344e-03  1.862683e-05
[26]  5.038464e-03  5.634064e-03 -6.183021e-03 -8.808367e-04 -7.386991e-03
[31] -7.853330e-04 -1.408161e-02  5.618191e-03 -4.095731e-03 -5.183360e-03
[36] -1.000162e-03 -5.005816e-03  1.759413e-02  6.441345e-03  3.415269e-03
[41] -4.811727e-03  1.280898e-02  5.689756e-03  1.713828e-02  1.870359e-02
[46] -5.211524e-03 -1.168249e-02  7.233224e-03  2.061144e-02 -2.995458e-03
[51] -3.788613e-03  3.599604e-03  1.947566e-03  3.509358e-03  4.022963e-03
[56]  4.163975e-03  1.103181e-02 -1.085985e-02 -7.597556e-03 -6.481960e-03
[61] -1.348933e-03  3.213561e-03 -1.311227e-03  5.524959e-03 -4.294027e-03
[66] -1.020929e-03  4.933660e-03 -6.259534e-03  8.478015e-03 -4.697628e-03
[71] -9.228660e-04 -3.228616e-03 -1.562169e-03 -1.053683e-02  2.340644e-03
[76]  1.541931e-02 -5.793183e-04  1.844354e-04  4.302699e-03 -1.678733e-03
[81]  3.871081e-03 -7.441852e-03  2.197435e-04 -6.916006e-03 -4.521844e-03
> plot(returns)

Asgn 2: Do logit analysis for 700 data points and then predict for 150 data points.

z<-read.csv(file.choose(),header=T)

head(z)

z.data<-z[1:700,1:9]

sapply(z.data,mean)

z.data$ed<-factor(z.data$ed)

logit.est<-glm(default~age+employ+address+income+debtinc+creddebt+othdebt,data=z.data,family="binomial")

summary(logit.est)

confint.default(logit.est)

logit.eg2<-with(z[701:850,1:8],data.frame(age=age,employ=employ,address=address,income=income,debtinc=debtinc,creddebt=creddebt,othdebt=othdebt,ed=factor(1:3)))

logit.eg2$prob<-predict(logit.est,newdata=logit.eg2,type="response")

head(logit.eg2)





Tuesday, 22 January 2013

Session 3(22.01.2013)

Q1.(a) Do regression for the data set(mileage vs grooves) and comment on the applicability of regression.


> z<-read.csv(file.choose(),header=T)
> z
  Mileage Groove  X
1       0 394.33 NA
2       4 329.50 NA
3       8 291.00 NA
4      12 255.17 NA
5      16 229.33 NA
6      20 204.83 NA
7      24 179.00 NA
8      28 163.83 NA
9      32 150.33 NA
> x<-z$Mileage
> y<-z$Groove
> reg1<-lm(x~y)
> reg1

Call:
lm(formula = x ~ y)

Coefficients:
(Intercept)            y 
    47.9446      -0.1308 

> res<-resid(reg1)
> res
         1          2          3          4          5          6          7
 3.6502499 -0.8322206 -1.8696280 -2.5576878 -1.9386386 -1.1442614 -0.5239038
         8          9
 1.4912269  3.7248633
> plot(y,res)


> sres<-(reg1)
> sres

Call:
lm(formula = x ~ y)

Coefficients:
(Intercept)            y 
    47.9446      -0.1308 

> sres<-rstandard(reg1)
> sres
         1          2          3          4          5          6          7
 2.0960030 -0.3763146 -0.7964870 -1.0654888 -0.8084405 -0.4840129 -0.2284168
         8          9
 0.6674135  1.7170098
> plot(y,sres)



> qqnorm(res)
> qqline(res)

Result: Since the residual plot is not random and shows a parabolic pattern , it is non-linear and thus we cannot go for linear regression.

Q1.(b)Do regression for the data set(alpha vs pluto) and comment on the applicability of regression.

> z<-read.csv(file.choose(),header=T)
> z
   Alpha Pluto
1  0.150    20
2  0.004     0
3  0.069    10
4  0.030     5
5  0.011     0
6  0.004     0
7  0.041     5
8  0.109    20
9  0.068    10
10 0.009     0
11 0.009     0
12 0.048    10
13 0.006     0
14 0.083    20
15 0.037     5
16 0.039     5
17 0.132    20
18 0.004     0
19 0.006     0
20 0.059    10
21 0.051    10
22 0.002     0
23 0.049     5
24 0.049     5
> x<-z$Alpha
> y<-Pluto
Error: object 'Pluto' not found
> y<-z$Pluto
> reg1<-lm(y~x)
> summary(reg1)

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max
-4.0830 -0.8682 -0.3614  0.4530  6.9820

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  -0.6893     0.6627   -1.04     0.31   
x           165.1490    10.9977   15.02  4.8e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.187 on 22 degrees of freedom
Multiple R-squared: 0.9111,     Adjusted R-squared: 0.9071
F-statistic: 225.5 on 1 and 22 DF,  p-value: 4.804e-13

> res<-resid(reg1)
> res
          1           2           3           4           5           6
-4.08300561  0.02874929 -0.70593611  0.73487513 -1.12729375  0.02874929
          7           8           9          10          11          12
-1.08176394  2.68810364 -0.54078710 -0.79699574 -0.79699574  2.76219302
         13          14          15          16          17          18
-0.30154872  6.98197780 -0.42116791 -0.75146592 -1.11032350  0.02874929
         19          20          21          22          23          24
-0.30154872  0.94555395  2.26674600  0.35904730 -2.40295599 -2.40295599
> plot(x,res)

> sres<-rstandard(reg1)
> sres
          1           2           3           4           5           6
-2.26945649  0.01373201 -0.33242731  0.34427383 -0.53463647  0.01373201
          7           8           9          10          11          12
-0.50545136  1.33090528 -0.25449470 -0.37869994 -0.37869994  1.29061750
         13          14          15          16          17          18
-0.14372047  3.32737186 -0.19690489 -0.35120473 -0.58062905  0.01373201
         19          20          21          22          23          24
-0.14372047  0.44295830  1.05953921  0.17189232 -1.12288360 -1.12288360
> plot(x,sres)

> qqnorm(res)
> qqline(res)

Result: The plot of residuals is random in nature and thus indication of linearity and thus can go for linear regression. Also the plot of qqnorm has points around the straight qq line showing normal distribution of residuals.

Q2. Justify null hypothesis showing the similarity of 3 types of chairs using ANOVA.

> z<-read.csv(file.choose(),header=T)
> z
   Chair Comfort Chair1
1      1       2      a
2      1       3      a
3      1       5      a
4      1       3      a
5      1       2      a
6      1       3      a
7      2       5      b
8      2       4      b
9      2       5      b
10     2       4      b
11     2       1      b
12     2       3      b
13     3       3      c
14     3       4      c
15     3       4      c
16     3       5      c
17     3       1      c
18     3       2      c
> x<-z$Comfort
> z
   Chair Comfort Chair1
1      1       2      a
2      1       3      a
3      1       5      a
4      1       3      a
5      1       2      a
6      1       3      a
7      2       5      b
8      2       4      b
9      2       5      b
10     2       4      b
11     2       1      b
12     2       3      b
13     3       3      c
14     3       4      c
15     3       4      c
16     3       5      c
17     3       1      c
18     3       2      c
> x
 [1] 2 3 5 3 2 3 5 4 5 4 1 3 3 4 4 5 1 2
> y<-z$Chair1
> anova<-aov(x~y)
> summary(anova)
            Df Sum Sq Mean Sq F value Pr(>F)
y            2  1.444  0.7222   0.385  0.687
Residuals   15 28.167  1.8778              

Conclusion: p value: 0.687
Since the p value is very high we cannot reject the null hypothesis, thus we can say that all the type of chairs are not different.




Tuesday, 8 January 2013

ITBAL-1st Lecture


Assignment 0: Plot histogram for a simple vector 

Assignment1: Plot a histogram from "High" column of the NSE indices data.


Assignment 2: Plot the point and line graph of the "high, along with naming of the graph and axis.

                          Assignment3: Draw a scatterplot for two columns of the NSE indices data.


Assignment4: To find the volatility in the data.