Monday, January 13, 2014

Differentiate time-series data

Aim: to differentiate time-series data

Aim: to differentiate time-series data

and to check the effects of choosing different time interval

Contents

Initialization

clear all; close all; clc;

differentiate symbolic equation

syms x
f=sin(x);
fd=diff(f);
ezplot(f,[0 2*pi])
hold on,
h=ezplot(fd,[0 2*pi]);
set(h,'Color','g')
ylim([-1 1])
legend('sin(x)', 'deriv of sin(x)')

differentiate discret time-series data

Suppose these are measured data

timeInterval=0.01;
t=0:timeInterval/pi:2*pi;
y=sin(t);
yd=diff(y)./(0.01/pi);
figure(2),plot(t,y,'b.',t(1:end-1),yd,'g.')
legend('sin(x)', 'deriv of sin(x)')

discret data, use larger time interval

Suppose these are measured data

timeIntervalL=0.2;
tL=0:timeIntervalL/pi:2*pi;
yL=sin(tL);
ydL=diff(yL)./(timeIntervalL/pi);
figure(3),plot(tL,yL,'b.',tL(1:end-1),ydL,'g.')
legend('sin(x)', 'deriv of sin(x)')

overlay different simulations

figure(4)
ezplot(fd),hold on
plot(t(1:end-1),yd,'bo');
plot(tL(1:end-1),ydL,'ro');
xlim([0 2*pi])

Saturday, January 4, 2014

Data and Questions: regression with factor variables

I was taking the Data Analysis course by Jeff Leek on CourseEra.org. The video ' Regression with Factor Variables' was very useful to me. However, I am a Matlab guy and don't like R very much. So I did all the analysis in Matlab. The code is published here:

http://matlabnewbie.blogspot.com/2014/01/matlab-code-regression-with-factor.html

and here's the video.


Matlab code: regression with factor variables

<!-- This HTML was auto-generated from MATLAB code. To make changes, update the MATLAB code and republish this document. MovieData_Practice

Contents

% Practice of Regression with categorical covariates
% By Segovia on 01/03/2014
clear;clc;close;
load MovieData.mat

Draw a scatter plot of boxoffice against score, grouped by rating

figure();
gscatter(score,boxoffice, rating,'bgr','x.o^');
title('boxoffice vs. score, grouped by rating')
""

Create dataset array, convert rating to a nominal array

Movie=dataset(boxoffice, score,rating);
Movie.rating=nominal(Movie.rating);

Queston 1 in Matlab

%Fit a regression model
% in 2013a version, use function LinearModel.fit
% in 2013b version, use function fitlm
fit=LinearModel.fit(Movie, 'score~rating')
fit = 


Linear regression model:
    score ~ 1 + rating

Estimated Coefficients:
                    Estimate    SE        tStat      pValue    
    (Intercept)       67.65     7.1933     9.4046    1.7256e-16
    rating_PG       -12.593     7.8486    -1.6045       0.11093
    rating_PG-13    -11.815     7.4113    -1.5941       0.11323
    rating_R         -12.02     7.4755    -1.6079       0.11017


Number of observations: 140, Error degrees of freedom: 136
Root Mean Squared Error: 14.4
R-squared: 0.0199,  Adjusted R-Squared -0.00177
F-statistic vs. constant model: 0.918, p-value = 0.434

Question 2 in Matlab

%Fit a regression model and use "R" as reference level in rating
Movie2=Movie;
Movie2.rating=reorderlevels(Movie2.rating, {'R','G','PG','PG-13'});
fit2=LinearModel.fit(Movie2,'score~rating')
fit2 = 


Linear regression model:
    score ~ 1 + rating

Estimated Coefficients:
                    Estimate    SE        tStat       pValue    
    (Intercept)        55.63    2.0346      27.342    4.0302e-57
    rating_G           12.02    7.4755      1.6079       0.11017
    rating_PG       -0.57286    3.7411    -0.15313       0.87852
    rating_PG-13     0.20538    2.7062    0.075893       0.93962


Number of observations: 140, Error degrees of freedom: 136
Root Mean Squared Error: 14.4
R-squared: 0.0199,  Adjusted R-Squared -0.00177
F-statistic vs. constant model: 0.918, p-value = 0.434

Questions 3 in Matlab

anova(fit)
ans = 

              SumSq     DF     MeanSq    F          pValue 
    rating    570.12      3    190.04    0.91818    0.43398
    Error      28149    136    206.98                      

20/20

[~,~,st]=anova1(Movie2.score, Movie2.rating,'off');
[c,m,h,nms]=multcompare(st,'display','off','ctype','hsd')
c =

    1.0000    2.0000  -31.2248  -12.0200    7.1848
    1.0000    3.0000   -9.0380    0.5729   10.1837
    1.0000    4.0000   -7.1578   -0.2054    6.7470
    2.0000    3.0000   -7.5703   12.5929   32.7560
    2.0000    4.0000   -7.2254   11.8146   30.8546
    3.0000    4.0000  -10.0553   -0.7782    8.4988


m =

   55.6300    2.0346
   67.6500    7.1933
   55.0571    3.1394
   55.8354    1.7844


h =

     []


nms = 

    'R'
    'G'
    'PG'
    'PG-13'

-->

my-alpine and docker-compose.yml

 ``` version: '1' services:     man:       build: .       image: my-alpine:latest   ```  Dockerfile: ``` FROM alpine:latest ENV PYTH...