Saturday, January 4, 2014

Matlab code: regression with factor variables

<!-- This HTML was auto-generated from MATLAB code. To make changes, update the MATLAB code and republish this document. MovieData_Practice

Contents

% Practice of Regression with categorical covariates
% By Segovia on 01/03/2014
clear;clc;close;
load MovieData.mat

Draw a scatter plot of boxoffice against score, grouped by rating

figure();
gscatter(score,boxoffice, rating,'bgr','x.o^');
title('boxoffice vs. score, grouped by rating')
""

Create dataset array, convert rating to a nominal array

Movie=dataset(boxoffice, score,rating);
Movie.rating=nominal(Movie.rating);

Queston 1 in Matlab

%Fit a regression model
% in 2013a version, use function LinearModel.fit
% in 2013b version, use function fitlm
fit=LinearModel.fit(Movie, 'score~rating')
fit = 


Linear regression model:
    score ~ 1 + rating

Estimated Coefficients:
                    Estimate    SE        tStat      pValue    
    (Intercept)       67.65     7.1933     9.4046    1.7256e-16
    rating_PG       -12.593     7.8486    -1.6045       0.11093
    rating_PG-13    -11.815     7.4113    -1.5941       0.11323
    rating_R         -12.02     7.4755    -1.6079       0.11017


Number of observations: 140, Error degrees of freedom: 136
Root Mean Squared Error: 14.4
R-squared: 0.0199,  Adjusted R-Squared -0.00177
F-statistic vs. constant model: 0.918, p-value = 0.434

Question 2 in Matlab

%Fit a regression model and use "R" as reference level in rating
Movie2=Movie;
Movie2.rating=reorderlevels(Movie2.rating, {'R','G','PG','PG-13'});
fit2=LinearModel.fit(Movie2,'score~rating')
fit2 = 


Linear regression model:
    score ~ 1 + rating

Estimated Coefficients:
                    Estimate    SE        tStat       pValue    
    (Intercept)        55.63    2.0346      27.342    4.0302e-57
    rating_G           12.02    7.4755      1.6079       0.11017
    rating_PG       -0.57286    3.7411    -0.15313       0.87852
    rating_PG-13     0.20538    2.7062    0.075893       0.93962


Number of observations: 140, Error degrees of freedom: 136
Root Mean Squared Error: 14.4
R-squared: 0.0199,  Adjusted R-Squared -0.00177
F-statistic vs. constant model: 0.918, p-value = 0.434

Questions 3 in Matlab

anova(fit)
ans = 

              SumSq     DF     MeanSq    F          pValue 
    rating    570.12      3    190.04    0.91818    0.43398
    Error      28149    136    206.98                      

20/20

[~,~,st]=anova1(Movie2.score, Movie2.rating,'off');
[c,m,h,nms]=multcompare(st,'display','off','ctype','hsd')
c =

    1.0000    2.0000  -31.2248  -12.0200    7.1848
    1.0000    3.0000   -9.0380    0.5729   10.1837
    1.0000    4.0000   -7.1578   -0.2054    6.7470
    2.0000    3.0000   -7.5703   12.5929   32.7560
    2.0000    4.0000   -7.2254   11.8146   30.8546
    3.0000    4.0000  -10.0553   -0.7782    8.4988


m =

   55.6300    2.0346
   67.6500    7.1933
   55.0571    3.1394
   55.8354    1.7844


h =

     []


nms = 

    'R'
    'G'
    'PG'
    'PG-13'

-->

No comments:

Post a Comment

Any comments?

my-alpine and docker-compose.yml

 ``` version: '1' services:     man:       build: .       image: my-alpine:latest   ```  Dockerfile: ``` FROM alpine:latest ENV PYTH...