Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Team 1 Pull request #5

Open
wants to merge 78 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
7450ee9
erverv
Premtonh May 8, 2019
a8cbacd
Converted from factor to numeric.
VilmaShehu May 8, 2019
1424f74
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
Premtonh May 8, 2019
c25e0c9
transformation
NedelcuAlin May 8, 2019
c4a8678
test
NedelcuAlin May 8, 2019
7438cec
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
Premtonh May 8, 2019
b221da3
Team 1 Project create
WonderAnn May 8, 2019
35641d3
Create file plot
WonderAnn May 8, 2019
af62396
model script added
Premtonh May 8, 2019
0475485
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
Premtonh May 8, 2019
161301a
model script
Premtonh May 8, 2019
aca57bc
model_function
NedelcuAlin May 8, 2019
58b6668
dcwevcwe
Premtonh May 8, 2019
169c72e
test
NedelcuAlin May 8, 2019
eab28a6
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
NedelcuAlin May 8, 2019
cbb5f48
commit
NedelcuAlin May 8, 2019
8c58b46
Adding Transform file
VilmaShehu May 8, 2019
20a7a72
prediction function
Premtonh May 8, 2019
e597de3
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
Premtonh May 8, 2019
2cd645d
Add ploting functions
WonderAnn May 8, 2019
6f22ee6
Adding Transform file
VilmaShehu May 8, 2019
50e3c64
change_object_function
NedelcuAlin May 8, 2019
b175b30
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
NedelcuAlin May 8, 2019
b11c3a1
Transform file
VilmaShehu May 8, 2019
1023b3e
Description added
WonderAnn May 8, 2019
1addeb8
Creating CLEAR function
JuliaPetranova May 8, 2019
1452849
Two cleaning functions modified
JuliaPetranova May 8, 2019
3f388c7
adding new function for clear
JuliaPetranova May 8, 2019
436c085
Add description changes
WonderAnn May 8, 2019
184efe7
prediction model
Premtonh May 8, 2019
fa3f9f9
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
Premtonh May 8, 2019
a871fc9
model performance
NedelcuAlin May 8, 2019
1802582
argument added to predictions
NedelcuAlin May 8, 2019
cfddd2e
Import for libraries and Documentation for Plot.R
WonderAnn May 8, 2019
88f8363
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
WonderAnn May 8, 2019
114f3d6
model_performance
Premtonh May 8, 2019
e73042e
model performance update
Premtonh May 8, 2019
f59fc37
update perfomance
Premtonh May 8, 2019
d9e534c
Documentation of Transform file
VilmaShehu May 8, 2019
50a8c58
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
VilmaShehu May 8, 2019
99675a9
Docs for Model.R
WonderAnn May 8, 2019
87eaf4a
Plot docs changes
WonderAnn May 8, 2019
02addde
Model comments changes
WonderAnn May 8, 2019
296e7e1
add documentation
JuliaPetranova May 8, 2019
dee93eb
Docs prediction
WonderAnn May 8, 2019
4d5a068
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
WonderAnn May 8, 2019
78d2534
model performance docs
WonderAnn May 8, 2019
9e186e2
move everything
WonderAnn May 8, 2019
b134379
Fix the issue
WonderAnn May 9, 2019
2985361
change the title
WonderAnn May 9, 2019
f7ac131
Delete b.rd
WonderAnn May 9, 2019
3b34924
New .Rd files
WonderAnn May 9, 2019
64e23af
Some fixes for Docs
WonderAnn May 9, 2019
558de39
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
WonderAnn May 9, 2019
2d0a0ba
Fix description
WonderAnn May 9, 2019
5297ae9
Namespacce fix
WonderAnn May 9, 2019
9e90b71
Fix description
WonderAnn May 9, 2019
2a63b17
Fix the description
WonderAnn May 9, 2019
1b070f8
Fix import libraries
WonderAnn May 9, 2019
840c07e
import fix
WonderAnn May 9, 2019
3f0d7ef
Delete bmarketing.rd
WonderAnn May 9, 2019
17a46f1
Delete newdata.rd
WonderAnn May 9, 2019
b1af62a
delete predictions.rd
WonderAnn May 9, 2019
9e68713
delete dt_model.rd
WonderAnn May 9, 2019
ad825e5
delete team1_package.rd
WonderAnn May 9, 2019
8edb405
Function fix.
WonderAnn May 9, 2019
3e98a86
Export added, generated new namespace
WonderAnn May 9, 2019
b0bee9e
Documentation changes
WonderAnn May 9, 2019
78d6386
readme changes
WonderAnn May 9, 2019
7b3ac25
description changes
WonderAnn May 9, 2019
2c926d7
doc changes
VilmaShehu May 9, 2019
4c1e848
Document changes
WonderAnn May 9, 2019
9a93fa4
Merge branch 'master' of https://github.com/WonderAnn/bmarketing
WonderAnn May 9, 2019
3640342
New docs files
WonderAnn May 9, 2019
fbbbeec
Doc changes
VilmaShehu May 9, 2019
19798d2
New docs and function fix
WonderAnn May 9, 2019
3e6d7f2
Docs changes
WonderAnn May 9, 2019
9c46dad
Docs changes
WonderAnn May 9, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .Rproj.user/E6F84AE4/sources/prop/5446D37A
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"cursorPosition" : "45,0",
"scrollLine" : "28"
}
6 changes: 6 additions & 0 deletions .Rproj.user/E6F84AE4/sources/prop/INDEX
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
~%2Fbmarketinggg%2FTeam1%2FR%2FPlot.R="85A4D0EF"
~%2Fbmarketinggg%2FTeam1%2FR%2Fmodel.R="78826CD7"
~%2Fbmarketinggg%2Fbmarketing.R="5446D37A"
~%2Fbmarketinggggg%2FTeam1%2FR%2Fmodel.R="2432FCD0"
~%2Fbmarketinggggg%2FTeam1%2FR%2Fpredict.R="4BB9E5C6"
~%2Fbmarketinggggg%2Fbmarketing.R="A971490D"
25 changes: 25 additions & 0 deletions .Rproj.user/E6F84AE4/sources/s-5EE927D0/B6001EF7
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"collab_server" : "",
"contents" : "",
"created" : 1557311882456.000,
"dirty" : false,
"encoding" : "UTF-8",
"folds" : "",
"hash" : "0",
"id" : "B6001EF7",
"lastKnownWriteTime" : 1557319656,
"last_content_update" : 1557319656,
"path" : "~/bmarketinggggg/bmarketing.R",
"project_path" : "bmarketing.R",
"properties" : {
"cursorPosition" : "51,0",
"scrollLine" : "28"
},
"read_only" : false,
"read_only_alternatives" : [
],
"relative_order" : 1,
"source_on_save" : false,
"source_window" : "",
"type" : "r_source"
}
58 changes: 58 additions & 0 deletions .Rproj.user/E6F84AE4/sources/s-5EE927D0/B6001EF7-contents
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
library(tidyverse)

#################Loading data into the environment#################
bmarketing <- read.csv2("bmarketing.csv")

bmarketing$euribor3m = as.numeric(as.character(bmarketing$euribor3m))
bmarketing$cons.conf.idx = as.numeric(as.character(bmarketing$cons.conf.idx))
bmarketing$emp.var.rate = as.numeric(as.character(bmarketing$emp.var.rate))
bmarketing$cons.price.idx = as.numeric(as.character(bmarketing$cons.price.idx))
bmarketing$nr.employed = as.numeric(as.character(bmarketing$nr.employed))

#Lets look at dataset and generate initial understanding about the column types
str(bmarketing)
summary(bmarketing)

view (bmarketing)

# A quick check:
# If newdata has same number of observation that implies no NA value present
# is.na(bmarketing)
newdata <- na.omit(bmarketing)
nrow(newdata)==nrow(bmarketing)

#A deep check for a particular column let say age
if(length(which(is.na(bmarketing$y)==TRUE)>0)){
print("Missing Value found in the specified column")
} else{
print("All okay: No Missing Value found in the specified column")
}

# Let's find the range of individual variables
summary(bmarketing)

## ------------------------------------------------------------------------
bmarketing %>%
ggplot() + geom_histogram(aes(age), bins = 30) +
geom_vline(aes(xintercept= median(age)), color = "red")

# TODO: do boxplots for each data
# boxplot(duration~y,data=bmarketing_sub,col="red")

#################Decision Tree#################
library(rpart)
library(rpart.plot)

dt_model<- rpart(y ~ ., data = bmarketing)
# dt_model<- rpart(y ~ poutcome + emp.var.rate + cons.price.idx + cons.conf.idx + euribor3m + nr.employed, data = bmarketing)
rpart.plot(dt_model)
summary(dt_model)

#################Testing Decision Tree #################
predictions <- predict(dt_model, bmarketing, type = "class")

## Compute the accuracy
mean(bmarketing$y == predictions)

# Lets look at the confusion matrix
table(predictions, bmarketing$y)
Empty file.
Empty file.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.Rproj.user
.Rhistory
.RData
.Ruserdata
14 changes: 14 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Package: Team1
Type: Package
Title: Data manage package
Version: 0.1.0
Date: 2019-05-08
Author: Team 1 project
Maintainer: Hanna Popova <[email protected]>
Imports:
tidyverse,
rpart,
rpart.plot
Description: The package Team1 used for data cleaning, data transforming, model, model plotting, model predicting and model performance.
License: open source
RoxygenNote: 6.1.1
15 changes: 15 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Generated by roxygen2: do not edit by hand

export(clear1)
export(clear2)
export(clear3)
export(histplot)
export(intonum)
export(logaritmic)
export(model)
export(model_performance)
export(model_pred)
export(treeplot)
import(rpart)
import(rpart.plot)
import(tidyverse)
98 changes: 98 additions & 0 deletions R/CLEAR.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#'Function to find missing data
#'
#'
#'
#'
#'@param data Basic object like \code{numeric,int,boolean,char}
#'@param target Basic object like \code{numeric,int,boolean,char}
#'
#'@keywords na.omit , nrow, length
#'
#'
#'@export clear1
#'@export clear2
#'@export clear3
#'
#'@examples
#'clear1(bmarketing$y)
#'clear2(bmarketing$duration)
#'clear3(bmarketing$poutcome)
#
#'




# A quick check:
# If newdata has same number of observation that implies no NA value present
# is.na(bmarketing)

clear1 <- function(data) {
newdata <- na.omit(data)
if (nrow(newdata)==nrow(data)) {
print("All okay: No Missing Value found")
}
else{
print("Error: Missing Values found")
}
}

#'Function to find missing data inside a column
#'
#'
#'
#'
#'@param data Basic object like \code{numeric,int,boolean,char}
#'@param target Basic object like \code{numeric,int,boolean,char}
#'
#'@keywords na.omit , nrow, length
#'
#'
#'@export clear2
#'
#'@examples
#'clear2(bmarketing$duration)
#
#'

#A deep check for a particular column let say age

clear2 <- function(data,target) {
if(length(which(is.na(data$target)==TRUE)>0)){
print("Missing Value found in the specified column")
}
else{
print("All okay: No Missing Value found in the specified column")
}
}

#'Remove any columns (and report as warning) which contain more than 50 percent NA’s
#'
#'
#'
#'
#'@param data Basic object like \code{numeric,int,boolean,char}
#'@param target Basic object like \code{numeric,int,boolean,char}
#'
#'@keywords na.omit, nrow, length
#'
#'
#'@export clear3
#'
#'@examples
#'clear3(bmarketing$duration)
#
#'

#Remove any columns (and report as warning) which contain more than 50% NA’s
clear3 <- function(data,target) {
if(length(which(is.na(target)==TRUE)>length(target)/2)){
print("There are more than 50% of Missing Value in the specified column")
assign('target', NULL, envir = .GlobalEnv)

} else{
print("All okay: There are no more than 50% of Missing Value in the specified column")
}
}


38 changes: 38 additions & 0 deletions R/Plot.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@

#'Function to histogram
#'
#'
#'
#'
#'@param data Basic object like \code{numeric, char, factor,boolean, NULL}
#'@param column Basic object like \code{numeric}
#'
#'@keywords plot
#'
#'@import rpart.plot
#'@import rpart
#'@import tidyverse
#'
#'@export histplot
#'@export treeplot
#'
#'@examples
#'histplot(data, column)
#'
#'


histplot<-function(data, column){

data %>%
ggplot() + geom_histogram(aes(column), bins = 30) +
geom_vline(aes(xintercept= median(column)), color = "red")
}

#trreplot

treeplot <- function(model){

rpart.plot(model)

}
60 changes: 60 additions & 0 deletions R/Transform.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#'Transform numeric variables using the log
#'
#'
#'
#'
#'@param x Basic object like \code{numeric}
#'@param data Basic object like \code{numeric, boolean, character, NULL, factor}
#'@param column Basic object like \code{factor}
#'
#'@keywords log
#'
#'@import rpart.plot
#'@import rpart
#'@import tidyverse
#'
#'@export logaritmic
#'
#'@examples
#'bmarketing$age<-logaritmic(bmarketing$age)
#'bmarketing$duration<-logaritmic(bmarketing$duration)
#'


#################Transform numeric variables using the log #################

logaritmic <- function(x) {

x<-log(x)

}

#'Transform factors into numeric variables (and vice versa) as necessary
#'
#'
#'
#'
#'@param data Basic object like \code{numeric, boolean, character, NULL, factor}
#'@param column Basic object like \code{factor}
#'
#'@keywords number2factor
#'
#'@import rpart.plot
#'@import rpart
#'@import tidyverse
#'
#'@export intonum
#'
#'@examples
#'bmarketing$poutcome<-intonum(bmarketing$poutcome)
#'bmarketing$euribor3m<-intonum(bmarketing, bmarketing$euribor3m)
#'

#################Transform factors into numeric variables (and vice versa) as necessary#################

intonum<-function(data, column) {

data$column <- as.numeric(as.character(column))


}
26 changes: 26 additions & 0 deletions R/model.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#'Function to model
#'
#'
#'
#'
#'@param dataset Basic object like \code{numeric, char, factor,boolean, NULL}
#'
#'@param target Basic object like \code{boolean}
#'
#'@keywords model, decision tree,
#'
#'@import rpart.plot
#'@import rpart
#'@import tidyverse
#'
#'@export model
#'
#'@examples
#'model(bmarketing, bmarketing$y)
#'
#'

# model function with 2 arguments: target and dataset
model <- function(dataset, target) {
rpart(target ~ ., data = dataset)
}
27 changes: 27 additions & 0 deletions R/model_performance.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#'Function to model performance
#'
#'
#'
#'
#'@param target Basic object like \code{boolean}
#'
#'@param predictions Basic object like \code{boolean}
#'
#'@keywords model, decision tree, accuracy
#'
#'@import rpart
#'@import tidyverse
#'
#'@export model_performance
#'
#'@examples
#'model_performance(target, predictions)
#'
#'

model_performance <- function(target, predictions) {
## Compute the accuracy
mean(target == predictions)
# Lets look at the confusion matrix
table(predictions, target)
}
Loading