The chief purpose of this undertaking is to compare the Decision Tree algorithm and Backpropagation algorithm. The survey had resulted in apprehension of which datasets performed good over the both algorithms. The truth of categorization and the public presentation of algorithm was analysed under the undermentioned parametric quantities:
Categorization Accuracy:
Percentage of how accurately the datasets are classified for a peculiar categorization algorithm.
Kappa Statistic:
The value which is used to measure the rate of categorization of categorical information is called as Kappa Statistic.
Root Mean Squared Error and Mean Absolute Mistake:
With these two values, we can happen the norm of mistake and presume the step of truth. If both the mistake values are more, so the truth is less and if the mistake values are less, the truth is more.
ALGORITHMS AND WEKA TOOL:
I have made usage of a tool called Weka [ 1 ] which has got all the set of algorithms in it. This tool was really utile, if I wanted to prove an algorithm over a given dataset. The version of tool which I used for carry oning the comparative survey was Weka 3.
For executing Decision Tree algorithm in Weka, I used J48. While utilizing J48, I had an option to utilize with pruning and without pruned. But I have selected to analyze utilizing Pruned J48 for Decision Tree algorithms. To measure the public presentation of the Backpropagation algorithm on Weka tool, I selected Multilayer Perceptron. Both the J48 and Multilayer Perceptron algorithm is included in Weka 3.6 Package which is free to download from the [ 1 ] web site.
Testing Procedure:
The trials on datasets were performed over the two algorithms, Decision Tree and Backpropagation algorithm. When I started the experiment, I had to repair some parametric quantities as invariables throughout the trial procedure for all the information. The chief parametric quantities for Decision Tree algorithm which had to be same for all datasets are:
Assurance Factor: 0.25
minNumObj: 2
numFolds: 3
All the three parametric quantities are set as default values and we are non traveling to alter them. Confidence factor is set to cut down the job on overfitting and the following two parametric quantities are the figure of cases and the figure of creases. They are set to cut down the mistake pruning. And I have performed the Pruned J48 for measuring the determination tree.
In Backpropagation algorithm besides I had fixed some parametric quantities as default before experimenting. They are given below:
Learning Rate: 0.3
Momentum: 0.2
Training Time: 500
Validation Set Size: 0
Validation Threshold: 20
For both the algorithms, I have done the process utilizing k-fold cross proof. Cross proof is nil but spliting the information into thousand figure ( k=10 in our instance ) of subsets of informations. In some datasets, I had considered the values of F-Measure and ROC Area for comparings.
DATASETS:
The datasets used for carry oning this survey are existent valued categorical datasets. As it is a comparing of two different algorithms, I tried to take more than five datasets and ended up with eight datasets eventually. All the datasets are chosen from the UCI machine larning repository [ 2 ] and they do n’t hold any missing values. After the datasets were chosen, I converted all of them in ARFF format [ 3 ] , instead than utilizing the direct nexus in Weka. The features of datasets are Multivariate and the basic undertaking of them is Classification.
The information about the datasets is tabulated below:
Dataset Name
Number of
Cases
Number of Attributes
Number of Classes
Missing Valuess
Ecoli
336
8
8
No
Glass Identification
214
10
7
No
Ionosphere
351
34
2
No
Iris Plant
150
4
3
No
Magic Gamma Telescope
19020
11
2
No
Image Cleavage
2310
19
7
No
Sonar – Mines vs. Rocks
208
60
2
No
Blood Transfusion
748
5
2
No
Trial Consequence:
The consequences of the datasets on algorithms are tabulated below.
Name
Correctly Classified Instance
Kappa Statistic
Mean Absolute Error
Root Mean Squared Error
Ecoli
86.0119 %
0.8066
0.0484
0.1704
Glass Identification
96.2617 %
0.9492
0.0196
0.0946
Ionosphere
91.1681 %
0.7993
0.0938
0.2786
Iris
Plant
97.3333 %
0.96
0.0327
0.1291
Magic Gamma Telescope
85.8728 %
0.6776
0.1934
0.327
Image Cleavage
96.0606 %
0.954
0.0159
0.097
Sonar – Mines vs. Rocks
82.2115 %
0.6419
0.1901
0.3964
Blood Transfusion
78.2086 %
0.2844
0.2958
0.3931
The tabular array above is the consequences of eight datasets on Multilayer Perceptron algorithm. The consequence of the datasets on pruned J48 is given below:
Name
Correctly Classified Instance
Kappa Statistic
Mean Absolute Error
Root Mean Squared Error
Ecoli
84.2262 %
0.7824
0.0486
0.1851
Glass Identification
96.729 %
0.9557
0.0093
0.0967
Ionosphere
91.453 %
0.8096
0.0938
0.2901
Iris
Plant
96 %
0.94
0.035
0.1586
Magic Gamma Telescope
85.0578 %
0.6614
0.1955
0.3509
Image Cleavage
96.9264 %
0.9641
0.0104
0.0914
Sonar – Mines vs. Rocks
71.1538 %
0.422
0.2863
0.5207
Blood Transfusion
77.8075 %
0.3424
0.3037
0.3987
A step [ 4 ] of dataset between the classification of predicted and the observed is called as Kappa Statistic. If the value predicted and observed are same, so the Kappa Statistic value is equal to 1. With RMS mistake and Mean Absolute mistake, the mean mistake value will be found. If both the mistake values are more, so the truth is less and frailty versa. Correctly Classified Instance is the per centum of case which has been classified right.
EVALUATING THE Consequence:
From the consequences, we can see that both the algorithm have classified the datasets harmonizing to the figure of cases and the figure of properties. The truth and public presentation of the both the algorithms are non the same in all the datasets and the fluctuations are more. So, I have divided this subdivision into eight parts discoursing the truth and public presentation of the given datasets.
ECOLI:
The ecoli dataset has got 336 cases with 8 properties. The truth of sorting the cases was good in MLP with 86.01 % than the J48 which classified 84.23 % of cases right.
Name
MLP
J48
Correctly Classified Instance
86.0119 %
84.2262 %
Kappa Statistic
0.8066
0.7824
Mean Absolute Error
0.0484
0.0486
Root Mean Squared Error
0.1704
0.1851
It was found that, out of eight categories, the True Positive rate and False Positive rate for last two categories was 0 in both algorithm end products. Though the leaden norms of ROC country in MLP is higher than J48. Kappa statistic value was besides good in MLP. Merely drawback of Multilayer Perceptron over this dataset is that the clip taken for preparation dataset is more than the clip taken by J48, otherwise it is good in footings of truth and public presentation.
GLASS IDENTIFICATION:
Both the algorithms has classified the cases right with J48 executing somewhat better than Multilayer Perceptron. The end product is tabulated below for the easiness of mentions. The root mean squared mistake was about same in both algorithms but the absolute mistake from MLP was double the value of J48.
This preparation dataset has achieved a greater value of kappa statistic from both algorithms and it is noted that the mean ROC country of Backpropagation algorithm is 0.991, which is besides a parametric quantity to look into the categorization ratio. If it is closer to 1.0, so it is assumed to be good.
Name
MLP
J48
Correctly Classified Instance
96.2617 %
96.729 %
Kappa Statistic
0.9492
0.9557
Mean Absolute Error
0.0196
0.0093
Root Mean Squared Error
0.0946
0.0967
The above given tabular array is data related to Glass Identification dataset on two algorithms. On overall from this information, I assume that Decision Tree has performed good merely with a small difference to Backpropagation algorithm.
Ionosphere:
The average absolute mistake value obtained from MLP and Pruned J48 is the same and the truth of categorization in J48 is somewhat more than that of MLP. But when analyzing the RMS mistake, I found that it is higher in Pruned J48.
Name
MLP
J48
Correctly Classified Instance
91.1681 %
91.453 %
Kappa Statistic
0.7993
0.8096
Mean Absolute Error
0.0938
0.0938
Root Mean Squared Error
0.2738
0.2901
This dataset has merely two categories and from the confusion matrix, we can easy place that difference between two algorithms in footings of cases is one. MLP has misclassified an excess case than the Pruned J48. Below given is the confusion matrix for Ionosphere dataset.
Multilayer Perceptron:
a B & A ; lt ; — classified as
98 28 | a = B
3 222 | B = g
Pruned J48:
a B & A ; lt ; — classified as
104 22 | a = B
8 217 | B = g
For the Ionosphere dataset, I consider that Decision Tree algorithm was better, because MLP had some disadvantages in public presentation and truth on this dataset.
Iris:
Name
MLP
J48
Correctly Classified Instance
97.3333 %
96 %
Kappa Statistic
0.96
0.94
Mean Absolute Error
0.0327
0.035
Root Mean Squared Error
0.1291
0.1586The Iris dataset consists of three categories and four properties. This datasets achieved a high categorization rate in both algorithms. On comparing both algorithms, the sorting rate in Multilayer Perceptron was better than the values of Decision Tree algorithm. The average absolute mistake and RMS mistake was really low and the kappa statistic value was high plenty in both algorithmic end products. Although they were really similar in public presentation, I predicate that MLP was better than Pruned J48 in Iris Dataset.
MAGIC GAMMA TELESCOPE:
Out of eight datasets I had chosen, this 1 has got the highest figure of cases with 11 properties and two categories. From this dataset end product, it is possible for us to acquire the accurate values of categorization rate and besides the public presentation.
Name
MLP
J48
Correctly Classified Instance
85.8728 %
85.0578 %
Kappa Statistic
0.6776
0.6614
Mean Absolute Error
0.1934
0.1955
Root Mean Squared Error
0.327
0.3509
The MLP has outdone the Decision Tree algorithm in footings of all the parametric quantity. Percentage of Classification and Kappa Statistic value is more in MLP when compared to Pruned J48. The Mean Absolute Error and RMS Error was more in both the end products, but, on observing, MLP was better than the J48.
IMAGE Cleavage:
The cases have been classified at a higher per centum in both algorithms and the Kappa Statistic is besides near the 1.0 value. On analyzing the best one, it was found that J48 was more suited than the MLP. But when I checked the leaden norms of ROC Area [ 5 ] which is normally represented as a graph between True Positives rate and False Positive rate, it is noted that the value got from MLP was 0.995 and the ROC of J48 was 0.988.
Name
MLP
J48
Correctly Classified Instance
96.0606 %
96.9264 %
Kappa Statistic
0.954
0.9641
Mean Absolute Error
0.0159
0.0104
Root Mean Squared Error
0.097
0.0914
Normally ROC Area of 1.0 is considered to be a perfect trial. So, in footings of ROC, MLP was good. But on overall, it is assumed that the J48 has performed good on datasets.
SONAR – MINES VS ROCKS:
Sonar dataset have 208 cases, 60 properties and 2 categories. MLP exhibited good in this dataset, but the categorization rate of cases and Kappa Statistic was moderate. The Kappa Statistic of J48 was less than 0.5 whereas the same on MLP was better accomplishing more than 0.5 ( although it is really low ) .
Name
MLP
J48
Correctly Classified Instance
82.2115 %
71.1538 %
Kappa Statistic
0.6419
0.422
Mean Absolute Error
0.1901
0.2863
Root Mean Squared Error
0.3964
0.5207
This dataset has resulted in high mistake values from both algorithmic values. But on comparing both algorithms, for this dataset, I would reason that MLP is good.
BLOOD Transfusion:
The Blood Transfusion datasets, which has 748 cases, 5 properties and 2 categories, achieved a categorization truth of less than 80 % from both the algorithms.
Name
MLP
J48
Correctly Classified Instance
78.2086 %
77.8075 %
Kappa Statistic
0.2844
0.3424
Mean Absolute Error
0.2958
0.3037
Root Mean Squared Error
0.3931
0.3987
Out of two, MLP was better at 78.2086 % than J48 which was merely 0.5 % lesser than the former. Kappa Statistic of was besides really low with high mistakes in the both end products. Even if the kappa statistic is non really high, it will normally be around 0.8 or 0.7. But in our instance, it is really low in value from both end products. Although the public presentation of MLP and J48 was slightly same, on overall comparing, MLP was better than Pruned J48.
Decision:
I chose datasets in a manner that they had different scopes of cases, categories and properties. And it made me to acquire the end products of truth and public presentation of datasets with assorted belongingss. On analyzing the consequences which was tabulated, it is seen than, five out of eight datasets was good in sorting, utilizing MLP. And three datasets was good in sorting utilizing Pruned J48.
We can non come to a decision that, MLP is the best classifier and J48 is non a good one, because MLP besides had some of its ain disadvantages. When I was dividing the information harmonizing to the datasets, I noted the preparation clip taken for each procedure. Finally when I went through that information, I understood that the preparation clip taken is more in MLP than Pruned J48. On most of the preparation informations, J48 takes merely a less sum of clip, whereas the MLP takes about 5 to 8 times of what Pruned J48 takes.
I would wish to come to a decision that the Multilayer Perceptron was good in our datasets and the Pruned J48 besides performed good with merely a minor difference in truth and public presentation from Multilayer Perceptron.