# Liberty University Statistics K-Nearest Neighbor Classification Essay

k-Nearest Neighbor Classification

## The purpose of this assignment is to perform k-Nearest Neighbor classification, interpret the results, and analyze whether or not the information generated can be used to address a specific business problem.

For this assignment, you will use the “Adult Incomes” data set from the Topic Materials.

ABC Survey Company collects data via surveys that it then sells to marketing departments. Marketing departments typically do not like missing data. Since survey takers typically do not like to answer questions regarding their salary, the one question usually missing from the survey results is, “Is your annual salary $50,000 or more?”

You are the analyst who has been tasked with finding a way to impute (i.e., fill-in) the answer to the question, “Is your annual salary $50,000 or more?” This information can best be imputed based upon how individuals answer other survey questions related to their marital status, educational level, occupation, and familial relationship status. If this important question can be accurately imputed, then the worth of the survey data provided by ABC Survey Company increases dramatically.

*Question 1:*Using only “Marital_Status,” “Education,” “Occupation,” and “Relationship” variables, find the number of neighbors (k) that minimizes the error rate. Use a range of k between 3 and 10. Include the “k Selection Error Log” output when submitting the answer.

*Question 2:*Using the same variables and the k selected in Question 1, rerun the nearest neighbor model using the feature selection option in the IBM SPSS Modeler. What is the set of variables that minimize the error rate? Include the “Predictor Selection Error Log” output when submitting the answer.

*Question 3:*Using the value of k and the set of variables that minimizes the error rate, rerun the k-Nearest Neighbor model. What is the classification table? Include the pivot table output when submitting the answer.

*Question 4:*Consider the following individual: Marital_Status=Never-married, Education=Masters, Occupation=Sales, and Relationship=Not-in-family. Based on the k-Nearest Neighbor model from Question 3, how would this individual be classified? Provide the predicted income level (“>50K” or “<=50K") and explain the process that you used to determine the income level. Include the table illustrating the data when submitting the answer.

*Question 5:*Describe the model building process you used to determine whether or not a particular survey taker earned an annual salary of $50,000 or more. Include discussion of the accuracy of the k-Nearest Neighbor model and how it can be used in practice to impute the answer to the question, “Is your annual salary $50,000 or more?”

Course Code

MIS-655

Class Code

MIS-655-O500

Criteria

Content

Percentage

100.0%

k-Nearest Neighbor Classification Questions 1-4

30.0%

Charts, Graphs, and Calculations

30.0%

Model Building Process and Accuracy

30.0%

Mechanics of Writing (includes spelling,

punctuation, grammar, language use)

10.0%

Total Weightage

100%

Assignment Title

k-Nearest Neighbor Classification

Unsatisfactory (0.00%)

Answers to k-Nearest Neighbor questions are not included.

Charts, graphs, and calculations to support and communicate

visual representations of data are not included.

Description of the model building process and model

accuracy is not included.

Surface errors are pervasive enough that they impede

communication of meaning. Inappropriate word choice or

sentence construction is used.

Total Points

95.0

Less than Satisfactory (65.00%)

Answers to k-Nearest Neighbor questions are incomplete or

incorrect.

Charts, graphs, and calculations to support and communicate

visual representations of data are incomplete or incorrect.

Description of the model building process and model

accuracy is incomplete or incorrect.

Frequent and repetitive mechanical errors distract the

reader. Inconsistencies in language choice (register) or word

choice are present. Sentence structure is correct but not

varied.

Satisfactory (75.00%)

Answers to k-Nearest Neighbor questions are included but

contain some errors in the classification as evidenced by

associated pivot charts, tables, and outputs.

Charts, graphs, and calculations to support and communicate

visual representations of data are mostly included and mostly

correct.

Description of the model building process and model

accuracy is included but lacks relevant supporting details.

Some mechanical errors or typos are present, but they are

not overly distracting to the reader. Correct and varied

sentence structure and audience-appropriate language are

employed.

Good (85.00%)

Answers to k-Nearest Neighbor questions are complete and

supported by associated pivot charts, tables, and outputs that

are generally accurate.

Charts, graphs, and calculations to support and communicate

visual representations of data are complete and correct.

Description of the model building process and model

accuracy is complete and includes relevant supporting details.

Prose is largely free of mechanical errors, although a few may

be present. The writer uses a variety of effective sentence

structures and figures of speech.

Excellent (100.00%)

Answers to k-Nearest Neighbor questions are expertly crafted

and supported by pivot charts, tables, and outputs that are

completely accurate.

Charts, graphs, and calculations to support and communicate

visual representations of data are expertly crafted.

Description of the model building process and model

accuracy is extensive and includes numerous relevant

supporting details.

Writer is clearly in command of standard, written, academic

English.

Comments

Points Earned

ID

Age

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

34

36

38

49

47

54

29

41

39

29

22

44

31

61

35

54

29

22

29

50

32

70

26

32

55

38

25

51

17

83

52

37

35

43

25

19

44

21

30

22

28

54

36

50

46

34

Age_Category

25-34

35-44

35-44

45-54

45-54

45-54

25-34

35-44

35-44

25-34

## We've got everything to become your favourite writing service

### Money back guarantee

Your money is safe. Even if we fail to satisfy your expectations, you can always request a refund and get your money back.

### Confidentiality

We don’t share your private information with anyone. What happens on our website stays on our website.

### Our service is legit

We provide you with a sample paper on the topic you need, and this kind of academic assistance is perfectly legitimate.

### Get a plagiarism-free paper

We check every paper with our plagiarism-detection software, so you get a unique paper written for your particular purposes.

### We can help with urgent tasks

Need a paper tomorrow? We can write it even while you’re sleeping. Place an order now and get your paper in 8 hours.

### Pay a fair price

Our prices depend on urgency. If you want a cheap essay, place your order in advance. Our prices start from $11 per page.