BAST 1206 HCT Statistics Managerial Statistics Exam Practice
HIGHER COLLEGE OF TECHNOLOGYDEPARTMENT: BUSINESS STUDIES
Final Examination: Assignment Based Assessment
Semester: II
A. Y.: 2019 / 2020
Start Date: Tuesday, 19 May
Time: 09.00 AM
Due Date: Thursday, 21 May
Time: 09.00 AM
Student Name
Student ID
Specialization
Section
01 to 11
Level
DIPLOMA FIRST YEAR
Course Name
MANAGERIAL STATISTICS FOR BUSINESS
Course Code
BAST1206
For official Use Only
Question No.
Max.
Marks
5
1
5
2
5
2
5
3
5
3
5
4
5
4
5
5
5
5
5
6
5
6
5
7
5
7
5
8
5
8
5
9
5
9
5
10
5
10
5
Grand Total Marks
50
Question No.
Max. Marks
1
Obtained Marks
Obtained
Marks
50
First Marker:
Second Marker:
Date:
Date:
Guidelines for Students to Submit the Assignment:
1|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
1)
The final assessment for semester 2, 2019-20 will be done through comprehensive assignment
for a maximum of 50 marks. The schedule of the final assessment is available in the college website.
https://www.hct.edu.om/about/the-college/announcements/final-assessment-timetable-041620
2)
All the students are expected to have only one assignment at one time. In case, if the students
have more than one assignment on the same day, please report to the exam committee through the
following mail id. exam.bus@hct.edu.om as soon as possible.
3)
All students are given 48 hours to complete and submit each assignment from the day, date and
time the assignment is uploaded. Students are advised not to wait till the last moment of the deadline
to submit the assignment.
4)
The students can check the assignment anytime and any number of times from the opening of the
assignment. The answer to the assignment need to be uploaded in e-learning within 48 hours.
5)
The answer to the assignment can be uploaded only one time. No requests for resubmission of
the assignment will be entertained.
6)
The students may contact the following mail Ids if they face any difficulties while related to final
assignment.
For Academic related support :
Business Courses
anand.kalimani@hct.edu.om
karri.krishna@hct.edu.om
For Technical Writing 1
ramil-ecot@hct.edu.om
For Technical Writing 2
khulood.aiadi@hct.edu.om
For Technical Communication
jocelyn.balili@hct.edu.om
For issues related to e-mail
office365support@hct.edu.om
accounts and Microsoft Teams
Any issues related to E-Learning
support.elearning@hct.edu.om
Moodle
Any other IT Troubleshooting
helpdesk2@hct.edu.om
2|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
7)
Students may contact their respective lecturer through college email (within the 48-hour period
given) if they have any doubts and clarifications on the assignments.
8)
Students should be aware that this assignment is an independent assessment. Students are not
allowed to get help from any other person during the assessment period.
9)
Students assignment will be checked for plagiarism through Turnitin software. This assignment
will be assessed as per the College Assessment Policy. Student will be investigated in case of
plagiarism as per the College policy and procedures. The maximum acceptable similarity index is
25%.
10) In case the students face any technical issues regarding the submission of assignment, the answer
to the assignment can be mailed to the concerned lecturer within the 48-hour period using college
email.
11) Any assignment submitted after the 48-hour period will not be considered for evaluation.
12) The assignment should be submitted only with the file in MS Word document. No other format is
acceptable at all (e.g. pictures, JPEG, PDF, etc).
13) The students need to answer the assignment in the prescribed number of words as mentioned in
the assignment.
14) The students need to follow the following format while preparing the assignment :
Font Style: Times New Roman
Font Size: 12 point for body and 14 point for Headings
Line Spacing: 1.5
Margin: 2.54cm (One inch) on all the sides
Page Number : At the bottom right hand corner of each page
Colour: All words should be in black colour
15) Students who will fail to submit their assignment as per the deadline given are required to make
an online appeal along with the valid excuses as the guidelines which will be announced through the
college website or e-learning portal within three days from the date of submission deadline.
ANSWER ALL THE QUESTIONS, SHOW ALL CALCULATIONS, WORKING NOTES AND EXPLANATION
(EACH QUESTION WILL CARRY 05 MARKS)
Q 1: Mr. Irfan, Plant Manager of Al Khuwair Furniture LLC has recently installed two plants A and B
for their production of 2 Seater Polyster Sofa.
The productivity of the plant A for the past 10 days is 9, 14, 10, 8, 12, 16, 9, 12, 8 and 14 sofas
The productivity of the plant B for the past 10 days is 10, 14, 7, 9, 10, 11, 8, 13, 10 and 9 sofas
3|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
a) Find out which plant is more consistent in productivity based on Standard Deviation (SD) and
give reason for your answer.
b) Which method will give you precise results, Coefficient of Variation (CV) or Standard
deviation? Discuss analytically
(3+2=5 Marks)
Q 2: During COVID-19 lockdown, Mr. Umair, a Sales Manager of Oasis Retail LLC is worried about
the sales and revenue of the store. He decided to find out the average sales revenue for the last 60 days.
He collected the revenue details of last 60 days from the accounts.
Revenue
(’00 RO)
No. of days
00-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
9
5
10
10
9
8
3
6
a) Find out the average revenue of the Oasis Retail LLC for the last 60 days
b) Find out the value which has occurred most frequently in the given data set.
c) Critically compare measures of central tendency mean and mode.
(2+2+1=5 Marks)
Q 3: The Times of Oman, leading newspaper had organized reading competition for the students and
the competition was attended by 40 Male students and 20 Female students. The exam result shows that
18 Male students and 15 female students obtained A Grade and remaining students got either B or C
Grade.
4|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
a) If a student is selected randomly, what is the probability that the student is going to be
Female student?
b) If a student is selected randomly, what is the probability that the student is male student?
c) If a student is selected randomly, what is probability that we get a male student or A grade
student?
d) If a student is selected randomly, what is probability that we get a male student or female
student?
e) What will happen if probability is more than 1? Discuss critically
(5Q*1M=5 Marks)
Q 4: Mr. Ahmed who is a manager at PVR Multiplex Muscat has conducted a survey to investigate
the number of people visited the theatre in the last 30 days.
The number of people visited the theatre in the last 30 days are: 30, 36, 39, 37, 36, 54, 38, 33, 37, 33,
33, 36, 32, 48, 42, 38, 36, 35, 30, 33, 39, 37, 33, 36, 30, 39, 44, 32, 44 and 50.
a) Find out the grouped frequency distribution using 5 classes.
b) Which graph will be suitable to present the grouped frequency distribution? Give reasons to
your answer
c) Cumulative frequency and relative frequency are same. Discuss critically
(3+1+1=5 Marks)
Q 5: Al Matra LLC has recently hired Mr. Musab, a professional market researcher. Mr. Musab is
interested to find out relationship between cost incurred and revenue earned by the company. He
collected the data of last 7 years from the accounts department, but he is confused about the method
which can find out the relationship.
5|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
Cost Incurred
Year
(OMR in
Millions)
Revenue
Earned
(OMR in
Millions)
2013
8
20
2014
10
22
2015
9
22
2016
7
18
2017
11
20
2018
12
24
2019
10
25
a) Suggest Mr. Musab a suitable method to find out the relationship between cost and revenue.
b) Find out the correlation value and give interpretation of the result
c) If correlation value is 1.25, discuss the result analytically.
(1+3+1=5 Marks)
Q 6 : (Using the Q. 5 data) If the cost incurred is OMR 20 Million, what will be the volume of revenue?
(05 Marks)
Q.7: Mr. Hamood, a research scholar of Business Department in HCT, would like to find out the
association between the mid-term marks, quiz marks, assignment marks and end semester marks with
the Total Marks/GP obtained by the students in the final exam. He has decided to collect the data from
the college students studying in the Sultanate of Oman. He came to know that there are more than
6|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
138,000 students studying in all the different colleges in Oman and realized that it is not possible to
collect the data from all the students considering the time to complete the study.
(5Q*1M=5 Marks)
a) How will you decide the sample size? Discuss with reasons
b) Find out the dependent variable and independent variables
c) Who are the respondents?
d) How will you collect the data? Discuss the instrument and process
e) Inferential statistics is applicable. Discuss analytically
Q.8: Following are the details of survey conducted by a survey firm Mini LLC. They would like to
find out the trend of urban population since 2013.
7|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
Population
Urban
Population
2013
2,843,415
2,086,983
2014
3,041,434
2,285,997
2015
4,267,348
3,416,565
2016
4,479,219
3,650,429
2017
4,665,928
3,874,042
2018
4,829,473
4,083,206
2019
4,974,986
4,273,762
2020
5,106,626
4,442,970
Year
a) Moving average or semi average which method will be suitable to find out the trend.
b) Find out the 3 yearly trend values using moving average method and comment on the trend
c) Critically compare moving average and semi average method
(1+3+1=5 Marks)
Q 9: In Bank Muscat Al Khuwair branch, there are 32 employees and average weight of the branch is
48 kg and average height is 155cm excluding branch manager.
Find out the following:
8|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
a) If 08 employees having weight of 44 kg, 46 kg, 48 kg, 47 kg, 40 kg, 42 kg, 41 kg and 40 kg
have been terminated, what will be the average weight of the branch excluding branch
manager?
(02 Marks)
b) If 08 new employees having height 140 cm, 135 cm, 137 cm, 135 cm, 145 cm, 142 cm, 143
cm and 140 cm joined, what will be the mean height of the branch excluding branch manager?
(02 Marks)
c) If the weight of the branch manager (76 kg) is included, what will be the mean weight of the
branch?
(01 Mark)
Q 10: Mr. Adheel, a Manager with Construction Company working in Ibra and his friend Mr. Anees
who is also working as a Manager with Construction Company in Muscat. The average weekly milk
products expenditure for both in the last 07 years is as per below:
Year
Expenditure in Expenditure
Muscat (RO)
in Ibra (RO)
9|P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
2013
30
12
2014
33
15
2015
30
30
2016
28
27
2017
34
18
2018
24
23
2019
27
24
a) If cost of living is positively correlated with milk expenditure, which city will have consistent
cost of living? Find out using coefficient of variation
b) Why do you think coefficient of variation is more precise than Standard deviation? Discuss
critically
(3+2=5 Marks)
10 | P a g e
HCTBUSDEPTAY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
HCTBUSDEPT AY2019-20SEM-II
INTRODUCTION TO
STATISTICS
Statistics
❑ The science of collecting, organizing, presenting,
analyzing, and interpreting data to assist in making
more effective decisions
❑ Statistical analysis – used to manipulate summarize,
and investigate data, to provide useful information
for decision-making.
Why study statistics?
1. Data are used everywhere.
2. Statistical techniques are used to make many decisions that affect
our day to day lives
3. Irrespective of your career, you will make professional decisions
that involve data. An understanding of statistical methods will help
you make these decisions effectively.
Types of statistics
• Descriptive statistics – Methods of organizing, summarizing, and
presenting data in an informative way
• Inferential statistics – The methods used to make conclusion about a
population on the basis of a sample
• Population –The entire set of individuals or objects of interest or the
measurements obtained from all individuals or objects of interest
In general population means number of people but in statistics meaning
of population is different.
Sample – A portion, or part, of the population of
interest
Descriptive Statistics
• Collect data
• e.g., Survey
• Present data
• e.g., Tables and graphs
• Summarize data
X
• e.g., Sample mean =
n
i
Examples of Descriptive Statistics Tools
• Mean
• Weighted mean
• Median
• Quartiles
• Mode
• Variance
• Range
• Mid range
Inferential Statistics
• Inference is the process of drawing conclusions or making
decisions based on sample results about a population
• Estimation.For example, Estimate the population average
height using the sample average height
• Hypothesis testing . For example, Test the claim that the
population average height is 161cm
• Hypothesis means assumptions or presumptions or claim
Sampling
A sample should have the same characteristics of the population from which sample is drawn.
Sampling methods can be:
• Random sampling :each member of the population has an equal chance of being selected. A
simple random sample is an unbiased
surveying technique.
• Non-random sampling is a sampling technique where the samples are gathered in a process that
doesnot give all the individuals in the population
equal chances of being selected.(biased)
The actual process of sampling causes sampling errors. For example, the sample may not be large
enough or representative of the population.
Factors not related to the sampling process cause non-sampling errors. A defective counting device
can cause a non-sampling error.
Statistical data
The collection of data that are relevant to the problem being
studied is commonly the most difficult, expensive, and timeconsuming part of the entire research project.
Statistical data are usually obtained by counting or measuring
items.
Primary data are collected for first time
specifically for the analysis desired. Methods are:
➢ Questionnaire
➢ Interview
➢ Observation
➢ Projective technique
• Secondary data have already been compiled
and are available for further statistical analysis
Questionnaire
Definition: Questionnaire is a
set of questions for obtaining
statistically useful or personal
information from individuals.
(www.merriam-webster.com)
Source: Batehet.al (2015),Using Statistics for Better Business Decisions, Business Expert Press, ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/momp/detail.action?docID=4201910. Created from momp on 2019-01-08 20:56:25.
Types of Data
Data means information in raw or unorganized form (such as alphabets, numbers, or
symbols) that refer to, or represent, conditions, ideas, or objects. Data is limitless and
present everywhere in the universe.
(Read more:http://www.businessdictionary.com/definition/data.html)
Statistical data are usually obtained by counting or measuring items. Most data can be put
into the following categories:
Qualitative – Qualitative data are generally described by words or letters. (Gender, blood
groups, hair colour etc). Many numerical techniques do not apply to the qualitative
data. For example, it does not make sense to find an average hair color or blood type.
• Quantitative – data are observations that are measured on
a numerical scale (distance traveled to college, number of
children in a family, etc.)
Quantitative data can be separated into two subgroups:
• discrete (if it is the result of counting (the number of students of a given ethnic group in
a class, the number of books on a shelf, …)
• continuous (if it is the result of measuring (distance traveled, weight of luggage, …)
Variables and Distributions
We finally need a flexible term to denote what is being measured
through the sample survey. That term is called as variable.
A variable is an item of interest that can take on many different
numerical values.
• Variable is the term used to record a particular characteristic of the
population we are studying. Example: Marks, Age, Gender, etc
• For example, if our population consists of pictures taken from Mars,
we might use the following variables to capture various
characteristics of our population:
• Quality of a picture • Title of a picture • Latitude and longitude of the
center of a picture • Date the picture was taken
Variables and Distributions
• It is useful to put variables into different categories, as different statistical procedures
apply to different types of variables. Variables can be categorized into two broad
categories, numerical and categorical:
• Categorical variables are variables that have a limited number of distinct values or
categories. They are sometimes called discrete variables.
Categorical variables again split up into two groups, ordinal and nominal variables.
•
Ordinal variables represent categories with some intrinsic order (e.g., low,
medium, high; or strongly agree, agree, disagree, strongly disagree). Ordinal
variables could consist of numeric values that represent distinct categories (e.g., 1 =
low, 2 = medium, 3 = high). These numbers are merely codes.
•
Nominal variables represent categories with no intrinsic order (e.g., job category,
company division, and race). Nominal variables could also consist of numeric values
that represent distinct categories (e.g., 1 = male, 2 = female).
• Numeric variables refer to characteristics that have a numeric value. They are usually
continuous variables, that is, all values in an interval are possible.
Variables and Distributions
• Example:
An experiment is conducted to test whether a particular drug will
successfully lower the blood pressure of people. The data collected
consists of the sex of each patient, the blood pressure measured, and
the date the measurement took place. The blood pressure is measured
three times, once before the patient was treated, then one hour after
administrating the drug, and again two days after administrating the
drug. What variables comprise this experiment?
Variables and Distributions
• The distribution of a variable refers to the set of all possible
values of a variable and the associated frequencies or
probabilities with which these values occur.
• Sometimes variables are distributed so that all outcomes are
equally, or nearly equally likely. Other variables show results that
“cluster” around one (or more) particular value.
• A heterogeneous distribution is a distribution of values of a
variable where all outcomes are nearly equally likely.
• A homogeneous distribution is a distribution of values of a
variable that cluster around one or more values, while other
values are occurring with very low frequencies or probabilities.
Statistics and Microsoft Excel
we recommend spending some time using the resources available on
Microsoft’s website at https://support.office.com. Search for “Basic
Tasks in Excel .”
Frequency Distributions and
Graphical Presentation of Data
Frequency distribution
Frequency distribution is the organization of raw data in a table form,
using classes and frequencies.
Types of frequency distribution
• Categorical frequency distribution
• Ungrouped frequency distribution
• Grouped frequency distribution
Categorical frequency distribution
• Categorical Frequency Distributions are used for data that can be
placed in specific categories, such as nominal or ordinal level data.
Example: Educational Qualifications of 40 individuals.
Example
Following are the grades scored by the students in Business
Mathematics. You are required to construct a frequency distribution.
A
C
D
D
C
D
A
B
B
C
C
B
C
B
A
A
C
F
A
C
A
C
A
D
F
A
D
C
D
D
A
D
D
A
D
F
D
A
A
D
A
F
B
F
B
A
F
B
B
D
F
B
C
F
B
D
A
F
A
D
F
A
A
Answer
Grade
A
B
C
D
F
Tally Bars
Frequency
Ungrouped Frequency Distribution
• The frequency is the number of times a particular data point occurs in
the set of data.
Example:
Given below are the marks obtained by 20 students in Accounting out
of 25.
21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19,
19
Marks
Tally Bars
Frequency
12
I
1
15
II
2
17
III
3
19
IIIII
5
21
III
3
23
IIII
4
25
II
2
Example
• From the below information construct an ungrouped frequency
distribution table.
Answer
Variable
5
10
15
20
Tally Bar
Frequency
Grouped Frequency Distribution
• In grouped frequency distribution, we need to find classes and frequencies.
Important terms in Grouped Frequency Distribution:
1.Class interval/width
2.Class limit
3.Inclusive method
4.Exclusive method
Steps :
1.
Find Highest and Lowest value
2.
Find Range
3.
Decide No. Of classes required
4.
Find Class width
5.
Find Upper and Lower class limit
6.
Talley the data
7.
Find frequencies
Example
Weekly wages for the 15 workers in a company are listed below. Construct a frequency
distribution with 5 classes.
21
10
16
32
23
23
26
20
29
34
18
22
19
27
20
Solution:
Step 1: Determine the classes.
➢Find the highest and lowest value. H = 34 and L = 10
➢Find the range: R = Highest value – Lowest Value
➢R = 34 – 10 = 24
➢Select the number of classes desired (usually between 5 and 20). In this case 5 classes
➢Find the class width by dividing the range by the number of classes
➢Width
=
(Range )/(Number of classes) = 24/5 = 4.8 (round off to 5)
Example-continued
• Step 2: tally the data
• Step 3: Find the numerical frequencies from the tallies
Class Limit
Tally
Frequencies
10 – 15
1
1
15 – 20
111
3
20 – 25
11111
5
25 – 30
1111
4
30 – 35
11
2
Cumulative Frequencies and Relative Frequencies
Class Limit
10 – 15
15 – 20
20 – 25
25 – 30
30 – 35
Total
Cumulative
Relative
Frequencies Frequencies Frequencies (%)
1
1
1÷15= 6.67%
3
1+3=4
3 ÷15=20%
5
1+3+5=9
5÷15=33.33%
4
1+3+5+4=13
4÷15=26.67%
2
1+3+5+4+2=15 2÷15=13.33%
15
100%
Exercise 1
• Example :- Following are the marks obtained by 30 students in an
examination. Prepare a grouped frequency distribution with 6 classes.
Also find out cumulative frequencies and relative frequencies.
25
15
26
65
41
55
35
54
78
55
65
22
32
62
22
45
13
15
16
46
62
19
42
33
24
Exercise 2
102
104
140
136
152
132
158
193
128
141
130
133
147
148
141
129
133
137
179
147
152
114
124
138
129
164
135
128
139
154
168
148
152
116
107
136
167
143
139
152
Construct a grouped frequency distribution with 7
classes. Also find out cumulative frequencies and
relative frequencies.
Graphical Presentation
BAR DIAGRAM/CHART
• A bar chart presents categorical data/ungrouped frequency distributions.
• A bar diagram makes it easy to compare sets of data between different
groups at a glance.
• The graph represents categories on one axis and frequencies in the other.
• Examples:-
Example
• The table shown here displays the number of students joined in
different specialization in a college during 2018. Construct a bar chart
for the data.
Specializations
Frequency
Accounting
Human
Resource
200
Marketing
120
E-Business
90
260
Pie Diagram/Chart
• Pie charts represent categorical data. Not suitable for numerical data.
• Pie charts are useful when the categories are not numerous( eight
categories or less)
• Pie charts are generally used to show percentage or proportional data.
• Each percentage is represented by a slice of pie.
Example
• The favorite flavors of ice-cream for the children in a locality are given
follow. Draw a pie chart to represent the given information.
Flavors
Vanilla Strawberry Chocolate Kesar-pista Mango zap
Number of
50
children
30
20
60
40
Answer
Flavors
Vanilla
Strawberry
Chocolate
Kesar-Pista
Mango Zap
Percentage of Children
(50/200)*100=25%
15%
10%
30%
20%
Pie chart
Vanilla
25%
Mango Zap
20%
Strawberry
15%
Kesar-Pista
30%
Chocolate
10%
Note: Pie chart is drawn based on percentage of frequencies
Histogram
Histograms are used to represent the grouped frequency distribution.
Steps for drawing histogram:
• Step 1: Draw and label the x and y axes. The x axis is always the horizontal
axis, and the y axis is always the vertical axis.
• Step 2: Represent the frequency on the y axis and the class limits on the x
axis.
• Step 3: Using the frequencies as the heights, draw vertical bars for each
class.
Example
Class
Limit Frequencies
10 – 15
1
15 – 20
3
20 – 25
5
25 – 30
4
30 – 35
2
Draw a Histogram from the above data.
CHAPTER 3
DESCRIPTIVE STATISTICS
MEASURES OF CENTRAL TENDENCY
ARITHMETIC MEAN/AVERAGE
Definition: The mean represents the average of all observations.
Mean = (sum of all measurements)/(number of measurements).
The Greek letter m (mu) is used to denote the mean of the entire
population, or population mean.
The symbol x̅ (read as “x bar”) is used to denote the mean of a
sample, or sample mean
Mean from Raw data.
• Formula:
• x̅ =
σ𝐗
𝐍
• Example:1. Find Mean from the following score by a student in 10 different
tests.
25, 65, 32, 46, 28, 34, 52, 64, 68, 36
2. Find the average profit of 6 small trading companies in OMR
250, 630, -330, 450, 350, 510
Mean from Ungrouped data
f(X)
• Formula : Mean =
N
Example:- Find the mean wages of 100 workers
Wages in OMR(X)
No of workers(f)
05
15
08
18
10
30
12
15
15
10
18
7
20
5
Example No. 2 – Find average cash flows of a shop
for 49 days.
Daily income in RO
No. Shops
100
5
-150
8
230
12
350
10
400
7
425
4
450
3
Mean from grouped data
Example:- Find average marks of 33 students in a class.
Marks
No of Students
0-10
2
10-20
6
20-30
8
30-40
10
40-50
4
50-60
2
60-70
1
Mean =
f(Xm)
N
Xm=Mid points of X values
Example No. 2
• Find the average monthly income of 52 families in Muscat
Income in
No. Families
OMR
00- 50
50- 100
100- 150
150- 200
200- 250
250- 300
300- 350
5
10
12
15
6
3
1
What are the Advantages and Disadvantages of
Mean?
Advantages:
The mean is easier to compute than the median since it does not
require sorted observations.
• The mean is based on each value of a variable that makes it more
useful than the median.
Disadvantage:
• The mean is influenced by extreme values.
Median
• Median is a value that divide a series of numbers into two equal parts.
• Definition: The median is that number from a population or sample
chosen so that half of all numbers are larger and half of the numbers are
smaller than that number.
Median from Raw data
• Rule:1. ARRANGE DATA IN ASCENDING ORDER
2. If N is odd number then, (N+1)/2th value
3. If N is an even number then, find (N+1)/2th value and take the
average of two middle values
Median from ungrouped data
• Step 1. Find cumulative frequency
• Step 2 Use the rule (N+1)/2th value.
• Example: Find Median from the following marks
Marks
Number of
Students
10
2
20
5
30
10
40
12
50
4
60
3
70
1
Cumulative
frequency
Median from Grouped data
• Step 1. Find cumulative frequency
• Step 2 Use the rule N/2th value.
• Step 3 Use the following formula to find the exact value of median
Median from Grouped data-Example
• Find the median income of 52 families in Muscat
Income in
No. Families
OMR
00- 50
50- 100
100- 150
150- 200
200- 250
250- 300
300- 350
5
10
12
15
6
3
1
Mode
• The mode is the value that occurs most often (with the highest
frequency) in the data set.
• A data set can have more than one mode or no mode at all.
• A group of data set may be
➢Uni-modal : Only one mode
➢Bi-modal : Two modes
➢Multi-modal: More than two modes
Mode from Raw data
• Value that has highest frequency is the mode.
Examples:
• 1,2,3,4,5,6,7,8,9,10 – No Mode
• 1,2,2,3,4,5,6,7,8,9 – Mode is 2 (Uni-modal)
• 1,2,2,3,4,5,5,6,7,8 – Modes are 2 and 5( Bi-Modal)
• 1,1,2,3,3,4,5,5,6,7,7– Modes are 1,3,5 and 7 ( Multi modal)
• 1,4,3,3,2,4,3,5,4,2,6,4,7,4– Mode is 4 ( Uni modal)
Mode from Ungrouped data
• Value that has highest frequency is the mode.
• Find Mode from the following marks:
Size of Shoe
Frequency
38
34
39
45
40
60
41
23
42
22
Mode from Grouped data
Step1: Find the modal class with highest frequency.
Step 2: Use the following formula to find the exact value of mode.
Where L = Lower limit of the modal class
f1 = frequency of the modal class
f0 = frequency of the class preceding the modal class
f2 = frequency of the class succeeding the modal class
c= class width
Mode from Grouped data-Example
Find Mode from the following marks:
MEASURES OF VARIABILITY
Meaning of Variability
• Variability measures how much values in a set of data differ from
each other.
• Variability is also called as dispersion or spread.
• Data sets with similar values are said to have little variability, while
data sets that have values that are spread out have high variability.
Methods of measuring variability
• Range
• Quartile deviation
• Variance
• Standard deviation
• Co-efficient of variation
RANGE
• The range is the difference between the largest and the smallest
value of the data set.
Example: Suppose two machines produce nails which are on average
10 inches long. A sample of 11 nails is selected from each machine.
Machine A: 6, 8, 8, 10, 10, 10, 10, 10, 12, 12, 14
Machine B: 6, 4, 6, 8, 8, 10, 12, 12, 14, 14, 14
• For Machine A data, the range is 14-6= 8
• For Mechanic B data, the range is 14-4= 10
Conclusion: Performance of machine A is better than machine B as nails
produced by machine A show less variation compared to machine B.
Quartile Deviation from raw data
• Quartile deviation= (Quartile3-Quartile 1)/2
Quartile Deviation = (29.50-16.25)/2= 6.625
Quartile deviation from ungrouped data
• Q1 = (n + 1) / 4 th value (using cf)
• Q3 =3 (n + 1) / 4 th value( using cf)
• Quartile deviation= (Quartile3-Quartile 1)/2
• Example: Find quartile deviation from below information:
Quartile deviation from grouped data
Formula :Modify median formula accordingly.
Q1– N/4 instead of N/2
Q3 — 3N/4 instead of N/2
Quartile deviation= (Quartile3-Quartile 1)/2
Standard Deviation and Variance
• The Standard deviation measures the spread of the data around the
mean.
• Standard deviation is the absolute measure of dispersion which
expresses variation in the same units as the original data.
• Variance is the square of standard deviation.
• Standard deviation is denoted by “σ” or “s”
• Variance is denoted by “σ²”or “s²”
Standard Deviation and Variance from raw data
•
Formula :
Example: Suppose two machines produce nails which are on average 10 inches long. A
sample of 11 nails is selected from each machine.
Machine A: 6, 10, 8, 10, 12, 10, 14
Machine B: 6, 4, 8, 10, 12, 14, 14
Find the value of standard deviation and variance of size of nails for each machine.
Standard Deviation and Variance from raw data
Answer
Machine A
Machine B
X
ഥ
X-𝑿
ഥ )2
(X-𝑿
8
-2
4
4
-6
36
8
-2
4
10
0
0
12
2
4
14
4
16
14
4
16
ഥ )2=80
∑(X-𝑿
Mean = 70/7=10
Standard deviation= 2.39
Variance=(2.39)2
= 5.71
Mean = 70/7=10
Standard deviation= 2.64
Variance=(2.64)2
= 6.96
Conclusion: Based on standard
deviation, nails produced by
machine A is better than nails
produced by machine B as
variability is comparatively less
for nails produced by machine A.
Standard Deviation and Variance from Ungrouped Data
• Formula for standard deviation
• Example:
Marks
Number of
students
10
20
30
40
50
2
5
10
8
3
Find standard deviation and variance
Standard Deviation and Variance from Ungrouped Data
Answer
X
f
10
2
20
5
30
10
40
8
50
3
F(x)
x2
f (x2)
Standard Deviation and Variance from Grouped Data
Formula:
Marks Students (f) Mid-point( x) F(x)
0- 10
2
10- 20 5
20- 30 10
30- 40 8
40- 50 3
x2
f (x2)
Co-efficient of Variation(CV)
• The Coefficient of Variation(CV) expresses the standard deviation as a
percentage of mean.
• Researchers can easily compare variability of more than one variable
using CV.
• Formula
Co-efficient of Variation(CV)-Example 1
• Following are the mean and standard deviation of marks scored by
students in two different sections of Managerial Statistics.
• Section 1– Mean = 64.43, SD = 12.02,
• Section 2– Mean = 46.68, SD = 14.76,
1. Calculate coefficient of variation.
2. Which section shows higher performance of students? Give reason.
Co-efficient of Variation(CV)-Example 2
Following are the average weekly wages in Riyals and standard deviations in Riyals in two
factories located in Gala Industrial Area:
Factory
X
Y
Average
Wages
24.5
Wages(
38.5
Standard Deviation
4
6
No. of workers
512
624
1. Which factory X or Y pays out a larger amount as weekly wages?
2. Which factory X or Y has greater variability in individual wages?
Co-efficient of Variation(CV)-Example 3
Prices of a particular commodity in six years in Muscat and Salalah are given below:
Price in Muscat
22
20
19
23
16
18
Which city has more stable prices?
Price in Salalah
10
20
18
12
15
16
Chapter 4 &5
PROBABILITY,NORMAL DISTRIBUTION, ESTIMATION AND HYPOTHESIS TESTING
What is a probability?
What does it mean to say that a probability of a fair coin is one half,
or that the chances I pass this class are 80 percent,
First, think of some event where the outcome is uncertain.
Examples of such outcomes would be the roll of a die,
➢ the amount of rain that we get tomorrow,
➢ the state of the economy in one month.
In each case, we don’t know for sure what will happen. For example, we don’t know
exactly how much rain we will get tomorrow.
http://www-math.bgsu.edu/~albert/m115/probability/interp.htm
• A probability is a numerical measure of the likelihood of the event. It is a
number that we attach to an event, say the event that we’ll get over an inch of
rain tomorrow, which reflects the likelihood that we will get this much rain.
• A probability is a number from 0 to 1.
• If we assign a probability of 0 to an event, this indicates that this
event never will occur.
• A probability of 1 attached to a particular event indicates that this
event always will occur.
• What if we assign a probability of .5? This means that it is just as likely for
the event to occur as for the event to not occur.
THE PROBABILITY SCALE
+—————————-+—————————-+
0
event never
will occur
.5
1
event and “not event”
event are likely
always
will occur
to occur
http://www-math.bgsu.edu/~albert/m115/probability/interp.htm
Experiment
• An experiment is defined as the process that generates or
provides an outcome (result).
• The sample point is an individual outcome of an experiment.
• The set of all possible sample points in an experiment is
called a sample space.
• An event of a random Experiment is defined as a subset of
the sample space of the random Experiment.
• When you toss one coin the possible outcomes or the
sample space will include a head or tail.
Classical probability and Empirical probability.
• The classical probability uses the sample space to determine the
numerical probability that an event will occur.
• It assumes that all events have the same probability of occurring
(equally likely). For example, if you toss a coin the probability that you
1
get a head is .
2
• The empirical probability relies on actual experience to determine
the likelihood of outcomes. For example, suppose you have data on
number of people with different blood types and we get the
following:
• A: 22
B: 5
AB: 2
O: 21
• The probability that a person has type O is
21
50
which is 0.42.
Some sample spaces for various probability experiments are shown here
Experiment
Sample space
Tossing a coin
S = {H, T}
Tossing a coin twice
S = {HH},( HT), (TH),(TT)
Throwing a die
S = (1,2,3,4,5,6)
Throwing two die
S = (1,1), (1,2),(1,3), (1,4),(1,5),(1,6)
(2,1), (2,2),(2,3), (2,4),(2,5),(2,6)
(3,1), (3,2),(3,3), (3,4),(3,5),(3,6)
(4,1), (4,2),(4,3), (4,4),(4,5),(4,6)
(5,1), (5,2),(5,3), (5,4),(5,5),(5,6)
(6,1), (6,2),(6,3), (6,4),(6,5),(6,6)
EVENT OF AN PROBABILITY EXPERIMENT
• An event of a random Experiment is defined as a subset of the sample
space of the random Experiment.
• Example: 1
• Consider the random Experiment of the following a die and getting an
even number. Here sample space
• S = (1,2,3,4,5,6) and the event getting an even number is E= ( 2,4,6).
• Example 2
• There are 2 children in a family. Find the event that:
• Both children are boys
• Only one of the children is a girl
• There is at least one girl
• Here S = (BB,BG,GB,GG)
• Answer?
Probability of an Event
• Probability of an event E is given by
Total number of outcomes in E
P(E)= ________________________________________________
Total number of outcomes in the sample space
• The probability is denoted by
• P (E) =
𝒏(𝑬)
𝒏(𝒔)
Probability Rules
Rule 1: The probability of any event E is a number between and
including 0 and 1.
• (probability cannot be negative or greater than 1 )
Rule 2: If an event E cannot occur, its probability is 0.
• Example: When a single die is rolled , find the probability of getting 9
• Sample space S= ( 1,2,3,4,5,6,)
• There is no point in 9.
• .. P (getting 9) =
0
6
Probability Rules
Rule 3: If an event E is certain, then the probability of E is 1.
• Example: When a single die is rolled , What is the probability of
getting a number less than 7
• Sample space S= ( 1,2,3,4,5,6,). The event of getting a number less
than 7 is certain.
• P (getting a number less than 7) = 6 =
6
6
=1
Rule 4: The sum of the probabilities of the outcomes in the sample
space is 1.
Example: Tossing a single coin- Two out comes( Head or Tail)
Probability of getting head is
Total probability is
𝟏 𝟏
+
𝟐 𝟐
=1
1
2
and probability of getting tail is
1
.
2
Addition Rule for probability
• Addition Rule 1: When two events, A and B, are mutually exclusive,
the probability that A or B will occur is the sum of the probability of
each event.
• P(A or B) = P(A) + P(B)
• Experiment 1 :A day of the week is selected at random. Find the
probability that it is a weekend day.
• P ( Friday or Saturday) = P (friday) + P (saturday ) =
1
7
1
7
+ =
2
7
• Experiment 2: A single 6-sided die is rolled. What is the probability of
rolling a 2 or a 5?
P(2 or 5) =
P(2)
=
1
6
+
+
1
6
P(5)
=
2
6
=
1
3
• Rule 2: If A and B are not mutually exclusive, then
P (A or B) = P (A) + P( B) – P (A and B ).
• Example
• In a hospital unit there are 8 nurses and 5 physicians; 7 nurses and 3
physicians are females. If a staff person is selected, find the probability that
the subject is a nurse or a male.
• Solution
• The sample space is shown here. Answer:
Staff
Females
Males
Total
Nurses
7
1
8
Physicians
3
2
5
Total
10
3
13
The probability is
P(nurse or male) = P(nurse) + P(male) – P(male and nurse)
𝟖
𝟑
𝟏
𝟏𝟎
=
+
=
𝟏𝟑
𝟏𝟑
𝟏𝟑
𝟏𝟑
NORMAL DISTRIBUTION
• NORMAL DISTRIBUTION is a continuous probability distribution in
which the relative frequencies of a continuous variable are distributed
according to the normal probability law.
• In other words ,it is a symmetrical distribution in which the frequencies
are distributed evenly about the mean of the distribution.
• The normal distribution of a variable, when represented graphically,
takes the shape of a symmetrical curve, known as the Normal Curve.
• It helps us to find the proportion of
measurements that falls within a
certain range above, or below or
between selected values.
Symmetrical curve or Normal Curve.
Summary of the properties of the Theoretical Normal Distribution
The normal distribution curve is bell – shaped
The mean, median and mode are equal and located at the center of the distribution
The normal distribution curve is uni modal (i.e., it has only
one mode)
ESTIMATION AND HYPOTHESIS TESTING
Point Estimate
•
A Point estimate is a specific numerical value estimate of a
parameter. The best point estimate of the population mean (µ) is the
sample mean(x̄) .
•
Suppose a College Dean wishes to estimate the average age of
students attending classes this semester. The College Dean could select
a random sample of 100 students and find the average age of these
students (x̄), say, 22.3 years. For the sample mean, the College Dean
could infer that the average age of all the students (µ) is 22.3years. This
type of estimate is called a point estimate.
•
Sample measures are used to estimate population measures.
These statistics are called the estimators.
•
A good estimator should satisfy the three properties described
below.
Three properties of Good Estimator
• The estimator should be an unbiased estimator. That is the
expected value of the mean of the estimate obtained from samples of
a given size is equal to the parameter being estimated.
• The estimator should be consistent. For a consistent estimator, as
sample size increase, the value of the estimator approaches the value
of the parameter estimated.
• The estimator should be relatively efficient estimator. That is, of all
the statistics that can be used to estimate a parameter, the relatively
efficient estimator has the smallest variance.
Interval estimate
• An interval estimate of a parameter is an interval or a range of values
used to estimate the parameter. This estimate may or may not
contain the value of the parameter being estimated.
• In an interval estimate, the parameter is specified as being between
two values. For example, an interval estimate for the average age of
all students might be 26.9 < µ < 27.7
Confidence Level and Confidence Interval
• The Confidence Level of an interval estimate of a parameter is the
probability that the interval estimate will contain the parameter.
• A Confidence Interval is a specific interval estimate of a parameter
determined by using data obtained from a sample and by using the
specific confidence level of the estimate.
•
HYPOTHESIS TESTING
• Hypothesis testing is a decision – making process for evaluating claims
about a population.
• In hypothesis testing, the researcher must define the population under
study, state the particular hypotheses that will be investigated, give the
significance level, select a sample from the population, perform the
calculations required for the statistical test, and reach a conclusion.
• A statistical hypothesis is a conjecture about a population parameter.
• This conjecture may or may not be true.
• There are two types of statistical hypotheses for each situation: the null
hypothesis and the alternative hypothesis.
Null Hypothesis
• The null hypothesis symbolized by H0, is a statistical hypothesis that
states that there is no difference between a parameter and a specific
value, or there is no difference between two parameters.
Alternative Hypothesis
• The alternative hypothesis symbolized by H1, is a statistical
hypothesis that states the existence of a difference between a
parameter and a specific value, or states that there is a difference
between two parameters.
Chapter 6
Correlation and Regression
Correlation
Correlation is a statistical technique used to determine the degree to
which two variables are related
For example
❑whether the volume of sales for a given month is related to the amount of
advertising .
❑whether the number of hours a student studies is related to the student’s
score on a particular exam
❑Whether person’s age and his or her blood pressure are related?
These are only a few of the many questions that can be answered by using
the techniques of correlation and regression analysis
Types of relationship between variables
There are two types of relationships
• Simple relationship: Relationship between only two
variables, X and Y.
➢For example, Relationship between Oil price and economic
development
•
Simple relationship can also be Positive or Negative.
Types of relationship between variables
• Positive relationship or positive correlation :
• If the value of two variables, X and Y, move in the same direction such
as:
❖an increase in the value of one variable results, on an average, in a
corresponding increase in the values of the other variable.
❖a decrease in the value of one variable results, on an average, in a
corresponding decrease in the other variable .
• Example: The height and weight of a growing child.
Types of relationship between variables
• Negative relationship or Negative correlation: the correlation is said
to be negative or inverse if the two variables X and Y deviate in the
opposite direction.
• i.e. if the increase (or decrease) in the values of one variable results,
on an average, in a corresponding decrease (or increase) in the
values of other variable
Example: the price and demand of a commodity .
Types of relationship between variables
• Multiple relationships or multiple correlation: Examines the relationship
between more than two variables.
• For example an educator may wish to investigate the relationship
between a student’s success in college and factors such as the number of
hours devoted to studying, the student’s previous GPA and the student’s
high school background.
Types of variables in correlation analysis
• Independent variable: is the variable in regression that can be
controlled or manipulated. In this case, “number of hours of study” is
the independent variable and is designated as the x variable.
• Dependent variable: is the variable in regression that cannot be
controlled or manipulated. The grade the student received on the
exam is the dependent variable, designated as the y variable.
How to measure correlation?
Correlation can be measured using:
• Scatter diagram( graphical method)
• Correlation coefficient( Numerical method)
Scatter Diagram
• The scatter diagram is drawn with two variables, usually the first
variable is independent and the second variable is dependent on the
first variable
Source:https://www.statisticshomeworktutors.com/Scatter-DiagramAssignment-Homework-Help.php
Scatter Diagram-Example
Draw a scatter diagram from the following information
Hours of study x
Grade y (%)
Chart Title
100
90
6
3
92
63
80
70
60
50
40
1
47
30
5
88
10
20
0
0
2
58
4
75
1
2
3
4
5
6
7
Exercise : Scatter Diagram
Draw scatter diagrams from the following data sets
X
Y
X
Y
X
Y
50
10
10
50
50
20
40
8
20
40
30
50
30
6
30
30
20
40
20
4
40
20
40
10
10
2
50
10
10
30
Interpretation of scatter diagrams
Computation of Correlation Co-efficient
• Correlation coefficient is a numerical measure of correlation.
• Correlation Co-efficient measures the strength of directions of a linear
relationship between two variables.
Types of Correlation Co-efficient
➢Karl Pearson's Correlation Co-efficient
➢Spearman's Correlation Co-efficient
Karl Pearson's Correlation Co-efficient
• The symbol for the Karl Pearson's correlation coefficient is r.
• The range of the correlation coefficient is from -1 to +1.
• If there is a strong positive relationship between the variables, the
value of r will be close to +1.
• If there is a strong negative relationship between the variables, the
value of r will be close to -1.
• When there is no relationship between variables or only a weak
relationship, the value of r will be close to 0 (zero).
Formula-Karl Pearson's Correlation Co-efficient
https://study.com/academy/lesson/pearson-correlation-coefficient-formula-examplesignificance.html
Example 1
Compute the value of the Karl Pearson’s Correlation Coefficient for the data
obtained in the study of age and blood pressure
Student
Age (x)
Pressure y
A
43
12
B
48
12
C
56
13
D
61
14
E
67
14
F
70
15
Answer
Student
Age
Pressure y
XY
X2
Y2
∑xy=
∑X2=
∑Y2=
x
A
43
12
B
48
12
C
56
13
D
61
14
E
67
14
F
70
15
Total
∑x=
∑y=
Interpretation of correlation coefficient
https://www.chegg.com/homework-help/definitions/pearson-correlation-coefficient-pcc-31
Exercise : Compute the value of Karl Pearson’s Correlation Coefficient from
the following data.
X
Y
12
5
14
10
16
6
18
10
20
12
22
9
24
10
XY
X2
Y2
Spearman's Rank Correlation Coefficient
• The symbol for the Spearman’s correlation coefficient is rs
• The range of the correlation coefficient is from -1 to +1.
• If there is a strong positive relationship between the variables, the
value of r will be close to +1.
• If there is a strong negative relationship between the variables, the
value of r will be close to -1.
• When there is no relationship between variables or only a weak
relationship, the value of r will be close to 0 (zero).
Spearman's Rank Correlation Coefficient : Steps and
formula
• Rank the two data sets( Rx and Ry). Ranking is achieved by giving the
ranking '1' to the biggest number in a column, '2' to the second biggest
value and so on. The smallest value in the column will get the lowest
ranking. This should be done for both sets of measurements.
• Tied scores are given the mean (average) rank.
• Find Rx-Ry =d
• Square the differences (d²) to remove negative values and then sum
them (∑d²).
Formula
n= number of paired values in the given
data set
Example 1
Compute the value of the Spearman’s Correlation Coefficient for the data obtained in
the study of age and blood pressure
Student
Age (x)
Pressure y
A
43
12
B
48
12
C
56
13
D
61
14
E
67
14
F
70
15
Answer
Students
Age
Pressure y
Rank X
Rank Y
Rx-Ry =d
d²
x
A
43
12
6
5.5
B
48
12
5
5.5
C
56
13
4
4
D
61
14
3
2.5
E
67
14
2
2.5
F
70
15
1
1
∑d²=
Exercise : Compute the value of the Spearman’s Correlation Coefficient
for the following data.
X
Y
10
6
12
12
10
10
9
5
11
9
10
5
Rank X
Rank Y
Rx-Ry=d
d²
Regression
Regression Analysis is a basic and commonly used
type of predictive analysis
Regression Analysis is used to predict dependent
variable(Y) when any one of the independent
variable (X) is varied.
Difference Between Correlation and Regression
• Correlation is described as the analysis which lets us
know the association or the absence of the relationship
between two variables ‘x’ and ‘y’.
• Regression analysis, predicts the value of the
dependent variable based on the known value of the
independent variable, assuming that average
mathematical relationship between two or more
variables.
• https://keydifferences.com/difference-between-correlation-and-regression.html
Difference Between Correlation and Regression
Correlation describes the strength of a linear relationship
between two variables. Linear means “straight line”
Regression tells us how to draw the straight line described by
the correlation. Also known as “best-fit” line .
• Using scatter diagram, one must be able to draw the line of best fit
• Best fit means that the sum of the squares of the vertical distances
from each point to the line is at a minimum
Formula to calculate Regression line
Formulas for the regression line y!= a + b x
𝑎=
( 𝑦)( 𝑥 2 ) − ( 𝑥)( 𝑥𝑦)
𝑛
𝑏=
𝑥 2 − ( 𝑥)
𝑛( 𝑥𝑦) −
𝑛
𝑥
𝑥 2 − ( 𝑥)
Where a is the y! intercept
And b is the slope of the line
2
𝑦
2
Example
Using formula find the equation of the regression line from the following data
Students
Age (x)
Pressure y
A
43
128
B
48
120
C
56
135
D
61
143
E
67
141
F
70
152
Solution:
Step 1:
Find the values of xy, and x2 using following table.
Students
Age (x)
Pressure y
A
43
128
B
48
120
C
56
135
D
61
143
E
67
141
F
70
152
Total
∑x=345
∑y=819
XY
X2
∑xy=
∑X2=
Solution
Step 4:
• Step 2 :
• 𝑎=
𝑥2
𝑦
𝑛
−
𝑥2 −
𝑥
𝑥
The equation of the regression line is
𝑥𝑦
2
y=a+bx
y = 81.048 + 0.964 x
• 𝑎=
819 20399 − 345 47634
6 20399 − 345 2
= 81.048
Question:
If the value of Age(X)= 80, find the value of
Pressure(Y).
• Step 3:
𝑏=
• 𝑏=
𝑛
𝑥𝑦 −
𝑛
𝑥2
𝑥
−
6 47634 − 345 819
6 20399 − 345 2
𝑦
𝑥
2
= 0.964
Answer:
Exercise:
Find the equation of the regression line from the following data. Also
find the value of sales when the advertisement expenditure is OMR
10 millions.
Advertisement Expenditure
Sales
(OMR in Millions)
(OMR in Millions)
3
12
8
16
6
13
1
10
7
14
7
15
Chapter 7: Time series
Definition
• According to Merriam Webster Dictionary, time series is a set of
data collected sequentially and usually at fixed intervals of time.
• Example : The number of packets of milk sold in a small shop
Day
No. of packets of milk sold
Monday
90
Tuesday
88
Wednesday
85
Thursday
75
Friday
72
Saturday
90
Sunday
102
Meaning
Time series data is a sequence of observations
• collected from a process
• with equally spaced periods of time.
Example: Oil production in Oman in Barrels/Day
Example: Population in Oman ( in millions)
Importance of Time Series Analysis in Business
• Profit Planning
• Sales Forecasting
• Stock Market Analysis
• Process and Quality Control
• Economic Forecasting
• Risk Analysis & Evaluation of changes
COMPONENTS OF TIME SERIES
• Any time series can contain some or all of the following components:
1.Secular trend
• The increase or decrease in the movements of a time series is called
Secular trend.
• Secular trend is a long term movement in a time series.
• A time series data may show upward trend or downward trend for a
period of years and this may be due to factors like:
• Increase in population
• Change in technological progress,
• Large scale shift in consumers’ demands
COMPONENTS OF TIME SERIES
2. Seasonal variation
• Seasonal variation are short-term fluctuation in a time series which occur
periodically in a year.
• This continues to repeat year after year.
• The major factors that are weather conditions and customs of people.
• More woolen clothes are sold in winter than the season of summer.
• Each year more ice creams are sold in summer and very little in winter
season
• The sales in the departmental stores are more during festive seasons that
in the normal days.
COMPONENTS OF TIME SERIES
3. Cyclical variation:
• Cyclical variations are recurrent upward or downward
movements in a time series but the period of cycle is greater
than a year.
• Also these variations are not regular as Seasonal variation.
COMPONENTS OF TIME SERIES
• Irregular variation:
➢Irregular variation are fluctuations in time series that are short in
duration, erratic in nature and follow no regularity in the
occurrence pattern.
➢Irregular fluctuations results due to the occurrence of unforeseen
factors.
➢This component is unpredictable.
• Floods
• Earthquakes
• Wars
• Famines
MEASUREMENT OF SECULAR TREND
• Free hand curve method or eye inspection method
• Semi average method
• Method of moving average
Free hand curve method or eye inspection method
In this method the data is denoted on graph paper.
Show “Time” on “X” axis and “ Data” on the “Y” axis.
On graph there will be a point for every point of time.
Draw a smooth hand curve with the help of this plotted points.
Example : Draw a free hand curve on the basis of the following data:
Year
1989
(Profit 148
in
‘000)
1990
1991
1992
1993
1994
1995
1996
149
149.5 150.5 152.2 153.7 153.7 153
Solution: Free hand curve
Semi-Average Method
• In this method the given data are divided in to two parts, preferably
with the equal number of years.
• An average is obtained for each part.
• Each such average is shown against the mid-point of the half period.
• Obtain two points on a graph paper based on the averages of each
part.
• By joining these points, a straight line trend is obtained.
Example:
• Find the trend line from the following data by semi-Average method:
Year
1989 1990
Production 150
152
1991
1992
1993
1994
1995
1996
153
151
154
153
156
158
Solution: Semi-Average Method
• There are total 8 years .
• Divide it into equal parts of 4 years.
• Calculate average for each part.
• First Part =
• Second part =
150+152+153+151
4
154+153+156+158
4
= 151.50
= 155.25
• Obtain two points on a graph paper based
on the averages of each part. Join these
points to get a straight line trend.
Moving Average Method
• This method is based on a series of arithmetic means as shown in the below example.
• Calculate 3-yearly moving average for the following data.
Year
Production
3-Yearly moving average (trend values)
2000- 2001
40
2001- 2002
45
(𝟒𝟎+𝟒𝟓+𝟒𝟎)
𝟑
= 41.67
2002- 2003
40
(𝟒𝟓+𝟒𝟎+𝟒𝟐)
𝟑
= 42.33
2003- 2004
42
(𝟒𝟎+𝟒𝟐+𝟒𝟔)
=
𝟑
2004- 2005
46
2005- 2006
52
2006- 2007
56
2007- 2008
61
(𝟒𝟐+𝟒𝟔+𝟓𝟐)
=
𝟑
(𝟒𝟔+𝟓𝟐+𝟓𝟔)
=
𝟑
(𝟓𝟐+𝟓𝟔+𝟔𝟏)
𝟑
=
42.67
46.67
51.33
56.33
We've got everything to become your favourite writing service
Money back guarantee
Your money is safe. Even if we fail to satisfy your expectations, you can always request a refund and get your money back.
Confidentiality
We don’t share your private information with anyone. What happens on our website stays on our website.
Our service is legit
We provide you with a sample paper on the topic you need, and this kind of academic assistance is perfectly legitimate.
Get a plagiarism-free paper
We check every paper with our plagiarism-detection software, so you get a unique paper written for your particular purposes.
We can help with urgent tasks
Need a paper tomorrow? We can write it even while you’re sleeping. Place an order now and get your paper in 8 hours.
Pay a fair price
Our prices depend on urgency. If you want a cheap essay, place your order in advance. Our prices start from $11 per page.