No AI Generated Content
RMIT Classification Applying Business Statistics to Data Analysis and Manipulation
Get Free Samples Written by our Top-Notch Subject Expert Writers known for providing the Best Assignment Help Services in Australia
Business Statistics has focused on analyzing and manipulating collected data to identify the actual impact of a specific variable on others. Business statistics has played a crucial role in tax preparation, data cleaning and data mining to identify the actual performance of a firm. The report has focused on analyzing the pay gap between the public and private sectors by using the wage rate of 219 employees working in the public and private sectors. The linear and multiple regression models have been used to analyze the actual relationship between the different variables on the pay gap of the sectors.
The issue of the pay gap has created a direct impact on the efficiency and engagement of employees in an entity. The research has been conducted by collecting the gender, age, wages, marital status and sector of 219 employees to identify the actual relationship of the variables with the wages. It has been identified that issue of the gender gap has been identified in the public sector as the wage rate of females has been more than male employees which have created a partial working environment. The wages of the employees working in the public sector have more than the private sector by 5.1% which has created a lack of coordination and collaboration among the employees.
2.2 Research aim
The aim of the research is to investigate the engagement of wages in the public sector and private sector. The research is explored to identify whether the wages has any influence by different wage rates, qualifications, and gender.
The research has proceeded by using preliminary analysis to identify the actual relationship between the wages of the workers with the sector, gender, qualifications, age and marital status.
3.1 Regression analysis
In this section, the data analysis tool pack in excel has been used to identify the actual relationship between the earnings (Wages) of the 219 employees with the five other independent variables. A “linear regression test” has been run to identify the relationship among the variables.
- a) Wages to sector
The “linear regression” test has been used to identify the relationship between the wages and sector of employees. A linear equation has been created by using the probability plots of the selected variables in the test.
Assumptions: Wages of the employees have been taken as the dependent variable and the sector has been taken as the independent variable. Here public sector employees have represented by 1 and the private sector is represented by 0.
The above figure has reflected the positive relationship between the wages paid (Dependent) and the sector (Independent). The Multiple R of the test has highlighted the value of 0.16 which means there has been a direct relationship between the variables. The 5% significance level has represented the weak relationship among the variables.
The above graph has reflected the value of the regression equation which is “Y= ax + b”. The Multiple R has reflected correlation among the variables, while the value of a is 0.270, and b is 10.21.
- b) Wages to gender
As in the provided case study, it has been identified that females have been paid more than male employees in the public and private sectors.
Assumptions: Wages of the employees have been taken as the dependent variable and gender has been taken as the independent variable. Here, male employees have represented by 1 and female employee is represented by 0.
The above figure has reflected that there has been a strong relationship between the wages paid and the Gender of the employees. The Multiple R has shown a value of 0.23 this is more than zero as this gas reflected a positive and strong relationship between the independent and depend on variables.
- c) Wages to degree
The educational attainment or degree of the employees has made any difference in wages paid to the employees or not. This has been measured by using the “linear regression test” by taking these two variables.
Assumptions: employee’s wages have been taken as the dependent variable and degree or educational attainment has been taken as the independent variable. Employees who have a higher degree are represented by 1 and lower-degree employee is represented by 0.
The above figure has reflected the positive value of Multiple R this means the dependent variable is correlated with the independent variable. Multiple R, T-stat, and Significance levels have reflected values of 0.21, 23.31, and 0.0016. This has reflected a positive and strong relationship between the degrees with wage rate. This has reflected that higher degree employees have gotten higher wages.
The linear equation of the test has reflected by the above graph with the help of the probability of the relationship.
- d) Wages to age
The relationship between the wages paid and the age of the employees has been tested by regression test.
Assumptions: Employee’s wages have been taken as the dependent variable and the age of the employees has been taken as the independent variable.
The above figure has reflected the actual relationship that exists between the wages paid to the employees and the age of the employees. Multiple R has reflected the value of 0.32 this has reflected that there has been a direct relationship between the wages paid and the age of the employees. This reflects that if the age of the employees in the company has then its wages have been higher than the less age of the employees. There has been a moderate positive relationship between the selected variables. The P value has reflected 1.21E-07 which means 1.21*10^7.
- e) Wages to marital status
The regression model has been used to acknowledge the actual relationship between the two variables in the analysis.
Assumptions: wages paid to employees have been taken as the dependent variable and the marital status of the employees has been taken as the independent variable. The employees who have married are denoted with Yes and represented by 1. However, the unmarried employees have been taken as No and represented by 0 in determining the relationship with the wages.
The above figure has reflected the output of the regression test to identify the relationship between the selected independent and dependent variables. The regression equation has depicted a clear picture of the relationship by equation line. However, the correlation among the variables is reflected in the strong value of 0.159 which means there has been a strong positive relationship between the variables. This means the wages paid to the employees have changed to the marital status of the employees.
3.2 Regression of wage rate and variables in the Public sector
The regression test has been conducted by structuring a new Model by using data on the wages paid to public sector employees to their gender (Murray and Wilson, 2021). Here, Model A: Wages of the public sector = β1 Gender where both male and female employees have been considered.
- a) Estimated equation of regression test
The regression equation has reflected the correlation, coefficient and intercepts of the selected variables for the test. The ideal regression equation is “Y= ax + b”, after using a simple linear regression test it has identified that the equation reflected a positive regression line.
The above graph has reflected that the wage paid to the public employees with gender has formed to Y = 0.562x + 19.27. This has represented a positive relationship between the Public Model A.
- b) Two-tailed hypothesis test
A regression test on the variable selected in Model A has been taken for the conduction of the hypothesis test in the study.
Assumption: It has been assumed that Wages paid to the employees in the public sector have been taken as the dependent variable and the gender of the employees as the independent variable. Female and male employees have been represented by 0 and 1 for the test.
Null Hypothesis (H0): There has been no relationship between wages and the Gender of public sector employees
Alternative Hypothesis (H1): There exists a relation between the dependent and independent variables.c) Interpretation of results
The Multiple R of the test has reflected a value to 0.24 while the R-square has reflected the ideal value of 0.05 this means there has been a positive relationship between the dependent and independent variables. However, here the H0 hypothesis has been not satisfied and the H1 hypothesis has been satisfied as there has a strong positive relationship among the selected variable in the test.
3.3 Multiple regression
The “multiple regression model” has reflected the relationship between the dependent variable with more than one independent variable in the most effective manner (Gholipour et al. 2018).
- a) Outputs of the model
The multiple regression test proceeded by using “data analysis tools” of financial analysis.
The above figure has reflected the actual relationship that exists between wages paid with the age, gender, qualifications, marital status and sector of 219 employees between the age group of 21-65.
- b) Estimated equation from the test
The above tests have reflected the equation to depict the relationship with the graph.
The above graph has depicted the regression equation of the multiple regression. The equation has reflected y=0.270x + 10.21.
- c) Two-tailed hypothesis on the basis of co-efficient, 5% significance level and p-value
The hypothesis test has been conducted to identify whether the actual relation exists or not among the selected variables.
Assumptions: Wages paid to the employees has taken as the dependent variable. However, age, gender, degree, sector and marital status have been taken as the independent variables for the test.
Null hypothesis (H0): There has no relationship determined between the dependent and independent variables.
Alternative Hypothesis (H1): There has been a relationship between the dependent and independent variables
- d) Overall test using P-value
The P-value of age has been maximum compared to the other four variables which means age has created a major gap in the pay scale of the private and public sectors. The P-value of gender is 3.18E-05, age is 9.96E-06, degree is 0.00078, marital status is 0.113 and Sector is 0.41. This means age has created a major reason for creating the pay gap.
- e) Interpretation of the hypothesis test
The coefficient of the intercept has reflected 12.006 and its multiple R have represented the value of 0.48. This clearly depicted that there has been a positive but moderate relationship exist between the variables in the test. However, the P-value and significance level at 5% has reflected the value of 0.002 and 6E-11. This satisfied the H1 hypothesis and rejected the H0 hypothesis.
- f) Interpretation of findings
It has been identified that employees' wages have highly influenced by age, gender and marital status compared to other variables. The five independent variables are positively related with the dependent variable that has justified by the Multiple R-values at 0.48 in the “multiple regression model”.
3.4 Comparison of the variables of the public in Model A and Model E
The results of Model A and Model E are different because Model A has been tested by a simple linear regression model and Model E is tested by a multiple regression model to depict the impact on more than two variables. Discrimination in wages as per age, gender and marital status has been the basic reason for the difference.
3.5 Prediction of earning on the basis of Model E
Male has higher earnings than female at the age of 40 years it has been estimated by using the filter option in excel. There are only three male employees who have 40 years of age and are qualified as per the requirement of Model E.
3.6 Impact of change in the age from 40 to 50 years of employees on the output of Model E
The average wages of the male and female employees were altered due to a change in the age from 40 to 50 years in the test. There are only one male and one female existing of 50 years of age and the wages of the male has more than those of the female. The male employees earned 43.85 and the female earned 31.11 wages.
3.7 Consistency of data
The regression test has clearly depicted the wage rate has altered due to changes in the gender of the employees. The significance level has been lower than 0.005 so this has reflected a strong relationship between the variables. The wages paid have fluctuated a little bit with the diversification in gender. The data has been reflected consistent data but reflects limited areas of analysis.
3.8 Additional data
The data has ignored the skills of the employees to provide wages to them if the data set has provided the variable related to the skills of the employees has an impact on wages or not. The inclusion of skill as a variable has provided professionalism to the research in analysing the pay gap of employees in the market.
This section summarizes the document and provides closure. The difference between this summary and the executive summary is that the summary in the “Conclusion” for someone who has read the report. You will again briefly state the objective of the analysis and the methods used and provide some highlights of your main findings.
Gholipour, K., Asghari-Jafarabadi, M., Iezadi, S., Jannati, A. and Keshavarz, S., 2018. Modelling the prevalence of diabetes mellitus risk factors based on artificial neural network and multiple regression. Eastern Mediterranean Health Journal, 24(8).
Murray, L.L. and Wilson, J.G., 2021. Generating data sets for teaching the importance of regression analysis. Decision Sciences Journal of Innovative Education, 19(2), pp.157-166.
Lai, J., Zou, Y., Zhang, J. and Peres-Neto, P., 2021. rdacca. hp: an R package for generalizing hierarchical and variation partitioning in multiple regression and canonical analysis. bioRxiv.
Mahuteau, S., Mavromaras, K., Richardson, S. and Zhu, R., 2017. Public–private sector wage differentials in Australia. Economic Record, 93, pp.105-121.
Mio?i?, J., Zekanovi?-Korona, L. and Hotti, L., 2019. Importance of regression analysis in sports information systems at evaluation of sports and sports associations.