Protected: GB Certification Exam

This content is password protected. To view it please enter your password below:

Welcome to your Online Course Completion Survey


1. Improve Recruitment Effectiveness

HR analytics lets HR make better decisions on the basis of historical information of employee performance. For example, if data suggests that some of your best talent have certain education background, hobbies or profile, you will be able to screen profiles from the candidates pool and get those who are most likely to be successful. This would mean lower cost of recruitment, reduction attrition in future and better business results.  The availability of online databases, applications, profiles in social media and career directories, documents, etc. today enables how we can improve the effectiveness of recruitment and easily learn more about applicants.

Similarly, we can use online databased and career directories to build profiles and job descriptions based on how other organizations define such roles and the availability of talent pool in the market. This is higher success rate during not only recruitment but also in retention

2. Build Productive Workforce

Using historic data of employee performance and specific conditions that led an employee to performance better, HR managers can using Clustering Models to put together teams of like minded employees where every individual performs it his/her best. Similarly, inconsistent performance, spikes or drops in performance can help HR analysts identify key drivers for such pattern.

Join “HR Analytics Live Online Course”
3. Reducing Attrition by Predicting it

This is one of the most widely used application or example for HR Analytics. By using historic data related to employees it is possible to used Machine Learning (ML) classification models is very accurately predict employees who are most likely to leave the organization. This is called as Predictive Model for Employee Attrition. The model provides the propensity or probability that an employee would leave in near future. This data based approach can replace RAG (Red/Amber/Green) colour codes that HRBP’s use to classify employees based on high flight risk.

4. Performance Management

Linking Performance to Pay is a ever green topic in HR. With performance data that goes beyond performance rating, C&B professionals can build statistical models to validate if the increased compensation and benefits to an individual can result in justifiable business performance improvement. Further data analytics can be used to profile employees based on the value they see various benefits provided by the organization and personalize the package.

5. L&D Effectiveness

L&D can play pivotal in enhancing business performance and building a future fit workforce by using the data to identify training needs, establish quantitative effectiveness measures for L&D interventions and statistically prove effectiveness of the program. For example, using wearables, L&D professionals can capture real time data of employees heart rate, to ascertain the effectiveness of the learning module covered in training. Data can be used to design effective intervention. 

And most importantly, L&D can rescue themselves from the perception of being providers of different career development programmes that deplete a large part of the company’s budget.

#nilakantasrinivasan-j #canopus-business-management-group #B2B-client-centric-growth #HR-analytics

In the recent years, most business functions have undergone a transformation because of the power of Big Data, Cloud Storage and Analytics. The Digitization wave that is sweeping the industry now is nothing but an outcome of the synergy of various technology developments over the past 2 decades. HR is no exception to this. HR Analytics and Big Data have provided the ability to HR leaders to take “intuitions” out of their decisions, that have been the norm before and replace that with informed decisions based on data. The use of HR analytics has made official decisions more promising and accurate.

For this reason, today many companies invest tremendous resources on talent management tools and skilled staff including data scientists, analytics and analysts.

Nevertheless, there’s a lot more to do in this area. According to a Deloitte survey, 3 out of four businesses (75%) believe the data analytics use is “important” but only 8% think that their organisation, is strong in analytics. (The same figure as in 2014).

HR Analytics can touch every division of HR and improve its decision making including Talent Acquisition and Management, Compensation and Benefits, Performance Management, HR Operations, Learning and Development, Leadership Development, etc. .

Most organizations today sit on a pile of data, thanks to HRMS & Cloud storage. However, in the absence of a proper HR analytics tool or necessary capability in HR professionals, these useful data or information we are talking about might be scattered and unused.  Organizations are now getting to accept that Analytics is more about capability and less about acquiring fancy technological tools.

A HR Professional with right Analytics capability can interpret and transform this valuable data in useful statistics using HR and big data analytics to insights. HR will determine what to do on the basis of the results until trends are illustrated. The impact of HR metrics on organisational performance is analysed using analytics and that can enables leaders to take proactive decisions.

HR Analytics can also help in addressing problems that organizations face. For example, High performers exit an organisation more often than low-performers, and if so, what leads to that turnover? Data based insights can empower business leaders to take right decisions regarding talent rather than mulling over intuitions or finger pointing between HR and Business.

 Here are 6 big Benefits of HR Analytics

Improve HR alignment to Business Strategy

It is very common to see that HR function in isolation vis-a vis the business. If you don’t agree with me, find out what business leaders do when HR slides are put up in Management Committee presentations and what HR head does when Business slides are put up? Most HR metrics, processes, & policies are benchmarked with respect to industry and competition, but very rarely they are aligned to hard hitting reality of their own business. For example, just by aligning HR metrics to business metrics, such as HR Cost per Revenue or HR Cost per unit sold, Revenue per employee, Average Lead time to productivity, HR professionals can take the first step towards better alignment to business strategy.

Complex decisions regarding the hiring, employee performance, career progression, internal movements, etc have direct impact on business strategy. When HR Analytics can provide insights on which employee is most likely to be productive in a new role, who is most likely to accept a internal job movement based on historic data, how long is it likely to take to close a critical position, based on data, HR seamlessly aligns with Business needs and strategy.

Join “HR Analytics Live Online Course”
Creating meaningful HR Processes

Not long ago, HR was marred with policy paralysis. Organizations had HR policies for everything. Processes were built for those policies and not for people who would use, manage or benefit from it. HR Automation is many ways has helped organizations their standardize processes. Whether it’s about Leave Approval, Employee Escalations, Reimbursements, Payroll, etc.

When we have meaningful data that provide us insights about processes, we will be able to take decisions that matter the most for our employees. For example, an organization introduced flexible working hours for its executives just because everyone in the market is doing that. And because few employees asked for it. Few weeks into the few flexi working system, surprisingly, they found that most employees wouldn’t avail this benefit. Data suggested that 90% of employees commute to work using company shuttle as the organization is located in an industrial suburb. So just by looking into data, organizations can build processes that are meaningful rather than what is an industry norm. 

Another popular example is that of Google reducing the number of rounds of interviews based on data, thereby improving candidate experience, interviewer experience and cutting down on the lead time to hire.

Enhance Employee Experience

Insights from the data across employee lifecycle can help HR managers emotionally connect with employees, build personalization , etc., For example, if an employee struggles to comply with certain HR policies, data of can provide timely insights on how the organization can support the employee in bettering his experience during the tenure thereby creating a win-win situation.

Improve HR Effectiveness

Data insights from HR analytics can suggest to us which candidates are likely to get selected, which are likely to perform well, if selected, thereby enabling the business to increase its performance and success rate. Such insights can be used in not only hiring, but in career progression, retention, learning and development, etc., For example, it would be an invaluable insight if HR can suggest which employees are likely to perform together without conflict, if business wants a put a new team together.

Reduce HR related costs

HR Analytics can help HR managers identify blind spots as far as leakage is concerned. For example, how much increment should we give a candidate, what are the increment slabs that the organization have so as to keep employee attrition within certain level, and so on.

Build a Great Place to Work

Ultimately, it is every HR head’s dream to build an organization that employee’s love to work for -One where employees wake up every morning and say, ‘here’s another great day’. Instead of being a copy cat and experimenting with what works for other best employers in your industry or country, delving into data can cull out insights on what your employees love, relish and dislike.

#nilakantasrinivasan-j #canopus-business-management-group #B2B-client-centric-growth #HR-analytics #big-data #HR-metrics

Welcome to your HR Analytics Knowledge Assessment for HR Professionals


Thanks for taking HR Analytics Knowledge Assessment for HR Professionals

Check your inbox for answers.



Welcome to your Getting to know the priorities

Welcome to this brief e-interview. Time required to complete : 10 mins The purpose of this interview is gain insights about current priorities of organizations and leaders like you, and how we can help/serve you better. All the information you provide will be used for our internal purpose ONLY. For clarity: As a token of appreciation for sharing your inputs, we are glad to share access to our premium digital learning course “Creating Customer Journey Maps”. You will receive the details after completion of this interview.

Continue

Is there a difference between Six Sigma and Lean Six Sigma?

Lean and Six Sigma are close cousins in the process improvement world and they have lot of commonalities. Now we will talk about the difference between Six Sigma and Lean Six Sigma.

Six Sigma uses a data centric analytical approach to problem solving and process improvements.  That means, there would be time and effort in data collection and analysis. While this sounds very logical to any problem solving approach, there can be practical challenges.

For example, some times we may need data and analysis to be even prove the obvious. That is lame.

On the other hand, Lean Six Sigma brings in some of the principles of Lean. Lean is largely a pragmatic and prescriptive approach. Which implies that we will look at data and practically validate that problem and move on to prescriptive solutions.

Thus combining Lean with Six Sigma, helps in reducing the time and effort needed to analyze or improve a situation. Lean will bring in a set of solutions that are tried and tested for a situation. For example, if you have high inventory, that Lean would suggest you to implement Kanban.

Lean is appealing because most often it simplifies the situation and that may not be always true with Six Sigma.  However the flip side to Lean is that if the system have been improved several times and reached a certain level of performance and consistency, Lean can bring out any further improvement unless we approach the problem with Six Sigma lens, using extensive data collection and analysis.

Looking at the body of knowledge of Six Sigma and Lean Six Sigma, you will find that Lean Six Sigma courses following tools:

To learn Free Lean Six Sigma Primer Certificate Course
 
To learn Green Belt Online Certification Course
 
To learn Black Belt Online Certification Course

#nilakantasrinivasan-j #canopus-business-management-group #B2B-client-centric-growth #Lean-six-sigma #six-sigma-green-belt-certification #six-sigma-black-belt-certification







 

Pima Diabetes Data Analytics – Neil

Here are the set the analytics that has been run on this data set

  • Data Cleaning to remove zeros
  • Data Exploration for Y and Xs
  • Descriptive Statistics – Numerical Summary and Graphical (Histograms) for all variables
  • Screening of variables by segmenting them by Outcome
  • Check for normality of dataset
  • Study bivariate relationship between variables using pair plots, correlation and heat map
  • Statistical screening using Logistic Regression
  • Validation of the model its precision and ploting of confusion matrix
 

Importing necessary packages

In [2]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(color_codes =True)
%matplotlib inline
 

Importing the Diabetes CSV data file

  • Import the data and test if all the columns are loaded
  • The Data frame has been assigned a name of ‘diab’
In [3]:
diab=pd.read_csv("diabetes.csv")
diab.head()
Out[3]:
  Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
0 6 148 72 35 0 33.6 0.627 50 1
1 1 85 66 29 0 26.6 0.351 31 0
2 8 183 64 0 0 23.3 0.672 32 1
3 1 89 66 23 94 28.1 0.167 21 0
4 0 137 40 35 168 43.1 2.288 33 1
 

About data set

In this data set, Outcome is the Dependent Variable and Remaining 8 variables are independent variables.

 

Finding if there are any null and Zero values in the data set

In [29]:
diab.isnull().values.any()
## To check if data contains null values
Out[29]:
False
 

Inference:

  • Data frame doesn’t have any NAN values
  • As a next step, we will do preliminary screening of descriptive stats for the dataset
In [30]:
diab.describe()
## To run numerical descriptive stats for the data set
Out[30]:
  Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
count 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000
mean 3.845052 120.894531 69.105469 20.536458 79.799479 31.992578 0.471876 33.240885 0.348958
std 3.369578 31.972618 19.355807 15.952218 115.244002 7.884160 0.331329 11.760232 0.476951
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.078000 21.000000 0.000000
25% 1.000000 99.000000 62.000000 0.000000 0.000000 27.300000 0.243750 24.000000 0.000000
50% 3.000000 117.000000 72.000000 23.000000 30.500000 32.000000 0.372500 29.000000 0.000000
75% 6.000000 140.250000 80.000000 32.000000 127.250000 36.600000 0.626250 41.000000 1.000000
max 17.000000 199.000000 122.000000 99.000000 846.000000 67.100000 2.420000 81.000000 1.000000
 

Inference at this point

  • Minimum values for many variables are 0.
  • As biological parameters like Glucose, BP, Skin thickness,Insulin & BMI cannot have zero values, looks like null values have been coded as zeros
  • As a next step, find out how many Zero values are included in each variable
In [32]:
(diab.Pregnancies == 0).sum(),(diab.Glucose==0).sum(),(diab.BloodPressure==0).sum(),(diab.SkinThickness==0).sum(),(diab.Insulin==0).sum(),(diab.BMI==0).sum(),(diab.DiabetesPedigreeFunction==0).sum(),(diab.Age==0).sum()
## Counting cells with 0 Values for each variable and publishing the counts below
Out[32]:
(111, 5, 35, 227, 374, 11, 0, 0)
 

Inference:

  • As Zero Counts of some the variables are as high as 374 and 227, in a 768 data set, it is better to remove the Zeros uniformly for 5 variables (excl Pregnancies & Outcome)
  • As a next step, we’ll drop 0 values and create a our new dataset which can be used for further analysis
In [4]:
## Creating a dataset called 'dia' from original dataset 'diab' with excludes all rows with have zeros only for Glucose, BP, Skinthickness, Insulin and BMI, as other columns can contain Zero values.
drop_Glu=diab.index[diab.Glucose == 0].tolist()
drop_BP=diab.index[diab.BloodPressure == 0].tolist()
drop_Skin = diab.index[diab.SkinThickness==0].tolist()
drop_Ins = diab.index[diab.Insulin==0].tolist()
drop_BMI = diab.index[diab.BMI==0].tolist()
c=drop_Glu+drop_BP+drop_Skin+drop_Ins+drop_BMI
dia=diab.drop(diab.index[c])
In [35]:
dia.info()
 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 392 entries, 3 to 765
Data columns (total 9 columns):
Pregnancies                 392 non-null int64
Glucose                     392 non-null int64
BloodPressure               392 non-null int64
SkinThickness               392 non-null int64
Insulin                     392 non-null int64
BMI                         392 non-null float64
DiabetesPedigreeFunction    392 non-null float64
Age                         392 non-null int64
Outcome                     392 non-null int64
dtypes: float64(2), int64(7)
memory usage: 30.6 KB
 

Inference

  • As in above, created a cleaned up list titled “dia” which has 392 rows of data instead of 768 from original list
  • Looks like we lost nearly 50% of data but our data set is now cleaner than before
  • In fact the removed values can be used for Testing during modeling. So actually we haven’t really lost them completly.
 

Performing Preliminary Descriptive Stats on the Data set

  • Performing 5 number summary
  • Usually, the first thing to do in a data set is to get a hang of vital parameters of all variables and thus understand a little bit about the data set such as central tendency and dispersion
In [19]:
dia.describe()
Out[19]:
  Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
count 392.000000 392.000000 392.000000 392.000000 392.000000 392.000000 392.000000 392.000000 392.000000
mean 3.301020 122.627551 70.663265 29.145408 156.056122 33.086224 0.523046 30.864796 0.331633
std 3.211424 30.860781 12.496092 10.516424 118.841690 7.027659 0.345488 10.200777 0.471401
min 0.000000 56.000000 24.000000 7.000000 14.000000 18.200000 0.085000 21.000000 0.000000
25% 1.000000 99.000000 62.000000 21.000000 76.750000 28.400000 0.269750 23.000000 0.000000
50% 2.000000 119.000000 70.000000 29.000000 125.500000 33.200000 0.449500 27.000000 0.000000
75% 5.000000 143.000000 78.000000 37.000000 190.000000 37.100000 0.687000 36.000000 1.000000
max 17.000000 198.000000 110.000000 63.000000 846.000000 67.100000 2.420000 81.000000 1.000000
 

Split the data frame into two sub sets for convenience of analysis

  • As we wish to study the influence of each variable on Outcome (Diabetic or not), we can subset the data by Outcome
  • dia1 Subset : All samples with 1 values of Outcome
  • dia0 Subset: All samples with 0 values of Outcome
In [8]:
dia1 = dia[dia.Outcome==1]
dia0 = dia[dia.Outcome==0]
In [21]:
dia1
Out[21]:
  Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
4 0 137 40 35 168 43.1 2.288 33 1
6 3 78 50 32 88 31.0 0.248 26 1
8 2 197 70 45 543 30.5 0.158 53 1
13 1 189 60 23 846 30.1 0.398 59 1
14 5 166 72 19 175 25.8 0.587 51 1
16 0 118 84 47 230 45.8 0.551 31 1
19 1 115 70 30 96 34.6 0.529 32 1
24 11 143 94 33 146 36.6 0.254 51 1
25 10 125 70 26 115 31.1 0.205 41 1
31 3 158 76 36 245 31.6 0.851 28 1
39 4 111 72 47 207 37.1 1.390 56 1
43 9 171 110 24 240 45.4 0.721 54 1
53 8 176 90 34 300 33.7 0.467 58 1
56 7 187 68 39 304 37.7 0.254 41 1
70 2 100 66 20 90 32.9 0.867 28 1
88 15 136 70 32 110 37.1 0.153 43 1
99 1 122 90 51 220 49.7 0.325 31 1
109 0 95 85 25 36 37.4 0.247 24 1
110 3 171 72 33 135 33.3 0.199 24 1
111 8 155 62 26 495 34.0 0.543 46 1
114 7 160 54 32 175 30.5 0.588 39 1
120 0 162 76 56 100 53.2 0.759 25 1
125 1 88 30 42 99 55.0 0.496 26 1
128 1 117 88 24 145 34.5 0.403 40 1
130 4 173 70 14 168 29.7 0.361 33 1
132 3 170 64 37 225 34.5 0.356 30 1
152 9 156 86 28 155 34.3 1.189 42 1
159 17 163 72 41 114 40.9 0.817 47 1
165 6 104 74 18 156 29.9 0.722 41 1
171 6 134 70 23 130 35.4 0.542 29 1
584 8 124 76 24 600 28.7 0.687 52 1
588 3 176 86 27 156 33.3 1.154 52 1
595 0 188 82 14 185 32.0 0.682 22 1
603 7 150 78 29 126 35.2 0.692 54 1
606 1 181 78 42 293 40.0 1.258 22 1
611 3 174 58 22 194 32.9 0.593 36 1
612 7 168 88 42 321 38.2 0.787 40 1
614 11 138 74 26 144 36.1 0.557 50 1
638 7 97 76 32 91 40.9 0.871 32 1
646 1 167 74 17 144 23.4 0.447 33 1
647 0 179 50 36 159 37.8 0.455 22 1
648 11 136 84 35 130 28.3 0.260 42 1
655 2 155 52 27 540 38.7 0.240 25 1
659 3 80 82 31 70 34.2 1.292 27 1
662 8 167 106 46 231 37.6 0.165 43 1
663 9 145 80 46 130 37.9 0.637 40 1
689 1 144 82 46 180 46.1 0.335 46 1
693 7 129 68 49 125 38.5 0.439 43 1
695 7 142 90 24 480 30.4 0.128 43 1
696 3 169 74 19 125 29.9 0.268 31 1
709 2 93 64 32 160 38.0 0.674 23 1
715 7 187 50 33 392 33.9 0.826 34 1
716 3 173 78 39 185 33.8 0.970 31 1
722 1 149 68 29 127 29.3 0.349 42 1
730 3 130 78 23 79 28.4 0.323 34 1
732 2 174 88 37 120 44.5 0.646 24 1
740 11 120 80 37 150 42.3 0.785 48 1
748 3 187 70 22 200 36.4 0.408 36 1
753 0 181 88 44 510 43.3 0.222 26 1
755 1 128 88 39 110 36.5 1.057 37 1

130 rows × 9 columns

In [36]:
dia0
Out[36]:
  Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
3 1 89 66 23 94 28.1 0.167 21 0
18 1 103 30 38 83 43.3 0.183 33 0
20 3 126 88 41 235 39.3 0.704 27 0
27 1 97 66 15 140 23.2 0.487 22 0
28 13 145 82 19 110 22.2 0.245 57 0
32 3 88 58 11 54 24.8 0.267 22 0
35 4 103 60 33 192 24.0 0.966 33 0
40 3 180 64 25 70 34.0 0.271 26 0
50 1 103 80 11 82 19.4 0.491 22 0
51 1 101 50 15 36 24.2 0.526 26 0
52 5 88 66 21 23 24.4 0.342 30 0
54 7 150 66 42 342 34.7 0.718 42 0
57 0 100 88 60 110 46.8 0.962 31 0
59 0 105 64 41 142 41.5 0.173 22 0
63 2 141 58 34 128 25.4 0.699 24 0
68 1 95 66 13 38 19.6 0.334 25 0
69 4 146 85 27 100 28.9 0.189 27 0
71 5 139 64 35 140 28.6 0.411 26 0
73 4 129 86 20 270 35.1 0.231 23 0
82 7 83 78 26 71 29.3 0.767 36 0
85 2 110 74 29 125 32.4 0.698 27 0
87 2 100 68 25 71 38.5 0.324 26 0
91 4 123 80 15 176 32.0 0.443 34 0
92 7 81 78 40 48 46.7 0.261 42 0
94 2 142 82 18 64 24.7 0.761 21 0
95 6 144 72 27 228 33.9 0.255 40 0
97 1 71 48 18 76 20.4 0.323 22 0
98 6 93 50 30 64 28.7 0.356 23 0
103 1 81 72 18 40 26.6 0.283 24 0
105 1 126 56 29 152 28.7 0.801 21 0
673 3 123 100 35 240 57.3 0.880 22 0
679 2 101 58 17 265 24.2 0.614 23 0
680 2 56 56 28 45 24.2 0.332 22 0
682 0 95 64 39 105 44.6 0.366 22 0
685 2 129 74 26 205 33.2 0.591 25 0
688 1 140 74 26 180 24.1 0.828 23 0
692 2 121 70 32 95 39.1 0.886 23 0
698 4 127 88 11 155 34.5 0.598 28 0
700 2 122 76 27 200 35.9 0.483 26 0
704 4 110 76 20 100 28.4 0.118 27 0
707 2 127 46 21 335 34.4 0.176 22 0
710 3 158 64 13 387 31.2 0.295 24 0
711 5 126 78 27 22 29.6 0.439 40 0
713 0 134 58 20 291 26.4 0.352 21 0
718 1 108 60 46 178 35.5 0.415 24 0
721 1 114 66 36 200 38.1 0.289 21 0
723 5 117 86 30 105 39.1 0.251 42 0
726 1 116 78 29 180 36.1 0.496 25 0
733 2 106 56 27 165 29.0 0.426 22 0
736 0 126 86 27 120 27.4 0.515 21 0
738 2 99 60 17 160 36.6 0.453 21 0
741 3 102 44 20 94 30.8 0.400 26 0
742 1 109 58 18 116 28.5 0.219 22 0
744 13 153 88 37 140 40.6 1.174 39 0
745 12 100 84 33 105 30.0 0.488 46 0
747 1 81 74 41 57 46.3 1.096 32 0
751 1 121 78 39 74 39.0 0.261 28 0
760 2 88 58 26 16 28.4 0.766 22 0
763 10 101 76 48 180 32.9 0.171 63 0
765 5 121 72 23 112 26.2 0.245 30 0

262 rows × 9 columns

 

Graphical screening for variables

  • Now we will start graphical analysis of outcome. At the data is nominal(binary), we will run count plot and compute %ages of samples who are diabetic and non-diabetic
In [37]:
## creating count plot with title using seaborn
sns.countplot(x=dia.Outcome)
plt.title("Count Plot for Outcome")
Out[37]:
Text(0.5, 1.0, 'Count Plot for Outcome')
 
In [38]:
# Computing the %age of diabetic and non-diabetic in the sample
Out0=len(dia[dia.Outcome==1])
Out1=len(dia[dia.Outcome==0])
Total=Out0+Out1
PC_of_1 = Out1*100/Total
PC_of_0 = Out0*100/Total
PC_of_1, PC_of_0
Out[38]:
(66.83673469387755, 33.16326530612245)
 

Inference on screening Outcome variable

  • There are 66.8% 1’s (diabetic) and 33.1% 0’s (nondiabetic) in the data
  • As a next step, we will start screening variables
 

Graphical Screening for Variables

  • We will take each variable, one at a time and screen them in the following manner
  • Study the data distribution (histogram) of each variable – Central tendency, Spread, Distortion(Skewness & Kurtosis)
  • To visually screen the association between ‘Outcome’ and each variable by plotting histograms & Boxplots by Outcome value
 

Screening Variable – Pregnancies

In [40]:
## Creating 3 subplots - 1st for histogram, 2nd for histogram segmented by Outcome and 3rd for representing same segmentation using boxplot
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.set_style("dark")
plt.title("Histogram for Pregnancies")
sns.distplot(dia.Pregnancies,kde=False)
plt.subplot(1,3,2)
sns.distplot(dia0.Pregnancies,kde=False,color="Blue", label="Preg for Outome=0")
sns.distplot(dia1.Pregnancies,kde=False,color = "Gold", label = "Preg for Outcome=1")
plt.title("Histograms for Preg by Outcome")
plt.legend()
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome,y=dia.Pregnancies)
plt.title("Boxplot for Preg by Outcome")
Out[40]:
Text(0.5, 1.0, 'Boxplot for Preg by Outcome')
 
 

Inference on Pregnancies

  • Visually, data is right skewed. For data of count of pregenancies. A large proportion of the participants are zero count on pregnancy. As the data set includes women > 21 yrs, its likely that many are unmarried
  • When looking at the segemented histograms, a hypothesis is the as pregnancies includes, women are more likely to be diabetic
  • In the boxplots, we find few outliers in both subsets. Esp some non-diabetic women have had many pregenancies. I wouldn’t be worried.
  • To validate this hypothesis, need to statistically test.
 

Screening Variable – Glucose

In [41]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
plt.title("Histogram for Glucose")
sns.distplot(dia.Glucose, kde=False)
plt.subplot(1,3,2)
sns.distplot(dia0.Glucose,kde=False,color="Gold", label="Gluc for Outcome=0")
sns.distplot(dia1.Glucose, kde=False, color="Blue", label = "Gloc for Outcome=1")
plt.title("Histograms for Glucose by Outcome")
plt.legend()
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome,y=dia.Glucose)
plt.title("Boxplot for Glucose by Outcome")
Out[41]:
Text(0.5, 1.0, 'Boxplot for Glucose by Outcome')
 
 

Inference on Glucose

  • 1st graph – Histogram of Glucose data is slightly skewed to right. Understandably, the data set contains over 60% who are diabetic and its likely that their Glucose levels were higher. But the grand mean of Glucose is at 122.\
  • 2nd graph – Clearly diabetic group has higher glucose than non-diabetic.
  • 3rd graph – In the boxplot, visually skewness seems acceptable (<2) and its also likely that confidence intervels of the means are not overlapping. So a hypothesis that Glucose is measure of outcome, is likely to be true. But needs to be statistically tested.
 

Screening Variable – Blood Pressure

In [18]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.distplot(dia.BloodPressure, kde=False)
plt.title("Histogram for Blood Pressure")
plt.subplot(1,3,2)
sns.distplot(dia0.BloodPressure,kde=False,color="Gold",label="BP for Outcome=0")
sns.distplot(dia1.BloodPressure,kde=False, color="Blue", label="BP for Outcome=1")
plt.legend()
plt.title("Histogram of Blood Pressure by Outcome")
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome,y=dia.BloodPressure)
plt.title("Boxplot of BP by Outcome")
Out[18]:
Text(0.5, 1.0, 'Boxplot of BP by Outcome')
 
 

Inference on Blood Pressure

  • 1st graph – Distribution looks normal. Mean value is 69, well within normal values for diastolic of 80. One should expect this data to be normal, but as we don’t know if the particpants are only hypertensive medication, we can’t comment much.
  • 2nd graph – Most non diabetic women seem to have nominal value of 69 and diabetic women seems to have high BP.
  • 3rd graph – Few outliers in the data. Its likely that some people have low and some have high BP. So the association between diabetic (Outcome) and BP is an suspect and needs to be statistically validated.
 

Screening Variable – Skin Thickness

In [21]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.distplot(dia.SkinThickness, kde=False)
plt.title("Histogram for Skin Thickness")
plt.subplot(1,3,2)
sns.distplot(dia0.SkinThickness, kde=False, color="Gold", label="SkinThick for Outcome=0")
sns.distplot(dia1.SkinThickness, kde=False, color="Blue", label="SkinThick for Outcome=1")
plt.legend()
plt.title("Histogram for SkinThickness by Outcome")
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome, y=dia.SkinThickness)
plt.title("Boxplot of SkinThickness by Outcome")
Out[21]:
Text(0.5, 1.0, 'Boxplot of SkinThickness by Outcome')
 
 

Inferences for Skinthickness

  • 1st graph – Skin thickness seems be be skewed a bit.
  • 2nd graph – Like BP, people who are not diabetic have lower skin thickness. This is a hypothesis that has to be validated. As data of non-diabetic is skewed but diabetic samples seems to be normally distributed.
 

Screening Variable – Insulin

In [42]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.distplot(dia.Insulin,kde=False)
plt.title("Histogram of Insulin")
plt.subplot(1,3,2)
sns.distplot(dia0.Insulin,kde=False, color="Gold", label="Insulin for Outcome=0")
sns.distplot(dia1.Insulin,kde=False, color="Blue", label="Insuline for Outcome=1")
plt.title("Histogram for Insulin by Outcome")
plt.legend()
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome, y=dia.Insulin)
plt.title("Boxplot for Insulin by Outcome")
Out[42]:
Text(0.5, 1.0, 'Boxplot for Insulin by Outcome')
 
 

Inference for Insulin

  • 2hour serum insulin is expected to be between 16 to 166. Clearly there are Outliers in the data. These Outliers are concern for us and most of them with higher insulin values ar also diabetic. So this is a suspect.
 

Screening Variable – BMI

In [23]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.distplot(dia.BMI, kde=False)
plt.title("Histogram for BMI")
plt.subplot(1,3,2)
sns.distplot(dia0.BMI, kde=False,color="Gold", label="BMI for Outcome=0")
sns.distplot(dia1.BMI, kde=False, color="Blue", label="BMI for Outcome=1")
plt.legend()
plt.title("Histogram for BMI by Outcome")
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome, y=dia.BMI)
plt.title("Boxplot for BMI by Outcome")
Out[23]:
Text(0.5, 1.0, 'Boxplot for BMI by Outcome')
 
 

Inference for BMI

  • 1st graph – There are few outliers. Few are obese in the dataset. Expected range is between 18 to 25. In general, people are obese
  • 2nd graph – Diabetic people seems to be only higher side of BMI. Also the contribute more for outliers
  • 3rd graph – Same inference as 2nd graph
 

Screening Variable – Diabetes Pedigree Function

In [24]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.distplot(dia.DiabetesPedigreeFunction,kde=False)
plt.title("Histogram for Diabetes Pedigree Function")
plt.subplot(1,3,2)
sns.distplot(dia0.DiabetesPedigreeFunction, kde=False, color="Gold", label="PedFunction for Outcome=0")
sns.distplot(dia1.DiabetesPedigreeFunction, kde=False, color="Blue", label="PedFunction for Outcome=1")
plt.legend()
plt.title("Histogram for DiabetesPedigreeFunction by Outcome")
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome, y=dia.DiabetesPedigreeFunction)
plt.title("Boxplot for DiabetesPedigreeFunction by Outcome")
Out[24]:
Text(0.5, 1.0, 'Boxplot for DiabetesPedigreeFunction by Outcome')
 
 

Inference of Diabetes Pedigree Function

  • I dont know what this variable is. But it doesn’t seem to contribute to diabetes
  • Data is skewed. I don’t know if his parameter is expected to be a normal distribution. Not all natural parameters are normal
  • As DPF increases, there seems to be a likelihood of being diabetic, but needs statistical validation
 

Screening Variable – Age

In [25]:
plt.figure(figsize=(20, 6))
plt.subplot(1,3,1)
sns.distplot(dia.Age,kde=False)
plt.title("Histogram for Age")
plt.subplot(1,3,2)
sns.distplot(dia0.Age,kde=False,color="Gold", label="Age for Outcome=0")
sns.distplot(dia1.Age,kde=False, color="Blue", label="Age for Outcome=1")
plt.legend()
plt.title("Histogram for Age by Outcome")
plt.subplot(1,3,3)
sns.boxplot(x=dia.Outcome,y=dia.Age)
plt.title("Boxplot for Age by Outcome")
Out[25]:
Text(0.5, 1.0, 'Boxplot for Age by Outcome')
 
 

Inference for Age

  • Age is skewed. Yes, as this is life data, it is likely to fall into a weibull distribution and not normal
  • There is a tendency that as people age, they are likely to become diabetic. This needs statistical validation
  • But diabetes, itself doesn’t seem to have an influence of longetivity. May be it impacts quality of life which is not measured in this data set.
 

Normality Test

Inference: None of the variables are normal. (P>0.05) May be subsets are normal

In [43]:
## importing stats module from scipy
from scipy import stats
## retrieving p value from normality test function
PregnanciesPVAL=stats.normaltest(dia.Pregnancies).pvalue
GlucosePVAL=stats.normaltest(dia.Glucose).pvalue
BloodPressurePVAL=stats.normaltest(dia.BloodPressure).pvalue
SkinThicknessPVAL=stats.normaltest(dia.SkinThickness).pvalue
InsulinPVAL=stats.normaltest(dia.Insulin).pvalue
BMIPVAL=stats.normaltest(dia.BMI).pvalue
DiaPeFuPVAL=stats.normaltest(dia.DiabetesPedigreeFunction).pvalue
AgePVAL=stats.normaltest(dia.Age).pvalue
## Printing the values
print("Pregnancies P Value is " + str(PregnanciesPVAL))
print("Glucose P Value is " + str(GlucosePVAL))
print("BloodPressure P Value is " + str(BloodPressurePVAL))
print("Skin Thickness P Value is " + str(SkinThicknessPVAL))
print("Insulin P Value is " + str(InsulinPVAL))
print("BMI P Value is " + str(BMIPVAL))
print("Diabetes Pedigree Function P Value is " + str(DiaPeFuPVAL))
print("Age P Value is " + str(AgePVAL))
 
Pregnancies P Value is 6.155097831782508e-20
Glucose P Value is 1.3277887088487345e-05
BloodPressure P Value is 0.030164917115239397
Skin Thickness P Value is 0.01548332935449814
Insulin P Value is 8.847272035922274e-43
BMI P Value is 1.4285556992424915e-09
Diabetes Pedigree Function P Value is 1.1325395699626466e-39
Age P Value is 1.0358469089881947e-21
 

Screening of Association between Variables to study Bivariate relationship

  • We will use pairplot to study the association between variables – from individual scatter plots
  • Then we will compute pearson correlation coefficient
  • Then we will summarize the same as heatmap
In [49]:
sns.pairplot(dia, vars=["Pregnancies", "Glucose","BloodPressure","SkinThickness","Insulin", "BMI","DiabetesPedigreeFunction", "Age"],hue="Outcome")
plt.title("Pairplot of Variables by Outcome")
Out[49]:
Text(0.5, 1.0, 'Pairplot of Variables by Outcome')
 
 

Inference from Pair Plots

  • From scatter plots, to me only BMI & SkinThickness and Pregnancies & Age seem to have positive linear relationships. Another likely suspect is Glucose and Insulin.
  • There are no non-linear relationships
  • Lets check it out with Pearson Correlation and plot heat maps
In [54]:
cor = dia.corr(method ='pearson')
cor
Out[54]:
  Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
Pregnancies 1.000000 0.198291 0.213355 0.093209 0.078984 -0.025347 0.007562 0.679608 0.256566
Glucose 0.198291 1.000000 0.210027 0.198856 0.581223 0.209516 0.140180 0.343641 0.515703
BloodPressure 0.213355 0.210027 1.000000 0.232571 0.098512 0.304403 -0.015971 0.300039 0.192673
SkinThickness 0.093209 0.198856 0.232571 1.000000 0.182199 0.664355 0.160499 0.167761 0.255936
Insulin 0.078984 0.581223 0.098512 0.182199 1.000000 0.226397 0.135906 0.217082 0.301429
BMI -0.025347 0.209516 0.304403 0.664355 0.226397 1.000000 0.158771 0.069814 0.270118
DiabetesPedigreeFunction 0.007562 0.140180 -0.015971 0.160499 0.135906 0.158771 1.000000 0.085029 0.209330
Age 0.679608 0.343641 0.300039 0.167761 0.217082 0.069814 0.085029 1.000000 0.350804
Outcome 0.256566 0.515703 0.192673 0.255936 0.301429 0.270118 0.209330 0.350804 1.000000
In [55]:
sns.heatmap(cor)
Out[55]:
<matplotlib.axes._subplots.AxesSubplot at 0xc4a9350>
 
 

Inference from ‘r’ values and heat map

  • No 2 factors have strong linear relationships
  • Age & Pregnancies and BMI & SkinThickness have moderate positive linear relationship
  • Glucose & Insulin technically has low correlation but 0.58 is close to 0.6 so can be assumed as moderate correlation
 

Final Inference before model building

  • Data set contains many zero values and they have been removed and remaining data has been used for screening and model building
  • Nearly 66% of participants are diabetic in the sample data
  • Visual screening (boxplots and segmented histograms) shows that few factors seem to influence the outcome
  • Moderate correlation exists between few factors and so while building model, this has to be borne in mind. If co-correlated factors are included, it might lead to Inflation of Variance.

  • As a next step, a binary logistic regression model has been built
 

Logistic Regression

  • A logistic regression is used from the dependent variable is binary, ordinal or nominal and the independent variables are either continuous or discrete
  • In this scenario, a Logit Model has been used to fit the data
  • In this case an event is defined as occurance of ‘1’ in outcome
  • Basically logistic regression uses the odds ratio to build the model
In [5]:
cols=["Pregnancies", "Glucose","BloodPressure","SkinThickness","Insulin", "BMI","DiabetesPedigreeFunction", "Age"]
X=dia[cols]
y=dia.Outcome
In [7]:
## Importing stats models for running logistic regression
import statsmodels.api as sm
## Defining the model and assigning Y (Dependent) and X (Independent Variables)
logit_model=sm.Logit(y,X)
## Fitting the model and publishing the results
result=logit_model.fit()
print(result.summary())
 
Optimization terminated successfully.
         Current function value: 0.563677
         Iterations 6
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                Outcome   No. Observations:                  392
Model:                          Logit   Df Residuals:                      384
Method:                           MLE   Df Model:                            7
Date:                Thu, 09 May 2019   Pseudo R-squ.:                  0.1128
Time:                        15:31:51   Log-Likelihood:                -220.96
converged:                       True   LL-Null:                       -249.05
                                        LLR p-value:                 8.717e-10
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
Pregnancies                  0.1299      0.049      2.655      0.008       0.034       0.226
Glucose                      0.0174      0.005      3.765      0.000       0.008       0.026
BloodPressure               -0.0484      0.009     -5.123      0.000      -0.067      -0.030
SkinThickness                0.0284      0.015      1.898      0.058      -0.001       0.058
Insulin                      0.0019      0.001      1.598      0.110      -0.000       0.004
BMI                         -0.0365      0.022     -1.669      0.095      -0.079       0.006
DiabetesPedigreeFunction     0.4636      0.344      1.347      0.178      -0.211       1.138
Age                          0.0005      0.016      0.031      0.976      -0.031       0.032
============================================================================================
 

Inference from the Logistic Regression

  • The R sq value of the model is 56%.. that is this model can explain 56% of the variation in dependent variable
  • To identify which variables influence the outcome, we will look at the p-value of each variable. We expect the p-value to be less than 0.05(alpha risk)
  • When p-value<0.05, we can say the variable influences the outcome
  • Hence we will eliminate Diabetes Pedigree Function, Age, Insulin and re run the model
 

2nd itertion of the Logistic Regression with fewer variables

In [76]:
cols2=["Pregnancies", "Glucose","BloodPressure","SkinThickness","BMI"]
X=dia[cols2]
In [77]:
logit_model=sm.Logit(y,X)
result=logit_model.fit()
print(result.summary2())
 
Optimization terminated successfully.
         Current function value: 0.569365
         Iterations 5
                         Results: Logit
=================================================================
Model:              Logit            Pseudo R-squared: 0.104     
Dependent Variable: Outcome          AIC:              456.3820  
Date:               2019-05-05 22:48 BIC:              476.2383  
No. Observations:   392              Log-Likelihood:   -223.19   
Df Model:           4                LL-Null:          -249.05   
Df Residuals:       387              LLR p-value:      1.5817e-10
Converged:          1.0000           Scale:            1.0000    
No. Iterations:     5.0000                                       
-----------------------------------------------------------------
                   Coef.  Std.Err.    z    P>|z|   [0.025  0.975]
-----------------------------------------------------------------
Pregnancies        0.1291   0.0374  3.4489 0.0006  0.0557  0.2024
Glucose            0.0215   0.0040  5.4447 0.0000  0.0138  0.0293
BloodPressure     -0.0507   0.0089 -5.6868 0.0000 -0.0682 -0.0332
SkinThickness      0.0299   0.0149  2.0073 0.0447  0.0007  0.0592
BMI               -0.0313   0.0215 -1.4537 0.1460 -0.0734  0.0109
=================================================================

 

Inference from 2nd Iteration

  • We will now eliminate BMI and re run the model
 

3rd iteration of Logistic Regression

In [78]:
cols3=["Pregnancies", "Glucose","BloodPressure","SkinThickness"]
X=dia[cols3]
logit_model=sm.Logit(y,X)
result=logit_model.fit()
print(result.summary())
 
Optimization terminated successfully.
         Current function value: 0.572076
         Iterations 5
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                Outcome   No. Observations:                  392
Model:                          Logit   Df Residuals:                      388
Method:                           MLE   Df Model:                            3
Date:                Sun, 05 May 2019   Pseudo R-squ.:                 0.09956
Time:                        22:49:35   Log-Likelihood:                -224.25
converged:                       True   LL-Null:                       -249.05
                                        LLR p-value:                 9.769e-11
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
Pregnancies       0.1403      0.037      3.820      0.000       0.068       0.212
Glucose           0.0199      0.004      5.297      0.000       0.013       0.027
BloodPressure    -0.0571      0.008     -7.242      0.000      -0.073      -0.042
SkinThickness     0.0160      0.011      1.404      0.160      -0.006       0.038
=================================================================================
 

Inference from 3rd Iteration

  • Now the P value of skinthickness is greater than 0.05, hence we will eliminate it and re run the model
 

4th Iteration of Logistic Regression

In [79]:
cols4=["Pregnancies", "Glucose","BloodPressure"]
X=dia[cols4]
logit_model=sm.Logit(y,X)
result=logit_model.fit()
print(result.summary())
 
Optimization terminated successfully.
         Current function value: 0.574607
         Iterations 5
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                Outcome   No. Observations:                  392
Model:                          Logit   Df Residuals:                      389
Method:                           MLE   Df Model:                            2
Date:                Sun, 05 May 2019   Pseudo R-squ.:                 0.09558
Time:                        22:49:53   Log-Likelihood:                -225.25
converged:                       True   LL-Null:                       -249.05
                                        LLR p-value:                 4.597e-11
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
Pregnancies       0.1405      0.037      3.826      0.000       0.069       0.212
Glucose           0.0210      0.004      5.709      0.000       0.014       0.028
BloodPressure    -0.0525      0.007     -7.449      0.000      -0.066      -0.039
=================================================================================
 

Inference from 4th Run

  • Now the model is clear. We have 3 variables that influence the Outcome and then are Pregnancies, Glucose and BloodPressure
  • Luckly, none of these 3 variables are co-correlated. Hence we can safetly assume tha the model is not inflated
In [34]:
## Importing LogisticRegression from Sk.Learn linear model as stats model function cannot give us classification report and confusion matrix
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
cols4=["Pregnancies", "Glucose","BloodPressure"]
X=dia[cols4]
y=dia.Outcome
logreg.fit(X,y)
## Defining the y_pred variable for the predicting values. I have taken 392 dia dataset. We can also take a test dataset
y_pred=logreg.predict(X)
## Calculating the precision of the model
from sklearn.metrics import classification_report
print(classification_report(y,y_pred))
 
              precision    recall  f1-score   support

           0       0.79      0.89      0.84       262
           1       0.71      0.53      0.61       130

   micro avg       0.77      0.77      0.77       392
   macro avg       0.75      0.71      0.72       392
weighted avg       0.77      0.77      0.76       392

 
C:\Users\Neil\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:433: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
  FutureWarning)
 

Precision of the model is 77%

In [35]:
from sklearn.metrics import confusion_matrix
## Confusion matrix gives the number of cases where the model is able to accurately predict the outcomes.. both 1 and 0 and how many cases it gives false positive and false negatives
confusion_matrix = confusion_matrix(y, y_pred)
print(confusion_matrix)
 
[[234  28]
 [ 61  69]]
 

The result is telling us that we have 234+69 are correct predictions and 61+28 are incorrect predictions.

 

 

Is TQM relevant in the age of Artificial Intelligence & Industry 4.0?

 

Digital Transformation, Artificial Intelligence, Industry 4.0, IoT, RPA, etc are some of the buzz words that are bringing shivers in the spine of many executives. To be fair, actually many are excited about the future and the opportunities that these tools and methods present.

One side of the coin

A few months ago, in a conversation with the Head of Business Excellence of a MNC in manufacturing sector where TQM & other similar practices are deeply rooted, he said that this year their focus is Industry 4.0 and there are no budgets for any other initiative. He said that TQM, Lean Six Sigma, etc are concepts that are gone past their half life and in this new age, everything will be automated sooner or later. So no Kaizens will be needed, no Six Sigma DMIAC projects will be needed and so is Value Stream Mapping. And as automated processes are highly efficient, there will be no need for improvement projects. He had a point. Instead of dismissing the idea or accepting, it is good to consider how to navigate through these new age developments.

Now the other side of the coin

Another friend of mine who steers strategy and business development for a global digital transformation solutions provideracross sectors recently reached out to me. The quest was to find ways to help their clients to speed up the adoption of digital technologies and reduce internal resistance. He said the problem was to do with their culture. Here is a quick summary of what transpired:

So, there is no doubt that new Digital technologies will put you in a new orbit, but soon that orbit will become a slow lane. In the ’90s, ERP wave swept the industry, then it was CRM, and then BI, and then Cloud, and then Big Data, and now it is AI, Robotics & IoT.

So ultimately these technology tools enable business but nothing can beat an organization that has the following competencies ingrained in their culture:

Whether you call it Agile, DevOps, Six Sigma, Lean, TQM or BE, these frameworks rely on the same fundamental principles mentioned above.

So, to sum up TQM or any such Business Excellence frameworks are enablers for Digital Transformation and cannot be replaced by AI, IoT, Industry 4.0

After few months, when we talked again, he said they are strategizing on Industry 4.0 and not started any real work.

We have created an assessment to evaluate the Digital Transformation Culture of an organization. There are 3 broad areas –

  1. Leadership Aspiration Index – What leaders are aspiring vs reality
  2. Management Practices Index – What management practices are needed vs reality
  3. People Perception Index – What people think vs reality

Gap assessment will be in the following manner:

This content is password protected. To view it please enter your password below:

Sign-up for collaborat newsletter