توظيف أسلوبي التحليل العنقودي والتحليل التمييزي في تصنيف البيانات وبناء الدوال التمييزية Employing the two methods of Cluster Analysis and Discriminant Analysis in Data Classification and building discriminative functions.

نوع المستند : بحوث فی مجال علم النفس والصحة النفسیة

المؤلف

قسم علم النفس، کلية التربية، جامعة الطائف، المملکة العربية السعودية

المستخلص

هدفت الدراسة الحالية الى توظيف استخدام اسلوب التحليل العنقودي وتحليل الدالة التمييزية في تصنيف بيانات الطلبة في الأداء الأکاديمي المرتفع والمنخفض ودقة تصنيف العناقيد. استخدمت الدراسة المنهج الوصفي، تکونت عينة الدراسة من عينة عشوائية لبيانات (62) طالبأ وطالبة من جامعة أم القرى للعام الجامعي 1434/1435هـ. تم تطبيق اسلوبي التحليل العنقودي والتحليل التمييزي على البيانات وقد توصلت النتائج عند تطبيق أسلوب التحليل العنقودي للمتوسطات (K-Means) وجود مجموعتين من العناقيد، في العنقود الأول کانت المسافة بين الحالات ومرکز العنقود  تتراوح من (2.968) الى (19.775)، وتضم (28) حالة. بينما في العنقود الثاني تراوحت المسافة بين الحالات ومرکز العنقود من  (1.919) إلى (12.084) وضمت (34) حالة. کما أظهرت نتائج التحليل العنقودي أهمية المتغيرات (x2)   و (x3)  في تصنيف الحالات على العناقيد حيث کانت دالة إحصائياً عند مستوى دلالة (0.01). کما أظهرت النتائج تأکيد نتائج التحليل التمييزي في أهمية المتغيرات المستقلة (X2) و (X3)  في تصنيف الحالات. کما اشارت النتائج ان الدالة التمييزية لها ارتباط قانوني بلغت نسبته (0.770)  والتي تشير التي قوة العلاقة بين المتغيرات الداخلة في التحليل، يقابلها قيمة ذاتية (Eigenvalues) تساوي (1.453) وقد فسرت الدالة (100%) من التباين. کما اشارت النتائج للتحليل التمييزي ان قيمة ولکس لمدا قد بلغت (0.408)، وقيمة کاي تربيع والتي بلغت (52.042) وهي دالة احصائياً  عند مستوى دلالة (0.01)، وهذا يشير الى قدرة الدالة التمييزية على التمييز بين المجموعتين.  کما أظهرت نتائج التحقق من تصنيف الحالات عند مقارنة التصنيف للحالات الذي تم وفق التحليل العنقودي فقد وجد ان التصنيف کان صحيحاً بنسبة وصلت الى (98.4%) وهي نسبة عالية جدأ تؤکد دقة التصنيف. کما قدمت الدراسة مجموعة من التوصيات والمقترحات.
The current study aimed at employing the use of the cluster analysis method and the discriminant function analysis in classifying student data in high and low academic performance and accuracy of classification of clusters. The study used the descriptive approach. The study sample consisted of a random sample of data for (62) male and female students from Umm Al-Qura University for the academic year 1434/1435 AH. The two methods of cluster analysis and discriminant analysis were applied to the data. The results were obtained when applying the method of cluster analysis of the mean (K-Means), the presence of two groups of clusters, in the first cluster the distance between the cases and the center of the cluster ranged from (2.968) to (19.775), and includes (28 ) status. Whereas in the second cluster the distance between the cases and the center of the cluster ranged from (1.919) to (12.084) and included (34) cases. The results of the cluster analysis showed the importance of the variables (x2) and (x3) in classifying the cases on the clusters where they were statistically significant at the level of significance (0.01).The results also showed the confirmation of the results of the discriminant analysis on the importance of independent variables (X2) and (X3) in the classification of cases. The results also indicated that the discriminant function has a Canonical Correlation of (0.770) which indicates that the strength of the relationship between the variables included in the analysis is offset by an intrinsic value (Eigenvalues) equal to (1.453) and the function has been explained (100%) of the variance. The results also indicated for the discriminant analysis that the value of Wilks' Lambda for a period reached (0.408), and the value of Chi-square, which amounted to (52.042) which is statistically significant at the level of significance (0.01), and this indicates the ability of the discriminant function to distinguish between the two groups. The results of checking the classification of cases also showed that when comparing the classification of cases that were done according to the cluster analysis, it was found that the classification was correct at a rate of (98.4%), which is a very high percentage confirming the accuracy of the classification. The study also presented a set of recommendations and proposals.

الكلمات الرئيسية

الموضوعات الرئيسية


1)            المراجع العربية :

-        أحمد، طالب (2015). تصنیف المحافظات السوریة حسب الاستهلاک للاسرة باستخدام التحلیل العنقودی. مجلة جامعة تشرین للبحوث والدراسات العلمیة، 37(2).
-       الجاعونی، فرید وغانم، عدنان (2001). التحلیل الإحصائی متعدد المتغیرات (اﻟﺘﺤﻠﻴﻞ التجمیعی) ﻓﻲ دراﺳﺔ تحديد ﻣﺴﺘﻮﻳﺎت اﻟﻬﻴﻜﻞ اﻻﻗﺘﺼﺎدی اﻻﺟﺘﻤﺎﻋﻲ ﻷﺳﺮ اﻟﻤﺠﺘﻤﻊ. ﻣﺠﻠﺔ ﺟﺎﻣﻌﺔ دﻣﺸﻖ،  17(2) .
-        رشید، أسیل ومهدی، نبأ  (2011) . تحلیل واقع التربیة والتعلیم فی العراق باستخدام طرائق التحلیل العنقودی، دراسة مقارنة، مجلة القادسیة للعلوم الإداریة والاقتصادیة،(2) 13.
-        الشافعی، محمد منصور  (2014) . الإحصاء التقلیدی والمتقدم فی البحوث العلمیة والإنسانیة ،الکتاب الثاني، مکتبة الرشد.
-        شیراز، محمد صالح (2015) التحلیل الاحصائی للبیانات SPSS. ، خوارزم العلمیة.
-        علی، کنان أحمد (2015) فاعلیة استخدام التحلیل العنقودی والتحلیل التمییزی فی التحقق من الدلالة التمییزیة لاختبارات الذکاء والشخصیة. (دارسة میدانیة مقارنة فی محافظة دمشق). رسالة ماجستیر غیر منشورة، کلیة التربیة، جامعة دمشق.
-        مصطفى، نزار (2007) استخدام بعض طرق التحلیل العنقودی فی التصنیف مع تطبیق عملی. مجلة التقنی،(2) 20.
-        یعقوب، أسماء أیوب(2017) التحلیل العنقودی والتمییزی فی دراسة تطبیقیة على بعض المصارف العراقیة . مجلة الاقتصاد الخلیجی، ع 31.

2)            المراجع العربية مترجمة Translation of the Arabic References

-        Ahmed, T. (2015). The Syrian governorates classification according to the household consumption using cluster analysis. University of Chern Journal of Research and Scientific Studies,37(2)
-        Al-Jaouni, F. & Ghanem, A. (2001). Multiple-variable statistical analysis (collective analysis) In the study of determining the levels of the socio-economic structure of the community’s families . Damsh University Magazine,17(2) .
-        Rashid, A & Mahdi, N  (2011) Analyzing the reality of education and learning in Iraq using the methods of cluster analysis (A comparative study  ).  Al-Qadisiyah Journal of Administrative and Economic Sciences,2 (13)
-        Al-Shafi’i, M, M  (2014). Traditional and advanced statistics in scientific and human research , (second book ) . Al-Rashab library .
-        Shiraz, M, S  (2015).  Statistical analysis of data  SPSS . Khwarizm Al-Alamiya .
-        Ali, K, A  (2015).   The effectiveness of using cluster and differential analysis in verifying the distinctive significance of intelligence and personality tests (A field study compared to Damascus Governorate). Unpublished Master thesis, College of Education, University of Damascus .
-        Mustafa, N (2007).  Using  of some cluster analysis methods in classification with a practical application . Al teqney  Magazine.
-        Yaacob, A (2017).  Cluster and discriminatory analysis in a applied study on some Iraqi banks . The Journal of the Gulf Economy.
 
 

3)            المراجع الأجنبية:

-      Aldenderfer,M. & Blashfield, R.(1984). Cluster Analysis. SAGA Publications.
-      AL-Temimi, S & AL-Saffar, R. & Shebib, H.(2018). Classification and Identification of IDP Camps After Mosul Events Based on Epidemics and Other Factors Using Cluster Analysis and Discriminant Analysis. International Journal of Pharmaceutical Research &Allied Sciences.
-       El-Hanjouri,M. & Hamad, B.(2015). Using Cluster Analysis and Discriminant Analysis Methods in Classification with Application on Standard of Living Family in Palestinian Areas. International Journal of Statistics and Applications,(5): 213-222 DOI: 10.5923/j.statistics.20150505.05.
-      Everitt, B. Landau, S. , Leese, M. & Stahi, D. (2011).Cluster Analysis. (5th ed), Wily Series.
-      Giniyatullin, K & Valeeva,A. & Smirnova,E. (2017). Application of Cluster and Discriminant Analyses to Diagnose Lithological Heterogeneity of the Parent Material according to Its Particle-Size Distribution. Eurasian Soil Science.
-      Hair, J. & Black, W. & Babin, B.& Anderson, R. & Tatham, R.(2006). Multivariate Data Analysis . (6th ed).  Pearson Education..
-      Hardle, W.& Simar, L. (2003). Applied Multivariate Statistical Analysis.Springer.
-      Hawng, D.(2001). Issues in Predictive Discriminant Analysis: Using and Interpreting the Leave-One-Out Jackknife Method and the Improvement-Over-Change " I" Index Effect Size. Paper Presented at the Annual Meeting of the Southwest  Educational Research Association.
-      Huang, M. & Li, Y. & Zhan,P.&  Liu,P. &Tian,H. &Fan,J.  (2017). Correlation of Volatile Compounds and Sensory Attributes of Chinese Traditional Sweet Fermented Flour Pastes Using Hierarchical Cluster Analysis and Partial Least Squares-Discriminant Analysis.  Journal of Chemistry , ID 3213492,  https://doi.org/10.1155/2017/3213492 .
-      Huberty, C. (1994). Applied Discriminant Analysis. John Willy & Sons.
-      Huberty, C. ; Olejnik,  S. (2006) . Applied  MANOVA and Discriminant Analysis . (2nd ed). John Wiley & Sons.
-      Inoue, Y.(2001).  Educational Research and Statistics: Examples of Questions and Answers. ERIC; ED 451242.
-      Jaiswara,R., Nandi, D. & Balakrishnan,R. (2013). Examining the Effectiveness of Discriminant Function Analysis and Cluster Analysis in Species Identification of Male Field Crickets Based on Their Calling Songs. PLoS ONE 8(9): e75930. doi:10.1371/journal.pone.0075930.
-      Kardiyen,F. ; Olmus, H. (2016). A Comparison of Two Group Classification Approaches to Fat tailed and Skewed Data. Communication in Statistics-Simulation and Computation, 32.Taylor&Francis, ISSN:0361-0918.
-      King, R. (2015). Cluster Analysis and Data Mining. Dulles: Mercury Learning and Information.
-      Klecka, W. (1980). Discriminant Analysis. Sage Publications.
-       Kutner, M. & Neter, J. & Li, W. & Nachtsheim, C. (2005). Applied Linear Statistical Models. (5 th ed), McGraw-Hill, .(
-      Li, H. & Sun, J. (2011). Empirical Research of  Hybridizing Principal Component Analysis with Multivariate Discriminant  Analysis and Logistic Regression for Business Failure Prediction. Expert Systems with Application,(38).6244-6253. www.elsevier.com/locate/eswa
-       Manly, B. (1994). Multivariate Statistical Methods A Primer. (2nd ed).Chapman & Hall,
-      Maroco, J. & Silva,D.& Rodrigues, A.& Guerreiro,M & Santana,L.& Mendonca, A. (2011). Data Mining Methods in the Prediction of  Dementia : A Real – Data Comparison of the  Accuracy , Sensitivity and Specificity of  Linear Discriminant Analysis, Logistic Regression , Neural Networks, Support vector machines, Classification Trees and Random forests. BMC Research Notes, 4 ; 299. Portugal.
-      Moor, J. (1996). Stepwise Methods  are as in Discriminant Analysis as They are Anywhere Else. Paper Presented at the Annual Meeting of the Southwest Educational Research Association.
-      Nath, R. ; Jackson, W.(1992).A Comparison of  the Classical and the Linear Programming approaches to the Classification  Problem in Discriminate. Journal of Statistical Computation and Simulation, 41, pp73-93.
-      Panagopoulos, G. &  Angelopoulou, D.& Tzirtzilakis, E. & Giannoulopoulos, P. (2016). The contribution of cluster and discriminant analysis to the classification of complex aquifer systems. Environ Monit Assess, (188) 591, DOI 10.1007/s10661-016-5590.
-      Pluker,  A. (1995). Application of Discriminant Analysis in Research with Gifted Students. Paper Presented at the Annual Meeting of the American Educational Research Association.
-      Pohar, M. & Blas, M. & Turk, S. (2004). Comparison of Logistic Regression and linear Discriminant Analysis : A simulation study. Metodoloski , 1(1), pp143-161.
-      Ramdeen,K. and  Yim, O. (2015). Hierarchical Cluster Analysis: Comparison of Three Linkage Measures and Application to Psychological Data.The Quantitative Methods for Psychology, 11(1). https://www.researchgate.net/publication/308015073
-      Rencher, A. ( 2002). Methods of Multivariate Analysis. (2nd ed). A Wiley Interscience.
-      Richardson,A. &  Abu Alhaija,E. (2003). Growth prediction in Class III patients using cluster and discriminant function analysis. European Journal of Orthodontics, 25, 599–608.
-      Romesburg, H. (2004). Cluster Analysis for Researchers. Lulu Press.
-      Rose,M.&Stedal,K& Reville,M& Noort,B & Kappel,V& Frampton,L& Watkins,B.& Lask ,B.(2016). Similarities and Differences of Neuropsychological Profiles in Children and Adolescents with Anorexia Nervosa and Healthy Controls Using Cluster and Discriminant Function Analyses. Archives of Clinical Neuropsychology ,31, 877–895.
-      Ruiz, S. (1991). Asymptotic Efficiency of Logistic Regression Relative to Linear Discriminant Analysis. Biometrika , V (78),2,pp.235-243.
-      Santos, F.& Guyomarch, P& Bruzek, J.(2014). Statistical Sex Determination from Craniometrics: Comparison of Linear Discriminant Analysis,  Logistic Regression , and Support Vector Machines. Forensic Science International ,245(205) .www.elsevier.com/locate/forscint.
-      Tabachnick, B. & Fidell, L. (2013).Using Multivariate StatisticsPearson Education .(6th ed).
-      Tanos,P. ,Kovács, J., Kovács, S., Anda, A. & Hatvani,I.(2015). Optimization of the monitoring network on the River Tisza (Central Europe, Hungary) using combined cluster and discriminant analysis, taking seasonality into account. Environ Monit Assess, (187) , DOI 10.1007/s10661-015-4777.
-      Thompson, B.(1994). Why Multivariate Methods are Usually Vital in Research: Some Basic Concepts. Paper Presented at the Annual Meeting of the Southwest Educational Research Association (Dallas, TX, Jun  27-29)
-      Wilson, L. & Hardgrave, C. (1995). Predicting Graduate Student Success in an MBA Program: Regression Versus Classification. Educational and Psychological Measurement,  55(2) , April 1995 , pp186-195.