TY - JOUR AU - Vilenchik, Dan AU - Yichye, Barak AU - Abutbul, Maor PY - 2019/07/06 Y2 - 2024/03/28 TI - To Interpret or Not to Interpret PCA? This Is Our Question JF - Proceedings of the International AAAI Conference on Web and Social Media JA - ICWSM VL - 13 IS - 01 SE - Poster Papers DO - 10.1609/icwsm.v13i01.3265 UR - https://ojs.aaai.org/index.php/ICWSM/article/view/3265 SP - 655-658 AB - <p>Principal Component Analysis (PCA) is a central tool for analyzing data and social media data in particular. Typically, the data is projected on the first two PCs to obtain a twodimensional view, and trends and patterns are being examined. A key to making sense of the projected data is the semantic interpretation of the new axes (the PCs). To label the PCs, one usually looks at the top <em>k</em> vector entries in absolute value and assigns meaning according to them. The choice of <em>k</em> is done by “eyeballing” the vector. In this work we provide a computational framework to support this process and suggest an <em>interpretability score</em>, which measures how sensitive the interpretation step could be to the choice of <em>k</em>. Furthermore we give a visual method to choose the optimal <em>k</em>. We study our methodology in four social media platforms and discover that in two of them, Twitter and Instagram, interpretation can be done in a carefree manner, but in Steam and LinkedIn there is no natural labeling of the axes. This separation is clearly reflected in the interpretability score that each dataset received.</p> ER -