Handbook of Learning Analytics

Chapter 7

Handbook of Learning Analytics
First Edition

Content Analytics: The Definition,
Scope, and an Overview of Published Research

Vitomir Kovanović, Srećko Joksimović, Dragan Gašević,
Marek Hatala, & George Siemens


The field of learning analytics recently attracted attention from educational practitioners and researchers interested in the use of large amounts of learning data for understanding learning processes and improving learning and teaching practices. In this chapter, we introduce content analytics — a particular form of learning analytics focused on the analysis of different forms of educational content. We provide the definition and scope of content analytics and a comprehensive summary of the significant content analytics studies in the published literature to date. Given the early stage of the learning analytics field, the focus of this chapter is on the key problems and challenges for which existing content analytics approaches are suitable and have been successfully used in the past. We also reflect on the current trends in content analytics and their position within a broader domain of educational research.

Export Citation: Plain Text (APA)     BIBTeX     RIS

Supplementary Material

No Supplementary Material Available

References (127)

Allen, L. K., Snow, E. L., & McNamara, D. S. (2014). The long and winding road: Investigating the differential writing patterns of high and low skilled writers. In J. Stamper, Z. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining (EDM2014), 4–7 July, London, UK (pp. 304–307). International Educational Data Mining Society.

Allen, L. K., Snow, E. L., & McNamara, D. S. (2015). Are you reading my mind? Modeling students’ reading comprehension skills with natural language processing techniques. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 246–254). New York: ACM. doi:10.1145/2723576.2723617

Anderson, T., & Dron, J. (2012). Learning technology through three generations of technology enhanced distance education pedagogy. European Journal of Open, Distance and E-Learning, 2012(II), 1–14.

Antonelli, F., & Sapino, M. L. (2005). A rule based approach to message board topics classification. In K. S. Candan & A. Celentano (Eds.), Advances in Multimedia Information Systems (pp. 33–48). Springer. http://link.springer.com/chapter/10.1007/11551898_6

Bateman, S., Brooks, C., McCalla, G., & Brusilovsky, P. (2007). Applying collaborative tagging to e-learning. Workshop held at the 16th International World Wide Web Conference (WWW2007), 8–12 May 2007, Banff, AB, Canada. http://www.www2007.org/workshops/paper_56.pdf

Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. doi:10.1145/2133806.2133826

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

Blikstein, P. (2011). Using learning analytics to assess students’ behavior in open-ended programming tasks. Proceedings of the 1st International Conference on Learning Analytics and Knowledge (LAK ʼ11), 27 February–1 March 2011, Banff, AB, Canada (pp. 110–116). New York: ACM. doi:10.1145/2090116.2090132

Bosnić, I., Verbert, K., & Duval, E. (2010). Automatic keywords extraction: A basis for content recommendation. Proceedings of the 4th International Workshop on Search and Exchange of e-le@rning Materials (SE@M’10), 27–28 September 2010, Barcelona, Spain (pp. 51–60). http://citeseerx.ist.psu.edu/viewdoc/download?doi=

Bramucci, R., & Gaston, J. (2012). Sherpa: Increasing student success with a recommendation engine. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 82–83). New York: ACM. doi:10.1145/2330601.2330625

Brooks, C., Amundson, K., & Greer, J. (2009). Detecting significant events in lecture video using supervised machine learning. Proceedings of the 2009 Conference on Artificial Intelligence in Education: Building Learning Systems That Care: From Knowledge Representation to Affective Modelling (pp. 483–490). Amsterdam, The Netherlands: IOS Press. http://dl.acm.org/citation.cfm?id=1659450.1659523

Brooks, C., Johnston, G. S., Thompson, C., & Greer, J. (2013). Detecting and categorizing indices in lecture video using supervised machine learning. In O. R. Zaïane & S. Zilles (Eds.), Advances in Artificial Intelligence (pp. 241–247). Springer. http://link.springer.com/chapter/10.1007/978-3-642-38457-8_22

Buckingham Shum, S., De Laat, M. F., De Liddo, A., Ferguson, R., Kirschner, P., Ravenscroft, A., … Whitelock, D. (2013). DCLA13: 1st International Workshop on Discourse-Centric Learning Analytics. Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (LAK ’13), 8–12 April 2013, Leuven, Belgium (pp. 282–282). New York: ACM. doi:10.1145/2460296.2460357

Buckingham Shum, S., & Ferguson, R. (2012). Social learning analytics. Journal of Educational Technology & Society, 15(3), 3–26.

Buckingham Shum, S., Knight, S., McNamara, D., Allen, L., Bektik, D., & Crossley, S. (2016). Critical perspectives on writing analytics. Proceedings of the 6th International Conference on Learning Analytics & Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 481–483). New York ACM. doi:10.1145/2883851.2883854

Burrows, S., Gurevych, I., & Stein, B. (2014). The eras and trends of automatic short answer grading. International Journal of Artificial Intelligence in Education, 25(1), 60–117. doi:10.1007/s40593-014-0026-8

Calvo, R., Aditomo, A., Southavilay, V., & Yacef, K. (2012). The use of text and process mining techniques to study the impact of feedback on students’ writing processes. Proceedings of the 10th International Conference of the Learning Sciences (ICLS ʼ12), 2–6 July 2012, Sydney, Australia (pp. 416–423).

Cardinaels, K., Meire, M., & Duval, E. (2005). Automating metadata generation: The simple indexing interface. Proceedings of the 14th International Conference on World Wide Web (WWW ’05), 10–14 May 2005, Chiba, Japan (pp. 548–556). ACM. http://dl.acm.org/citation.cfm?id=1060825

Charleer, S., Santos, J. L., Klerkx, J., & Duval, E. (2014). Improving teacher awareness through activity, badge and content visualizations. In Y. Cao, T. Väljataga, J. K..T. Tang, H. Leung, M. Laanpere (Eds.), New Horizons in Web Based Learning (pp. 143–152). Springer. http://link.springer.com/chapter/10.1007/978-3-319-13296-9_16

Chatti, M. A., Dyckhoff, A. L., Schroeder, U., & Thüs, H. (2012). A reference model for learning analytics. International Journal of Technology Enhanced Learning, 4(5/6), 318–331. doi:10.1504/IJ℡.2012.051815

Chen, B. (2014). Visualizing semantic space of online discourse: The Knowledge Forum case. Proceedings of the 4th International Conference on Learning Analytics and Knowledge (LAK ʼ14), 24–28 March 2014, Indianapolis, IN, USA (pp. 271–272). New York: ACM. doi:10.1145/2567574.2567595

Chen, B., Chen, X., & Xing, W. (2015). “Twitter archeology” of Learning Analytics and Knowledge conferences. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 340–349). New York: ACM. doi:10.1145/2723576.2723584

Chiu, M. M., & Fujita, N. (2014a). Statistical discourse analysis: A method for modeling online discussion processes. Journal of Learning Analytics, 1(3), 61–83.

Chiu, M. M., & Fujita, N. (2014b). Statistical discourse analysis of online discussions: Informal cognition, social metacognition and knowledge creation. Proceedings of the 4th International Conference on Learning Analytics and Knowledge (LAK ʼ14), 24–28 March 2014, Indianapolis, IN, USA (pp. 217–225). New York: ACM. doi:10.1145/2567574.2567580

Cohen, D. J., Troyano, J. F., Hoffman, S., Wieringa, J., Meeks, E., & Weingart, S. (Eds.). (2012). Special Issue on Topic Modeling in Digital Humanities. Journal of Digital Humanities, 2(1).

Cook, D. A., Garside, S., Levinson, A. J., Dupras, D. M., & Montori, V. M. (2010). What do we mean by web-based learning? A systematic review of the variability of interventions. Medical Education, 44(8), 765–774. doi:10.1111/j.1365-2923.2010.03723.x

Corich, S., Hunt, K., & Hunt, L. (2012). Computerised content analysis for measuring critical thinking within discussion forums. Journal of E-Learning and Knowledge Society, 2(1), 47–60.

Crossley, S., Allen, L. K., Snow, E. L., & McNamara, D. S. (2015). Pssst… Textual features… There is more to automatic essay scoring than just you! Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 203–207). New York: ACM. doi:10.1145/2723576.2723595

Crossley, S., Roscoe, R., & McNamara, D. S. (2014). What is successful writing? An investigation into the multiple ways writers can write successful essays. Written Communication, 31(2), 184–214. doi:10.1177/0741088314526354

Crossley, S., Varner, L. K., Roscoe, R. D., & McNamara, D. S. (2013). Using automated indices of cohesion to evaluate an intelligent tutoring system and an automated writing evaluation system. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Artificial Intelligence in Education (pp. 269–278). Springer. http://link.springer.com/chapter/10.1007/978-3-642-39112-5_28

Cui, Y., & Wise, A. F. (2015). Identifying content-related threads in MOOC discussion forums. Proceedings of the 2nd ACM Conference on Learning @ Scale (L@S 2015), 14–18 March 2015, Vancouver, BC, Canada (pp. 299–303). New York: ACM. doi:10.1145/2724660.2728679

De Freitas, S. (2007). Post-16 e-learning content production: A synthesis of the literature. British Journal of Educational Technology, 38(2), 349–364. doi:10.1111/j.1467-8535.2006.00632.x

De Laat, M. F., Lally, V., Lipponen, L., & Simons, R.-J. (2007). Investigating patterns of interaction in networked learning and computer-supported collaborative learning: A role for Social Network Analysis. International Journal of Computer-Supported Collaborative Learning, 2(1), 87–103. doi:10.1007/s11412-007-9006-4

De Wever, B., Schellens, T., Valcke, M., & Van Keer, H. (2006). Content analysis schemes to analyze transcripts of online asynchronous discussion groups: A review. Computers & Education, 46(1), 6–28.

Di Eugenio, B., Fossati, D., Haller, S., Yu, D., & Glass, M. (2008). Be brief, and they shall learn: Generating concise language feedback for a computer tutor. International Journal of Artificial Intelligence in Education, 18(4), 317–345.

Donnelly, R., & Gardner, J. (2011). Content analysis of computer conferencing transcripts. Interactive Learning Environments, 19(4), 303–315.

Dowell, N., Skrypnyk, O., Joksimović, S., Graesser, A. C., Dawson, S., Gašević, D., … Kovanović, V. (2015). Modeling learners’ social centrality and performance through language and discourse. In O. C. Santos, J. G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.), Proceedings of the 8th International Conference on Education Data Mining (EDM2015), 26–29 June 2015, Madrid, Spain (pp. 250–258). International Educational Data Mining Society. http://www.educationaldatamining.org/EDM2015/uploads/papers/paper_211.pdf

Drachsler, H., Hummel, H. G. K., & Koper, R. (2008). Personal recommender systems for learners in lifelong learning networks: The requirements, techniques and model. International Journal of Learning Technology, 3(4), 404–423. doi:10.1504/IJLT.2008.019376

Dufty, D. F., Graesser, A. C., Louwerse, M., & McNamara, D. S. (2006). Assigning grade levels to textbooks: Is it just readability? In R. Sun & N. Miyake (Eds.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006), 26–29 July 2006, Vancouver, British Columbia, Canada (pp. 1251–1256). Austin, TX: Cognitive Science Society.

Dzikovska, M., Steinhauser, N., Farrow, E., Moore, J., & Campbell, G. (2014). BEETLE II: Deep natural language understanding and automatic feedback generation for intelligent tutoring in basic electricity and electronics. International Journal of Artificial Intelligence in Education, 24(3), 284–332. doi:10.1007/s40593-014-0017-9

Ellis, C. (2013). Broadening the scope and increasing the usefulness of learning analytics: The case for assessment analytics. British Journal of Educational Technology, 44(4), 662–664. doi:10.1111/bjet.12028

Ezen-Can, A., Boyer, K. E., Kellogg, S., & Booth, S. (2015). Unsupervised modeling for understanding MOOC discussion forums: A learning analytics approach. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 146–150). New York: ACM. doi:10.1145/2723576.2723589

Ferguson, R. (2012). Learning analytics: Drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5/6), 304. doi:10.1504/IJTEL.2012.051816

Ferguson, R., & Buckingham Shum, S. (2012). Social learning analytics: Five approaches. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 23–33). New York: ACM. doi:10.1145/2330601.2330616

Foltz, P. W., Laham, D., Landauer, T. K., Foltz, P. W., Laham, D., & Landauer, T. K. (1999). Automated essay scoring: Applications to educational technology. In B. Collis & R. Oliver (Eds.), Proceedings of EdMedia: World Conference on Educational Media and Technology 1999, 19–24 June 1999, Seattle, WA, USA (pp. 939–944). Association for the Advancement of Computing in Education (AACE). https://www.learntechlib.org/p/6607

Foltz, P. W., & Rosenstein, M. (2015). Analysis of a large-scale formative writing assessment system with automated feedback. Proceedings of the 2nd ACM Conference on Learning @ Scale (L@S 2015), 14–18 March 2015, Vancouver, BC, Canada (pp. 339–342). New York: ACM. doi:10.1145/2724660.2728688

Garrison, D. R., Anderson, T., & Archer, W. (2001). Critical thinking, cognitive presence, and computer conferencing in distance education. American Journal of Distance Education, 15(1), 7–23.

Gašević, D., Dawson, S., & Siemens, G. (2015). Let’s not forget: Learning analytics are about learning. TechTrends, 59(1), 64–71. doi:10.1007/s11528-014-0822-x

Gašević, D., Mirriahi, N., & Dawson, S. (2014). Analytics of the effects of video use and instruction to support reflective learning. Proceedings of the 4th International Conference on Learning Analytics and Knowledge (LAK ʼ14), 24–28 March 2014, Indianapolis, IN, USA (pp. 123–132). New York: ACM. doi:10.1145/2567574.2567590

Graesser, A. C., McNamara, D. S., & Kulikowich, J. M. (2011). Coh-Metrix: Providing multilevel analyses of text characteristics. Educational Researcher, 40(5), 223–234. doi:10.3102/0013189X11413260

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. doi:10.3102/003465430298487

Hecking, T., & Hoppe, H. U. (2015). A network based approach for the visualization and analysis of collaboratively edited texts. Proceedings of the Workshop on Visual Aspects of Learning Analytics (VISLA ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. XXX–XXX). http://ceur-ws.org/Vol-1518/paper4.pdf

Hever, R., De Groot, R., De Laat, M., Harrer, A., Hoppe, U., McLaren, B. M., & Scheuer, O. (2007). Combining structural, process-oriented and textual elements to generate awareness indicators for graphical e-discussions. In C. Chinn, G. Erkens, & S. Puntambekar (Eds.), Proceedings of the 7th International Conference on Computer-Supported Collaborative Learning (CSCL 2007), 16–21 July 2007, New Brunswick, NJ, USA (pp. 289–291). International Society of the Learning Sciences. http://dl.acm.org/citation.cfm?id=1599600.1599654

Hong, L., & Davison, B. D. (2010). Empirical study of topic modeling in Twitter. Proceedings of the 1st Workshop on Social Media Analytics (SOMA ’10), 25–28 July 2010, Washington, DC, USA (pp. 80–88). New York: ACM. doi:10.1145/1964858.1964870

Hosseini, R., & Brusilovsky, P. (2014). Example-based problem solving support using concept analysis of programming content. In S. Trausan-Matu, K. E. Boyer, M. Crosby, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 683–685). Springer. http://link.springer.com/chapter/10.1007/978-3-319-07221-0_106

Hsiao, I.-H., & Awasthi, P. (2015). Topic facet modeling: Semantic visual analytics for online discussion forums. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 231–235). New York: ACM. doi:10.1145/2723576.2723613

Joksimović, S., Dowell, N., Skrypnyk, O., Kovanović, V., Gašević, D., Dawson, S., & Graesser, A. C. (2015). Exploring the accumulation of social capital in cMOOC through language and discourse. Under Review.

Joksimović, S., Gašević, D., Kovanović, V., Adesope, O., & Hatala, M. (2014). Psychological characteristics in cognitive presence of communities of inquiry: A linguistic analysis of online discussions. The Internet and Higher Education, 22, 1–10.

Joksimović, S., Kovanović, V., Jovanović, J., Zouaq, A., Gašević, D., & Hatala, M. (2015). What do cMOOC participants talk about in social media? A topic analysis of discourse in a cMOOC. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 156–165). New York: ACM.

Knight, S., & Littleton, K. (2015). Discourse centric learning analytics: Mapping the terrain. Journal of Learning Analytics, 2(1), 185–209.

Kovanović, V., Joksimović, S., Gašević, D., & Hatala, M. (2014). Automated content analysis of online discussion transcripts. In K. Yacef & H. Drachsler (Eds.), Proceedings of the Workshops at the LAK 2014 Conference (LAK-WS 2014), 24–28 March 2014, IN, Indiana, USA. http://ceur-ws.org/Vol-1137/LA_machinelearning_submission_1.pdf

Kovanović, V., Joksimović, S., Waters, Z., Gašević, D., Kitto, K., Hatala, M., & Siemens, G. (2016). Towards automated content analysis of discussion transcripts: A cognitive presence case. Proceedings of the 6th International Conference on Learning Analytics & Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 15–24). New York: ACM. doi:10.1145/2883851.2883950

Krippendorff, K. H. (2003). Content analysis: An introduction to its methodology. Thousand Oaks, CA: Sage Publications.

Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2–3), 259–284. doi:10.1080/01638539809545028

Lárusson, J. A., & White, B. (2012). Monitoring student progress through their written “point of originality.” Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 212–221). New York: ACM. doi:10.1145/2330601.2330653

Leeman-Munk, S. P., Wiebe, E. N., & Lester, J. C. (2014). Assessing elementary students’ science competency with text analytics. Proceedings of the 4th International Conference on Learning Analytics and Knowledge (LAK ʼ14), 24–28 March 2014, Indianapolis, IN, USA (pp. 143–147). New York: ACM. doi:10.1145/2567574.2567620

Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H., & Koper, R. (2011). Recommender systems in technology enhanced learning. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender systems handbook (pp. 387–415). Springer. http://link.springer.com/chapter/10.1007/978-0-387-85820-3_12

McKlin, T. (2004). Analyzing cognitive presence in online courses using an artificial neural network. Georgia State University, College of Education, Atlanta, GA, United States. https://pdfs.semanticscholar.org/d6af/c0073f2efc53bb2e46a0dd39a677027b1c3d.pdf

McKlin, T., Harmon, S., Evans, W., & Jones, M. (2002, March 21). Cognitive presence in web-based learning: A content analysis of students’ online discussions. IT Forum, 60. https://pdfs.semanticscholar.org/037b/f466c1c2290924e0ba00eec14520c091b57e.pdf

McNamara, D. S., Crossley, S., & McCarthy, P. M. (2009). Linguistic features of writing quality. Written Communication, 27(1), 57–86. doi:10.1177/0741088309351547

McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge, UK: Cambridge University Press.

McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction, 14(1), 1–43. doi:10.1207/s1532690xci1401_1

McNamara, D. S., Raine, R., Roscoe, R., Crossley, S., Jackson, G. T., Dai, J., … others. (2012). The Writing-Pal: Natural language algorithms to support intelligent tutoring on writing strategies. In P. M. McCarthy & C. Boonthum-Denecke (Eds.), Applied natural language processing: Identification, investigation and resolution (pp. 298–311). Hershey, PA: IGI Global.

Mehrotra, R., Sanner, S., Buntine, W., & Xie, L. (2013). Improving LDA topic models for microblogs via tweet pooling and automatic labeling. Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’13), 28 July–1 August 2013, Dublin, Ireland (pp. 889–892). New York: ACM. doi:10.1145/2484028.2484166

Mintz, L., Stefanescu, D., Feng, S., D’Mello, S., & Graesser, A. (2014). Automatic assessment of student reading comprehension from short summaries. In J. Stamper, Z. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining (EDM2014), 4–7 July, London, UK. International Educational Data Mining Society. http://www.educationaldatamining.org/conferences/index.php/EDM/2014/paper/view/1372/1338

Mirriahi, N., & Dawson, S. (2013). The pairing of lecture recording data with assessment scores: A method of discovering pedagogical impact. Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (LAK ’13), 8–12 April 2013, Leuven, Belgium (pp. 180–184). New York: ACM. doi:10.1145/2460296.2460331

Moore, M. G. (1989). Editorial: Three types of interaction. American Journal of Distance Education, 3(2), 1–7. doi:10.1080/08923648909526659

Muldner, K., & Conati, C. (2010). Scaffolding meta-cognitive skills for effective analogical problem solving via tailored example selection. International Journal of Artificial Intelligence in Education, 20(2), 99–136.

Niemann, K., Schmitz, H.-C., Kirschenmann, U., Wolpers, M., Schmidt, A., & Krones, T. (2012). Clustering by usage: Higher order co-occurrences of learning objects. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 238–247). New York: ACM. doi:10.1145/2330601.2330659

Pérez-Marín, D., & Pascual-Nieto, I. (2010). Showing automatically generated students’ conceptual models to students and teachers. International Journal of Artificial Intelligence in Education, 20(1), 47–72.

Petrushyna, Z., Kravcik, M., & Klamma, R. (2011). Learning analytics for communities of lifelong learners: A forum case. Proceedings of the 11th IEEE International Conference on Advanced Learning Technologies (ICALT ʼ11), 6–8 July 2011, Athens, GA, USA (pp. 609–610). IEEE. doi:10.1109/ICALT.2011.185

Pham, M. C., Derntl, M., Cao, Y., & Klamma, R. (2012). Learning analytics for learning blogospheres. In E. Popescu, Q. Li, R. Klamma, H. Leung, & M. Specht (Eds.), Advances in Web-Based Learning: ICWL 2012 (pp. 258–267). Springer. http://link.springer.com/chapter/10.1007/978-3-642-33642-3_28

Ramachandran, L., Cheng, J., & Foltz, P. (2015). Identifying patterns for short answer scoring using graph-based lexico-semantic text matching. Proceedings of the 10th Workshop on Innovative Use of NLP for Building Educational Applications (NAACL-HLT 2015), 4 June 2015, Denver, CO, USA (pp. 97–106). http://www.aclweb.org/anthology/W15-0612

Ramachandran, L., & Foltz, P. (2015). Generating reference texts for short answer scoring using graph-based summarization. Proceedings of the 10th Workshop on Innovative Use of NLP for Building Educational Applications (NAACL-HLT 2015), 4 June 2015, Denver, CO, USA (pp. 207–212). http://www.aclweb.org/anthology/W15-0624

Ramage, D., Dumais, S. T., & Liebling, D. J. (2010). Characterizing microblogs with topic models. In W. W. Cohen & S. Gosling (Eds.), Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM ’10) 23–26 May 2010, Washington, DC, USA (pp. XXX–XXX). Palo Alto, CA: AAAI Press. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1528/1846

Ramage, D., Rosen, E., Chuang, J., Manning, C. D., & McFarland, D. A. (2009). Topic modeling for the social sciences. Workshop on Applications for Topic Models: Text and Beyond (NIPS 2009), 11 December 2009, Whistler, BC, Canada. https://ed.stanford.edu/sites/default/files/mcfarland/tmt-nips09-20091122+21-29-34.pdf

Ramesh, A., Goldwasser, D., Huang, B., Daumé III, H., & Getoor, L. (2013). Modeling learner engagement in MOOCs using probabilistic soft logic. NIPS Workshop on Data Driven Education (NIPS-DDE 2013), 9 December 2013, Lake Tahoe, NV, USA. https://www.umiacs.umd.edu/~hal/docs/daume13engagementmooc.pdf

Reich, J., Tingley, D., Leder-Luis, J., Roberts, M. E., & Stewart, B. (2014). Computer-assisted reading and discovery for student generated text in massive open online courses. Journal of Learning Analytics, 2(1), 156–184.

Robinson, R. L., Navea, R., & Ickes, W. (2013). Predicting final course performance from students’ written self-introductions: A LIWC analysis. Journal of Language and Social Psychology, 32(4). doi:10.1177/0261927X13476869

Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 40(6), 601–618. doi:10.1109/TSMCC.2010.2053532

Rosen, D., Miagkikh, V., & Suthers, D. (2011). Social and semantic network analysis of chat logs. Proceedings of the 1st International Conference on Learning Analytics and Knowledge (LAK ʼ11), 27 February–1 March 2011, Banff, AB, Canada (pp. 134–139). New York: ACM. doi:10.1145/2090116.2090137

Roy, D., Sarkar, S., & Ghose, S. (2008). Automatic extraction of pedagogic metadata from learning content. International Journal of Artificial Intelligence in Education, 18(2), 97–118.

Shea, P., Hayes, S., Vickers, J., Gozza-Cohen, M., Uzuner, S., Mehta, R., … Rangan, P. (2010). A re-examination of the community of inquiry framework: Social network and content analysis. The Internet and Higher Education, 13(1–2), 10–21. doi:10.1016/j.iheduc.2009.11.002

Sherin, B. (2012). Using computational methods to discover student science conceptions in interview data. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 188–197). New York: ACM. doi:10.1145/2330601.2330649

Siemens, G., Gašević, D., Haythornthwaite, C., Dawson, S., Buckingham Shum, S., Ferguson, R., … Baker, R. S. J. d. (2011, July 28). Open learning analytics: An integrated & modularized platform. SoLAR Concept Paper. http://www.elearnspace.org/blog/wp-content/uploads/2016/02/ProposalLearningAnalyticsModel_SoLAR.pdf

Simsek, D., Buckingham Shum, S., De Liddo, A., Ferguson, R., & Sándor, Á. (2014). Visual analytics of academic writing. Proceedings of the 4th International Conference on Learning Analytics and Knowledge (LAK ʼ14), 24–28 March 2014, Indianapolis, IN, USA (pp. 265–266). New York: ACM. http://dl.acm.org/citation.cfm?id=2567577

Simsek, D., Buckingham Shum, S., Sandor, A., De Liddo, A., & Ferguson, R. (2013). XIP Dashboard: Visual analytics from automated rhetorical parsing of scientific metadiscourse. Presented at the 1st International Workshop on Discourse-Centric Learning Analytics, 8 April 2013, Leuven, Belgium. http://oro.open.ac.uk/37391/1/LAK13-DCLA-Simsek.pdf

Simsek, D., Sandor, A., Buckingham Shum, S., Ferguson, R., De Liddo, A., & Whitelock, D. (2015). Correlations between automated rhetorical analysis and tutors’ grades on student essays. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 355–359). New York: ACM. http://dl.acm.org/citation.cfm?id=2723603

Snow, E. L., Allen, L. K., Jacovina, M. E., Perret, C. A., & McNamara, D. S. (2015). You’ve got style: Detecting writing flexibility across time. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 194–202). New York: ACM. doi:10.1145/2723576.2723592

Southavilay, V., Yacef, K., & Calvo, R. A. (2009). WriteProc: A framework for exploring collaborative writing processes. Proceedings of the 14th Australasian Document Computing Symposium (ADCS 2009), 4 December 2009, Sydney, NSW, Australia (pp. 129–136). New York: ACM. http://es.csiro.au/adcs2009/proceedings/poster-presentation/09-southavilay.pdf

Southavilay, V., Yacef, K., & Calvo, R. A. (2010). Analysis of collaborative writing processes using hidden Markov models and semantic heuristics. In W. Fan, W. Hsu, G. I. Webb, B. Liu, C. Zhang, D. Gunopulos, & X. Wu (Eds.), Proceedings of the 2010 IEEE International Conference on Data Mining Workshops (ICDMW 2010), 14 December 2010, Sydney, Australia (pp. 543–548). doi:10.1109/ICDMW.2010.118

Southavilay, V., Yacef, K., Reimann, P., & Calvo, R. A. (2013). Analysis of collaborative writing processes using revision maps and probabilistic topic models. Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (LAK ’13), 8–12 April 2013, Leuven, Belgium (pp. 38–47). New York: ACM. doi:10.1145/2460296.2460307

Stamper, J., Barnes, T., & Croy, M. (2010). Enhancing the automatic generation of hints with expert seeding. In V. Aleven, J. Kay, & J. Mostow (Eds.), Intelligent tutoring systems (pp. 31–40). Springer. http://link.springer.com/chapter/10.1007/978-3-642-13437-1_4

Strijbos, J.-W., Martens, R. L., Prins, F. J., & Jochems, W. M. G. (2006). Content analysis: What are they talking about? Computers & Education, 46(1), 29–48.

Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29. doi:10.1177/0261927X09351676

Teplovs, C., Fujita, N., & Vatrapu, R. (2011). Generating predictive models of learner community dynamics. Proceedings of the 1st International Conference on Learning Analytics and Knowledge (LAK ʼ11), 27 February–1 March 2011, Banff, AB, Canada (pp. 147–152). New York: ACM. doi:10.1145/2090116.2090139

Varner, L. K., Jackson, G. T., Snow, E. L., & McNamara, D. S. (2013). Linguistic content analysis as a tool for improving adaptive instruction. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Artificial Intelligence in Education (pp. 692–695). Springer Berlin Heidelberg. doi:10.1007/978-3-642-39112-5_90

Vega, B., Feng, S., Lehman, B., Graesser, A., & D’Mello, S. (2013). Reading into the text: Investigating the influence of text complexity on cognitive engagement. In S. K. DʼMello, R. A. Calvo, & A. Olney (Eds.), Proceedings of the 6th International Conference on Educational Data Mining (EDM2013), 6–9 July, Memphis, TN, USA (pp. 296–299). International Educational Data Mining Society/Springer.

Velasquez, N. F., Fields, D. A., Olsen, D., Martin, T., Shepherd, M. C., Strommer, A., & Kafai, Y. B. (2014). Novice programmers talking about projects: What automated text analysis reveals about online scratch users’ comments. Proceedings of the 47th Hawaii International Conference on System Sciences (HICSS-47), 6–9 January 2014, Waikoloa, HI, USA (pp. 1635–1644). IEEE Computer Society. doi:10.1109/HICSS.2014.209

Verbert, K., Manouselis, N., Ochoa, X., Wolpers, M., Drachsler, H., Bosnic, I., & Duval, E. (2012). Context-aware recommender systems for learning: A survey and future challenges. IEEE Transactions on Learning Technologies, 5(4), 318–335. doi:10.1109/TLT.2012.11

Walker, A., Recker, M. M., Lawless, K., & Wiley, D. (2004). Collaborative information filtering: A review and an educational application. International Journal of Artificial Intelligence in Education, 14(1), 3–28.

Waters, Z. (2015). Using structural features to improve the automated detection of cognitive presence in online learning discussions (B.Sc. Thesis). Queensland University of Technology.

Wen, M., Yang, D., & Rosé, C. (2014a). Sentiment analysis in MOOC discussion forums: What does it tell us? In J. Stamper, Z. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining (EDM2014), 4–7 July, London, UK. International Educational Data Mining Society. http://www.cs.cmu.edu/~mwen/papers/edm2014-camera-ready.pdf

Wen, M., Yang, D., & Rosé, C. P. (2014b). Linguistic reflections of student engagement in massive open online courses. Proceedings of the 8th International AAAI Conference on Weblogs and Social Media (ICWSM ’14), 1–4 June 2014, Ann Arbor, Michigan, USA (pp. 525–534). Palo Alto, CA: AAAI Press. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8057

Weragama, D., & Reye, J. (2014). Analysing student programs in the PHP intelligent tutoring system. International Journal of Artificial Intelligence in Education, 24(2), 162–188. doi:10.1007/s40593-014-0014-z

Whitelock, D., Field, D., Pulman, S., Richardson, J. T. E., & Van Labeke, N. (2014). Designing and testing visual representations of draft essays for higher education students. 2nd International Workshop on Discourse-Centric Learning Analytics (DCLA14), 24 March 2014, Indianapolis, IN, USA. http://oro.open.ac.uk/41845/

Whitelock, D., Twiner, A., Richardson, J. T. E., Field, D., & Pulman, S. (2015). OpenEssayist: A Supply and demand learning analytics tool for drafting academic essays. Proceedings of the 5th International Conference on Learning Analytics and Knowledge (LAK ʼ15), 16–20 March 2015, Poughkeepsie, NY, USA (pp. 208–212). New York: ACM. doi:10.1145/2723576.2723599

Wolfe, M. B. W., Schreiner, M. E., Rehder, B., Laham, D., Foltz, P. W., Kintsch, W., & Landauer, T. K. (1998). Learning from text: Matching readers and texts by latent semantic analysis. Discourse Processes, 25(2–3), 309–336. doi:10.1080/01638539809545030

Worsley, M., & Blikstein, P. (2011). What’s an expert? Using learning analytics to identify emergent markers of expertise through automated speech, sentiment and sketch analysis. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero & J. Stamper (Eds.), Proceedings of the 4th Annual Conference on Educational Data Mining (EDM2011), 6–8 July 2011, Eindhoven, The Netherlands (pp. 235–240). International Educational Data Mining Society.

Xu, X., Murray, T., Park Woolf, B., & Smith, D. (2013). If you were me and I were you: Mining social deliberation in online communication. In S. K. DʼMello, R. A. Calvo, & A. Olney (Eds.), Proceedings of the 6th International Conference on Educational Data Mining (EDM2013), 6–9 July, Memphis, TN, USA (pp. 208–216). International Educational Data Mining Society/Springer.

Yan, X., Guo, J., Lan, Y., & Cheng, X. (2013). A biterm topic model for short texts. Proceedings of the 22nd International Conference on World Wide Web (WWW ’13), 13–17 May 2013, Rio de Janeiro, Brazil (pp. 1445–1456). New York: ACM.

Yang, D., Wen, M., Kumar, A., Xing, E. P., & Rosé, C. P. (2014). Towards an integration of text and graph clustering methods as a lens for studying social interaction in MOOCs. The International Review of Research in Open and Distributed Learning, 15(5), 214–234.

Yang, D., Wen, M., & Rosé, C. (2014). Towards identifying the resolvability of threads in MOOCs. Proceedings of the Workshop on Modeling Large Scale Social Interaction in Massively Open Online Courses at the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), 25 October 2014, Doha, Qatar (pp. 21–31). http://www.aclweb.org/anthology/W/W14/W14-41.pdf#page=28

Yoo, J., & Kim, J. (2012). Predicting learner’s project performance with dialogue features in online Q&A discussions. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K. Panourgia (Eds.), Intelligent tutoring systems (pp. 570–575). Springer. http://link.springer.com/chapter/10.1007/978-3-642-30950-2_74

Yoo, J., & Kim, J. (2013). Can online discussion participation predict group project performance? Investigating the roles of linguistic features and participation patterns. International Journal of Artificial Intelligence in Education, 24(1), 8–32. doi:10.1007/s40593-013-0010-8

Zaldivar, V. A. R., García, R. M. C., Burgos, D., Kloos, C. D., & Pardo, A. (2011). Automatic discovery of complementary learning resources. In C. D. Kloos, D. Gillet, R. M. C. García, F. Wild, & M. Wolpers (Eds.), Towards ubiquitous learning (pp. 327–340). Springer. http://link.springer.com/chapter/10.1007/978-3-642-23985-4_26

Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., & Li, X. (2011). Comparing Twitter and traditional media using topic models. In P. Clough, C. Foley, C. Gurrin, G. Jones, W. Kraaij, H. Lee, & V. Murdock (Eds.), Proceedings of the 33rd European Conference on Advances in Information Retrieval (ECIR 2011), 18–21 April 2011, Dublin, Ireland (pp. 338–349). Springer. http://dl.acm.org/citation.cfm?id=1996889.1996934

About this Chapter

Content Analytics: The Definition, Scope, and an Overview of Published Research

Book Title
Handbook of Learning Analytics

pp. 77-92




Society for Learning Analytics Research

Vitomir Kovanović1
Srećko Joksimović2
Dragan Gašević1,2
Marek Hatala3
George Siemens4

Author Affiliations
1. School of Informatics, University of Edinburgh, UK
2. Moray House School of Education, University of Edinburgh, UK
3. School of Interactive Arts and Technology, Simon Fraser University, Canada
4. LINK Research Lab, University of Texas at Arlington, USA

Charles Lang5
George Siemens4
Alyssa Wise6
Dragan Gašević1,2

Editor Affiliations
5. Teachers College, Columbia University, USA
6. Learning Analytics Research Network, New York University, USA

Society for Learning Analytics Research (SoLAR)