diff --git a/research/misconceptions/Makefile b/research/misconceptions/Makefile index 3daa2115..646cc45c 100644 --- a/research/misconceptions/Makefile +++ b/research/misconceptions/Makefile @@ -1,9 +1,18 @@ +LATEXFLAGS= -shell-escape +TEX_PYTHONTEX= yes + .PHONY: all all: article.pdf +article.pdf: article.tex article.pdf: bibliography.bib article.pdf: preamble.tex +article.pdf: introduction.tex +article.pdf: background.tex +article.pdf: method.tex +article.pdf: results-overview.tex + article.pdf: classes.tex article.pdf: conditionals.tex article.pdf: functions-variables.tex @@ -12,10 +21,10 @@ article.pdf: repetitions.tex article.pdf: tools.tex article.pdf: types.tex -article.pdf: article.tex - latexmk -pdf $< - .PHONY: clean clean: latexmk -C ${RM} article.bbl article.run.xml + +INCLUDE_MAKEFILES?=../../makefiles +include ${INCLUDE_MAKEFILES}/tex.mk diff --git a/research/misconceptions/article.tex b/research/misconceptions/article.tex index 48b73ebf..34117fd5 100644 --- a/research/misconceptions/article.tex +++ b/research/misconceptions/article.tex @@ -4,107 +4,113 @@ % This is a simple template for a LaTeX document using the "article" class. % See "book", "report", "letter" for other types of document. -\documentclass[twocolumn]{article} +\documentclass[onecolumn]{article} \usepackage[utf8]{inputenc} % set input encoding (not needed with XeLaTeX) \usepackage[english]{babel} + + \input{preamble.tex} \title{Student misconceptions in programming through the lens of variation theory} -\author{Celina Soori and Daniel Bosk (at the moment)} +\author{% + Celina Soori and + Daniel Bosk\thanks{% + Oskar Ejderby, + Leandros Grigoriadis, + Björn Hickman, + Yousef Hilal, + Beata Johansson, + Mazen Mardini, and + Yasmine Schüllerqvist + helped record students' misconceptions during the course. + }% +} %\date{} % Activate to display a given date or no date (if empty), % otherwise the current date is printed \begin{document} + +\definecolor{light-green}{HTML}{D5F5E3} %this definition needed to be inside the document + \maketitle +\newpage \tableofcontents +\newpage -\section{Introduction} +\input{introduction} -We study misconceptions in introductory programming from the perspective of -variation theory. -We first survey the existing literature on misconceptions in introductory -programming. -We then analyze these misconceptions using variation theory. -According to variation theory \parencite[Ch.~2]{NCOL}, each educational -objective can be divided into different \emph{aspects}. -Consider the educational objective \enquote{the student should be able to use -functions}. -One aspect of this particular educational objective is local and global scope -of variables. -Another aspect is returning values from a function. -For a student to achieve the educational objective, she must be able to discern -the different aspects of the educational objective. -Aspects that the student hasn't yet discerned are critical aspects. -One necessary condition for learning is that the student is introduced to a -series of patterns of variation in the dimension of each critical aspect. -Misconceptions are examples of when a student has failed to discern (at least) -one critical aspect. -This allows us to use misconceptions to inform our designs when designing -teaching according to variation theory. -This study answers the following questions: -\begin{enumerate} - \item What misconceptions in introductory programming has been identifies in - the literature? - \item Based on these misconceptions, what aspects (in terms of variation - theory) of introductory programming can we identify? - \item Where do we need further research? -\end{enumerate} +\input{background} -\section{Prerequisites: definitions} +\input{method} -Added this to explain different words we use in the text and how we interpreted them and use them. Maybe we dont need subsections for every word, but I started this way. -Maybe it will be needed, maybe not. Uncertain at the moment. +\input{results-overview.tex} -\subsection{Variation Theory} +\input{functions-variables.tex} +\input{classes.tex} +\input{repetitions.tex} +\input{types.tex} +\input{conditionals.tex} +\input{findings.tex} -Here I'm thinking we should describe what variation theory is so that we can use it in the analysis later on. -\subsection{Misconception} -Do we need to describe what we mean with misconception? Do we maybe want to use another word? +\newpage +\printbibliography -\subsection{Computational thinking} +\newpage -Maybe we want to include something about computational thinking in this article? +\appendix -\subsection{Conceptual Change Theory} +\section{Misconceptions not yet included} -An interesting method to unlearn misconceptions and instead learn the correct conception. Mentioned by Qian \& Lehman. +\input{problem-solving.tex} +\input{tools.tex} +\section{Misconceptions I have noticed that is not mentioned} -\section{Surveying the literature on misconceptions} +\begin{itemize} + \item When returning a value from a function, students misses to +capture it into a variable, not understanding why they cannot use the +returned value later in the programme. + \item A local list in a function will be changed when passed to a +function and then changed in that function. This because it is a +reference to the list that is sent to the new function, not a copy of +the list. +\end{itemize} -I'm thinking we need a method-section to describe how this research has been -conducted. Something about the literature collecting and so on. +\section{Random stuff removed from the article} -Indeed, you'll need to describe the search queries and the selection criteria. +\begin{itemize} + \item There is also a misconception that variables can hold more than +one value at a time \parencite{Doukakis2007}. This misconception can +relate to several things: + \begin{enumerate*} + \item the type system, confusing lists with non-container types, not + seeing a + list as a type itself; + \item the scope of variables, that the same variable identifier can be + used + for different things in different scopes. + \end{enumerate*} + \item Students that know that a return-statement is needed have +difficulties returning the right value or variable from a function +\parencite{KumarVeerasamy2016}. -\section{Important modules in CS1} +\end{itemize} -This section is divided to reflect the different concepts that are teached during CS1. Each section will describe what students often are meant to learn and understand in that module, which is then followed with a summary of what different studies have found is difficult for students in that module. XXX And then we also will try to analyze the findings with the help of variation theory. -\input{functions-variables.tex} -\input{classes.tex} -\input{repetitions.tex} -\input{types.tex} -\input{conditionals.tex} -\input{problem-solving.tex} -\input{tools.tex} -\section{Analysis} -The analyze will be performed with the help of variation theory. -\section{Discussion} -Here we can discuss what we want to analyze and research further. We might also want to discuss how to design the education to avoid everything above. -\printbibliography \end{document} + + diff --git a/research/misconceptions/background.tex b/research/misconceptions/background.tex new file mode 100644 index 00000000..12eb8a58 --- /dev/null +++ b/research/misconceptions/background.tex @@ -0,0 +1,67 @@ +\section{Theoretical background} + +\subsection{Definition of a misconception} + +In order to summarise misconceptions found in earlier research it is +important to define what a misconception is and how the term is used in this +article. According to \textcite{NCOL} a misconception is where the student +understand some critical aspects but misunderstand others, also defined as +\textcquote[p.~1]{KumarVeerasamy2016}{\textins{a} misconception is an +erroneous +belief, which is not true or valid}. This definition is also used by +\textcite{MisconceptionsSurvey2017}, where they specify it in a programming +context to include aspects of syntax, concepts, control flow, learned +constructs and debugging programs. A misconception can also include errors +in +conceptual understanding of programming. As one can see, the definition of +misconceptions is quite broad, and will be used in this article to include +all +errors, misunderstanding, difficulties and so forth. + +\subsection{Variation Theory} + +Learning ← discernment ← variation \textbf{XXX This is in your slides +Daniel, but I could not find in the book where Marton draws this +relationship between the three.} + +Step-by-step patterns in variation theory (with the example of learning +what the colour green is): + +\begin{enumerate} + \item Contrast (vary the critical concept) - showing a picture of a +circle that is green, and a circle that is the colour blue. + \item Generalisation (vary the non-critical aspect) - showing several +figures that are green, but has different shapes. Here the green colour +is the variant, and we have variation on the shape of the figures. +Called induction if it comes before the pattern contrast. + \item Fusion (vary both critical and non-critical aspect) - show +different figures that vary in colour and in shape, to fuse the two +earlier steps. +\end{enumerate} + +One can vary the order of these steps, but it has been found to be more +efficient to start with the contrast. + +\textbf{Things that we might need to write about here, if we want to use it +later in the analysis} + +\begin{itemize} + \item Test in another scenario than the students have been taught in. +For example: When learning to throw a ball, teach the students in other +places than in the testing place. + \item Grouping: Teaching similar concepts within a subject at the same +time. Students learn more when being taught the differences between +similar concepts/methods within a subject, instead of learning the +concepts separately. + \item Changing variables to pay attention to. Students can learn better +if they focus on one specific attribute of a concept at a time, instead +of trying to understand the entire concept in one iteration. +\end{itemize} + + + + + + + + diff --git a/research/misconceptions/bibliography.bib b/research/misconceptions/bibliography.bib index cd48ef8e..e1583981 100644 --- a/research/misconceptions/bibliography.bib +++ b/research/misconceptions/bibliography.bib @@ -118,7 +118,8 @@ @techreport{Sleeman1984 @article{Ragonis2005OOP, author = { Noa Ragonis and Mordechai Ben-Ari }, - title = {A long-term investigation of the comprehension of OOP concepts by novices}, + title = {A long-term investigation of the comprehension of OOP concepts by +novices}, journal = {Computer Science Education}, volume = {15}, number = {3}, @@ -140,7 +141,8 @@ @article{hatala2003practice } @article{KumarVeerasamy2016, author = {Ashok Kumar Veerasamy and Daryl D'Souza and Mikko-Jussi Laakso}, - title ={Identifying Novice Student Programming Misconceptions and Errors From Summative Assessments}, + title ={Identifying Novice Student Programming Misconceptions and Errors From +Summative Assessments}, journal = {Journal of Educational Technology Systems}, volume = {45}, number = {1}, @@ -241,7 +243,8 @@ @article{Doukakis2007 pages = {}, title = {Understanding the programming variable concept with animated interactive analogies}, -booktitle = {Proceedings of the 8th Hellenic European Research on Computer Mathematics \& Its Applications Conference (HERCMA’07)} +booktitle = {Proceedings of the 8th Hellenic European Research on Computer +Mathematics \& Its Applications Conference (HERCMA’07)} } @@ -249,15 +252,29 @@ @article{Doukakis2007 @inproceedings{Brown2014, author = {Brown, Neil C.C. and Altadmri, Amjad}, -title = {Investigating Novice Programming Mistakes: Educator Beliefs vs. Student Data}, +title = {Investigating Novice Programming Mistakes: Educator Beliefs vs. +Student Data}, year = {2014}, isbn = {9781450327558}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/2632320.2632343}, doi = {10.1145/2632320.2632343}, -abstract = {Educators often form opinions on which programming mistakes novices make most often - for example, in Java: "they always confuse equality with assignment", or "they always call methods with the wrong types". These opinions are generally based solely on personal experience. We report a study to determine if programming educators form a consensus about which Java programming mistakes are the most common. We used the Blackbox data set to check whether the educators' opinions matched data from over 100,000 students - and checked whether this agreement was mediated by educators' experience. We found that educators formed only a weak consensus about which mistakes are most frequent, that their rankings bore only a moderate correspondence to the students in the Blackbox data, and that educators' experience had no effect on this level of agreement. These results raise questions about claims educators make regarding which errors students are most likely to commit.}, -booktitle = {Proceedings of the Tenth Annual Conference on International Computing Education Research}, +abstract = {Educators often form opinions on which programming mistakes novices +make most often - for example, in Java: "they always confuse equality with +assignment", or "they always call methods with the wrong types". These opinions +are generally based solely on personal experience. We report a study to +determine if programming educators form a consensus about which Java +programming mistakes are the most common. We used the Blackbox data set to +check whether the educators' opinions matched data from over 100,000 students - +and checked whether this agreement was mediated by educators' experience. We +found that educators formed only a weak consensus about which mistakes are most +frequent, that their rankings bore only a moderate correspondence to the +students in the Blackbox data, and that educators' experience had no effect on +this level of agreement. These results raise questions about claims educators +make regarding which errors students are most likely to commit.}, +booktitle = {Proceedings of the Tenth Annual Conference on International +Computing Education Research}, pages = {43–50}, numpages = {8}, keywords = {educators, programming mistakes}, @@ -265,7 +282,8 @@ @inproceedings{Brown2014 series = {ICER '14} } @inproceedings{Kaczmarczyk2010, -author = {Kaczmarczyk, Lisa C. and Petrick, Elizabeth R. and East, J. Philip and Herman, Geoffrey L.}, +author = {Kaczmarczyk, Lisa C. and Petrick, Elizabeth R. and East, J. Philip +and Herman, Geoffrey L.}, title = {Identifying Student Misconceptions of Programming}, year = {2010}, isbn = {9781450300063}, @@ -273,11 +291,25 @@ @inproceedings{Kaczmarczyk2010 address = {New York, NY, USA}, url = {https://doi.org/10.1145/1734263.1734299}, doi = {10.1145/1734263.1734299}, -abstract = {Computing educators are often baffled by the misconceptions that their CS1 students hold. We need to understand these misconceptions more clearly in order to help students form csorrect conceptions. This paper describes one stage in the development of a concept inventory for Computing Fundamentals: investigation of student misconceptions in a series of core CS1 topics previously identified as both important and difficult. Formal interviews with students revealed four distinct themes, each containing many interesting misconceptions. Three of those misconceptions are detailed in this paper: two misconceptions about memory models, and data assignment when primitives are declared. Individual misconceptions are related, but vary widely, thus providing excellent material to use in the development of the CI. In addition, CS1 instructors are provided immediate usable material for helping their students understand some difficult introductory concepts.}, -booktitle = {Proceedings of the 41st ACM Technical Symposium on Computer Science Education}, +abstract = {Computing educators are often baffled by the misconceptions that +their CS1 students hold. We need to understand these misconceptions more +clearly in order to help students form csorrect conceptions. This paper +describes one stage in the development of a concept inventory for Computing +Fundamentals: investigation of student misconceptions in a series of core CS1 +topics previously identified as both important and difficult. Formal interviews +with students revealed four distinct themes, each containing many interesting +misconceptions. Three of those misconceptions are detailed in this paper: two +misconceptions about memory models, and data assignment when primitives are +declared. Individual misconceptions are related, but vary widely, thus +providing excellent material to use in the development of the CI. In addition, +CS1 instructors are provided immediate usable material for helping their +students understand some difficult introductory concepts.}, +booktitle = {Proceedings of the 41st ACM Technical Symposium on Computer +Science Education}, pages = {107–111}, numpages = {5}, -keywords = {cs1, programming, misconceptions, pedagogy, concept inventory, curriculum}, +keywords = {cs1, programming, misconceptions, pedagogy, concept inventory, +curriculum}, location = {Milwaukee, Wisconsin, USA}, series = {SIGCSE '10} } @@ -291,8 +323,23 @@ @inproceedings{Holland1997 address = {New York, NY, USA}, url = {https://doi.org/10.1145/268084.268132}, doi = {10.1145/268084.268132}, -abstract = {This paper identifies and describes a number of misconceptions observed in students learning about object technology. It identifies simple, concrete, measures course designers and teachers can take to avoid these misconceptions arising. The context for this work centres on an introductory undergraduate course and a postgraduate course. Both these courses are taught by distance education. These courses both use Smalltalk as an introduction to object technology. More particularly, the undergraduate course uses Smalltalk as a first programming language.Distance education can limit the amount and speed of individual feedback that can be given in the early stages of learning. For this reason, particular attention has been paid to characterizing measures for avoiding elementary misconceptions seen in beginning learners. At the same time we also address some misconceptions observed in postgraduate students. The pedagogical issues discussed are of particular importance when devising an extended series of examples for teaching or assessment, or when designing a visual microworld to be used for teaching purposes.}, -booktitle = {Proceedings of the Twenty-Eighth SIGCSE Technical Symposium on Computer Science Education}, +abstract = {This paper identifies and describes a number of misconceptions +observed in students learning about object technology. It identifies simple, +concrete, measures course designers and teachers can take to avoid these +misconceptions arising. The context for this work centres on an introductory +undergraduate course and a postgraduate course. Both these courses are taught +by distance education. These courses both use Smalltalk as an introduction to +object technology. More particularly, the undergraduate course uses Smalltalk +as a first programming language.Distance education can limit the amount and +speed of individual feedback that can be given in the early stages of learning. +For this reason, particular attention has been paid to characterizing measures +for avoiding elementary misconceptions seen in beginning learners. At the same +time we also address some misconceptions observed in postgraduate students. The +pedagogical issues discussed are of particular importance when devising an +extended series of examples for teaching or assessment, or when designing a +visual microworld to be used for teaching purposes.}, +booktitle = {Proceedings of the Twenty-Eighth SIGCSE Technical Symposium on +Computer Science Education}, pages = {131–134}, numpages = {4}, location = {San Jose, California, USA}, @@ -308,7 +355,8 @@ @inproceedings{Fleury1991 address = {New York, NY, USA}, url = {https://doi.org/10.1145/107004.107066}, doi = {10.1145/107004.107066}, -booktitle = {Proceedings of the Twenty-Second SIGCSE Technical Symposium on Computer Science Education}, +booktitle = {Proceedings of the Twenty-Second SIGCSE Technical Symposium on +Computer Science Education}, pages = {283–286}, numpages = {4}, location = {San Antonio, Texas, USA}, @@ -324,7 +372,8 @@ @inproceedings{Sekiya2013 address = {New York, NY, USA}, url = {https://doi.org/10.1145/2526968.2526978}, doi = {10.1145/2526968.2526978}, - booktitle = {Proceedings of the 13th Koli Calling International Conference on Computing Education Research}, + booktitle = {Proceedings of the 13th Koli Calling International Conference on +Computing Education Research}, pages = {87--95}, numpages = {9}, keywords = {tracing, misconception, CS1, novice programmers}, @@ -332,6 +381,34 @@ @inproceedings{Sekiya2013 series = {Koli Calling '13} } +@article{Snyder2019, +title = {Literature review as a research methodology: An overview and +guidelines}, +journal = {Journal of Business Research}, +volume = {104}, +pages = {333-339}, +year = {2019}, +issn = {0148-2963}, +doi = {https://doi.org/10.1016/j.jbusres.2019.07.039}, +url = {https://www.sciencedirect.com/science/article/pii/S0148296319304564}, +author = {Hannah Snyder}, +keywords = {Literature review, Synthesis, Research methodology, Systematic +review, Integrative review}, +abstract = {Knowledge production within the field of business research is +accelerating at a tremendous speed while at the same time remaining fragmented +and interdisciplinary. This makes it hard to keep up with state-of-the-art and +to be at the forefront of research, as well as to assess the collective +evidence in a particular area of business research. This is why the literature +review as a research method is more relevant than ever. Traditional literature +reviews often lack thoroughness and rigor and are conducted ad hoc, rather than +following a specific methodology. Therefore, questions can be raised about the +quality and trustworthiness of these types of reviews. This paper discusses +literature review as a methodology for conducting research and offers an +overview of different types of reviews, as well as some guidelines to how to +both conduct and evaluate a literature review paper. It also discusses common +pitfalls and how to get literature reviews published.} +} + %Har hopp om att få citera från artikeln nedan @article{Sleeman1986, @@ -345,5 +422,47 @@ @article{Sleeman1986 doi = {10.2190/2XPP-LTYH-98NQ-BU77}, URL = {https://doi.org/10.2190/2XPP-LTYH-98NQ-BU77}, eprint = {https://doi.org/10.2190/2XPP-LTYH-98NQ-BU77}, -abstract = { A screening test was given to three classes of high school students, who were just completing introductory semester-long courses in Pascal. These tests were graded, and subsequently thirty-five students were given detailed clinical interviews. These interviews showed that errors were made with essentially every Pascal construct. Over half the students were classified as having major difficulties—fewer than 10 percent had no difficulties. The errors noted are discussed in detail in this article. A major finding is that the students attribute to the computer the reasoning power of an average person. The article also speculates about how difficult it might be to remediate the errors found, and concludes with an outline of future work. } +abstract = { A screening test was given to three classes of high school +students, who were just completing introductory semester-long courses in +Pascal. These tests were graded, and subsequently thirty-five students were +given detailed clinical interviews. These interviews showed that errors were +made with essentially every Pascal construct. Over half the students were +classified as having major difficulties—fewer than 10 percent had no +difficulties. The errors noted are discussed in detail in this article. A major +finding is that the students attribute to the computer the reasoning power of +an average person. The article also speculates about how difficult it might be +to remediate the errors found, and concludes with an outline of future work. } } + +@inproceedings{GuoMarkelZhang2020, +author = {Guo, Philip J. and Markel, Julia M. and Zhang, Xiong}, +title = {Learnersourcing at Scale to Overcome Expert Blind Spots for +Introductory Programming: A Three-Year Deployment Study on the Python Tutor +Website}, +year = {2020}, +isbn = {9781450379519}, +publisher = {Association for Computing Machinery}, +address = {New York, NY, USA}, +url = {https://doi.org/10.1145/3386527.3406733}, +doi = {10.1145/3386527.3406733}, +abstract = {It is hard for experts to create good instructional resources due +to a phenomenon known as the expert blind spot: They forget what it was like to +be a novice, so they cannot pinpoint exactly where novices commonly struggle +and how to best phrase their explanations. To help overcome these expert blind +spots for computer programming topics, we created a learnersourcing system that +elicits explanations of misconceptions directly from learners while they are +coding. We have deployed this system for the past three years to the +widely-used Python Tutor coding website (pythontutor.com) and collected 16,791 +learner-written explanations. To our knowledge, this is the largest dataset of +explanations for programming misconceptions. By inspecting this dataset, we +found surprising insights that we did not originally think of due to our own +expert blind spots as programming instructors. We are now using these insights +to improve compiler and run-time error messages to explain common novice +misconceptions.}, +booktitle = {Proceedings of the Seventh ACM Conference on Learning @ Scale}, +pages = {301–304}, +numpages = {4}, +keywords = {programming, learnersourcing, python tutor, syntax errors}, +location = {Virtual Event, USA}, +series = {L@S '20} +} \ No newline at end of file diff --git a/research/misconceptions/classes.tex b/research/misconceptions/classes.tex index b1cf23bf..9a672bb8 100644 --- a/research/misconceptions/classes.tex +++ b/research/misconceptions/classes.tex @@ -1,10 +1,55 @@ \subsection{Classes and objects} -\subsubsection{Role in the syllabus} +The concept of classes and objects in object-oriented languages is difficult +and a basic understanding of objects is something that many CS1 students lack +\parencite{Kaczmarczyk2010}. This is emphasized by \textcite{Ragonis2005OOP} +who in their study noticed a number of dire misconceptions, two which are +presented below. -\subsubsection{Difficulties that can occur} +\begin{enumerate} + \item An instance of a class can be created within the class' method. -The concept of classes and objects in object-oriented languages is difficult and basic understanding of objects is something that many CS1 students lack \parencite{Kaczmarczyk2010}. This is emphasized by \textcite{Ragonis2005OOP} who dive into this subject in their article \emph{A long-term investigation of the comprehension of OOP concepts by novices}. In their studie they noticed a number of diere misconceptions that the students had when learning about classes and objects, for instance that you can create an object from a method and that you can define a method that does not access any attributes. They also found that the students had a hard time to visualize the class as a template for a type of object, instead the students had the image of the class as a collection of objects and that the methods had the power to change, add and delete objects that are class-instances. + \item It is possible to define a method which does not access any of the +class' attributes +\end{enumerate} -Similar misconceptions has been characterized by \textcite{Holland1997} in their article \emph{Avoiding object misconceptions} where they highlights the misconception that an object is a variable that can only hold one value or several values of the same type, a misconception they trace back to the first class examples that the students see. This misconception is not the only symptome from the first classes the students see. The misconception that a class is strictly a data base is also a misconception that according to \textcite{Holland1997} is the product of that the first classes the students write often are a good substitue to a data base and therefor shapes the student misconception. The last concept that \textcite{Holland1997} discuss is the concept of storing the objects in the programme. Some students believe that the attributes of an object are the objects identifier, which leads to the misconception that there can not be two objects that have the same attributes, and therefor that one attribute of every object must be unique otherwise the programme will not be able to store it. The concept that every object has its own memory space and are stored separatly is hard to grasp for some students \parencite{Holland1997,Ragonis2005OOP}. +XXX Add analysis from variation theory + + +\parencite{Ragonis2005OOP} also found that students have a hard time +visualising the class as a template for a type of object. Instead the +students have the +image of the class as a collection of objects and that the class' methods +have the power to change, add and delete objects that are class-instances. +Similar +misconceptions has been characterised by \textcite{Holland1997}, where the +students believe that \begin{enumerate*} + \item an object is a variable that can only hold one or several values of +the same type and + \item a class is strictly a data base +\end{enumerate*}. These particular misconceptions can be traced back to the +first classes the students write, which are often good substitutes to data +bases +and therefor shapes these student misconceptions. + +XXX Add analysis one how we can teach classes with easy examples and still +manage to avoid the misconception that a class is equal to a data base. Maybe +we can +contrast it with the already existing objects in python (Strings, lists, +integers etc). + +The last concept that \textcite{Holland1997} discuss is the concept of +storing the objects in the programme. Some students believe that the +attributes of an +object are the objects identifier, which leads to the misconception that +there can not be two objects that have the same attributes, and therefor that +one +attribute of every object must be unique otherwise the programme will not be +able to store it. The concept that every object has its own memory space and +are +stored separately is hard to grasp for some students \parencite{ +Holland1997,Ragonis2005OOP}. + +XXX Add analysis in how we can teach how objects are stored with the help of +variation theory. diff --git a/research/misconceptions/conditionals.tex b/research/misconceptions/conditionals.tex index a7c85df2..31fe9573 100644 --- a/research/misconceptions/conditionals.tex +++ b/research/misconceptions/conditionals.tex @@ -1,7 +1,28 @@ \subsection{Conditionals} -\subsubsection{Role in the syllabus} +A common control structure taught in CS1 is if- and else-statements, where +the students learn how to create easy conditionals that controls the +progress of the programme. How a programme will understand and execute an if +-statement is something that students have misconceptions about. A severe +misconception that \textcite{Plass2015Variables} found was that some +students believe that an if-statement can control if the programme will +keep on executing or shut down, depending on if the statement is true or +false. Students believe that an if-statement that is false will terminate +the programme, even though a quit-statement has not been introduced. +Another misconception is that when writing an if and else statement, both +if and else will be executed, even when the if-statement is true \parencite{ +MisconceptionsSurvey2017}. -\subsubsection{Difficulties that can occur} +XXX Add analysis on how we can teach conditionals in a way which help the +students to understand when and how the code below the conditionals will be +executed. -How a programme will understand and execute an if-statement is something that students might have misconcepted. A severe misconception that \textcite{Plass2015Variables} found was that some students belive that an if-statement can control if the programme will keep on executing or shut down, depending on if the statement is true or false. Another misconception is that when writing an if and else statement both if and else will execute, even when the if-statement is true \parencite{MisconceptionsSurvey2017} +A syntax error that is common is when students try to chain conditions in +an if-statement, for example \mintinline{python}{if x != a or b}, where the +correct statement should be \mintinline{python}{if x != a or x != b} +\parencite{GuoMarkelZhang2020}. This misconception is believed to originate +from the way the statement is read out loud as \enquote{if x is not equal to a +or b}, which in mathematical terms is the right way to state it. + +XXX Add analysis on how we can help students to grasp the way if statements +should be stated in code, apposed to how it is stated in mathematics. diff --git a/research/misconceptions/findings.tex b/research/misconceptions/findings.tex new file mode 100644 index 00000000..f4aef2b3 --- /dev/null +++ b/research/misconceptions/findings.tex @@ -0,0 +1,113 @@ +\section{Misconceptions we have found} + +\subsection{Misconceptions not in articles} + + \begin{itemize} + \item That in order to return a the value of a variable from a +function that variable must be an input argument to the function. + + Example: + \hfill + \begin{minted}{python} + def foo(x): + x = 3 + return x + + x = 0 + x = foo(x) + \end{minted} + \hfill + + \item That the built-in function \mintinline{python}{input()} +translates the given value. For example, if the user writes a +number, \mintinline{python}{input()} will return an integer. + + \item That functions will be automatically called in the correct +order, without the functions being invoked in the programme. + + \item That in order to invoke a function several times if a +condition is true, the function need to call itself instead of +having a loop-structure that repeatedly invokes the function. + + Example: + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def foo(): + try: + x = int(input("Integer:")) + return x + except ValueError: + foo() + \end{minted} + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def foo(): + while True: + try: + x = int(input("Integer:")) + return x + except ValueError: + continue + \end{minted} + \end{minipage} + \item That even though several arguments are stated in a function +definition, when invoking the function the parameters does not have +to be passed to the function if the arguments' names are global +variables in the programme. + + Example: + + \begin{minted}{python} + + def foo(x,y): + print(x+y) + + x = 3 + y = 4 + + foo() + + \end{minted} + + \item That a return-statement in a function will "send" the +variable name and value to the place the function was invoked in, +and that the variable name then later can be used by the programme. + + Example: + \hfill + \begin{minted}{python} + def foo(): + x = 3 + + return x + + foo() + print(x) + \end{minted} + \end{itemize} + +\subsection{Misconceptions already mentioned} + + \begin{itemize} + \item That a variable name controls what value that can be assigned +to it. For example that a programme will throw an error if an +integer is assigned to a variable called \mintinline{python}{my_ +string}. + \item That a local variable is reachable outside the function it is +defined in. + \item That a while-loop without a condition will terminate without +a \mintinline{python}{break}-statement. + + + + \end{itemize} + +\subsection{Difficulties that can not be translated to a misconception} + +\begin{itemize} + \item The word "iterate" need to be explained more clearly, not a word +all students have heard before. +\end{itemize} \ No newline at end of file diff --git a/research/misconceptions/functions-variables.tex b/research/misconceptions/functions-variables.tex index 544370f4..88b2767e 100644 --- a/research/misconceptions/functions-variables.tex +++ b/research/misconceptions/functions-variables.tex @@ -1,128 +1,1416 @@ \subsection{Functions and variables} -According to \textcite{MisconceptionsSurvey2017}, in their article -\citetitle{MisconceptionsSurvey2017}, students have difficulties understanding -variables and that the students usually make assumptions about variables that -are wrong. +Functions and variables is often the first area which is taught in +introductory programming, an area which holds many misconceptions. This +section will be divided into several categories, which will reflect the +different areas that students have trouble grasping. -According to \textcite{Kohn2017VariableEvaluation,Plass2015Variables,Doukakis2007}, +\subsubsection{Conceptual understanding of variables} + +According to \textcite{ +Kohn2017VariableEvaluation,Plass2015Variables,Doukakis2007}, sometimes students believe that variables can hold an entire algorithm and -therefore see a variable as a function (or a mathematical equation). This will -create problems when a student creates a variable in belief that the variable -will dynamically change its value when the equation would change its value, or be updated -when the variable is used in the program. Another misconception that goes hand in hand with the assumption that a +therefore see a variable as a function (or a mathematical equation). +This will create problems when a student creates a variable in belief that the +variable will dynamically change its value when the equation would change its +value, or be updated when the variable is used in the program. + +Another misconception that goes hand in hand with the assumption that a variable holds an equation and not a single value, is that if in the return statement the student returns an equation, the student believes that the return value will be that equation, not the value that the equation represents \parencite{Kohn2017VariableEvaluation}. -\Textcite{Kohn2017VariableEvaluation} explains that this misconception can be -connected to how variable definitions are used in mathematics. -From a variation theoretic perspective, we can say several things about this: -\begin{enumerate*} +From a variation theoretic perspective, we can say several things about +this: +\begin{enumerate} \item That variables and functions are interconnected and should be treated simultaneously (not \enquote{one thing at a time}), to be able to contrast them \parencite[\cf][Ch~6, pp~167--168]{NCOL}. \item Unlike in mathematics, every line in a piece of program (in an - imperative language) is constitutes a new state of the program. + imperative language) constitutes a new state of the program. We must teach this to students through a series of patterns, as dictated by variation theory. -\end{enumerate*} +\end{enumerate} + +So, how should we teach functions and variables according to variation theory? +Let's assume that the students know variables, functions (or relations or maps) +and equations from mathematics and that that's the only prerequisite knowledge. + +From the variation theoretic perspective, one \emph{critial aspect} (dimension +of variation) is the type of object that an identifier refers to. +Functions (as a type) is a \emph{critical feature} in that dimension. +But all non-functional types are also critical features that the students must +learn to discern\footnote{% + Actually, this is not only true for programming, it's equally true for + mathematics. + Unfortunately, it's not until university-level mathematics (at least in the + Swedish context) that most students are taught to specify that \(y\colon + \mathbb{R}\to \mathbb{R}\) is a function and \(x\in\mathbb{R}\) is simply a + variable such that \(y\mapsto kx + m\). + Those details are usually lost in the \enquote{equationification} of + functions at lower levels. +}. + +Another critical aspect (dimension of variation) contains the features +statefulness (of an algorithm) and statelessness (of equations). + +\subsubsection{Statefulness of variable assignments} + +\begin{description} + \item [Contrast] We must introduce variation in the aspect (dimension of + variability) of statefulness of variables, but keep all other aspects + (dimensions) invariant. + Varying the statefulness of variables is variation in where we notice it + and where we don't. + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateC1] +a = 1 +b = 2 +x = a + b + + + +print(f"x = {x}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateC2][highlightlines={4-5}] +a = 1 +b = 2 +x = a + b +a = 2 +b = 3 + +print(f"x = {x}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + + The contrast is the update of the variables \mintinline{python}{a} and + \mintinline{python}{b}, which doesn't cause any change---as would be + expected according to the misconception identified by + \textcite{Kohn2017VariableEvaluation,Plass2015Variables,Doukakis2007}, that + sometimes students believe that variables can hold an entire algorithm and + therefore see a variable as a function (or a mathematical equation). + + \item [Generalisation] In the generalisation pattern we want to generalise + phenomenon (statefulness) to other examples, so that the student can + observe when the phenomenon occurs. + This means that we keep the aspect (dimension of variability) of + statefulness of variables invariant (we can observe it) while we vary + other available aspects (for example variable names or values). + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateG1] +a = 3 +b = 2 +c = a + b +a = 1 +b = 5 + +print(f"c = {c}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateG2] +name = "Ada" +greeting = "Hi" +msg = f"{greeting} {name}!" +name = "Beda" + + +print(msg) + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + + We see that we've varied the variable names and the types, still the same + effect. + + We can finish by updating the original definition after the updates, to + show that then we don't get the effect. + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateG2] +name = "Ada" +greeting = "Hi" +msg = f"{greeting} {name}!" +name = "Beda" + + +print(msg) + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateG2][highlightlines=5] +name = "Ada" +greeting = "Hi" +msg = f"{greeting} {name}!" +name = "Beda" +msg = f"{greeting} {name}!" + +print(msg) + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + + And then the same thing with the first example again: + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateG1] +a = 3 +b = 2 +c = a + b +a = 1 +b = 5 + +print(f"c = {c}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[varstateG1][highlightlines=6] +a = 3 +b = 2 +c = a + b +a = 1 +b = 5 +c = a + b +print(f"c = {c}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} +\end{description} + +Now, this was just one aspect of the misconception that students believe that +variables can hold an entire algorithm and therefore see a variable as a +function (or a mathematical equation). +We also have that expected behaviour, namely through functions. + +\subsubsection{Variability of functions} + +The students' expectation that \(y = kx + m\) is a function +\parencite{Kohn2017VariableEvaluation,Plass2015Variables,Doukakis2007} is +likely due to the mathematical shorthand for \(y(x) = kx + m\), where +\enquote{\((x)\)} is dropped (the \enquote{equationification} of functions). +Above we introduced a series of pattern showing that +\mintinline{python}{y = k*x + m} +is not the same as the dynamic relation in mathematics, +\(y = kx + m\) (or rather \(y(x) = kx + m\) to be precise). +However, we can achieve that dynamic relation in Python too, through functions. +Next, we'll go back and show a series of patterns to achieve the expected +behaviour. + +\begin{description} + \item [Contrast] We must introduce variation in the aspect (dimension of + variation) of functionality of functions, where we have the behaviour the + students expected from mathematics. + (But keep all other aspects/dimensions invariant.) + For this, we return to the original example, but make another contrast this + time. + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcC1][highlightlines=3] +a = 1 +b = 2 +x = a + b +a = 2 +b = 3 + +print(f"x = {x}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcC2][highlightlines={1-2}] +def x(a, b): + return a + b + + + +print(f"x(1, 2) = {x(1, 2)}") +print(f"x(2, 3) = {x(2, 3)}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + + \item[Generalisation] To generalise, we must keep this aspect invariant (the + dynamic behaviour of a function), but vary other aspects (such as function + names and what they do). + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcG1] +def y(x): + return 2*x + 5 + +print(f"y(1) = {y(1)}") +print(f"y(2) = {y(2)}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcG2] +def greet(name, greeting): + return f"{greeting} {name}!" -XXX propose a pattern for this example. +print(greet("Ada", "Hi")) +print(greet("Beda", "Hi")) + \end{pyblock} -There is also a misconception that variables can hold more than one value at -the time \parencite{Doukakis2007}. -This misconception can relate to several things: -\begin{enumerate*} - \item the type system, confusing lists with non-container types, not seeing a - list as a type itself; - \item the scope of variables, that the same variable identifier can be used - for different things in different scopes. -\end{enumerate*} + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} -XXX Possible ways to attack this using variation theory \dots + \item[Fusion] Now we'd like to fuse this back with the statefulness of + variables from above. -But it is not only the right side of the variable definition that students can -have misconceptions about. The name of the variable has been misunderstood as + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcG1] +def y(x): + return 2*x + 5 + +y1 = y(1) +y2 = y(2) +y3 = y(5) + +print(f"y1 = {y1}") +print(f"y2 = {y2}") +print(f"y3 = {y3}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcG1] +def y(x): + return 2*x + 5 + + +y1 = y(1) +print(f"y1 = {y1}") +y2 = y(2) +print(f"y2 = {y2}") +y3 = y(5) +print(f"y3 = {y3}") + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + + We also give two versions of the greeting example. + We reuse the same variable and also vary the order of assignments and + printing: + + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcG2] +def greet(name, greeting): + return f"{greeting} {name}!" + +msg1 = greet("Ada", "Hi") +msg2 = greet("Beda", "Hi") + +print(msg1) +print(msg2) + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[funcG2] +def greet(name, greeting): + return f"{greeting} {name}!" + +msg = greet("Ada", "Hi") +print(msg) + +msg = greet("Beda", "Hi") +print(msg) + \end{pyblock} + + \vspace{0.5em} + Which yields the following: + \printpythontex[verbatim] + \end{minipage} + +\end{description} + +XXX Treat the other aspects related to functions: scope, default values, ... + +Another misconception that goes hand in hand with the assumption that a +variable holds an equation and not a single value, is that if in the return +statement the student returns an equation, the student believes that the return +value will be that equation, not the value that the equation represents +\parencite{Kohn2017VariableEvaluation}. + + +\subsubsection{Defining variables and functions} + +But it is not only the right side of the variable definition that students +can +have misconceptions about. The name of the variable has been misunderstood +as having power of the value which it holds \parencite{MisconceptionsSurvey2017,Sleeman1984}. For example, if a student names one variable \emph{max} and another variable \emph{min}, the student -might think that the variables will strictly only hold the maximum value and -the minimum value throughout the program. +might think that the variables will strictly only hold the maximum value +and +the minimum value throughout the program, even though the code does not +carry through this rule. We can both explain this phenomenon and propose a teaching design using variation theory. Let's start with the explanation. -When teaching the students we always use proper (\ie relevant) variable names +When teaching the students we always use proper (\ie relevant) variable +names that relate to the purpose of the variable. If we never show the students any examples where the variable name is not -related to its purpose (bad variable names), they cannot separate the variable +related to its purpose (bad variable names), they cannot separate the +variable naming from its purpose. This, inevitably, leads to the teaching design: we must show the students that the variable names are independent of their -purpose. -We do this by introducing \emph{contrast}. -We show a standard example, then we change a variable name from a relevant to -an irrelevant one. -We show that the program still works. -We can then \emph{generalize} this by showing that we can rename the other -variables too, and even show other examples where the variable names are -disconnected from the purpose. -When done, we can point out that we name variables properly for readability -(ease of comprehension), by \emph{contrasting} the same example with and -without relevant variable names. -We can follow this by \emph{generalization}, by showing a previously unseen +purpose. To achieve this we propose this pattern: + +\begin{description} + \item [Contrast] We show a standard example, then we change a variable +name from a relevant to an irrelevant one. We show that the program +still works. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(values): + maximum = 0 + for value in values: + if value > maximum: + maximum = value + + return maximum + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={2,4-5,7}]{python} + def example(values): + x = 0 + for value in values: + if i > x: + x = i + + return x + \end{minted} + \end{minipage} +\newline + + \item [Generalisation] We can then \emph{generalise} this by showing +that we can rename the other variables too. We can even show other +examples where the variable names are disconnected from the purpose. + + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={}]{python} + def example(l): + maximum = 0 + for i in l: + if i > maximum: + maximum = i + + return maximum + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightcolor=light-green, highlightlines={2,7}]{ +python} + def example(values): + maximum = 0 + for value in values: + if value < maximum: + maximum = value + + return maximum + \end{minted} + \end{minipage} +\newline + \item [Fusion] When done, we can point out that we name variables +properly for readability (ease of comprehension), by \emph{contrasting} + the same example with and without relevant variable names. We can +follow this by \emph{generalisation}, by showing a previously unseen example with unrelated variable names and trying to read it. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(l): + x = 0 + for i in l: + if i > x: + x = i + + return x + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(hi): + y = [] + + variable = hi.readlines() + + for i in variable: + x = i.split() + object = Class1(x[0], x[1]) + y.append(object) + + for i in y: + if y.property2 > 18: + print(f"Hi {y.property1}! + You are an adult.") + \end{minted} + \end{minipage} +\hfill + +\end{description} +\vspace{5pt} +Other misconceptions that students have when defining variables were found +by \textcite{GuoMarkelZhang2020}, + +\begin{enumerate} + \item When using a variable students have the misconception that the + \enquote{pronoun} of the variable can be used later in the programme, + instead of the name that was used in the first definition of the + variable. For example, if a list has been defined as + \mintinline{python}{my_list}, the list is later referenced only as + \mintinline{python}{list}. + + From a variation theory perspective, this misconception can be +managed through a similar pattern as described for the misconception +of variable names' power of variable values. First by \emph{ +contrasting} via examples where the name \mintinline{python}{my_list} + is used throughout the programme and changed mid-programme +respectively. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(): + maximum = 0 + for value in my_list: + if value > maximum: + maximum = value + + return maximum + + my_list = [1, 3, 2, 5, 4] + maximum = example() + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={3}]{python} + def example(): + maximum = 0 + for value in list: + if value > maximum: + maximum = value + + return maximum + + my_list = [1, 3, 2, 5, 4] + maximum = example() + \end{minted} + \end{minipage} +\hfill + + Then, \emph{generalising} by using different variable names, but +keeping them throughout the programme. + \hfill + \begin{minted}{python} + def example(): + for value in my_list: + if value > maximum: + maximum = value + + maximum = 4 + my_list = [1, 3, 2, 5, 4] + example() + \end{minted} +\hfill + +The pattern ends with a fusion of the two. +\hfill + \begin{minted}{python} + def example(): + for value in list: + if value > maximum: + maximum = value + + + maximum = 4 + my_list = [1, 3, 2, 5, 4] + example() + \end{minted} + +\hfill + + \item Defining a variable by using another variable students use + \mintinline{python}{x == y} instead of \mintinline{python}{x = y}, a + misconception supposedly originating from how the statement is read +out + loud as \enquote{x equals y}. + + Here we propose the pattern where we first \emph{contrast} the two +different statements, by printing the output of \mintinline{python}{ +x == y} and \mintinline{python}{x = y} separately. Both \mintinline{ +python}{x} and \mintinline{python}{y} can be pre-defined, to avoid +the programme throwing an error. However, in the contrast we could +also include the example where an error is thrown, this to \emph{ +contrast} even further. + + \hfill + \begin{minipage}[t]{0.3\columnwidth} + \begin{minted}{python} + def example(): + x = 1 + y = 2 + x == y + + return x + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.3\columnwidth} + \begin{minted}[highlightlines={4}]{python} + def example(): + x = 1 + y = 2 + x = y + + return x + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.3\columnwidth} + \begin{minted}[highlightlines={3}]{python} + def example(): + y = 2 + x == y + # Will throw an + # error + return x + \end{minted} + \end{minipage} +\hfill + + The \emph{generalisation} for this pattern is easy, and consists of +several variable-definitions where other variables are used in the +definition. Here again we are also including different function +functionalities, to help students understand variables and functions +simultaneously. -If we move on to the relationship between variables and functions we can see -more misconceptions that students have; from where input and output arguments + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(y): + x = y + + + return x + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={}]{python} + def example(): + a = 3 + b = a + + return b + \end{minted} + \end{minipage} +\hfill + + + The last pattern, \emph{Fusion} should consist of both variable- +definitions, and where the variables also are used in comparison- +statements. + + \begin{lstlisting}[language=Python] + + def example1(): + XXX Not really sure of a good example for this... + maybe an absolut-value example? + + \end{lstlisting} + \item Writing definitions of variables from left-to-right + (\mintinline{python}{a+b = c}) instead of right-to-left + (\mintinline{python}{c = a+b}). The same can be seen when using +functions + in the definition of variables, for example instead of writing + \mintinline{python}{x=parse(input())} the students write + \mintinline{python}{parse(x) = input()}. + + This misconception can be seen as a misconception of what the +functionality of the left and right side of a variable definition +is. To \emph{contrast} this, we need to create code which will throw +errors, since it is not possible to do function calls on the left +side of a variable definition. This can be done with examples from +both misconceptions mentioned, where we do it the right way and the +wrong way, using the same examples as \textcite{GuoMarkelZhang2020}. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(): + a = 1 + b = 2 + c = a + b + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={4}]{python} + def example(): + a = 1 + b = 2 + a + b = c + # Will throw an + # error + \end{minted} + \end{minipage} +\hfill +\vspace{5pt} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={}]{python} + def example(): + x = parse(input()) + \end{minted} + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={2}]{python} + def example(): + parse(x) = input() + # Will throw an + # error + \end{minted} + \end{minipage} + \hfill + + We then \emph{generalise} it by writing an example, with several +function calls which will not throw errors. + + \begin{minted}[highlightlines={}]{python} + def example(): + number = int(input("Write a number ")) + new_number = number + 1 + string_number = str(abs(new_number)) + \end{minted} +\hfill + + In this pattern it will however not be possible to \emph{fuse} the +invariant and variant with each other, since the latter will throw +an error for the programme. XXX Or can you think of a fusion Daniel? +\end{enumerate} + + +Since Python is an interpreted language, the placement of the definition of +a function is important, something that differs from a compiled language. +This gives +room for a misconception for students that have learned to code in for +example Java, where the definition of a function can be below a call of the +function. + +XXX Add analysis \textbf{IF} we want to include this misconception. However +it can be seen as out of the scope of our article, since we focus on novice +programmers. But we still have students that might have been exposed to +Java in high-school, so it might still be interesting to include? + +\subsubsection{Arguments and return values of functions} + +If we move on to the relationship between variables and functions we can +see +more misconceptions that students have; for example where input and output +arguments come from and go to \parencite{Ragonis2005OOP}. -The first difficulty is how students treat return-values. When a function is -supposed to return a value some students miss the return value, expecting the -function to return it by default \parencite{Kurvinen2016,KumarVeerasamy2016}. -A student might also write a function which returns a value, but that value is -not stored nor being used later in the program +The first difficulty is how students understand and treat return-values, +where these student misconceptions have been found: + +\begin{enumerate} + \item Missing to return the variable when a function is supposed to, +expecting the function to return it by default \parencite{ +Kurvinen2016,KumarVeerasamy2016}. + + \item Believe that a print-statement at the end of a function will act +as a return statement \parencite{MisconceptionsSurvey2017}. + +\end{enumerate} + + +These two misconceptions are connected to each other, and what they have +in common is the trouble to return a value, the \emph{right} value, from a +function. However, the two misconceptions can be, and according to us +should be, treated separately. To help the students understand how to +write a correct return-statement for different functions we propose these +two patterns for the two misconceptions: + +\begin{enumerate} + \item To help the students understand that a function will not return +the correct value by default, we start with \emph{contrasting} with +the help of a function which in the first example does not return a +value, and in the second return the value. We name the functions +\mintinline{python}{find_maximum}, this to trigger the misconception +since this might "trick" students into believing that the return value +will automatically be the expected value (in this case the maximum). + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def find_maximum(values): + maximum = 0 + + for value in values: + if value > maximum: + maximum = value + + + + x = find_maximum([1,3,2]) + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={8}]{python} + def find_maximum(values): + maximum = 0 + + for value in values: + if value > maximum: + maximum = value + + return maximum + + x = find_maximum([1,3,2]) + \end{minted} + \end{minipage} +\hfill + + Then we \emph{generalise} by showing two different examples where we +return the expected value. + \hfill + \begin{minted}{python} + def read_file(filename): + file = open(filename, "r") + lines = file.readlines() + + return lines + \end{minted} + +\hfill + + \begin{minted}[highlightlines={}]{python} + def calculate_average(values): + values_sum = sum(values) + average = values_sum/len(values) + + return average + \end{minted} +\hfill + + And the last step of the pattern, the \emph{fusion}, consists of an +example where we use functions that return the expected value, and not. + \hfill + + \begin{minted}{python} + def read_file(filename): + file = open(filename, "r") + lines = file.readlines() + + return lines + + def calculate_average(values): + values_sum = sum(values) + average = values_sum/len(values) + + def main(): + lines = readfile("test.txt") + values = [] + for line in lines: + values.append(int(line[0])) + average = calculate_average(values) + print(average) + \end{minted} +\hfill + + \item The latter misconception we believe originates from the +students' misconception that what they see in the terminal is what +happening in the programme. So when we print the value that we want to +be returned, the programme will see it and can later use it. We start +the pattern with \emph{contrasting} this concept, using the same +example as above. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def find_maximum(values): + maximum = 0 + for value in values: + if value > maximum: + maximum = value + + print(maximum) + + x = find_maximum([1,3,2]) + print(x) + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={7}]{python} + def find_maximum(values): + maximum = 0 + for value in values: + if value > maximum: + maximum = value + + return maximum + + x = find_maximum([1,3,2]) + print(x) + \end{minted} + \end{minipage} +\hfill + + Here the students will see the maximum in the terminal when calling +\mintinline{python}{find_maximum}, and might therefore believe that x +will hold the value the function printed. The \emph{generalisation} +for this misconception will be the same as for the first +misconception, but the \emph{fusion} will differ slightly. When we +\emph{fuse} in this pattern, we instead vary between returning the +correct value and only printing it. + \hfill + \begin{minted}{python} + def read_file(filename): + file = open(filename, "r") + lines = file.readlines() + + print(lines) + + def calculate_appearences(lines): + num_appearences = 0 + for line in lines: + if line[0] == "Adam": + num_appearences += 1 + return num_appearences + + + def main(): + lines = readfile("test.txt") + appearences = calculate_appearences(lines) + print(appearences) + \end{minted} +\hfill + +\end{enumerate} + +A student might also write a function which returns a value, but that +value is not stored nor being used later in the program \parencite{AltadmriBrown2015}. -Some students might also believe that a print-statement at the end of a -function will act as a return statement \parencite{MisconceptionsSurvey2017}. - Some students also have difficulties returning the right value or variable from a function -\parencite{KumarVeerasamy2016}. - -XXX we should probably split some of those to analyse separately. - -It is not only the return value that is difficult to grasp for students, the -input arguments are also a difficult concept for some students. When calling a -function it shows that the student have trouble using and understanding what -arguments are meant to be used in the function call -\parencite{AltadmriBrown2015}. \Textcite{Fleury1991} researched the -misconceptions students have when using parameters in functions in her article -\citetitle{Fleury1991}. -What she found was that students had constructed their own rules for the using -of global and local variables which were connected to the using of variables in -functions. The first rule that a student had conducted was that when changing a -local variable in a function, the variable was changed for the whole program. -Another assumption made by the students was that if the local variable was not -an argument in the function-call, the program would go back to where the -function was called and search for it there. The assumptions the students had -made about global variables was that if a function references to a global -variable, it will create an error in the program because the global variable -was not an argument in the function call. The students also believed that if a -global variable was changed in the function body, the new value would not be -reachable for the rest of the program if not returned by the function. - -The difference between variables in programming and variables in mathematics is -is something that some students do not grasp. If a student in a variable -definition uses on the right side of the equal symbol a variable that is not -defined, but the variable on the left side is already defined, they think that + + +We draw the conclusion that this misconception can be the product of +students believing that a function has to have a return-statement to end, +and that the programme will throw an error if a function misses a return- +statement. We propose this pattern to avoid this misconception: + +\begin{description} + \item[Contrast] To contrast this misconception, we write a function +that is not meant to return a value, with and without a return- +statement. To contrast it even further we print the return-value +\mintinline{python}{None}, to show the students that the two functions +return the same value. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(): + print("Hello world!") + + return + + print(example()) + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={4}]{python} + def example(): + print("Hello world!") + + + + print(example()) + \end{minted} + \end{minipage} +\hfill + + + \item[Generalisation] In the generalisation of this pattern we write +different functions which all misses a return-statement, to show the +students that the functions still executes and ends without it. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(lines): + file = open("test","w") + for line in lines: + file.write(line) + file.close() + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={}]{python} + def example(): + print("Menu options") + print("A. Open file") + print("B. Add person") + print("C. Delete person") + \end{minted} + \end{minipage} +\hfill + \item[Fusion] We know fuse the invariant and the variant in the +earlier examples, with a programme that have functions with and +without return-statements. + \hfill + \begin{minted}{python} + def print_menu(filename): + print("Menu options") + print("A. Open file") + print("B. Add person") + print("C. Delete person") + + def add_person(file): + name = input("Name? ") + file = open("file","a") + file.write(name) + file.close() + + return + + def main(): + file = "names.txt" + print_menu() + option = input() + if option == "B": + add_person(file) + \end{minted} +\hfill + +\end{description} + +It is not only the return value that is difficult to grasp for students, +the +input arguments are also a difficult concept for some students. When +calling a +function \textcite{AltadmriBrown2015} found that students often have +trouble passing the right parameter when invoking a function, for example +by inserting data of wrong type in the function call. + +This misconception students have when invoking functions, shows a gap in +the conceptual understanding of function arguments and parameters. To +avoid this and to help students grasp the relationship between function +arguments and parameters in function calls we propose this pattern: + +\begin{description} + \item[Contrast] First we contrast with an example of when we pass a +parameter in the correct data type to a function, and what happens if +we pass the wrong type. Since there are several common data types used +in CS1, we have created examples for strings, integers and lists. + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(num): + print("You have "+num) + + num = input("Number:") + example(num) + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={5}]{python} + def example(num): + print("You choose "+num) + # Will throw + # an error + num = int(input("Number:")) + example(num) + \end{minted} + \end{minipage} +\hfill + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + import random as r + + def example(num): + random = r.randint(0,9) + new = random + num + + return new + + + num = int(input("Number:")) + print(example(num)) + \end{minted} + \end{minipage} +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={10}]{python} + def example(lines): + for line in lines: + print(line) + + f_name = "test.txt" + file = open(f_name,'r') + content = file.readlines() + example(content) + \end{minted} + \end{minipage} +\hfill + +\hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={7}]{python} + def example(lines): + for line in lines: + print(line) + + f_name = "test.txt" + file = open(f_name,'r') + content = file.readline() + example(content) + \end{minted} + \end{minipage} +\hfill + +The last contrast will not throw an error, however it might not do what +the students expected.. + + \item[Generalisation] To generalise we write several examples where it +is important to include the correct data type in order for the +programme to work as expected. + + + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}{python} + def example(a, b): + c = a + b + + return c + \end{minted} + \end{minipage} +\hfill + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{minted}[highlightlines={}]{python} + def example(index, names): + try: + print(names[index]) + except IndexError: + print("Does not exist") + \end{minted} + \end{minipage} +\hfill + + + \item[Fusion] In the fusion we will go back to the last example of the +generalisation, but instead we will show what happens when we, by +mistake, switches the places of the parameters. This will lead to +wrong data types for the example function and lead to an error. + \hfill + \begin{minted}[highlightlines={}]{python} + def example(index, names): + try: + print(names[index]) + except IndexError: + print("Does not exist") + + def main(): + names = ["Eva", "Adam", "Mark"] + index = int(input("What index do you want to print? ")) + example(names, i) + \end{minted} +\hfill + + +\end{description} + + +\Textcite{Fleury1991} researched the +misconceptions students have when using parameters in functions. +She found that students had constructed their own rules for the using +of global and local variables which are connected to the use of variables +in +functions. These are the misconceptions she found, and for each +misconception we propose a pattern: + +\begin{enumerate} + \item when changing a local variable in a function, the variables with +the same name is changed for the whole program + + \begin{description} + \item[Contrast] We begin by contrasting, where we first change the +local variable by returning the value from the example function. +Then we show an example where students might believe that the +local variable number will change automatically for both functions. + + \hfill + \begin{minipage}[t]{0.4\columnwidth} + \begin{minted}{python} + import random as r + + def example(num): + rand = r.randint(0,9) + num = rand + num + + return num + + def main(): + num = 1 + num = example(num) + print(num) + \end{minted} + \end{minipage} +\hfill + \hfill + \begin{minipage}[t]{0.4\columnwidth} + \begin{minted}[highlightlines={7,11}]{python} + import random as r + + def example(num): + rand = r.randint(0,9) + num = rand + num + + + + def main(): + num = 1 + example(num) + print(num) + \end{minted} + \end{minipage} +\hfill + + + \item[Generalisation] We then generalise, where we update the local +variable in a correct way. + + \hfill + \begin{minipage}[t]{0.4\columnwidth} + \begin{minted}{python} + def example(i, values): + print(values[i]) + i += 1 + + return i + + def main(): + values = [1,2,3] + i = 0 + while i < 3: + i = example(i, values) + \end{minted} + \end{minipage} +\hfill + \hfill + \begin{minipage}[t]{0.4\columnwidth} + \begin{minted}[highlightlines={7,11}]{python} + def example(phrase): + phrase += input() + + return phrase + + def main(): + phrase = "Phrase: " + phrase = example(phrase) + print(phrase) + \end{minted} + \end{minipage} +\hfill + + \item[Fusion] Lastly, we fuse the misconception +\end{description} + + + + \item if the local variable is not an argument in the function-call, +the program will go back to where the function was called and search +for it there + + \item if a function references to a global variable that is not a +function argument, it will create an error in the program + + \item if a global variable is changed in the function body, the new +value will not be reachable for the rest of the program if not returned +by the function +\end{enumerate} + + +\subsubsection{Variables in mathematics vs programming} + +The difference between variables in programming and variables in +mathematics is something that some students do not grasp. If a student in a +variable +definition uses, on the right side of the equal symbol, a variable that is +not +defined, but the variable on the left side is already defined, they think +that the computer will solve this as an equation \parencite{Plass2015Variables}. -This assumption by the students is also something discovered by -\textcite{Kohn2017VariableEvaluation} when giving the students the definition -\verb'x = x + 1'. If you look at this definition with a mathematical -perspective you will see an unsolvable equation, which is also what some of the -students saw. They did not see that the \verb'x' to the left is the variable, -and that the \verb'x' to the right only holds a value. This definition is +This assumption made by the students was also discovered by +\textcite{Kohn2017VariableEvaluation} when giving the students the +definition +\mintinline{python}{x = x + 1}. If you look at this definition with a +mathematical +perspective you will see an unsolvable equation, which is also what some of +the +students saw. They did not see that the \mintinline{python}{x} to the left +is the +variable, +and that the \mintinline{python}{x} to the right only holds a value. This +definition is easier to understand for a novice programmer, according to -\textcite{Kohn2017VariableEvaluation}, when we instead write this definition as -\verb'x += 1'. +\textcite{Kohn2017VariableEvaluation}, when we instead write this +definition as +\mintinline{python}{x += 1}. + +XXX Add analysis on how to help students understand the difference between +variable definitions in programming and equations in mathematics + +\begin{description} + \item[Contrast] + \item[Generalisation] + \item[Fusion] +\end{description} + +\endinput + +\subsection{Arguments to functions} + +We start with the following example. +\begin{pyblock}[greet] +def greet(name, place): + print(f"Hello {name}, so you're from {place}?") + +def main(): + name = "Ada" + place = "Computer Town" + greet(name, place) + +main() +\end{pyblock} +The output of running the code will be: +\stdoutpythontex[verbatim] + +\subsection{Scope of identifiers} + +\begin{pyblock}[greet-scope][highlightlines={5-7}] +def greet(name, place): + print(f"Hello {name}, so you're from {place}?") + +def main(): + the_name = "Ada" + the_place = "Computer Town" + greet(the_name, the_place) + +main() +\end{pyblock} + +\begin{pyblock}[greet-scope-more][highlightlines={5}] +def greet(name, place) + print(f"Hello {name}, so you're from {place}?") + +def main(): + greet("Ada", "Computer Town") + +main() +\end{pyblock} + +\subsection{Order of arguments} + +\begin{pyblock}[greet-order][highlightlines={1,7}] +def greet(name, place): + print(f"Hello {name}, so you're from {place}?") + +def main(): + name = "Ada" + place = "Computer Town" + greet(place, name) + +main() +\end{pyblock} + + diff --git a/research/misconceptions/introduction.tex b/research/misconceptions/introduction.tex new file mode 100644 index 00000000..df53f271 --- /dev/null +++ b/research/misconceptions/introduction.tex @@ -0,0 +1,51 @@ +\section{Introduction} + + +In higher education in technology there often exists introductory +programming courses, which is often referenced as \emph{Computer Science 1} +(CS1). When students are introduced to the world of programming some +misconceptions, where students understand some critical aspects but +misunderstand others \parencite{NCOL}, are bound to happen. These kind of +misconceptions are something we have seen when educating students in CS1. +These obstacles that students meet when first learning to code made us +interested in mapping which misconceptions that are common and understand +the origin of these misconceptions. To map these misconceptions we rely on +earlier studies that have been done in the area, combined with our own +findings. With the help of this knowledge we hope to get a better +understanding of how one could develop the courses in introductory +programming to avoid these misconceptions. In order to develop the course +we are using variation theory, which we believe can be an effective tool in +education. + +According to variation theory \parencite[Ch.~2]{NCOL}, each educational +objective can be divided into different \emph{aspects}. +Consider the educational objective \enquote{the student should be able to +use +functions}. +One aspect of this particular educational objective is the local and global +scope +of variables. +Another aspect is returning values from a function. +For a student to achieve the educational objective, she must be able to +discern +the different aspects of the educational objective. +Aspects that the student hasn't yet discerned are critical aspects. +One necessary condition for learning is that the student is introduced to a +series of patterns of variation in the dimension of each critical aspect. +Misconceptions are examples of when a student has failed to discern (at +least) +one critical aspect. +This allows us to use misconceptions to inform our designs when designing +teaching according to variation theory. + +\subsection{Purpose} + +The purpose of this study is to answer the following questions: +\begin{enumerate} + \item What misconceptions in introductory programming has been identified +by + earlier research? + \item Based on these misconceptions, what aspects (in terms of variation + theory) of introductory programming can we identify? + \item Where do we need further research? +\end{enumerate} \ No newline at end of file diff --git a/research/misconceptions/method.tex b/research/misconceptions/method.tex new file mode 100644 index 00000000..2883c977 --- /dev/null +++ b/research/misconceptions/method.tex @@ -0,0 +1,63 @@ +\section{Method} +\subsection{Literature review} +To map misconceptions in introductory programming a semi-structured +literature review was performed. A semi-structured literature review is used +when the studied area is broad and has been conceptualised in many different +ways \parencite{Snyder2019}. When conducting a semi-structured literature +review the purpose will be to find significant findings in the area, however +it is not mandatory, nor perhaps possible, to review all articles relevant +to the researched subject. Nevertheless, it is still important to keep the +review transparent by stating which search words and criterias that have +been +used when browsing the articles available on the subject +\parencite{Snyder2019}. +For this literature review of misconceptions the following criterias have +been used: +\begin{itemize} +\item The word \emph{misconceptions} should be in the title of the article. +\item The article should be focused on \emph{Python} or other languages that +have similarities to Python. Articles focused on object-oriented languages +in +general might also be relevant. +\item The focus in the article should be on introductory programming for +high +school or higher education. +\item Articles with the purpose of deciding which programming language is +the +most efficient one to use when teaching introductory programming have been +sifted out because of the irrelevance for the purpose of this study. +\item Articles which revolve around a tool that measures students' knowledge +instead of common misconceptions have also been sifted out. +\item An iteration with one search word has been stopped when ten articles +have had the same findings or if ten articles in a row have been irrelevant. +\end{itemize} +\begin{table}[h] +\centering +\begin{tabular}{ll} +\toprule +Database & Search words\\ +\midrule +\multirow{3}{4em}{Google scholar} & common misconceptions in intro to +programming \\ +& common misconceptions cs1 \\ +& programming misconceptions student mistakes \\ +\bottomrule +\end{tabular} +\caption{Databases and search words used in review} +\label{databasesandwords} +\end{table} +When conducting the literature review several databases and search words +have +been used. In \cref{databasesandwords} each word used in each database is +presented. + +The misconceptions found in each chosen article are presented in +\cref{misconceptions}. + + +\subsection{Analysis with variation theory} + +Each misconception found in earlier research will be analysed through +variation theory. The analysis has the purpose of understanding how one can +teach the different aspects revolving around the misconception at hand, in +order to avoid this misconception in the future. \ No newline at end of file diff --git a/research/misconceptions/preamble.tex b/research/misconceptions/preamble.tex index 3bf52928..3a13208b 100644 --- a/research/misconceptions/preamble.tex +++ b/research/misconceptions/preamble.tex @@ -1,19 +1,39 @@ \usepackage{graphicx} % support the \includegraphics command and options -\usepackage[strict]{csquotes} \usepackage[natbib,backend=biber,style=authoryear-comp,maxbibnames=99]{biblatex} \addbibresource{bibliography.bib} +\usepackage[strict]{csquotes} +\SetCiteCommand{\parencite} \usepackage[all]{foreign} +\usepackage{amsfonts} + %%% PACKAGES \usepackage{booktabs} % for much better looking tables \usepackage[inline]{enumitem} \setlist[enumerate]{label=(\arabic*)} +\usepackage[outputdir=ltxobj]{minted} +\setminted{autogobble,linenos} +\usepackage{pythontex} +\setpythontexfv{numbers=left} +\setpythontexoutputdir{.} +\setpythontexworkingdir{..} + \usepackage{verbatim} % adds environment for commenting out blocks of text & for better verbatim \usepackage{caption} % make it possible to include more than one captioned figure/table in a single float \usepackage{hyperref} % These packages are all incorporated in the memoir class to one degree or another... -\usepackage{cleveref} +\usepackage[capitalize]{cleveref} + +\usepackage{multirow}%for tables + + +\usepackage[parfill]{parskip} +% C: I think it is easier to read if there are blank rows between paragraphs +% D: I disagree, but I'll leave it for now :-) + +\usepackage{listings} + diff --git a/research/misconceptions/problem-solving.tex b/research/misconceptions/problem-solving.tex index a5529ba0..338b68a4 100644 --- a/research/misconceptions/problem-solving.tex +++ b/research/misconceptions/problem-solving.tex @@ -5,14 +5,27 @@ \subsubsection{Role in the syllabus} \subsubsection{Difficulties that can occur} -Here I want to have some articles about how math-problems will make it harder for students to solve the problem. Quote from Veerasamy et al \emph{This study analysis also explored that novices of programming struggled in writing code for math-related Questions 6 and 7 (refer Table C1). Nearly 66\% of students did not do well in the mathematical problem-based questions though explained and allowed to surf the Internet to seek for more details during the exam hours. A neo-Piagetian theory of cognitive development stated that students who are at the concrete operational stage struggle to write large programs with partial specifications, although they can write small programs from well-defined specifications (Teague et al., 2012).} +Here I want to have some articles about how math-problems will make it harder +for students to solve the problem. Quote from Veerasamy et al \emph{This +study analysis also explored that novices of programming struggled in writing +code for math-related Questions 6 and 7 (refer Table C1). Nearly 66\% of +students did not do well in the mathematical problem-based questions though +explained and allowed to surf the Internet to seek for more details during +the exam hours. A neo-Piagetian theory of cognitive development stated that +students who are at the concrete operational stage struggle to write large +programs with partial specifications, although they can write small programs +from well-defined specifications (Teague et al., 2012).} -Also I would want to include difficulties students have when debugging the code -and trying to find errors. Students often have a problem with tracing the code, +Also I would want to include difficulties students have when debugging the +code +and trying to find errors. Students often have a problem with tracing the +code, something that is discussed by \textcite[p.~20]{Sleeman1984}. On the same subject as above: In what order a program will be executed in, Programming misconceptions in an introductory level programming course exam by Einari Kurvinen, Niko Hellgren, Erkki Kaila, Mikko-Jussi Laakso, Tapio Salakoski -Would also maybe like to mention how a lab instruction should be to help students get the right knowledge from the lab. Is discussed somewhere in Yizhou Qian and James Lehmans article I think. +Would also maybe like to mention how a lab instruction should be to help +students get the right knowledge from the lab. Is discussed somewhere in +Yizhou Qian and James Lehmans article I think. diff --git a/research/misconceptions/repetitions.tex b/research/misconceptions/repetitions.tex index d2465432..50002594 100644 --- a/research/misconceptions/repetitions.tex +++ b/research/misconceptions/repetitions.tex @@ -1,22 +1,48 @@ \subsection{Repetitions} -\subsubsection{Role in the syllabus} +In CS1 students usually learn about repetitions, which includes for- and +while-loops and in some cases recursion. -\subsubsection{Difficulties that can occur} +Loop constructions can be hard to trace and understand for novice students, +for instance when a loop starts, ends and what is repeated and not repeated +in the loop \parencite{Sekiya2013,KumarVeerasamy2016,Kaczmarczyk2010}. This +was something \textcite{Sleeman1984} also realised when studying high +school students writing and debugging loop-structures. A common +misconception that the students had was that if the loop contained a print- +statement, the students thought that the only thing repeated inside the +loop was the string they saw in the terminal. The difficulties students +have in tracing the code linearly when entering a loop is according to +\textcite{KumarVeerasamy2016} because of the lack of understanding the +students have of the looping technique and the amount of cognitive skills +the tracing takes. -Loop construction is an algorithm that is hard to trace and understand for -novice students, for instance when a loop starts, ends and what is repeated and -not repeated in the loop \parencite{Sekiya2013,KumarVeerasamy2016,Kaczmarczyk2010}. -This was something \textcite{Sleeman1984} also -realized in their article \emph{Pascal and High-School Students: A Study of -Misconceptions} when studying high school students writing and debugging -loop-structures. One of the common misconception that the students had was that -if the loop contained a print-statement, the students thought that the only -thing that was repeated inside the loop was the string they saw in the -terminal. The difficulties students have in tracing the code lineary when -entering a loop is because of the lack of understanding the students have of -the looping technique and the amount of cognitive skills the tracing takes -\parencite{KumarVeerasamy2016}. +XXX Add analysis on how we can help students trace loops and understanding +how the loop-structure works. -Another difficult part of the loop technique is to understand how an if-statement inside a loop is executed. \textcite{Sekiya2013} found in their studies \emph{Tracing quiz set to identify novices' programming misconceptions} that the combination of the two made the way for several student misconceptions. For instance the students thought that the variables in the conditional part of the loop-construction was control variables or the output from the loop. The students in the studies often got confused and started to misplace the different variables that are defined when writing an if-statement in a for-loop. +When students define and use loops, \textcite{GuoMarkelZhang2020} found +three common misconceptions that the students had about the loop-statement, +\begin{enumerate} + \item A misconception about which variable in a loop defined as + \mintinline{python}{for item in items}, that is supposed to be use to + extract different information. + \item The misconception that \mintinline{python}{for i in 100} will iterate + a hundred times, even though Python require a specified range. + \item When defining a while-loop, the misconception is that one can write + \mintinline{python}{while i <= 100} without initialising + \mintinline{python}{i} beforehand, and that \mintinline{python}{i} will + automatically increase with 1 inside the loop. +\end{enumerate} + +Another difficult part of the loop technique is to understand how an if- +statement inside a loop is executed. \textcite{Sekiya2013} found in their +studies that the combination of the two control structures created +misconceptions. For instance the students thought that the variables in the +conditional part of the loop-construction was control variables or the +output from the loop. The students in the studies often got confused and +started to misplace the different variables that are defined when writing +an if-statement in a for-loop. + +XXX Add analysis on how we can teach the combination of loops and +conditionals in a way which will avoid misconceptions about the different +variables used. \ No newline at end of file diff --git a/research/misconceptions/results-overview.tex b/research/misconceptions/results-overview.tex new file mode 100644 index 00000000..5e160164 --- /dev/null +++ b/research/misconceptions/results-overview.tex @@ -0,0 +1,264 @@ +\section{Misconceptions in introductory programming} +\label{misconceptions} + +This section is divided to reflect the different concepts that are taught +throughout CS1. Each section will describe what students often are meant to +learn and understand in that module. Something which is followed by a +summary +of what different studies have found is difficult for students in that +particular +module and which common misconceptions that students may have. Each common +misconception will be analysed through the lens of variation theory, with +the purpose of explaining how it can be avoided by adjusting the way +the specific term or concept is being introduced or taught. + +Let us start with an example to illustrate the outline. +We will use a particular feature related to default arguments in Python that +few expect, hence most readers will hopefully not know about this and get the +intended experience. +(So this first set of patterns is not intended for students learning to program +for the first time, but rather the instructors for whom this paper is +intended.) + +Variation theory dictates that we teach using specific patterns of variation. +We should start with a contrast pattern in a critical aspect, followed by a +generalization pattern for the same critical aspect and finally tie several +aspects together using a fusion pattern. + +Let's get started. + +\begin{description} + \item[Contrast] The contrast pattern requires two examples to create + contrast. + The left-hand example should be read first and then the right-hand + example will highlight the changes made to the left-hand example to get + the right-hand one. + (So that one can make the changes oneself.) + + We assume that the reader is familiar with default values for arguments in + Python. + We define a function \mintinline{python}{expand} that takes a list as an + argument and expands it by appending an element \mintinline{python}{1} + (left-hand code below). + Then we run some examples. + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[default1] +def expand(x=[]): + return x + [1] + + +print( + f"[1] -> {expand([1])}\n" + f"[2] -> {expand([2])}\n" + f"() -> {expand()}\n" + f"() -> {expand()}\n" +) + \end{pyblock} + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[default2][highlightlines={2-3}] +def expand(x=[]): + x.extend([1]) + return x + +print( + f"[1] -> {expand([1])}\n" + f"[2] -> {expand([2])}\n" + f"() -> {expand()}\n" + f"() -> {expand()}\n" +) + \end{pyblock} + + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim][highlightlines={4}] + \end{minipage} + + On the right-hand side, the change is that we modify \mintinline{python}{x} + before returning the new value. + However, we notice in the output that something weird happens when we use + the default value of the parameter now: + it seems like we actually update the default value every time we run the + function using the default value. + + \item[Generalisation] This brings us to the generalisation pattern. + In the generalisation pattern we vary the non-critical aspects and keep the + critical aspect invariant. + In this case, we just add more print statements to the example. + (For brevity we don't repeat the definition of the + function~\mintinline{python}{expand}.) + + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[default1] +print( + f"() -> {expand()}\n" + f"[3] -> {expand([3])}\n" + f"() -> {expand()}\n" + f"[] -> {expand([])}\n" + f"[] -> {expand([])}\n" + f"() -> {expand()}\n" +) + \end{pyblock} + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim] + \end{minipage} + \hfill + \begin{minipage}[t]{0.45\columnwidth} + \begin{pyblock}[default2] +print( + f"() -> {expand()}\n" + f"[3] -> {expand([3])}\n" + f"() -> {expand()}\n" + f"[] -> {expand([])}\n" + f"[] -> {expand([])}\n" + f"() -> {expand()}\n" +) + \end{pyblock} + + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim][highlightlines={1,3,6}] + \end{minipage} + + We can see in the output that the default value in the argument keeps + expanding. + In a sense, the generalisation pattern is the same as induction or the + scientific method. + + Now, we'll change the example, but keep this property invariant; + or, phrased in terms of the scientific method, we try to falsify our + hypothesis. + (We highlight the lines that keeps this property, \ie remains invariant in + the variation theoretic sense.) + \begin{pyblock}[default1][highlightlines={10-12}] +class Person: + def __init__(self, first, last): + self.first = first + self.last = last + + def __str__(self): + return f"{self.first} {self.last}" + +def create_person(first=None, last=None, + person_base=Person("Gina", "Jones")): + if first: person_base.first = first + if last: person_base.last = last + return person_base + +person_default = create_person() +print(person_default) +person_A = create_person("Ada") +print(person_A) +person_B = create_person("Beda") +print(person_B) +print(person_A) +print(person_default) + \end{pyblock} + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim][highlightlines={4-5}] + + We can see that the last three lines of the output is the same. + The first is expected, but the last two (highlighted) indicates that they + refer to the same object; \ie that \mintinline{python}{person_A}, + \mintinline{python}{person_B} and \mintinline{python}{person_default} all + refer to the same object. + + This must be due to how Python is constructed. + The default value seems to be constructed when the function is defined, not + when it's called (like in other languages, C++ for instance), and then + referenced (not copied) whenever the function is called without the + argument. + Consider this example. + + \begin{pyblock}[default1][highlightlines=5] +class TraceClass: + def __init__(self): + print(f"{self} created") + +def test_function(obj=TraceClass()): + print(f"test_function called with obj = {obj}") + +print("Test code begins") +test_function() + \end{pyblock} + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim][highlightlines={1}] + + We see that the print statement from the constructor is executed before the + test code is executed (highlighted line), supporting our hypothesis that + the default value is constructed when the function is defined, not the when + function is called, and then referenced throughout. + + \item[Fusion] Now we can fuse this back with our previous understanding of + default arguments, to see that it doesn't work with non-mutables like + integers. + \begin{pyblock}[default1] +def increment(x=1): + x += 1 + return x + +print(f"(1) -> {increment(1)}") +print(f"(2) -> {increment(2)}") +print(f"() -> {increment()}") +print(f"() -> {increment()}") +print(f"() -> {increment()}") + \end{pyblock} + \vspace{0.5em} + This yields the output + \vspace{0.5em} + \printpythontex[verbatim][highlightlines={3-5}] + + And we can thus conclude that this phenomenon happens only with mutable + objects. +\end{description} + +There are several things to note with this example. +First, note how the contrast pattern is designed to focus the your (the +reader's) attention to the phenomenon at hand: that default values can change +during execution. +Next, the generalisation pattern broadens our view of when this phenomenon +happens, that the objects are referenced and reused. +Finally, the fusion pattern merges this back into our original view of default +values, namely that they work as usual for non-mutable types (\eg integers). + +Second thing to note, if you are a seasoned programmer, once you had that +initial contrast pattern the remaining patterns (generalisation and fusion) +probably resembles quite a lot what you would have tested yourself to make +sense of this phenomenon---it would probably resemble how you would go about to +\enquote{debug} this. + +We actually tested this hypothesis on several colleagues who have been +programming for many years, are well-versed in Python but didn't know of this +phenomenon. +We gave them the contrast above and asked them to \enquote{debug} this +behaviour and later explain it when they understood it. +(To record the data, we asked them to think aloud and recorded their screen and +voice in a Zoom session.) + +% XXX Test with more colleagues. +% Tested with Alexander. +The concrete examples that they tried varied from person to person, some tried +many more examples than above, but the patterns of variation shown above were +present. +Indeed, they couldn't explain the phenomenon until they had generated all the +patterns above---covering both generalisation and fusion. + +XXX add reference to NCOL that students taught according to variation theory, +becomes better at creating these patterns for themselves. +In the context of programming, this should mean that if we teach them using +variation theory, they would get better at debugging. + diff --git a/research/misconceptions/test.cpp b/research/misconceptions/test.cpp new file mode 100644 index 00000000..5a6e6adf --- /dev/null +++ b/research/misconceptions/test.cpp @@ -0,0 +1,20 @@ +#include + +class TraceClass { + public: + TraceClass() { + std::cout << this << " object created" << std::endl; + } +}; + +void test_function(TraceClass obj=TraceClass()) { + std::cout << "test_function called" << std::endl; +} + +int main(void) { + std::cout << "test code starts" << std::endl; + test_function(); + test_function(); + + return 0; +} diff --git a/research/misconceptions/tools.tex b/research/misconceptions/tools.tex index 633c4b0c..bd39aa73 100644 --- a/research/misconceptions/tools.tex +++ b/research/misconceptions/tools.tex @@ -1,4 +1,13 @@ \subsection{Tools} -IDE: What IDE is best for CS1? What difficulties can occur when using different IDEs? Should we recommend one? Quote from Qian \& Lehman \emph{Although many other syntactic-level errors are reported in previous research (see Altadmri and Brown (2015), Hristova et al. (2003), and Sorva (2012)), we do not discuss them in depth here, because problems in syntactic knowledge are often easy to detect and fix. Perhaps that is why they are often noted as the most frequent mistakes novices make (Altadmri and Brown 2015; Jackson et al. 2005). A compiler or a modern integrated development environment (IDE) may be able to find them and then provide error messages or hints for correction.} +IDE: What IDE is best for CS1? What difficulties can occur when using +different IDEs? Should we recommend one? Quote from Qian \& Lehman \emph{ +Although many other syntactic-level errors are reported in previous research ( +see Altadmri and Brown (2015), Hristova et al. (2003), and Sorva (2012)), we +do not discuss them in depth here, because problems in syntactic knowledge +are often easy to detect and fix. Perhaps that is why they are often noted as +the most frequent mistakes novices make (Altadmri and Brown 2015; Jackson et +al. 2005). A compiler or a modern integrated development environment (IDE) +may be able to find them and then provide error messages or hints for +correction.} diff --git a/research/misconceptions/types.tex b/research/misconceptions/types.tex index 356664b4..dbc0ed2c 100644 --- a/research/misconceptions/types.tex +++ b/research/misconceptions/types.tex @@ -1,12 +1,47 @@ \subsection{Data types} -\subsubsection{Role in the syllabus} +When introduced to programming students will often encounter several +different data types, for example strings, integers, arrays and +dictionaries. As excepted when learning about several data types during a +short time-span different misconceptions occur. -In this chapter I was thinking that we could combine lists, arrays, maybe dictionaries, strings, charachters and the comparison of different type of variables. Maybe I will found another type that can be included here. +One of the first data types that students encounter is strings, which can +be seen as a simple type. However, \textcite{GuoMarkelZhang2020} found in +their study that students often have a misconception about how one should +use a string in a function call. When including a string as a parameter, for +example \mintinline{python}{foo("Hello World")}, students instead write +\mintinline{python}{foo(Hello World)}, where the misconception is about the syntax of +declaring a string. -\subsubsection{Difficulties that can occur} +XXX Add analysis on how to teach how to declare string-variables. Maybe +this misconception should be in the section of functions and variables? -A data type that is common in CS1 is arrays. \textcite{Kurvinen2016} found in their studies that it was not intuitive for the students that the index of an array starts at 0 instead of 1. They also saw that the students had a hard time figuring out how to loop through an arrays elements by using the elements index. The same problem was found by \textcite{KumarVeerasamy2016} that noticed that students often were of by one index when looping through the list using index, causing index-error when trying to extract an element with an index larger then the length of the list. They also saw a tendency to use negative index-numbers when unneccesary to extract elements from an array. +When using data types in CS1 the students are often required to use some +kind of comparison between variables. This creates situations where the +students try to compare different data types to each other, which shows +that the students have misconceptions about the difference between data +types and how they are initialised and then later used \parencite{ +Kurvinen2016}. Another misconception that is common is what data type that +is returned from terminal input or text-files. Both of these will return +strings, but students often hold the misconception that when reading +numbers from the terminal or files, they will automatically be converted to +integers \parencite{GuoMarkelZhang2020}. This misconception will +unfortunately not always raise an error, since many mathematical operators +work for both strings and integers, and the misconception will instead lead +to subtle errors later on in the programme. -When using data types in CS1 the students are often required to use some kind of comparison between variables. This creates situations where the students try to compare different data types to each other, which shows that the students are not completely familiar to the difference between data types and how they are initalized and later used \parencite{Kurvinen2016}. When objects are introduced to the students later in the course it will create more confusion when the students are suppose to compare objects with simple data types, which can be made easier if comparison between types are repeated by the teacher at the end of the course \parencite{Kurvinen2016}. +XXX Add analysis on how to introduce different data types so that the +students understand the difference between them + + \textcite{Kurvinen2016} found in their studies that it is not intuitive +for students that the indexation of an array starts at 0 instead of 1. +They also noticed that students have a hard time figuring out how to loop +through an array's elements by using the elements index. The same problem +was found by \textcite{KumarVeerasamy2016} who found that students often +are of by one index when looping through an array using index, causing +index-error when trying to extract an element with an index larger then +the length of the list. They also saw a tendency to use negative index- +numbers when unnecessary to extract elements from an array. + + XXX Insert analysis on how we can teach indexation of arrays \ No newline at end of file