| Title | Swedin, Skyler Lee MCS_2025 |
| Alternative Title | Findings on Preferences Between Human and Artificial Data Summaries |
| Creator | Swedin, Skyler Lee |
| Contributors | Ball, Robert (advisor) |
| Collection Name | Master of Computer Science |
| Description | This study examines public preferences for AI-generated versus human-written data summaries, revealing statistically significant differences tied to gender and educational background. Findings show that women with non-STEM backgrounds tend to prefer human-written summaries, while men with STEM backgrounds favor AI-generated ones, with other groups falling in between. |
| Abstract | Data is a common, indispensable utility in the modern world. This utility is so; great that we continue to produce and store an unending amount of it, using it to enrich; our lives. With its great volume it has become increasingly difficult to process and; comprehend at a reasonable speed. We have, therefore, begun to utilize summarizations; to quickly process new information from this otherwise intractable data. Artificial; intelligence, another useful utility, is often used to create these summaries which are so; useful in saving us time, but how receptive is the public to artificial data summaries? We; ran an experiment to measure the preferences people exhibit toward data summaries; generated by an artificial intelligence vs data summaries written by humans. We found; statistically significant differences in the types of summaries people prefer, and how; those change depending on the person. Specifically, women with non-STEM (Science,; Technology, Engineering, and Mathematics) educational backgrounds report a preference; for human written summaries, whereas men with a STEM educational background report; the opposite. Women with a STEM background and men with a non-STEM background; fall somewhere in between |
| Subject | Computer science; Artificial intelligence |
| Digital Publisher | Digitized by Special Collections & University Archives, Stewart Library, Weber State University. |
| Date | 2025-07 |
| Medium | theses |
| Type | Text |
| Access Extent | 135 page pdf |
| Conversion Specifications | Adobe Acrobat |
| Language | eng |
| Rights | The author has granted Weber State University Archives a limited, non-exclusive, royalty-free license to reproduce his or her thesis, in whole or in part, in electronic or paper form and to make it available to the general public at no charge. The author retains all other rights. For further information: |
| Source | University Archives Electronic Records: Master of Education. Stewart Library, Weber State University |
| OCR Text | Show Findings on Preferences Between Human and Artificial Data Summaries by Skyler Lee Swedin A Thesis in the Field of Computer Science for the Degree of Master of Science in Computer Science of MASTER OF SCIENCE in Computer Science Approved: Dr. Robert Ball Advisor/Committee Chair Patrick Zwick (Jun 9, 2025 12:01 MDT) Dr. Dylan Zwick Committee Member Professor Joshua Jensen Committee Member WEBER STATE UNIVERSITY 2025 r Findings on Preferences Between Human and Artificial Data Summaries A Thesis in the Field of Computer Science for the Degree of Master of Science in Computer Science Weber State University April 2025 Copyright 2025 Skyler Lee Swedin Abstract Data is a common, indispensable utility in the modern world. This utility is so great that we continue to produce and store an unending amount of it, using it to enrich our lives. With its great volume it has become increasingly difficult to process and comprehend at a reasonable speed. We have, therefore, begun to utilize summarizations to quickly process new information from this otherwise intractable data. Artificial intelligence, another useful utility, is often used to create these summaries which are so useful in saving us time, but how receptive is the public to artificial data summaries? We ran an experiment to measure the preferences people exhibit toward data summaries generated by an artificial intelligence vs data summaries written by humans. We found statistically significant differences in the types of summaries people prefer, and how those change depending on the person. Specifically, women with non-STEM (Science, Technology, Engineering, and Mathematics) educational backgrounds report a preference for human written summaries, whereas men with a STEM educational background report the opposite. Women with a STEM background and men with a non-STEM background fall somewhere in between. Acknowledgments I thank the United States Air Force for giving me the opportunity to attend graduate school and complete this thesis. Additionally, I express my thanks to my thesis committee for their willingness to assist me in my research. Thank you, Dr. Dylan Zwick and Professor Joshua Jensen, for your excellent instruction and contributions to the data summaries presented in this work. I express my thanks to my colleague, Samuel Reeder, as well. Your assistance and willingness to promote my survey are greatly appreciated. Last, but certainly not least, I offer special thanks to my thesis advisor Dr. Robert Ball. Without your expert guidance and willingness to engage with me I would not have made it this far. Table of Contents Acknowledgments.............................................................................................................. iv List of Tables .................................................................................................................... vii List of Figures .................................................................................................................. viii Chapter 1. Introduction ......................................................................................................14 Chapter 2. Related Work....................................................................................................17 Chapter 3. Approach ..........................................................................................................21 3.1 Data Collection and Summary Creation ..........................................................21 3.2 Survey Creation ...............................................................................................23 3.2.1 Survey Presentation ..........................................................................24 3.2.2 Survey Questions ..............................................................................27 3.2.3 Survey Administration ......................................................................30 3.3 Survey Analysis ...............................................................................................32 Chapter 4. Results ..............................................................................................................33 4.1 Statistical Findings ...........................................................................................33 4.2 Qualitative Findings .........................................................................................39 Chapter 5. Conclusion........................................................................................................43 5.1 Experiment Reflections ....................................................................................43 5.2 Further Research ..............................................................................................45 5.3 Looking to the Future.......................................................................................45 Appendix 1: Prompts .........................................................................................................47 Appendix 2: Survey ...........................................................................................................49 References ........................................................................................................................131 List of Tables Table 1. Human Writer Assignment ................................................................................. 23 Table 2. Pair Distribution.................................................................................................. 27 Table 3. Number of Participants by Gender and STEM Educational Background .......... 33 Table 4. Number of Participants who Completed the Entire Survey by Gender and STEM Educational Background ................................................................................................... 34 List of Figures Figure 1. Example of a Data Summary Pair ......................................................................24 Figure 2. Example of an Omission-Type Pair ...................................................................26 Figure 3. Survey Question #1 ............................................................................................27 Figure 4. Survey Question #2 ............................................................................................28 Figure 5. Survey Question #3 ............................................................................................28 Figure 6. Survey Question #4 ............................................................................................28 Figure 7. Survey Question #5 ............................................................................................29 Figure 8. Survey Question #6 ............................................................................................29 Figure 9. Survey Question #7 ............................................................................................29 Figure 10. Survey Question #1 for Omissions ...................................................................30 Figure 11. Summary Preferences Between the Sexes with and Without a STEM Educational Background ....................................................................................................36 Figure 12. Trust Ratings ....................................................................................................37 Figure 13. Understanding Ratings .....................................................................................38 Chapter 1. Introduction Artificial Intelligence (AI), according to IBM, is “technology that enables computers and machines to simulate human learning, comprehension, problem solving, decision making, creativity and autonomy” [1]. So sophisticated is AI becoming that it is expected to reach capabilities approximating a human by 2030 [2]. Throughout the latter 2010s and into the 2020s, an important facet in the development of computer science has been the introduction of AI into the mainstream. Thanks to newfound tools such as Chat GPT and an increase in computational resources available through the cloud, AI is becoming a cornerstone in the lives of not just computer scientists, but everyone. Instead of only being the tool of high-level programmers or computer architects, AI can now be used by the average user to create a grocery shopping list, give recommendations for vacation spots, or, if asked, write a student’s essay, among others. The ease and availability of AI tools to users with little computational knowledge means that interacting with AI on a day-to-day basis is now a reality. Indeed, as we go about our business the ways in which AI touches upon all facets of life may not be immediately obvious, as it has become so natural and entrenched within our daily habits. A smartphone user may use his phone for ten minutes and not realize that he interacted with AI at least a dozen times. In just those ten minutes he may have utilized AI to unlock his phone via face recognition, received autofill recommendations for his search query on Google.com, read five AI generated emails, utilized a dating app with autonomous matching, read a data summary, and so forth. As AI continues to develop, its impact on people’s sensibilities ought to be continuously studied. To that end, this thesis presents findings which potentially add to the research. We hope that the information contained therein may be useful in advancing the reader’s knowledge of the dynamic between humans and AI. We were specifically interested in analyzing the attitudes people have regarding the presence of AI in data summaries. We designed our experiment considering the following research questions and hypotheses. Research questions: • Do people prefer data summaries written by a human or generated by AI? • Does the foreknowledge of the presence of AI affect people’s preferences towards these data summaries? • Does a person’s gender and STEM (Science, Technology, Engineering, and Mathematics) educational background affect his or her preference towards these data summaries? Hypotheses: • People will generally exhibit a preference for data summaries written by humans. • The foreknowledge of the presence of AI will affect people’s preferences towards data summaries, with that preference being in favor of human generation. 15 • A person’s gender and STEM educational background will affect his or her preference towards data summaries, with STEM backgrounds having a greater preference for AI generated summaries than those without a STEM background. We answered these questions by creating data summaries – both AI generated and human written – and surveyed a sample of people on their preferences towards them. This paper is organized as follows: Chapter 2 presents a related work section comprising the relevant literature on AI and human attitudes towards it. Chapter 3 outlines the approach we took and explains how the experiment was conducted. Chapter 4 describes the results, presenting an overview of the participants’ responses. Chapter 5 concludes the work and offers suggestions for future research. 16 Chapter 2. Related Work Research into AI and human interaction has existed since at least 1950 with the publication of Alan Turing’s seminal paper Computer Machinery and Intelligence, in which the Imitation Game was first proposed [3]. This game, henceforth referred to as the Turing Test, attempts to determine if a machine has the capability to deceive a human into thinking that it is, in fact, also a human. While the specifics of how the test is conducted vary, Turing proposed it as follows. A human interrogator is placed in a room, and a man (referred to as A) and a woman (referred to as B) are placed in another room. The objective of the interrogator is to correctly identify which is the man and which is the woman. The interrogator asks questions to the two (over teleprinter, else a participant’s voice or handwriting might give them away), and each respond as needed. The objective of A is to fool the interrogator, so his answers might be lies or otherwise attempt to mislead. B, on the other hand, helps the interrogator, providing him with truths or otherwise attempting to convince him of her legitimacy. If the interrogator can correctly identify the participants after his questioning, he wins, otherwise A wins. If A wins, he has passed the test. The Turing Test is applicable to AI when the man (participant A) is replaced by a machine. In this event, will the interrogator be able to correctly identify A and B as often as when A was a man? This is the question Turing proposed when discussing the capabilities of machines and their ability to think [3]. The Turing Test has since become a litmus test when determining the capabilities of AI. For example, since 2023 ChatGPT has been able to pass the test, fooling interrogators into thinking it is human [4], [5], [6]. 17 Related to this, the debate surrounding what it means for a machine to be “intelligent” or “thinking” is ongoing. Some argue the validity of the Turing Test is not necessarily the optimal metric to base the capability of a machine’s “thinking”, with the machine’s capabilities instead being based on the intelligence of humans [7], [8]. By this reasoning AI is reflective of our own capabilities and its intelligence, in so far as it has any, can only be determined by how we interact with it. AI, then, is a facsimile of a human that is not capable of learning or growing on its own. Other researchers propose the opposite; that a machine is capable of learning and growing like that of a human, or that the definition of intelligence is irrelevant if we cannot tell a difference between it and ourselves while interacting [9], [10]. Regardless of the semantics surrounding machine intelligence it seems clear that the proliferation of AI is ongoing, embedding itself in many facets of our lives. Examples can be found in [11], [12], [13], [14], [15], [16], [17]. Exploration into the logical models by which AI may be constructed can be found in [18]. Given the increasing importance of AI and the debate thus established, the interaction between humans and AI is especially relevant to this thesis. Research into the preferences humans have regarding AI and human generated content – research which is similar to what is presented in this paper – was recently completed in [19]. In it, Abel and Johnson asked ChatGPT 4 to generate a short story in the style of author Jason Brown. This story was then presented to a group of readers, who were offered a study compensation of $3.50 to read and assess it. Half of these participants were told that the story was AI generated, and the other half were misled into believing it was the genuine work of Jason Brown. After reading the first half of the story 18 participants were asked to rate it up to that point on measures of emotional engagement, creativity/originality, authenticity, and so forth. They were also asked how willing they were to pay to read the rest of the story. This payment came in two forms: how much of their study compensation they would be willing to give up, and how many minutes they would be willing to transcribe provided text. The researchers found that the group of readers who were told the story was AI generated had a much more negative rating on the aforementioned measures, but that both groups were willing to pay the same amount of money and time to finish reading [19]. After the experiment, approximately 40 percent of participants in the human group said they would have paid less if the same story was written by an AI as opposed to a human [19]. These results suggest that the stated preferences people have toward AI generated short stories are not reflective of their actual choices, seeing as there was no noticeable difference in the willingness to pay of both groups. The available literature at present has similar conclusions to Abel and Johnson, suggesting that humans tend to prefer AI generated works when the generation method is unknown [20], [21], [22]. In the case of artwork, it has been demonstrated that art labelled as AI generated is rated more negatively than that same artwork labelled as human generated [23], [24], [25], [26], [27]. Of note, there are many types of AI systems. Some of these include computer vision, reinforcement learning, and recommendation engines, among others. Examples of each can be seen in [28], [29], and [30] respectively. For our purposes, however, the type of AI we’ve been discussing thus far is primarily that of generative AI using large language models (LLMs). An LLM is a language processing system that uses a large 19 amount of training data to learn how to communicate in the language of its training. Examples of LLMs include Google’s Gemini, Meta’s LLaMA, and Open AI’s ChatGPT. We can think of these systems as communication robots capable of generating text (hence the generative AI moniker), as any reader who has had a conversation with one can attest to. To create an LLM researchers will often use a transformer, which is an architecture that allows a system to identify the relative importance of data in relation to each other. For example, if our training data is text then a transformer will ascertain how important words are in context of each other, such as when constructing a sentence and deciding which words should appear in what order. The transformer, and by extension the LLM, forms the basis for what we typically call AI today. The transformer was proposed in 2017 by Google [31]. Note that ChatGPT specifically is more accurately classified as a generative pretrained transformer (GPT) rather than an LLM (although it still is an LLM). Large language models need not necessarily use a transformer in their architecture, whereas a GPT always does and is the product line of Open AI. The generative pre-trained architecture itself was proposed in 2018 [32]. While virtually all major LLMs today use transformers there have been attempts to use other architectures, such as recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Examples of these can be seen in [33], [34], [35]. The rest of this paper will focus on LLMs – ChatGPT specifically – but the future of AI may or may not involve entirely different systems. Examples on prospective AI futures can be seen in [36], [37], [38], [39], [40]. 20 Chapter 3. Approach To answer our research questions, we designed an experiment in which participants were shown two data summaries about the same topic – one AI generated and one human written – and were asked to answer questions about them. The following sections explain the methodology and process we took to create the experiment. 3.1 Data Collection and Summary Creation To create the data summaries, we obtained data about four topics. These topics were ice cream sales vs temperature, college majors, S&P 500 stocks, and physics particles. One dataset was used for each topic, and the data was sourced from Kaggle.com [41], [42], [43], [44]. The reasoning for choosing each topic was to select data that a layman might be generally familiar with, then gradually raise the difficulty by using data from topics that were increasingly less common. In this regard we suggest that a random person selected from the population might be familiar with the relationship between ice cream sales and temperature but might be increasingly less familiar with college major data, stock trading, and particle physics. Of course, the specifics of any given person’s knowledge are unknown. The topics were chosen at our discretion and may be considered a weakness of the study, but we think they provide a varied distribution of ideas. After selecting the data, we designed four types of data summaries that could be used with each dataset. Each type of summary was used twice for each set, once for a human written summary and once for an AI generated summary. The four types of summaries were: 21 • A univariate analysis, in which just one variable of the data was summarized. • A bivariate analysis, in which two variables from the data were analyzed and compared to each other. • A prescriptive analysis, in which the data was used holistically to answer a question related to its subject. • A speculative analysis, in which the summary writer was free to speculate as to why the data looked like it did. Two summaries of each type were created for each dataset, leading to a total of eight summaries for each set or 32 summaries total. Of these 32 summaries 16 were written by humans and 16 were generated by AI, and the human writer and AI were given the same prompt for each summary. For example, the univariate prompt for the ice cream sales vs temperature dataset was “Summarize the relationship between sales and temperature (e.g., correlation)”. Since both the human and AI wrote about the same topic, we can say that the 32 total summaries are comprised of 16 pairs of summaries (or four pairs for each dataset), with each pair reflecting the creations of humans and AI on the same topic. A list of the prompts for each type and dataset can be found in appendix 1. To help control bias and ensure our human written summaries were reflective of more than just one writer, we used four human writers. Each human was assigned one topic per dataset and wrote just one summary for each topic. The work was assigned such that each writer would write a summary for each topic once and for each dataset once. For example, writer #1 wrote the bivariate summary for the ice cream sales dataset, the speculative summary for the college majors dataset, the univariate summary for the S&P 500 stocks, and the prescriptive summary for physics particles. Writer #2 was assigned a 22 different combination, such that he would cover once each topic and dataset in a different combination than the other writers. This led to a balanced spread of writing and analysis from the human contributors, whose assignments can be seen in table 1. Table 1. Human Writer Assignment Human Writer Writer #1 Writer #2 Writer #3 Writer #4 Ice Cream Sales Bivariate Speculative Univariate Prescriptive College Majors Speculative Univariate Prescriptive Bivariate S&P 500 Stocks Univariate Prescriptive Bivariate Speculative Physics Particles Prescriptive Bivariate Speculative Univariate The human writing assignments for each dataset. The AI generated summaries were written by ChatGPT 4o after the human writers had completed their summaries. It was given the exact same datasets the humans used and the exact same prompts. A stipulation imposed on both humans and AI was to keep each summary to around 100 words. 3.2 Survey Creation After obtaining all 32 summaries, we designed a survey to gauge their public receptivity. Survey participants were asked to provide demographic data and then respond to questions rating their preferences toward the summaries on various measures. These measures, as well as how the survey was presented, will be discussed below. The survey itself can be seen in appendix 2. 23 3.2.1 Survey Presentation Before answering questions, participants were shown two data summaries, as well as a description of what data the summary covered. An example can be seen in figure 1. Figure 1. Example of a Data Summary Pair Figure 1 shows the first data summaries participants saw. A brief explanation of what the summaries cover is written at the top. 24 The order in which the human written and AI summaries were presented, that is, which was first (top), and which was second (bottom) was random and changed per question. To ascertain the attitudes that people have toward AI, not all the pairs presented to participants truthfully reported the generation method of the summaries. Of the 16 pairs, only 9 truthfully labelled the human written and AI generated summary. We call these pairs truths. 3 of the pairs, which we call misattributions labelled a human written summary as being AI generated, and vice versa. The remaining 4 are called omissions. Omissions did not label each summary with their generation method; they instead labelled the pair of summaries as summary #1 and summary #2. The purpose of including omissions in the survey was to observe if people’s stated preferences toward AI generated data summaries are different from what they report. An example of an omission can be seen in figure 2. Unlike with Figure 1 above, the summaries are not labelled as human written or AI generated. 25 Figure 2. Example of an Omission-Type Pair Truths, misattributions, and omissions were evenly distributed throughout the survey, as shown in table 2. The first three pairs participants saw were truths, the fourth was an omission, the fifth was a misattribution, and so forth. 26 Table 2. Pair Distribution Pair 1 Number Pair T Type 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 T T O M T O T M T O T O T M T Pair Number: Human and AI data summaries about the same topic, presented in pairs. Participants were asked to read both summaries in each pair and answer questions about which they preferred. Pair Type: T is truth, M is misattribution, O is omission. The final survey consisted of 16 pairs of data summaries (referred to as Pair Number in table 2) and seven questions for each pair. These questions will be covered in the following section. 3.2.2 Survey Questions After reading a pair of data summaries, participants were asked the questions shown in figures 3, 4, 5, 6, 7, 8, and 9. Figure 3. Survey Question #1 27 This is the first question participants were asked after reading a pair of summaries. It is a single response type, allowing participants to choose only one option. Figure 4. Survey Question #2 The only optional question. It was a free response type, allowing participants to input whatever they wanted. The responses they provided will be discussed in chapter 4. Figure 5. Survey Question #3 Question 3 asked participants to rate their expertise on the topic that pair of summaries presented. Questions 3-7 asked them to rate themselves on a Likert scale of 1-5. Figure 6. Survey Question #4 28 The purpose of this question was to see how trust changed over time for the human written summaries. Figure 7. Survey Question #5 Like the previous question, but for the AI generated summaries. Figure 8. Survey Question #6 The purpose of this question was to see how understanding changed over time for the human written summaries. Figure 9. Survey Question #7 Like the previous question, but for the AI generated summaries. 29 If the presented pair was an omission the first question would not state the generation method, as shown in figure 10. Figure 10. Survey Question #1 for Omissions The first survey question for omissions is like question #1 for other types, except for the removal of summary generation methods. We asked the participants these questions to observe if the lack of information about generation method influenced preferences in favor of AI or human generation. For example, if a participant who typically favored AI generated summaries was unaware of a given summary’s generation method, would he or she exhibit different preferential behavior? Further discussion into this will be had in the next chapter. The survey was created using Qualtrics Survey Tool and administered by us. 3.2.3 Survey Administration We used Qualtrics’ distribution feature to create a public survey URL by which anyone with the link could take the survey. We asked friends and family to share the 30 URL with as many people as possible, posting it to Facebook, X, and Instagram. Most who responded to the survey were also friends or family interested in our success, meaning the data could be biased due to a limited sample with similar motivation. This distribution method was chosen due to its ease and the limited amount of time we had to complete the experiment. There were a total of 74 responses, although not all responses produced equally valuable data. Of these, only 19 participants completed the entire survey, and 15 did not answer any questions. The remaining 40 participants answered some questions – the exact amount inconsistent between them. Some only answered a few questions, while others completed up to approximately half of the survey. This is unfortunate, as the amount of data we were able to collect was quite low. This is perhaps the biggest weakness of our study. The reason for so few participants completing the survey was likely due to its length and the amount of reading involved. The average time to complete the survey for those who finished it in its entirety was 2810.1 seconds (46.8 minutes). The average time spent on the survey among all participants was 944.8 seconds (15.7 minutes). The median for each was 1701 and 235.5 seconds respectively (3.9 and 28.4 minutes). When we consider that participants were asked to read 32 summaries – each around 100 words – and answer related questions, it is perhaps unsurprising that the completion rate was only 25.7 percent. Additionally, our sample was selected from known acquaintances and relations, leading to a homogenous sample which was not randomly selected. This may have introduced selection bias, limiting the validity of our study. Suggestions to improve the survey are offered in chapter 5. 31 3.3 Survey Analysis After collecting the survey results we used Jupyter Notebooks and the Python libraries Pandas, NumPy, and Matplotlib to analyze the responses. The findings of this analysis will be discussed in the next chapter. 32 Chapter 4. Results The findings from the survey results will be discussed in the below sections. Section 4.1 covers the statistical findings from the numerical responses of the survey, while section 4.2 covers the qualitative data provided by question #2. 4.1 Statistical Findings Although the number of responses was low, we performed a statistical analysis of the data and found intriguing results. The results were first divided into two categories: one which contained all participants’ answers, and the other which contained the answers of participants who completed the entire survey only, both broken down by gender and STEM educational background. The number of participants in each category is shown in tables 3 and 4. Table 3. Number of Participants by Gender and STEM Educational Background Gender STEM Non-STEM Male Female 16 7 14 22 There was a total of 59 participants. 19 of them completed the entire survey. 33 Table 4. Number of Participants who Completed the Entire Survey by Gender and STEM Educational Background Gender STEM Non-STEM Male Female 8 4 4 3 There was a total of 19 participants who completed the entire survey. Given this information, chi-square analyses and t-tests were performed on the following group distributions: • Male vs Female responses for all participants. • STEM vs non-STEM responses for all participants. • Four-way analysis across Male/STEM, Male/non-STEM, Female/STEM, Female/non-STEM responses for all participants. • Male vs Female responses for participants who completed the entire survey only. • STEM vs non-STEM responses for participants who completed the entire survey only. • Four-way analysis across Male/STEM, Male/non-STEM, Female/STEM, Female/non-STEM responses for participants who completed the entire survey only. For each of these distributions, 17 chi-square tests were conducted among the comparison dimensions (e.g., male vs female for all participants) to ascertain any noticeable statistical differences in summary preferences for each of the 16 pairs. The final chi-square test was for a summation of all responses from that dimension. For example, the first test was for male and female preferences for the first pair only, the 34 second for the second pair, and so forth, while the final test was conducted on summed preferences between males and females across all pairs. To do this we assumed that questions for each pair were independent from questions for other pairs. While true independence in a survey like this is difficult to measure, the purpose of assuming each question’s independence was to see if any pairs solicited outlier responses. Additionally, several t-tests were conducted for each group distribution comparing the dimensions’ Likert rating for understanding and trust in the human and AI summaries. Paired t-tests were also performed to compare trust in humans vs AI regardless of group distribution. The noteworthy results for both the chi-square tests and t-tests are as follows. When considering multiple hypothesis testing, however, it is possible that many of our “significant” tests produced noise that isn’t actually significant. In this regard performing so many statistical tests was likely a mistake. Note that among all group distributions, for all pairs and questions, there were not enough participants who selected “neither” when asked their preference to make a statistical difference. In general, there was not enough data for individual questions to produce statistically significant results across any dimension (exceptions to follow). However, with the number of responses we do have we observed a significant difference in preferences between men and women when STEM educational background is also considered. A four-way chi-square test between males with a STEM educational background, males without such a background, females with the background, and females without revealed a statistically significant difference in data summary preference, as shown in Figure 11. 35 Figure 11. Summary Preferences Between the Sexes with and Without a STEM Educational Background Figure 11 shows the difference in preferred summary between men and women who come from a STEM educational background. While women with STEM backgrounds and men without are relatively even in their stated preference, women without a STEM background strongly favor human written summaries while men with one tend to favor AI generated summaries. In general, non-STEM individuals favor human written summaries more than STEM individuals. With a p-value of 0.0433, we found that there is a difference in data summary preferences between men and women depending on STEM background. This is consistent with the findings in [30], lending support to its conclusion. It should be noted, however, that our small sample size may have led to a biased sample not reflective of the world. Bearing this in mind, the results suggest that there may be a difference between 36 male and female preferences in AI generated vs human written data summaries which could warrant future investigation. It is possible, however, that a more significant factor in data summary preferences lies in STEM educational background rather than gender. To observe this, we performed chi-square tests measuring just male vs female and just STEM vs non-STEM educational background. We found p-values of 0.4355 and 0.0066 respectively, suggesting that the STEM background on its own is much more influential in a person’s preferences toward data summaries than gender. As above, we note that our sample was small and may not be indicative of the world. Despite this we are hopeful that the information we have discussed can motivate future research. We also investigated participants’ reported trust and understanding of both the human and AI summaries. A t-test of both these revealed the following about how average trust and understanding changed throughout the survey. Figures 12 and 13 illustrate our results for all participants. Figure 12. Trust Ratings 37 Figure 12 shows the mean difference between reported trust for all 16 data summary pairs. Bars above zero show a greater trust in AI summaries, and vice versa. Figure 13. Understanding Ratings Figure 13 shows the mean difference between reported understanding for all 16 pairs of data summaries. Bars above zero show a greater understanding of the AI summaries, and vice versa. The mean difference over time in both trust and understanding differs by small amounts, with some noticeable standouts such as pair #6. This pair, an omission, is the first major example of AI generated summaries overtaking human written ones in trust and understanding. A chi-square test comparing all participants vs only those who completed the entire survey reveals no statistical difference in reported trust and understanding compared to the results shown in the above figures. The same is true for gender, but with two exceptions. 38 Chi-square tests suggest that men understood the AI summary of pair #5 much more than women, and that women understood the human summary of pair #15 much more than men. The p-value of these tests were 0.0174 and 0.0054 respectively. These pairs are both misattribution pair types, but the reasoning for such low p-values is unclear. Pair #5 covers the particle physics topic. For this pair, many female participants reported not understanding either summary. Pair #15 covers the ice cream sales topic. For this pair, many male participants reported not understanding either summary, and few left a comment explaining why. For pair #5, the average reported understanding on a 1-5 Likert scale for the AI summary was 3.31 for males and 2.00 for females. For pair #15, the average reported understanding on the same scale for the human summary was 4.17 for males and 5.00 for females. The qualitative findings of both pairs will be covered in the next section. All other statistical findings were insignificant; the rest of our chi-square and ttests revealed nothing of note. We also checked if truths, misattributions, or omission pair types were more likely to produce preferences for human written summaries, but found nothing of significance. It is still possible that realities which were not made statistically manifest in our study exist, but we lack the data to find them. 4.2 Qualitative Findings Participants were able to utilize a free response question type to state specific reasons for choosing their preferred summary. The results of these qualitative answers are covered here. 39 Most comments left by users contained solely positive or negative ideas, directed toward one generation method. Some comments contained both positive and negative content, praising one side while critiquing the other. Generally, far more comments were positive than negative, in both directions. The AI was praised for its reasoning, organization, data driven approach, data insights, and presentation. This can be seen in comments such as “It provides more information” and “Less opinionated and felt like it quickly gets into the content and showed understanding of the subject”. The human writers were praised primarily for their simple writing compared to the AI, and that they were easier to understand. “It is easier to understand” and “The concept in the human version was an easier read. Especially since I’m not interested in the subject matter.” were two such comments reflecting this. The AI and human summaries received approximately the same number of positive comments. Out of all comments, about 35 percent were positive comments directed at the AI and about 35 percent were positive directed at the humans. For negative remarks, the AI was often criticized for being too complicated and speculating too much. Comments such as “The AI version includes words that are to [sic] advanced which deviates from a simple point.” and “The human written version makes more sense to someone who isn't a statistician” reflect this. Human summaries were criticized for having a poor writing style or not reflecting the data appropriately. These comments included “The Human Written seemed as if it were rambling a bit. I got confused and noticed a grammatical error that distracted me.” and “AI is not emotion based”. 40 The human summaries received about twice as many negative comments as the AI ones. Out of all comments, about 18 percent were negative toward humans and 10 percent negative toward AI. While we cannot say for certain, it is possible that the human writers would have received fewer negative comments if we used a different set of human writers. Revisiting pairs #5 and #15, the comments may assist in ascertaining why there was such a stark difference in understanding between men and women. For pair #5, in which men understood the AI summary much more than women, we see comments such as “I don’t really understand the subject matter I am reading about in either summary.”, but there are fewer comments in general than with other pairs. The available data, both quantitative and qualitative, does not suggest anything of note. We may conclude that the subject matter covered in this pair (physics particles) is complicated in general, that this is a type I error due to many tests, or that the results of our chi-square are a fluke driven by an unlucky sample. For pair #15, in which women understood the human summary much more than men, we see comments such as “This provides a variety of recommendations to consider based on the data suggesting higher profit in warmer conditions” in support of the AI summary. This pair is a misattribution, and it is possible that more men than women were suspicious of it, hence the lower understanding, or that it, like pair #5, could be a fluke. It does present one of the largest examples of length difference between the human and AI summary, with the human summary being much shorter, which could explain the confusion. There also does not appear to be any data, quantitative or qualitative, that hints to the reason. 41 Regardless, we can conclude that the survey participants were harsher toward the human summaries than the AI ones with their qualitative responses. It is possible this is due to expectations differing expectations, the quality of our writers, or something else. Future research into human motivation when assessing AI works may shed light on this. 42 Chapter 5. Conclusion In this study we were able to test the public receptivity of human written vs artificially generated data summaries. We designed an experiment in which several datasets were summarized both by human and AI, then the public participated via survey in an experiment rating their preferences towards these summaries. The following section will answer the research questions first introduced in chapter 1, based on the results of the experiment. 5.1 Experiment Reflections While the amount of data we collected was small due to issues with exposure, we were still able to show statistically significant results about gender and educational background predicting data summary preference. Given these results, we will now answer each of our research questions and compare the answers to our hypotheses. • Question 1: Do people prefer data summaries written by a human or generated by an AI? According to our experiment’s results, it depends on the person. Women with non-STEM educational backgrounds report a preference for human written summaries, whereas men with a STEM educational background state the opposite. Women with a STEM background and men with a non-STEM background fall somewhere in between. These results reject our first hypothesis. • Hypothesis 1: People will generally exhibit a preference for data summaries written by a human. We found evidence to support the opposite. 43 • Question 2: Does the foreknowledge of the presence of AI affect people’s preferences towards these summaries? We found nothing to prove or disprove this. Our pair-types – truths, misattributions, and omissions – did not reveal any answers. • Hypothesis 2: The foreknowledge of the presence of AI will affect people’s preferences towards data summaries, with that preference being in favor of human generation. Our results could not determine any statistically significant evidence in favor of this claim. • Question 3: Does a person’s gender and STEM (science, technology, engineering, and math) educational background affect his or her preference towards these data summaries? Our findings suggest so, citing a clear difference in the preferences exhibited toward data summaries based on gender and STEM educational background. • Hypothesis 3: A person’s gender and STEM educational background will affect his or her preference towards data summaries, with STEM backgrounds having a greater preference for AI generated summaries than those without a STEM background. Our findings appear to support this claim. We can conclude based on our own work, and that of others seen in the related works section, that this is likely true. Reflecting on the experiment, we would change things if we could do it again or otherwise repeat it in the future. Perhaps the biggest flaw in our design was our inability to collect data. This was due to the limited resources we had, but also the data collection 44 mechanism itself. The survey was lengthy and there were no incentive structures in place to entice participants to complete it. A shorter survey with some form of payment for participating would be ideal in the future. In addition, a larger number of misattributions and omissions may be useful, or perhaps, utilizing just one of the two types. We also collected demographic data that was unused in our analysis, such as age and subject matter expertise. This data was unneeded in the end, but we still spent the time collecting it. A repeat of the experiment would do well to avoid this. 5.2 Further Research Those who continue the research into the complex dynamic between human and artificial interaction might consider the preferences people exhibit toward AI. In regard to continuing our experiment, further data collection and analysis regarding demographics and data summary topics would be useful. The future researcher might also consider the intriguing differences between the sexes, and how that might manifest in the preferred data summation technology we have utilized throughout our experiment. Other forms of generative AI might also be studied, not just data summaries. 5.3 Looking to the Future It seems quite clear that artificial intelligence will continue to embed itself into the bedrock of our society, shaping much of our modern way of life. We hope that the future is a bright one, wherein we can utilize this tool called technology to enrich ourselves and others. It is our hope that the research presented in this paper might 45 contribute some small insight into what comes next, and that the time spent reading it was beneficial. 46 Appendix 1: Prompts The following are the prompts which were presented to the human writers and ChatGPT for each dataset and topic. Each of the four human writers wrote one summary for each dataset, as discussed in chapter 3. See table 1 for the assignments. All summaries should be limited to about 100 words. o Temperature and Ice Cream Sales (Link, Ice Cream Sales – temperatures.csv) • 1 – Bivariate o Summarize the relationship between sales and temperature (e.g., correlation). • 2 – Univariate o Summarize sales only (e.g., mean, median). • 3 – Prescriptive o Give suggestions to an ice cream seller based on this data. How do sellers maximize sales? • 4 – Speculative o Explain why the data looks like this. o College Majors (Link, all-ages.csv) • 1 – Bivariate o For all majors in the “Arts” category, summarize the relationship between total and employed (e.g., correlation). • 2 – Univariate o For all majors in the “Computers and Mathematics” category, summarize total (e.g., mean, median). • 3 – Bivariate (Two-Pair) o For all majors in the “Arts” and “Computers and Mathematics” categories, summarize the relationship between total and employed. How does this relationship differ between each major category? 47 • • 4 – Prescriptive o Give suggestions to prospective college students based on this data. What majors yield the best potential careers? 5 – Speculative o Explain why the data looks like this. o S&P 500 Stocks (Link, sp500_companies.csv) • 1 – Bivariate o For all stocks part of the magnificent 7, summarize the relationship between current price and revenue growth (e.g., correlation). • 2 – Univariate o Summarize current price (e.g., mean, median, stark differences between high/low stocks, etc.) • 3 – Bivariate (Seven-Pair) o For each of the magnificent 7, summarize the relationship between current price and revenue growth. How do these relationships compare between each of the 7 stocks? • 4 – Prescriptive o Give suggestions to investors based on this data. How should they invest to ensure maximum returns? • 5 – Speculative o Explain why the data looks like this. o Physics Particles (Link, physics_particles.csv) • 1 – Bivariate o Summarize the relationship between particle mass and width (e.g., correlation). • 2 – Univariate o Summarize particle mass (e.g., mean, median). • 3 – Prescriptive o Give suggestions to physicists based on this data. What information can be gleamed about particles? • 4 – Speculative o Explain why the data looks like this. 48 Appendix 2: Survey The survey begins by asking participants about their demographics, after which data summary pairs are presented. The first question for each pair is a single response type, the second is a free response, and the remaining are 1-5 Likert scales. For more information about the survey and data summary pairs see chapter 3. Thesis Survey Start of Block: Demographics Thank you for taking the time to complete this survey. Your responses are important for shaping future research in artificial intelligence. Before proceeding with the survey please answer the following questions about your demographics, educational background, and interests. Page Break What is your sex? o Male o Female What is your age? Age: ▼ 18 ... 95 Do you have a Science, Technology, Engineering, or Math (STEM) educational background? To qualify for a STEM educational background, you must meet at least one of the following criteria: You have graduated from a university with a STEM degree 50 You are currently enrolled in a university level STEM program You are currently employed in a STEM related job o Yes o No How would you rate yourself as a technology enthusiast? A technology enthusiast is defined as someone who meets at least one of the following criteria: You regularly follow news on new technologies, either consumer or commercial You are an early adopter of new technologies, i.e., you often purchase new technologies shortly after release You have an active interest in technology 1 Not at all a technology enthusiast 2 3 4 5 Very clearly o o o o o a technology enthusiast 51 How would you rate your overall comfort level with artificial intelligence (AI), i.e., how would you rate your knowledge, experience, and familiarity with AI? 1 Not at all comfortable o 2 o 3 o Page Break 52 4 o 5 o Very comfortable Please read before continuing. The purpose of this survey is to learn about preferences in data summary generation. Your answers will help us determine if people prefer data summaries written by a human or an Artificial Intelligence (AI). To create the questions you are about to see, a human writer and an AI were given a dataset and asked to write a summary about its contents. You will be shown the summaries and asked to choose which you prefer. The following questions will show you: A preface explaining the dataset being used and what kind of summary the human writer and AI were asked to create A Human written data summary An AI generated data summary The human writer and AI were asked to generate a summary about the same topic. For example, given a dataset about college majors, both were asked to provide recommendations to prospective students based on the data available. After you state your preference, you will be asked some questions about your trust and understanding of the summaries. You will be asked to answer questions for 16 pairs of data summaries. Please answer honestly and to the best of your abilities. Note that for some questions you will not be told the generation method for the summaries. End of Block: Demographics Start of Block: Part 1 53 This section is about ice cream sales. The human writer and AI were asked to speculate as to why the data looks the way it does. Summary #1 (AI Generated): The dataset reflects the natural relationship between weather and ice cream consumption. Warmer temperatures typically increase demand for cold, refreshing treats, directly boosting ice cream sales. This behavior aligns with human preferences and seasonal trends, as people are more likely to indulge in ice cream to cool off in the heat. The high correlation between temperature and profits suggests minimal influence from other variables, such as marketing or economic factors, emphasizing temperature as the primary driver. Variations in profits may result from occasional external factors like local events, competition, or customer preferences, but temperature remains the dominant influence. Summary #2 (Human Written): The ice cream sales dataset has two columns: temperature, which is represented in Fahrenheit, and ice cream profits, which is represented in United States Dollars. It appears that when the temperature is cold there are few profits and when the temperature is hot there are more profits. In general, the profits are almost perfectly aligned with the temperature. However, it appears that there are slight changes from that pattern at extreme temperatures. In other words, people generally buy more ice cream when it is hotter and buy less ice cream when it is colder. 54 Which summary do you prefer? o Summary #1 (AI Generated) o Summary #2 (Human Written) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 55 How would you rate your level of expertise in the subject of these summaries (ice cream sales)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 56 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 57 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 58 4 o 5 o Complete understanding This section is about S&P 500 stocks. The human writer and AI were asked to summarize the current share prices of companies which are a part of the S&P 500. Summary #1 (Human Written): Presently, the mean and median price for company stocks in the S&P 500 are $217.89 and $118.37 respectively. Comparing this to the highest and lowest valued stock we can see that the homebuilder and mortgage company NVR Inc. has the highest share price at $8276.78, and the packaging company Amcor has the lowest at $9.40. Other noteworthy stock prices include Apple at $254.49 and Netflix at $909.05. NVR Inc. is a particularly standout example – most companies do not have stocks valued over $1000. The second highest stock belongs to the travel company Booking Holdings, valued at $5048.59, which is over $3000 less than NVR Inc. Summary #2 (AI Generated): The average stock price in the dataset is approximately $217.89, while the median price is notably lower at $118.37, indicating a skewed distribution with some high-priced outliers. The highest stock price is $8,276.78, which starkly contrasts with the lowest price of $9.40. This wide range reflects the diversity of companies in the S&P 500, from smaller firms with lower valuations to dominant market leaders with significantly higher share prices. The gap between the mean and median also suggests that a few high-priced stocks heavily influence the average. 59 Which summary do you prefer? o Summary #1 (Human Written) o Summary #2 (AI Generated) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 60 How would you rate your level of expertise in the subject of these summaries (S&P 500 stocks)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 61 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 62 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 63 4 o 5 o Complete understanding This section is about physics particles. The human writer and AI were asked to summarize the relationship between particle mass and width. Summary #1 (AI Generated): There is a strong positive correlation of approximately 0.73 between particle mass and width, indicating that more massive particles generally have higher decay widths. The mass values range from 0 to 172,500 units, while decay widths span from 0 to 2,495.2 units. Some particles have near-zero widths, likely corresponding to stable or long-lived particles. The relationship suggests that heavier particles decay more rapidly, aligning with theoretical expectations in particle physics. However, certain particles deviate from this trend, possibly due to unique interaction mechanisms or conservation laws affecting their stability and decay rates. Summary #2 (Human Written): When comparing the mass and width of the particles, there is a correlation of R=0.73. That means that there is a relationship between the mass and width of the particles. In general, when the mass goes up the width generally goes up and when the mass goes down the width generally goes down. Even though the dataset represents particles, this is an intuitive result. This dataset is about quantum physics, but the same results from classic physics still hold. For example, the larger the mass of a person, the wider that person generally is. It appears to be similar with particles. 64 Which summary do you prefer? o Summary #1 (AI Generated) o Summary #2 (Human Written) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 65 How would you rate your level of expertise in the subject of these summaries (physics particles)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 66 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 67 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 68 4 o 5 o Complete understanding This section is about college majors. The human writer and AI were asked to summarize the relationship between the total number of students and employed graduates for students with degrees in the Arts. Summary #1: For majors in the "Arts" category, the correlation between the total number of individuals and those employed is extremely high at 0.9966, indicating a nearly perfect linear relationship. On average, 72.5% of individuals with an "Arts" major are employed, with this percentage varying by major. Total graduates in this category range from 8,511 to 571,961, while employed individuals range from 6,431 to 386,961. These statistics highlight that the number of employed graduates strongly depends on the total number of individuals pursuing each major. Despite strong correlation, the data does not address underemployment or job relevance to their field of study. Summary #2: On average, 73.8% of graduates in the Arts category are employed, with the highest employment coming from visual and performing arts (81%) and the lowest employment coming from miscellaneous fine arts (63%). For the Computers and Mathematics category, the average employment is 81.7%, with the highest employment coming from information sciences (87%), and the lowest coming from mathematics (69%). 69 Which summary do you prefer? o Summary #1 o Summary #2 o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 70 How would you rate your level of expertise in the subject of these summaries (college majors)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 71 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 72 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o 4 o 5 o Complete understanding End of Block: Part 1 Start of Block: Part 2 This section is about particle physics. The human writer and AI were asked to give suggestions to physicists about what could be learned about particles from the data. Summary #1 (AI Generated): This dataset contains information about fundamental particles in physics. We can use it to learn useful quantitative information about both fermions and bosons, as well as how these particles relate to the fundamental forces of nature. For example, we can see that all quarks, non-neutrino leptons, and W bosons have a charge. Physicists may use the information of the bosons here to determine which particles carry the electromagnetic, gravitational, and nuclear forces. There is also quantum physics information considered, and we can expect the data to be updated should certain quantum particles be physically proven (such as the graviton). Summary #2 (Human Written): Physicists can use this dataset to explore relationships between particle properties. The strong correlation between mass and decay width suggests that heavier particles decay faster, consistent with theoretical expectations. The wide range of 73 masses, from light mesons to heavy bosons, highlights the hierarchical nature of fundamental particles. The presence of stable particles with near-zero widths indicates long-lived or fundamental states, such as protons and neutrons. Studying deviations from expected mass-width trends may reveal new physics, such as unknown interactions or exotic states. Further investigation into quark compositions can provide insights into hadronic structure and fundamental symmetries. Which summary do you prefer? o Summary #1 (AI Generated) o Summary #2 (Human Written) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 74 ________________________________________________________________ How would you rate your level of expertise in the subject of these summaries (physics particles)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? No trust 1 2 3 4 5 o o o o o 75 Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 76 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 77 4 o 5 o Complete understanding This section is about S&P 500 stocks. The human writer and AI were asked to speculate as to why the data looks the way it does. Summary #1 (AI Generated): The dataset reflects the diversity of companies in the S&P 500, representing a broad range of industries, sizes, and market capitalizations. Larger companies with higher stock prices, such as technology giants, skew the mean upward, while the median remains lower due to the majority of stocks being more moderately priced. High-priced outliers, like niche or highly specialized firms, may result from investor expectations, profitability, or growth potential. Meanwhile, lower-priced stocks could represent companies facing challenges or operating in less lucrative sectors. This variability highlights the differing risk profiles, financial performances, and market conditions that shape the stock prices of S&P 500 constituents. Summary #2 (Human Written): The prompt asks me to explain why the data looks like it does. When you ask for everything you're likely to either get it or get nothing, and I can't give you everything, but I'll try to not give you nothing. The most valuable companies in the world are technology companies, and their valuations are enormous. This will likely continue for a while. If you're looking to invest in "the economy", the S&P looks well diversified and would be a good investment. 78 Which summary do you prefer? o Summary #1 (AI Generated) o Summary #2 (Human Written) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 79 How would you rate your level of expertise in the subject of these summaries (S&P 500 stocks)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 80 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 81 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 82 4 o 5 o Complete understanding This section is about ice cream sales. The human writer and AI were asked to summarize the relationship between ice cream sales and temperature. Summary #1: The data suggests a strong positive correlation between ice cream sales and temperature. That is, we can assume with very high accuracy that the higher the temperature rises, the greater the amount of ice cream sales. There are only a few examples in the data of ice cream sales decreasing during high temperatures. The highest recorded temperature observed in this data is 101 degrees. It is unknown how ice cream sales are affected beyond this temperature, but profits are near their highest here. The lowest recorded temperature is 39 degrees, with profits near their lowest. Summary #2: The correlation between temperature and ice cream profits is approximately 0.99, indicating an extremely strong positive relationship. As the temperature increases, ice cream sales profits rise almost proportionally. This makes intuitive sense, as warmer weather likely drives higher demand for ice cream. The near-perfect correlation suggests that temperature is a key factor influencing sales, making it a reliable predictor of ice cream profits in this dataset. External factors like promotions or holiday seasons might introduce slight variations, but the data strongly supports temperature as the dominant influence. 83 Which Summary do you prefer? o Summary #1 o Summary #2 o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 84 How would you rate your level of expertise in the subject of these summaries (ice cream sales)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 85 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 86 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 87 4 o 5 o Complete understanding These section is about college majors. The human writer and AI were asked to summarize the total number of students in majors classified as mathematics and computers. Summary #1 (Human Written): There are eleven majors in the Computers and Mathematics category. These eleven majors represent approximately 1.8 million students, with the average number of students at about 160,000 students per major. Computer Science is an outlier and has more than twice the number of students than any other major with about 800 thousand majors. On the other end of the spectrum is Mathematics and Computer Science with only about 7 thousand majors. Most majors have about the same unemployment percentage of 5% except for Mathematics and Computer Science with only 2% and Communication Technologies and Computer Programming and Data Processing with about 9%. Summary #2 (AI Generated): The Computers and Mathematics category comprises 11 majors, with significant variation in the total number of individuals. The mean total is 161,943, while the median is 51,771, indicating that a few large majors skew the average upward. The totals range from 7,184 to 783,292, showing considerable disparity. The 25th percentile is 27,061.5, and the 75th percentile is 165,793.5, suggesting that most majors have a total closer to the lower end. The standard deviation of 243,480.3 further underscores this variability, highlighting the dominance of a few large majors within the category. 88 Which summary do you prefer? o Summary #1 (Human Written) o Summary #2 (AI Generated) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 89 How would you rate your level of expertise in the subject of these summaries (college majors)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 90 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 91 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o 4 o 5 o Complete understanding End of Block: Part 2 Start of Block: Part 3 This section is about S&P 500 Stocks. The human writer and AI were asked to summarize the relationship between the current share price and revenue growth of companies comprising the magnificent 7. The magnificent 7 are the seven highest stocks of technology companies, and they include Apple, Microsoft, Amazon, Google, Meta, Nvidia, and Tesla. Summary #1 (Human Written): For the "Magnificent 7" stocks, the correlation between current stock price and revenue growth is approximately -0.46, indicating a moderate negative relationship. This suggests that higher revenue growth does not necessarily align with higher stock prices within this group. The inverse relationship might be influenced by other factors like market sentiment, profitability, or differing valuation approaches, highlighting that stock price movements are not solely determined by revenue growth. Summary #2 (AI Generated): Considering only the Magnificent Seven Apple, Microsoft, Amazon, Google, Meta, Nvidia, and Tesla there is one company with extraordinary growth. Nvidia experienced a growth rate 122.4% in the 92 same period, while the others of the magnificent seven ranged from 6.1% to 18.9%. Because of this extreme outlier in terms of both growth and current stock price, we find a negative correlation coefficient (-0.46) which implies that as growth increases prices tend to decrease. This is further demonstrated by a p-value of 0.3020, showing a lack of statistical correlation. When removing the outlier, a positive correlation exists, but there still is not a significant connection. Which summary do you prefer? o Summary #1 (Human Written) o Summary #2 (AI Generated) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 93 ________________________________________________________________ How would you rate your level of expertise in the subject of these summaries (S&P 500 stocks)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? No trust 1 2 3 4 5 o o o o o 94 Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 95 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 96 4 o 5 o Complete understanding This section is about college majors. The human writer and AI were asked to provide major recommendations to prospective students based on the available data. Summary #1 (Human Written): When selecting a college major there are three important factors to look at: The demand for the degree demonstrated by the number of employees, the pay rates for the degree which can be examined by looking at the median pay, and the unemployment rate which implies market oversaturation. First examining by pay shows several outliers which are well paid, but lack high-volume, such as Petroleum Engineers. Looking solely by volume, we find Business majors have the most demand but a salary half that of Petroleum Engineers. Instead looking at the intersection between the two concepts a degree in either Accounting/Finance or Computer Science yields the best combination of the three factors. Summary #2 (AI Generated): Prospective college students should consider majors with strong employment prospects and high earning potential. Fields like Engineering, Computers and Mathematics, and Health Sciences often yield the best career outcomes. These majors typically have low unemployment rates and high median salaries, reflecting strong demand in the job market. For example, majors in Computers and Mathematics offer a median salary exceeding many other fields, paired with substantial employment opportunities. Students passionate about these areas should focus on gaining relevant skills and internships. While other fields like Arts or Humanities may align with personal interests, they often require additional effort to secure stable, well-paying roles. 97 Which summary do you prefer? o Summary #1 (Human Written) o Summary #2 (AI Generated) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 98 How would you rate your level of expertise in the subject of these summaries (college majors)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 99 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 100 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 101 4 o 5 o Complete understanding This section is about physics particles. The human writer and AI were asked to summarize particle mass for the particles in the data. Summary #1: The average particle mass in the dataset is approximately 18,781 MeV, while the median mass is much lower at 939.57 MeV, indicating a skewed distribution with a few very massive particles. The standard deviation is about 45,962 MeV, reflecting a wide spread in mass values. Most particles have masses clustered at lower values, but a few heavy particles significantly increase the mean. This distribution aligns with expectations, as fundamental particles span a broad range from nearly massless neutrinos to heavy bosons like the Higgs or top quark. Summary #2: The masses of the proton and neutron are almost identical, with the neutron being slightly larger. These are the two most massive particles. The electron is much, much lighter. The other, more exotic, particles have masses that range from almost twice that of the proton (the omega-) to nothing (the neutrinos, although I thought they had some small mass). All antiparticles have the same mass as their corresponding particles. 102 Which summary do you prefer? o Summary #1 o Summary #2 o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 103 How would you rate your level of expertise in the subject of these summaries (physics particles)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 104 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 105 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 106 4 o 5 o Complete understanding This section is about ice cream sales. The human writer and AI were asked to summarize sales from the available data. Summary #1 (Human Written): The ice cream dataset represents one years worth of sales data, represented by profit and temperature. The lowest profit was 11.88, while the largest was 89.29. When examining only the profit, a mostly normalized curve is found with the mean profit being 52.1. The median sales are nearly the same at 53.62 proposing a slight skew to the left. Examining the Interquartile range, approximately 51.15% of all sales is represented here, further demonstrating the connection to a mostly normalized curve. In other words there are slightly more higherprofit days compared to lower-profit days, but the asymmetry is minimal. Summary #2 (AI Generated): The average ice cream sales profit in the dataset is approximately $52.10, while the median profit is slightly higher at $53.62, indicating a relatively balanced distribution. The maximum profit observed is $89.29, and the minimum is $11.88, highlighting a broad range of sales performance. This variation likely reflects differences in demand due to temperature fluctuations. The close alignment of the mean and median suggests a fairly symmetric distribution of sales data, with no extreme outliers significantly skewing the results. 107 Which summary do you prefer? o Summary #1 (Human Written) o Summary #2 (AI Generated) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 108 How would you rate your level of expertise in the subject of these summaries (ice cream sales)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 109 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 110 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 3 o o 4 o 5 o Complete understanding End of Block: Part 3 Start of Block: Part 4 This section is about college majors. The human writer and AI were asked to speculate as to why the data looks like it does. Summary #1: This dataset represents a snapshot of employment and salary outcomes for various college majors, grouped into broader categories. The data structure includes totals and specific metrics like employment, unemployment, and median salaries to provide insight into job market trends. Variability across majors reflects differences in demand, industry growth, and required skill sets. For example, majors like Engineering and Computers and Mathematics align closely with high-demand fields, resulting in better employment outcomes. Conversely, Arts and Humanities may show higher unemployment rates or lower salaries due to a competitive job market and varied career paths requiring additional qualifications or experience. Summary #2: The data reflects the employment rate of students in various majors. It shows that the majors with the greatest unemployment rate include Miscellaneous Fine Arts, Clinical Psychology, and Student School Counseling. Students with these majors 111 have unemployment rates of 10 to 15 percent. Contrasting this, the majors with the lowest unemployment rates include Computer Science, Geophysical Engineering, and Pharmacology. Students with these majors have unemployment rates between zero and two percent. We can conclude that the data is reflective of current job trends, where students in science and mathematics are highly sought after and those in other fields find less work. Which summary do you prefer? o Summary #1 o Summary #2 o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 112 ________________________________________________________________ How would you rate your level of expertise in the subject of these summaries (college majors)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? No trust 1 2 3 4 5 o o o o o 113 Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 114 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 115 4 o 5 o Complete understanding This section is about physics particles. The human writer and AI were asked to speculate as to why the data looks like it does. Summary #1 (Human Written): The near-perfect match between the upper and lower mass values of each particle demonstrates a very high confidence in their measured masses. In contrast, the slight variations between the upper and lower boundaries of particle widths indicate a less precise determination of decay properties. Significant gaps in the data may reflect either negligible values or opportunities for further quantification of certain particle dimensions. Interestingly, when examining the relationship between mass and width, a near-linear distribution emerges based on the particle's charge, hinting at an underlying structure or pattern in particle physics. Summary #2 (AI Generated): The dataset reflects fundamental principles of particle physics, particularly how mass, charge, and quark composition influence particle properties. The wide range of masses arises from different interaction mechanisms, with light particles like pions forming from the strong force and heavier ones like W bosons emerging from weak interactions. The strong correlation between mass and decay width aligns with the idea that heavier particles decay faster due to more available decay channels. The presence of stable particles with near-zero widths suggests they play essential roles in matter, like protons and neutrons. Discrepancies may hint at measurement uncertainties or undiscovered physics. 116 Which summary do you prefer? o Summary #1 (Human Written) o Summary #2 (AI Generated) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 117 How would you rate your level of expertise in the subject of these summaries (physics particles)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 118 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 119 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 120 4 o 5 o Complete understanding This section is about ice cream sales. The human writer and AI were asked to provide recommendations to ice cream sellers based on the available data. Summary #1 (AI Generated): There is an extremely strong correlation between ice cream profit and temperature. The higher the temperature, the greater the profit. However, the ice cream seller probably has little control over the temperature, so the best they could do with this information is prepare (and stock) accordingly for the days it's expected to be hot. Summary #2 (Human Written): To maximize sales, ice cream sellers should prioritize operations during warmer temperatures, as profits strongly correlate with higher temperatures. Focus on marketing and promotions on hot days, such as offering discounts or introducing new flavors. Ensure sufficient stock and staff availability during peak temperature periods to meet increased demand. Consider setting up stands or trucks in high-traffic outdoor areas like parks or beaches where people gather in warm weather. Additionally, use temperature forecasts to anticipate demand and optimize inventory. Leveraging these strategies will align supply and marketing efforts with customer preferences during profitable conditions 121 Which summary do you prefer? o Summary #1 (AI Generated) o Summary #2 (Human Written) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 122 How would you rate your level of expertise in the subject of these summaries (ice cream sales)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 123 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 124 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o Page Break 125 4 o 5 o Complete understanding This section is about S&P 500 stocks. The human writer and AI were asked to provide recommendations to investors based on the available data. Summary #1 (AI Generated): Investors should consider a diversified approach to balance risk and potential returns. Focus on high-growth sectors like technology and semiconductors, as evidenced by the performance of companies in the "Magnificent 7." Evaluate undervalued stocks with strong revenue growth, as these might offer opportunities for higher returns. Pay attention to market capitalization and industry trends, as larger companies tend to offer stability, while smaller, high-growth companies might carry higher risks but also greater potential rewards. Finally, balance investments across low and high-priced stocks, ensuring portfolio exposure to both established leaders and emerging players to optimize for growth and resilience. Summary #2 (Human Written): I suggest investing in XOM (Exxon Mobil Corporation), an oil and gas company located in Spring, Texas – near Houston, Texas. Although you could lose money, you also have the highest potential of reward. The main reason for the prediction is that the Joe Biden administration is against oil and gas and has been making it harder for Exxon Mobil Corporation to make money. However, Donald Trump was recently elected. Based on Trump’s campaign promises it will be easier to get gas and oil again and thus Exxon Mobil Corporation will make significant gains in the market. 126 Which summary do you prefer? o Summary #1 (AI Generated) o Summary #2 (Human Written) o Neither (Optional) Why do you prefer this summary? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ 127 How would you rate your level of expertise in the subject of these summaries (S&P 500 stocks)? Not at all an expert 1 2 3 4 5 o o o o o Expert How would you rate your overall trust with the human summaries at this time? 1 No trust o 2 o 3 o 128 4 o 5 o Complete trust How would you rate your overall trust with the AI summaries at this time? 1 No trust 2 o 3 o o 4 5 o o Complete trust How would you rate your understanding of the human summaries at this time? 1 No understanding o 2 o 3 o 129 4 o 5 o Complete understanding How would you rate your understanding of the AI summaries at this time? 1 No understanding o 2 o 3 o End of Block: Part 4 130 4 o 5 o Complete understanding References 1. Ibm, C. Stryker, and E. Kavlakoglu, “What is Artificial Intelligence (AI)?,” IBM, https://www.ibm.com/topics/artificial-intelligence (accessed Oct. 24, 2024). 2. G. Buttazzo, “Rise of Artificial General Intelligence: Risks and Opportunities,” Frontiers in Artificial Intelligence, vol. 6, Aug. 2023. doi:10.3389/frai.2023.1226990 3. A. M. Turing, “Computing Machinery and Intelligence,” Mind, vol. LIX, no. 236, pp. 433–460, Oct. 1950. 4. C. Biever, "ChatGPT broke the Turing test — the race is on for new ways to assess AI," Nature, vol. 619, no. 7971, pp. 686–689, Jul. 2023. [Online]. Available: https://www.nature.com/articles/d41586-023-02361-7 5. "Study finds ChatGPT’s latest bot behaves like humans, only better," Stanford School of Humanities and Sciences, Feb. 22, 2024. [Online]. Available: https://humsci.stanford.edu/feature/study-finds-chatgpts-latest-bot-behaveshumans-only-better 6. Q. Mei, Y. Xie, W. Yuan, and M. O. Jackson, "A Turing test of whether AI chatbots are behaviorally similar to humans," Proceedings of the National Academy of Sciences, vol. 121, no. 14, Apr. 2024. [Online]. Available: https://www.pnas.org/doi/10.1073/pnas.2313925121 7. T. J. Sejnowski, "Large Language Models and the Reverse Turing Test," Neural Computation, vol. 35, no. 3, pp. 309–342, Mar. 2023. [Online]. Available: https://doi.org/10.1162/neco_a_01563 8. R. Pesonen and S. Reijula, "Would you pass the Turing Test? Mirroring human intelligence with large language models," unpublished manuscript, PhilArchive, Nov. 2024. [Online]. Available: https://philarchive.org/rec/PESWYP 9. B. Agüera y Arcas, "Do Large Language Models Understand Us?," Daedalus, vol. 151, no. 2, pp. 183–197, May 2022. [Online]. Available: https://doi.org/10.1162/daed_a_01909 10. J. Huang et al., "Large Language Models Can Self-Improve," presented at the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore, Dec. 2023. [Online]. Available: https://aclanthology.org/2023.emnlp-main.67/ 11. A. De, "Statistical Considerations and Challenges for Pivotal Clinical Studies of Artificial Intelligence Medical Tests for Widespread Use: Opportunities for InterDisciplinary Collaboration," Statistics in Biopharmaceutical Research, vol. 15, no. 2, pp. 290–299, 2023. [Online]. Available: https://www.tandfonline.com/doi/full/10.1080/19466315.2023.2169752 12. K. Walczak and W. Cellary, "Challenges for Higher Education in the Era of Widespread Access to Generative AI," Economics and Business Review, vol. 9, no. 2, pp. 71–100, 2023. [Online]. Available: https://intapi.sciendo.com/pdf/10.18559/ebr.2023.2.743 13. M. N. Sakib, M. A. Islam, R. Pathak and M. M. Arifin, "Risks, Causes, and Mitigations of Widespread Deployments of Large Language Models (LLMs): A Survey," 2024 2nd International Conference on Artificial Intelligence, 131 Blockchain, and Internet of Things (AIBThings), Mt Pleasant, MI, USA, 2024, pp. 1-7, doi: 10.1109/AIBThings63359.2024.10863356. 14. P. W. Grimm, M. R. Grossman, and G. V. Cormack, "Artificial Intelligence as Evidence," Northwestern Journal of Technology and Intellectual Property, vol. 19, no. 3, pp. 215–241, 2021. [Online]. Available: https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=6919&context=facul ty_scholarship 15. M. C. Rillig, I. Mansour, S. Hempel, and M. Bi, "How Widespread Use of Generative AI for Images and Video Can Affect the Environment and the Science of Ecology," Ecology and Evolution, vol. 14, no. 1, e10123, 2024. [Online]. Available: https://onlinelibrary.wiley.com/doi/pdf/10.1111/ele.14397 16. C. Zhang and Y. Lu, "Study on Artificial Intelligence: The State of the Art and Future Prospects," Journal of Industrial Information Integration, vol. 23, 100224, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2452414X21000248 17. L. Huang, "Ethics of Artificial Intelligence in Education: Student Privacy and Data Protection," Science Insights Education Frontiers, vol. 16, no. 2, 2023. [Online]. Available: https://www.bonoi.org/index.php/sief/article/view/1084 18. C. I. Chesñevar, A. G. Maguitman, and R. P. Loui, "Logical models of argument," ACM Computing Surveys, vol. 32, no. 4, pp. 337–383, Dec. 2000, doi: 10.1145/371578.371581. 19. M. Abel and R. Johnson, "AI Bias for Creative Writing: Subjective Assessment Versus Willingness to Pay," IZA Institute of Labor Economics, Discussion Paper No. 17646, Jan. 2025. [Online]. Available: https://docs.iza.org/dp17646.pdf 20. H. Gangadharbatla, "The role of AI attribution knowledge in the evaluation of artwork," Empirical Studies of the Arts, vol. 40, no. 2, pp. 125–142, 2022. 21. S. Grassini and M. Koivisto, "Understanding how personality traits, experiences, and attitudes shape negative bias toward AI-generated artworks," Scientific Reports, vol. 14, no. 1, p. 4113, 2024. 22. J. Park, H. Kang, and H. Y. Kim, "Human, do you think this painting is the work of a real artist?," Int. J. Human–Computer Interaction, vol. 40, no. 18, pp. 5174– 5191, 2024. 23. L. Bellaiche, R. Shahi, M. H. Turpin, A. Ragnhildstveit, S. Sprockett, N. Barr, A. Christensen, and P. Seli, "Humans versus AI: Whether and why we prefer humancreated compared to AI-created artwork," Cognitive Research: Principles and Implications, vol. 8, no. 1, p. 42, 2023. [Online]. Available: https://doi.org/10.1186/s41235-023-00499-6 24. K. Millet, F. Buehler, G. Du, and M. D. Kokkoris, "Defending humankind: Anthropocentric bias in the appreciation of AI art," Computers in Human 132 Behavior, vol. 143, 107707, 2023. [Online]. Available: https://doi.org/10.1016/j.chb.2023.107707 25. C. B. Horton Jr., M. W. White, and S. S. Iyengar, "Bias against AI art can enhance perceptions of human creativity," Scientific Reports, vol. 13, no. 1, p. 19001, 2023. [Online]. Available: https://doi.org/10.1038/s41598-023-45202-3 26. A. de Rooij, "Bias against artificial intelligence in visual art: A meta-analysis," unpublished, 2024. 27. M. Ragot, N. Martin, and S. Cojean, "AI-generated vs. human artworks: A perception bias towards artificial intelligence?," in Proc. CHI Conf. Human Factors in Comput. Syst., Honolulu, HI, USA, 2020, pp. 1–10. [Online]. Available: https://doi.org/10.1145/3334480.3382823 28. X. Zhao et al., "A review of convolutional neural networks in computer vision," Artificial Intelligence Review, vol. 57, no. 4, p. 99, 2024. 29. Y. Matsuo et al., "Deep learning, reinforcement learning, and world models," Neural Networks, vol. 152, pp. 267–275, 2022. 30. S. Reeder, J. Jensen, and R. Ball, "Evaluating Explainable AI (XAI) in Terms of User Gender and Educational Background," in Proceedings of the 25th HCI International Conference, Copenhagen, Denmark, Jul. 2023, pp. 286–304. 31. A. Vaswani et al., "Attention Is All You Need," in Advances in Neural Information Processing Systems 30 (NIPS 2017), I. Guyon et al., Eds. Curran Associates, Inc., 2017, pp. 5998–6008. doi: 10.5555/3295222.3295349. 32. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving Language Understanding by Generative Pre-Training," OpenAI, 2018. [Online]. Available: https://cdn.openai.com/research-covers/languageunsupervised/language_understanding_paper.pdf 33. M. E. Peters et al., "Deep Contextualized Word Representations," in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, Jun. 2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202/ 34. N. Kalchbrenner, L. Espeholt, K. Simonyan, A. van den Oord, A. Graves, and K. Kavukcuoglu, "Neural Machine Translation in Linear Time," arXiv preprint arXiv:1610.10099, Oct. 2016. [Online]. Available: https://arxiv.org/abs/1610.10099 35. J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, "Convolutional Sequence to Sequence Learning," in Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, NSW, Australia, Aug. 133 2017, pp. 1243–1252. [Online]. Available: https://proceedings.mlr.press/v70/gehring17a.html 36. M. G. Hanna et al., “Future of Artificial Intelligence (AI)-Machine Learning (ML) Trends in Pathology and Medicine,” Mod. Pathol., vol. 2025, Art. no. 100705, 2025. 37. K.-B. Ooi et al., “The potential of generative artificial intelligence across disciplines: Perspectives and future directions,” J. Comput. Inf. Syst., vol. 65, no. 1, pp. 76–107, 2025. 38. J. Kokina et al., “Challenges and opportunities for artificial intelligence in auditing: Evidence from the field,” Int. J. Account. Inf. Syst., vol. 56, Art. no. 100734, 2025. 39. D. Gursoy and R. Cai, “Artificial intelligence: an overview of research trends and future directions,” Int. J. Contemp. Hosp. Manag., vol. 37, no. 1, pp. 1–17, 2025. 40. A. D. Samala et al., “Unveiling the landscape of generative artificial intelligence in education: a comprehensive taxonomy of applications, challenges, and future prospects,” Educ. Inf. Technol., vol. 30, no. 3, pp. 3239–3278, 2025. 41. R. Manayon, "Temperature and Ice Cream Sales," Kaggle. [Online]. Available: https://www.kaggle.com/datasets/raphaelmanayon/temperature-and-ice-creamsales?resource=download 42. T. Tunguz, "College Majors," Kaggle. [Online]. Available: https://www.kaggle.com/datasets/tunguz/college-majors/data?select=all-ages.csv 43. A. Mvd, "S&P 500 Stocks," Kaggle. [Online]. Available: https://www.kaggle.com/datasets/andrewmvd/sp-500stocks?select=sp500_companies.csv 44. D. S. Felix, "Physics Particles," Kaggle. [Online]. Available: https://www.kaggle.com/datasets/dsfelix/physics-particles/data 134 Skyler Swedin Thesis with signatures Final Audit Report 2025-06-09 Created: 2025-06-09 By: Robert Ball (robertball@weber.edu) Status: Signed Transaction ID: CBJCHBCAABAAxNZxbKvgbEuuPiOdmvIfK21IY95SNXk2 "Skyler Swedin Thesis with signatures" History Document created by Robert Ball (robertball@weber.edu) 2025-06-09 - 2:57:10 PM GMT- IP address: 137.190.217.68 Document emailed to Robert Ball (robertball@weber.edu) for signature 2025-06-09 - 2:58:39 PM GMT Document e-signed by Robert Ball (robertball@weber.edu) Signature Date: 2025-06-09 - 2:58:49 PM GMT - Time Source: server- IP address: 137.190.217.68 Document emailed to Patrick Zwick (dylanzwick@weber.edu) for signature 2025-06-09 - 2:58:54 PM GMT Email viewed by Patrick Zwick (dylanzwick@weber.edu) 2025-06-09 - 6:01:02 PM GMT- IP address: 66.249.88.104 Document e-signed by Patrick Zwick (dylanzwick@weber.edu) Signature Date: 2025-06-09 - 6:01:15 PM GMT - Time Source: server- IP address: 137.190.154.171 Document emailed to Joshua Jensen (joshuajensen1@weber.edu) for signature 2025-06-09 - 6:01:17 PM GMT Email viewed by Joshua Jensen (joshuajensen1@weber.edu) 2025-06-09 - 6:06:19 PM GMT- IP address: 66.249.88.104 Document e-signed by Joshua Jensen (joshuajensen1@weber.edu) Signature Date: 2025-06-09 - 6:06:47 PM GMT - Time Source: server- IP address: 137.190.163.251 Agreement completed. 2025-06-09 - 6:06:47 PM GMT |
| Format | application/pdf |
| ARK | ark:/87278/s6c7w6d3 |
| Setname | wsu_smt |
| ID | 154961 |
| Reference URL | https://digital.weber.edu/ark:/87278/s6c7w6d3 |



