I Hear You: On Human Knowledge and Vocal Intelligence
DOI:
https://doi.org/10.21814/rlec.6316Keywords:
voice technology, human-computer interaction, affective computing, large language modelsAbstract
This interview explores embodied agency and the evolving dynamics of knowledge creation through practical and experimental engagement with conversational artificial intelligence (AI) systems. Drawing on media archaeology, media theory, and science and technology studies, it examines how the emergence of language interfaces destabilize distinctions between user and system, collapsing the boundaries between human and artificial modes of expression and understanding. Framed within an artistic research methodology, the project critically engages with the ongoing shift toward machine- and voice-based forms of inquiry, analysing how these technologies reshape the epistemic, linguistic, and ontological conditions of knowledge and research. Departing from keyboard-based interaction, the process emphasizes the decoupling of the body from the machine interface and the increasing fluidity of human-computer correspondence through voice technology. While acknowledging the growing uncertainty of origin and autonomy resulting from this technological shift, it foregrounds indeterminate authorship as both methodological challenge and theoretical pivot, underlining the implications for academic accountability and data ethics. The employment of practice-based experimentation is used as a tool to trace the infrastructural, affective, and rhetorical vectors through which intelligent automated speech influences knowledge production. By examining this process, the study contributes to ongoing debates on verification, trust, and the social negotiation of information induced by advanced conversational AI agents. Overall, the paper argues that voice technologies do not merely transmit content but actively configure the conditions under which knowledge is produced, authenticated, and circulated.
Downloads
References
*Afzal, S., Khan, H. A., Khan, I. U., Piran, M. J., & Lee, J. W. (2023). A comprehensive survey on affective computing; challenges, trends, applications, and future directions. arXiv. https://doi.org/10.48550/arXiv.2305.07665 DOI: https://doi.org/10.1109/ACCESS.2024.3422480
Agrawal, K. (2010). To study the phenomenon of the Moravec’s paradox. arXiv. https://doi.org/10.48550/arXiv.1012.3148
Ardelt, M. (2004). Wisdom as expert knowledge system: A critical review of a contemporary operationalization of an ancient concept. Human Development, 47(5), 257–285. https://doi.org/10.1159/000079154 DOI: https://doi.org/10.1159/000079154
Arora, S. (2025, April 28). OpenAI CEO Sam Altman admits ChatGPT 4O’s ‘annoying’ personality needs work: “We are working on fixes”. Times Now. https://www.timesnownews.com/technology-science/openai-ceo-sam-altman-admits-chatgpt-4os-annoying-personality-needs-work-we-are-working-on-fixes-article-151522930
Baltes, P. B., & Staudinger, U. M. (2000). Wisdom: A metaheuristic (pragmatic) to orchestrate mind and virtue toward excellence. American Psychologist, 55(1), 122–136. https://doi.org/10.1037/0003-066x.55.1.122 DOI: https://doi.org/10.1037//0003-066X.55.1.122
Bandura, A. (2001). Social cognitive theory: An agentic perspective. Annual Review of Psychology, 52, 1–26. https://doi.org/10.1146/annurev.psych.52.1.1 DOI: https://doi.org/10.1146/annurev.psych.52.1.1
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., Sydney, V. A., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., . . . Liang, P. (2021). On the opportunities and risks of foundation models. arXiv. https://doi.org/10.48550/arXiv.2108.07258
*Chervonyi, Y., Trinh, T. H., Olšák, M., Yang, X., Nguyen, H., Menegali, M., Jung, J., Verma, V., Le, Q., V., & Luong, T. (2025). Gold-medalist performance in solving Olympiad geometry with AlphaGeometry2. arXiv. https://arxiv.org/html/2502.03544v1
*Chomsky, N. (2006). Language and mind. Cambridge University Press. https://doi.org/10.1017/cbo9780511791222. (Original work published 1968) DOI: https://doi.org/10.1017/CBO9780511791222
Clarke, L. (2022, November 12). When AI can make art – what does it mean for creativity? The Guardian. https://www.theguardian.com/technology/2022/nov/12/when-ai-can-make-art-what-does-it-mean-for-creativity-dall-e-midjourney
Cohn, M., Pushkarna, M., Olanubi, G. O., Moran, J. M., Padgett, D., Mengesha, Z., & Heldreth, C. (2024). Believing anthropomorphism: Examining the role of anthropomorphic cues on trust in large language models. arXiv. https://doi.org/10.48550/arXiv.2405.06079 DOI: https://doi.org/10.1145/3613905.3650818
*Connolly, F. F., Hjerm, M., & Kalucza, S. (2025). When will AI transform society? Swedish public predictions on AI development timelines. arXiv. https://doi.org/10.48550/arXiv.2504.04180
Crawford, K. (2021). Atlas of AI: Power, politics, and the planetary costs of artificial intelligence. Yale University Press. https://doi.org/10.2307/j.ctv1ghv45t DOI: https://doi.org/10.12987/9780300252392
Crystal, D. (2008). Dictionary of linguistics and phonetics. Wiley-Blackwell. https://doi.org/10.1002/9781444302776 DOI: https://doi.org/10.1002/9781444302776
Dada, E. G., Bassi, J. S., Chiroma, H., Abdulhamid, S. M., Adetunmbi, A. O., & Ajibuwa, O. E. (2019). Machine learning for email spam filtering: Review, approaches and open research problems. Heliyon, 5(6), e01802. https://doi.org/10.1016/j.heliyon.2019.e01802 DOI: https://doi.org/10.1016/j.heliyon.2019.e01802
De Waal, F. (2016). Are we smart enough to know how smart animals are? W. W. Norton & Company.
*Dreyfus, H. (2014). 20. What computers can’t do: A critique of artificial reason. In B. Williams (Ed.), Essays and reviews: 1959-2002 (pp. 90–100). Princeton University Press. https://doi.org/10.1515/9781400848393-021 DOI: https://doi.org/10.1515/9781400848393-021
Eidsheim, N. S. (2019). The race of sound: Listening, timbre, and vocality in African American music. Duke University Press. https://doi.org/10.2307/j.ctv11hpntq DOI: https://doi.org/10.1215/9781478090359
Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: A three-factor theory of anthropomorphism. Psychological Review, 114(4), 864–886. https://doi.org/10.1037/0033-295x.114.4.864 DOI: https://doi.org/10.1037/0033-295X.114.4.864
*Farrell, T. J. (1985). Orality and literacy: The technologizing of the word [Book review of Orality and literacy: The technologizing of the word, by W. J. Ong]. College Composition and Communication, 36(3), 363–365. https://doi.org/10.2307/357987 DOI: https://doi.org/10.2307/357987
Fedorenko, E., & Varley, R. (2016). Language and thought are not the same thing: Evidence from neuroimaging and neurological patients. Annals of the New York Academy of Sciences, 1369, 132–153. https://doi.org/10.1111/nyas.13046 DOI: https://doi.org/10.1111/nyas.13046
*Floridi, L., & Illari, P. (Eds.). (2014). The philosophy of information quality. Springer Cham. https://doi.org/10.1007/978-3-319-07121-3 DOI: https://doi.org/10.1007/978-3-319-07121-3
*Freire, S. K., Wang, C., & Niforatos, E. (2024). Conversational assistants in knowledge-intensive contexts: An evaluation of LLM- versus intent-based systems. arXiv. https://doi.org/10.48550/arXiv.2402.04955
French, R. M. (1990). Subcognition and the limits of the Turing test. Mind, XCIX(393), 53–65. https://doi.org/10.1093/mind/XCIX.393.53 DOI: https://doi.org/10.1093/mind/XCIX.393.53
Fron, C., & Korn, O. (2019, July 2). A short history of the perception of robots and automata from antiquity to modern times. In O. Korn (Ed.), Social robots: Technological, societal and ethical aspects of human–robot interaction (pp. 1–12). Springer Nature. https://doi.org/10.1007/978-3-030-17107-0_1 DOI: https://doi.org/10.1007/978-3-030-17107-0_1
*Gardavski, K. (2022). Wittgenstein and LaMDA. The Logical Foresight - Journal for Logic and Science, 2(1), 25–42. https://doi.org/10.54889/issn.2744-208x.2022.2.1.25 DOI: https://doi.org/10.54889/issn.2744-208X.2022.2.1.25
Harwell, D. (2019, November 6). A face-scanning algorithm increasingly decides whether you deserve the job. The Washington Post. https://www.washingtonpost.com/technology/2019/10/22/ai-hiring-face-scanning-algorithm-increasingly-decides-whether-you-deserve-job/
*He, L., Qi, X., Liao, M., Cheong, I., Mittal, P., Chen, D., & Henderson, P. (2025). The deployment of end-to-end audio language models should take into account the principle of least privilege. arXiv. https://doi.org/10.48550/arXiv.2503.16833
Hillis, K., Petit, M., & Jarrett, K. (2012). Google and the culture of search. Routledge. https://doi.org/10.4324/9780203846261 DOI: https://doi.org/10.4324/9780203846261
Jones, C. R., & Bergen, B. K. (2025). Large language models pass the Turing test. arXiv. https://doi.org/10.48550/arXiv.2503.23674
*Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2024). A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions. ACM Transactions on Office Information Systems. https://doi.org/10.48550/arXiv.2311.05232 DOI: https://doi.org/10.1145/3703155
*Keat, L. C., & Ying, T. X. (2025). Artificial intelligence-based email spam filtering. Journal of Advanced Research in Artificial Intelligence & It’s Applications, 2(1), 67–75. https://doi.org/10.5281/zenodo.14264139
Kreps, S., McCain, R. M., & Brundage, M. (2022). All the news that’s fit to fabricate: AI-generated text as a tool of media misinformation. Journal of Experimental Political Science, 9(1), 104–117. https://doi.org/10.1017/xps.2020.37 DOI: https://doi.org/10.1017/XPS.2020.37
Lakhani, K. (2023, July 17). How can we counteract generative AI’s hallucinations? Digital Data Design Institute at Harvard. https://d3.harvard.edu/how-can-we-counteract-generative-ais-hallucinations/
Leo-Liu, J. (2023). Loving a “defiant” AI companion? The gender performance and ethics of social exchange robots in simulated intimate interactions. Computers in Human Behavior, 141, 107620. https://doi.org/10.1016/j.chb.2022.107620 DOI: https://doi.org/10.1016/j.chb.2022.107620
Lewandowsky, S., Robertson, R. E., & DiResta, R. (2023). Challenges in understanding human-algorithm entanglement during online information consumption. Perspectives on Psychological Science, 19(5), 758–766. https://doi.org/10.1177/17456916231180809 DOI: https://doi.org/10.1177/17456916231180809
Li, Y. A., Han, C., Raghavan, V. S., Mischler, G., & Mesgarani, N. (2023). StyleTTS 2: Towards human-level text-to-speech through style diffusion and adversarial training with large speech language models. arXiv. https://doi.org/10.48550/arXiv.2306.07691
*Lin, G., Chiang, C., & Lee, H. (2024). Advancing large language models to capture varied speaking styles and respond properly in spoken conversations. arXiv. https://doi.org/10.48550/arXiv.2402.12786 DOI: https://doi.org/10.18653/v1/2024.acl-long.358
Lovato, S. B., & Piper, A. M. (2019). Young children and voice search: What we know from human-computer interaction research. Frontiers in Psychology, 10, 1–5. https://doi.org/10.3389/fpsyg.2019.00008 DOI: https://doi.org/10.3389/fpsyg.2019.00008
Luscombe, R. (2022, June 12). Google engineer put on leave after saying AI chatbot has become sentient. The Guardian. https://www.theguardian.com/technology/2022/jun/12/google-engineer-ai-bot-sentient-blake-lemoine
Manovich, L. (2002). The language of new media. MIT Press. DOI: https://doi.org/10.22230/cjc.2002v27n1a1280
Matthias, M. (2023, August 25). Why does AI art screw up hands and fingers? Encyclopaedia Britannica. https://www.britannica.com/topic/Why-does-AI-art-screw-up-hands-and-fingers-2230501
*Mikalson, J. D. (2006). (H.) Bowden classical Athens and the Delphic Oracle: Divination and democracy. Pp. xviii + 188, maps, ills. Cambridge: Cambridge University Press, 2005. ISBN: 0-521-53081-4 (0-521-82373-0 hbk). The Classical Review, 56(2), 406–407. https://doi.org/10.1017/s0009840x06002150 DOI: https://doi.org/10.1017/S0009840X06002150
Miller, T., Paloque-Bergès, C., & Dame-Griff, A. (2022). Remembering Netizens: an interview with Ronda Hauben, co-author of Netizens: on the history and impact of Usenet and the internet (1997). Internet Histories, 7(1), 76–98. https://doi.org/10.1080/24701475.2022.2123120 DOI: https://doi.org/10.1080/24701475.2022.2123120
Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press. https://doi.org/10.2307/j.ctt1pwt9w5 DOI: https://doi.org/10.2307/j.ctt1pwt9w5
O’Donnell, J. (2024, September 24). OpenAI released its advanced voice mode to more people. Here’s how to get it. MIT Technology Review. https://www.technologyreview.com/2024/09/24/1104422/openai-released-its-advanced-voice-mode-to-more-people-heres-how-to-get-it/
OpenAI. (2025a, January 30). Advanced voice mode FAQ. https://help.openai.com/en/articles/9617425-advanced-voice-mode-faq
OpenAI. (2025b, April 25). Response generated by ChatGPT (version 4o) [Large language model]. https://openai.com/policies/usage-policies/?utm
Parisi, L. (2019a). Machine sirens and vocal intelligence. In S. Goodman & U. Erlmann (Eds.), Unsound undead (pp. 53–56). MIT Press.
Parisi, L. (2019b). The alien subject of AI. Subjectivity, 12(1), 27–48. https://doi.org/10.1057/s41286-018-00064-3 DOI: https://doi.org/10.1057/s41286-018-00064-3
Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. MIT Press. https://doi.org/10.7551/mitpress/4158.001.0001 DOI: https://doi.org/10.7551/mitpress/4158.001.0001
Pillai, M. (2024). The Evolution of Customer Service: Identifying the Impact of Artificial Intelligence on Employment and Management in Call Centres. Journal of Business Management and Information Systems, (special issue), 52–55. DOI: https://doi.org/10.48001/jbmis.2024.si1010
*Quijano, A., & Ennis, M. (2000). Coloniality of power, eurocentrism, and Latin America. Nepantla: Views from South, 1(3), 533–580. https://muse.jhu.edu/article/23906 DOI: https://doi.org/10.1177/0268580900015002005
*Quinn, K. (2014). Google and the culture of search [Review of the book Google and the culture of search, by K. Hillis, M. Petit, & K. Jarrett]. Journal of Broadcasting & Electronic Media, 58(3), 473–475. https://doi.org/10.1080/08838151.2014.935943 DOI: https://doi.org/10.1080/08838151.2014.935943
*Raman, R., Kowalski, R., Achuthan, K., Iyer, A., & Nedungadi, P. (2025). Navigating artificial general intelligence development: Societal, technological, ethical, and brain-inspired pathways. Scientific Reports, 15, 8443. https://doi.org/10.1038/s41598-025-92190-7 DOI: https://doi.org/10.1038/s41598-025-92190-7
Schreibelmayr, S., & Mara, M. (2022). Robot voices in daily life: Vocal human likeness and application context as determinants of user acceptance. Frontiers in Psychology, 13, 787499. https://doi.org/10.3389/fpsyg.2022.787499 DOI: https://doi.org/10.3389/fpsyg.2022.787499
sculpting_Noise. (2025). I hear you: On human knowledge and vocal intelligence [Audio work]. SoundCloud. https://soundcloud.com/user-432639751-504934319/i-hear-you-on-human-knowledge-and-vocal-intelligence
Sheth, A., Roy, K., & Gaur, M. (2023). Neurosymbolic AI -- Why, what, and how. arXiv. https://doi.org/10.48550/arXiv.2305.00813
*Shum, H., He, X., & Li, D. (2018). From Eliza to XiaoIce: Challenges and opportunities with social chatbots. arXiv. https://doi.org/10.48550/arXiv.1801.01957 DOI: https://doi.org/10.1631/FITEE.1700826
Sindoni, M. G. (2024). The femininization of AI-powered voice assistants: Personification, anthropomorphism and discourse ideologies. Discourse, Context & Media, 62, 100833. https://doi.org/10.1016/j.dcm.2024.100833 DOI: https://doi.org/10.1016/j.dcm.2024.100833
Sternberg, R. J. (2012). Intelligence. Dialogues in Clinical Neuroscience, 14(1), 19–27. https://doi.org/10.31887/dcns.2012.14.1/rsternberg DOI: https://doi.org/10.31887/DCNS.2012.14.1/rsternberg
Sullivan, D. (2013, June 28). A eulogy for AltaVista, the Google of its time. Search Engine Land. https://searchengineland.com/altavista-eulogy-165366
*Sun, H., Zhao, L., Wu, Z., Gao, X., Hu, Y., Zuo, M., Zhang, W., Han, J., Liu, T., & Hu, X. (2024). Brain-like functional organization within large language models. arXiv. https://doi.org/10.48550/arXiv.2410.19542
X Technologies. (2025, February 21). Introducing NEO gamma. https://www.1x.tech/discover/introducing-neo-gamma
Takahashi, M., & Overton, W. F. (2005). Cultural foundations of wisdom: An integrated developmental approach. In R. J. Sternberg & J. Jordan (Eds.), A handbook of wisdom: Psychological perspectives (pp. 32–60). Cambridge University Press. https://doi.org/10.1017/CBO9780511610486.003 DOI: https://doi.org/10.1017/CBO9780511610486.003
*Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Harvard University Press. https://doi.org/10.2307/j.ctv26070v8 DOI: https://doi.org/10.2307/j.ctv26070v8
Turing, A. (2004). Computing machinery and intelligence (1950). In B J Copeland (Ed.), The essential Turing (pp. 433–464). Oxford university Press. https://doi.org/10.1093/oso/9780198250791.003.0017 DOI: https://doi.org/10.1093/oso/9780198250791.003.0017
Wang, J., Ma, W., Sun, P., Zhang, M., & Nie, J. (2024). Understanding user experience in large language model interactions. arXiv. https://doi.org/10.48550/arXiv.2401.08329
Wittgenstein, L. (2009). Philosophical investigations (P. M. S. Hacker & J. Schulte, Eds.; G. E. M. Anscombe, Trans.; 4th ed.). Wiley-Blackwell. (Original work published 1953)
*Yamaguchi, S., & Fukuda, T. (2023). On the limitation of diffusion models for synthesizing training datasets. arXiv. https://doi.org/10.48550/arXiv.2311.13090
Yaqub, M. Z., & Alsabban, A. (2023). Knowledge sharing through social media platforms in the silicon age. Sustainability, 15(8), 6765. https://doi.org/10.3390/su15086765 DOI: https://doi.org/10.3390/su15086765
Yeh, K.-C., Chi, J.-A., Lian, D.-C., Hsieh, S.-K. (2023). Evaluating interfaced LLM bias. In J.-L. Wu & M.-H. Su (Eds.), Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023) (pp. 292–299). The Association for Computational Linguistics and Chinese Language Processing (ACLCLP). https://aclanthology.org/2023.rocling-1.37/
*Zou, A., Wang, Z., Carlini, N., Nasr, M., Kolter, J. Z., & Fredrikson, M. (2023). Universal and transferable adversarial attacks on aligned language models. arXiv. https://doi.org/10.48550/arXiv.2307.15043
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Moana Ava Holenstein

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors own the copyright, providing the journal with the right of first publication. The work is licensed under a Creative Commons - Atribuição 4.0 Internacional License.