[OPR] Schmidt & Marx: Co-Constructing Tele-Presence by Embodying Avatars: Evidence from Let’s Play Videos

On this page you can download the discussion paper that was submitted for publication in the Journal for Media Linguistics. The blogstract summarises the submission in a comprehensible manner. You can comment on the discussion paper and the blogstract below this post. Please use your real name for this purpose. For detailed comments on the discussion paper please refer to the line numbering of the PDF.

This submission is a contribution to the special issue “Co-constructing presence between players and non-players in videogame interactions”.

Discussion Paper (PDF)

Blogstract of

Co-Constructing Tele-Presence by Embodying Avatars: Evidence from Let’s Play Videos

by Axel Schmidt & Konstanze Marx

Our data comes from so-called Let’s Plays which are supposed to present and comment computer gaming on the internet and which are one of the most successful YouTube-genres. Let’s Plays can be done in a single player mode (one person is playing and commenting) or in a multiplayer mode (several people are playing and commenting together).

Video games are attractive because they are highly immersive and interactive (Freyermuth 2015). Exactly these characteristics get lost as soon as Let’s Plays are produced as videos. Recipients do not have the chance to intervene into the game anymore. They can only watch others while playing a game. Thus, the reception situation is comparable to watching a show on TV (Ackermann 2016). We assume that the accompanying moderation of Let’s Plays is crucial to make a computer game ‘watchable’ (Schmidt/Marx 2020). That is, Let’s Players are constantly engaged in embodying their avatars by formulating and explaining their actions in the game and by producing so called response cries (Goffman 1981) in reaction to game events. By that, they make their experiences during the game more transparent for spectators. Thereby they construct a specific kind of (tele-)presence.

Following an ethnomethodological conversation analytical approach, our paper will focus on practices of making computer games ‘watchable’. One possibility to do that is to exploit the computer game specific participation framework composed of at least players and avatars which are connected with one another in several ways (Baldauf- Quilliatre/Colón de Carvajal 2015; Mondada 2012; Keating/Sunakawa 2010). The presentation mode of Let’s Plays usually consists of game activities on a large screen and the simultaneous mimic activities of the players on a small screen (transmitted by a facecam). Obviously, the embodied activities of the players are used to enhance the pleasure of merely watching the game. We are interested in how players use their voices and the facecam to either interact with avatars resp. non-play characters (mainly in the single player mode) or to animate avatars (frequently in the multiplayer mode). Both practices are readable as attempts to ‘embody’ avatars in order to ‘bring them to life’ and to make (watching) the game more lively.

References

Ackermann, Judith (2016) (Ed.): Phänomen Let’s play-Video: Entstehung, Ästhetik, Aneignung und Faszination aufgezeichneten Computerhandelns. Wiesbaden: Springer VS.

Baldauf-Quilliatre, Heike/Colón de Carvajal, Isabel (2015). Is the avatar considered as a participant by the players? A conversational analysis of multi-player videogames interactions. In: PsychNology Journal, 13, 2-3, 127-147.

Freyermuth, Gundolf S. (2015): Games, game design, game studies: eine Einführung. Bielefeld: transcript.

Goffman, Erving (1981): Response Cries. In Goffman, Erving (Ed.): Forms of Talk. Philadelphia: University of Pennsylvania Press, 78-122.

Mondada, Lorenza (2012): Coordinating action and talk-in-interaction in and out of video games. In Ayaß, Ruth/Gerhardt, Cornelia (Ed.): The appropriation of media in everyday life. Philadelphia: Benjamins, 231-270.

Keating, Elizabeth/Sunakawa, Chiho (2010): Participation cues: Coordinating activity and collaboration in complex online gaming worlds. In: Language in Society, 39, S. 331-356.

Schmidt, Axel/Marx, Konstanze (2020): Making Let’s Plays watchable: An interactional approach to multimodality. In Crispin Thurlow/Christa Dürscheid/Diémoz, Federica (Eds.): Visualizing (in) the New Media. London: John Benjamins, 131-150.

Empirical data referred to in the discussion paper as

One Reply to “[OPR] Schmidt & Marx: Co-Constructing Tele-Presence by Embodying Avatars: Evidence from Let’s Play Videos”

  1. Martin LuginbühlJuly 15, 2020 at 08:23Reply

    GENERAL
    This is an excellent article on crucial characteristics of Let’s Play videos, a genre with a complex media setting, as gamers control with their devices an avatar that virtually represents these activities and with whom they partly conflate if you look at the language use; at the same time the Let’s Players are constantly oriented towards the viewers in order to make their game “watchable” as the authors put it. The analysis within the field of multimodal, ethnographic CA (but, as I might add, media linguistics as well) is concerned with questions of media affordances, participation framework, and accounts of immediacy and presence that are performed with specific practices. Analyzing examples of a German speaking player, the authors focus on two important practices, formulating own actions and animating avatars via response cries. While formulating one own’s actions makes the Let’s Play more transparent for viewers, the response cries are part of a (multimodal, as is shown in the analysis very nicely) embodiment of avatars. The main hypothesis – whose correctness is shown very convincingly – is that these practices interact on the split screen with game actions as well as with the facecam of the gamers in order to embody the avatars, resulting in tele-presence, that enable the viewers to experience the gamer’s presence that has shifted from the real world to the virtual world.
     
     
    INTRO
    Overall very informative introduction that really made me want to read on. I was confused by the use of “co-construction” and asked myself if the analysis will be on games that are played together. It is only in the conclusion that you explain your understanding of co-construction. I asked myself if this term is adequate in this context, as in the Let’s Plays you are looking at asynchronous one-way communication. If you describe the practices of the Let’s Players and speak of co-construction, then probably every one-way communication could be labelled co-constructed as it aims at a preferred reading by the viewers. And in the following section lines 95 ff. you point out that viewers cannot interact – so how can they co-construct (in a narrower sense of the word)?
     
    SECTION 2
    Very well description of the participation framework, that explains crucial terms for the following analysis. As you show in your analysis, not only the verbs, but also the body behavior is oriented to the viewers. So the “extra level” you are talking about and the staged “intimate interaction” is also (and probably to an important part) result of showing body movements and mimic in close up videos by the facecam (similar to TV hosts, as shown in different studies on TV news). This becomes very clear in your analysis (e.g. lines 418f.), but in this section you only mention “embodied conduct” (line 88) in passing. Perhaps you could emphasize this aspect more clearly here.
    I would have liked to read some theoretical considerations when it comes to the media ‘infrastructure’ of Let’s Plays; you mention the media involved and you mention “mediation” and “re-mediation” in the Conclusion (lines 783 and 790). Also, you speak of afforances.
    I wonder how you would describe the affordances of such a combination on a theoretical level and how notions of mediation and re-mediation fit in here (we have two media in a third here). In a way you do that implicitly in your article, but I think you could point out here an additional theoretical point. Can you grasp on a theoretical level what happens in terms of media affordances, mediation and re-mediation when gamers not just play for themselves but play for a Let’s Play? 
     
    SECTION 3
    Again, very readable and stimulating chapter. I was wondering if you would restrict the term of tele-presence to the gamers or also extend it to the viewers. You speak of the gamers (in a quote in line 236 the “media users”) and it is clear that gamers are tele-present in the games. But what about the viewers? Are they – due the practices of the gamers you describe – tele-present as well? I think you would say so, but I am not sure. And if yes: Do we need a sub-classification of tele-presence (gamers vs. viewers). 
     
    SECTION 4
    Chapter describing data and method, no comments on that.
     
    SECTION 5
    Very convincing, plausible and detailed (but never boring) analysis, with very good examples showing different subtypes of the practices mentioned above (by the way: video 8 is just hilarious). While in the previous chapters you sometimes speak of “inner state” of the gamers, you write “she displays a stance” here (line 337). Watching the video 2 you truly get the impression that the player got scared – but of course the Let’s Players perform these displays probably quite consciously, so I do prefer referring to these phenomena with ‘display of a stance’. (You surely are aware of all of this, but I still think it is important to highlight this difference, especially because it remains unclear what is authentic and what is staged, see below.)
    I think the argument that formulating own actions makes the action more transparent and therefore more attractive is very convincing – as well as the conflation of gamer and avatar e.g. by using “I” (line 377).
    As mentioned above, I would be careful to talk about “inner states” as well as “spontaneous reactions” (line 566); some of them might be spontaneous, other just well staged for entertainment of a bigger audience. The same is true for emotions (“the player’s emotions”, line 594, similar line 758). There is some work on reality TV that also addresses this issue, as you never really know what is authentic and what is staged. This goes in line with your observation of “liveliness” (line 664), a term created by Tolson (2006, perhaps you add this reference) that points out that “live” is not just a temporal phenomenon, but can be staged with different means (therefor liveliness). I think that here the notion of “immediacy” is also important – in addition to presence.
     
    CONCLUSION
    Good summary with a very interesting outlook on future research. Again, I am skeptical about your conceptualization of “co-construction” (lines 745ff.): Does taking viewers into account equal with co-construction? 
     
    RECOMMENDATION
    Consider minor revisions mentioned above.
     
     
     
    Minor remarks:
    Line 209: It probably should read: .. is only possible within a functional cycle of sensing”, not “within in”
    Line 213: I struggled first to connect the first sentence to the preceding paragraph. Perhaps you could elaborate: “What occurs WHEN PLAYERS DO XYZ…”
    Line 307: In this transcript “GS” and “Fig” are (at least in my pdf) not aligned correctly

Leave a Comment