5.
Method
5.1
Participants
We recruited 20 native English speakers (mean age in years: 22.6, SD: 3.5; mean
years of education: 16.4, SD: 2.0) and 20 native Cantonese speakers (mean age
in years: 34.7, SD: 5.3; mean years of education: 16.5, SD: 2.3) to participate as
listeners. To be included in the study, listeners could not have any functional
ability or protracted exposure to the non-native language as determined by an
initial screening interview (which was always carried out in the participant’s na-
tive language). All English participants were native speakers of Canadian English
from Montreal and southern Ontario, and were undergraduate students attending
McGill University. All Cantonese participants were born, raised, and educated ei-
ther in the city of Hong Kong or Guangzhou (i.e., Cantonese environments) and
each was a recent immigrant to the province of Quebec (Canada). All Cantonese
participants continued to carry out their daily activities predominantly or exclu-
sively in the Cantonese language.
Recognizing sarcasm without language 21
5.2
Materials
The stimuli were a subset of recorded utterances taken from our previous studies
(see Cheang and Pell 2008, 2009 for complete details regarding stimulus rationale
and construction). Stimulus elicitation, recording, and perceptual validation pro-
cedures were highly comparable in each of the two language conditions and are
only summarized briefly here.
a. Stimulus elicitation. For each language, six young adults (three male, three
female) were recruited as native speakers to enact each of the four target attitudes
(sarcasm, sincerity, humorous irony, and neutrality) in their respective native lan-
guage. The speakers produced short target sentences as part of a scripted dialogue;
these sentences were semantically and syntactically comparable in the two lan-
guages and the text of each utterance allowed the speakers to produce the same
item to express each of the four attitudes on separate occasions during the record-
ing session. The text of the tokens consisted of the following English sentences and
their Cantonese analogues: “I suppose; it’s a respectful gesture /
係啩,呢個係個好客
氣嘅表示
”; “Is that so; she is a healthy lady./
係咩; 佢係個好健康嘅女人
”; “Oh boy;
he is a superior chef/
嘩哎;佢係個好鬼叻嘅廚師
”; “Yeah, right; what a spectacular
result/
係囉; 呢個係個犀利嘅結果
”. A pilot reading study involving native speakers
of the respective target language was run to establish that the text of each utter-
ance did not strongly bias one of the target attitudes (Cheang and Pell 2008, 2009).
Each speaker produced 96 recorded utterances. Recordings were conducted in a
sound-attenuated booth using a high quality head-mounted mono microphone
positioned approximately one inch from the speaker’s mouth (sampling rate of
recordings: 44.1 kHz, 16 bit, mono).
b. Stimulus validation and selection. For the purpose of our acoustic studies
(Cheang and Pell 2008, 2009), a separate group of English and Cantonese listeners
were recruited from the same populations as the speakers to verify the intended
attitudes expressed in the recordings (prior to submitting the tokens to acous-
tic analyses). None of these participants was the same as those who participated
in the current study. In each language condition, 16 native English or Cantonese
speakers were presented all of the items recorded in the same language and were
required to identify the attitude conveyed by each utterance from among the four
possible alternatives (25% recognition represents chance performance). This al-
lowed us to estimate how accurately the target attitude was encoded by each re-
corded utterance.
These perceptual data were used as a basis from which to select utterances
that were recognized as the target attitude. To keep the task manageable for par-
ticipants, only 15% of the best validated utterances were selected as stimuli in the
present experiment. These tokens were recognized as conveying a given attitude by
22 Henry S. Cheang and Marc D. Pell
a minimum of 57% of the native listener group (i.e., more than two times chance).
Note that the items initially constructed for acoustic analysis in each language
varied in linguistic structure and syllable length (i.e., utterances were two, seven,
or eleven syllables in length, Cheang and Pell 2008, 2009). In the present experi-
ment, in order to provide the participants increased exposure to acoustic informa-
tion upon which to base their recognition, only the 11-syllable tokens that met
or exceeded the recognition criteria were entered as stimuli for cross-linguistic
recognition.
In total, 79 English utterances (20 exemplars conveying sarcasm, sincerity, and
neutrality and 19 exemplars of humorous irony) and 77 Cantonese utterances (20
exemplars conveying sarcasm, sincerity, and humorous irony and 17 exemplars of
neutrality) served as the experimental stimuli. As these stimuli represent the best
exemplars of utterances conveying each attitude described in our previous work
(Cheang and Pell 2008, 2009), acoustic features of the selected items mirrored the
major patterns reported in our earlier studies. For example, sarcastic utterances
spoken in Cantonese were marked by higher mean F0 values than corresponding
sincere, humorous, or neutral utterances, whereas sarcastic utterances in English
displayed lower mean F0 values than the other attitudes; in each language, sin-
cere utterances demonstrated the opposite setting in mean F0 making them most
distinct from sarcasm for this acoustic parameter (see Cheang and Pell 2009 for
complete details).
5.3
Experimental tasks/ procedure
The English and Cantonese utterances were blocked for presentation in two sepa-
rate tasks according to the respective language condition. Each of the 40 partici-
pants (20 English-speaking, 20 Cantonese-speaking) completed both the English
and the Cantonese task during a single testing session. The order in which the two
language tasks were presented varied evenly within each participant group and
the sequence of individual trials was always randomized within each task. A total
of 156 experimental trials (79 English, 77 Cantonese stimuli) were judged by each
listener. The experiment was presented by a computer using Superlab 4.0 presenta-
tion software (Cedrus, USA) which also recorded the participants’ responses.
Testing was conducted on an individual basis at McGill University or in a
quiet room in the participant’s home. In all cases, communication between the
examiner and participants was carried out entirely in the native language of the
participant. Participants were informed that they would be listening to individual
utterances, spoken in either English or Cantonese, and that they should judge the
attitude of the speaker in each case from four alternatives: sarcasm, sincerity, hu-
mor, and neutral. Listeners were always instructed to attend to how the sentences
Recognizing sarcasm without language 23
were spoken, since in half of the cases they would not understand the language.
After listening to each sentence, written labels appeared on the computer screen
(in the native language) and the participant used a mouse click response to in-
dicate their judgement. Before beginning the experiment, definitions and short
descriptions of each attitude and the situations under which the attitudes might
be produced were given. Following these examples and the administration of in-
structions, listeners then completed two blocks of practice trials which were not
included in the experiment to get accustomed to the experimental procedure and
the sound of the stimuli. The experiment began when all questions regarding the
procedure had been addressed. Each participant was paid $20 CDN after complet-
ing both tasks.
5.4
Statistical procedure
The dependent variable of interest was response accuracy. Data for each attitude
(sarcasm, sincerity, humorous irony, and neutrality) were examined in two ways.
First, responses to stimuli of each attitude from both listener groups were subject-
ed to separate single-sample t-tests; these analyses were conducted to determine
whether listener responses for each attitude category differed significantly from
chance (i.e., chance = 0.25). Second, the data for each attitude were then submit-
ted to separate analyses of variance (ANOVA) with a fixed factor of LANGUAGE
(Cantonese, English) and a repeated factor of LISTENER GROUP (Cantonese,
English). We conducted separate ANOVAs on each attitude in an attempt to focus
our findings on identification differences across listener groups per attitude, as this
was the comparison of greatest theoretical interest. All significant main and inter-
active effects were elaborated using Tukey’s HSD criteria (α = 0.05). Main effects
subsumed by higher-order interactions are reported but not described.
Chia sẻ với bạn bè của bạn: |