Black History Month is here! Discover ERA research focused on Black experiences in Canada and worldwide. Use our general search below to get started!

The Wenzhou Spoken Corpus

Loading...
Thumbnail Image

Date

Citation for Previous Publication

Newman, J. et al. (2007). The Wenzhou Spoken Corpus. Corpora, 2(1), 97-109.

Link to Related Item

Abstract

Description

The creation of the Wenzhou Spoken Corpus, an online searchable corpus of a modern Chinese dialect, presents a number of challenges that are of interest to the corpus linguistic community. We review issues involved with collection of spoken data, its transcription and markup, as well as the functionality of the search tools. The transcription makes use of Chinese characters as well as IPA symbols for Wenzhou colloquial forms not conventionally represented by characters. XML was adopted as the standard for the basic format of files, with file searches expressed in XPath form. The search tools provide the usual options of restricting searches by age, gender, etc., and yield concordances and tables of collocates. Though the collection of data for the corpus was ‘opportunistic’ in some ways, and so not ideally balanced or representative, it is nevertheless proving to be a valuable tool for corpus-based research on Wenzhou.

Item Type

http://purl.org/coar/resource_type/c_6501 http://purl.org/coar/version/c_970fb48d4fbd8a85

Alternative

License

Other License Text / Link

© 2007 Edinburgh University Press

Language

en

Location

China

Time Period

Source