Do boys and girls write the same? Analysis of n-grams of morphological categories (¿Niños y niñas escriben igual? Análisis de n-gramas de categorías morfológicas)

IF 1.1 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Culture and Education Pub Date : 2022-11-08 DOI:10.1080/11356405.2022.2121130
Sheila Queralt, Jordi Cicres
{"title":"Do boys and girls write the same? Analysis of n-grams of morphological categories (¿Niños y niñas escriben igual? Análisis de n-gramas de categorías morfológicas)","authors":"Sheila Queralt, Jordi Cicres","doi":"10.1080/11356405.2022.2121130","DOIUrl":null,"url":null,"abstract":"ABSTRACT The objective of this study is to characterize writing samples in Catalan written by boys and girls in primary school (from seven to 12 years old) using syntactic patterns. The corpus contains 169 writings divided by sex (76 boys and 93 girls) with an average of 200 words and a total length of 33,763 words. From this corpus, we calculated the 40 n-grams of the most frequent morphological categories (bigrams, trigrams). The data were statistically analysed using ANOVA and Linear Discriminant Analysis, and the accuracy in predicting the writer’s gender in a cross-validation experiment was 60.4% using both bigrams and trigrams. When the children’s age was taken into account, the percentage of accuracy was higher than 70% in both the original classification and the cross-validation. The identification of the most discriminating bigrams and trigrams allowed us to determine that girls show a greater expressive capacity and superior syntactic maturity, and greater lexical and syntactic richness.","PeriodicalId":51688,"journal":{"name":"Culture and Education","volume":"11 1","pages":"33 - 63"},"PeriodicalIF":1.1000,"publicationDate":"2022-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Culture and Education","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/11356405.2022.2121130","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

Abstract

ABSTRACT The objective of this study is to characterize writing samples in Catalan written by boys and girls in primary school (from seven to 12 years old) using syntactic patterns. The corpus contains 169 writings divided by sex (76 boys and 93 girls) with an average of 200 words and a total length of 33,763 words. From this corpus, we calculated the 40 n-grams of the most frequent morphological categories (bigrams, trigrams). The data were statistically analysed using ANOVA and Linear Discriminant Analysis, and the accuracy in predicting the writer’s gender in a cross-validation experiment was 60.4% using both bigrams and trigrams. When the children’s age was taken into account, the percentage of accuracy was higher than 70% in both the original classification and the cross-validation. The identification of the most discriminating bigrams and trigrams allowed us to determine that girls show a greater expressive capacity and superior syntactic maturity, and greater lexical and syntactic richness.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
男孩和女孩写同样的东西吗?n克形态分类分析(男孩和女孩写一样吗?形态类别n-gram分析)
摘要:本研究的目的是用语法模式来描述小学(7至12岁)男孩和女孩写的加泰罗尼亚语写作样本。该语料库包含169个按性别划分的作品(76个男生和93个女生),平均200个单词,总长度为33763个单词。从这个语料库中,我们计算了40个n-grams的最常见的形态类别(双元,三元)。采用方差分析(ANOVA)和线性判别分析(Linear Discriminant Analysis)对数据进行统计分析,双组和三组交叉验证实验预测作者性别的准确率为60.4%。当考虑儿童的年龄时,原始分类和交叉验证的准确率百分比均高于70%。鉴别最具辨别力的双字和三字使我们确定女孩表现出更大的表达能力和更优越的句法成熟度,以及更大的词汇和句法丰富性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Culture and Education
Culture and Education EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
2.00
自引率
9.10%
发文量
41
期刊最新文献
Cultural factors influencing Bhutanese secondary science teachers’ implementation of action research (Factores culturales que influyen en la práctica de la investigación acción del profesorado de ciencias de un centro de secundaria de Bután) Support for decision-making in checking the level of quality of student research works based on automated text analysis (Asistencia para la toma de decisiones en la evaluación de la calidad de las investigaciones de los estudiantes basada en el análisis automático de textos) Development of vocal exercises for training musical skills ( Desarrollo de ejercicios vocales para entrenar las habilidades musicales ) A cross-cultural analysis on career decision-making of college students: the role of Chinese mainstream ideology ( Análisis intercultural sobre las decisiones de los universitarios respecto a sus salidas profesionales: el rol de la ideología dominante ) Influence of education and the media on the awareness of climate change (Influencia de la educación y la comunicación en la concienciación sobre el cambio climático)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1