10 月 282020
 
# -*- coding:utf-8 -*-
lyric = 'The night begin to shine, the night begin to shine'
words = lyric.split()
print(words)

# 词频统计
# https://docs.python.org/zh-cn/3/library/stdtypes.html#str.split
# split()方法:返回一个由字符串内单词组成的列表,使用 sep 作为分隔字符串
# https://docs.python.org/zh-cn/3/library/stdtypes.html#str.count
# count()方法:返回子字符串 sub 在 [start, end] 范围内非重叠出现的次数。 可选参数 start 与 end 会被解读为切片表示法。
# https://docs.python.org/zh-cn/3/tutorial/inputoutput.html#methods-of-file-objects
# 文件对象的read()方法,读取文件内容
path = 'C:/Users/harveymei/Desktop/news.txt'
with open(path, 'r') as text:  # 只读打开文件并赋值给变量text
    words = text.read().split()  # 读取文件内容拆分单词并赋值变量
    print(words)
    for word in words:  # for循环遍历列表中元素并赋值给变量
        print('{} -{} times'.format(word, words.count(word)))  # 将元素及元素计数插入字符串

# 问题:
# 1,带标点符号的单词被单独统计
# 2,一些单词展示多次统计
# 3,首字母大写单词被单独统计
C:\Users\harveymei\PycharmProjects\mod3736\venv\Scripts\python.exe C:/Users/harveymei/PycharmProjects/mod3736/static.py
['The', 'night', 'begin', 'to', 'shine,', 'the', 'night', 'begin', 'to', 'shine']
['The', 'Party', 'leads', 'the', 'formulation', 'work,', 'and', 'the', 'government', 'and', 'the', 'legislature', 'play', 'their', 'due', 'roles', 'in', 'the', 'process,', 'Li', 'said,', 'adding', 'that', 'such', 'a', 'practice', 'is', 'a', 'good', 'experience', 'that', 'the', 'CPC', 'has', 'created', 'in', 'the', 'governance', 'of', 'China.', 'The', 'Party', 'leads', 'the', 'formulation', 'work,', 'and', 'the', 'government', 'and', 'the', 'legislature', 'play', 'their', 'due', 'roles', 'in', 'the', 'process,', 'Li', 'said,', 'adding', 'that', 'such', 'a', 'practice', 'is', 'a', 'good', 'experience', 'that', 'the', 'CPC', 'has', 'created', 'in', 'the', 'governance', 'of', 'China.']
The -2 times
Party -2 times
leads -2 times
the -12 times
formulation -2 times
work, -2 times
and -4 times
the -12 times
government -2 times
and -4 times
the -12 times
legislature -2 times
play -2 times
their -2 times
due -2 times
roles -2 times
in -4 times
the -12 times
process, -2 times
Li -2 times
said, -2 times
adding -2 times
that -4 times
such -2 times
a -4 times
practice -2 times
is -2 times
a -4 times
good -2 times
experience -2 times
that -4 times
the -12 times
CPC -2 times
has -2 times
created -2 times
in -4 times
the -12 times
governance -2 times
of -2 times
China. -2 times
The -2 times
Party -2 times
leads -2 times
the -12 times
formulation -2 times
work, -2 times
and -4 times
the -12 times
government -2 times
and -4 times
the -12 times
legislature -2 times
play -2 times
their -2 times
due -2 times
roles -2 times
in -4 times
the -12 times
process, -2 times
Li -2 times
said, -2 times
adding -2 times
that -4 times
such -2 times
a -4 times
practice -2 times
is -2 times
a -4 times
good -2 times
experience -2 times
that -4 times
the -12 times
CPC -2 times
has -2 times
created -2 times
in -4 times
the -12 times
governance -2 times
of -2 times
China. -2 times

Process finished with exit code 0

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)