解决 NotFittedError: Vocabulary not fitted or provided 错误

NotFittedError: Vocabulary not fitted or provided

Traceback (most recent call last)

Cell In[57], line 11
      9 # 输出每个主题对应词语
     10 n_top_words = 5
---> 11 tf_feature_names = tf_vectorizer.get_feature_names()
     12 topic_word = print_top_words(lda, tf_feature_names, n_top_words)

File ~\anaconda3\lib\site-packages\sklearn\utils\deprecation.py:88, in deprecated._decorate_fun.<locals>.wrapped(*args, **kwargs)
     85 @functools.wraps(fun)
     86 def wrapped(*args, **kwargs):
     87     warnings.warn(msg, category=FutureWarning)
---> 88     return fun(*args, **kwargs)

File ~\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py:1429, in CountVectorizer.get_feature_names(self)
   1417 @deprecated(
   1418     'get_feature_names is deprecated in 1.0 and will be removed '
   1419     'in 1.2. Please use get_feature_names_out instead.'
   1420 )
   1421 def get_feature_names(self):
   1422     '''Array mapping from feature integer indices to feature name.
   1423 
   1424     Returns
   (...) 
   1427         A list of feature names.
   1428     '''
-> 1429     self._check_vocabulary()
   1431     return [t for t, i in sorted(self.vocabulary_.items(), key=itemgetter(1))]

File ~\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py:498, in _VectorizerMixin._check_vocabulary(self)
    496     self._validate_vocabulary()
    497     if not self.fixed_vocabulary_:
--> 498         raise NotFittedError('Vocabulary not fitted or provided')
    500 if len(self.vocabulary_) == 0:
    501     raise ValueError('Vocabulary is empty')

NotFittedError: Vocabulary not fitted or provided

这个错误是因为tf_vectorizer对象没有进行拟合（fit）操作，所以无法获取特征词的名称。要解决这个问题，你需要在使用get_feature_names()方法之前，先对tf_vectorizer对象进行拟合操作。

你可以在调用fit_transform()方法之后，添加如下代码进行拟合操作：

tf_vectorizer.fit_transform(your_data)

然后再使用get_feature_names()方法获取特征词的名称：

tf_feature_names = tf_vectorizer.get_feature_names()

这样就可以避免NotFittedError错误的发生了。

解决 NotFittedError: Vocabulary not fitted or provided 错误