Skip to content

Commit

Permalink
Fix minor errors in assertions
Browse files Browse the repository at this point in the history
  • Loading branch information
Allen Riddell committed Mar 27, 2015
1 parent 12d1294 commit 9e9ddb4
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 3 deletions.
1 change: 1 addition & 0 deletions source/feature_selection.rst
Expand Up @@ -89,6 +89,7 @@ appealing feature of word rates per 1,000 words is that readers are familiar
with documents of this length (e.g., a newspaper article).

.. ipython:: python
:okwarning:
import os
import nltk
Expand Down
8 changes: 6 additions & 2 deletions source/topic_model_mallet.rst
Expand Up @@ -230,7 +230,7 @@ called ``grouper`` using ``itertools.izip_longest`` that solves our problem.
assert np.all(doctopic > 0)
assert np.allclose(np.sum(doctopic, axis=1), 1)
assert len(doctopic) == len(filenames)
assert all(doctopic_orig == doctopic)
assert np.all(doctopic_orig == doctopic)
Now we will calculate the average of the topic shares associated with each
novel. Recall that we have been working with small sections of novels. The
Expand All @@ -252,6 +252,8 @@ novel.
@suppress
assert len(set(novel_names)) == 6
@supress
doctopic_orig = doctopic.copy()
# use method described in preprocessing section
num_groups = len(set(novel_names))
Expand Down Expand Up @@ -294,6 +296,7 @@ and fashioned a representation that preserves important features in a matrix
that is 813 by 20 (5% the size of the original).

.. ipython:: python
:okwarning:
from sklearn.feature_extraction.text import CountVectorizer
Expand Down Expand Up @@ -325,7 +328,7 @@ that is 813 by 20 (5% the size of the original).
.. ipython:: python
:suppress:
assert dtm.shape[0] == doctopic.shape[0]
assert dtm.shape[0] == doctopic_orig.shape[0]
# NOTE: the IPython directive seems less prone to errors when these blocks
# are split up.
xs, ys = pos[:, 0], pos[:, 1]
Expand Down Expand Up @@ -426,6 +429,7 @@ Now we have everything we need to list the words associated with each topic.

.. ipython:: python
N_WORDS_DISPLAY = 10
for t in range(len(topic_words)):
print("Topic {}: {}".format(t, ' '.join(topic_words[t][:N_WORDS_DISPLAY])))
Expand Down
5 changes: 4 additions & 1 deletion source/topic_model_python.rst
Expand Up @@ -58,6 +58,7 @@ As always we need to give Python access to our corpus. In this case we will work
with our familiar document-term matrix.

.. ipython:: python
:okwarning:
import numpy as np # a conventional alias
import sklearn.feature_extraction.text as text
Expand Down Expand Up @@ -144,6 +145,8 @@ Now we will average those topic shares associated with the same novel together
@suppress
assert len(set(novel_names)) == 6
@supress
doctopic_orig = doctopic.copy()
# use method described in preprocessing section
num_groups = len(set(novel_names))
Expand Down Expand Up @@ -192,7 +195,7 @@ The topics (or components) of the NMF fit preserve the distances between novels
.. ipython:: python
:suppress:
assert dtm.shape[0] == doctopic.shape[0]
assert dtm.shape[0] == doctopic_orig.shape[0]
# NOTE: the IPython directive seems less prone to errors when these blocks
# are split up.
xs, ys = pos[:, 0], pos[:, 1]
Expand Down

0 comments on commit 9e9ddb4

Please sign in to comment.