Friday, May 2, 2014

Who Wrote What Joseph Smith Wrote?

Summary

  • Joseph Smith appears to have a naturally diffuse or generic style when compared with other 19th century New Englanders and early Mormons as measured by noncontextual word use.
  • Smith's style significantly overlaps Cowdery, Pratt, Rigdon, and Spalding.
  • Having dictated so many of his personal writings through scribes, Smith's already diffuse or multiply overlapping style may have been further diffused through the influence of individual scribal styles.

Preface

I've dreaded writing about these papers, a little, because I find the conclusions embarrassing, frankly. I'm going to give Matthew Jockers credit I think he is due, up front. In requesting access to these papers from him, and then asking him a follow up question regarding his results, he explicitly told me some things I had come to believe through reading the 2013 paper. Jockers performed most of the data analysis and computation. He is not emotionally invested in the issue like I am or like many of the other authors on Book of Mormon authorship papers (including one of his coauthors). His conclusions from his 2013 paper are the most reasonable conclusions to be drawn from the data if one is convinced beforehand that the Book of Mormon must have been produced strictly by 19th century authors, and without considering the possibility of unknown authors. His conclusions in the 2013 paper are also much more modest than those put forth in the more ideologically charged 2008 paper. Jockers did not write the historical analysis presented in the 2008 paper, although anyone who puts their name on a paper is taking some responsibility for the entirety of its content. I have no intention of addressing the historical evidence for the Spaulding-Rigdon hypothesis.

In my opinion, the methods employed by Jockers and coworkers are shown to be embarrassingly misapplied to the questions of Book of Mormon authorship solely based on internal evidences of the studies without referencing the problems due to closed-set attribution, and that is what I'm going to illustrate, here.

The 2013 Paper

http://www.matthewjockers.net/publications/ (Preprints available)
Jockers, Matthew L. “Testing Authorship in the Personal Writings of Joseph Smith Using NSC Classification.” Literary and Linguistic Computing. 28.3, (2013): 371-381
Jockers, Matthew L., Daniela M. Witten, and Craig S. Criddle. “Reassessing Authorship of the Book of Mormon Using Delta and Nearest Shrunken Centroid Classification.” Literary and Linguistic Computing, 23.4 (2008): 465 – 492.
http://www.jstor.org/stable/2982671

Beginning with the 2013 paper, we see the problems inherent in applying the Nearest Shrunken Centroid methodology to the particular data set chosen for study. I emphasize this point because NSC is a proven method--when applied to the right kind of data. I don't really know what those data are, but we will see how ineffective the method choice is for the writings of Joseph Smith.

In the 2013 paper, Jockers tested whether documents supposedly dictated by Joseph Smith to scribes could be identified as having come from Joseph Smith. He collected handwritten examples of Joseph Smith's style into a training set:
[T]he corpus of Smith material available for training contained 25 documents in Smith's handwriting. These works ranged in length from 112
words to 2,300 words with an average length of 527 (13,172 words total). The test corpus contained 96 additional documents attributed to Smith but in the handwriting of one of 23 different scribes. These works ranged in length from 105 words to 10,927 words with an average length of 1,168 words.
Jockers was able to compare 106 noncontextual words rather than the approximately 40 considered in the earliest Book of Mormon stylometry paper. In this regard the measures are potentially more powerful, but the samples in this study are much smaller, on average, and vary in length. Both these factors reduce the ability to identify authors. As controls, Jockers included texts by Isaiah and Malachi, Henry Wadsworth Longfellow, and Joel Barlow--all authors loosely associated with Joseph Smith (in the Book of Mormon or from the same time period) and that wrote about some similar topics, but were known to have differing styles. Personal writings from most of the 23 scribes were not included for testing. Now I need to give you Jockers's main conclusion:

Joseph Smith did not have a clear style when he was dictating or writing. This means it was the correct choice to leave Joseph Smith's style out of the tests when they were trying to identify which 19th century authors wrote the Book of Mormon.

I agree that it was a reasonable choice to leave Joseph Smith out of the Book of Mormon analysis--three other studies (Holmes's study is the only one I've written about, so far) had all previously established that Joseph Smith's personal style was measurably different from any of the styles in the Book of Mormon. Those studies would all claim there is no point in including Joseph Smith in Book of Mormon authorship studies because he had a measurably distinct style. Of course, Jockers points out that including the many dictated documents might not have shown Joseph Smith's personal style, but his scribes' styles, or a mix of styles. So let's take a look at what Jockers found to see what claims are best supported.

The following is an excerpt from the first table in the 2013 study. It shows the number of the 96 texts that identified a particular author as the 1st or 2nd most similar style to the text:
Table 1


Identified Author
1st choice
2nd choice
Barlow
1
0
Cowdery
32
21
IsaiahMalachi
3
1
Longfellow
2
4
Pratt
24
12
Rigdon
12
10
Smith
15
25
Spalding
7
23

Notice that:
  • 13/96 (Barlow, Longfellow, Isaiah/Malachi, and Spalding) were attributed to authors with no connection to the texts.
  • 32/96 were assigned to Cowdery
  • 24/96 were assigned to Pratt
  • 12/96 were assigned to Rigdon
  • 15/96 were assigned to Smith
On the surface this would seem to say that the method got 13.5 % 'wrong', 15.6 % 'right', and 70.8 % were at least assigned to the scribes. That's not bad, but lets look a little closer. How many of the texts assigned to Cowdery, Pratt, and Rigdon were penned by those scribes?

Scribe
# texts assigned to scribe
# of those texts for which scribe acted as scribe
Cowdery
32
2
Pratt
24
1
Rigdon
12
0

In summary, the method does a terrible job of identifying either the the dictator or the scribe. In addition, Rigdon and Spalding yielded a total of 19.8 % false positives. Taken altogether, the method was able to identify the correct author or scribe as the 1st most likely candidate a total of 18.8 % of the time. About the only thing the study consistently got right was assigning low probabilities that most texts were written by Barlow, Longfellow, or Isaiah/Malachi.

Jockers explored an alternative approach. Considering the possibility that Joseph was collaborating, maybe we should see paired authorship signatures. When looking at 1st and 2nd place assignments together, Smith is paired with Cowdery 32 times, and 7 more times with Rigdon and Pratt--all influential contributors to early Mormon thought and documents. It sounds superficially impressive until you remember that Cowdery only had a hand in 2 of those 32, Pratt in 1 of the 7, and Rigdon in none.

An Alternative Interpretation

I would like to propose an alternative set of conclusions that is consistent with Jockers's data, as reported, and also with the data of Holmes's studies. One problem with these data is that none of it tells us on some absolute scale how much stylometric variation there is among the various texts. All we know is relative similarities. What would the results show if Smith had a moderately ambiguous style? From the adversarial authorship studies, there were hints that some authors might have more generic writing styles than others. If Smith were one of these, his style could well show a broader range of stylometric measures than most authors. What if parts of this style then overlapped the styles of Cowdery, Rigdon, Pratt, and Spalding? Any texts which fell within the regions of Cowdery, for example, would be assigned as most likely to have come from Cowdery. This is because Cowdery's signature is much more defined than Smith's, not because the text doesn't match Smith's style or because it wasn't written by Smith. Here's how a graph of the style overlaps might look:
Smith's style is represented by the blue circle. Some of the other authors styles are represented in different colors. Orange circles represent the unmeasured, personal styles of a number of other scribes. Because Smith's style is so diffuse, any texts that fall in the Pratt section of his style will be classified as the more clearly defined Pratt, any who fall in the Cowdery section will be assigned to Cowdery, etc. If scribes influenced Smith's style, it could diffuse his style even further. This is imagined with the dashed blue oval. You can see how, in a scenario like this, many of the texts would be assigned to other authors even if they were all in Smith's style. If Smith's style was further diffused through influence of scribes (whose personal styles could be anywhere, since most of them weren't tested), then the probability of false assignment increases greatly through suggesting a dual authorship influence. Smith's signal could even expand to give a mistaken match for very different authors like Barlow. While allowing for a dual author influence, this hypothesis avoids claiming that Smith has no style of his own and allows for explanations of why Smith's dictations would be assigned to pairings like Pratt and Spalding without either one having taken part in their creation.

I wondered if any of the pairings gave consistent results, like if the texts assigned to Pratt and Spalding in 1st and 2nd place line up with documents dictated by Smith and recorded by one or two particular scribes. Jockers provided the data to answer this in an appendix, and the answer is no. Maybe there are some weak correlations, but none show strong consistency. This is unsurprising, given the normal statistical distributions of stylometric signatures.

I can't prove the truth of this hypothesis from these data, but it seems like a reasonable, and likely testable, hypothesis. It is also consistent with conclusions from earlier studies that Smith had a style distinct from LDS scripture. If Smith had no distinction to his style, the data of all the earlier studies would have to be meaningless. In addition, Jockers didn't include vocabulary richness measures or noncontextual word pairings as used by two of the earlier studies. It seems unlikely that Holmes did such a bad job, even if you don't want to credit the BYU studies that I haven't covered, yet.

The Take Away

Joseph Smith's personal style is somewhat generic when compared with the writings of Cowdery, Pratt, Rigdon, and Spalding. Scribal influence may be responsible for further diffusing this style. Furthermore, false positives appear to be the norm for these methods when applied to the writings of Joseph Smith. I don't believe this means the data and methods are useless, just that they don't support the principle interpretations discussed by Jockers.

Next Time

We will take a look at the 2008 paper which supposedly supports the Rigdon-Spalding hypothesis for authorship of the Book of Mormon. Having examined the 2013 paper, we will be in a better position to interpret the results presented in the 2008 paper.

No comments:

Post a Comment