- Under certain conditions, it is possible for authors uninformed regarding linguistics and statistical authorship attribution to obscure their own writing style, or imitate the style of another author.
- Without intent to obscure or change authorship style, many current stylometric methods are able to assign authorship with high accuracy, even with large sets of candidate authors and relatively small text samples.
- Machine translation through as many as two additional languages does not obscure authorship styles (i.e. write in English and translate through German and Japanese before returning to English).
- The conditions for successful obfuscation and imitation are different in a number of ways from those necessary for Joseph Smith to have copied, forged, imitated, or invented Mormon scripture.
- The differences represent the stylometric challenge needed to establish the plausibility of fraud on the part of Joseph Smith.
Michael Brennan, Sadia Afroz, and Rachel Greenstadt. 2012. Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity. ACM Trans. Inf. Syst. Secur. 15, 3, Article 12 (November 2012).
Michael Brennan and Rachel Greenstadt. Practical Attacks Against Authorship Recognition Techniques in Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence (IAAI), Pasadena, California, July 2009.
The papers aren't too technical, and are pretty fun reads, if you are interested. Most of the technical parts are in other, referenced papers, since the authors employed already established methods. I'm just going to give you the bottom line results and the details of the writing data sets which are relevant to thinking about authorship of Mormon scripture.
The text samples
- Authors submitted 5000 or 6500 words of previously written academic text, like essays, papers, articles, business letters, etc.
- Authors were told to write 500 words describing their neighborhood to a friend who hadn't seen it before, but to disguise their style.
- Authors were then given a 2500 word passage from a very distinctive author, and told to describe their day, from the time they woke up, in the style of this author. They were to use third person, and again, they each wrote approximately 500 words.
- The texts from 15-45 authors were then split up randomly into many smaller groups of 2-40 authors. Things were randomized quite thoroughly, and lots of different combinations were tried. Sometimes every possible combination, depending on the stylometric test being used and computational feasibility.
The results
- All of the stylometric measures tried were able to assign normal texts to the correct authors with high probability (>90%, typically).
- Almost none of the tests did better than random guessing when trying to assign texts where the author disguised his or her style (some authors did better than others, and some authors had disguised styles assigned to them more often than others).
- All of the tests were very likely to assign the imitation styles to the author being imitated.
What does this mean for studies of Book of Mormon authorship? From Holmes's data, some type of fraud seems to be the most scientifically defensible alternative if one rules out claims involving some sort of divine intervention or other magical explanations. Thus, our ability to detect obfuscation or imitation in Mormon scripture texts would lend support to this conclusion, while finding that fraud is unlikely lends support to selecting among the various 'miraculous' or 'supernatural' claims. Here is my tentative comparison and estimate of what pursuing the hypothesis of fraud will mean for Mormon scripture.
Comparison of Adversarial
Authorship Studies with Mormon Scripture Historical and
Stylometric Studies
|
||
Fooling Stylometric Measures |
Mormon Scripture |
Favors/Disfavors Fraud |
6500 word reference samples |
Multiple 10000 word reference samples |
Disfavors. Longer reference texts give more information
regarding authors' styles. |
500 word samples for classification |
10000 word samples for classification |
Disfavors. It is presumably harder to hide your style over
longer texts. |
Simple, familiar topics |
Complex, unfamiliar topics |
Disfavors. Greenstadt and her coauthors assume it is harder to
concentrate on obfuscation or imitation while inventing or
remembering complex, new material. |
Written in short times |
Written or dictated in short times |
|
'Dumbed down' to obscure personal style |
Less rich vocabulary in Mormon scripture than in Joseph Smith's
personal papers. The same is seen for Joanna Southcott's prophetic
voice, although to a lesser degree. |
Favors. This was the most common technique to obscure a
personal style. |
Distictive authorial style for imitation |
No historically verified texts being imitated |
Disfavors. Joseph Smith apparently created distinct and
consistent authorial styles for Nephi, the Doctrine and Covenants,
Moroni, and Alma, and nearly consistent for Mormon—all without
having any known reference authors to copy. (A rigorous
stylometric comparison with The Late War
could suggest a text from which one or more of the Book of Mormon
styles was imitated. This is an obvious test to be attempted by those
interested in Book of Mormon authorship.) |
Closed set of authors |
Open set of authors |
Favors. If we can't detect fraud within a closed set, how can we hope to in an open set? |
Adversarial authorship attack known |
No direct evidence of fraud |
Disfavors. Stylometric methods are demonstrably highly
effective at identifying authors when sample sizes are as large as
those from Mormon scripture. The only time this is known to be
untrue is when authors are deliberately disguising their style or
copying another.
|
Machine translation doesn't disguise style |
Claimed to be translations |
Disfavors. Authors' styles are preserved through multiple
machine translations, consistent with Joseph Smith having 'translated'
texts by multiple authors.
|
No comments:
Post a Comment