If you are interested in the most recent results concerning pirated content you should have a look at this article:
↑ Return to Top
Until today no in depth investigation on plagiarism in TechNet Wiki has been done. So some empirical data might be interesting. Even though this is no long term study, it gives interesting insights and allows the deduction of some rules of thumb.
Between June 7th and July 5th 2013 a search for plagiarisms was conducted in 7 “search sessions”. This analysis only counted those articles as plagiarisms, which have an overwhelming percentage of stolen content. This approach was to avoid discussions about cases of doubt or minor mistakes of members who are not acquainted with citation standards.
When you look at (a) most new articles of last month (b) selected googled articles with just one revision (c) some articles of members who appear in one of the Wiki leaderboards (d) some other articles of members who have already published a plagiarism. The analysis was restricted to articles written in English.
Let’s have a look at the observations and some first deductions (“rules of thumb”).
Observations:
Observation:
Observation: The table shows for each anonymized plagiarist (NN 1 – NN 13), the number of plagiarized articles and some core profile data. The analysis did not check all articles of all listed members (NN 1 – NN 13). I.e. the plagiarism count is a lower bound. These cases are indicated by the “≥” sign.
Observation: The table shows how many sources are used in a plagiarized article:
Explanation of the columns:
Observation: A popular plagiarism algorithm goes like this:
During the analysis, we found some interesting samples and refinements:
Copy a complete article as it is and create exactly one revision.
Example: The article “SharePoint 2013: What is SkyDrive Pro“ was already deleted at TechNet Wiki. However, it was a 1:1 copy including formatting taken from this smart blog post.
Find a promising MS related buzzword like “Visual Studio 2012”, take the complete article content from a Microsoft source, link the article prominently – for example on the Wiki: Development Portal:
Example: Visual Studio 2012
Copy a complete article and “pinged back from <url>” or “Content taken from <url>” or “From: <url>“ at the very end of the article or in a posted comment.
Member NN 12 used this technique in at least 22 of 94 articles. (Only a quick check with Google in this case. Because no check the overlap with other plagiarism articles already detected, these cases were not added to my statistics.)
Example: Migrate RADIUS config...
Copy a complete article and use algorithm steps 2.1 to 2.3.
Do not mention the author, do not change a personal intro, do only slight text changes and avoid writing a new text paragraph.
Example: How To View the MAC Address...
Copy as many sources as you can find into your article.
A more structured approach: Use a link list to a topic as a starting point and replace all links by subsets of the referenced articles.
Example: Windows trust migration...
My personal summary:
Plagiarism is a problem which occurs in an order similar to spam. All detected cases are serious, because the complete or nearly complete article is plagiarized. In most cases a plagiarism is not a single mistake. It is even possible to identify plagiarism strategies.
These documents handle plagiarism:
“Rights to Provide Customer Data. (…) We do not accept any additional obligations that may apply to Customer Data.”
“Respect content creators. Do not copy content from another author (such as a blog post) unless you have permission. If you do have permission, mention it at the top of the article and include a link to the original source and author.”
(1)“Do not just copy/paste from TechNet/MSDN or other websites, or blogs, or other sources of material that you did not create. If you do this and then save this without any editing, rewriting and improvement by you, you are plagiarizing another's work. If you are working on material that started on a blog or another website, it is important to link to the original material at the top of the article.”
(2) “Copy/pasting and then saving under your name is plagiarism. We frown on that here. Violations of copyright will be deleted.”
Terms of Use and Wiki: Code of Conduct exclude all cases of plagiarism – copying whole article or parts or them. I.e. zero tolerance for plagiarism.
Anyone who reproduces copyrighted material can be prosecuted. Even altering (“enhancing”) the content doesn’t matter as long as the content is substantially similar to the original.
You may argue that a plagiarist may add valuable enhancements to a plagiarized article. Please note, that that there are always (!) alternatives to plagiarism that are nicer, more trustworthy and respectful towards the original author.
Let’s have a look at some real life scenarios:
You found a fantastic article and want to share it with the TechNet Wiki Community. Don’t copy the article. Instead add a link to an existing Wiki article. Or even better: Add a commented link to an existing Wiki article and explain why it is worth to follow this link.
You want to give an overview of a broad topic. Instead of creating a mashup article which plagiarizes (subsets of) a lot of articles you should instead create a landing page or an article with a commented link list.
You think a non-TechNet Wiki article (a blog post, a MSDN Library article, …) contains a mistake that should be corrected. In case of a blog post you can post a comment. MSDN Library also supports comments and feedback. You can contact the author.
You think you can enhance an article. Write your own article and link to the original you want to enhance instead of copying it. In some case you may detect that your enhancements do no justify another article. In this case you may decide to post a comment instead or even to discard the idea of a new article. This reduces your own effort and that of your readers.
You write an article about a sophisticated topic and need an introduction which supplies the reader with the necessary background information to follow the rest of your article. Instead of copying original sources you should mention the prerequisites a reader of your article should know and link to background articles.
None of these techniques require copying other articles. By the way: This may remind you of the DRY principle – don’t repeat yourself.
What’s about citation? Citation is OK, but keep in mind: Citation is like a spice – it is not the whole meal.
Plagiarism and “enhanced” plagiarized articles can have legal consequences for the plagiarists and Microsoft:
(1) According to the Copyright Act anyone can be prosecuted who reproduces copyrighted material. Even altering (“enhancing”) the content doesn’t matter as long as the content is substantially similar to the original.
(2) Imagine we are only adding source references to plagiarisms: In case of an impeachment it could be argued that Microsoft can easily detect those plagiarisms and has neglected its duties.
Maybe it is a good idea to contact Microsoft’s legal department (if adding source references to plagiarisms is preferred instead of deletions).
There is no alternative to a zero tolerance policy for plagiarism.
Let’s play an intellectual game: Imagine for a moment that TechNet Wiki contains no plagiarized content (e.g. because we have zero tolerance for plagiarism, community members check new articles or members a plagiarism aware).
A community, magazine or organization which accepts and tolerates plagiarism loses its reputation. This behavior impairs also the non-plagiarized top quality content.
An article like the sample article can be copied and published within minutes.
Even in parts of the Community Council is a slight tendency that copying/plagiarizing MSDN content is not too bad.
If a plagiarist is smart, he will focus on MSDN content: The content has a high quality. He can create lots of articles on the fly.
If the frequency of articles with source references increases, it establishes the wrong impressions: “Plagiarism is OK as long as no one complains.” and “Copying content is an accepted article creation technique”.
If we don't delete plagiarism and add source references instead, we silently accept plagiarism. But even the lax ”How to Contribute” article states: “Copy/pasting and then saving under your name is plagiarism. We frown on that here.”
Taken to extremes, this means “Anything goes” and “Plagiarized articles have a right of continuance”.
If we follow this track, we are lost: What’s right, what’s wrong? Where is the threshold for plagiarism?
No one writes a plagiarism without knowing it.
TechNet Wiki was inspired by Wikipedia. Can you image that plagiarism is accepted at Wikipedia? Wikipedia goes even further: Even summarized information should have a proof and a reference.
May be not everyone is acquainted with (scientific) citation rules. However, plagiarism has never been an accepted behavior.
Plagiarism starts affecting the Top Contributor Awards. Examples:
“This week's largest document …looks like it's been copied directly from another source …”
“sayedissahassan in second place, and leading most new articles for the last month, but most seem just copied in, and may have to be removed :/”
By the way: 26 of his 41 (now 53) articles were checked, and they were all plagiarisms.
To make a long story short:
Suggestions:
It is time to act. It is up to you and the Community Council. Accepting plagiarism is no long term alternative.
This section was moved to the article Wiki: Pirated Content Portal.