November 29, 2017

Public Comments to the Federal Communications Commission About Net Neutrality Contain Many Inaccuracies and Duplicates

Fully 57% of comments used temporary or duplicate email addresses, and seven popular comments accounted for 38% of all submissions

Correction: This report initially noted that 450,000 comments were submitted to the FCC during its previous open comment period on net neutrality. That data point was based only on the initial comment period, spanning Feb. 9-July 18, 2014. The FCC subsequently reopened the comment period through Sept. 15, 2014, and the report now reflects the total number of comments received during the entirety of the 2014 public comment period. In addition, a reference to John Oliver in a sentence referring to the most popular pro-net-neutrality comment has been removed. Pew Research Center has issued a statement regarding concerns raised about this analysis.

For the second time in less than four years, the U.S. Federal Communications Commission (FCC) is considering regulations regarding net neutrality – the principle that internet service providers must treat all data the same, regardless of the origin or purpose of that data. Opponents of net neutrality regulations argue that ISPs should have the right to prioritize traffic and charge for their services as they wish. Meanwhile, supporters of net neutrality suggest that so-called fast lanes are anti-competitive and would prevent start-ups and smaller companies from competing with more well-established companies that can afford to pay for prioritized web traffic.

From April 27 to Aug. 30, 2017, the FCC allowed members of the public to formally submit comments on the subject. In total, 21.7 million comments were submitted electronically and posted online for review. This figure dwarfs the number received during the initial comment period when the FCC last accepted comments on this topic in 2014, as well as the nearly four million total submissions received during the entirety of the comment process that year.1 Net neutrality regulations underpin the digital lives of many Americans, yet it is challenging to survey the public on such an inherently complex and technical subject. For this reason, Pew Research Center set out to analyze the opinions of those who had taken the time to submit their thoughts to the FCC.

However, the Center’s analysis of these submissions finds that the comments present challenges to anyone hoping to understand the attitudes of the concerned public regarding net neutrality. It also highlights the ways in which individuals and groups are using modern digital tools to engage in the long-standing practice of speaking out in order to influence government policy decisions. Among the most notable findings:

  • Many submissions seemed to include false or misleading personal information. Some 57% of the comments utilized either duplicate email addresses or temporary email addresses created with the intention of being used for a short period of time and then discarded. In addition, many individual names appeared thousands of times in the submissions. As a result, it is often difficult to determine if any given comment came from a specific citizen or from an unknown person (or entity) submitting multiple comments using unverified names and email addresses.
  • There is clear evidence of organized campaigns to flood the comments with repeated messages. Of the 21.7 million comments posted, 6% were unique. The other 94% were submitted multiple times – in some cases, hundreds of thousands of times. In fact, the seven most-submitted comments (six of which argued against net neutrality regulations) comprise 38% of all the submissions over the four-month comment period.
  • Often, thousands of comments were submitted at precisely the same moment. On nine different occasions, more than 75,000 comments were submitted at the very same second – often including identical or highly similar comments. Three of these nine instances featured variations of a popular pro-net-neutrality message, while the others promoted several different anti-net-neutrality statements.

The Center conducted its analysis by downloading all the comments from the FCC’s publicly available API. All data and comments used in this report are stored on the FCC’s site and are freely available to the public. Researchers then used various data analysis techniques to summarize the comments and to look for duplicates or invalid information. Most notably, the Center utilized a measure of textual similarity to determine the share of highly similar comments that were submitted multiple times.2 Full details of the contents of this dataset and the techniques used in this analysis can be found in the methodology at the end of this report.

Many submissions contained false or misleading personal information

Collecting large-scale data from the public is always challenging. It is difficult to ensure that a person online is indeed who they claim to be, and falsification of someone’s personal information can be accomplished with relatively minimal effort. The Center’s analysis finds evidence that many people did not reveal their true identities when submitting comments to the FCC. Some of these instances may have been accidental, but in many cases patterns in the comments indicate those submitting the comments intentionally entered false or misleading personal information.

Many common names – as well as other words – appeared thousands of times as “authors” of comments

The most common “name” included as an author was not, in fact, a name. In nearly 17,000 instances, the name of the commenter filing their views on the FCC site was written as “Net Neutrality” (this term also appeared as the author of more than 5,000 comments in lower-case form). “The Internet” also appeared as the name in almost 7,500 submissions. Of the top 15 names that appeared in the FCC submissions, eight included the common last names of “Smith” or “Johnson,” and four were not names at all.

These submissions often featured email addresses that were nonfunctional, frequently repeated, or disposable

In theory, the process for submitting a comment to the FCC included a validation technique to ensure the email address submitted with each comment came from a legitimate account. The submission form clearly states that all information submitted, including names and addresses, would be publicly available via the FCC site.

However, the Center’s analysis shows that the FCC site does not appear to have utilized this email verification process on a consistent basis. According to this analysis of the data from the FCC, only 3% of the comments definitively went through this validation process. In the vast majority of cases, it is unclear whether any attempt was made to validate the email address provided.

As a result, in many cases commenters were able to use generic or bogus email addresses and still have their comments accepted by the FCC and posted online. For instance, the email address example@example.com appeared in 7,513 comments, making it the most common email address to appear. The email address john_oliver@yahoo.com (television host John Oliver advocated on his show for net neutrality earlier this year) was also used in 1,002 comments. All told, the Center’s analysis identified 1.4 million email addresses that appeared multiple times in the comments.

Additionally, in 9,190 cases the email address supplied did not contain the “@” character necessary to serve as a functioning email account. Moreover, 10% of the comments submitted did not include an email address at all.

Along with using duplicate or potentially fraudulent addresses, the Center’s analysis finds more than 8 million submissions included email addresses from temporary email accounts designed to disappear within hours and leave no trace of email exchanges behind.3 Taken together, some 57% of the comments submitted to the FCC either utilized a temporary email address or an email address that was also included with at least one other comment.

Many submissions highlight organized efforts to influence the commentary period

The Center’s analysis of these data suggests the net neutrality comment period was marked by several organized efforts aimed at conveying the public’s feelings on this subject.

Some 6% of the comments posted were unique submissions. Six of the seven most-common submissions in the remaining 94% argued against net neutrality and can be traced back to websites of a handful of organizations

This analysis finds that 6% of the 21.7 million comments were submitted a single time. The remaining 94% were each submitted multiple times, in some cases numbering in the thousands. In fact, five comments were submitted more than 800,000 times each. Taken together, these seven comments alone account for more than 8 million submissions, representing 38% of the total over the entirety of the comment period.

The single comment submitted more times than any other was a pro-net-neutrality statement that appeared 2.8 million times, accounting for 13% of all submissions. At the same time, seven of the top 10 comments argued against net neutrality and encouraged the FCC to roll back Title II regulations.4 The seven most-popular anti-net-neutrality posts made up 27% of all the comments submitted, while the three most-popular comments in favor of net neutrality made up 17% of the total submitted.

Whether they argued for or against net neutrality, the text of many of the top comments can be traced back to a small number of organizations. For example, the single most-popular comment was a pro-net-neutrality statement that appeared as a submission form on the website battleforthenet.com. Similarly, the wording for three popular comments opposing net neutrality (representing the second-, sixth- and ninth-most submitted overall) appeared on the website for an organization known as the Taxpayers Protection Alliance. Combined, the text from these three suggested comments appeared in almost 2.4 million submissions, making up 11% of the total.

In many instances, thousands of comments were submitted simultaneously – down to the second.

Other research has suggested that some share of the FCC comments may have been submitted in bulk using automated processes, such as organized bot campaigns. The Center’s analysis finds support for this argument, based on the fact that many comments were submitted at precisely the same instant. The FCC assigned a precise timestamp to each comment as it was submitted, and an analysis of those timestamps shows that on numerous occasions, thousands of posts were submitted at exactly the same time – a sign that these submissions were likely automated.

On more than 100 different occasions, 25,000 or more comments were submitted to the FCC at the same precise second. And on nine different occasions, 75,000 messages or more were posted simultaneously. The three most numerous of these nine moments featured variations of the most popular pro-net-neutrality message. The remaining six included several different anti-net-neutrality statements.

In the most prolific example, 475,482 comments were submitted on July 19 at precisely 2:57:15 p.m. EDT. Almost all of those comments were pro-net neutrality and offered variations of text that appeared on the site battleforthenet.com. In some cases, the only difference was the name of the submitter: the same text was “signed” 286 times by “Andrew,” 265 times by “Michael” and 235 times by “Ryan,” among other names.

A deeper analysis of these simultaneous comments highlights several variations in how they were submitted. In some cases, the comments were highly similar but with minor variations. The 86,237 comments submitted at precisely 7:18:04 p.m. on May 24 offer an example of this approach. No two were exactly the same, but all featured consistent patterns. Most began with variations of a similar theme, such as: “Dear [FCC Chairman] Mr. Pai, I am a voter worried about regulations on the Internet,” “Dear Chairman Pai, I am a voter worried about Title 2 and net neutrality,” or “Dear Commissioners: I’m concerned about Internet regulation and net neutrality.”

The body of these comments also featured similar phrases. One post charged, “Obama’s policy to take over the web is a betrayal of net neutrality. It reversed a free-market policy that functioned supremely well for decades with both parties’ backing.” While another stated, “The previous administration’s policy to control the Internet is a betrayal of the open Internet. It disrupted a free-market system that functioned fabulously smoothly for decades with bipartisan approval.”

In other cases, the content of these simultaneous submissions was entirely identical. On May 28 at exactly 8:23:51 p.m. EDT, the FCC received 90,458 comments with this exact message: “Title II is a Depression-era regulatory framework designed for a telephone monopoly that no longer exists. It was wrong to apply it to the Internet and the FCC should repeal it and go back to the free-market approach that worked so well.” Indeed, this example was not an isolated incident. The Center identified at least five separate occasions when the exact same text was submitted more than 24,000 times at precisely the same moment.

Off-topic comments

Some comments submitted to the FCC had nothing to do with net neutrality and appeared to be attempts by users to further complicate the data collection:
• At least 34 comments included references to Bee Movie, some of which contained portions of the movie’s script.
• Fully 108 comments had more non-alphanumeric characters – such as equal signs (=) or ampersands (&) – than alphanumeric characters.
• Others consisted entirely of short messages without a clear meaning, such as: “get a hobby,” “Democracy,” “cat videos,” “google it,” “SAD!” and “!!!!!!!!!!!!!!!!!!!!!!!!!”

Of course, the fact that many comments were submitted at precisely the same time does not mean the organization or webpage where the text first appeared was responsible for automating or standardizing those submissions. It is possible a third party used the text and submitted these comments on its own. Nor is there anything inherently wrong or sinister about bulk filing of comments. This analysis simply highlights the scale at which digital tools are being brought to bear in the long-standing practice of commenting on proposed government rules.

The comment period was marked by bursts of intense activity and long stretches with few submissions

During the four-month period in which the FCC accepted comments on net neutrality, an average of 172,246 posts were submitted per day. But the comment period featured several long stretches with few submissions, punctuated by bursts of intense activity.

The comment period officially opened on April 27, and only 453 comments were submitted on that day. On Sunday, May 7, two major events occurred that coincided with a significant increase in submissions. That evening, comedian John Oliver broadcasted a nearly 20-minute segment on his HBO show Last Week Tonight defending net neutrality and encouraging his viewers to submit comments supporting his position. The last time the FCC considered net neutrality in 2014, a Pew Research Center analysis showed that John Oliver’s program also led to a spike in the number of comments submitted.

Also on May 7, the FCC issued a news release stating that a distributed denial of service attack (DDoS) occurred against the electronic filing system. Some critics have questioned whether an actual DDoS attack occurred, noting that the FCC did not provide documentation regarding the attack following a Freedom of Information Act request by the website Gizmodo. And two Democratic members of the House Energy and Commerce Committee have since requested an investigation into the matter.

More than 2.1 million comments were submitted in the five days following those two events (May 8-12). Those comments made up 10% of all the comments submitted during the entire submission period.

In response to this surge of submissions, the FCC released a public notice on May 11 that announced a “sunshine period” for the week spanning May 12-18 in which the FCC would temporarily stop taking public comments due to the large number of submissions. According to the FCC’s statement:

“This means that during this brief period of time, members of the public cannot make presentations to FCC employees who are working on the matter, and are likely to be involved in making a decision on it, if the underlying content of the communication concerns the outcome of the proceeding … The Commission adopted these rules to provide FCC decision-makers with a period of repose during which they can reflect on the upcoming items.”

Although the FCC claimed it would not accept comments during this period, the Center’s analysis finds that more than 93,000 posts submitted on those days were included among the final database made available for public review.

The rate of comments slowed significantly over the next few weeks. From May 30 to July 8, the number of comments declined to an average of only 5,832 posts per day. In mid-July, activity increased dramatically and remained relatively high until the original date the comment period ended.

The single day with the most submissions occurred on July 12. Online activists dubbed the day “Net Neutrality Day of Action” or “Day of Action to Save Net Neutrality” and numerous sites altered their websites to include statements favoring net neutrality. On that day alone, 1.4 million comments were submitted electronically to the FCC.

  1. The 2014 Pew Research Center study was released after the initial comment period was closed and the initial data had been made available to the public. The FCC subsequently extended the comment period. Additionally, the FCC allows for submissions by phone or letter, but those comments are not publicly accessible and are excluded from this report.
  2. The analysis used is known as cosine similarity, which measures the distance between characters in different documents. Throughout this report (unless explicitly noted) comments with a cosine similarity of at least .95 on a scale from 0-1 are grouped together and considered the same.
  3. The Center identified disposable email addresses by matching their domains to a list of known providers of temporary or disposable email accounts. The specific domains include @pornhub.com, as well as the 10 domains provided as an option by the site FakeMailGenerator.com (which include @gustr.com and @armyspy.com, among others). Other sites, such as 10minutemail.com, also offer disposable accounts. But these sites utilize random domain names that cannot be tracked and as a result there was no way to identify these accounts for the purposes of this analysis.
  4. The shorthand “Title II” is often used to refer to how the FCC implemented net neutrality regulations in 2014. The FCC reclassified internet providers as common carriers under Title II of the Communications Act of 1934. This decision allowed the FCC to implement net neutrality regulations.