Jump to content
  • Rights holders got Google to remove 6 billion links from Search over 10 years


    Karlston

    • 246 views
    • 7 minutes
     Share


    • 246 views
    • 7 minutes

    Experts say policymakers mostly ignore Google's transparency reports.

     

    Over the past decade, Google has consistently documented its efforts to remove links from its search results to content that the tech giant considers pirated, and recently, the total number of Google takedowns since its reporting began has shot past 6 billion. It's a milestone that Torrent Freak suggested shows that, "[w]hile copyright infringement can't be eradicated entirely, Google is slowly but steadily presenting itself as a willing partner in the anti-piracy fight."

     

    Google's slow evolution into an anti-piracy champion began ramping up in 1998. That's when the Federal Communications Commission granted safe harbor to online service providers like Google, protecting them from copyright infringement claims about third-party content, with a condition that the providers disclose information on any users alleged to be infringers. A decade later in 2009, it seemed like Google wasn't doing enough, though, and the FCC again intervened, responding to news publishers lashing out at Google and others. At that time, the publishers accused service providers of profiting off ad placements next to links from aggregators and scrapers, who were accused of grabbing and republishing news content without permission.

     

    Back then, Google promised to address the issue by making it easier for rights holders to flag infringing content in search results. Then it launched its first transparency report in 2010, but that initial report only shared information on government requests for takedowns. Two years later, Google expanded its report, publicly counting every takedown notice that it received and "providing information about who sends us copyright removal notices, how often, on behalf of which copyright owners and for which websites."

     

    More recently, Google decided to go one step further by creating a preemptive blocklist in 2018. That move stopped copyright-infringing URLs from ever being indexed in search results, and those links are included in the 6 billion total of URLs delisted that Google documents today.

     

    According to Torrent Freak, 326,575 copyright holders identified 4,041,845 separate domain names, adding up to 6 billion takedowns since 2012. But not all reports were valid. Torrent Freak itself was counted among "false positive" reported domains, along with "websites of the White House, the FBI, Disney, Netflix, and the New York Times."

     

    In 2012, Fred von Lohmann, Google's senior copyright counsel, wrote in a blog that Google's efforts to be more transparent about takedowns over this past decade were intended to help inform policy choices as the Internet evolves.

     

    "As policymakers and Internet users around the world consider the pros and cons of different proposals to address the problem of online copyright infringement, we hope this data will contribute to the discussion," Lohmann wrote.

     

    Google did not immediately respond to Ars' request for comment on policy impacts of its transparency reports.

     

    Google's partner in tracking all of its takedown notices for transparency purposes is Lumen, whose project manager, Adam Holland, told Ars that Google submits more data than any other company that Lumen partners with, such as Twitter, Wikipedia, or Reddit. Holland said that the majority of requests for Google data come from academics interested in analyzing long-term trends, as well as, increasingly, media and non-government organizations, but rarely policymakers.

     

    "We don't actually get a lot of interest directly from lawmakers," Holland told Ars. "Personally, that disappoints me, but that's the reality."

     

    However, recently, Holland said that Lumen has begun advising European Union policymakers as they work to implement new transparency requirements for online service providers included in its newly passed Digital Services Act. As a neutral data resource, Lumen's primary goal isn't to influence policy, though. Holland told Ars the only stance that Lumen takes on the issue is remaining firmly against invisible takedowns by online service providers, because "[o]ur unofficial motto is good policy requires good data." And because of its global reach, Google remains the key supplier of Lumen data.

     

    Harvard Law School copyright expert Rebecca Tushnet told Ars that she thinks the key benefit of Google's report is showing lawmakers how difficult it is for service providers to classify content. Advising the US Senate Committee on the Judiciary Subcommittee on Intellectual Property in 2020, Tushnet warned lawmakers and more recently told Ars that reports like Google's prove that for every path created to report infringing content, scammers will find creative ways to exploit them by "doing their very best to mimic people who have valid claims."

    "Unfortunately, I'm not sure [Google's transparency reporting] has influenced policy," Tushnet told Ars. "The best use of transparency reports, I suppose, is to make clear how difficult this is and how even if you're right 99.9 percent of the time [taking down infringing content], you're going to be wrong a lot."

    How Google decides a complaint is valid

    When Google receives a takedown notice, it says in its report that "our teams carefully review it for completeness and check for other problems. If the notice is complete and we find no other issues, we delist the URL from Search results."

     

    Torrent Freak reported that "the majority of these requests were indeed removed or put on a preemptive blacklist," and in Google's report, the company provides a handful of examples of what forms a valid or invalid request for takedowns. Invalid requests, Lohmann said in 2012, can sometimes be erroneous or abusive.

     

    In one example, Google notes that it doesn't delist URLs from Google Search if a business, celebrity, religious organization, or politician attempts to make claims to remove articles criticizing their work or views. Google also won't delist a URL if it discovers that someone attempted to back-date a post to claim rights, as happened in a recent invalid report when "an individual claiming to represent a news site filed a copyright complaint against a second, reputable news site for use of their article." Additionally, scammers have attempted and failed to fool Google by impersonating rights holders. For each example, Google explains how it reached its decision to delist or not.

     

    Tushnet told Ars these are some of the well-known strategies of scammers and that Google will have to remain flexible as scammer trends shift. It's critical that regulators recognize that Google needs that flexibility, she said, and that any new laws should acknowledge that smaller online service providers do not have the same resources as Google to respond to all infringing content.

     

    Tushnet recommends that lawmakers not focus on finding a one-size-fits-all solution to end copyright infringement, which Torrent Freak reported would be impossible. Instead, she thinks lawmakers should consider reports like Google's and decide, "how much error are we willing to tolerate? And what are our best mechanisms for reducing error?"

     

    In the last 10 years, Google has said it removes infringing content more quickly. However, even as it's gotten faster and smarter with time, it seems that the tech giant will be forever stuck seeking ways to try to catch it all. Tushnet said that "Google is in an inherently reactive position here," because "every tool of protection is also a tool of abuse."

     

    Holland said that when it comes to improving how online service providers like Google remove infringing content, though, it's not necessarily about being fast, because then there are risks of invalid takedowns from acting too hastily. Instead, Lumen helps different stakeholders find solutions to effectively blocking infringing content according to their institution's unique goals.

     

    "I think everybody's in favor of improvement," Holland said. "We just don't know what improvement means." However, "transparency is going to need to continue to be a critical part of" improvement, because "if we don't have the ability to evaluate what these companies are doing, and have done, then we don't know whether it's working or not."

     

     

    Rights holders got Google to remove 6 billion links from Search over 10 years


    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...