<?xml version="1.0" encoding="iso-8859-1"?>
<feed version="0.3" xmlns="http://purl.org/atom/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xml:lang="en">
  <title>Gatekeepers of the Web</title>
  <link rel="alternate" type="text/html" href="http://cfp2004.org/blogs/gatekeepers/" />
  <modified>2004-04-21T22:17:14Z</modified>
  <tagline></tagline>
  <id>tag:cfp2004.org,2006:/blogs/gatekeepers//14</id>
  <generator url="http://www.movabletype.org/" version="2.661">Movable Type</generator>
  <copyright>Copyright (c) 2004, abigail</copyright>
  <entry>
    <title>gate keepers</title>
    <link rel="alternate" type="text/html" href="http://cfp2004.org/blogs/gatekeepers/archives/000038.html" />
    <modified>2004-04-21T22:17:14Z</modified>
    <issued>2004-04-21T15:17:14-08:00</issued>
    <id>tag:cfp2004.org,2004:/blogs/gatekeepers//14.38</id>
    <created>2004-04-21T22:17:14Z</created>
    <summary type="text/plain">This panel pitted the fear of discrepancies in search returns against the fear of a uniform result. (Think Hindeman&apos;s &apos;All roads lead to Rome&apos; versus Machill&apos;s BBC rule: &apos;Never rely on one source.&apos;) It seems to me the user who...</summary>
    <author>
      <name>abigail</name>
      
      <email>abigail@mercurially.com</email>
    </author>
    
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://cfp2004.org/blogs/gatekeepers/">
      <![CDATA[<p>This panel pitted the fear of discrepancies in search returns against the fear of a uniform result. (Think Hindeman's 'All roads lead to Rome' versus Machill's BBC rule: 'Never rely on one source.') It seems to me the user who wants to make some personal headway in correcting both imbalances should determine to click on random links from search results (instead of first, second, third,... in the rankings). In discussing the effect of using one search engine over another, or relying on multiple search engines instead of just one, we're leaving out the variable of results ranking, which equally impacts these fears.</p>

<p>It makes you wonder if one day there will be a real debate over whether search engines should be regulated. As Ben pointed out, search engines function as gatekeepers. As they have increasing repercussions on the economy--promoting (however inadvertently) certain online businesses while prejudicing others, it seems inevitable that the push for more public control will increase.<br />
</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Getting Google to be more transparent</title>
    <link rel="alternate" type="text/html" href="http://cfp2004.org/blogs/gatekeepers/archives/000036.html" />
    <modified>2004-04-21T22:04:58Z</modified>
    <issued>2004-04-21T15:04:58-08:00</issued>
    <id>tag:cfp2004.org,2004:/blogs/gatekeepers//14.36</id>
    <created>2004-04-21T22:04:58Z</created>
    <summary type="text/plain">Google has to remove some search results from its service because of legal obligations under the DMCA (like section 512). Google, in an attempt to be transparent, adds a few lines to the bottom of search results saying as much....</summary>
    <author>
      <name>joehall</name>
      
      <email>jhall@sims.berkeley.edu</email>
    </author>
    <dc:subject>transparency</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://cfp2004.org/blogs/gatekeepers/">
      <![CDATA[<p>Google has to remove some search results from its service because of legal obligations under the DMCA (like section <a href="http://www4.law.cornell.edu/uscode/17/512.html">512</a>).  Google, in an attempt to be transparent, adds a few lines to the bottom of search results saying as much.  </p>

<p>For example, search for <a href="http://www.google.com/search?q=%22content+on+their+sites.+Whether+it%27s+product+information%22"><code>"content on their sites. Whether it's product information"</code></a>. At the bottom of the page is the following phrase:</p>

<blockquote>
	<p><em>In response to a complaint we received under the <a href=http://www.google.com/dmca.html>Digital Millennium Copyright Act</a>, we have removed 1 result(s) from this page. If you wish, you may <a href=http://chillingeffects.org/dmca512/notice.cgi?NoticeID=1022>read the DMCA complaint</a> for these removed results.</em></p>
</blockquote>

<p>A sharp member of the audience asked, "Why can't this notice be where the removed link would have appeared in the search results instead of the bottom of the page where I don't see it?"  The Google Rep. (<a href="http://www.cfp2004.org/program/speakers.html#mclaughlina">Andrew McLaughlin</a>), said that sounded reasonable but that he wasn't sure what would have to be done on the engineering side to make this happen.</p>

<p>A not-so-smart member of the audience (<a href="http://pobox.com/~joehall/">me</a>) then said, "Something that wouldn't require any effort on the part of your engineering staff would be to put these notices at the top of the page." Word.</p>

]]>
      
    </content>
  </entry>
  <entry>
    <title>Legitimate Google-bombing?</title>
    <link rel="alternate" type="text/html" href="http://cfp2004.org/blogs/gatekeepers/archives/000034.html" />
    <modified>2004-04-21T21:50:09Z</modified>
    <issued>2004-04-21T14:50:09-08:00</issued>
    <id>tag:cfp2004.org,2004:/blogs/gatekeepers//14.34</id>
    <created>2004-04-21T21:50:09Z</created>
    <summary type="text/plain">So, I just got out of the concurrent session entitled, &quot;Gatekeepers of the Web: The Hidden Power of Search Engine Technology&quot; here at CFP 2004. One thing that was mentioned quite a bit was the process of &quot;Google-bombing&quot;. (or &quot;search-engine...</summary>
    <author>
      <name>joehall</name>
      
      <email>jhall@sims.berkeley.edu</email>
    </author>
    
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://cfp2004.org/blogs/gatekeepers/">
      <![CDATA[<p>So, I just got out of the concurrent session entitled, <a href="http://www.cfp2004.org/program/#concurrent3">"Gatekeepers of the Web: The Hidden Power of Search Engine Technology"</a> here at <a href="http://www.cfp2004.org/">CFP 2004</a>. One thing that was mentioned quite a bit was the process of <a href="http://www.wordspy.com/words/Googlebombing.asp">"Google-bombing"</a>. (or "search-engine spamming" in a world where search services are competitive)</p>

<p>One question I would have liked to ask is, "Is their such thing as legitimate google-bombing?" That is, what if I know a page exists, and that it should be the first (or close to first) entry in the search results for a query?  </p>

<p>For example, there is a great band from Olympia, Washington called Veronica Lipgloss and the Evil Eyes.  If you do a Google search for this phrase (with or without quotes) like the following:</p>

<p><a href="http://www.google.com/search?hl=en&amp;q=%22veronica+lipgloss+and+the+evil+eyes%22">http://www.google.com/search?hl=en&amp;q=%22veronica+lipgloss+and+the+evil+eyes%22</a></p>

<p>this band's <a href="http://www.veronicalipglossandtheevileyes.com/">official web site</a> does not appear in the first page of search results (didn't check other pages).</p>

<p>So, how does someone go about correcting this? Well, I introduce legitimate google-bombing:</p>

<p><a href="http://www.veronicalipglossandtheevileyes.com/">Veronica Lipgloss and the Evil Eyes</a> <br />
<a href="http://www.veronicalipglossandtheevileyes.com/">Veronica Lipgloss and the Evil Eyes</a> <br />
<a href="http://www.veronicalipglossandtheevileyes.com/">Veronica Lipgloss and the Evil Eyes</a> <br />
<a href="http://www.veronicalipglossandtheevileyes.com/">Veronica Lipgloss and the Evil Eyes</a> <br />
<a href="http://www.veronicalipglossandtheevileyes.com/">Veronica Lipgloss and the Evil Eyes</a>  </p>

<h5>And yes, I've already attempted to <a href="http://www.google.com/webmasters/">add this site</a> to the Google crawl.</h5>

<b>UPDATE (2004-04-21 16:38:47):</b> As Josh pointed out in the comments to <a href="http://pobox.com/~joehall/nqb/archives/000241.html">my blog</a>, there is the possibility of google-bombing wars!

<blockquote>
On the topic of socially-engineered Google bombing:  you may have heard about
the recent controversy about searching Google on the word "Jew" -- until
recently, the first result was an anti-semitic site.  (See <a
href="http://www.adl.org/rumors/google_search_rumors.asp">Anti-Defamation
League explanation</a>.)  
<p>
I first found out about this when a friend emailed me about adding my name to
an online petition to try to force Google into removing the link (a solution I
wasn't thrilled about).  But then soon afterwards a different friend contacted
me about a google-bombing campaign to raise the ranking of the 2nd and 3rd
hits, thus pushing the anti-semitic site down...  Within about two days, the
offending site was pushed down to #2.
<p>
Also interestingly enough, Google put a <a
href="http://www.google.com/explanation.html">apology/explanation</a> on the
results page for that query.</p></p>
</blockquote>]]>
      
    </content>
  </entry>
  <entry>
    <title>Gatekeepers of the Net: Search Engines and Regulatory Implications</title>
    <link rel="alternate" type="text/html" href="http://cfp2004.org/blogs/gatekeepers/archives/000032.html" />
    <modified>2004-04-21T21:27:38Z</modified>
    <issued>2004-04-21T14:27:38-08:00</issued>
    <id>tag:cfp2004.org,2004:/blogs/gatekeepers//14.32</id>
    <created>2004-04-21T21:27:38Z</created>
    <summary type="text/plain">&quot;Gatekeepers of the Web&quot; is, unoffcially, the &quot;Berkman Center&quot; panel here @ CFP: it includes Andrew McLaughlin, a long-time Berkman fellow, who is here representing Google, and Ben Edelman, a former fellow and affiliate @ Berkman, who&apos;ll discuss his filtering...</summary>
    <author>
      <name>Donna Wentworth</name>
      <url>http://www.corante.com/copyfight</url>
      
    </author>
    
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://cfp2004.org/blogs/gatekeepers/">
      <![CDATA[<p>"Gatekeepers of the Web" is, unoffcially, the "<a href="http://cyber.law.harvard.edu">Berkman Center</a>" panel here @ CFP: it includes Andrew McLaughlin, a long-time Berkman fellow, who is here representing Google, and Ben Edelman, a former fellow and affiliate @ Berkman, who'll discuss his filtering research. </p>

<p>Others on the panel: German researcher Dr. Marcel Macquill - who proposed the session, and Matthew Hindman of Harvard's Kennedy School.</p>

<p>Dr. Macquill begins with a description of a study he has conducted on search engines and attitudes toward regulating them:</p>

<p>Macquill: We did use surveys, explored market questions, did performance testing; also asked people about attitudes toward regulating search engines.  </p>

<p>So: Over 90 percent of German people regularly use search engines.  69 percent of Germans used Google -- so we see predominance of one search engine.  </p>

<p>What we found is cooperation behind the scenes among search engines.  Google operates with other search engines.  </p>

<p>Many people think search engines are neutral and objective.  We try to show with this study that this is a myth.  Results can be manipulated.  </p>

<p>The term "spamming" is also be applied to manipulating websites to achieve better ranking.  What are the methods?  "Google bombing" -- building up hundreds/thousands of websites that contain one link.  Can also use words inappropriately.  Can use "invisible text": robot reads it, increases rank.  [...]</p>

<p>Opinion poll: people use search engines for different reasons.  Biggest reason: works better/easier to use.  Majority of those who don't use Google use their search engine because...they've always used it.</p>

<p>Only 11 percent of the population uses a second search engine.  Fewer use a third. </p>

<p>People know very little about search engines.  Comparable to the situation in the '60s with TV -- a mysterious black box.  We had to learn to be critical.  </p>

<p>We asked about regulation: one third said no one should regulate.  </p>

<p>We did a lab experiment: examined "partner" links -- paid content v. normal content.  Google is transparent about this; others are not.</p>

<p>(Shows how search can go terribly wrong -- eBay hijacks your search, porn sites  do typo-squatting, etc.; shows a number one result @ Google for NSDAP, gets an anti-Semite site.  Wonders whether this should be changed.)</p>

<p>Main challenges: Google monopoly, gatekeeper function of search engines, paid links, etc.  These are classic questions of media policy/concentration, etc.  Classical questions in a new light.  </p>

<p>Moderator:  What Marcel is saying is that "neutrality" may lead to problems; Ben's presentation will explore the problems with "actvist" search engines.</p>

<p>Ben Edelman: </p>

<p>Three points -- country-specific omissions; attempts to take porn out of search engine, Google does best job, but error-filled; what's different about search engines v. other businesses.</p>

<p>Country-specific: "Stormfront" in U.S. Google, it's there; "stormfront" in German, it's gone.  Most people don't know to compare. (Provides lists of omissions.)</p>

<p>Google not entirely clear about how it does this "filtering" -- can't seem to find out in an official way. </p>

<p>How serious is this problem?  Not terribly, in comparison to the problems w/filtering porn.</p>

<p>Google doesn't claim perfection -- and there are indeed some problems.  What is the definition of adult content?  Google waffles, but seems to over-exclude (with "Safesearch").  Search for "Library of Congress" using Safesearch--it's missing! Why?  Could be any number of reasons.</p>

<p>Also can't get "Northeastern University" if you have "Safesearch" on.  Basically, this thing doesn't work.</p>

<p>This must be a hard problem to fix, or Google would have got it right already.</p>

<p>Thoughts on transparency: the "black box" (secret sauce) problem.  To some extent, this is a business secret.  I understand this.  On the other hand there are good reasons to try to fix things.</p>

<p>[...missed a bit...]</p>

<p>Matt Hindman: </p>

<p>Because of link/traffic patterns, we may be facing a situation in which all search engines are returning the same results.  </p>

<p>How did we get here?  How did we enter the age of Google?  Mr. Page had the bright idea that links contain a lot of intelligence.  In all of this, paying attention to links was the critical shift.  </p>

<p>Structure of the Web -- inbound and outbound links -- "power law."  Traffic patterns are power-law distributed as well.  Eternal myth of openness prevents people from recognizing power law.  My work finds these power laws in politics, etc.  What we have is a fractally organized Web that is dominated by the power law.</p>

<p>All roads lead to Rome; all search engines return same results.  </p>

<p>I've been talking here almost exclusively about links.  [...] What I would submit is that the problem is not Google -- it's the Web itself.  Power law structures real problem.  </p>

<p>Andrew M.: Obviously, this stuff matters.  It matters what gets excluded.  Two points: 1) steps for increased transparency (not fully baked, but we're working on it), 2) "Safesearch" stuff Ben raised -- there is an answer.</p>

<p>Transparency: One of the things that Google does is turn our C&Ds over to <a href="http://www.chillingeffects.org">Chilling Effects</a>.  We put a note and people can see the take-down requests.  This is a great model.  We'll expand this to all legal orders.  Google.de exclusions would be covered by this.  Say we get an order from a court -- we'll submit the document to Chilling Effects.  "Something used to be here."  Before too long, we'll have something stable for Google to publish.  </p>

<p>Another thing: pages are sometimes pulled because of things like malicious script.  Talking to engineers.  This may take longer.  But the goal is transparency.</p>

<p>Okay, so back to "Safesearch" -- it's intended to be conservative.  If we haven't crawled the page, we don't label it.  So we don't move it into the "green zone." So Ben pointed out Thomas.  We haven't crawled the page and don't have a cached copy -- we can't tell whether it's safe, so we leave it out.  Also, some have asked not to be indexed; we don't index them.  If the page says, 'Don't crawl this," we don't.</p>

<p>Our to-do is to describe this in an FAQ.  No reason why not.  </p>

<p>Ben: Andrew, I think the lag-time is a little silly.</p>

<p>Andrew: Okay.</p>

<p>Audience Q & A:</p>

<p>[...]</p>

<p>Q: It wouldn't take much engineering skill to put the "Something was here" notice at the top.</p>

<p>Andrew: It's a valid point.</p>

<p>Q: What's the problem w/convergence?</p>

<p>Matt: [...] We need to fight the myth of Internet openness.  Eyeballs are actually more concentrated on the Web than in other media.  </p>

<p>Q: Hasn't research shown that there is a relatively fluid power law?  </p>

<p>Matt: No magic number you need to jump over. Important to bear in mind: well, everyone wants to be a pop star.  There's "no real barrier" to becoming a star...but if you're a star on the Net, you're going to stay a star. In weblogs, for example -- top is very stable.</p>

<p>Macquill: What about my question: Should we do "human editing" of results, or not? </p>

<p>Andrew: Should we? Straw poll?</p>

<p>Audience mainly votes for "neutral" approach.</p>

<p>Andrew: Impossible to human-edit -- which to move up or down.  Instead, let people know that they need to change their concept that the number one result is the most authoritative -- search engine no substitute for the human facility.  </p>

<p><br />
 </p>

<p></p>

<p></p>

<p> </p>]]>
      
    </content>
  </entry>

</feed>