PurpleFilm

The Information Within

Archive for the ‘Information Retrieval’ Category

The science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases, whether relational stand-alone databases or hypertextually-networked databases such as the World Wide Web. There is a common confusion, however, between data retrieval, document retrieval, information retrieval, and text retrieval, and each of these has its own bodies of literature, theory, praxis and technologies. IR is, like most nascent fields, interdisciplinary, based on computer science, mathematics, library science, information science, cognitive psychology, linguistics, statistics, physics

Behind the scenes with universal search

Posted by purplefilm on May 24, 2007

logo.gif

 

From Official Google Blog

 

So when we were asked to make the vision Marissa describes about universal search into a reality, we admit we were a little daunted. Googlers had tried before to do this without success — several times. Finding the best answer across multiple content types is a well-known hard problem in the search field. Besides that, we wondered if we had become too big a company to pull off a project this complex.

Here’s the challenge in a nutshell: Until now, we’ve only been able to show news, books, local and other such results at the top of the page, like this example for [trends in education]. But it’s a tall order to earn placement at the top of our search results, so plenty often we end up not showing these kinds of results even when they might be useful. If only we could smartly place such results elsewhere on the page when they don’t quite deserve the top, we could share the benefits of these great Google features with people much more often.

One challenge was being able to regularly search through all of the additional content types to find relevant results. After all, you don’t know if there might be a minor news story or an obscure book relevant to your query unless you go and check. But Google’s massive compute cluster — and much effort by our infrastructure experts — gave us a leg up on that one, and we can now search these disparate types of information about as efficiently as we search our massive index of web pages. We may have melted down a data center or two along the way, but then bugs are part of life in this business!

The next challenge was deciding when and where such results should blend in. Fortunately we have some of the world’s experts on ranking, and have been able to apply the lessons learned on web search to ensure that we show news only for newsworthy queries, scanned books only when there aren’t better web results, etc. It can be tricky. As we learned the hard way, just because everyone under the sun is writing about Anna Nicole Smith doesn’t mean news about her should show up for the search [baby names].

Lastly, we faced the challenge of the user interface you see on the screen — the UI. The new UI for these results is subtle, but this is one reason why the project is fun for our designers and usability experts: they get to focus on creating a simple experience for you. For example, with news results they designed a compact look for the result that includes helpful items like an image and a date, but is limited to just the most salient information. Or take our book search results, which call out the author and number of pages in the book. (Of course, we learned that sometimes you don’t even need to design a user interface. In one early usability study, shortly after Barry Bonds broke Babe Ruth’s home run record, we asked people “how many home runs has Barry Bonds hit?” hoping they would type [barry bonds] into the search box. Instead, each and every one simply blurted out “715″.)

We also called on experts from each of our feature areas such as News and Local, and were delighted to find our startup mindset is alive and well. Folks from all over found spare time and pitched in to get us to the finish line. There were many nights when we went to bed knowing that plenty of the team’s IM status still reported they were online.

And after all this elbow grease, finally we have something that works. What does it mean for you?

Although it’s just a beginning, this first pass of universal search focuses on video, news, local and books. Now you’ll be able to get more information Google knows about directly from within the search results. You won’t have to know about specialized areas of content. If you’re looking for the [atkins southwestern pork fajitas] recipe, we can now link you right to that page in the book. Or if, like me, you’ve been busy these past few days and have not caught up with your Tivo, don’t type [sopranos] into Google, because our news result will be a giant spoiler. The search for [rachmaninoff concerto 3] includes a video of Vladimir Horowitz performing this piece (scroll down to see it), and [Animator vs. Animation 2] is pretty cool as well. (And as Johanna notes: I was delighted to see that when querying for my son’s name a video showed up too.)

This is just the tip of the iceberg in making Google results more comprehensive and useful. It has involved launching a number of new systems that will make it much easier for us to continue making improvements so you get the most relevant information from our varied content areas. We hope you like it. And finally, we’re especially happy to know that Google is still very much a place where we can get big things done!

 

Posted in Information Retrieval | Leave a Comment »

Google Begins Move to Universal Search

Posted by purplefilm on May 24, 2007

logo1.gif

Google Introduces New Search Features and Unveils New Homepage Design MOUNTAIN VIEW, Calif. (May 16, 2007) – Google Inc. (NASDAQ: GOOG) today announced its critical first steps toward a universal search model that will offer users a more integrated and comprehensive way to search for and view information online. The company also introduced an updated homepage design and several new navigation features that make it faster and easier for users to find the information they are looking for.

“Our focus has always been making our users’ search experience as simple and straightforward as possible,” said Marissa Mayer, vice president of search products and user experience at Google. “The ultimate goal of universal search is to break down the silos of information that exist on the web and provide the very best answer every time a user enters a query. While we still have a long way to go, today’s announcements are a big step in that direction.”

Google’s vision for universal search is to ultimately search across all its content sources, compare and rank all the information in real time, and deliver a single, integrated set of search results that offers users precisely what they are looking for. Beginning today, the company will incorporate information from a variety of previously separate sources – including videos, images, news, maps, books, and websites – into a single set of results. At first, universal search results may be subtle. Over time users will recognize additional types of content integrated into their search results as the company advances toward delivering a truly comprehensive search experience.

For example, a user searching for information on the Star Wars character Darth Vader is likely interested in all the information related to the character and the actor – not just web pages that mention the movie. Google will now deliver a single set of blended search results that include a humorous parody of the movie, images of the Darth Vader character, news reports on the latest Lucas film, as well as websites focused on the actor James Earl Jones – all ranked in order of relevance to the query. Users no longer have to visit several different Google search properties to find such a wide array of information on the topic.

The Power of Google Technology
Google is also in the process of deploying a new technical infrastructure that will enable the search engine to handle the computationally intensive tasks required to produce universal search results. The company is also releasing the first stage of an upgraded ranking mechanism that automatically and objectively compares different types of information. As always, Google™ search results are ranked automatically by algorithms to deliver the best results to users anywhere in the world.

“Google has continued to concentrate on improving the quality of search,” said Udi Manber, vice president of engineering at Google. “The level and speed of search innovation at Google has increased. Most of this innovation addresses basic ranking algorithms and is often not obvious to users. Users just see more accurate results, more often, in more languages, which is our primary goal.”

New Navigation Features
New dynamically generated navigation links have been added above the search results to suggest additional information that is relevant to a user’s query. For example, a search for “python” will now generate links to Google Blog Search™, Google Book Search™, Google Groups™, and Google Code™, to let the user know there is additional information on his or her query in each of those areas. As a result, users can find a wider array of information on their topic, including data types they might not have initially considered.

Google’s homepage and a number of applications have also been updated with a new navigation bar to provide easier access to popular Google products. Now, instead of having links above the Google.com homepage search box, users will see a navigation bar on the top left side of the page with various Google search properties and popular products including Gmail™, Google Calendar™, Google Docs & Spreadsheets™, and Picasa Web Albums™.

Experience the Experiments
Google also announced today a new experimental version of its popular search service called Google Experimental™, available on Google Labs™. This new test site provides users an opportunity to try out some of the latest search experiments and innovations and provide Google with feedback. One of the first experiments to be featured on the site enables users to view their search results on a map or timeline. For instance, when someone searches for “Albert Einstein” on Google Experimental, they can choose to view the search results on a map that shows locations mentioned within web pages about Albert Einstein or on a timeline that illustrates the history of Albert Einstein’s life. More information on Google Experimental search is available at Google Labs at http://labs.google.com.

About Google Inc.
Google’s innovative search technologies connect millions of people around the world with information every day. Founded in 1998 by Stanford Ph.D. students Larry Page and Sergey Brin, Google today is a top web property in all major global markets. Google’s targeted advertising program provides businesses of all sizes with measurable results, while enhancing the overall web experience for users. Google is headquartered in Silicon Valley with offices throughout the Americas, Europe and Asia. For more information, visit www.google.com.

From Official Google Press Center

Posted in Information Retrieval | Leave a Comment »

Universal Search: The best answer is still the best answer

Posted by purplefilm on May 24, 2007

logo.gif

From Official Google Blog

Back in 2001, Eric asked for a brainstorm of a few “splashy” ideas in search. A designer and product manager at the time, I made a few mockups — one of which was for ‘universal search.’ It was a sample search results page for Britney Spears that, in addition to web results, also had news, images, and groups results right on the same page. Even then, we could see that people could easily become overwhelmed with the number of different search tools available on Google — let alone those that would be created over the next few years. This proliferation of tools, while useful, has outgrown the old model of search. We want to help you find the very best answer, even if you don’t know where to look.

That mockup and early observations were the motivation behind the universal search effort we announced earlier today. And while that Britney Spears mockup was the start of Google’s universal search vision, it was instantly obvious that this would be one of the biggest architectural, ranking, and interface challenges we would face at Google. Over several years, with the help of more than 100 people, we’ve built the infrastructure, search algorithms, and presentation mechanisms to provide what we see as just the first step in the evolution toward universal search. Today, we’re making that first step available on google.com by launching the new architecture and using it to blend content from Images, Maps, Books, Video, and News into our web results.

With universal search, we’re attempting to break down the walls that traditionally separated our various search properties and integrate the vast amounts of information available into one simple set of search results.

Here are a few of my favorite searches that show off the power of universal search:

In addition, we’ve rolled out a few new navigation elements and experimental features to help our users better navigate our site and find the information they’re looking for. These include contextual navigation links above the search results that help users “drill down” to specific types of information. For instance, developers who search for [python] will see links for “web,” “blogs,” “books,” “groups,” and “code,” whereas [downtown los angeles] will show a different set of links. Also, in terms of integration and navigation, today we introduced a new universal navigation bar at the top of all Google web pages to provide easier navigation to your favorite Google products, such as Gmail.

While today’s releases are big steps in making the world’s information more easily accessible, these are just the beginning steps toward the universal search vision. Stay tuned!

 

 

Posted in Information Retrieval | Leave a Comment »

Personal Bibliography Management Software

Posted by purplefilm on May 14, 2007

Personal Bibliography Management Software
Analysis and Comparison of Some Packages

In his last post on Burioni Forum, Francesco Dell’Orso analyzes problems and possibilities related to Citation Management platforms, looking for the best choice and underlining how user intentions and behaviours could be influenced by market trends.

Full text on http://www.burioni.it/forum/dellorso/bms/text/index.html

Posted in Information Retrieval | Leave a Comment »

(Not) Everything is Miscellaneous (a review)

Posted by purplefilm on May 8, 2007

To the librarians. So begins Everything is Miscellaneous, David Weinberger’s mesmerizing new book about organization, authority, and knowledge. I received my advance copy last week and read it in a single day. I found it interesting and inspiring, and I recommend it highly.

Everything is MiscellaneousBut, I don’t agree that everything is or will be or should be miscellaneous, and I don’t believe David is entirely fair to librarians, information architects, and other professional organizers.

The troubles begin with David’s taxonomy which divides the history of organization into a first, second, and third order of order. Implicit in this taxonomy lies the assertion of linear progress, a lossless swap of old for new, similar to the presumed step change into Web 2.0 that I challenged in Information Architecture 3.0.

In David’s book, the inevitability and desirability of this migration to miscellany is also made explicit:

When you go to a commercial Web site, the business owns and controls the information it wants to give you, the way you’ll navigate through that information, and the experience you’ll have while doing so…the miscellanizing of information, knowledge, and ideas rips these assets out of the hands of individual businesses…the most successful businesses will have to get over the second-order assumption that they own the customer’s experience. In a truly miscellaneous world, a successful business owns nothing but what it wants to sell us. The rest is ours.

It’s not that I disagree with David about the power and potential of user participation in the creation and organization of knowledge. But, I do believe that the old serves as foundation for and coexists with the new, or as I explained in Ambient Findability:

We don’t have to choose. Ontologies, taxonomies, and folksonomies are not mutually exclusive. In many contexts, such as corporate web sites, the formal structure of ontologies and taxonomies is worth the investment. In others, like the blogosphere, the casual serendipity of folksonomies is certainly better than nothing. And in some contexts, such as intranets and knowledge networks, a hybrid metadata ecology that combines elements of each may be ideal.

And, in fact, it’s the “third order” information-as-commodity companies like Amazon and eBay that have most aggressively and successfully integrated traditional and novel organization methods to create a positive (and profitable) customer experience.

Furthermore, while I agree with David that “second-order organization is often as much about authority as about making things easier to find” and that all taxonomies embed bias, the same can be said of search engines, books, blogs, Amazon, eBay, and the Wikipedia. This doesn’t negate the value and good intentions of librarians, information architects, authors, editors, designers, and users who labor to improve findability, accessibility, and understanding for all.

It simply suggests that we must all be more aware, as consumers and creators, of the incentives, biases, and weaknesses inherent in all sources and structures of authority and knowledge.

Despite, or perhaps because of, these points of contention, I really did find Everything is Miscellaneous to be an exhilarating read. David has done a masterful job of weaving the histories of library science and information architecture into a hot and sexy page-turner of a story.

Of course, I can’t help but wonder about the dedication.

  1. To the librarians. Thanks for nothing?
  2. To the librarians. Thanks for everything?
  3. To the librarians. May they rest in peace?

After reading the book, I’m still not sure. What do you think?

By Peter Morville

(Thanks to Bonaria Biancu who originally posted it upon her great blog)

Posted in Information Retrieval | Leave a Comment »

Introducing the D-Lib Alliance

Posted by purplefilm on May 7, 2007

d-libmagazine.gif

Since its launch in 1995, D-Lib Magazine has been freely available on the Internet, and the magazine’s expanding archive provides much of the core literature on digital libraries. The organization that publishes D-Lib, CNRI, hopes to continue to provide the magazine to readers, free of charge, but to do so we need financial and advisory support from the worldwide community that D-Lib serves. Initiated in February 2007, the D-Lib Alliance provides a venue for those organizations wishing to support the ongoing, open access publication of D-Lib Magazine.

We know from our own experience and from talking with others that the digital library community relies on D-Lib Magazine, and now the magazine relies on the digital library community for its continuing existence. Please join the D-Lib Alliance as a way to engage in this mutually beneficial relationship.

As participants in the D-Lib Alliance, institutional supporters will be prominently acknowledged in the magazine, and we are discussing other benefits for D-Lib Alliance participants as well.

More info on D-Lib Magazine 2007, 13, 3-4

Posted in Information Retrieval | Leave a Comment »

Information Design for the New Web

Posted by purplefilm on May 2, 2007

Information design for the Web has changed.

People are changing the way that they consume online information, as well as their expectations about its delivery. The social nature of the Web brings with it an expectation of interaction with information and modern Web design is reflecting that. There are now alternate forms of navigation including the ability to browse by user, tag clouds, tabbed navigation etc. Advances in technology along with these shifts in user expectations are affecting the way that information is laid out on a webpage. Today’s websites are aiming for intuitive and usable interfaces which are continuously evolving in response to user needs. Website designers are approaching information design differently and designing simple, interactive websites which incorporate advancements in Web interface design, current Web philosophies, and user needs. Information design for the New Web is simple, it is social, and it embraces alternate forms of navigation.

(Ellyssa Kroski, posted at Computers in Libraries 2007- full post on InfoTangle

Posted in Information Architecture, Information Retrieval | Leave a Comment »