The new libraries of alexandria

Google print, amazon’s search inside the book, full text search engines online – what’s going on here? We talk about the future or do we want to start working?

Recently, the project was recently on the bookshops in berlin full-text search online of the publisher committee of the borsenverein of the german book trade. According to the press info, it should be about "a separate central platform for full-text search on the internet" act. An important impact on the industry that exported from matthias ulm, publisher of ulmer verlag and initiator of the working group. Because, as ulmer in conversation, "it’s about the future of publishing, ie our very future". Already formulated to the point – and yet it could be that the initiative has just failed at its own ambation. Because a few days ago the website of google print went online again, with a circumference of approx. 1.1 – 1.2 million digitized beers.

Google print is (again)

Was and has been and is since we started in the spring of 2004 to report about it (the vlbs of the web: google print amazon), a lot of talk. Some of one will have surprised that the original website of google print had disappeared relatively quickly after the first wave of reporting – and remained. Whatever the two lands of google, sergey brin and larry page, who were back to europe tour in the spring year, may have led to ring for google print, which may take the google print website for almost a year from the internet, the time of the darls is before. Since the 26th. May 2005 is the website, easily refreshed and equipped with new features, again. From now on, therefore, everyone can get an idea of what google print really is and can.

"No web or newsgroup results: just books, and nothing but the books", it is already in the envision. In other words, google print is nothing but a full-text search engine that threw the content of beers. Search for keywords or. At the moment we can be in english-speaking buchers at the moment, with a maximum of 20% of the booking pages available.

Let’s start with a simple name, so our favorite author neal stephenson (the "science fiction" the past), it is also clear why the website has disappeared so long. Was there 42 search results a year ago, will "neal stephenson" now already in 3.210 book pages found in 0.14 seconds! A year ago we were allowed to ame that about. 60.000 – 100.000 bucher fully digitized in google print. At that time, however, only one hits per book was displayed. At an average quota of 4 hits per single book, we come to ca today. 800 bucher for the simply query of "neal stephenson".

Let’s follow the simple treble (calculated with 60.000 buchers), bat this, in google print now had to be the 1.1 – 1.2 million books digitally. Probably a realistic, if not at a low order of magnitude, given the effort that google takes to digit and indexes a booker machine and index. How many titles come from the university libraries project, which has been running since the last summer and in which the insistence of the university libraries of michigan, stanford, harvard, oxford and parts of the new york public library, together around the 20-25 million beech, incorporated should not be identified.

The search result, so the books in which "neal stephenson" occurs, are present with title, authors, year of publication, scope and, if available, the title cover. The description provides the exact page number and the text in the "neal stephenson" was found, as well as the other search results within the displayed book. Interesting: both worldview as well as books.De have already contributed commercial google ad’s on this overview page. The elderly from these google displays should be shared with the publishers.

If we click on a search result, the actual book page will be presented to us on the "neal stephenson" was found. The keyword is highlighted graphically, over the site are the book title and the authors, including the publisher and the isbn. The left navigation column is prominently containing the publishing logo, the direct searchability after further definitions of the term in the displayed book or even according to completely different terms. Under "about this book" there is a good short description of the book, all bibliographic information, expires from meetings and links to other websites that have worked with the band. Likewise, the entire table of contents, the register and, under "copyright", the impressions of the title for fraction. And finally there is a link list where we can buy the book everywhere. Best, just try it yourself. Printing the side can not exceed the layman for the laymen and the somewhat wandered web users must have to trick wild wild.

This is how the business works: the deal of google print and 2 questions

As already written: a maximum of 20% of the digitized book will be displayed. However, a not insignificant innovation is that google print corresponds to google accounts and we can only read more in many cases if we have such google account. This in turn requires a registry with google, as well as for the use of google’s webmail (gmail) or the usual news notification service google alerts. The deal, which google offers here, is thus: personal information against content.

This is fair, because minimal personal data are queried as the name or the email and we have to ask for every time our credit card per se more information. It becomes complicated because all google services can correspond with each other and thus easily put together an abstract overall picture of a personality: what are we looking for at google, which bookers are interested in google print, what buy via google’s warehouse froogle, what keywords google alerts up to date and who sends us emails on our gmail account? The answers to all these questions is called google. But we are not sacred than the pope: this form of "data mining" is in the "real life" exactly as gear and gift – every stupid raffle and every customer or customer. Discount card demands a work of personal data, up to the schufa information.

Stay two questions: google print is really useful and represents this service for the publishers a danger? First is already defined by the supremacy of google on the web: what does not exist on google, there is not. This goes so far that the conclusion: "do not exist on the web because google does not find it", to believe in this way: "that does not exist?". This means that the dividing line loses between the digital online and the analog offline world, even in the face of the ever-wider and deeper user behavior on the web, further sharp and contour.

Google is here at the moment whether we have not quelled us or not to be the stroke – not to be used to be perceived. Whoever can allow that in the age of the attention economy, especially the publishers with their special and rather small priced articles were not allowed to do so. Whether google print is accepted by the users is on another leaf, but if we only involve the category of the outested books and think about the success of zvab, more than likely.

The second question, whether google print is a danger for the publishers, has ever been wrong. She really had to live: is google print a chance for the publishers? And we were answered all around with yes. For, unlike amazon’s search inside the book service, which runs both visually and in terms of meantime behind google print, with google print remain all rights at the publishers.

Which means that for the basic legal situation between rights owners and publishers, for example, for all no available bookers, which are then still available, at least in digital form, had to be clarified as soon as possible. On all cases, amazon germany, already alone, if a publisher might sell his book via amazon, all rights to amazon are transferred. Here again the link to the reading. To ask, which rights should actually be transferred to amazon for participation in the search inside the book service – amazon already has all. That amazon is movable at least in germany, show the successful renegotiation of various german publishers.

Web wunches and web realities

Ca. 20 us dollars costs google to digitize a book, about. 5000 bucher a day is said to work the specially built book scan street from google. What makes it daily 100 000 us dollars for google print are ied and, with more than 1.1 million digitized beer, the first $ 22 million already verga dollars. For 300 american working days a year, $ 300 has added $ 300 million over the clock and 1.5 million digitized books in google print. Whether here the costs for the implementation of the content or. Keywords already included in the special google search algorithm, nobody is so accurate. To believe, google cracks this investment is far from missing. The first quarter of 2005 castle of the borsenneuling, whose share is currently moving at an annual high, with sales of $ 1.256 billion. The profit amounted to $ 369 million and cash and cash equivalents amounted to sage and write 2.5 billion us dollars.

In view of these numbers, the well-intentioned envision of the borsenverein of the german book trade must install an independent platform for the full-text search in beers on the web, quite disposable. Apart from that, and that’s the point of jumping, it’s lacking exactly at the things that matthias ulmer, member of the board of the publisher’s committee of the borsen association for the industry reclaims: "we have the potential, the experiences and technical know-how for such a solution…".

The potential, the books, the authors, very well. But the technical know-how and especially the experiences have not been sufficient to establish a simple, neutral platform for bookers, publishers, authors, and book enthusiasts on the web, which were approached to the quasi-monopoly of the quasi-bar assay amazon. The fact is: so far it has not succeeded, and here it is not about making amazon competition to create a neutral platform on the web that presents the contents of the book named and the authors – which for a given potential in view of the many authors and your weblogs – meaningful with wiverning and offers you an appropriate forum. But it will succeed in giving a voice to the entire publishing industry (from the author to the antiquarian) within the web a voice and thus attention. Why should therefore be suddenly possible to install a book full-text search engine on the web, which is not only technologically complex, organisationally complicated and of the investments is far more expensive than a platform for bucher?

Because we only become the meaningful networking of its lectures, which already exists with the publishers. We will set the data formats together and then build the infrastructure that is necessary, for a common full-text search engine.

Matthias ulmer

That sounds simple and feasible and is but for everyone who is a little bit on the internet to say it carefully, bar every reality. What really took us every day on the internet, if we operate a hub-frequented website, john walker, the land of autodesk and a veteran of the internet, has described exactly in its history of the internet liver and is confirmed by any system administrator or webmaster. And rest ared: the man does not just know, he does not even drive.

Coarse treasures for such a first web book platform, as they were discussed last year at a meeting of different publishers in switzerland and in germany, go from approx. 1.2 – 1.8 million euros for the first two years. So that the goods were done a first important step, who has enough room for experiences, which maybe someday could be in a full-text book search engine. The egg-laying wool milk sau, which is disregarded, to carry out infrastructural maws, is there, which were allowed to hear now, especially on the internet. What pays is continuous, persistent work away from the hypes, the identity and thus attention and added value. See amazon, e-bay, firefox, google, wikipedia, yahoo ….

Even in the book industry, a lot of important web work has already been made. The already mentioned central directory of antiquarian bucher (zvab) is just as much as the website of the frankfurt book fair. For years, in partially muhseliger kleinstarbeit, the publishing and translator directory as well as the who’s who of the industry is maintained. Often enough, the book fair website is not only the only source that can provide data and information overhead, but also up to date and reliable. Also this development has taken years and was as the head of the eservices division of the book fair, marife boix-garcia confirmed "bone-hard work".

If it, like matthias ulmer says, goes to the future of (digital) publishing, then we should first remember the last 10 web years. Here there were tops and flops: stephen king failed with his attempt to win buyers for a continuation novel on the web (the plant grows no more online). The weblogger and e-text lovers cory doctorow set his book down and out in the magic kingdom in 22 different text file formats for free download available, got over 150.000 downloads and sold his physical book magnificently (reproduction by separation of the thumbs). The full-bodied introduced bol failed as well as the thick mucki strategy of barnes nobles. The bucher pot won amazon and mutated to the web store with used department. Wikipedia runs brockhaus, despite the existing strategy approaches, the rank and direct media weub wikipedia’s success in sold products surrounded. The consignor two thousands are not only established, but also better than worldview in the web, etc. Usf.

The examples were still many. In commonly, you have everything: all successful web offers or conversions began very simple, became more complex or more varied to themselves respond to respond. Divide. Everyone had developing and learning times of several years. Google, started as a simple search engine, now provided with zig services, from google blogs on the community platform orkut to the google maps, exactly like amazon usa with now 31 sections, club membership and its own search engine a9.Com, who also looks specifically after bookers, the best example of that.

So we do not mind: without google, there was no google print today. The google years (founded in 1998), however, if we want to start the obvious and easy, still before us. And we should hurry to finally work out.

Excursus: amazon’s web services and the consequences

Amazon’s book offer consists of the databases of vlb, libri and kno / kv. All bookers existing there are automatically displayed at amazon. Copyright or. Publishers can be self-standing, they should not be satisfied with the image of the data, read information. Each registered user can also make a bookeer who have an isbn, but the amazon does not (more) lead to use for the z-shops or the marketplace,.

All these data are not only shown at amazon, but are also available to third parties via amazon’s web services. There it is called:

Amazon web services allows web developers, features and content from amazon.De seamless to integrate into your websites. Operators of websites can be used to:

  1. Continuous updated information about amazon products.To bring de to your websites
  2. Amazon products.De on your websites for sale
  3. Maabmaked links and dynamic advertising placements. Web developer can use productivity applications for other customers from amazon.De, to develop for handlers, partners and website operators."

In other words, we can load the entire content of amazon on our own website and map it. This may explain why amazon is so rigid at the rights ownership. Like kathleen ohlson on 1. May on addtmag.Com reported, take over 80.000 developers worldwide at the amazon web service program. A stately number that makes it clear how deep amazon is anchored on the web. Amazon benefits twice here: once from the passing on its contents against bought – here, however, it would clearly clear what amazon actually rejects the rights holders. But especially amazon benefits from the programs and applications that the developers write on the basis of the amazon data.

Epilogue: "good artists are ready" (steve jobs

Ergo could therefore be a first step of the borsenverein, a complete database from vlb, libri, kno / kv, zvab and other sources to install them all websites that have anything to do with beers, as well as the internet, price and product search engines, as well as the internet, price and product search engines. The past shows how the diogenes verlag has succeeded, for example, with paulo coelho, that it is worth the sources that the author (inn).

Alone the picture of the recycled data in a secured and high, even talking qualitat was allowed to equal sisyphous work. From the outset to a later full-text search within the bucher, is certainly reasonable. But in the face of the mountain of work, which is stuck, rather a later step. In the simple position of the beginning, which allows it to be done, visible, perceptible to be on the web before the eternal dream of a library of alexandria again ends in an air lock.

Google print’s offer, free digitization and indexing of the books, pitch of the advertisement, which arises from the google ads, with the publishers and the whereabouts of all rights at the publishers, is not currently on top. The two crucial questions that google must answer are: is the exclusive right to digitize a book; who, auber google print, the digitized book is still made available and to which verification is then thought?

Who stops at first glance for tracing, he was again reminiscent of amazon’s web services, but especially the fact that the web has become a development laboratory of the computer and software industry. Here can be-practically overnight, so it has at least the appearance, others. How long it is often done behind the scenes is another story. And as surprisingly it is when one of the coars on the web games the muscles, just yahoo shows in america. Since last week, the search service, which plows over-contaminated features, such as web mail, music downloads, a price and product search engine, etc. Easy, e-bay competition. When the new yahoo auction platform is also available in germany, is open. That she comes, but is as sure as the amen in the church.