{"id":3900,"date":"2010-12-18T15:12:35","date_gmt":"2010-12-18T14:12:35","guid":{"rendered":"http:\/\/textundblog.de\/?p=3900"},"modified":"2010-12-18T15:43:35","modified_gmt":"2010-12-18T14:43:35","slug":"mit-books-ngram-viewer-einen-teil-der-bucher-der-welt-abfragen","status":"publish","type":"post","link":"https:\/\/textundblog.de\/?p=3900","title":{"rendered":"Mit Google Books Ngram Viewer einen Teil der B\u00fccher der Welt abfragen"},"content":{"rendered":"<p>Google scannt ja, wie wir alle wissen, massenhaft die gedruckte Literatur der Welt ein und macht sie \u00fcber <a href=\"http:\/\/books.google.de\">Google B\u00fccher<\/a> durchsuchbar. Um welchen Umfang es dabei geht, verdeutlicht dieser Abschnitt aus dem ZEIT-Artikel&laquo; <a href=\"http:\/\/www.zeit.de\/wissen\/2010-12\/kulturomik-google-books\">Google Books \u2013 Wie oft kam Gott?<\/a>&raquo;:<\/p>\n<blockquote><p>\nEs hat zwar kein Mensch der Welt genug Zeit, um auch nur alle B\u00fccher eines Jahrgangs zu lesen, aber mit der zunehmenden Digitalisierung von B\u00fcchern werden die Informationen von den Buchseiten gel\u00f6st und in eine computerverst\u00e4ndliche Sprache \u00fcbersetzt. Nicht alle etwa 129 Millionen B\u00fccher, die jemals geschrieben wurden, sind digital verf\u00fcgbar. Aber immerhin 15 Millionen B\u00fccher will Internetgigant Google inzwischen in Universit\u00e4tsbibliotheken rund um die Welt eingescannt haben.<\/p><\/blockquote>\n<p>Das wissenschaftliche Projekt <a href=\"http:\/\/www.culturomics.org\/\">culturomics.org<\/a> hat sich einen Teil dieses enormen Datenbestandes herausgesucht, n\u00e4mlich 5,2 Millionen B\u00fccher mit der unfassbar gro\u00dfen Datenmenge von etwa 500 Milliarden W\u00f6rter. Die Untersuchung wird sowohl im bereits zitierten ZEIT-Artikel als auch in der NY Times beschrieben: <a href=\"http:\/\/www.nytimes.com\/2010\/12\/17\/books\/17words.html\">In 500 Billion Words, New Window on Culture<\/a>. Die Forschungsergebnisse des <a href=\"http:\/\/www.culturomics.org\/cultural-observatory-at-harvard\/People\">Teams<\/a> um Erez Lieberman Aiden von der Harvard University sind in <em>Science<\/em> ver\u00f6ffentlicht worden: <a href=\"http:\/\/www.sciencemag.org\/content\/early\/2010\/12\/15\/science.1199644\">Quantitative Analysis of Culture Using Millions of Digitized Books<\/a>.<\/p>\n<p>Doch das beste: man kann unabh\u00e4ngig von diesen kultur- und sprachwissenschaftlichen Auswertungen eigene Feldforschung betreiben. Mit dem Tool <a href=\"http:\/\/ngrams.googlelabs.com\/\">Books Ngram Viewer<\/a>.<\/p>\n<p>Ein paar Beispiele (bitte beachten: die Suchabfragen sind <em>case-sensitive<\/em>, d.h. es wird zwischen Gro\u00df- und Kleinschreibung unterschieden):<\/p>\n<p><strong>Kontrollverlust, Privatsph\u00e4re, Datenschutz:<\/strong><br \/>\n<a href=\"http:\/\/ngrams.googlelabs.com\/graph?content=Kontrollverlust%2CPrivatsph%C3%A4re%2CDatenschutz&#038;year_start=1900&#038;year_end=2008&#038;corpus=8&#038;smoothing=3\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/textundblog.de\/wp-content\/ngram-kontr.jpg?resize=500%2C250&#038;ssl=1\" width=\"500\" height=\"250\" alt=\"Kontrollverlust, Privatsph\u00e4re, Datenschutz\" title=\"Kontrollverlust, Privatsph\u00e4re, Datenschutz\" \/><\/a><\/p>\n<p><strong>Saarland:<\/strong><br \/>\n<a href=\"http:\/\/ngrams.googlelabs.com\/graph?content=Saarland&#038;year_start=1900&#038;year_end=2000&#038;corpus=8&#038;smoothing=3\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/textundblog.de\/wp-content\/ngram-saarland.jpg?resize=500%2C256&#038;ssl=1\" width=\"500\" height=\"256\" alt=\"Saarland\" title=\"Saarland\" \/><\/a><\/p>\n<p><strong>das Blog, der Blog:<\/strong><br \/>\n<a href=\"http:\/\/ngrams.googlelabs.com\/graph?content=das+Blog%2Cder+Blog&#038;year_start=2000&#038;year_end=2008&#038;corpus=8&#038;smoothing=0\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/textundblog.de\/wp-content\/ngram-blog.jpg?resize=500%2C246&#038;ssl=1\" width=\"500\" height=\"246\" alt=\"das Blog, der Blog\" title=\"das Blog, der Blog\" \/><\/a><\/p>\n<p>Weite Informationen siehe auch Artikel in Libreas: <a href=\"http:\/\/libreas.wordpress.com\/2010\/12\/17\/kulturkurven-fur-achtjahrige-ein-kurzer-blick-auf-googles-ngrammatologie\/\">Kulturkurven f\u00fcr Achtj\u00e4hrige: Ein kurzer Blick auf Googles Ngrammatologie<\/a>.<\/p>\n<p>Und ansonsten einfach mal selbst ausprobieren: <a href=\"http:\/\/ngrams.googlelabs.com\/\">Books Ngram Viewer<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Google scannt ja, wie wir alle wissen, massenhaft die gedruckte Literatur der Welt ein und macht sie \u00fcber Google B\u00fccher durchsuchbar. Um welchen Umfang es dabei geht, verdeutlicht dieser Abschnitt aus dem ZEIT-Artikel&laquo; Google Books \u2013 Wie oft kam Gott?&raquo;: Es hat zwar kein Mensch der Welt genug Zeit, um auch nur alle B\u00fccher eines [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ngg_post_thumbnail":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[11,21],"tags":[],"class_list":["post-3900","post","type-post","status-publish","format-standard","hentry","category-literatur","category-software"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p4uzZ-10U","jetpack-related-posts":[{"id":5319,"url":"https:\/\/textundblog.de\/?p=5319","url_meta":{"origin":3900,"position":0},"title":"Arte: Google und die Macht des Wissens","author":"Markus","date":"1\/4\/2013","format":false,"excerpt":"http:\/\/youtu.be\/RZkdkobK99A Im Jahr 2002 fing Google an, Weltliteratur einzuscannen. Man schloss Vertr\u00e4ge ab mit den gr\u00f6\u00dften Universit\u00e4tsbibliotheken wie Michigan, Harvard und Stanford in den USA, der Bodleian Bibliothek in England und der Katalanischen Bibliothek in Spanien. Das Ziel war nicht nur eine riesige globale Bibliothek aufzubauen, sondern all dieses Wissen\u2026","rel":"","context":"In &quot;Literatur&quot;","block_context":{"text":"Literatur","link":"https:\/\/textundblog.de\/?cat=11"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/img.youtube.com\/vi\/RZkdkobK99A\/0.jpg?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1014,"url":"https:\/\/textundblog.de\/?p=1014","url_meta":{"origin":3900,"position":1},"title":"Wer soll unsere B\u00fccher digitalisieren?","author":"Markus","date":"23\/6\/2006","format":false,"excerpt":"Zwei spanische Tageszeitungen widmen sich in diesen Tagen dem Thema Digitalisierung von Literatur. La Vanguardia: \u00abGoogle quiere 'sacar del limbo' a los libros olvidados\u00bb. Hier \u00e4u\u00dfert sich Marco Marinucci von Google Book Search. La idea de Google Book es \"integrar y compartir el conocimiento de la humanidad\" al digitalizarlo y\u2026","rel":"","context":"In &quot;Literatur&quot;","block_context":{"text":"Literatur","link":"https:\/\/textundblog.de\/?cat=11"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":515,"url":"https:\/\/textundblog.de\/?p=515","url_meta":{"origin":3900,"position":2},"title":"Google scannt keine B\u00fccher mehr","author":"Markus","date":"15\/8\/2005","format":false,"excerpt":"... meldet die netzeitung: Googles Buch-Projekt war massiv kritisiert worden. Nun digitalisiert Google vorerst nur noch \u00e4ltere B\u00fccher.","rel":"","context":"In &quot;Literatur&quot;","block_context":{"text":"Literatur","link":"https:\/\/textundblog.de\/?cat=11"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":306,"url":"https:\/\/textundblog.de\/?p=306","url_meta":{"origin":3900,"position":3},"title":"Europas Literatur digitalisieren","author":"Markus","date":"20\/3\/2005","format":false,"excerpt":"Chirac bringt franz\u00f6sische B\u00fccher ins Netz Weil Google bei der Digitalisierung englische Werke bevorzugt, soll ein europ\u00e4ischer Vorsto\u00df die Balance wiederherstellen. [...] In den kommenden Wochen will Chirac weitere europ\u00e4ische L\u00e4nder als Unterst\u00fctzer f\u00fcr sein Vorhaben gewinnen und so die wichtigsten B\u00fccher Europas koordiniert ins Netz bringen. [via futurezone.ORF.at] Artikel\u2026","rel":"","context":"In &quot;Literatur&quot;","block_context":{"text":"Literatur","link":"https:\/\/textundblog.de\/?cat=11"},"img":{"alt_text":"Google-Digitalisierung","src":".\/wp-content\/google.gif","width":350,"height":200},"classes":[]},{"id":499,"url":"https:\/\/textundblog.de\/?p=499","url_meta":{"origin":3900,"position":4},"title":"Kulturkrieg im Cyberspace&#8230;","author":"Markus","date":"5\/8\/2005","format":false,"excerpt":"... betitelt Michael M\u00f6nninger sein lesenswertes ZEIT-Dossier (DIE ZEIT 04.08.2005 Nr.32). Frankreich ruft zu den Waffen. Um den Vormarsch der Online-Bibliothek von Google zu kontern, soll Europa seine eigenen B\u00fcchereien ins Netz stellen. Aber mit Google auf Dauer mithalten k\u00f6nnen die Europ\u00e4er nur, wenn sie ihren Datenverbund kr\u00e4ftig ausbauen -\u2026","rel":"","context":"In &quot;Artikel&quot;","block_context":{"text":"Artikel","link":"https:\/\/textundblog.de\/?cat=5"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":816,"url":"https:\/\/textundblog.de\/?p=816","url_meta":{"origin":3900,"position":5},"title":"Google scannt nicht in Archivqualit\u00e4t","author":"Markus","date":"17\/2\/2006","format":false,"excerpt":"Wer glaubt mit dem Google Book Search-Projekt w\u00fcrden ernsthafte Archivierungsvorkehrungen f\u00fcr das weltweite Kulturged\u00e4chtnis betrieben, der mu\u00df entt\u00e4uscht werden. The key point that I took away from this is that Google book project IS NOT an alternative to library\/archive\/archival\/preservation scans. Libraries will still have an important role to play (as\u2026","rel":"","context":"In &quot;Internet&quot;","block_context":{"text":"Internet","link":"https:\/\/textundblog.de\/?cat=4"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/textundblog.de\/index.php?rest_route=\/wp\/v2\/posts\/3900","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/textundblog.de\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/textundblog.de\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/textundblog.de\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/textundblog.de\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3900"}],"version-history":[{"count":0,"href":"https:\/\/textundblog.de\/index.php?rest_route=\/wp\/v2\/posts\/3900\/revisions"}],"wp:attachment":[{"href":"https:\/\/textundblog.de\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3900"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/textundblog.de\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3900"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/textundblog.de\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3900"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}