Pages in topic:   [1 2] >
Off topic: See how often certain words were used in books between 1500 and 2008
Thread poster: Suzan Hamer
Suzan Hamer
Suzan Hamer  Identity Verified
Netherlands
Local time: 10:16
English
+ ...
Dec 18, 2010

This is really cool!


Type in a word to see how often it appeared in books published between 1500 and 2008.

http://ngrams.googlelabs.com/



Google Ngram Database Tracks Popularity Of 500 Billion Words

Did you know that ''google'' appeared in print as early as 1908? Or that ''email'' first popped up in 1524?

Google has quietly
... See more
This is really cool!


Type in a word to see how often it appeared in books published between 1500 and 2008.

http://ngrams.googlelabs.com/



Google Ngram Database Tracks Popularity Of 500 Billion Words

Did you know that ''google'' appeared in print as early as 1908? Or that ''email'' first popped up in 1524?

Google has quietly released Ngram, a free tool that allows users to sift through 500 billion words contained in nearly 5.2 million books published between 1500 and 2008 in English, French, Spanish, German, Chinese and Russian. The tool can compare words and phrases and provide and year-by-year breakdown of when and how often they appeared in print. The entire dataset that powers the tool is also available for download.

The tool is expected to prove invaluable for researchers and academics, and provides a fascinating window into the past for the rest of us. For example, a quick search shows that references to communism peaked around 1960 at the height of the Cold War, while the word ''internet'' only appeared sporadically in print before usage skyrocketed in the late 1980s.

According to the New York Times, the project was a collaboration between Google and Harvard, and Erez Lieberman Aiden, a junior fellow at Harvard's Society of Fellows told the newspaper the goal was to make it simple for people to browse cultural trends as shown in books.

''We wanted to show what becomes possible when you apply very high-turbo data analysis to questions in the humanities,'' he said.

On the tech front, Microsoft will be pleased to know that ''bing'' first appeared in print in 1650, and it will come as no surprise to learn that ''apple'' was around as early as the late 1500s. Sadly, ''Neowin'' does not appear once in the centuries of text indexed by the tool. Those looking for a special treat should try entering the phrase ''never gonna give you up'' into the tool.
http://www.neowin.net/news/google-ngram-database-tracks-popularity-of-500-billion-words



[Edited at 2010-12-18 13:37 GMT]

[Edited at 2010-12-18 14:19 GMT]
Collapse


 
Alison Sabedoria (X)
Alison Sabedoria (X)  Identity Verified
United Kingdom
French to English
+ ...
Wow! Dec 18, 2010

This is amazing!

I'm used to checking dates for words to avoid anachronisms, but this is much more fun. The "compare" function is particularly interesting: tomato v. potato (1600-1800), sofa v. settee (1800-2000), canal v. railway (1700-2000).

There'll be no stopping me now...


 
Suzan Hamer
Suzan Hamer  Identity Verified
Netherlands
Local time: 10:16
English
+ ...
TOPIC STARTER
Can't figure out why "computer" Dec 18, 2010

shows up as a blip in the early 1600's though....

 
Kim Metzger
Kim Metzger  Identity Verified
Mexico
Local time: 03:16
German to English
Computer Dec 18, 2010

Suzan Hamer wrote:

shows up as a blip in the early 1600's though....


OED - computer: A person who makes calculations, a person employed in an observatory etc M17

M17 = 1630 - 1669

The OED explains that it provides the age of words, the date of first recorded use given in terms of date ranges.


 
Suzan Hamer
Suzan Hamer  Identity Verified
Netherlands
Local time: 10:16
English
+ ...
TOPIC STARTER
Aha! Dec 18, 2010

Of course. Thanks, Kim.

[Edited at 2010-12-18 16:28 GMT]


 
Bilbo Baggins
Bilbo Baggins
Catalan to English
+ ...
OCR errors Dec 18, 2010

Suzan Hamer wrote:

This is really cool!


Type in a word to see how often it appeared in books published between 1500 and 2008.

http://ngrams.googlelabs.com/



Did you know that ''google'' appeared in print as early as 1908? Or that ''email'' first popped up in 1524?




OCR errors surely:-)


 
Emma Goldsmith
Emma Goldsmith  Identity Verified
Spain
Local time: 10:16
Member (2004)
Spanish to English
Thanks for sharing, Suzan Dec 18, 2010

It's fun and useful.

 
Henk Peelen
Henk Peelen  Identity Verified
Netherlands
Local time: 10:16
Member (2002)
German to Dutch
+ ...
SITE LOCALIZER
Nice resource! Less adverbs from the 1960's on? Dec 18, 2010

I tried some words, amongst others I did like to find out whether the adverbs "good" and "well" would show complementary behaviour through the last ages (especailly because in German and Dutch the good-equivalents goed and gut are much more used than the well-equivalent. I was suprised to see both well and good show decline after 1960. I searched whether they probably were replaced by fine, excellent, perfect, nice or the opposites bad / evil. I realize my survey was too badly well-considered to... See more
I tried some words, amongst others I did like to find out whether the adverbs "good" and "well" would show complementary behaviour through the last ages (especailly because in German and Dutch the good-equivalents goed and gut are much more used than the well-equivalent. I was suprised to see both well and good show decline after 1960. I searched whether they probably were replaced by fine, excellent, perfect, nice or the opposites bad / evil. I realize my survey was too badly well-considered to draw solid conclusions, but it seems to me the use of adverbs did decline from 1960 on. Probably because people started to speak more directly and mass-oriented. Class became less important and daily consumption of choclate, beer, wine, cars, telephones and so on was affordable for nearly anyone, so there was less need for distinction. "A good glass of wine" simply became "(a glass of) wine", is what I think.
Secondly I was curious about my first name, because in English countries it's always changed into / lives on as Hank. No much imagination necessary to accept that the search for "henk" exactly shows the immigration waves from NL/BE to US, Australia, Canada and so on.
By the way, I also found some misreadings. The Optical Character Recognition programme sometimes misreads letters / numbers. Still a great source for any linguistic or demographic research, in my opinion.



Errr ... sorry for that "daily consumption of cars and telephones"




[Bijgewerkt op 2010-12-18 18:51 GMT]
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:16
Member (2006)
English to Afrikaans
+ ...
History and words Dec 18, 2010

Henk Peelen wrote:
No much imagination necessary to accept that the search for "henk" exactly shows the immigration waves from NL/BE to US, Australia, Canada and so on.


I also found some correlations between words and history. Try "voortrekker", for example, which shows a peak round 1840 and another one at 1900. The same applies to "zulu" -- it peaks round the time when the Zulu and white people starting meeting more often (1840ish)and then again when the British had their war agains the Zulus in 1880.


 
Lingua 5B
Lingua 5B  Identity Verified
Bosnia and Herzegovina
Local time: 10:16
Member (2009)
English to Croatian
+ ...
Very useful... Dec 18, 2010

Thanks very much for sharing this, Suzan.

[Edited at 2010-12-18 18:50 GMT]


 
Karoline Spiessl
Karoline Spiessl  Identity Verified
New Zealand
Local time: 22:16
English to German
+ ...
Books Ngram Viewer and translation Dec 18, 2010

Now there have been a few posts here about the Books Ngram Viewer as a tool to play around with. But how can we use Google's new data source for translation?

First: Rule out everything that will be published online. Google Insights, AdWords Keyword Tool and Google Trends are the tools of choice for search engine optimization and other forms of web-based translation.

Secondly: How many of us are translating books? For those who are: The tool will be a better help for you
... See more
Now there have been a few posts here about the Books Ngram Viewer as a tool to play around with. But how can we use Google's new data source for translation?

First: Rule out everything that will be published online. Google Insights, AdWords Keyword Tool and Google Trends are the tools of choice for search engine optimization and other forms of web-based translation.

Secondly: How many of us are translating books? For those who are: The tool will be a better help for you if you are not sure of the use of certain words, mostly that of two similar words. By using Ngram you can see what is more common for the time frame your book sits in.

Thirdly: "Email" etc. before the 20th century. This is most probably an OCR (Optical Character Recognition) mistake - see Google reference. http://ngrams.googlelabs.com/info

You can apply the tool if you are doing some linguistic research, even potentially in translation studies. Example: Choose 2 or 3 synonyms for which you want to know how often they were used in books over a certain period of time. Then compare 2 or 3 translations of these synonyms in your target language - and you should get a rough idea of the translation in that period!
Have a look at the corpora first before drawing any conclusions!

To summarize: Ngram Book Viewer's use is limited for translators, but can - under reserve, and depending on the overall academic acceptance - generate new approaches in research.
Collapse


 
Henk Peelen
Henk Peelen  Identity Verified
Netherlands
Local time: 10:16
Member (2002)
German to Dutch
+ ...
SITE LOCALIZER
Also useful for Google itself: Google Translate Dec 18, 2010

I also think it's use for translators is limited. Not so many translators need to know the use of voortrekker, kindergarten, Oktoberfest in English or colour versus color and realize versus realise. I guess Google in first instance uses it for its own research, maybe to improve Google Translate. I think better OCR techniques, more corpora and combination with other online sources may bring more reliable information ... not really a sensational conclusion. But it tells a lot about trends, immigra... See more
I also think it's use for translators is limited. Not so many translators need to know the use of voortrekker, kindergarten, Oktoberfest in English or colour versus color and realize versus realise. I guess Google in first instance uses it for its own research, maybe to improve Google Translate. I think better OCR techniques, more corpora and combination with other online sources may bring more reliable information ... not really a sensational conclusion. But it tells a lot about trends, immigration, hypes, movements, developments, wars, celebs and so on.Collapse


 
Suzan Hamer
Suzan Hamer  Identity Verified
Netherlands
Local time: 10:16
English
+ ...
TOPIC STARTER
That's why I posted it "off topic", Dec 18, 2010

Karoline and Henk. I wasn't thinking of it as being of any use to translators, but perhaps of interest in general to people who work (and play) with words.

[Edited at 2010-12-18 20:49 GMT]


 
Rachel Fell
Rachel Fell  Identity Verified
United Kingdom
Local time: 09:16
French to English
+ ...
Compare? Dec 18, 2010

Wordeffect wrote:

The "compare" function is particularly interesting: tomato v. potato (1600-1800), sofa v. settee (1800-2000), canal v. railway (1700-2000).

There'll be no stopping me now...

I don't see where the Compare function is, but agree it looks an interesting time consumer!


 
Suzan Hamer
Suzan Hamer  Identity Verified
Netherlands
Local time: 10:16
English
+ ...
TOPIC STARTER
To compare terms, Dec 18, 2010

type (at least two) words, separated by commas, in the search box. I think it should work for phrases separated by commas also.

Yes, in light of another forum thread about focusing on your work, I apologize for posting this. It is a time eater, if you can't control yourself.

[Edited at 2010-12-18 23:50 GMT]


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Fernanda Rocha[Call to this topic]

You can also contact site staff by submitting a support request »

See how often certain words were used in books between 1500 and 2008






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »