Friday, May 25, 2007

Washington Scandals and Baby Names

Updated 5/26/07: Added Technical Appendix
























The appearance this week of Monica Goodling before the House Judiciary committee sparked a conversation in the Political Arithmetik household about a previous Monica related Washington scandal. It perhaps says something about our household that this provoked a search for empirical evidence concerning the effect of the Clinton-Lewinsky scandal on the popularity of Monica as a name. Was it urban legend that the scandal had an effect? Was the effect large or small? Was it immediate? Let's run the numbers.

Monica was a reasonably popular name in the early 1970s, ranking between 39th and 56th in the decade of the 1970s. As it happens both Monica Lewinsky and Monica Goodling were born in the summer of 1973, two weeks apart, when the name was ranked 40th, its second highest ranking. (Monica ranked between 59 and 141 in the decade of the 1960s.) [My thanks to my colleagues at the coffee shop for suggesting I check tennis player Monica Seles, who turns out to also be a 1973 baby. Granted, she isn't connected to a DC scandal despite being born in 1973, and being born in the former Yugoslavia makes the relevance to our current investigation a tad suspect.]

If we were going to pick a name to go with a DC scandal from babies born in 1973, better bets would have been Jennifer, Amy, Michelle, Kimberly, Lisa, Melissa, Angela, Heather, Stephanie or Rebecca, the top 10 girls names that year. But Monica at 40th wasn't rare by any means.

The 1970s were the peak years for Monicas. By the 1990s the name had slowly but steadily declined to rank between 76th and 88th during 1990-1997.

And then the events of 1998 intervened. The Clinton-Lewinsky scandal broke on January 21, 1998, reached its fevered peak by the end of 1998 with the impeachment of President Clinton and was resolved by the Senate's failure to convict on February 12, 1999. Of course that didn't prevent late night comics from continuing to milk the material for months, years, perhaps forever after.

The impact on parents was immediate, but not as drastic as I had expected. There were 11 months of 1998 in which the scandal's impact could be felt. And the ranking of Monica dropped from 79 in 1997 to 105 in 1998, a substantial but not precipitous drop. Of course events were unfolding during this year, so perhaps it is reasonable to focus on 1999, by which time surely every expectant parent in America would be aware of the Clinton scandal.

And in 1999 the ranking of Monica did fall dramatically, to 151, just a bit below where it stood in 1960.

So indeed, the impact of the scandal produced an immediate and substantial response, as one would surely expect. No urban legend this.

But what I find fascinating is the continued decline since 1999. I would expect the impact to be greatest in the immediate aftermath of the infamous episode and to level off or perhaps even abate thereafter. Instead, the data suggest a much slower response and a much longer diffusion of unpopularity through the population. Having dropped 72 places between 1997 and 1999, the popularity of Monica dropped ANOTHER 99 places from 1999 through 2006, the last year for which we have data, to now stand at the 250th name on the popularity list.

One interesting speculation is to consider the effect of the Clinton-Lewinsky scandal on the parents who are just now having baby girls. Many of them would have been in their teens or early 20s during the height of the scandal, compared to parents of 1999 or 2000 who would have been on average 7 or 8 years older. I wonder if the impact of the scandal was larger on teenage and college age parents to be. These are ages not noted for consumption of political news, but they are ages extremely well known for crude sexual humor, for which Kenneth Starr provided an abundant supply of raw material. So I wonder if this cohort that is now giving birth was somewhat more affected by the scandal than were even slightly older cohorts who were past the age of campus humor as well as early sexual development. That could explain the continued and steady decline in use of Monica as a girl's name. It would also predict a leveling off once cohorts start to dominate births who were too young to understand the Clinton-Lewinsky scandal at the time.

The alternative is a slow diffusion of unpopularity throughout the culture, which is having an increasing effect regardless of personal experience with the scandal. If so, there is little reason to expect a leveling off of ranking. But there is also a puzzle about why the cultural diffusion is as slow as it has been.

It seems unlikely that Monica Goodling's testimony will significantly reduce the already declining popularity of the name. But given the current standing of "Monica", it is much less likely that a DC scandal in 2035 or so will feature a Monica in the staring role.

Prospective parents may want to visit the source of these data, the Social Security Administration's Popular Baby Names site here.

A superb academic study of the sociology of naming babies is A Matter of Taste: How Names, Fashion and Culture Change, by Stanley Lieberson.


Technical Appendix (added 5/26/07)

Warning: This is the really geeky part. Unless you think log2(x) is really cool, you might want to turn back now!

"Professor M" posted a comment on the cross post of this article at Pollster.com. Rather than "geek up" Pollster, I'm replying here. (This was supposed to be a "just for fun" post, after all.)

His/her comment is:
Hmmm. Try graphing the percent of babies given the name Monica in each year instead of the popularity rank. I think your discussion might change.
The good Professor M makes an excellent point. Let's think why. The actual rate of name use is quite small, even for the most popular names. For example, in 2006 the most popular name for girls was Emily. That name was used for 1.0267% of girls born. The number 2 name was Emma, 0.9159%. This difference in percentages is actually rather large. When we get down to ranks 101 and 102 we find Mya at 0.1602% and Amanda at 0.1599%. When we get down to Monica at 250, the rate is 0.0650% and for Carly at 251 the rate is 0.0649%.

So the rate of name use gets closer together for adjacent ranks as we go from more popular to less popular ranks. In my plots above, a change of one rank is the same vertical distance in the plot whether we are going from 1 to 2 or 100 to 101 or 250 to 251. But the percentage rates would not be changing by the same amount for each of those ranks. Instead, the difference in percentage rate would be getting smaller as we go from more popular to less popular rankings. In techie terms, the relationship between rank and percentage use is non-linear. And that can produce a different look to the plot, as Professor M suggests. So let's take a look.

I've converted the percentages into rate per 10,000 girls born, just to avoid the decimal points. That makes no difference for the look. So let's look at what Professor M suggests:























And behold! As Professor M suggested, the look is a bit different. What appears as a continued sharp drop after 1999 in my plot of rankings, now looks more like a continued decline but not so sharp, and much more of the decline came between 1997 and 1999. Also, the declining popularity of Monica between 1973 and 1997 appears more substantial, dropping from 41 per 10000 to 22 per 10000.

So Professor M's point is well taken. The change in rates are significantly different from the change in ranks. The popularity of Monica has continued to decline since 1999 but not nearly so dramatically as it appears in my ranking graph.

But...

Is the raw percentage (or per 10,000) rate the right measure either? As the rate approaches zero, it becomes impossible to decrease by a constant amount. From1973 to 1997, the rate of use of Monica fell from 41.0 to 22.1 per 10,000, a decline of 18.9. But in 1999 the rate was 10.96 per 10,000. It would be impossible for that to decline by another 18.9, lest we end up with a negative rate of name use! The point is, a constant change in the raw rate is impossible as we approach low incidence of the name. So perhaps linear change in the rate is also not a good way to model this.

An alternative is to think of the "half life" of the name use. This equates a fall of 1/2 from say 40 to 20 per 10000 with an equivalent proportionate change from 10 to 5 per 10000. This makes proportionate declines equal across the entire range of name rates. In effect, this says a fall of 1/2 in usage rate is the same wherever it occurs.

A simple way to measure this is to use the log base 2 of the rate per 10000. In base 2, each unit increase on the log2 scale is a doubling of the rate. So 1=log2(2), 2=log2(4), 3=log2(8), 4=log2(16), 5=log2(32) and 6=log2(64). Those values cover the range of Monica rates, and the critical point is that each 1 unit increase is a doubling and each 1 unit decrease is a halving of the rate of use.

Replotting the data on this log2(rate per 10000) scale produces the following:
























Now we see that from 1973 to 1997 the log2 rate fell from 5.4 to 4.5, or almost a full unit, representing a halving of the rate. From 1997 to 1999 it fell from 4.5 to 3.5, another halving. And from 1999 to 2006 from 3.5 to 2.7, a bit less than half again.

On this scale of proportionate change then, the drop from 1997 to 1999 is huge, a full halving of the rate (from 22.1 per 10,000 to 10.96) in just 2 years. The subsequent decline from 10.96 to 6.50 is a 41% decrease in rate over 7 years.

Now this plot is not identical to my ranking plot, but it is pretty close. The qualitative description in my original post applies pretty well to this one as it did to the ranking plot. So I stand by my original comments.

I had not looked at these issues before Professor M's comment, so I am very grateful to him/her for pointing this out. And indeed, as we saw above, the raw rates do look somewhat different. But on reflection, prompted by that comment, I think the log2 rate is probably the most reasonable way to look at this. The ranks alone can be misleading because the equal intervals between ranks distort the changes in rate. But the raw rates are also misleading because changes cannot remain constant when there is a lower limit of zero usage which we approach. Proportionate change seems more compelling in this case, and log2 is a convenient and easy to understand approach to this.

And one last technical point. The plot of rate against rank is strongly non-linear, as Professor M implies. The plot of log2(rate) against rank is much closer to linear, though with some continued bend. This is why my final log2 plot above more closely resembles the rank plot. Since log2 rate is close to linear with rank, the two plots must look quite similar.




12 comments:

Anonymous said...

Was Katie-Lynn (and its derivatives) popular ~20 years ago? I find the Katie-Lynn's I see to be somewhat scandalous.

Rodolfo Espino said...
This post has been removed by the author.
mikelicht said...

This may be in the Lieberson book, but the choice of everyday European first-names-of-convenience by Asian Americans who keep their original names as legal names is fascinating. There seems to be a distinct set or pool of European names and it is not clear to me if these are related in sound or meaning to original Asian names.

Emma said...

This is fascinating material! I noticed the same thing happening in The Netherlands, on a more sad occasion though. At the time Pim Fortuyn was murdered in 2002, he was the most influential politician in NL and about to win the elections. Link: http://www.angus-reid.com/polls/index.cfm/fuseaction/viewItem/itemID/15730

His name had never been more popular. All this changed in 2002. In stead of over 300 boys per year (which is a lot in NL) there were just over a 100 who were named 'Pim' in 2003. The name seems to be gaining interest though, now that the murder is in the back of the Dutch' minds, but the charismatic leader is well remembered. I copied one of your graphics to my site, where I wrote about your post. You can read a bit more about the name Pim too. All in Dutch :-)

Anonymous said...

I do not know how to blog but it strikes me that one reason Monica was used so often as a girl's name was that Saint Monica, the mother of Saint Augustine, was/is very much looked up to as a wonderful example of a mother who did not give up on a wayward son.

Toejam Miami said...

Its even worse for Hillary

txmichael said...

When I looked at the graph, it seemed that after two decades of a steady decline curve, the 90's (up until 98) seemed to flatten out and even had a slight uptick pattern happening.

This is probably due to the hugely popular television show, Friends.

If you take those years out, the downward trend of pretty much matches the end result of the drop between 97 and 99.

The Washington sex scandal basically cancelled out Hollywood fame effect.

Jody said...

The first two graphs in your technical appendix raise a competing issue at work in the Monica popularity: the TV effect. "Friends" came on-air in 1994 and achieved cultural saturation in 1995, at just the moment when Monica bumped out of its long-term declining trend. And of course Monica was not only the name of one of the lead characters, but also the name of the one character most recognizable in those early years of broadcast.

I would hazard a guess that Monica might have kept its higher per-10000 status into the late 1990s if not for the Clinton-Lewinsky scandal, simply because of the ongoing cultural importance of "Friends" to young women in their twenties and early thirties. And I find your arguments about the sexualization of the name for then-teenagers convincing. But I did want to point out the obvious cultural effect of the TV show as demonstrated in those two graphs, because it's so striking.

[The sudden and continued surge in popularity for the name "Phoebe" after 1995 supports the argument that "Friends" had some effect on baby-name choice. Although none of the other names used by the characters enjoyed that same surge, with the brief exception of "Chandler."

Dwight McCabe said...

While I find your analysis compelling, I would want to see how other reasonably popular names fare over several decades. There is obviously a long term variance in the popularity of baby names and I would want to see how the pattern for Monica compared to the expected variance for any similar name.

For example, when my wife and I were discussing possible girl's names for our soon to be born first child, we quickly discarded our mothers' and grandmothers' names. Our grandmothers' names in particular seemed very old fashioned. (My daughter owes us her undying gratitude for not naming her Mildred or Geraldine.)

MSS said...

Somewhere on some blog or other website, I read something rather similar about the name, Madison, for a girl. The short story was that there was something of an urban legend about this name, too. That it took off around the time of the movie, "Splash," which featured Darryl Hannah's mermaid character looking up at a street sign and picking "Madison" as her name.

It turned out that the name did indeed enjoy a surge in popularity, but it was delayed. That is, it was girls who had been just before child-bearing age at the time the movie was popular who began using the name in substantial numbers, rather than the young women who were already childbearing at the time. Pretty much the same pattern as we see with the decline in popularity of "Monica."

Oh, and I pass the geek test, because not only did I find this fascinating, but as soon as I looked at your scale on the first graph, I had Professor M's objection, too!

Anonymous said...

Has anybody created longitudinal graphs for all the characters on Friends, not just for Monica?

Did the popularity of that sit-com cause "upticks" for, say, Chandler?

Cleveland Kent Evans said...

Yes, Friends actually had its greatest impact on Chandler. In 1993, the year before the show began, there were 703 boys born in the USA named Chandler. In 1995, there were 1,854. Chandler's peak year of use was 1999, when 2,393 American boys received the name. It has been receding since, with only 741 born in 2006.

There was much less impact of the show on Ross or Joey, because these were not "new" names at the time. New parents aren't naming their children "after" characters on television shows; they are using the TV shows to find the "different but not too different" names they are seeking. So a character with a name common in the generation of the parents or grandparents will not give much of a boost to the name's popularity. It is the names which sound "new" to young parents that are affected.