MLA Language Map FAQ
Of the 62,431,447 people (21% of the entire United States population over five years old) whom the American Community Survey calculates speak languages other than English at home, 59.3% report that they also speak English “very well,” 19.2% that they speak English “well,” 14.8% that they speak English “not well,” and 6.8% that they do not speak English at all (2015 American Community Survey, 5-Year Estimates, Table B16004 [aggregate data]). The MLA Language Map Data Center sometimes collapses these four categories into two, suggesting an upper and a lower range of English ability. This is done on the assumption that gradations between “well” and “very well” and between “not well" and “not at all” will vary according to factors such as an individual’s sense of what ability ought to be, whereas a broad distinction between the upper and lower ranges of ability is more likely to be less subjectively defined. The American Community Survey sometimes contrasts the first category (speaks English “very well”) with a single figure that combines the other three categories as “less than very well.” For a discussion of this approach, see Robert Kominski, “How Good Is ‘How Well’? An Examination of the Census English-Speaking Ability Question,” 1989.
Close“America Speaks: A Demographic Profile of Foreign-Language Speakers for the United States: 2000,” published in 2006 by the Census Bureau, summarizes data by language spoken in terms of age, sex, race and Hispanic origin, nativity, citizenship, year of entry, place of birth, level of school enrollment, educational attainment, employment status, and employment class. View the 104 tables reporting these data. Languages are collapsed into four categories in this profile: Spanish, other Indo-European languages, Asian and Pacific Island languages, and all other languages. The ages of speakers in the tables are broken out in eighteen intervals. See also “Language Use in the United States: 2019.”
CloseAll data were recorded in response to the question “Does the person [responding] speak a language other than English at home?” Respondents who answered in the affirmative are then asked to name the language spoken. Responses for each home are provided by one person, who reports on each individual household member. Note that some people who do not speak a language consistently at home may declare themselves speakers or include household members as speakers out of loyalty to family, culture, or nation; some people, by contrast, may prefer not to mention that they speak a language other than English.
CloseIn the United States, 2,455 two- and four-year colleges and universities reported language enrollments of 1,182,562 in fall 2021. The languages most frequently studied in 2016 were, in alphabetical order, American Sign Language, Arabic, Chinese/Mandarin, French, German, Ancient Greek, Biblical Hebrew, Modern Hebrew, Italian, Japanese, Korean, Latin, Portuguese, Russian, and Spanish. Ranked in descend ing order of number of college enrollments, they are as follows: Spanish, French, American Sign Language, Japanese, German, Chinese/Mandarin, Italian, Arabic, Latin, Korean, Russian, Ancient Greek, Biblical Hebrew, Portuguese, and Modern Hebrew. In addition, 228 other languages, from Aaniiih to Zulu, were studied in US postsecondary institutions in 2021 (see Lusin et al, Enrollments in Languages Other Than English in United States Institutions of Higher Education, Fall 2021.) The MLA Language Enrollment Database provides access to language enrollments in the United States since 1958 by state and by institution. You can also select Institutional Enrollments on the map to see the locations of language programs in US colleges and universities in fall 2021. Language programs are identified by bubbles, sized by numbers of enrollments; click on language program bubbles to see institution names, languages, and enrollment numbers.
CloseNative Languages of the Americas provides a comprehensive list of languages indigenous to the Americas, and, along with language families and language lists, includes maps and information on cultures, tribes, and resources. Ethnologue and Glottolog also include comprehensive information on Indigenous languages in the Americas. The information across all three databases is substantially similar, but there are variations in family names, language trees, and the spelling of languages, among others.
CloseIn a number of counties, two or more languages are tied as the most predominant after English, or after English and Spanish, with the same number of enrollments in each. In those instances, the map shows whichever language comes first in alphabetical order. Learn more.
CloseThe US Department of Education encourages the learning of languages, particularly less commonly taught languages (LCTLs), through its sixteen Title VI Language Resource Centers (LRCs) nationally. The LRCs consider less commonly taught languages as any language other than French, German, or Spanish (LRC Brochure 14). More information is available on the main site, but here are some of the LRCs that are especially helpful to educators and language learners of LCTLs:
- CARLA - The University of Minnesota's Center for Advanced Research on Language Acquisition (CARLA) features a database of postsecondary institutions in the United States that teach less commonly taught languages. There is also a section on developing classroom materials for teaching LCTLs.
- CeLCAR - Indiana University Bloomington’s Center for Languages of the Central Asian Region (CeLCAR) has collected textbooks, phrasebooks, alphabet charts, and many other resources for central Asian languages from Armenian to Uzbek, some available as a free download, in the Materials section.
- COERLL - The University of Texas at Austin’s Center for Open Educational Resources and Language Learning (COERLL) produces and disseminates free teaching and learning resources for many languages, including LCTLs, under open license agreements.
- NRCAL - The National Resource Center for Asian Languages (NRCAL) at California State University, Fullerton, develops resources for Chinese, Japanese, Khmer, Korean, and Vietnamese.
- SEELRC - Duke University’s Slavic and Eurasian Language Resource Center (SEELRC) develops teaching materials and maintains a list of digital resources for some 40 Slavic and Eurasian languages.
Also, The National Council of Less Commonly Taught Languages (NCOLCTL) provides links to organizations and resources that may help.
CloseThe Ethnologue language name index has an alphabetical listing of over 7,000 of the world's languages. Ethnologue provides such details as countries in which languages are spoken, numbers of speakers, variations in language names, dialect variants, and related languages. Glottolog is another comprehensive database of the world’s languages, identifying families and language trees, as well as dialects. Both Ethnologue and Glottolog include maps indicating where languages originate, estimates of numbers of speakers and whether the language is considered endangered; while most of the information across the databases is substantially similar, there are variations in family names, language trees, and the spelling of languages, among others.
CloseEstimates differ as to the numbers of primary (mother tongue) speakers of the world's most-spoken languages. The following listing is based on figures published in the 1990s: Mandarin Chinese (726 million; all Chinese languages, 1,071 million); English (427 million); Spanish (266 million), Hindu/Urdu (223 million); Arabic (181 million); Portuguese (165 million); Bengali (162 million); Russian (158 million); Japanese (124 million); German (121 million); French (116 million); Javanese (75 million); Korean (66 million); Italian (65 million); Panjabi (60 million); Marathi (58 million); Vietnamese (57 million); Telugu (55 million); Turkish (53 million); Tamil (49 million); Ukrainian (45 million); Polish (42 million). These figures do not include second-language speakers. (David Crystal, The Cambridge Encyclopedia of Language, 3rd ed., Cambridge UP, 2010, p. 297).
Ethnologue includes more recent estimates of both native speakers and total speakers; the top 4 languages with the most native speakers are Mandarin Chinese (940 million), Spanish (485 million), English (380 million) and Hindi (345 million) (“What is the most spoken language,” Ethnologue, accessed 9 Jan. 2024). The World Factbook lists the 2018 most-spoken first languages by percentages, with the top languages Mandarin Chinese (12.3%), Spanish (6%), English (5.1%), Arabic (5.1%), Hindi (3.5%), Bengali (3.4%), Russian (3.2%), Portuguese (3.2%), and Urdu (2.9%) (“Languages,” The World Factbook 2021, Central Intelligence Agency, 2021).
If all language speakers, including those for whom a language is not the mother tongue, are counted, the list changes. According to Ethnologue, the top 10 most-spoken languages in 2023 are English (1.5 billion), Mandarin Chinese (1.1 billion), Hindi (609.5 million), Spanish (559.1 million), French (309.8 million), Standard Arabic (274 million), Bengali (272.8 million), Portuguese (263.6 million), Russian (255 million) and Urdu (231.7 million). (“What are the top 200 most spoken languages,” Ethnologue, accessed 9 Jan. 2024). This ordering is mirrored in the CIA’s estimates for 2022: English (18.8%), Mandarin Chinese (13.8%), Hindi (7.5%), Spanish (6.9%), French (3.4%), Arabic (3.4%), Bengali (3.4%), Russian (3.2%), Portuguese (3.2%), and Urdu (2.9%). (“Languages,” The World Factbook 2021, Central Intelligence Agency, 2021).
Estimates also differ as to the total number of languages currently spoken; David Crystal notes that “[m]ost reference books give a figure of 6,000–7,000, but estimates have varied from 3,000 to 10,000” (294). The CIA estimates there are “just over 7,151 languages spoken in the world” as of 2022, with about 80% of these languages “spoken by less than 100,000 people.” (The World Factbook).
CloseBetween 2000 and 2021, the foreign-born population of the United States recorded by the United States Census and the American Community Survey increased by 14.16 million; this is an increase of over 1.5 million since 2016. This number includes, among other categories identified by the Migration Policy Institute, naturalized US citizens; lawful permanent residents; legal nonresidents, such as those holding student or work visas; and authorized and unauthorized immigrants and migrants. The Migration Policy Institute provides an interactive map showing both national and state-by-state demographic, education, workforce, and income data for immigrant and migrant populations in the United States in 1990, 2000, and 2021. View additional information regarding immigrants and migrants in the United States.
CloseThe data used for the maps on this website are drawn from the 2011–2015 American Community Survey (ACS), Aggregate Data, 5-Year Estimates. The 2010 and 2015 US, Regional, and State data in the data center are drawn from ACS 5-Year Estimates and Public Use Microdata Sample (PUMS). Both ACS and PUMS are estimates based on 60 months of data and are considered to be comparable to data collected in 2000 on the US Census long form.
Annual data collection by ACS provides up-to-date information throughout the decade, solicited from about three million households each year and producing about two million completed records. ACS data is reported in one-year, three-year, and five-year estimates.
County data for 2000 and all zip code data used in the data center were collected on the US Census Long Form (2000 Census Summary File 3), which was distributed to approximately one in six US households. County data for 2010 used in the data center are drawn from the 2006–10 ACS, Aggregate Data, 5-Year Estimates. County data for 2015 used in the data center are drawn from the 2011–15 ACS, Aggregate Data, 5-Year Estimates. US, Regional, and State data for 2000 in the data center were collected on the 2000 Census Long Form and reported in a special tabulation of Census 2000 data (STP 258), commissioned by the Modern Language Association. The 2005 data in the data center are taken from the 2005 ACS one-year estimate.
CloseAlthough data from one-year estimates may be more current, five-year estimates have a larger sample size and are therefore generally more precise, particularly when studying small areas such as counties, small population groups, or languages spoken by relatively few people. All ACS and US Census data about language are based on sampling and may be somewhat different from data that would have been obtained if all the census respondents had been asked about their language use. The Census Bureau uses statistical formulas to determine the possible degree of error in a given sample. Because sample size varies depending on the size of the language community (e.g., Spanish in Los Angeles vs. Yiddish in Detroit), the possible degree of error also varies, but in all cases it is very small. Because different formulas are used to calculate different estimates (e.g., Aggregate Data vs. Public Use Microdata Sample), totals may vary between estimates. For more information about methodology, see the US Census Bureau’s Design and Methodology Report. In addition, find extensive documentation on PUMS data on their site.
CloseAfter 2015, because of privacy regulations, data on languages spoken at home are not available by county. Instead, the ACS uses Public Use Microdata Areas (PUMAs). The MLA is exploring the option of adding PUMAs to the map at some point and will then update the map to use the most recent US Census data.
CloseVisit the US Census website, where users can explore Census data. For more information, see the following resources:
- American Community Survey (ACS)
- Understanding and Using American Community Survey Data: What Researchers Need to Know
- Design and Methodology, American Community Survey
- 2011-2015 PUMS Accuracy of the Data
- PUMS Documentation
The Census Bureau’s American Community Survey presents this information in a number of different ways; for example, in a report on language use in the United States in 2019 (Sandy Dietrich and Erik Hernandez, Language Use in the United States: 2019, US Census Bureau, Aug. 2022), an inset on page 6 lists four major language groups:
Spanish includes Spanish, Spanish Creole, and Ladino.
Other Indo-European Languages include most languages of Europe and the Indic languages of India. These include the Germanic languages such as German, Yiddish, and Dutch; the Scandinavian languages such as Swedish and Norwegian; the Romance languages such as French, Italian, and Portuguese; the Slavic languages such as Russian, Polish, and Serbo-Croatian; the Indic languages such as Hindi, Gujarati, Punjabi, and Urdu; Celtic languages; Greek; Baltic languages; and Iranian languages.
Asian and Pacific Island Languages include Chinese; Korean; Japanese; Vietnamese; Hmong; Khmer; Lao; Thai; Tagalog or Filipino; the Dravidian languages of India such as Telugu, Tamil, and Malayalam; and other languages of Asia and the Pacific, including the Philippine, Polynesian, and Micronesian languages.
All Other Languages include Uralic languages such as Hungarian; the Semitic languages such as Arabic and Hebrew; languages of Africa; Native North American languages, including the American Indian and Alaska Native languages; and indigenous languages of Central and South America.
According to this report, “The full list of languages is not available in data products or public-use files due to confidentiality restrictions that apply to all data released by the Census Bureau. The most detailed tables released for 2019 contain 42 language categories” (6).
In table 2, found on page 8 in this same report, the languages and languages groupings listed, in addition to English and Spanish, are as follows (note that languages within each category in table 2 have been reordered alphabetically in the lists below):
Asian and Pacific Island Languages
Chinese (including Mandarin and Cantonese), Hmong, Ilocano/Samoan/Hawaiian/Other Austronesian Languages, Japanese, Khmer, Korean, Malayam/Kannada/Other Dravidian Languages, Other Languages of Asia, Tagalog (including Filipino), Tamil, Telugu, Thai/Lao/Other Tai-Kadai Languages, Vietnamese
Other Indo-European Languages
Armenian, Bengali, French (including Cajun), German, Greek, Gujarati, Haitian, Hindi, Italian, Nepali/Marathi/Other Indic Languages, Other Indo-European Languages, Pennsylvania Dutch/Yiddish/Other West Germanic Languages, Persian (including Farsi and Dari), Polish, Portuguese, Punjabi, Russian, Serbo-Croatian, Ukrainian or Other Slavic Languages, Urdu, Yiddish
Other Languages
Arabic, Hebrew, Amharic/Somali/Other Afro-Asiatic Languages, Navajo, Other Native Languages of North America, Other and Unspecified Languages, Swahili or Other Languages of Southern Africa, Yoruba/Twi/Igbo/Other Languages of West Africa
Further information about language names and families can be found at Ethnologue, Glottolog, and Native Languages of the Americas.
CloseView the continuously updated estimates provided by the US Census Bureau.
CloseThe United States Post Office provides a Zip Code Lookup tool.
CloseDifferent percentage ranges are used for different languages because languages vary greatly in maximum density. In a set of ranges appropriate to a commonly spoken language, the gradations are not fine enough to bring out the variations in density of a less commonly spoken language. The percentage ranges used in the Language Map are determined by natural breaks in the data.
CloseThe MLA Language Map uses the Mercator projection. Mercator is a cylindrical map projection—the meridians and parallels are straight, not curved, and they intersect at 90-degree angles. All maps inevitably contain some distortion, because they reduce the three-dimensional earth to a two-dimensional representation, the equivalent of trying to make an orange peel lie flat on a table. The distortion in the Mercator projection means that the farther one goes from the equator, the more the size of land masses is exaggerated. States in the northern part of the United States therefore appear larger than states of similar size in the South.
Close© 2024 Modern Language Association of America